Eric Lee / smarc-fsl-linux-kernel

15 Jan, 2016

1 commit

ea535e418 dma-debug: switch check from _text to _stext ... Browse Code »

In include/asm-generic/sections.h:

/*
* Usage guidelines:
* _text, _data: architecture specific, don't use them in
* arch-independent code
* [_stext, _etext]: contains .text.* sections, may also contain
* .rodata.*
* and/or .init.* sections

_text is not guaranteed across architectures. Architectures such as ARM
may reuse parts which are not actually text and erroneously trigger a bug.
Switch to using _stext which is guaranteed to contain text sections.

Came out of https://lkml.kernel.org/g/

Signed-off-by: Laura Abbott
Reviewed-by: Kees Cook
Cc: Russell King
Cc: Arnd Bergmann
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Laura Abbott
2016-01-15 08:00:49 +0800

14 Jan, 2016

1 commit

d080827f8 Merge tag 'libnvdimm-for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm ... Browse Code »

Pull libnvdimm updates from Dan Williams:
"The bulk of this has appeared in -next and independently received a
build success notification from the kbuild robot. The 'for-4.5/block-
dax' topic branch was rebased over the weekend to drop the "block
device end-of-life" rework that Al would like to see re-implemented
with a notifier, and to address bug reports against the badblocks
integration.

There is pending feedback against "libnvdimm: Add a poison list and
export badblocks" received last week. Linda identified some localized
fixups that we will handle incrementally.

Summary:

- Media error handling: The 'badblocks' implementation that
originated in md-raid is up-levelled to a generic capability of a
block device. This initial implementation is limited to being
consulted in the pmem block-i/o path. Later, 'badblocks' will be
consulted when creating dax mappings.

- Raw block device dax: For virtualization and other cases that want
large contiguous mappings of persistent memory, add the capability
to dax-mmap a block device directly.

- Increased /dev/mem restrictions: Add an option to treat all
io-memory as IORESOURCE_EXCLUSIVE, i.e. disable /dev/mem access
while a driver is actively using an address range. This behavior
is controlled via the new CONFIG_IO_STRICT_DEVMEM option and can be
overridden by the existing "iomem=relaxed" kernel command line
option.

- Miscellaneous fixes include a 'pfn'-device huge page alignment fix,
block device shutdown crash fix, and other small libnvdimm fixes"

* tag 'libnvdimm-for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (32 commits)
block: kill disk_{check|set|clear|alloc}_badblocks
libnvdimm, pmem: nvdimm_read_bytes() badblocks support
pmem, dax: disable dax in the presence of bad blocks
pmem: fail io-requests to known bad blocks
libnvdimm: convert to statically allocated badblocks
libnvdimm: don't fail init for full badblocks list
block, badblocks: introduce devm_init_badblocks
block: clarify badblocks lifetime
badblocks: rename badblocks_free to badblocks_exit
libnvdimm, pmem: move definition of nvdimm_namespace_add_poison to nd.h
libnvdimm: Add a poison list and export badblocks
nfit_test: Enable DSMs for all test NFITs
md: convert to use the generic badblocks code
block: Add badblock management for gendisks
badblocks: Add core badblock management code
block: fix del_gendisk() vs blkdev_ioctl crash
block: enable dax for raw block devices
block: introduce bdev_file_inode()
restrict /dev/mem to idle io memory ranges
arch: consolidate CONFIG_STRICT_DEVM in lib/Kconfig.debug
...

Linus Torvalds
2016-01-14 11:15:14 +0800

13 Jan, 2016

5 commits

c17488d06 Merge tag 'trace-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace ... Browse Code »

Pull tracing updates from Steven Rostedt:
"Not much new with tracing for this release. Mostly just clean ups and
minor fixes.

Here's what else is new:

- A new TRACE_EVENT_FN_COND macro, combining both _FN and _COND for
those that want both.

- New selftest to test the instance create and delete

- Better debug output when ftrace fails"

* tag 'trace-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (24 commits)
ftrace: Fix the race between ftrace and insmod
ftrace: Add infrastructure for delayed enabling of module functions
x86: ftrace: Fix the comments for ftrace_modify_code_direct()
tracing: Fix comment to use tracing_on over tracing_enable
metag: ftrace: Fix the comments for ftrace_modify_code
sh: ftrace: Fix the comments for ftrace_modify_code()
ia64: ftrace: Fix the comments for ftrace_modify_code()
ftrace: Clean up ftrace_module_init() code
ftrace: Join functions ftrace_module_init() and ftrace_init_module()
tracing: Introduce TRACE_EVENT_FN_COND macro
tracing: Use seq_buf_used() in seq_buf_to_user() instead of len
bpf: Constify bpf_verifier_ops structure
ftrace: Have ftrace_ops_get_func() handle RCU and PER_CPU flags too
ftrace: Remove use of control list and ops
ftrace: Fix output of enabled_functions for showing tramp
ftrace: Fix a typo in comment
ftrace: Show all tramps registered to a record on ftrace_bug()
ftrace: Add variable ftrace_expected for archs to show expected code
ftrace: Add new type to distinguish what kind of ftrace_bug()
tracing: Update cond flag when enabling or disabling a trigger
...

Linus Torvalds
2016-01-13 12:04:15 +0800
aee3bfa33 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next ... Browse Code »

Pull networking updates from Davic Miller:

1) Support busy polling generically, for all NAPI drivers. From Eric
Dumazet.

2) Add byte/packet counter support to nft_ct, from Floriani Westphal.

3) Add RSS/XPS support to mvneta driver, from Gregory Clement.

4) Implement IPV6_HDRINCL socket option for raw sockets, from Hannes
Frederic Sowa.

5) Add support for T6 adapter to cxgb4 driver, from Hariprasad Shenai.

6) Add support for VLAN device bridging to mlxsw switch driver, from
Ido Schimmel.

7) Add driver for Netronome NFP4000/NFP6000, from Jakub Kicinski.

8) Provide hwmon interface to mlxsw switch driver, from Jiri Pirko.

9) Reorganize wireless drivers into per-vendor directories just like we
do for ethernet drivers. From Kalle Valo.

10) Provide a way for administrators "destroy" connected sockets via the
SOCK_DESTROY socket netlink diag operation. From Lorenzo Colitti.

11) Add support to add/remove multicast routes via netlink, from Nikolay
Aleksandrov.

12) Make TCP keepalive settings per-namespace, from Nikolay Borisov.

13) Add forwarding and packet duplication facilities to nf_tables, from
Pablo Neira Ayuso.

14) Dead route support in MPLS, from Roopa Prabhu.

15) TSO support for thunderx chips, from Sunil Goutham.

16) Add driver for IBM's System i/p VNIC protocol, from Thomas Falcon.

17) Rationalize, consolidate, and more completely document the checksum
offloading facilities in the networking stack. From Tom Herbert.

18) Support aborting an ongoing scan in mac80211/cfg80211, from
Vidyullatha Kanchanapally.

19) Use per-bucket spinlock for bpf hash facility, from Tom Leiming.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1375 commits)
net: bnxt: always return values from _bnxt_get_max_rings
net: bpf: reject invalid shifts
phonet: properly unshare skbs in phonet_rcv()
dwc_eth_qos: Fix dma address for multi-fragment skbs
phy: remove an unneeded condition
mdio: remove an unneed condition
mdio_bus: NULL dereference on allocation error
net: Fix typo in netdev_intersect_features
net: freescale: mac-fec: Fix build error from phy_device API change
net: freescale: ucc_geth: Fix build error from phy_device API change
bonding: Prevent IPv6 link local address on enslaved devices
IB/mlx5: Add flow steering support
net/mlx5_core: Export flow steering API
net/mlx5_core: Make ipv4/ipv6 location more clear
net/mlx5_core: Enable flow steering support for the IB driver
net/mlx5_core: Initialize namespaces only when supported by device
net/mlx5_core: Set priority attributes
net/mlx5_core: Connect flow tables
net/mlx5_core: Introduce modify flow table command
net/mlx5_core: Managing root flow table
...

Linus Torvalds
2016-01-13 10:57:02 +0800
c597b6bcd Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 ... Browse Code »

Pull crypto update from Herbert Xu:
"Algorithms:
- Add RSA padding algorithm

Drivers:
- Add GCM mode support to atmel
- Add atmel support for SAMA5D2 devices
- Add cipher modes to talitos
- Add rockchip driver for rk3288
- Add qat support for C3XXX and C62X"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (103 commits)
crypto: hifn_795x, picoxcell - use ablkcipher_request_cast
crypto: qat - fix SKU definiftion for c3xxx dev
crypto: qat - Fix random config build issue
crypto: ccp - use to_pci_dev and to_platform_device
crypto: qat - Rename dh895xcc mmp firmware
crypto: 842 - remove WARN inside printk
crypto: atmel-aes - add debug facilities to monitor register accesses.
crypto: atmel-aes - add support to GCM mode
crypto: atmel-aes - change the DMA threshold
crypto: atmel-aes - fix the counter overflow in CTR mode
crypto: atmel-aes - fix atmel-ctr-aes driver for RFC 3686
crypto: atmel-aes - create sections to regroup functions by usage
crypto: atmel-aes - fix typo and indentation
crypto: atmel-aes - use SIZE_IN_WORDS() helper macro
crypto: atmel-aes - improve performances of data transfer
crypto: atmel-aes - fix atmel_aes_remove()
crypto: atmel-aes - remove useless AES_FLAGS_DMA flag
crypto: atmel-aes - reduce latency of DMA completion
crypto: atmel-aes - remove unused 'err' member of struct atmel_aes_dev
crypto: atmel-aes - rework crypto request completion
...

Linus Torvalds
2016-01-13 10:51:14 +0800
33caf82ac Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull misc vfs updates from Al Viro:
"All kinds of stuff. That probably should've been 5 or 6 separate
branches, but by the time I'd realized how large and mixed that bag
had become it had been too close to -final to play with rebasing.

Some fs/namei.c cleanups there, memdup_user_nul() introduction and
switching open-coded instances, burying long-dead code, whack-a-mole
of various kinds, several new helpers for ->llseek(), assorted
cleanups and fixes from various people, etc.

One piece probably deserves special mention - Neil's
lookup_one_len_unlocked(). Similar to lookup_one_len(), but gets
called without ->i_mutex and tries to avoid ever taking it. That, of
course, means that it's not useful for any directory modifications,
but things like getting inode attributes in nfds readdirplus are fine
with that. I really should've asked for moratorium on lookup-related
changes this cycle, but since I hadn't done that early enough... I
*am* asking for that for the coming cycle, though - I'm going to try
and get conversion of i_mutex to rwsem with ->lookup() done under lock
taken shared.

There will be a patch closer to the end of the window, along the lines
of the one Linus had posted last May - mechanical conversion of
->i_mutex accesses to inode_lock()/inode_unlock()/inode_trylock()/
inode_is_locked()/inode_lock_nested(). To quote Linus back then:

-----
| This is an automated patch using
|
| sed 's/mutex_lock(&$.*$->i_mutex)/inode_lock(\1)/'
| sed 's/mutex_unlock(&$.*$->i_mutex)/inode_unlock(\1)/'
| sed 's/mutex_lock_nested(&$.*$->i_mutex,[ ]*I_MUTEX_$[A-Z0-9_]*$)/inode_lock_nested(\1, I_MUTEX_\2)/'
| sed 's/mutex_is_locked(&$.*$->i_mutex)/inode_is_locked(\1)/'
| sed 's/mutex_trylock(&$.*$->i_mutex)/inode_trylock(\1)/'
|
| with a very few manual fixups
-----

I'm going to send that once the ->i_mutex-affecting stuff in -next
gets mostly merged (or when Linus says he's about to stop taking
merges)"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
nfsd: don't hold i_mutex over userspace upcalls
fs:affs:Replace time_t with time64_t
fs/9p: use fscache mutex rather than spinlock
proc: add a reschedule point in proc_readfd_common()
logfs: constify logfs_block_ops structures
fcntl: allow to set O_DIRECT flag on pipe
fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE
fs: xattr: Use kvfree()
[s390] page_to_phys() always returns a multiple of PAGE_SIZE
nbd: use ->compat_ioctl()
fs: use block_device name vsprintf helper
lib/vsprintf: add %*pg format specifier
fs: use gendisk->disk_name where possible
poll: plug an unused argument to do_poll
amdkfd: don't open-code memdup_user()
cdrom: don't open-code memdup_user()
rsxx: don't open-code memdup_user()
mtip32xx: don't open-code memdup_user()
[um] mconsole: don't open-code memdup_user_nul()
[um] hostaudio: don't open-code memdup_user()
...

Linus Torvalds
2016-01-13 09:11:47 +0800
ca9706a28 Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull iov_iter infrastructure updates from Al Viro:
"A couple of iov_iter updates"

* 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
iov_iter: export import_single_range()
iov_iter: constify {csum_and_,}copy_to_iter()

Linus Torvalds
2016-01-13 08:49:58 +0800

12 Jan, 2016

4 commits

fb591fbd0 Merge tag 'mmc-v4.5' of git://git.linaro.org/people/ulf.hansson/mmc ... Browse Code »

Pull MMC updates from Ulf Hansson:
"MMC core:
- Optimize boot time by detecting cards simultaneously
- Make runtime resume default behavior for MMC/SD
- Enable MMC/SD/SDIO devices to suspend/resume asynchronously
- Allow more than 8 partitions per card
- Introduce MMC_CAP2_NO_SDIO to prevent unsupported SDIO commands
- Support the standard DT wakeup-source property
- Fix driver strength switching for HS200 and HS400
- Fix switch command timeout
- Fix invalid vdd in voltage switch power cycle for SDIO

MMC host:
- sdhci: Restore behavior when setting VDD via external regulator
- sdhci: A couple of changes/fixes related to the dma support
- sdhci-tegra: Add Tegra210 support
- sdhci-tegra: Support for UHS-I cards including tuning support
- sdhci-of-at91: Add PM support
- sh_mmcif: Rework dma channel handling
- mvsdio: Delete platform data code path"

* tag 'mmc-v4.5' of git://git.linaro.org/people/ulf.hansson/mmc: (52 commits)
mmc: dw_mmc: remove the unused quirks
mmc: sdhci-pci: use to_pci_dev()
mmc: cb710: use to_platform_device()
mmc: tegra: use correct accessor for misc ctrl register
mmc: tegra: enable UHS-I modes
mmc: tegra: implement UHS tuning
mmc: tegra: disable SPI_MODE_CLKEN
mmc: tegra: implement module external clock change
mmc: sdhci: restore behavior when setting VDD via external regulator
mmc: It is not an error for the card to be removed while suspended
mmc: block: Allow more than 8 partitions per card
mmc: core: Optimize boot time by detecting cards simultaneously
mmc: dw_mmc: use resource_size_t to store physical address
mmc: core: fix __mmc_switch timeout caused by preempt
mmc: usdhi6rol0: handle NULL data in timeout
mmc: of_mmc_spi: Add IRQF_ONESHOT to interrupt flags
mmc: mediatek: change some dev_err to dev_dbg
mmc: enable MMC/SD/SDIO device to suspend/resume asynchronously
mmc: sdhci: Fix sdhci_runtime_pm_bus_on/off()
mmc: sdhci: 64-bit DMA actually has 4-byte alignment
...

Linus Torvalds
2016-01-12 11:39:09 +0800
0f8c79010 Merge branch 'for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq ... Browse Code »

Pull workqueue update from Tejun Heo:
"Workqueue changes for v4.5. One cleanup patch and three to improve
the debuggability.

Workqueue now has a stall detector which dumps workqueue state if any
worker pool hasn't made forward progress over a certain amount of time
(30s by default) and also triggers a warning if a workqueue which can
be used in memory reclaim path tries to wait on something which can't
be.

These should make workqueue hangs a lot easier to debug."

* 'for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: simplify the apply_workqueue_attrs_locked()
workqueue: implement lockup detector
watchdog: introduce touch_softlockup_watchdog_sched()
workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue

Linus Torvalds
2016-01-12 10:53:13 +0800
5cb52b5e1 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf updates from Ingo Molnar:
"Kernel side changes:

- Intel Knights Landing support. (Harish Chegondi)

- Intel Broadwell-EP uncore PMU support. (Kan Liang)

- Core code improvements. (Peter Zijlstra.)

- Event filter, LBR and PEBS fixes. (Stephane Eranian)

- Enable cycles:pp on Intel Atom. (Stephane Eranian)

- Add cycles:ppp support for Skylake. (Andi Kleen)

- Various x86 NMI overhead optimizations. (Andi Kleen)

- Intel PT enhancements. (Takao Indoh)

- AMD cache events fix. (Vince Weaver)

Tons of tooling changes:

- Show random perf tool tips in the 'perf report' bottom line
(Namhyung Kim)

- perf report now defaults to --group if the perf.data file has
grouped events, try it with:

# perf record -e '{cycles,instructions}' -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.093 MB perf.data (1247 samples) ]
# perf report
# Samples: 1K of event 'anon group { cycles, instructions }'
# Event count (approx.): 1955219195
#
# Overhead Command Shared Object Symbol

2.86% 0.22% swapper [kernel.kallsyms] [k] intel_idle
1.05% 0.33% firefox libxul.so [.] js::SetObjectElement
1.05% 0.00% kworker/0:3 [kernel.kallsyms] [k] gen6_ring_get_seqno
0.88% 0.17% chrome chrome [.] 0x0000000000ee27ab
0.65% 0.86% firefox libxul.so [.] js::ValueToId
0.64% 0.23% JS Helper libxul.so [.] js::SplayTree::splay
0.62% 1.27% firefox libxul.so [.] js::GetIterator
0.61% 1.74% firefox libxul.so [.] js::NativeSetProperty
0.61% 0.31% firefox libxul.so [.] js::SetPropertyByDefining

- Introduce the 'perf stat record/report' workflow:

Generate perf.data files from 'perf stat', to tap into the
scripting capabilities perf has instead of defining a 'perf stat'
specific scripting support to calculate event ratios, etc.

Simple example:

$ perf stat record -e cycles usleep 1

Performance counter stats for 'usleep 1':

1,134,996 cycles

0.000670644 seconds time elapsed

$ perf stat report

Performance counter stats for '/home/acme/bin/perf stat record -e cycles usleep 1':

1,134,996 cycles

0.000670644 seconds time elapsed

$

It generates PERF_RECORD_ userspace records to store the details:

$ perf report -D | grep PERF_RECORD
0xf0 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 27637
0x118 [0x12]: PERF_RECORD_CPU_MAP nr: 1 cpu: 65535
0x12a [0x40]: PERF_RECORD_STAT_CONFIG
0x16a [0x30]: PERF_RECORD_STAT
-1 -1 0x19a [0x40]: PERF_RECORD_MMAP -1/0: [0xffffffff81000000(0x1f000000) @ 0xffffffff81000000]: x [kernel.kallsyms]_text
0x1da [0x18]: PERF_RECORD_STAT_ROUND
[acme@ssdandy linux]$

An effort was made to make perf.data files generated like this to
not generate cryptic messages when processed by older tools.

The 'perf script' bits need rebasing, will go up later.

- Make command line options always available, even when they depend
on some feature being enabled, warning the user about use of such
options (Wang Nan)

- Support hw breakpoint events (mem:0xAddress) in the default output
mode in 'perf script' (Wang Nan)

- Fixes and improvements for supporting annotating ARM binaries,
support ARM call and jump instructions, more work needed to have
arch specific stuff separated into tools/perf/arch/*/annotate/
(Russell King)

- Add initial 'perf config' command, for now just with a --list
command to the contents of the configuration file in use and a
basic man page describing its format, commands for doing edits and
detailed documentation are being reviewed and proof-read. (Taeung
Song)

- Allows BPF scriptlets specify arguments to be fetched using DWARF
info, using a prologue generated at compile/build time (He Kuang,
Wang Nan)

- Allow attaching BPF scriptlets to module symbols (Wang Nan)

- Allow attaching BPF scriptlets to userspace code using uprobe (Wang
Nan)

- BPF programs now can specify 'perf probe' tunables via its section
name, separating key=val values using semicolons (Wang Nan)

Testing some of these new BPF features:

Use case: get callchains when receiving SSL packets, filter then in the
kernel, at arbitrary place.

# cat ssl.bpf.c
#define SEC(NAME) __attribute__((section(NAME), used))

struct pt_regs;

SEC("func=__inet_lookup_established hnum")
int func(struct pt_regs *ctx, int err, unsigned short port)
{
return err == 0 && port == 443;
}

char _license[] SEC("license") = "GPL";
int _version SEC("version") = LINUX_VERSION_CODE;
#
# perf record -a -g -e ssl.bpf.c
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.787 MB perf.data (3 samples) ]
# perf script | head -30
swapper 0 [000] 58783.268118: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb
8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux)
896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux)
8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux)
855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
8572a8 process_backlog (/lib/modules/4.3.0+/build/vmlinux)
856b11 net_rx_action (/lib/modules/4.3.0+/build/vmlinux)
2a284b __do_softirq (/lib/modules/4.3.0+/build/vmlinux)
2a2ba3 irq_exit (/lib/modules/4.3.0+/build/vmlinux)
96b7a4 do_IRQ (/lib/modules/4.3.0+/build/vmlinux)
969807 ret_from_intr (/lib/modules/4.3.0+/build/vmlinux)
2dede5 cpu_startup_entry (/lib/modules/4.3.0+/build/vmlinux)
95d5bc rest_init (/lib/modules/4.3.0+/build/vmlinux)
1163ffa start_kernel ([kernel.vmlinux].init.text)
11634d7 x86_64_start_reservations ([kernel.vmlinux].init.text)
1163623 x86_64_start_kernel ([kernel.vmlinux].init.text)

qemu-system-x86 9178 [003] 58785.792417: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb
8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux)
896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux)
8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux)
855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
856660 netif_receive_skb_internal (/lib/modules/4.3.0+/build/vmlinux)
8566ec netif_receive_skb_sk (/lib/modules/4.3.0+/build/vmlinux)
430a br_handle_frame_finish ([bridge])
48bc br_handle_frame ([bridge])
855f44 __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
#

- Use 'perf probe' various options to list functions, see what
variables can be collected at any given point, experiment first
collecting without a filter, then filter, use it together with
'perf trace', 'perf top', with or without callchains, if it
explodes, please tell us!

- Introduce a new callchain mode: "folded", that will list per line
representations of all callchains for a give histogram entry,
facilitating 'perf report' output processing by other tools, such
as Brendan Gregg's flamegraph tools (Namhyung Kim)

E.g:

# perf report | grep -v ^# | head
18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry
|
---cpu_startup_entry
|
|--12.07%--start_secondary
|
--6.30%--rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel
#

Becomes, in "folded" mode:

# perf report -g folded | grep -v ^# | head -5
18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry
12.07% cpu_startup_entry;start_secondary
6.30% cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
16.90% 0.00% swapper [kernel.kallsyms] [k] call_cpuidle
11.23% call_cpuidle;cpu_startup_entry;start_secondary
5.67% call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
16.90% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter
11.23% cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
5.67% cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
15.12% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter_state
#

The user can also select one of "count", "period" or "percent" as
the first column.

... and lots of infrastructure enhancements, plus fixes and other
changes, features I failed to list - see the shortlog and the git log
for details"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (271 commits)
perf evlist: Add --trace-fields option to show trace fields
perf record: Store data mmaps for dwarf unwind
perf libdw: Check for mmaps also in MAP__VARIABLE tree
perf unwind: Check for mmaps also in MAP__VARIABLE tree
perf unwind: Use find_map function in access_dso_mem
perf evlist: Remove perf_evlist__(enable|disable)_event functions
perf evlist: Make perf_evlist__open() open evsels with their cpus and threads (like perf record does)
perf report: Show random usage tip on the help line
perf hists: Export a couple of hist functions
perf diff: Use perf_hpp__register_sort_field interface
perf tools: Add overhead/overhead_children keys defaults via string
perf tools: Remove list entry from struct sort_entry
perf tools: Include all tools/lib directory for tags/cscope/TAGS targets
perf script: Align event name properly
perf tools: Add missing headers in perf's MANIFEST
perf tools: Do not show trace command if it's not compiled in
perf report: Change default to use event group view
perf top: Decay periods in callchains
tools lib: Move bitmap.[ch] from tools/perf/ to tools/{lib,include}/
tools lib: Sync tools/lib/find_bit.c with the kernel
...

Linus Torvalds
2016-01-12 06:39:17 +0800
24af98c4c Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull locking updates from Ingo Molnar:
"So we have a laundry list of locking subsystem changes:

- continuing barrier API and code improvements

- futex enhancements

- atomics API improvements

- pvqspinlock enhancements: in particular lock stealing and adaptive
spinning

- qspinlock micro-enhancements"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op
futex: Cleanup the goto confusion in requeue_pi()
futex: Remove pointless put_pi_state calls in requeue()
futex: Document pi_state refcounting in requeue code
futex: Rename free_pi_state() to put_pi_state()
futex: Drop refcount if requeue_pi() acquired the rtmutex
locking/barriers, arch: Remove ambiguous statement in the smp_store_mb() documentation
lcoking/barriers, arch: Use smp barriers in smp_store_release()
locking/cmpxchg, arch: Remove tas() definitions
locking/pvqspinlock: Queue node adaptive spinning
locking/pvqspinlock: Allow limited lock stealing
locking/pvqspinlock: Collect slowpath lock statistics
sched/core, locking: Document Program-Order guarantees
locking, sched: Introduce smp_cond_acquire() and use it
locking/pvqspinlock, x86: Optimize the PV unlock code path
locking/qspinlock: Avoid redundant read of next pointer
locking/qspinlock: Prefetch the next node cacheline
locking/qspinlock: Use _acquire/_release() versions of cmpxchg() & xchg()
atomics: Add test for atomic operations with _relaxed variants

Linus Torvalds
2016-01-12 06:18:38 +0800

09 Jan, 2016

3 commits

90a545e98 restrict /dev/mem to idle io memory ranges ... Browse Code »

This effectively promotes IORESOURCE_BUSY to IORESOURCE_EXCLUSIVE
semantics by default. If userspace really believes it is safe to access
the memory region it can also perform the extra step of disabling an
active driver. This protects device address ranges with read side
effects and otherwise directs userspace to use the driver.

Persistent memory presents a large "mistake surface" to /dev/mem as now
accidental writes can corrupt a filesystem.

In general if a device driver is busily using a memory region it already
informs other parts of the kernel to not touch it via
request_mem_region(). /dev/mem should honor the same safety restriction
by default. Debugging a device driver from userspace becomes more
difficult with this enabled. Any application using /dev/mem or mmap of
sysfs pci resources will now need to perform the extra step of either:

1/ Disabling the driver, for example:

echo > /dev/bus//drivers//unbind

2/ Rebooting with "iomem=relaxed" on the command line

3/ Recompiling with CONFIG_IO_STRICT_DEVMEM=n

Traditional users of /dev/mem like dosemu are unaffected because the
first 1MB of memory is not subject to the IO_STRICT_DEVMEM restriction.
Legacy X configurations use /dev/mem to talk to graphics hardware, but
that functionality has since moved to kernel graphics drivers.

Cc: Arnd Bergmann
Cc: Russell King
Cc: Andrew Morton
Cc: Greg Kroah-Hartman
Acked-by: Kees Cook
Acked-by: Ingo Molnar
Signed-off-by: Dan Williams

Dan Williams
2016-01-09 22:30:49 +0800
21266be9e arch: consolidate CONFIG_STRICT_DEVM in lib/Kconfig.debug ... Browse Code »

Let all the archs that implement devmem_is_allowed() opt-in to a common
definition of CONFIG_STRICT_DEVM in lib/Kconfig.debug.

Cc: Kees Cook
Cc: Russell King
Cc: Will Deacon
Cc: Benjamin Herrenschmidt
Cc: Martin Schwidefsky
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Andrew Morton
Cc: Greg Kroah-Hartman
Cc: "David S. Miller"
Acked-by: Catalin Marinas
Acked-by: Heiko Carstens
[heiko: drop 'default y' for s390]
Acked-by: Ingo Molnar
Suggested-by: Arnd Bergmann
Signed-off-by: Dan Williams

Dan Williams
2016-01-09 22:30:49 +0800
6108209c4 Merge branch 'for-linus' into work.misc Browse Code »

Al Viro
2016-01-09 10:20:11 +0800

07 Jan, 2016

1 commit

1031bc589 lib/vsprintf: add %*pg format specifier ... Browse Code »

This allow to directly print block_device name.
Currently one should use bdevname() with temporal char buffer.
This is very ineffective because bloat stack usage for deep IO call-traces

Example:
%pg -> sda, sda1 or loop0p1

[AV: fixed a minor braino - position updates should not be dependent
upon having reached the of buffer]

Signed-off-by: Dmitry Monakhov
Signed-off-by: Al Viro

Dmitry Monakhov
2016-01-07 01:55:29 +0800

06 Jan, 2016

1 commit

3104fb3dd Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmc… ... Browse Code »

…k/linux-rcu into core/rcu

Pull RCU changes from Paul E. McKenney:

- Adding transitivity uniformly to rcu_node structure ->lock
acquisitions. (This is implemented by the first two commits
on top of v4.4-rc2 due to the pervasive nature of this change.)

- Documentation updates, including RCU requirements.

- Expedited grace-period changes.

- Miscellaneous fixes.

- Linked-list fixes, courtesy of KTSAN.

- Torture-test updates.

- Late-breaking fix to sysrq-generated crash.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2016-01-06 18:41:48 +0800

04 Jan, 2016

1 commit

16e5c1fc3 convert a bunch of open-coded instances of memdup_user_nul() ... Browse Code »

A _lot_ of ->write() instances were open-coding it; some are
converted to memdup_user_nul(), a lot more remain...

Signed-off-by: Al Viro

Al Viro
2016-01-04 23:26:58 +0800

01 Jan, 2016

1 commit

c07f30ad6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Browse Code »

David S. Miller
2016-01-01 07:20:10 +0800

24 Dec, 2015

1 commit

ff078d8fc tracing: Use seq_buf_used() in seq_buf_to_user() instead of len ... Browse Code »

commit 5ac48378414d ("tracing: Use trace_seq_used() and seq_buf_used()
instead of len") changed the tracing code to use trace_seq_used() and
seq_buf_used() instead of using the seq_buf len directly to avoid
overflow issues, but missed a spot in seq_buf_to_user() that makes use
of s->len.

Cleaned up the code a bit as well per suggestion of Steve Rostedt.

Link: http://lkml.kernel.org/r/1447703848-2951-1-git-send-email-jsnitsel@redhat.com

Signed-off-by: Jerry Snitselaar
Signed-off-by: Steven Rostedt

Jerry Snitselaar
2015-12-24 03:27:20 +0800

23 Dec, 2015

1 commit

5ca636b98 crypto: 842 - remove WARN inside printk ... Browse Code »

Remove the WARN() from the beN_to_cpu macro, which is used as a param to a
pr_debug() call. With a certain kernel config, this printk-in-printk
results in the no_printk() macro trying to recursively call the
no_printk() macro, and since macros can't recursively call themselves
a build error results.

Reported-by: Randy Dunlap
Signed-off-by: Dan Streetman
Signed-off-by: Herbert Xu

Dan Streetman
2015-12-23 18:20:01 +0800

22 Dec, 2015

1 commit

28ff4fda9 mmc: kconfig: replace FAULT_INJECTION with FAULT_INJECTION_DEBUG_FS ... Browse Code »

Fault-injection capability for MMC IO uses debugfs entries to configure
the attributes.
FAULT_INJECTION_DEBUG_FS must be enabled to use FAIL_MMC_REQUEST.

Replace FAULT_INJECTION with FAULT_INJECTION_DEBUG_FS.
Also remove 'select DEBUG_FS' since FAULT_INJECTION_DEBUG_FS depends on
it.

Signed-off-by: Adrien Schildknecht
Signed-off-by: Ulf Hansson

Adrien Schildknecht
2015-12-22 18:32:06 +0800

19 Dec, 2015

2 commits

179ccc0a7 rhashtable: Kill harmless RCU warning in rhashtable_walk_init ... Browse Code »

The commit c6ff5268293ef98e48a99597e765ffc417e39fa5 ("rhashtable:
Fix walker list corruption") causes a suspicious RCU usage warning
because we no longer hold ht->mutex when we dereference ht->tbl.

However, this is a false positive because we now hold ht->lock
which also guarantees that ht->tbl won't disppear from under us.

This patch kills the warning by using rcu_dereference_protected.

Reported-by: kernel test robot
Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-12-19 12:44:18 +0800
9dd2af834 bpf, test: add couple of test cases ... Browse Code »

Add couple of test cases for interpreter but also JITs, f.e. to test that
when imm32 moves are being done, upper 32bits of the regs are being zero
extended.

Without JIT:

[...]
[ 1114.129301] test_bpf: #43 MOV REG64 jited:0 128 PASS
[ 1114.130626] test_bpf: #44 MOV REG32 jited:0 139 PASS
[ 1114.132055] test_bpf: #45 LD IMM64 jited:0 124 PASS
[...]

With JIT (generated code can as usual be nicely verified with the help of
bpf_jit_disasm tool):

[...]
[ 1062.726782] test_bpf: #43 MOV REG64 jited:1 6 PASS
[ 1062.726890] test_bpf: #44 MOV REG32 jited:1 6 PASS
[ 1062.726993] test_bpf: #45 LD IMM64 jited:1 6 PASS
[...]

Signed-off-by: Daniel Borkmann
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Daniel Borkmann
2015-12-19 05:04:51 +0800

18 Dec, 2015

3 commits

b3e0d3d7b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts:
drivers/net/geneve.c

Here we had an overlapping change, where in 'net' the extraneous stats
bump was being removed whilst in 'net-next' the final argument to
udp_tunnel6_xmit_skb() was being changed.

Signed-off-by: David S. Miller

David S. Miller
2015-12-18 11:08:28 +0800
73796d8bf Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:

1) Fix uninitialized variable warnings in nfnetlink_queue, a lot of
people reported this... From Arnd Bergmann.

2) Don't init mutex twice in i40e driver, from Jesse Brandeburg.

3) Fix spurious EBUSY in rhashtable, from Herbert Xu.

4) Missing DMA unmaps in mvpp2 driver, from Marcin Wojtas.

5) Fix race with work structure access in pppoe driver causing
corruptions, from Guillaume Nault.

6) Fix OOPS due to sh_eth_rx() not checking whether netdev_alloc_skb()
actually succeeded or not, from Sergei Shtylyov.

7) Don't lose flags when settifn IFA_F_OPTIMISTIC in ipv6 code, from
Bjørn Mork.

8) VXLAN_HD_RCO defined incorrectly, fix from Jiri Benc.

9) Fix clock source used for cookies in SCTP, from Marcelo Ricardo
Leitner.

10) aurora driver needs HAS_DMA dependency, from Geert Uytterhoeven.

11) ndo_fill_metadata_dst op of vxlan has to handle ipv6 tunneling
properly as well, from Jiri Benc.

12) Handle request sockets properly in xfrm layer, from Eric Dumazet.

13) Double stats update in ipv6 geneve transmit path, fix from Pravin B
Shelar.

14) sk->sk_policy[] needs RCU protection, and as a result
xfrm_policy_destroy() needs to free policies using an RCU grace
period, from Eric Dumazet.

15) SCTP needs to clone ipv6 tx options in order to avoid use after
free, from Eric Dumazet.

16) Missing kbuild export if ila.h, from Stephen Hemminger.

17) Missing mdiobus_alloc() return value checking in mdio-mux.c, from
Tobias Klauser.

18) Validate protocol value range in ->create() methods, from Hannes
Frederic Sowa.

19) Fix early socket demux races that result in illegal dst reuse, from
Eric Dumazet.

20) Validate socket address length in pptp code, from WANG Cong.

21) skb_reorder_vlan_header() uses incorrect offset and can corrupt
packets, from Vlad Yasevich.

22) Fix memory leaks in nl80211 registry code, from Ola Olsson.

23) Timeout loop count handing fixes in mISDN, xgbe, qlge, sfc, and
qlcnic. From Dan Carpenter.

24) msg.msg_iocb needs to be cleared in recvfrom() otherwise, for
example, AF_ALG will interpret it as an async call. From Tadeusz
Struk.

25) inetpeer_set_addr_v4 forgets to initialize the 'vif' field, from
Eric Dumazet.

26) rhashtable enforces the minimum table size not early enough,
breaking how we calculate the per-cpu lock allocations. From
Herbert Xu.

27) Fix FCC port lockup in 82xx driver, from Martin Roth.

28) FOU sockets need to be freed using RCU, from Hannes Frederic Sowa.

29) Fix out-of-bounds access in __skb_complete_tx_timestamp() and
sock_setsockopt() wrt. timestamp handling. From WANG Cong.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (117 commits)
net: check both type and procotol for tcp sockets
drivers: net: xgene: fix Tx flow control
tcp: restore fastopen with no data in SYN packet
af_unix: Revert 'lock_interruptible' in stream receive code
fou: clean up socket with kfree_rcu
82xx: FCC: Fixing a bug causing to FCC port lock-up
gianfar: Don't enable RX Filer if not supported
net: fix warnings in 'make htmldocs' by moving macro definition out of field declaration
rhashtable: Fix walker list corruption
rhashtable: Enforce minimum size on initial hash table
inet: tcp: fix inetpeer_set_addr_v4()
ipv6: automatically enable stable privacy mode if stable_secret set
net: fix uninitialized variable issue
bluetooth: Validate socket address length in sco_sock_bind().
net_sched: make qdisc_tree_decrease_qlen() work for non mq
ser_gigaset: remove unnecessary kfree() calls from release method
ser_gigaset: fix deallocation of platform device structure
ser_gigaset: turn nonsense checks into WARN_ON
ser_gigaset: fix up NULL checks
qlcnic: fix a timeout loop
...

Linus Torvalds
2015-12-18 06:05:22 +0800
d7637d01b Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm ... Browse Code »

Pull libnvdimm fixes from Dan Williams:

- Two bug fixes for misuse of PAGE_MASK in scatterlist and dma-debug.
These are tagged for -stable. The scatterlist impact is potentially
corrupted dma addresses on HIGHMEM enabled platforms.

- A minor locking fix for the NFIT hot-add implementation that is new
in 4.4-rc. This would only trigger in the case a hot-add raced
driver removal.

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
dma-debug: Fix dma_debug_entry offset calculation
Revert "scatterlist: use sg_phys()"
nfit: acpi_nfit_notify(): Do not leave device locked

Linus Torvalds
2015-12-18 03:20:13 +0800

17 Dec, 2015

2 commits

0354aec19 dma-debug: Fix dma_debug_entry offset calculation ... Browse Code »

dma-debug uses struct dma_debug_entry to keep track of dma coherent
memory allocation requests. The virtual address is converted into a pfn
and an offset. Previously, the offset was calculated using an incorrect
bit mask. As a result, we saw incorrect error messages from dma-debug
like the following:

"DMA-API: exceeded 7 overlapping mappings of cacheline 0x03e00000"

Cacheline 0x03e00000 does not exist on our platform.

Cc:
Fixes: 0abdd7a81b7e ("dma-debug: introduce debug_dma_assert_idle()")
Signed-off-by: Daniel Mentz
Signed-off-by: Dan Williams

Daniel Mentz
2015-12-17 03:24:26 +0800
c6ff52682 rhashtable: Fix walker list corruption ... Browse Code »

The commit ba7c95ea3870fe7b847466d39a049ab6f156aa2c ("rhashtable:
Fix sleeping inside RCU critical section in walk_stop") introduced
a new spinlock for the walker list. However, it did not convert
all existing users of the list over to the new spin lock. Some
continued to use the old mutext for this purpose. This obviously
led to corruption of the list.

The fix is to use the spin lock everywhere where we touch the list.

This also allows us to do rcu_rad_lock before we take the lock in
rhashtable_walk_start. With the old mutex this would've deadlocked
but it's safe with the new spin lock.

Fixes: ba7c95ea3870 ("rhashtable: Fix sleeping inside RCU...")
Reported-by: Colin Ian King
Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-12-17 00:13:14 +0800

16 Dec, 2015

1 commit

3a324606b rhashtable: Enforce minimum size on initial hash table ... Browse Code »

William Hua wrote:
>
> I wasn't aware there was an enforced minimum size. I simply set the
> nelem_hint in the rhastable_params struct to 1, expecting it to grow as
> needed. This caused a segfault afterwards when trying to insert an
> element.

OK we're doing the size computation before we enforce the limit
on min_size.

---8
Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-12-16 23:44:08 +0800

09 Dec, 2015

2 commits

46c749eac rhashtable: Remove unnecessary wmb for future_tbl ... Browse Code »

The patch 9497df88ab5567daa001829051c5f87161a81ff0 ("rhashtable:
Fix reader/rehash race") added a pair of barriers. In fact the
wmb is superfluous because every subsequent write to the old or
new hash table uses rcu_assign_pointer, which itself carriers a
full barrier prior to the assignment.

Therefore we may remove the explicit wmb.

Signed-off-by: Herbert Xu
Acked-by: Thomas Graf
Signed-off-by: David S. Miller

Herbert Xu
2015-12-09 11:46:32 +0800
82607adcf workqueue: implement lockup detector ... Browse Code »

Workqueue stalls can happen from a variety of usage bugs such as
missing WQ_MEM_RECLAIM flag or concurrency managed work item
indefinitely staying RUNNING. These stalls can be extremely difficult
to hunt down because the usual warning mechanisms can't detect
workqueue stalls and the internal state is pretty opaque.

To alleviate the situation, this patch implements workqueue lockup
detector. It periodically monitors all worker_pools periodically and,
if any pool failed to make forward progress longer than the threshold
duration, triggers warning and dumps workqueue state as follows.

BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 31s!
Showing busy workqueues and worker pools:
workqueue events: flags=0x0
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=17/256
pending: monkey_wrench_fn, e1000_watchdog, cache_reap, vmstat_shepherd, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, cgroup_release_agent
workqueue events_power_efficient: flags=0x80
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
pending: check_lifetime, neigh_periodic_work
workqueue cgroup_pidlist_destroy: flags=0x0
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1
pending: cgroup_pidlist_destroy_work_fn
...

The detection mechanism is controller through kernel parameter
workqueue.watchdog_thresh and can be updated at runtime through the
sysfs module parameter file.

v2: Decoupled from softlockup control knobs.

Signed-off-by: Tejun Heo
Acked-by: Don Zickus
Cc: Ulrich Obergfell
Cc: Michal Hocko
Cc: Chris Mason
Cc: Andrew Morton

Tejun Heo
2015-12-09 00:29:47 +0800

07 Dec, 2015

2 commits

e12675853 iov_iter: export import_single_range() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2015-12-07 09:42:19 +0800
36f7a8a4c iov_iter: constify {csum_and_,}copy_to_iter() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2015-12-07 09:42:15 +0800

06 Dec, 2015

2 commits

153a4334c x86/headers: Don't include asm/processor.h in asm/atomic.h ... Browse Code »

asm/atomic.h doesn't really need asm/processor.h anymore. Everything
it uses has moved to other header files. So remove that include.

processor.h is a nasty header that includes lots of
other headers and makes it prone to include loops. Removing the
include here makes asm/atomic.h a "leaf" header that can
be safely included in most other headers.

The only fallout is in the lib/atomic tester which relied on
this implicit include. Give it an explicit include.
(the include is in ifdef because the user is also in ifdef)

Signed-off-by: Andi Kleen
Signed-off-by: Peter Zijlstra (Intel)
Cc: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Thomas Gleixner
Cc: Vince Weaver
Cc: rostedt@goodmis.org
Link: http://lkml.kernel.org/r/1449018060-1742-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar

Andi Kleen
2015-12-06 19:56:03 +0800
a90099d9f Revert "rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation" ... Browse Code »

This reverts commit d3716f18a7d841565c930efde30737a3557eee69.

vmalloc cannot be used in BH disabled contexts, even
with GFP_ATOMIC. And we certainly want to support
rhashtable users inserting entries with software
interrupts disabled.

Signed-off-by: David S. Miller

David S. Miller
2015-12-06 11:47:11 +0800

05 Dec, 2015

2 commits

d3716f18a rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation ... Browse Code »

When an rhashtable user pounds rhashtable hard with back-to-back
insertions we may end up growing the table in GFP_ATOMIC context.
Unfortunately when the table reaches a certain size this often
fails because we don't have enough physically contiguous pages
to hold the new table.

Eric Dumazet suggested (and in fact wrote this patch) using
__vmalloc instead which can be used in GFP_ATOMIC context.

Reported-by: Phil Sutter
Suggested-by: Eric Dumazet
Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-12-05 05:53:05 +0800
3cf92222a rhashtable: Prevent spurious EBUSY errors on insertion ... Browse Code »

Thomas and Phil observed that under stress rhashtable insertion
sometimes failed with EBUSY, even though this error should only
ever been seen when we're under attack and our hash chain length
has grown to an unacceptable level, even after a rehash.

It turns out that the logic for detecting whether there is an
existing rehash is faulty. In particular, when two threads both
try to grow the same table at the same time, one of them may see
the newly grown table and thus erroneously conclude that it had
been rehashed. This is what leads to the EBUSY error.

This patch fixes this by remembering the current last table we
used during insertion so that rhashtable_insert_rehash can detect
when another thread has also done a resize/rehash. When this is
detected we will give up our resize/rehash and simply retry the
insertion with the new table.

Reported-by: Thomas Graf
Reported-by: Phil Sutter
Signed-off-by: Herbert Xu
Tested-by: Phil Sutter
Signed-off-by: David S. Miller

Herbert Xu
2015-12-05 03:38:26 +0800

04 Dec, 2015

1 commit

c39d0454e net: Add support for CHANGEUPPER notifier error injection ... Browse Code »

Since CHANGEUPPER can now fail, add support for it in the newly
introduced netdev notifier error injection infrastructure.

Signed-off-by: Ido Schimmel
Signed-off-by: Jiri Pirko
Acked-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller

Ido Schimmel
2015-12-04 00:49:23 +0800

02 Dec, 2015

1 commit

02fff96a7 net: add support for netdev notifier error injection ... Browse Code »

This module allows to insert errors in some of netdevice's notifier
events. All network drivers use these notifiers to signal various events
and to check if they are allowed, e.g. PRECHANGEMTU and CHANGEMTU
afterwards. Until recently I had to run failure tests by injecting
a custom module, but now this infrastructure makes it trivial to test
these failure paths. Some of the recent bugs I fixed were found using
this module.
Here's an example:
$ cd /sys/kernel/debug/notifier-error-inject/netdev
$ echo -22 > actions/NETDEV_CHANGEMTU/error
$ ip link set eth0 mtu 1024
RTNETLINK answers: Invalid argument

CC: Akinobu Mita
CC: "David S. Miller"
CC: netdev
Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller

Nikolay Aleksandrov
2015-12-02 04:31:57 +0800

24 Nov, 2015

1 commit

1c97be677 list: Use WRITE_ONCE() when adding to lists and hlists ... Browse Code »

Code that does lockless emptiness testing of non-RCU lists is relying
on the list-addition code to write the list head's ->next pointer
atomically. This commit therefore adds WRITE_ONCE() to list-addition
pointer stores that could affect the head's ->next pointer.

Reported-by: Dmitry Vyukov
Signed-off-by: Paul E. McKenney

Paul E. McKenney
2015-11-24 02:37:35 +0800