Eric Lee / smarc-fsl-linux-kernel

23 Sep, 2016

1 commit

d6989d4bb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Browse Code »

David S. Miller
2016-09-23 18:46:57 +0800

21 Sep, 2016

1 commit

a4f1f9ac8 lib/win_minmax: windowed min or max estimator ... Browse Code »

This commit introduces a generic library to estimate either the min or
max value of a time-varying variable over a recent time window. This
is code originally from Kathleen Nichols. The current form of the code
is from Van Jacobson.

A single struct minmax_sample will track the estimated windowed-max
value of the series if you call minmax_running_max() or the estimated
windowed-min value of the series if you call minmax_running_min().

Nearly equivalent code is already in place for minimum RTT estimation
in the TCP stack. This commit extracts that code and generalizes it to
handle both min and max. Moving the code here reduces the footprint
and complexity of the TCP code base and makes the filter generally
available for other parts of the codebase, including an upcoming TCP
congestion control module.

This library works well for time series where the measurements are
smoothly increasing or decreasing.

Signed-off-by: Van Jacobson
Signed-off-by: Neal Cardwell
Signed-off-by: Yuchung Cheng
Signed-off-by: Nandita Dukkipati
Signed-off-by: Eric Dumazet
Signed-off-by: Soheil Hassas Yeganeh
Signed-off-by: David S. Miller

Neal Cardwell
2016-09-21 12:22:59 +0800

20 Sep, 2016

1 commit

ca26893f0 rhashtable: Add rhlist interface ... Browse Code »

The insecure_elasticity setting is an ugly wart brought out by
users who need to insert duplicate objects (that is, distinct
objects with identical keys) into the same table.

In fact, those users have a much bigger problem. Once those
duplicate objects are inserted, they don't have an interface to
find them (unless you count the walker interface which walks
over the entire table).

Some users have resorted to doing a manual walk over the hash
table which is of course broken because they don't handle the
potential existence of multiple hash tables. The result is that
they will break sporadically when they encounter a hash table
resize/rehash.

This patch provides a way out for those users, at the expense
of an extra pointer per object. Essentially each object is now
a list of objects carrying the same key. The hash table will
only see the lists so nothing changes as far as rhashtable is
concerned.

To use this new interface, you need to insert a struct rhlist_head
into your objects instead of struct rhash_head. While the hash
table is unchanged, for type-safety you'll need to use struct
rhltable instead of struct rhashtable. All the existing interfaces
have been duplicated for rhlist, including the hash table walker.

One missing feature is nulls marking because AFAIK the only potential
user of it does not need duplicate objects. Should anyone need
this it shouldn't be too hard to add.

Signed-off-by: Herbert Xu
Acked-by: Thomas Graf
Signed-off-by: David S. Miller

Herbert Xu
2016-09-20 16:43:36 +0800

18 Sep, 2016

1 commit

d4690f1e1 fix iov_iter_fault_in_readable() ... Browse Code »

... by turning it into what used to be multipages counterpart

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds

Al Viro
2016-09-18 05:05:30 +0800

16 Sep, 2016

1 commit

5c0ca3f56 test_bpf: fix the dummy skb after dissector changes ... Browse Code »

Commit d5709f7ab776 ("flow_dissector: For stripped vlan, get vlan
info from skb->vlan_tci") made flow dissector look at vlan_proto
when vlan is present. Since test_bpf sets skb->vlan_tci to ~0
(including VLAN_TAG_PRESENT) we have to populate skb->vlan_proto.

Fixes false negative on test #24:
test_bpf: #24 LD_PAYLOAD_OFF jited:0 175 ret 0 != 42 FAIL (1 times)

Signed-off-by: Jakub Kicinski
Reviewed-by: Dinan Gunawardena
Acked-by: Daniel Borkmann
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Jakub Kicinski
2016-09-16 07:17:15 +0800

13 Sep, 2016

1 commit

b20b378d4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts:
drivers/net/ethernet/mediatek/mtk_eth_soc.c
drivers/net/ethernet/qlogic/qed/qed_dcbx.c
drivers/net/phy/Kconfig

All conflicts were cases of overlapping commits.

Signed-off-by: David S. Miller

David S. Miller
2016-09-13 06:52:44 +0800

07 Sep, 2016

1 commit

60175ccdf Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next ... Browse Code »

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for your net-next
tree. Most relevant updates are the removal of per-conntrack timers to
use a workqueue/garbage collection approach instead from Florian
Westphal, the hash and numgen expression for nf_tables from Laura
Garcia, updates on nf_tables hash set to honor the NLM_F_EXCL flag,
removal of ip_conntrack sysctl and many other incremental updates on our
Netfilter codebase.

More specifically, they are:

1) Retrieve only 4 bytes to fetch ports in case of non-linear skb
transport area in dccp, sctp, tcp, udp and udplite protocol
conntrackers, from Gao Feng.

2) Missing whitespace on error message in physdev match, from Hangbin Liu.

3) Skip redundant IPv4 checksum calculation in nf_dup_ipv4, from Liping Zhang.

4) Add nf_ct_expires() helper function and use it, from Florian Westphal.

5) Replace opencoded nf_ct_kill() call in IPVS conntrack support, also
from Florian.

6) Rename nf_tables set implementation to nft_set_{name}.c

7) Introduce the hash expression to allow arbitrary hashing of selector
concatenations, from Laura Garcia Liebana.

8) Remove ip_conntrack sysctl backward compatibility code, this code has
been around for long time already, and we have two interfaces to do
this already: nf_conntrack sysctl and ctnetlink.

9) Use nf_conntrack_get_ht() helper function whenever possible, instead
of opencoding fetch of hashtable pointer and size, patch from Liping Zhang.

10) Add quota expression for nf_tables.

11) Add number generator expression for nf_tables, this supports
incremental and random generators that can be combined with maps,
very useful for load balancing purpose, again from Laura Garcia Liebana.

12) Fix a typo in a debug message in FTP conntrack helper, from Colin Ian King.

13) Introduce a nft_chain_parse_hook() helper function to parse chain hook
configuration, this is used by a follow up patch to perform better chain
update validation.

14) Add rhashtable_lookup_get_insert_key() to rhashtable and use it from the
nft_set_hash implementation to honor the NLM_F_EXCL flag.

15) Missing nulls check in nf_conntrack from nf_conntrack_tuple_taken(),
patch from Florian Westphal.

16) Don't use the DYING bit to know if the conntrack event has been already
delivered, instead a state variable to track event re-delivery
states, also from Florian.

17) Remove the per-conntrack timer, use the workqueue approach that was
discussed during the NFWS, from Florian Westphal.

18) Use the netlink conntrack table dump path to kill stale entries,
again from Florian.

19) Add a garbage collector to get rid of stale conntracks, from
Florian.

20) Reschedule garbage collector if eviction rate is high.

21) Get rid of the __nf_ct_kill_acct() helper.

22) Use ARPHRD_ETHER instead of hardcoded 1 from ARP logger.

23) Make nf_log_set() interface assertive on unsupported families.
====================

Signed-off-by: David S. Miller

David S. Miller
2016-09-07 03:45:26 +0800

02 Sep, 2016

2 commits

e6173ba42 lib/test_hash.c: fix warning in preprocessor symbol evaluation ... Browse Code »

Some versions of gcc don't like tests for the value of an undefined
preprocessor symbol, even in the #else branch of an #ifndef:

lib/test_hash.c:224:7: warning: "HAVE_ARCH__HASH_32" is not defined [-Wundef]
#elif HAVE_ARCH__HASH_32 != 1
^
lib/test_hash.c:229:7: warning: "HAVE_ARCH_HASH_32" is not defined [-Wundef]
#elif HAVE_ARCH_HASH_32 != 1
^
lib/test_hash.c:234:7: warning: "HAVE_ARCH_HASH_64" is not defined [-Wundef]
#elif HAVE_ARCH_HASH_64 != 1
^

Seen with gcc 4.9, not seen with 4.1.2.

Change the logic to only check the value inside an #ifdef to fix this.

Fixes: 468a9428521e7d00 (": Add support for architecture-specific functions")
Link: http://lkml.kernel.org/r/20160829214952.1334674-4-arnd@arndb.de
Signed-off-by: Geert Uytterhoeven
Signed-off-by: Arnd Bergmann
Acked-by: George Spelvin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Geert Uytterhoeven
2016-09-02 08:52:01 +0800
ed76b7a13 lib/test_hash.c: fix warning in two-dimensional array init ... Browse Code »

lib/test_hash.c: In function 'test_hash_init':
lib/test_hash.c:146:2: warning: missing braces around initializer [-Wmissing-braces]

Fixes: 468a9428521e7d00 (": Add support for architecture-specific functions")
Link: http://lkml.kernel.org/r/20160829214952.1334674-3-arnd@arndb.de
Signed-off-by: Geert Uytterhoeven
Signed-off-by: Arnd Bergmann
Acked-by: George Spelvin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Geert Uytterhoeven
2016-09-02 08:52:01 +0800

31 Aug, 2016

1 commit

0d025d271 mm/usercopy: get rid of CONFIG_DEBUG_STRICT_USER_COPY_CHECKS ... Browse Code »

There are three usercopy warnings which are currently being silenced for
gcc 4.6 and newer:

1) "copy_from_user() buffer size is too small" compile warning/error

This is a static warning which happens when object size and copy size
are both const, and copy size > object size. I didn't see any false
positives for this one. So the function warning attribute seems to
be working fine here.

Note this scenario is always a bug and so I think it should be
changed to *always* be an error, regardless of
CONFIG_DEBUG_STRICT_USER_COPY_CHECKS.

2) "copy_from_user() buffer size is not provably correct" compile warning

This is another static warning which happens when I enable
__compiletime_object_size() for new compilers (and
CONFIG_DEBUG_STRICT_USER_COPY_CHECKS). It happens when object size
is const, but copy size is *not*. In this case there's no way to
compare the two at build time, so it gives the warning. (Note the
warning is a byproduct of the fact that gcc has no way of knowing
whether the overflow function will be called, so the call isn't dead
code and the warning attribute is activated.)

So this warning seems to only indicate "this is an unusual pattern,
maybe you should check it out" rather than "this is a bug".

I get 102(!) of these warnings with allyesconfig and the
__compiletime_object_size() gcc check removed. I don't know if there
are any real bugs hiding in there, but from looking at a small
sample, I didn't see any. According to Kees, it does sometimes find
real bugs. But the false positive rate seems high.

3) "Buffer overflow detected" runtime warning

This is a runtime warning where object size is const, and copy size >
object size.

All three warnings (both static and runtime) were completely disabled
for gcc 4.6 with the following commit:

2fb0815c9ee6 ("gcc4: disable __compiletime_object_size for GCC 4.6+")

That commit mistakenly assumed that the false positives were caused by a
gcc bug in __compiletime_object_size(). But in fact,
__compiletime_object_size() seems to be working fine. The false
positives were instead triggered by #2 above. (Though I don't have an
explanation for why the warnings supposedly only started showing up in
gcc 4.6.)

So remove warning #2 to get rid of all the false positives, and re-enable
warnings #1 and #3 by reverting the above commit.

Furthermore, since #1 is a real bug which is detected at compile time,
upgrade it to always be an error.

Having done all that, CONFIG_DEBUG_STRICT_USER_COPY_CHECKS is no longer
needed.

Signed-off-by: Josh Poimboeuf
Cc: Kees Cook
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H . Peter Anvin"
Cc: Andy Lutomirski
Cc: Steven Rostedt
Cc: Brian Gerst
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Cc: Byungchul Park
Cc: Nilay Vaish
Signed-off-by: Linus Torvalds

Josh Poimboeuf
2016-08-31 01:10:21 +0800

30 Aug, 2016

1 commit

6abdd5f59 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

All three conflicts were cases of simple overlapping
changes.

Signed-off-by: David S. Miller

David S. Miller
2016-08-30 12:54:02 +0800

27 Aug, 2016

1 commit

9dbeea7f0 rhashtable: fix a memory leak in alloc_bucket_locks() ... Browse Code »

If vmalloc() was successful, do not attempt a kmalloc_array()

Fixes: 4cf0b354d92e ("rhashtable: avoid large lock-array allocations")
Reported-by: CAI Qian
Signed-off-by: Eric Dumazet
Cc: Florian Westphal
Acked-by: Herbert Xu
Tested-by: CAI Qian
Signed-off-by: David S. Miller

Eric Dumazet
2016-08-27 12:59:53 +0800

26 Aug, 2016

1 commit

5ca8cc5bf rhashtable: add rhashtable_lookup_get_insert_key() ... Browse Code »

This patch modifies __rhashtable_insert_fast() so it returns the
existing object that clashes with the one that you want to insert.
In case the object is successfully inserted, NULL is returned.
Otherwise, you get an error via ERR_PTR().

This patch adapts the existing callers of __rhashtable_insert_fast()
so they handle this new logic, and it adds a new
rhashtable_lookup_get_insert_key() interface to fetch this existing
object.

nf_tables needs this change to improve handling of EEXIST cases via
honoring the NLM_F_EXCL flag and by checking if the data part of the
mapping matches what we have.

Cc: Herbert Xu
Cc: Thomas Graf
Signed-off-by: Pablo Neira Ayuso
Acked-by: Herbert Xu

Pablo Neira Ayuso
2016-08-26 23:29:41 +0800

20 Aug, 2016

1 commit

246779dd0 rhashtable: Remove GFP flag from rhashtable_walk_init ... Browse Code »

The commit 8f6fd83c6c5ec66a4a70c728535ddcdfef4f3697 ("rhashtable:
accept GFP flags in rhashtable_walk_init") added a GFP flag argument
to rhashtable_walk_init because some users wish to use the walker
in an unsleepable context.

In fact we don't need to allocate memory in rhashtable_walk_init
at all. The walker is always paired with an iterator so we could
just stash ourselves there.

This patch does that by introducing a new enter function to replace
the existing init function. This way we don't have to churn all
the existing users again.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2016-08-20 05:40:24 +0800

18 Aug, 2016

1 commit

184ca8234 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:

1) Buffers powersave frame test is reversed in cfg80211, fix from Felix
Fietkau.

2) Remove bogus WARN_ON in openvswitch, from Jarno Rajahalme.

3) Fix some tg3 ethtool logic bugs, and one that would cause no
interrupts to be generated when rx-coalescing is set to 0. From
Satish Baddipadige and Siva Reddy Kallam.

4) QLCNIC mailbox corruption and napi budget handling fix from Manish
Chopra.

5) Fix fib_trie logic when walking the trie during /proc/net/route
output than can access a stale node pointer. From David Forster.

6) Several sctp_diag fixes from Phil Sutter.

7) PAUSE frame handling fixes in mlxsw driver from Ido Schimmel.

8) Checksum fixup fixes in bpf from Daniel Borkmann.

9) Memork leaks in nfnetlink, from Liping Zhang.

10) Use after free in rxrpc, from David Howells.

11) Use after free in new skb_array code of macvtap driver, from Jason
Wang.

12) Calipso resource leak, from Colin Ian King.

13) mediatek bug fixes (missing stats sync init, etc.) from Sean Wang.

14) Fix bpf non-linear packet write helpers, from Daniel Borkmann.

15) Fix lockdep splats in macsec, from Sabrina Dubroca.

16) hv_netvsc bug fixes from Vitaly Kuznetsov, mostly to do with VF
handling.

17) Various tc-action bug fixes, from CONG Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
net_sched: allow flushing tc police actions
net_sched: unify the init logic for act_police
net_sched: convert tcf_exts from list to pointer array
net_sched: move tc offload macros to pkt_cls.h
net_sched: fix a typo in tc_for_each_action()
net_sched: remove an unnecessary list_del()
net_sched: remove the leftover cleanup_a()
mlxsw: spectrum: Allow packets to be trapped from any PG
mlxsw: spectrum: Unmap 802.1Q FID before destroying it
mlxsw: spectrum: Add missing rollbacks in error path
mlxsw: reg: Fix missing op field fill-up
mlxsw: spectrum: Trap loop-backed packets
mlxsw: spectrum: Add missing packet traps
mlxsw: spectrum: Mark port as active before registering it
mlxsw: spectrum: Create PVID vPort before registering netdevice
mlxsw: spectrum: Remove redundant errors from the code
mlxsw: spectrum: Don't return upon error in removal path
i40e: check for and deal with non-contiguous TCs
ixgbe: Re-enable ability to toggle VLAN filtering
ixgbe: Force VLNCTRL.VFE to be set in all VMDq paths
...

Linus Torvalds
2016-08-18 08:26:58 +0800

16 Aug, 2016

1 commit

12311959e rhashtable: fix shift by 64 when shrinking ... Browse Code »

I got this:

================================================================================
UBSAN: Undefined behaviour in ./include/linux/log2.h:63:13
shift exponent 64 is too large for 64-bit type 'long unsigned int'
CPU: 1 PID: 721 Comm: kworker/1:1 Not tainted 4.8.0-rc1+ #87
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
Workqueue: events rht_deferred_worker
0000000000000000 ffff88011661f8d8 ffffffff82344f50 0000000041b58ab3
ffffffff84f98000 ffffffff82344ea4 ffff88011661f900 ffff88011661f8b0
0000000000000001 ffff88011661f6b8 dffffc0000000000 ffffffff867f7640
Call Trace:
[] dump_stack+0xac/0xfc
[] ? _atomic_dec_and_lock+0xc4/0xc4
[] ubsan_epilogue+0xd/0x8a
[] __ubsan_handle_shift_out_of_bounds+0x255/0x29a
[] ? __ubsan_handle_out_of_bounds+0x180/0x180
[] ? nl80211_req_set_reg+0x256/0x2f0
[] ? print_context_stack+0x8a/0x160
[] ? amd_pmu_reset+0x341/0x380
[] rht_deferred_worker+0x1618/0x1790
[] ? rht_deferred_worker+0x1618/0x1790
[] ? rhashtable_jhash2+0x370/0x370
[] ? process_one_work+0x6fd/0x1970
[] process_one_work+0x79f/0x1970
[] ? process_one_work+0x6fd/0x1970
[] ? try_to_grab_pending+0x4c0/0x4c0
[] ? worker_thread+0x1c4/0x1340
[] worker_thread+0x55f/0x1340
[] ? __schedule+0x4df/0x1d40
[] ? process_one_work+0x1970/0x1970
[] ? process_one_work+0x1970/0x1970
[] kthread+0x237/0x390
[] ? __kthread_parkme+0x280/0x280
[] ? _raw_spin_unlock_irq+0x33/0x50
[] ret_from_fork+0x1f/0x40
[] ? __kthread_parkme+0x280/0x280
================================================================================

roundup_pow_of_two() is undefined when called with an argument of 0, so
let's avoid the call and just fall back to ht->p.min_size (which should
never be smaller than HASH_MIN_SIZE).

Cc: Herbert Xu
Signed-off-by: Vegard Nossum
Acked-by: Herbert Xu
Signed-off-by: David S. Miller

Vegard Nossum
2016-08-16 02:10:09 +0800

15 Aug, 2016

1 commit

4cf0b354d rhashtable: avoid large lock-array allocations ... Browse Code »

Sander reports following splat after netfilter nat bysrc table got
converted to rhashtable:

swapper/0: page allocation failure: order:3, mode:0x2084020(GFP_ATOMIC|__GFP_COMP)
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc1 [..]
[] warn_alloc_failed+0xdd/0x140
[] __alloc_pages_nodemask+0x3e1/0xcf0
[] alloc_pages_current+0x8d/0x110
[] kmalloc_order+0x1f/0x70
[] __kmalloc+0x129/0x140
[] bucket_table_alloc+0xc1/0x1d0
[] rhashtable_insert_rehash+0x5d/0xe0
[] nf_nat_setup_info+0x2ef/0x400

The failure happens when allocating the spinlock array.
Even with GFP_KERNEL its unlikely for such a large allocation
to succeed.

Thomas Graf pointed me at inet_ehash_locks_alloc(), so in addition
to adding NOWARN for atomic allocations this also makes the bucket-array
sizing more conservative.

In commit 095dc8e0c3686 ("tcp: fix/cleanup inet_ehash_locks_alloc()"),
Eric Dumazet says: "Budget 2 cache lines per cpu worth of 'spinlocks'".
IOW, consider size needed by a single spinlock when determining
number of locks per cpu. So with 64 byte per cacheline and 4 byte per
spinlock this gives 32 locks per cpu.

Resulting size of the lock-array (sizeof(spinlock) == 4):

cpus: 1 2 4 8 16 32 64
old: 1k 1k 4k 8k 16k 16k 16k
new: 128 256 512 1k 2k 4k 8k

8k allocation should have decent chance of success even
with GFP_ATOMIC, and should not fail with GFP_KERNEL.

With 72-byte spinlock (LOCKDEP):
cpus : 1 2
old: 9k 18k
new: ~2k ~4k

Reported-by: Sander Eikelenboom
Suggested-by: Thomas Graf
Signed-off-by: Florian Westphal
Signed-off-by: David S. Miller

Florian Westphal
2016-08-15 12:12:57 +0800

09 Aug, 2016

2 commits

1bd4403d8 unsafe_[get|put]_user: change interface to use a error target label ... Browse Code »

When I initially added the unsafe_[get|put]_user() helpers in commit
5b24a7a2aa20 ("Add 'unsafe' user access functions for batched
accesses"), I made the mistake of modeling the interface on our
traditional __[get|put]_user() functions, which return zero on success,
or -EFAULT on failure.

That interface is fairly easy to use, but it's actually fairly nasty for
good code generation, since it essentially forces the caller to check
the error value for each access.

In particular, since the error handling is already internally
implemented with an exception handler, and we already use "asm goto" for
various other things, we could fairly easily make the error cases just
jump directly to an error label instead, and avoid the need for explicit
checking after each operation.

So switch the interface to pass in an error label, rather than checking
the error value in the caller. Best do it now before we start growing
more users (the signal handling code in particular would be a good place
to use the new interface).

So rather than

if (unsafe_get_user(x, ptr))
... handle error ..

the interface is now

unsafe_get_user(x, ptr, label);

where an error during the user mode fetch will now just cause a jump to
'label' in the caller.

Right now the actual _implementation_ of this all still ends up being a
"if (err) goto label", and does not take advantage of any exception
label tricks, but for "unsafe_put_user()" in particular it should be
fairly straightforward to convert to using the exception table model.

Note that "unsafe_get_user()" is much harder to convert to a clever
exception table model, because current versions of gcc do not allow the
use of "asm goto" (for the exception) with output values (for the actual
value to be fetched). But that is hopefully not a limitation in the
long term.

[ Also note that it might be a good idea to switch unsafe_get_user() to
actually _return_ the value it fetches from user space, but this
commit only changes the error handling semantics ]

Signed-off-by: Linus Torvalds

Linus Torvalds
2016-08-09 04:02:01 +0800
3b3bf80b9 rhashtable-test: Fix max_size parameter description ... Browse Code »

Looks like a simple copy'n'paste error.

Fixes: 1aa661f5c3df1 ("rhashtable-test: Measure time to insert, remove & traverse entries")
Signed-off-by: Phil Sutter
Signed-off-by: David S. Miller

Phil Sutter
2016-08-09 03:52:42 +0800

04 Aug, 2016

2 commits

9049fc745 dynamic_debug: add jump label support ... Browse Code »

Although dynamic debug is often only used for debug builds, sometimes
its enabled for production builds as well. Minimize its impact by using
jump labels. This reduces the text section by 7000+ bytes in the kernel
image below. It does increase data, but this should only be referenced
when changing the direction of the branches, and hence usually not in
cache.

text data bss dec hex filename
8194852 4879776 925696 14000324 d5a0c4 vmlinux.pre
8187337 4960224 925696 14073257 d6bda9 vmlinux.post

Link: http://lkml.kernel.org/r/d165b465e8c89bc582d973758d40be44c33f018b.1467837322.git.jbaron@akamai.com
Signed-off-by: Jason Baron
Cc: "David S. Miller"
Cc: Arnd Bergmann
Cc: Benjamin Herrenschmidt
Cc: Chris Metcalf
Cc: Heiko Carstens
Cc: Joe Perches
Cc: Martin Schwidefsky
Cc: Michael Ellerman
Cc: Paul Mackerras
Cc: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jason Baron
2016-08-04 20:50:07 +0800
00085f1ef dma-mapping: use unsigned long for dma_attrs ... Browse Code »

The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer. Thus the pointer can point to const data.
However the attributes do not have to be a bitfield. Instead unsigned
long will do fine:

1. This is just simpler. Both in terms of reading the code and setting
attributes. Instead of initializing local attributes on the stack
and passing pointer to it to dma_set_attr(), just set the bits.

2. It brings safeness and checking for const correctness because the
attributes are passed by value.

Semantic patches for this change (at least most of them):

virtual patch
virtual context

@r@
identifier f, attrs;

@@
f(...,
- struct dma_attrs *attrs
+ unsigned long attrs
, ...)
{
...
}

@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)

and

// Options: --all-includes
virtual patch
virtual context

@r@
identifier f, attrs;
type t;

@@
t f(..., struct dma_attrs *attrs);

@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)

Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
Signed-off-by: Krzysztof Kozlowski
Acked-by: Vineet Gupta
Acked-by: Robin Murphy
Acked-by: Hans-Christian Noren Egtvedt
Acked-by: Mark Salter [c6x]
Acked-by: Jesper Nilsson [cris]
Acked-by: Daniel Vetter [drm]
Reviewed-by: Bart Van Assche
Acked-by: Joerg Roedel [iommu]
Acked-by: Fabien Dessenne [bdisp]
Reviewed-by: Marek Szyprowski [vb2-core]
Acked-by: David Vrabel [xen]
Acked-by: Konrad Rzeszutek Wilk [xen swiotlb]
Acked-by: Joerg Roedel [iommu]
Acked-by: Richard Kuo [hexagon]
Acked-by: Geert Uytterhoeven [m68k]
Acked-by: Gerald Schaefer [s390]
Acked-by: Bjorn Andersson
Acked-by: Hans-Christian Noren Egtvedt [avr32]
Acked-by: Vineet Gupta [arc]
Acked-by: Robin Murphy [arm64 and dma-iommu]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Krzysztof Kozlowski
2016-08-04 20:50:07 +0800

03 Aug, 2016

8 commits

d52bd54db Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge yet more updates from Andrew Morton:

- the rest of ocfs2

- various hotfixes, mainly MM

- quite a bit of misc stuff - drivers, fork, exec, signals, etc.

- printk updates

- firmware

- checkpatch

- nilfs2

- more kexec stuff than usual

- rapidio updates

- w1 things

* emailed patches from Andrew Morton : (111 commits)
ipc: delete "nr_ipc_ns"
kcov: allow more fine-grained coverage instrumentation
init/Kconfig: add clarification for out-of-tree modules
config: add android config fragments
init/Kconfig: ban CONFIG_LOCALVERSION_AUTO with allmodconfig
relay: add global mode support for buffer-only channels
init: allow blacklisting of module_init functions
w1:omap_hdq: fix regression
w1: add helper macro module_w1_family
w1: remove need for ida and use PLATFORM_DEVID_AUTO
rapidio/switches: add driver for IDT gen3 switches
powerpc/fsl_rio: apply changes for RIO spec rev 3
rapidio: modify for rev.3 specification changes
rapidio: change inbound window size type to u64
rapidio/idt_gen2: fix locking warning
rapidio: fix error handling in mbox request/release functions
rapidio/tsi721_dma: advance queue processing from transfer submit call
rapidio/tsi721: add messaging mbox selector parameter
rapidio/tsi721: add PCIe MRRS override parameter
rapidio/tsi721_dma: add channel mask and queue size parameters
...

Linus Torvalds
2016-08-03 09:08:07 +0800
a4691deab kcov: allow more fine-grained coverage instrumentation ... Browse Code »

For more targeted fuzzing, it's better to disable kernel-wide
instrumentation and instead enable it on a per-subsystem basis. This
follows the pattern of UBSAN and allows you to compile in the kcov
driver without instrumenting the whole kernel.

To instrument a part of the kernel, you can use either

# for a single file in the current directory
KCOV_INSTRUMENT_filename.o := y

or

# for all the files in the current directory (excluding subdirectories)
KCOV_INSTRUMENT := y

or

# (same as above)
ccflags-y += $(CFLAGS_KCOV)

or

# for all the files in the current directory (including subdirectories)
subdir-ccflags-y += $(CFLAGS_KCOV)

Link: http://lkml.kernel.org/r/1464008380-11405-1-git-send-email-vegard.nossum@oracle.com
Signed-off-by: Vegard Nossum
Cc: Dmitry Vyukov
Cc: Quentin Casasnovas
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vegard Nossum
2016-08-03 07:35:43 +0800
a9bfd3321 crc32: use ktime_get_ns() for measurement ... Browse Code »

The crc32 test function measures the elapsed time in nanoseconds, but
uses 'struct timespec' for that. We want to remove timespec from the
kernel for y2038 compatibility, and ktime_get_ns() also helps make the
code simpler here.

It is also slightly better to use monontonic time, as we are only
interested in the time difference.

Link: http://lkml.kernel.org/r/20160617143932.3289626-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann
Cc: "David S . Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arnd Bergmann
2016-08-03 07:35:08 +0800
f003a1f18 lib/iommu-helper: skip to next segment ... Browse Code »

When a large enough area in the iommu bitmap is found but would span a
boundary we continue the search starting from the next bit position.
For large allocations this can lead to several useless invocations of
bitmap_find_next_zero_area() and iommu_is_span_boundary().

Continue the search from the start of the next segment (which is the
next bit position such that we'll not cross the same segment boundary
again).

Link: http://lkml.kernel.org/r/alpine.LFD.2.20.1606081910070.3211@schleppi
Signed-off-by: Sebastian Ott
Reviewed-by: Gerald Schaefer
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sebastian Ott
2016-08-03 07:35:07 +0800
6b1d174b0 ratelimit: extend to print suppressed messages on release ... Browse Code »

Extend the ratelimiting facility to print the amount of suppressed lines
when it is being released.

This use case is aimed at short-termed, burst-like users for which we
want to output the suppressed lines stats only once, after it has been
disposed of. For an example, see /dev/kmsg usage in a follow-on patch.

Also, change the printk() line we issue on release to not use
"callbacks" as it is misleading: we're not suppressing callbacks but
printk() calls.

This has been separated from a previous patch by Linus.

Link: http://lkml.kernel.org/r/20160716061745.15795-2-bp@alien8.de
Signed-off-by: Borislav Petkov
Cc: Dave Young
Cc: Franck Bui
Cc: Greg Kroah-Hartman
Cc: Ingo Molnar
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Steven Rostedt
Cc: Uwe Kleine-König
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Borislav Petkov
2016-08-03 07:35:06 +0800
901d805c3 UBSAN: fix typo in format string ... Browse Code »

handle_object_size_mismatch() used %pk to format a kernel pointer with
pr_err(). This seemed to be a misspelling for %pK, but using this to
format a kernel pointer does not make much sence here.

Therefore use %p instead, like in handle_missaligned_access().

Link: http://lkml.kernel.org/r/20160730083010.11569-1-nicolas.iooss_linux@m4x.org
Signed-off-by: Nicolas Iooss
Acked-by: Andrey Ryabinin
Cc: Joe Perches
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nicolas Iooss
2016-08-03 05:31:41 +0800
05eb6e726 radix-tree: account nodes to memcg only if explicitly requested ... Browse Code »

Radix trees may be used not only for storing page cache pages, so
unconditionally accounting radix tree nodes to the current memory cgroup
is bad: if a radix tree node is used for storing data shared among
different cgroups we risk pinning dead memory cgroups forever.

So let's only account radix tree nodes if it was explicitly requested by
passing __GFP_ACCOUNT to INIT_RADIX_TREE. Currently, we only want to
account page cache entries, so mark mapping->page_tree so.

Fixes: 58e698af4c63 ("radix-tree: account radix_tree_node to memory cgroup")
Link: http://lkml.kernel.org/r/1470057188-7864-1-git-send-email-vdavydov@virtuozzo.com
Signed-off-by: Vladimir Davydov
Acked-by: Johannes Weiner
Acked-by: Michal Hocko
Cc: [4.6+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vladimir Davydov
2016-08-03 05:31:41 +0800
f716a85cd Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild ... Browse Code »

Pull kbuild updates from Michal Marek:

- GCC plugin support by Emese Revfy from grsecurity, with a fixup from
Kees Cook. The plugins are meant to be used for static analysis of
the kernel code. Two plugins are provided already.

- reduction of the gcc commandline by Arnd Bergmann.

- IS_ENABLED / IS_REACHABLE macro enhancements by Masahiro Yamada

- bin2c fix by Michael Tautschnig

- setlocalversion fix by Wolfram Sang

* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
gcc-plugins: disable under COMPILE_TEST
kbuild: Abort build on bad stack protector flag
scripts: Fix size mismatch of kexec_purgatory_size
kbuild: make samples depend on headers_install
Kbuild: don't add obj tree in additional includes
Kbuild: arch: look for generated headers in obtree
Kbuild: always prefix objtree in LINUXINCLUDE
Kbuild: avoid duplicate include path
Kbuild: don't add ../../ to include path
vmlinux.lds.h: replace config_enabled() with IS_ENABLED()
kconfig.h: allow to use IS_{ENABLE,REACHABLE} in macro expansion
kconfig.h: use already defined macros for IS_REACHABLE() define
export.h: use __is_defined() to check if __KSYM_* is defined
kconfig.h: use __is_defined() to check if MODULE is defined
kbuild: setlocalversion: print error to STDERR
Add sancov plugin
Add Cyclomatic complexity GCC plugin
GCC plugin infrastructure
Shared library support

Linus Torvalds
2016-08-03 04:37:12 +0800

02 Aug, 2016

1 commit

f38d2e531 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 ... Browse Code »

Pull crypto fixes from Herbert Xu:
"This fixes a number of regressions in the marvell cesa driver caused
by the chaining work, and a regression in lib/mpi that leads to a
GFP_KERNEL allocation with preemption disabled"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: marvell - Don't copy IV vectors from the _process op for ciphers
lib/mpi: Fix SG miter leak
crypto: marvell - Update cache with input sg only when it is unmapped
crypto: marvell - Don't chain at DMA level when backlog is disabled
crypto: marvell - Fix memory leaks in TDMA chain for cipher requests

Linus Torvalds
2016-08-02 02:28:42 +0800

31 Jul, 2016

1 commit

d761f3ed6 Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull x86 microcode updates from Thomas Gleixner:

- more work to make the microcode loader robust

- a fix for the micro code load precedence

- fixes for initrd loading with randomized memory

- less printk noise on SMP machines

* 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/asm, x86/microcode: Add __PAGE_OFFSET_BASE define on 32-bit
x86/microcode/intel: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y
x86/microcode: Remove unused symbol exports
x86/microcode/intel: Do not issue microcode updates messages on each CPU
Documentation/microcode: Document some aspects for more clarity
x86/microcode/AMD: Make amd_ucode_patch[] static
x86/microcode/intel: Unexport save_mc_for_early()
x86/microcode/intel: Rename load_microcode_early() to find_microcode_patch()
x86/microcode: Propagate save_microcode_in_initrd() retval
x86/microcode: Get rid of find_cpio_data()'s dummy offset arg
lib/cpio: Make find_cpio_data()'s offset arg optional
x86/microcode: Fix suspend to RAM with builtin microcode
x86/microcode: Fix loading precedence

Linus Torvalds
2016-07-31 04:18:33 +0800

29 Jul, 2016

6 commits

4816c9406 lib/mpi: Fix SG miter leak ... Browse Code »

In mpi_read_raw_from_sgl we may leak the SG miter resouces after
reading the leading zeroes. This patch fixes this by stopping the
iteration once the leading zeroes have been read.

Fixes: 127827b9c295 ("lib/mpi: Do not do sg_virt")
Reported-by: Nicolai Stange
Tested-by: Nicolai Stange
Signed-off-by: Herbert Xu

Herbert Xu
2016-07-29 18:30:16 +0800
1c88e19b0 Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge more updates from Andrew Morton:
"The rest of MM"

* emailed patches from Andrew Morton : (101 commits)
mm, compaction: simplify contended compaction handling
mm, compaction: introduce direct compaction priority
mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations
mm, page_alloc: make THP-specific decisions more generic
mm, page_alloc: restructure direct compaction handling in slowpath
mm, page_alloc: don't retry initial attempt in slowpath
mm, page_alloc: set alloc_flags only once in slowpath
lib/stackdepot.c: use __GFP_NOWARN for stack allocations
mm, kasan: switch SLUB to stackdepot, enable memory quarantine for SLUB
mm, kasan: account for object redzone in SLUB's nearest_obj()
mm: fix use-after-free if memory allocation failed in vma_adjust()
zsmalloc: Delete an unnecessary check before the function call "iput"
mm/memblock.c: fix index adjustment error in __next_mem_range_rev()
mem-hotplug: alloc new page from a nearest neighbor node when mem-offline
mm: optimize copy_page_to/from_iter_iovec
mm: add cond_resched() to generic_swapfile_activate()
Revert "mm, mempool: only set __GFP_NOMEMALLOC if there are free elements"
mm, compaction: don't isolate PageWriteback pages in MIGRATE_SYNC_LIGHT mode
mm: hwpoison: remove incorrect comments
make __section_nr() more efficient
...

Linus Torvalds
2016-07-29 07:36:48 +0800
87cc271d5 lib/stackdepot.c: use __GFP_NOWARN for stack allocations ... Browse Code »

This (large, atomic) allocation attempt can fail. We expect and handle
that, so avoid the scary warning.

Link: http://lkml.kernel.org/r/20160720151905.GB19146@node.shutemov.name
Cc: Andrey Ryabinin
Cc: Alexander Potapenko
Cc: Michal Hocko
Cc: Rik van Riel
Cc: David Rientjes
Cc: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kirill A. Shutemov
2016-07-29 07:07:41 +0800
80a9201a5 mm, kasan: switch SLUB to stackdepot, enable memory quarantine for SLUB ... Browse Code »

For KASAN builds:
- switch SLUB allocator to using stackdepot instead of storing the
allocation/deallocation stacks in the objects;
- change the freelist hook so that parts of the freelist can be put
into the quarantine.

[aryabinin@virtuozzo.com: fixes]
Link: http://lkml.kernel.org/r/1468601423-28676-1-git-send-email-aryabinin@virtuozzo.com
Link: http://lkml.kernel.org/r/1468347165-41906-3-git-send-email-glider@google.com
Signed-off-by: Alexander Potapenko
Cc: Andrey Konovalov
Cc: Christoph Lameter
Cc: Dmitry Vyukov
Cc: Steven Rostedt (Red Hat)
Cc: Joonsoo Kim
Cc: Kostya Serebryany
Cc: Andrey Ryabinin
Cc: Kuthonuzo Luruo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexander Potapenko
2016-07-29 07:07:41 +0800
3fa6c5073 mm: optimize copy_page_to/from_iter_iovec ... Browse Code »

copy_page_to_iter_iovec() and copy_page_from_iter_iovec() copy some data
to userspace or from userspace. These functions have a fast path where
they map a page using kmap_atomic and a slow path where they use kmap.

kmap is slower than kmap_atomic, so the fast path is preferred.

However, on kernels without highmem support, kmap just calls
page_address, so there is no need to avoid kmap. On kernels without
highmem support, the fast path just increases code size (and cache
footprint) and it doesn't improve copy performance in any way.

This patch enables the fast path only if CONFIG_HIGHMEM is defined.

Code size reduced by this patch:
x86 (without highmem) 928
x86-64 960
sparc64 848
alpha 1136
pa-risc 1200

[akpm@linux-foundation.org: use IS_ENABLED(), per Andi]
Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1607221711410.4818@file01.intranet.prod.int.rdu2.redhat.com
Signed-off-by: Mikulas Patocka
Cc: Hugh Dickins
Cc: Michal Hocko
Cc: Alexander Viro
Cc: Mel Gorman
Cc: Johannes Weiner
Cc: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mikulas Patocka
2016-07-29 07:07:41 +0800
554828ee0 Merge branch 'salted-string-hash' ... Browse Code »

This changes the vfs dentry hashing to mix in the parent pointer at the
_beginning_ of the hash, rather than at the end.

That actually improves both the hash and the code generation, because we
can move more of the computation to the "static" part of the dcache
setup, and do less at lookup runtime.

It turns out that a lot of other hash users also really wanted to mix in
a base pointer as a 'salt' for the hash, and so the slightly extended
interface ends up working well for other cases too.

Users that want a string hash that is purely about the string pass in a
'salt' pointer of NULL.

* merge branch 'salted-string-hash':
fs/dcache.c: Save one 32-bit multiply in dcache lookup
vfs: make the string hashes salt the hash

Linus Torvalds
2016-07-29 03:26:31 +0800

28 Jul, 2016

2 commits

818e607b5 Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random ... Browse Code »

Pull random driver updates from Ted Ts'o:
"A number of improvements for the /dev/random driver; the most
important is the use of a ChaCha20-based CRNG for /dev/urandom, which
is faster, more efficient, and easier to make scalable for
silly/abusive userspace programs that want to read from /dev/urandom
in a tight loop on NUMA systems.

This set of patches also improves entropy gathering on VM's running on
Microsoft Azure, and will take advantage of a hw random number
generator (if present) to initialize the /dev/urandom pool"

(It turns out that the random tree hadn't been in linux-next this time
around, because it had been dropped earlier as being too quiet. Oh
well).

* tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
random: strengthen input validation for RNDADDTOENTCNT
random: add backtracking protection to the CRNG
random: make /dev/urandom scalable for silly userspace programs
random: replace non-blocking pool with a Chacha20-based CRNG
random: properly align get_random_int_hash
random: add interrupt callback to VMBus IRQ handler
random: print a warning for the first ten uninitialized random users
random: initialize the non-blocking pool via add_hwgenerator_randomness()

Linus Torvalds
2016-07-28 06:11:55 +0800
468fc7ed5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next ... Browse Code »

Pull networking updates from David Miller:

1) Unified UDP encapsulation offload methods for drivers, from
Alexander Duyck.

2) Make DSA binding more sane, from Andrew Lunn.

3) Support QCA9888 chips in ath10k, from Anilkumar Kolli.

4) Several workqueue usage cleanups, from Bhaktipriya Shridhar.

5) Add XDP (eXpress Data Path), essentially running BPF programs on RX
packets as soon as the device sees them, with the option to mirror
the packet on TX via the same interface. From Brenden Blanco and
others.

6) Allow qdisc/class stats dumps to run lockless, from Eric Dumazet.

7) Add VLAN support to b53 and bcm_sf2, from Florian Fainelli.

8) Simplify netlink conntrack entry layout, from Florian Westphal.

9) Add ipv4 forwarding support to mlxsw spectrum driver, from Ido
Schimmel, Yotam Gigi, and Jiri Pirko.

10) Add SKB array infrastructure and convert tun and macvtap over to it.
From Michael S Tsirkin and Jason Wang.

11) Support qdisc packet injection in pktgen, from John Fastabend.

12) Add neighbour monitoring framework to TIPC, from Jon Paul Maloy.

13) Add NV congestion control support to TCP, from Lawrence Brakmo.

14) Add GSO support to SCTP, from Marcelo Ricardo Leitner.

15) Allow GRO and RPS to function on macsec devices, from Paolo Abeni.

16) Support MPLS over IPV4, from Simon Horman.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1622 commits)
xgene: Fix build warning with ACPI disabled.
be2net: perform temperature query in adapter regardless of its interface state
l2tp: Correctly return -EBADF from pppol2tp_getname.
net/mlx5_core/health: Remove deprecated create_singlethread_workqueue
net: ipmr/ip6mr: update lastuse on entry change
macsec: ensure rx_sa is set when validation is disabled
tipc: dump monitor attributes
tipc: add a function to get the bearer name
tipc: get monitor threshold for the cluster
tipc: make cluster size threshold for monitoring configurable
tipc: introduce constants for tipc address validation
net: neigh: disallow transition to NUD_STALE if lladdr is unchanged in neigh_update()
MAINTAINERS: xgene: Add driver and documentation path
Documentation: dtb: xgene: Add MDIO node
dtb: xgene: Add MDIO node
drivers: net: xgene: ethtool: Use phy_ethtool_gset and sset
drivers: net: xgene: Use exported functions
drivers: net: xgene: Enable MDIO driver
drivers: net: xgene: Add backward compatibility
drivers: net: phy: xgene: Add MDIO driver
...

Linus Torvalds
2016-07-28 03:03:20 +0800

27 Jul, 2016

1 commit

df15929f8 Merge branch 'linus' into x86/microcode, to pick up merge window changes ... Browse Code »

Signed-off-by: Ingo Molnar

Ingo Molnar
2016-07-27 18:35:35 +0800