Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

07 Mar, 2014

7 commits

f7324acd9 tcp: Use NET_ADD_STATS instead of NET_ADD_STATS_BH in tcp_event_new_data_sent() ... Browse Code »

Can be invoked from non-BH context.

Based upon a patch by Eric Dumazet.

Fixes: f19c29e3e391 ("tcp: snmp stats for Fast Open, SYN rtx, and data pkts")
Reported-by: Sergey Senozhatsky
Signed-off-by: David S. Miller

David S. Miller
2014-03-07 04:19:43 +0800
072256d1f bonding: make slave status notifications GFP_ATOMIC ... Browse Code »

Currently we're using GFP_KERNEL, however there are some path(s) where we
can hold some spinlocks, specifically bond->curr_slave_lock:

[ 4.722916] BUG: sleeping function called from invalid context at mm/slub.c:965
[ 4.724438] in_atomic(): 1, irqs_disabled(): 0, pid: 940, name: ifup-eth
[ 4.726034] 5 locks held by ifup-eth/940:
...snip...
[ 4.734646] #4: (&bond->curr_slave_lock){+...+.}, at: [] bond_enslave+0xda6/0xdd0 [bonding]
...snip...
[ 4.759081] [] bond_change_active_slave+0x191/0x3b0 [bonding]
[ 4.760917] [] bond_select_active_slave+0xf7/0x1d0 [bonding]
[ 4.762751] [] bond_enslave+0xdae/0xdd0 [bonding]
...snip...

As it's out of hot path and is a really rare event - change the gfp_t flags
to GFP_ATOMIC to avoid sleeping under spinlock.

v2: convert new notify calls to GFP_ATOMIC.

CC: Thomas Glanzmann
CC: Ding Tianhong
CC: Jay Vosburgh
CC: Andy Gospodarek
Signed-off-by: Veaceslav Falico
Signed-off-by: David S. Miller

Veaceslav Falico
2014-03-07 04:19:43 +0800
e90c14835 inet: remove now unused flag DST_NOPEER ... Browse Code »

Commit e688a604807647 ("net: introduce DST_NOPEER dst flag") introduced
DST_NOPEER because because of crashes in ipv6_select_ident called from
udp6_ufo_fragment.

Since commit 916e4cf46d0204 ("ipv6: reuse ip6_frag_id from
ip6_ufo_append_data") we don't call ipv6_select_ident any more from
ip6_ufo_append_data, thus this flag lost its purpose and can be removed.

Cc: Eric Dumazet
Signed-off-by: Hannes Frederic Sowa
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Hannes Frederic Sowa
2014-03-07 02:15:52 +0800
e08f53f56 Merge branch 'r8152' ... Browse Code »

Hayes Wang says:

====================
r8152: cleanups

Deal with some empty lines and spaces, replace some tp->netdev with netdev,
and remove the unnecessary function.
====================

Signed-off-by: David S. Miller

David S. Miller
2014-03-07 02:15:17 +0800
05e0f1aad r8152: remove rtl8152_get_stats ... Browse Code »

The rtl8152_get_stats() returns the point address of the struct
net_device_stats. This could be got from struct net_device directly.

Signed-off-by: Hayes Wang
Signed-off-by: David S. Miller

hayeswang
2014-03-07 02:15:12 +0800
d104eafa6 r8152: replace tp->netdev with netdev ... Browse Code »

Replace some tp->netdev with netdev.

Signed-off-by: Hayes Wang
Signed-off-by: David S. Miller

hayeswang
2014-03-07 02:15:12 +0800
db8515eff r8152: deal with the empty line and space ... Browse Code »

Add or remove some empty lines. Replace the spaces with the tabs.

Signed-off-by: Hayes Wang
Signed-off-by: David S. Miller

hayeswang
2014-03-07 02:15:12 +0800

06 Mar, 2014

1 commit

67ddc87f1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts:
drivers/net/wireless/ath/ath9k/recv.c
drivers/net/wireless/mwifiex/pcie.c
net/ipv6/sit.c

The SIT driver conflict consists of a bug fix being done by hand
in 'net' (missing u64_stats_init()) whilst in 'net-next' a helper
was created (netdev_alloc_pcpu_stats()) which takes care of this.

The two wireless conflicts were overlapping changes.

Signed-off-by: David S. Miller

David S. Miller
2014-03-06 09:32:02 +0800

05 Mar, 2014

10 commits

6092c79fd ieee802154: fix whitespace issues in Kconfig ... Browse Code »

This patch fixes some whitespace issues in Kconfig files of IEEE
802.15.4 subsytem.

Signed-off-by: Alexander Aring
Signed-off-by: David S. Miller

Alexander Aring
2014-03-05 09:12:44 +0800
9aa69bc3c at86rf230: add help and 212 to Kconfig menu entry ... Browse Code »

Since commit 8fad346f366a72978ea942abd06bd501ebd39c22
(ieee802154: add basic support for RF212 to at86rf230 driver)

we support at86rf212 as well.

Signed-off-by: Alexander Aring
Signed-off-by: David S. Miller

Alexander Aring
2014-03-05 09:12:43 +0800
e50287be7 be2net: dma_sync each RX frag before passing it to the stack ... Browse Code »

The driver currently maps a page for DMA, divides the page into multiple
frags and posts them to the HW. It un-maps the page after data is received
on all the frags of the page. This scheme doesn't work when bounce buffers
are used for DMA (swiotlb=force kernel param).

This patch fixes this problem by calling dma_sync_single_for_cpu() for each
frag (excepting the last one) so that the data is copied from the bounce
buffers. The page is un-mapped only when DMA finishes on the last frag of
the page.
(Thanks Ben H. for suggesting the dma_sync API!)

This patch also renames the "last_page_user" field of be_rx_page_info{}
struct to "last_frag" to improve readability of the fixed code.

Reported-by: Li Fengmao
Signed-off-by: Sathya Perla
Signed-off-by: David S. Miller

Sathya Perla
2014-03-05 05:17:53 +0800
9e82e7f4a Merge branch 'mpls_tc' ... Browse Code »

Simon Wunderlich says:

====================
this series contains a header file proposal for MPLS labels. These
labels do not seem to be properly defined in the kernel so far. We are
developing a wired/wireless 802.21/MPLS switch and need to check the
MPLS labels to use the traffic control info for transmissions over
802.11 networks.

Changes to third version:

* rename mpls_label_stack to mpls_label (thanks Neil)
* fix over-indendented closing brac (thanks Sergei)
* add Johannes' Ack
====================

Signed-off-by: David S. Miller

David S. Miller
2014-03-05 02:51:13 +0800
960d97f95 cfg80211: add MPLS and 802.21 classification ... Browse Code »

MPLS labels may contain traffic control information, which should be
evaluated and used by the wireless subsystem if present.

Also check for IEEE 802.21 which is always network control traffic.

Signed-off-by: Simon Wunderlich
Signed-off-by: Mathias Kretschmer
Acked-by: Johannes Berg
Signed-off-by: David S. Miller

Simon Wunderlich
2014-03-05 02:51:06 +0800
f3baa393f UAPI: add MPLS label stack definition ... Browse Code »

Labels for the Multiprotocol Label Switching are defined in RFC 3032
which was superseded by RFC 5462. Add the definition to UAPI and a stub
header for include/linux.

Signed-off-by: Simon Wunderlich
Signed-off-by: Mathias Kretschmer
Signed-off-by: David S. Miller

Simon Wunderlich
2014-03-05 02:51:06 +0800
b62faf3cd if_ether.h: add IEEE 802.21 Ethertype ... Browse Code »

Add the Ethertype for IEEE Std 802.21 - Media Independent Handover
Protocol. This Ethertype is used for network control messages.

Signed-off-by: Simon Wunderlich
Signed-off-by: Mathias Kretschmer
Signed-off-by: David S. Miller

Simon Wunderlich
2014-03-05 02:51:06 +0800
c3bebc71c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:

1) Fix memory leak in ieee80211_prep_connection(), sta_info leaked on
error. From Eytan Lifshitz.

2) Unintentional switch case fallthrough in nft_reject_inet_eval(),
from Patrick McHardy.

3) Must check if payload lenth is a power of 2 in
nft_payload_select_ops(), from Nikolay Aleksandrov.

4) Fix mis-checksumming in xen-netfront driver, ip_hdr() is not in the
correct place when we invoke skb_checksum_setup(). From Wei Liu.

5) TUN driver should not advertise HW vlan offload features in
vlan_features. Fix from Fernando Luis Vazquez Cao.

6) IPV6_VTI needs to select NET_IPV_TUNNEL to avoid build errors, fix
from Steffen Klassert.

7) Add missing locking in xfrm_migrade_state_find(), we must hold the
per-namespace xfrm_state_lock while traversing the lists. Fix from
Steffen Klassert.

8) Missing locking in ath9k driver, access to tid->sched must be done
under ath_txq_lock(). Fix from Stanislaw Gruszka.

9) Fix two bugs in TCP fastopen. First respect the size argument given
to tcp_sendmsg() in the fastopen path, and secondly prevent
tcp_send_syn_data() from potentially using order-5 allocations.
From Eric Dumazet.

10) Fix handling of default neigh garbage collection params, from Jiri
Pirko.

11) Fix cwnd bloat and over-inflation of RTT when transmit segmentation
is in use. From Eric Dumazet.

12) Missing initialization of Realtek r8169 driver's statistics
seqlocks. Fix from Kyle McMartin.

13) Fix RTNL assertion failures in 802.3ad and AB ARP monitor of bonding
driver, from Ding Tianhong.

14) Bonding slave release race can cause divide by zero, fix from
Nikolay Aleksandrov.

15) Overzealous return from neigh_periodic_work() causes reachability
time to not be computed. Fix from Duain Jiong.

16) Fix regression in ipv6_find_hdr(), it should not return -ENOENT when
a specific target is specified and found. From Hans Schillstrom.

17) Fix VLAN tag stripping regression in BNA driver, from Ivan Vecera.

18) Tail loss probe can calculate bogus RTTs due to missing packet
marking on retransmit. Fix from Yuchung Cheng.

19) We cannot do skb_dst_drop() in iptunnel_pull_header() because
multicast loopback detection in later code paths need access to
skb_rtable(). Fix from Xin Long.

20) The macvlan driver regresses in that it propagates lower device
offload support disables into itself, causing severe slowdowns when
running over a bridge. Provide the software offloads always on
macvlan devices to deal with this and the regression is gone. From
Vlad Yasevich.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (103 commits)
macvlan: Add support for 'always_on' offload features
net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable
ip_tunnel:multicast process cause panic due to skb->_skb_refdst NULL pointer
net: cpsw: fix cpdma rx descriptor leak on down interface
be2net: isolate TX workarounds not applicable to Skyhawk-R
be2net: Fix skb double free in be_xmit_wrokarounds() failure path
be2net: clear promiscuous bits in adapter->flags while disabling promiscuous mode
be2net: Fix to reset transparent vlan tagging
qlcnic: dcb: a couple off by one bugs
tcp: fix bogus RTT on special retransmission
hsr: off by one sanity check in hsr_register_frame_in()
can: remove CAN FD compatibility for CAN 2.0 sockets
can: flexcan: factor out soft reset into seperate funtion
can: flexcan: flexcan_remove(): add missing netif_napi_del()
can: flexcan: fix transition from and to freeze mode in chip_{,un}freeze
can: flexcan: factor out transceiver {en,dis}able into seperate functions
can: flexcan: fix transition from and to low power mode in chip_{en,dis}able
can: flexcan: flexcan_open(): fix error path if flexcan_chip_start() fails
can: flexcan: fix shutdown: first disable chip, then all interrupts
USB AX88179/178A: Support D-Link DUB-1312
...

Linus Torvalds
2014-03-05 00:44:32 +0800
16e3f5391 Merge tag 'regulator-v3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator ... Browse Code »

Pull regulator fixes from Mark Brown:
"A couple of fixes here which ensure that regulators using the core
support for GPIO enables work in all cases by ensuring that helpers
are used consistently rather than open coding in places and hence not
having GPIO support in some of them"

* tag 'regulator-v3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: core: Replace direct ops->disable usage
regulator: core: Replace direct ops->enable usage

Linus Torvalds
2014-03-05 00:41:42 +0800
3f803abf2 Merge branch 'akpm' (patches from Andrew Morton) ... Browse Code »

Merge misc fixes from Andrew Morton.

* emailed patches from Andrew Morton akpm@linux-foundation.org>:
mm: page_alloc: exempt GFP_THISNODE allocations from zone fairness
mm: numa: bugfix for LAST_CPUPID_NOT_IN_PAGE_FLAGS
MAINTAINERS: add and correct types of some "T:" entries
MAINTAINERS: use tab for separator
rapidio/tsi721: fix tasklet termination in dma channel release
hfsplus: fix remount issue
zram: avoid null access when fail to alloc meta
sh: prefix sh-specific "CCR" and "CCR2" by "SH_"
ocfs2: fix quota file corruption
drivers/rtc/rtc-s3c.c: fix incorrect way of save/restore of S3C2410_TICNT for TYPE_S3C64XX
kallsyms: fix absolute addresses for kASLR
scripts/gen_initramfs_list.sh: fix flags for initramfs LZ4 compression
mm: include VM_MIXEDMAP flag in the VM_SPECIAL list to avoid m(un)locking
memcg: reparent charges of children before processing parent
memcg: fix endless loop in __mem_cgroup_iter_next()
lib/radix-tree.c: swapoff tmpfs radix_tree: remember to rcu_read_unlock
dma debug: account for cachelines and read-only mappings in overlap tracking
mm: close PageTail race
MAINTAINERS: EDAC: add Mauro and Borislav as interim patch collectors

Linus Torvalds
2014-03-05 00:29:39 +0800

04 Mar, 2014

22 commits

27329369c mm: page_alloc: exempt GFP_THISNODE allocations from zone fairness ... Browse Code »

Jan Stancek reports manual page migration encountering allocation
failures after some pages when there is still plenty of memory free, and
bisected the problem down to commit 81c0a2bb515f ("mm: page_alloc: fair
zone allocator policy").

The problem is that GFP_THISNODE obeys the zone fairness allocation
batches on one hand, but doesn't reset them and wake kswapd on the other
hand. After a few of those allocations, the batches are exhausted and
the allocations fail.

Fixing this means either having GFP_THISNODE wake up kswapd, or
GFP_THISNODE not participating in zone fairness at all. The latter
seems safer as an acute bugfix, we can clean up later.

Reported-by: Jan Stancek
Signed-off-by: Johannes Weiner
Acked-by: Rik van Riel
Acked-by: Mel Gorman
Cc: [3.12+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2014-03-04 23:55:50 +0800
1ae71d031 mm: numa: bugfix for LAST_CPUPID_NOT_IN_PAGE_FLAGS ... Browse Code »

When doing some numa tests on powerpc, I triggered an oops bug. I find
it is caused by using page->_last_cpupid. It should be initialized as
"-1 & LAST_CPUPID_MASK", but not "-1". Otherwise, in task_numa_fault(),
we will miss the checking (last_cpupid == (-1 & LAST_CPUPID_MASK)). And
finally cause an oops bug in task_numa_group(), since the online cpu is
less than possible cpu. This happen with CONFIG_SPARSE_VMEMMAP disabled

Call trace:

SMP NR_CPUS=64 NUMA PowerNV
Modules linked in:
CPU: 24 PID: 804 Comm: systemd-udevd Not tainted3.13.0-rc1+ #32
task: c000001e2746aa80 ti: c000001e32c50000 task.ti:c000001e32c50000
REGS: c000001e32c53510 TRAP: 0300 Not tainted(3.13.0-rc1+)
MSR: 9000000000009032 CR:28024424 XER: 20000000
CFAR: c000000000009324 DAR: 7265717569726857 DSISR:40000000 SOFTE: 1
NIP .task_numa_fault+0x1470/0x2370
LR .task_numa_fault+0x1468/0x2370
Call Trace:
.task_numa_fault+0x1468/0x2370 (unreliable)
.do_numa_page+0x480/0x4a0
.handle_mm_fault+0x4ec/0xc90
.do_page_fault+0x3a8/0x890
handle_page_fault+0x10/0x30
Instruction dump:
3c82fefb 3884b138 48d9cff1 60000000 48000574 3c62fefb3863af78 3c82fefb
3884b138 48d9cfd5 60000000 e93f0100 7d2907b45529063e 7d2a07b4
---[ end trace 15f2510da5ae07cf ]---

Signed-off-by: Liu Ping Fan
Signed-off-by: Aneesh Kumar K.V
Acked-by: Peter Zijlstra
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Liu Ping Fan
2014-03-04 23:55:50 +0800
cea8321cd MAINTAINERS: add and correct types of some "T:" entries ... Browse Code »

Tree location entries should start with the appropriate type.

Add git to some, hg to another.

Neaten tree type description.

Signed-off-by: Joe Perches
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2014-03-04 23:55:49 +0800
b75f00507 MAINTAINERS: use tab for separator ... Browse Code »

Convert whitespace to single tab for separators.

Signed-off-by: Joe Perches
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2014-03-04 23:55:49 +0800
04379dffd rapidio/tsi721: fix tasklet termination in dma channel release ... Browse Code »

This patch is a modification of the patch originally proposed by
Xiaotian Feng : https://lkml.org/lkml/2012/11/5/413
This new version disables DMA channel interrupts and ensures that the
tasklet wil not be scheduled again before calling tasklet_kill().

Unfortunately the updated patch was not released at that time due to
planned rework of Tsi721 mport driver to use threaded interrupts (which
has yet to happen). Recently the issue was reported again:
https://lkml.org/lkml/2014/2/19/762.

Description from the original Xiaotian's patch:

"Some drivers use tasklet_disable in device remove/release process,
tasklet_disable will inc tasklet->count and return. If the tasklet is
not handled yet under some softirq pressure, the tasklet will be
placed on the tasklet_vec, never have a chance to be excuted. This
might lead to a heavy loaded ksoftirqd, wakeup with pending_softirq,
but tasklet is disabled. tasklet_kill should be used in this case."

This patch is applicable to kernel versions starting from v3.5.

Signed-off-by: Alexandre Bounine
Cc: Matt Porter
Cc: Xiaotian Feng
Reviewed-by: Thomas Gleixner
Cc: Mike Galbraith
Cc: [3.5+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexandre Bounine
2014-03-04 23:55:49 +0800
bd2c00353 hfsplus: fix remount issue ... Browse Code »

Current implementation of HFS+ driver has small issue with remount
option. Namely, for example, you are unable to remount from RO mode
into RW mode by means of command "mount -o remount,rw /dev/loop0
/mnt/hfsplus". Trying to execute sequence of commands results in an
error message:

mount /dev/loop0 /mnt/hfsplus
mount -o remount,ro /dev/loop0 /mnt/hfsplus
mount -o remount,rw /dev/loop0 /mnt/hfsplus

mount: you must specify the filesystem type

mount -t hfsplus -o remount,rw /dev/loop0 /mnt/hfsplus

mount: /mnt/hfsplus not mounted or bad option

The reason of such issue is failure of mount syscall:

mount("/dev/loop0", "/mnt/hfsplus", 0x2282a60, MS_MGC_VAL|MS_REMOUNT, NULL) = -1 EINVAL (Invalid argument)

Namely, hfsplus_parse_options_remount() method receives empty "input"
argument and return false in such case. As a result, hfsplus_remount()
returns -EINVAL error code.

This patch fixes the issue by means of return true for the case of empty
"input" argument in hfsplus_parse_options_remount() method.

Signed-off-by: Vyacheslav Dubeyko
Cc: Al Viro
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vyacheslav Dubeyko
2014-03-04 23:55:49 +0800
db5d711e2 zram: avoid null access when fail to alloc meta ... Browse Code »

zram_meta_alloc could fail so caller should check it. Otherwise, your
system will hang.

Signed-off-by: Minchan Kim
Acked-by: Jerome Marchand
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Minchan Kim
2014-03-04 23:55:49 +0800
a5f6ea29f sh: prefix sh-specific "CCR" and "CCR2" by "SH_" ... Browse Code »

Commit bcf24e1daa94 ("mmc: omap_hsmmc: use the generic config for
omap2plus devices"), enabled the build for other platforms for compile
testing.

sh-allmodconfig now fails with:

include/linux/omap-dma.h:171:8: error: expected identifier before numeric constant
make[4]: *** [drivers/mmc/host/omap_hsmmc.o] Error 1

This happens because SuperH #defines "CCR", which is one of the enum
values in include/linux/omap-dma.h. There's a similar issue with "CCR2"
on sh2a.

As "CCR" and "CCR2" are too generic names for global #defines, prefix
them with "SH_" to fix this.

Signed-off-by: Geert Uytterhoeven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Geert Uytterhoeven
2014-03-04 23:55:49 +0800
15c34a760 ocfs2: fix quota file corruption ... Browse Code »

Global quota files are accessed from different nodes. Thus we cannot
cache offset of quota structure in the quota file after we drop our node
reference count to it because after that moment quota structure may be
freed and reallocated elsewhere by a different node resulting in
corruption of quota file.

Fix the problem by clearing dq_off when we are releasing dquot structure.
We also remove the DB_READ_B handling because it is useless -
DQ_ACTIVE_B is set iff DQ_READ_B is set.

Signed-off-by: Jan Kara
Cc: Goldwyn Rodrigues
Cc: Joel Becker
Reviewed-by: Mark Fasheh
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2014-03-04 23:55:48 +0800
40d2d968e drivers/rtc/rtc-s3c.c: fix incorrect way of save/restore of S3C2410_TICNT for TYPE_S3C64XX ... Browse Code »

On exynos5250, exynos5420 and exynos5260 it was observed that, after 1
cycle of S2R, the rtc-tick occurs at a very fast rate as compared to the
rtc-tick occuring before S2R.

This patch fixes the above issue by correcting the wrong way of
save/restore of S3C2410_TICNT for TYPE_S3C64XX.

Signed-off-by: Vikas Sajjan
Cc: Grant Likely
Cc: Rob Herring
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vikas Sajjan
2014-03-04 23:55:48 +0800
0f55159d0 kallsyms: fix absolute addresses for kASLR ... Browse Code »

Currently symbols that are absolute addresses are incorrectly displayed
in /proc/kallsyms if the kernel is loaded with kASLR.

The problem was that the scripts/kallsyms.c file which generates the
array of symbol names and addresses uses an relocatable value for all
symbols, even absolute symbols. This patch fixes that.

Several kallsyms output in different boot states for comparison:

$ egrep '_(stext|_per_cpu_(start|end))' /root/kallsyms.nokaslr
0000000000000000 D __per_cpu_start
0000000000014280 D __per_cpu_end
ffffffff810001c8 T _stext
$ egrep '_(stext|_per_cpu_(start|end))' /root/kallsyms.kaslr1
000000001f200000 D __per_cpu_start
000000001f214280 D __per_cpu_end
ffffffffa02001c8 T _stext
$ egrep '_(stext|_per_cpu_(start|end))' /root/kallsyms.kaslr2
000000000d400000 D __per_cpu_start
000000000d414280 D __per_cpu_end
ffffffff8e4001c8 T _stext
$ egrep '_(stext|_per_cpu_(start|end))' /root/kallsyms.kaslr-fixed
0000000000000000 D __per_cpu_start
0000000000014280 D __per_cpu_end
ffffffffadc001c8 T _stext

Signed-off-by: Andy Honig
Signed-off-by: Kees Cook
Cc: Michal Marek
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andy Honig
2014-03-04 23:55:48 +0800
5ec384d45 scripts/gen_initramfs_list.sh: fix flags for initramfs LZ4 compression ... Browse Code »

LZ4 as implemented in the kernel differs from the default method now
used by the reference implementation of LZ4. Until the in-kernel method
is updated to support the new default, passing the legacy flag (-l) to
the compressor is necessary. Without this flag the kernel-generated,
LZ4-compressed initramfs is junk.

Kyungsik said:

: It seems that lz4 supports legacy format with the same option as lz4c
: does. Just looking at the first few bytes of lz4 compressed image, we can
: see whether it is new format or not.
:
: It shows new format magic number without this patch. New format magic
: number is 0x184d2204.
:
: $ hexdump -C ./initramfs_data.cpio.lz4 |more
: 00000000 04 22 4d 18 64 70 b9 69 (Little Endian)
: ...
:
: Currently kernel supports legacy format only.

Signed-off-by: Daniel M. Weeks
Cc: Michal Marek
Acked-by: Kyungsik Lee
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Daniel M. Weeks
2014-03-04 23:55:48 +0800
9050d7eba mm: include VM_MIXEDMAP flag in the VM_SPECIAL list to avoid m(un)locking ... Browse Code »

Daniel Borkmann reported a VM_BUG_ON assertion failing:

------------[ cut here ]------------
kernel BUG at mm/mlock.c:528!
invalid opcode: 0000 [#1] SMP
Modules linked in: ccm arc4 iwldvm [...]
video
CPU: 3 PID: 2266 Comm: netsniff-ng Not tainted 3.14.0-rc2+ #8
Hardware name: LENOVO 2429BP3/2429BP3, BIOS G4ET37WW (1.12 ) 05/29/2012
task: ffff8801f87f9820 ti: ffff88002cb44000 task.ti: ffff88002cb44000
RIP: 0010:[] [] munlock_vma_pages_range+0x2e0/0x2f0
Call Trace:
do_munmap+0x18f/0x3b0
vm_munmap+0x41/0x60
SyS_munmap+0x22/0x30
system_call_fastpath+0x1a/0x1f
RIP munlock_vma_pages_range+0x2e0/0x2f0
---[ end trace a0088dcf07ae10f2 ]---

because munlock_vma_pages_range() thinks it's unexpectedly in the middle
of a THP page. This can be reproduced with default config since 3.11
kernels. A reproducer can be found in the kernel's selftest directory
for networking by running ./psock_tpacket.

The problem is that an order=2 compound page (allocated by
alloc_one_pg_vec_page() is part of the munlocked VM_MIXEDMAP vma (mapped
by packet_mmap()) and mistaken for a THP page and assumed to be order=9.

The checks for THP in munlock came with commit ff6a6da60b89 ("mm:
accelerate munlock() treatment of THP pages"), i.e. since 3.9, but did
not trigger a bug. It just makes munlock_vma_pages_range() skip such
compound pages until the next 512-pages-aligned page, when it encounters
a head page. This is however not a problem for vma's where mlocking has
no effect anyway, but it can distort the accounting.

Since commit 7225522bb429 ("mm: munlock: batch non-THP page isolation
and munlock+putback using pagevec") this can trigger a VM_BUG_ON in
PageTransHuge() check.

This patch fixes the issue by adding VM_MIXEDMAP flag to VM_SPECIAL, a
list of flags that make vma's non-mlockable and non-mergeable. The
reasoning is that VM_MIXEDMAP vma's are similar to VM_PFNMAP, which is
already on the VM_SPECIAL list, and both are intended for non-LRU pages
where mlocking makes no sense anyway. Related Lkml discussion can be
found in [2].

[1] tools/testing/selftests/net/psock_tpacket
[2] https://lkml.org/lkml/2014/1/10/427

Signed-off-by: Vlastimil Babka
Signed-off-by: Daniel Borkmann
Reported-by: Daniel Borkmann
Tested-by: Daniel Borkmann
Cc: Thomas Hellstrom
Cc: John David Anglin
Cc: HATAYAMA Daisuke
Cc: Konstantin Khlebnikov
Cc: Carsten Otte
Cc: Jared Hulbert
Tested-by: Hannes Frederic Sowa
Cc: Kirill A. Shutemov
Acked-by: Rik van Riel
Cc: Andrea Arcangeli
Cc: [3.11.x+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vlastimil Babka
2014-03-04 23:55:48 +0800
4fb1a86fb memcg: reparent charges of children before processing parent ... Browse Code »

Sometimes the cleanup after memcg hierarchy testing gets stuck in
mem_cgroup_reparent_charges(), unable to bring non-kmem usage down to 0.

There may turn out to be several causes, but a major cause is this: the
workitem to offline parent can get run before workitem to offline child;
parent's mem_cgroup_reparent_charges() circles around waiting for the
child's pages to be reparented to its lrus, but it's holding
cgroup_mutex which prevents the child from reaching its
mem_cgroup_reparent_charges().

Further testing showed that an ordered workqueue for cgroup_destroy_wq
is not always good enough: percpu_ref_kill_and_confirm's call_rcu_sched
stage on the way can mess up the order before reaching the workqueue.

Instead, when offlining a memcg, call mem_cgroup_reparent_charges() on
all its children (and grandchildren, in the correct order) to have their
charges reparented first.

Fixes: e5fca243abae ("cgroup: use a dedicated workqueue for cgroup destruction")
Signed-off-by: Filipe Brandenburger
Signed-off-by: Hugh Dickins
Reviewed-by: Tejun Heo
Acked-by: Michal Hocko
Cc: Johannes Weiner
Cc: [v3.10+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Filipe Brandenburger
2014-03-04 23:55:48 +0800
ce48225fe memcg: fix endless loop in __mem_cgroup_iter_next() ... Browse Code »

Commit 0eef615665ed ("memcg: fix css reference leak and endless loop in
mem_cgroup_iter") got the interaction with the commit a few before it
d8ad30559715 ("mm/memcg: iteration skip memcgs not yet fully
initialized") slightly wrong, and we didn't notice at the time.

It's elusive, and harder to get than the original, but for a couple of
days before rc1, I several times saw a endless loop similar to that
supposedly being fixed.

This time it was a tighter loop in __mem_cgroup_iter_next(): because we
can get here when our root has already been offlined, and the ordering
of conditions was such that we then just cycled around forever.

Fixes: 0eef615665ed ("memcg: fix css reference leak and endless loop in mem_cgroup_iter").
Signed-off-by: Hugh Dickins
Acked-by: Michal Hocko
Cc: Johannes Weiner
Cc: Greg Thelen
Cc: [3.12+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2014-03-04 23:55:47 +0800
5f30fc94c lib/radix-tree.c: swapoff tmpfs radix_tree: remember to rcu_read_unlock ... Browse Code »

Running fsx on tmpfs with concurrent memhog-swapoff-swapon, lots of

BUG: sleeping function called from invalid context at kernel/fork.c:606
in_atomic(): 0, irqs_disabled(): 0, pid: 1394, name: swapoff
1 lock held by swapoff/1394:
#0: (rcu_read_lock){.+.+.+}, at: [] radix_tree_locate_item+0x1f/0x2b6

followed by

================================================
[ BUG: lock held when returning to user space! ]
3.14.0-rc1 #3 Not tainted
------------------------------------------------
swapoff/1394 is leaving the kernel with locks still held!
1 lock held by swapoff/1394:
#0: (rcu_read_lock){.+.+.+}, at: [] radix_tree_locate_item+0x1f/0x2b6

after which the system recovered nicely.

Whoops, I long ago forgot the rcu_read_unlock() on one unlikely branch.

Fixes e504f3fdd63d ("tmpfs radix_tree: locate_item to speed up swapoff")

Signed-off-by: Hugh Dickins
Cc: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2014-03-04 23:55:47 +0800
3b7a6418c dma debug: account for cachelines and read-only mappings in overlap tracking ... Browse Code »

While debug_dma_assert_idle() checks if a given *page* is actively
undergoing dma the valid granularity of a dma mapping is a *cacheline*.
Sander's testing shows that the warning message "DMA-API: exceeded 7
overlapping mappings of pfn..." is falsely triggering. The test is
simply mapping multiple cachelines in a given page.

Ultimately we want overlap tracking to be valid as it is a real api
violation, so we need to track active mappings by cachelines. Update
the active dma tracking to use the page-frame-relative cacheline of the
mapping as the key, and update debug_dma_assert_idle() to check for all
possible mapped cachelines for a given page.

However, the need to track active mappings is only relevant when the
dma-mapping is writable by the device. In fact it is fairly standard
for read-only mappings to have hundreds or thousands of overlapping
mappings at once. Limiting the overlap tracking to writable
(!DMA_TO_DEVICE) eliminates this class of false-positive overlap
reports.

Note, the radix gang lookup is sub-optimal. It would be best if it
stopped fetching entries once the search passed a page boundary.
Nevertheless, this implementation does not perturb the original net_dma
failing case. That is to say the extra overhead does not show up in
terms of making the failing case pass due to a timing change.

References:
http://marc.info/?l=linux-netdev&m=139232263419315&w=2
http://marc.info/?l=linux-netdev&m=139217088107122&w=2

Signed-off-by: Dan Williams
Reported-by: Sander Eikelenboom
Reported-by: Dave Jones
Tested-by: Dave Jones
Tested-by: Sander Eikelenboom
Cc: Konrad Rzeszutek Wilk
Cc: Francois Romieu
Cc: Eric Dumazet
Cc: Wei Liu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dan Williams
2014-03-04 23:55:47 +0800
668f9abbd mm: close PageTail race ... Browse Code »

Commit bf6bddf1924e ("mm: introduce compaction and migration for
ballooned pages") introduces page_count(page) into memory compaction
which dereferences page->first_page if PageTail(page).

This results in a very rare NULL pointer dereference on the
aforementioned page_count(page). Indeed, anything that does
compound_head(), including page_count() is susceptible to racing with
prep_compound_page() and seeing a NULL or dangling page->first_page
pointer.

This patch uses Andrea's implementation of compound_trans_head() that
deals with such a race and makes it the default compound_head()
implementation. This includes a read memory barrier that ensures that
if PageTail(head) is true that we return a head page that is neither
NULL nor dangling. The patch then adds a store memory barrier to
prep_compound_page() to ensure page->first_page is set.

This is the safest way to ensure we see the head page that we are
expecting, PageTail(page) is already in the unlikely() path and the
memory barriers are unfortunately required.

Hugetlbfs is the exception, we don't enforce a store memory barrier
during init since no race is possible.

Signed-off-by: David Rientjes
Cc: Holger Kiehl
Cc: Christoph Lameter
Cc: Rafael Aquini
Cc: Vlastimil Babka
Cc: Michal Hocko
Cc: Mel Gorman
Cc: Andrea Arcangeli
Cc: Rik van Riel
Cc: "Kirill A. Shutemov"
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Rientjes
2014-03-04 23:55:47 +0800
aa15aa0e8 MAINTAINERS: EDAC: add Mauro and Borislav as interim patch collectors ... Browse Code »

We're more or less collecting EDAC patches already anyway so let's hold it
down so that get_maintainer sees it too.

Signed-off-by: Borislav Petkov
Acked-by: Mauro Carvalho Chehab
Cc: Doug Thompson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Borislav Petkov
2014-03-04 23:55:47 +0800
8b4703e9b macvlan: Add support for 'always_on' offload features ... Browse Code »

Macvlan currently inherits all of its features from the lower
device. When lower device disables offload support, this causes
macvlan to disable offload support as well. This causes
performance regression when using macvlan/macvtap in bridge
mode.

It can be easily demonstrated by creating 2 namespaces using
macvlan in bridge mode and running netperf between them:

MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec

87380 16384 16384 20.00 1204.61

To restore the performance, we add software offload features
to the list of "always_on" features for macvlan. This way
when a namespace or a guest using macvtap initially sends a
packet, this packet will not be segmented at macvlan level.
It will only be segmented when macvlan sends the packet
to the lower device.

MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec

87380 16384 16384 20.00 5507.35

Fixes: 6acf54f1cf0a6747bac9fea26f34cfc5a9029523 (macvtap: Add support of packet capture on macvtap device.)
Fixes: 797f87f83b60685ff8a13fa0572d2f10393c50d3 (macvlan: fix netdev feature propagation from lower device)
CC: Florian Westphal
CC: Christian Borntraeger
CC: Jason Wang
CC: Michael S. Tsirkin
Tested-by: Christian Borntraeger
Signed-off-by: Vlad Yasevich
Signed-off-by: David S. Miller

Vlad Yasevich
2014-03-04 05:43:56 +0800
48235515c Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless ... Browse Code »

John W. Linville says:

====================
Please pull this batch of fixes intended for the 3.14 stream...

For the mac80211 bits, Johannes says:

"This time I have a fix to get out of an 'infinite error state' in case
regulatory domain updates failed and two fixes for VHT associations: one
to not disconnect immediately when the AP uses more bandwidth than the
new regdomain would allow after a change due to association country
information getting used, and one for an issue in the code where
mac80211 doesn't correctly ignore a reserved field and then uses an HT
instead of VHT association."

For the iwlwifi bits, Emmanuel says:

"Johannes fixes a long standing bug in the AMPDU status reporting.
Max fixes the listen time which was way too long and causes trouble
to several APs."

Along with those, Bing Zhao marks the mwifiex_usb driver as _not_
supporting USB autosuspend after a number of problems with that have
been reported.
====================

Signed-off-by: David S. Miller

David S. Miller
2014-03-04 05:42:47 +0800
ec0223ec4 net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable ... Browse Code »
20

RFC4895 introduced AUTH chunks for SCTP; during the SCTP
handshake RANDOM; CHUNKS; HMAC-ALGO are negotiated (CHUNKS
being optional though):

---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->

peer
meta data (peer_random, peer_hmacs, peer_chunks) in case
sysctl -w net.sctp.auth_enable=1 is set. If in INIT's
SCTP_PARAM_SUPPORTED_EXT parameter SCTP_CID_AUTH is set,
peer_random != NULL and peer_hmacs != NULL the peer is to be
assumed asoc->peer.auth_capable=1, in any other case
asoc->peer.auth_capable=0.

Now, if in sctp_sf_do_5_1D_ce() chunk->auth_chunk is
available, we set up a fake auth chunk and pass that on to
sctp_sf_authenticate(), which at latest in
sctp_auth_calculate_hmac() reliably dereferences a NULL pointer
at position 0..0008 when setting up the crypto key in
crypto_hash_setkey() by using asoc->asoc_shared_key that is
NULL as condition key_id == asoc->active_key_id is true if
the AUTH chunk was injected correctly from remote. This
happens no matter what net.sctp.auth_enable sysctl says.

The fix is to check for net->sctp.auth_enable and for
asoc->peer.auth_capable before doing any operations like
sctp_sf_authenticate() as no key is activated in
sctp_auth_asoc_init_active_key() for each case.

Now as RFC4895 section 6.3 states that if the used HMAC-ALGO
passed from the INIT chunk was not used in the AUTH chunk, we
SHOULD send an error; however in this case it would be better
to just silently discard such a maliciously prepared handshake
as we didn't even receive a parameter at all. Also, as our
endpoint has no shared key configured, section 6.3 says that
MUST silently discard, which we are doing from now onwards.

Before calling sctp_sf_pdiscard(), we need not only to free
the association, but also the chunk->auth_chunk skb, as
commit bbd0d59809f9 created a skb clone in that case.

I have tested this locally by using netfilter's nfqueue and
re-injecting packets into the local stack after maliciously
modifying the INIT chunk (removing RANDOM; HMAC-ALGO param)
and the SCTP packet containing the COOKIE_ECHO (injecting
AUTH chunk before COOKIE_ECHO). Fixed with this patch applied.

Fixes: bbd0d59809f9 ("[SCTP]: Implement the receive and verification of AUTH chunk")
Signed-off-by: Daniel Borkmann
Cc: Vlad Yasevich
Cc: Neil Horman
Acked-by: Vlad Yasevich
Signed-off-by: David S. Miller

Daniel Borkmann
2014-03-04 05:39:36 +0800