Eric Lee / smarc-fsl-linux-kernel

06 Jan, 2016

5 commits

73c20a8b7 net: sched: fix missing free per cpu on qstats ... Browse Code »

When a qdisc is using per cpu stats (currently just the ingress
qdisc) only the bstats are being freed. This also free's the qstats.

Fixes: b0ab6f92752b9f9d8 ("net: sched: enable per cpu qstats")
Signed-off-by: John Fastabend
Acked-by: Eric Dumazet
Acked-by: Daniel Borkmann
Signed-off-by: David S. Miller

John Fastabend
2016-01-06 14:40:21 +0800
f941461c9 ARM: net: bpf: fix zero right shift ... Browse Code »

The LSR instruction cannot be used to perform a zero right shift since a
0 as the immediate value (imm5) in the LSR instruction encoding means
that a shift of 32 is perfomed. See DecodeIMMShift() in the ARM ARM.

Make the JIT skip generation of the LSR if a zero-shift is requested.

This was found using american fuzzy lop.

Signed-off-by: Rabin Vincent
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Rabin Vincent
2016-01-06 14:32:09 +0800
60aa3b080 6pack: fix free memory scribbles ... Browse Code »

commit acf673a3187edf72068ee2f92f4dc47d66baed47 fixed a user triggerable free
memory scribble but in doing so replaced it with a different one that allows
the user to control the data and scribble even more.

sixpack_close is called by the tty layer in tty context. The tty context is
protected by sp_get() and sp_put(). However network layer activity via
sp_xmit() is not protected this way. We must therefore stop the queue
otherwise the user gets to dump a buffer mostly of their choice into freed
kernel pages.

Signed-off-by: Alan Cox
Signed-off-by: David S. Miller

One Thousand Gnomes
2016-01-06 14:25:01 +0800
55795ef54 net: filter: make JITs zero A for SKF_AD_ALU_XOR_X ... Browse Code »

The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data
instructions since it XORs A with X while all the others replace A with
some loaded value. All the BPF JITs fail to clear A if this is used as
the first instruction in a filter. This was found using american fuzzy
lop.

Add a helper to determine if A needs to be cleared given the first
instruction in a filter, and use this in the JITs. Except for ARM, the
rest have only been compile-tested.

Fixes: 3480593131e0 ("net: filter: get rid of BPF_S_* enum")
Signed-off-by: Rabin Vincent
Acked-by: Daniel Borkmann
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Rabin Vincent
2016-01-06 13:43:52 +0800
ff6219855 bridge: Only call /sbin/bridge-stp for the initial network namespace ... Browse Code »

[I stole this patch from Eric Biederman. He wrote:]

> There is no defined mechanism to pass network namespace information
> into /sbin/bridge-stp therefore don't even try to invoke it except
> for bridge devices in the initial network namespace.
>
> It is possible for unprivileged users to cause /sbin/bridge-stp to be
> invoked for any network device name which if /sbin/bridge-stp does not
> guard against unreasonable arguments or being invoked twice on the
> same network device could cause problems.

[Hannes: changed patch using netns_eq]

Cc: Eric W. Biederman
Signed-off-by: Eric W. Biederman
Signed-off-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller

Hannes Frederic Sowa
2016-01-06 05:46:17 +0800

05 Jan, 2016

6 commits

c845acb32 af_unix: Fix splice-bind deadlock ... Browse Code »

On 2015/11/06, Dmitry Vyukov reported a deadlock involving the splice
system call and AF_UNIX sockets,

http://lists.openwall.net/netdev/2015/11/06/24

The situation was analyzed as

(a while ago) A: socketpair()
B: splice() from a pipe to /mnt/regular_file
does sb_start_write() on /mnt
C: try to freeze /mnt
wait for B to finish with /mnt
A: bind() try to bind our socket to /mnt/new_socket_name
lock our socket, see it not bound yet
decide that it needs to create something in /mnt
try to do sb_start_write() on /mnt, block (it's
waiting for C).
D: splice() from the same pipe to our socket
lock the pipe, see that socket is connected
try to lock the socket, block waiting for A
B: get around to actually feeding a chunk from
pipe to file, try to lock the pipe. Deadlock.

on 2015/11/10 by Al Viro,

http://lists.openwall.net/netdev/2015/11/10/4

The patch fixes this by removing the kern_path_create related code from
unix_mknod and executing it as part of unix_bind prior acquiring the
readlock of the socket in question. This means that A (as used above)
will sb_start_write on /mnt before it acquires the readlock, hence, it
won't indirectly block B which first did a sb_start_write and then
waited for a thread trying to acquire the readlock. Consequently, A
being blocked by C waiting for B won't cause a deadlock anymore
(effectively, both A and B acquire two locks in opposite order in the
situation described above).

Dmitry Vyukov() tested the original patch.

Signed-off-by: Rainer Weikusat
Signed-off-by: David S. Miller

Rainer Weikusat
2016-01-05 12:22:49 +0800
b5bdacf3b net: Propagate lookup failure in l3mdev_get_saddr to caller ... Browse Code »

Commands run in a vrf context are not failing as expected on a route lookup:
root@kenny:~# ip ro ls table vrf-red
unreachable default

root@kenny:~# ping -I vrf-red -c1 -w1 10.100.1.254
ping: Warning: source address might be selected on device other than vrf-red.
PING 10.100.1.254 (10.100.1.254) from 0.0.0.0 vrf-red: 56(84) bytes of data.

--- 10.100.1.254 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 999ms

Since the vrf table does not have a route for 10.100.1.254 the ping
should have failed. The saddr lookup causes a full VRF table lookup.
Propogating a lookup failure to the user allows the command to fail as
expected:

root@kenny:~# ping -I vrf-red -c1 -w1 10.100.1.254
connect: No route to host

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2016-01-05 11:58:30 +0800
7ec2541aa r8152: add reset_resume function ... Browse Code »

When the reset_resume() is called, the flag of SELECTIVE_SUSPEND should be
cleared and reinitialize the device, whether the SELECTIVE_SUSPEND is set
or not. If reset_resume() is called, it means the power supply is cut or the
device is reset. That is, the device wouldn't be in runtime suspend state and
the reinitialization is necessary.

Signed-off-by: Hayes Wang
Signed-off-by: David S. Miller

hayeswang
2016-01-05 11:02:08 +0800
55285bf09 connector: bump skb->users before callback invocation ... Browse Code »

Dmitry reports memleak with syskaller program.
Problem is that connector bumps skb usecount but might not invoke callback.

So move skb_get to where we invoke the callback.

Reported-by: Dmitry Vyukov
Signed-off-by: Florian Westphal
Signed-off-by: David S. Miller

Florian Westphal
2016-01-05 10:46:45 +0800
3934aa4c1 cxgb4: correctly handling failed allocation ... Browse Code »

Since t4_alloc_mem can be failed in memory pressure,
if not properly handled, NULL dereference could be happened.

Signed-off-by: Insu Yun
Signed-off-by: David S. Miller

Insu Yun
2016-01-05 06:18:42 +0800
b77357b69 qlcnic: correctly handle qlcnic_alloc_mbx_args ... Browse Code »

Since qlcnic_alloc_mbx_args can be failed,
return value should be checked.

Signed-off-by: Insu Yun
Signed-off-by: David S. Miller

Insu Yun
2016-01-05 06:14:30 +0800

01 Jan, 2016

5 commits

9c982e86d Merge tag 'pci-v4.4-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci ... Browse Code »

Pull PCI bugfix from Bjorn Helgaas:
"Here's another fix for v4.4.

This fixes 32-bit config reads for the HiSilicon driver. Obviously
the driver is completely broken without this fix (apparently it
actually was tested internally, but got broken somehow in the process
of upstreaming it).

Summary:

HiSilicon host bridge driver
Fix 32-bit config reads (Dongdong Liu)"

* tag 'pci-v4.4-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: hisi: Fix hisi_pcie_cfg_read() 32-bit reads

Linus Torvalds
2016-01-01 06:59:21 +0800
7c672dd60 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc ... Browse Code »

Pull sparc fixes from David Miller:
"Just some missing syscall wire ups"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc: Wire up mlock2 system call.
sparc: Add all necessary direct socket system calls.

Linus Torvalds
2016-01-01 06:46:49 +0800
8f5daf2a4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:

1) Prevent XFRM per-cpu counter updates for one namespace from being
applied to another namespace. Fix from DanS treetman.

2) Fix RCU de-reference in iwl_mvm_get_key_sta_id(), from Johannes
Berg.

3) Remove ethernet header assumption in nft_do_chain_netdev(), from
Pablo Neira Ayuso.

4) Fix cpsw PHY ident with multiple slaves and fixed-phy, from Pascal
Speck.

5) Fix use after free in sixpack_close and mkiss_close.

6) Fix VXLAN fw assertion on bnx2x, from Yuval Mintz.

7) natsemi doesn't check for DMA mapping errors, from Alexey
Khoroshilov.

8) Fix inverted test in ip6addrlbl_get(), from ANdrey Ryabinin.

9) Missing initialization of needed_headroom in geneve tunnel driver,
from Paolo Abeni.

10) Fix conntrack template leak in openvswitch, from Joe Stringer.

11) Mission initialization of wq->flags in sock_alloc_inode(), from
Nicolai Stange.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (35 commits)
sctp: sctp should release assoc when sctp_make_abort_user return NULL in sctp_close
net, socket, socket_wq: fix missing initialization of flags
drivers: net: cpsw: fix error return code
openvswitch: Fix template leak in error cases.
sctp: label accepted/peeled off sockets
sctp: use GFP_USER for user-controlled kmalloc
qlcnic: fix a loop exit condition better
net: cdc_ncm: avoid changing RX/TX buffers on MTU changes
geneve: initialize needed_headroom
ipv6: honor ifindex in case we receive ll addresses in router advertisements
addrconf: always initialize sysctl table data
ipv6/addrlabel: fix ip6addrlbl_get()
switchdev: bridge: Pass ageing time as clock_t instead of jiffies
sh_eth: fix 16-bit descriptor field access endianness too
veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
net: usb: cdc_ncm: Adding Dell DW5813 LTE AT&T Mobile Broadband Card
net: usb: cdc_ncm: Adding Dell DW5812 LTE Verizon Mobile Broadband Card
natsemi: add checks for dma mapping errors
rhashtable: Kill harmless RCU warning in rhashtable_walk_init
openvswitch: correct encoding of set tunnel action attributes
...

Linus Torvalds
2016-01-01 06:40:43 +0800
42d85c52f sparc: Wire up mlock2 system call. ... Browse Code »

Signed-off-by: David S. Miller

David S. Miller
2016-01-01 04:38:56 +0800
8b30ca73b sparc: Add all necessary direct socket system calls. ... Browse Code »

The GLIBC folks would like to eliminate socketcall support
eventually, and this makes sense regardless so wire them
all up.

Signed-off-by: David S. Miller

David S. Miller
2016-01-01 04:18:02 +0800

31 Dec, 2015

4 commits

068d8bd33 sctp: sctp should release assoc when sctp_make_abort_user return NULL in sctp_close ... Browse Code »

In sctp_close, sctp_make_abort_user may return NULL because of memory
allocation failure. If this happens, it will bypass any state change
and never free the assoc. The assoc has no chance to be freed and it
will be kept in memory with the state it had even after the socket is
closed by sctp_close().

So if sctp_make_abort_user fails to allocate memory, we should abort
the asoc via sctp_primitive_ABORT as well. Just like the annotation in
sctp_sf_cookie_wait_prm_abort and sctp_sf_do_9_1_prm_abort said,
"Even if we can't send the ABORT due to low memory delete the TCB.
This is a departure from our typical NOMEM handling".

But then the chunk is NULL (low memory) and the SCTP_CMD_REPLY cmd would
dereference the chunk pointer, and system crash. So we should add
SCTP_CMD_REPLY cmd only when the chunk is not NULL, just like other
places where it adds SCTP_CMD_REPLY cmd.

Signed-off-by: Xin Long
Acked-by: Marcelo Ricardo Leitner
Signed-off-by: David S. Miller

Xin Long
2015-12-31 05:57:16 +0800
a0ccc3f27 Merge tag 'wireless-drivers-for-davem-2015-12-28' of git://git.kernel.org/pub/sc… ... Browse Code »

…m/linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
iwlwifi

* don't load firmware that won't exist for 7260
* fix RCU splat
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

David S. Miller
2015-12-31 05:42:31 +0800
574aab1e0 net, socket, socket_wq: fix missing initialization of flags ... Browse Code »

Commit ceb5d58b2170 ("net: fix sock_wake_async() rcu protection") from
the current 4.4 release cycle introduced a new flags member in
struct socket_wq and moved SOCKWQ_ASYNC_NOSPACE and SOCKWQ_ASYNC_WAITDATA
from struct socket's flags member into that new place.

Unfortunately, the new flags field is never initialized properly, at least
not for the struct socket_wq instance created in sock_alloc_inode().

One particular issue I encountered because of this is that my GNU Emacs
failed to draw anything on my desktop -- i.e. what I got is a transparent
window, including the title bar. Bisection lead to the commit mentioned
above and further investigation by means of strace told me that Emacs
is indeed speaking to my Xorg through an O_ASYNC AF_UNIX socket. This is
reproducible 100% of times and the fact that properly initializing the
struct socket_wq ->flags fixes the issue leads me to the conclusion that
somehow SOCKWQ_ASYNC_WAITDATA got set in the uninitialized ->flags,
preventing my Emacs from receiving any SIGIO's due to data becoming
available and it got stuck.

Make sock_alloc_inode() set the newly created struct socket_wq's ->flags
member to zero.

Fixes: ceb5d58b2170 ("net: fix sock_wake_async() rcu protection")
Signed-off-by: Nicolai Stange
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Nicolai Stange
2015-12-31 05:38:01 +0800
c6169202e Merge branch 'for-linus' of git://git.kernel.dk/linux-block ... Browse Code »

Pull block fixes from Jens Axboe:
"Make the block layer great again.

Basically three amazing fixes in this pull request, split into 4
patches. Believe me, they should go into 4.4. Two of them fix a
regression, the third and last fixes an easy-to-trigger bug.

- Fix a bad irq enable through null_blk, for queue_mode=1 and using
timer completions. Add a block helper to restart a queue
asynchronously, and use that from null_blk. From me.

- Fix a performance issue in NVMe. Some devices (Intel Pxxxx) expose
a stripe boundary, and performance suffers if we cross it. We took
that into account for merging, but not for the newer splitting
code. Fix from Keith.

- Fix a kernel oops in lightnvm with multiple channels. From Matias"

* 'for-linus' of git://git.kernel.dk/linux-block:
lightnvm: wrong offset in bad blk lun calculation
null_blk: use async queue restart helper
block: add blk_start_queue_async()
block: Split bios on chunk boundaries

Linus Torvalds
2015-12-31 02:26:20 +0800

30 Dec, 2015

14 commits

866be88a1 Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge misc fixes from Andrew Morton:
"9 fixes"

* emailed patches from Andrew Morton :
mm/vmstat: fix overflow in mod_zone_page_state()
ocfs2/dlm: clear migration_pending when migration target goes down
mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone()
ocfs2: fix flock panic issue
m32r: add io*_rep helpers
m32r: fix build failure
arch/x86/xen/suspend.c: include xen/xen.h
mm: memcontrol: fix possible memcg leak due to interrupted reclaim
ocfs2: fix BUG when calculate new backup super

Linus Torvalds
2015-12-30 10:04:41 +0800
e25bd6ca8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull vfs fix from Al Viro:
"Fix for 3.15 breakage of fcntl64() in arm OABI compat. -stable
fodder"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
[PATCH] arm: fix handling of F_OFD_... in oabi_fcntl64()

Linus Torvalds
2015-12-30 09:46:29 +0800
6cdb18ad9 mm/vmstat: fix overflow in mod_zone_page_state() ... Browse Code »

mod_zone_page_state() takes a "delta" integer argument. delta contains
the number of pages that should be added or subtracted from a struct
zone's vm_stat field.

If a zone is larger than 8TB this will cause overflows. E.g. for a
zone with a size slightly larger than 8TB the line

mod_zone_page_state(zone, NR_ALLOC_BATCH, zone->managed_pages);

in mm/page_alloc.c:free_area_init_core() will result in a negative
result for the NR_ALLOC_BATCH entry within the zone's vm_stat, since 8TB
contain 0x8xxxxxxx pages which will be sign extended to a negative
value.

Fix this by changing the delta argument to long type.

This could fix an early boot problem seen on s390, where we have a 9TB
system with only one node. ZONE_DMA contains 2GB and ZONE_NORMAL the
rest. The system is trying to allocate a GFP_DMA page but ZONE_DMA is
completely empty, so it tries to reclaim pages in an endless loop.

This was seen on a heavily patched 3.10 kernel. One possible
explaination seem to be the overflows caused by mod_zone_page_state().
Unfortunately I did not have the chance to verify that this patch
actually fixes the problem, since I don't have access to the system
right now. However the overflow problem does exist anyway.

Given the description that a system with slightly less than 8TB does
work, this seems to be a candidate for the observed problem.

Signed-off-by: Heiko Carstens
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Heiko Carstens
2015-12-30 09:45:49 +0800
cc28d6d80 ocfs2/dlm: clear migration_pending when migration target goes down ... Browse Code »

We have found a BUG on res->migration_pending when migrating lock
resources. The situation is as follows.

dlm_mark_lockres_migration
res->migration_pending = 1;
__dlm_lockres_reserve_ast
dlm_lockres_release_ast returns with res->migration_pending remains
because other threads reserve asts
wait dlm_migration_can_proceed returns 1
>>>>>>> o2hb found that target goes down and remove target
from domain_map
dlm_migration_can_proceed returns 1
dlm_mark_lockres_migrating returns -ESHOTDOWN with
res->migration_pending still remains.

When reentering dlm_mark_lockres_migrating(), it will trigger the BUG_ON
with res->migration_pending. So clear migration_pending when target is
down.

Signed-off-by: Jiufei Xue
Reviewed-by: Joseph Qi
Cc: Mark Fasheh
Cc: Joel Becker
Cc: Junxiao Bi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

xuejiufei
2015-12-30 09:45:49 +0800
5f0f2887f mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone() ... Browse Code »

test_pages_in_a_zone() does not account for the possibility of missing
sections in the given pfn range. pfn_valid_within always returns 1 when
CONFIG_HOLES_IN_ZONE is not set, allowing invalid pfns from missing
sections to pass the test, leading to a kernel oops.

Wrap an additional pfn loop with PAGES_PER_SECTION granularity to check
for missing sections before proceeding into the zone-check code.

This also prevents a crash from offlining memory devices with missing
sections. Despite this, it may be a good idea to keep the related patch
'[PATCH 3/3] drivers: memory: prohibit offlining of memory blocks with
missing sections' because missing sections in a memory block may lead to
other problems not covered by the scope of this fix.

Signed-off-by: Andrew Banman
Acked-by: Alex Thorlton
Cc: Russ Anderson
Cc: Alex Thorlton
Cc: Yinghai Lu
Cc: Greg KH
Cc: Seth Jennings
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Banman
2015-12-30 09:45:49 +0800
b5a8bc338 ocfs2: fix flock panic issue ... Browse Code »

Commit 4f6563677ae8 ("Move locks API users to locks_lock_inode_wait()")
move flock/posix lock indentify code to locks_lock_inode_wait(), but
missed to set fl_flags to FL_FLOCK which caused the following kernel
panic on 4.4.0_rc5.

kernel BUG at fs/locks.c:1895!
invalid opcode: 0000 [#1] SMP
Modules linked in: ocfs2(O) ocfs2_dlmfs(O) ocfs2_stack_o2cb(O) ocfs2_dlm(O) ocfs2_nodemanager(O) ocfs2_stackglue(O) iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi xen_kbdfront xen_netfront xen_fbfront xen_blkfront
CPU: 0 PID: 20268 Comm: flock_unit_test Tainted: G O 4.4.0-rc5-next-20151217 #1
Hardware name: Xen HVM domU, BIOS 4.3.1OVM 05/14/2014
task: ffff88007b3672c0 ti: ffff880028b58000 task.ti: ffff880028b58000
RIP: locks_lock_inode_wait+0x2e/0x160
Call Trace:
ocfs2_do_flock+0x91/0x160 [ocfs2]
ocfs2_flock+0x76/0xd0 [ocfs2]
SyS_flock+0x10f/0x1a0
entry_SYSCALL_64_fastpath+0x12/0x71
Code: e5 41 57 41 56 49 89 fe 41 55 41 54 53 48 89 f3 48 81 ec 88 00 00 00 8b 46 40 83 e0 03 83 f8 01 0f 84 ad 00 00 00 83 f8 02 74 04 0b eb fe 4c 8d ad 60 ff ff ff 4c 8d 7b 58 e8 0e 8e 73 00 4d
RIP locks_lock_inode_wait+0x2e/0x160
RSP
---[ end trace dfca74ec9b5b274c ]---

Fixes: 4f6563677ae8 ("Move locks API users to locks_lock_inode_wait()")
Signed-off-by: Junxiao Bi
Cc: Mark Fasheh
Cc: Joel Becker
Cc: Joseph Qi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Junxiao Bi
2015-12-30 09:45:49 +0800
92a8ed4c7 m32r: add io*_rep helpers ... Browse Code »

m32r allmodconfig was failing with the error:

error: implicit declaration of function 'read'

On checking io.h it turned out that 'read' is not defined but 'readb' is
defined and 'ioread8' will then obviously mean 'readb'.

At the same time some of the helper functions ioreadN_rep() and
iowriteN_rep() were missing which also led to the build failure.

Signed-off-by: Sudip Mukherjee
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sudip Mukherjee
2015-12-30 09:45:49 +0800
6122192eb m32r: fix build failure ... Browse Code »

m32r allmodconfig is failing with:

In file included from ../include/linux/kvm_para.h:4:0,
from ../kernel/watchdog.c:26:
../include/uapi/linux/kvm_para.h:30:26: fatal error: asm/kvm_para.h: No such file or directory

kvm_para.h was not included in the build.

Signed-off-by: Sudip Mukherjee
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sudip Mukherjee
2015-12-30 09:45:49 +0800
facca6168 arch/x86/xen/suspend.c: include xen/xen.h ... Browse Code »

Fix the build warning:

arch/x86/xen/suspend.c: In function 'xen_arch_pre_suspend':
arch/x86/xen/suspend.c:70:9: error: implicit declaration of function 'xen_pv_domain' [-Werror=implicit-function-declaration]
if (xen_pv_domain())
^

Reported-by: kbuild test robot
Cc: Sasha Levin
Cc: Konrad Rzeszutek Wilk
Cc: Boris Ostrovsky
Cc: David Vrabel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2015-12-30 09:45:49 +0800
6df38689e mm: memcontrol: fix possible memcg leak due to interrupted reclaim ... Browse Code »

Memory cgroup reclaim can be interrupted with mem_cgroup_iter_break()
once enough pages have been reclaimed, in which case, in contrast to a
full round-trip over a cgroup sub-tree, the current position stored in
mem_cgroup_reclaim_iter of the target cgroup does not get invalidated
and so is left holding the reference to the last scanned cgroup. If the
target cgroup does not get scanned again (we might have just reclaimed
the last page or all processes might exit and free their memory
voluntary), we will leak it, because there is nobody to put the
reference held by the iterator.

The problem is easy to reproduce by running the following command
sequence in a loop:

mkdir /sys/fs/cgroup/memory/test
echo 100M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes
echo $$ > /sys/fs/cgroup/memory/test/cgroup.procs
memhog 150M
echo $$ > /sys/fs/cgroup/memory/cgroup.procs
rmdir test

The cgroups generated by it will never get freed.

This patch fixes this issue by making mem_cgroup_iter avoid taking
reference to the current position. In order not to hit use-after-free
bug while running reclaim in parallel with cgroup deletion, we make use
of ->css_released cgroup callback to clear references to the dying
cgroup in all reclaim iterators that might refer to it. This callback
is called right before scheduling rcu work which will free css, so if we
access iter->position from rcu read section, we might be sure it won't
go away under us.

[hannes@cmpxchg.org: clean up css ref handling]
Fixes: 5ac8fb31ad2e ("mm: memcontrol: convert reclaim iterator to simple css refcounting")
Signed-off-by: Vladimir Davydov
Signed-off-by: Johannes Weiner
Acked-by: Michal Hocko
Acked-by: Johannes Weiner
Cc: [3.19+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vladimir Davydov
2015-12-30 09:45:49 +0800
5c9ee4cbf ocfs2: fix BUG when calculate new backup super ... Browse Code »

When resizing, it firstly extends the last gd. Once it should backup
super in the gd, it calculates new backup super and update the
corresponding value.

But it currently doesn't consider the situation that the backup super is
already done. And in this case, it still sets the bit in gd bitmap and
then decrease from bg_free_bits_count, which leads to a corrupted gd and
trigger the BUG in ocfs2_block_group_set_bits:

BUG_ON(le16_to_cpu(bg->bg_free_bits_count) < num_bits);

So check whether the backup super is done and then do the updates.

Signed-off-by: Joseph Qi
Reviewed-by: Jiufei Xue
Reviewed-by: Yiwen Jiang
Cc: Mark Fasheh
Cc: Joel Becker
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joseph Qi
2015-12-30 09:45:49 +0800
c1e3334fa drivers: net: cpsw: fix error return code ... Browse Code »

Propagate the return value of platform_get_irq on failure.

A simplified version of the semantic match that finds the two cases where
no error code is returned at all is as follows:
(http://coccinelle.lip6.fr/)

//
@@
identifier ret; expression e1,e2;
@@
(
if ($ret < 0\|ret != 0$)
{ ... return ret; }
|
ret = 0
)
... when != ret = e1
when != &ret
*if(...)
{
... when != ret = e2
when forall
return ret;
}
//

Signed-off-by: Julia Lawall
Signed-off-by: David S. Miller

Julia Lawall
2015-12-30 04:31:10 +0800
90c7afc96 openvswitch: Fix template leak in error cases. ... Browse Code »

Commit 5b48bb8506c5 ("openvswitch: Fix helper reference leak") fixed a
reference leak on helper objects, but inadvertently introduced a leak on
the ct template.

Previously, ct_info.ct->general.use was initialized to 0 by
nf_ct_tmpl_alloc() and only incremented when ovs_ct_copy_action()
returned successful. If an error occurred while adding the helper or
adding the action to the actions buffer, the __ovs_ct_free_action()
cleanup would use nf_ct_put() to free the entry; However, this relies on
atomic_dec_and_test(ct_info.ct->general.use). This reference must be
incremented first, or nf_ct_put() will never free it.

Fix the issue by acquiring a reference to the template immediately after
allocation.

Fixes: cae3a2627520 ("openvswitch: Allow attaching helpers to ct action")
Fixes: 5b48bb8506c5 ("openvswitch: Fix helper reference leak")
Signed-off-by: Joe Stringer
Signed-off-by: David S. Miller

Joe Stringer
2015-12-30 04:27:52 +0800
76cc404bf [PATCH] arm: fix handling of F_OFD_... in oabi_fcntl64() ... Browse Code »

Cc: stable@vger.kernel.org # 3.15+
Reviewed-by: Jeff Layton
Signed-off-by: Al Viro

Al Viro
2015-12-30 02:04:21 +0800

29 Dec, 2015

6 commits

c3293a9ac lightnvm: wrong offset in bad blk lun calculation ... Browse Code »

dev->nr_luns reports the total number of luns available in a device
while dev->luns_per_chnl is the number of luns per channel.

When multiple channels are available, the offset is calculated from a
channel and lun id into a linear array. As it multiplies with
the total number of luns, we go out of bound when channel id > 0 and
causes the kernel to panic when we read a protected kernel memory area.

Signed-off-by: Matias Bjørling
Signed-off-by: Jens Axboe

Matias Bjørling
2015-12-29 23:28:32 +0800
1e60508c3 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma ... Browse Code »

Pull rdma fixes from Doug Ledford:
"Three late 4.4-rc fixes.

The first two were very small in terms of number of lines, the third
is more lines of change than I like this late in the cycle, but there
are positive test results from Avagotech and from my own test setup
with the target hardware, and given the problem was a 100% failure
case, I sent it through.

- A previous patch updated the mlx4 driver to use vmalloc when there
was not enough memory to get a contiguous region large enough for
our needs, so we need kvfree() whenever we free that item. We
missed one place, so fix that now.

- A previous patch added code to match incoming packets against a
specific device, but failed to compensate for devices that have
both InfiniBand and Ethernet ports. Fix that.

- Under certain vlan conditions, the ocrdma driver would fail to
bring up any vlan interfaces and would print out a circular locking
failure. Fix that"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
RDMA/be2net: Remove open and close entry points
RDMA/ocrdma: Depend on async link events from CNA
RDMA/ocrdma: Dispatch only port event when port state changes
RDMA/ocrdma: Fix vlan-id assignment in qp parameters
IB/mlx4: Replace kfree with kvfree in mlx4_ib_destroy_srq
IB/cma: cma_match_net_dev needs to take into account port_num

Linus Torvalds
2015-12-29 08:45:14 +0800
48cc661e7 null_blk: use async queue restart helper ... Browse Code »

If null_blk is run in NULL_IRQ_TIMER mode and with queue_mode NULL_Q_RQ,
we need to restart the queue from the hrtimer interrupt. We can't
directly invoke the request_fn from that context, so punt the queue run
to async kblockd context.

Tested-by: Rabin Vincent
Signed-off-by: Jens Axboe

Jens Axboe
2015-12-29 04:07:09 +0800
21491412f block: add blk_start_queue_async() ... Browse Code »

We currently only have an inline/sync helper to restart a stopped
queue. If drivers need an async version, they have to roll their
own. Add a generic helper instead.

Signed-off-by: Jens Axboe

Jens Axboe
2015-12-29 04:07:07 +0800
851334217 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 ... Browse Code »

Pull crypto fix from Herbert Xu:
"This fixes a bug in the algif_skcipher interface that can trigger a
kernel WARN_ON from user-space. It does so by using the new skcipher
interface which unlike the previous ablkcipher does not need to create
extra geniv objects which is what was used to trigger the WARN_ON"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: algif_skcipher - Use new skcipher interface

Linus Torvalds
2015-12-29 02:44:41 +0800
2c7143d4f Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security ... Browse Code »

Pull key handling bugfix from James Morris:
"Fix a race between keyctl_read() and keyctl_revoke()"

* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
KEYS: Fix race between read and revoke

Linus Torvalds
2015-12-29 02:35:19 +0800