Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

14 Aug, 2014

16 commits

2485c5dd5 iovec: make sure the caller actually wants anything in memcpy_fromiovecend ... Browse Code »

[ Upstream commit 06ebb06d49486676272a3c030bfeef4bd969a8e6 ]

Check for cases when the caller requests 0 bytes instead of running off
and dereferencing potentially invalid iovecs.

Signed-off-by: Sasha Levin
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Sasha Levin
2014-08-14 09:51:49 +0800
05e584976 net: Correctly set segment mac_len in skb_segment(). ... Browse Code »

[ Upstream commit fcdfe3a7fa4cb74391d42b6a26dc07c20dab1d82 ]

When performing segmentation, the mac_len value is copied right
out of the original skb. However, this value is not always set correctly
(like when the packet is VLAN-tagged) and we'll end up copying a bad
value.

One way to demonstrate this is to configure a VM which tags
packets internally and turn off VLAN acceleration on the forwarding
bridge port. The packets show up corrupt like this:
16:18:24.985548 52:54:00:ab:be:25 > 52:54:00:26:ce:a3, ethertype 802.1Q
(0x8100), length 1518: vlan 100, p 0, ethertype 0x05e0,
0x0000: 8cdb 1c7c 8cdb 0064 4006 b59d 0a00 6402 ...|...d@.....d.
0x0010: 0a00 6401 9e0d b441 0a5e 64ec 0330 14fa ..d....A.^d..0..
0x0020: 29e3 01c9 f871 0000 0101 080a 000a e833)....q.........3
0x0030: 000f 8c75 6e65 7470 6572 6600 6e65 7470 ...unetperf.netp
0x0040: 6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp
0x0050: 6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp
0x0060: 6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp
...

This also leads to awful throughput as GSO packets are dropped and
cause retransmissions.

The solution is to set the mac_len using the values already available
in then new skb. We've already adjusted all of the header offset, so we
might as well correctly figure out the mac_len using skb_reset_mac_len().
After this change, packets are segmented correctly and performance
is restored.

CC: Eric Dumazet
Signed-off-by: Vlad Yasevich
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Vlad Yasevich
2014-08-14 09:51:49 +0800
e629ca252 macvlan: Initialize vlan_features to turn on offload support. ... Browse Code »

[ Upstream commit 081e83a78db9b0ae1f5eabc2dedecc865f509b98 ]

Macvlan devices do not initialize vlan_features. As a result,
any vlan devices configured on top of macvlans perform very poorly.
Initialize vlan_features based on the vlan features of the lower-level
device.

Signed-off-by: Vlad Yasevich
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Vlad Yasevich
2014-08-14 09:51:49 +0800
478ac554b net: sctp: inherit auth_capable on INIT collisions ... Browse Code »

[ Upstream commit 1be9a950c646c9092fb3618197f7b6bfb50e82aa ]

Jason reported an oops caused by SCTP on his ARM machine with
SCTP authentication enabled:

Internal error: Oops: 17 [#1] ARM
CPU: 0 PID: 104 Comm: sctp-test Not tainted 3.13.0-68744-g3632f30c9b20-dirty #1
task: c6eefa40 ti: c6f52000 task.ti: c6f52000
PC is at sctp_auth_calculate_hmac+0xc4/0x10c
LR is at sg_init_table+0x20/0x38
pc : [] lr : [] psr: 40000013
sp : c6f538e8 ip : 00000000 fp : c6f53924
r10: c6f50d80 r9 : 00000000 r8 : 00010000
r7 : 00000000 r6 : c7be4000 r5 : 00000000 r4 : c6f56254
r3 : c00c8170 r2 : 00000001 r1 : 00000008 r0 : c6f1e660
Flags: nZcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
Control: 0005397f Table: 06f28000 DAC: 00000015
Process sctp-test (pid: 104, stack limit = 0xc6f521c0)
Stack: (0xc6f538e8 to 0xc6f54000)
[...]
Backtrace:
[] (sctp_auth_calculate_hmac+0x0/0x10c) from [] (sctp_packet_transmit+0x33c/0x5c8)
[] (sctp_packet_transmit+0x0/0x5c8) from [] (sctp_outq_flush+0x7fc/0x844)
[] (sctp_outq_flush+0x0/0x844) from [] (sctp_outq_uncork+0x24/0x28)
[] (sctp_outq_uncork+0x0/0x28) from [] (sctp_side_effects+0x1134/0x1220)
[] (sctp_side_effects+0x0/0x1220) from [] (sctp_do_sm+0xac/0xd4)
[] (sctp_do_sm+0x0/0xd4) from [] (sctp_assoc_bh_rcv+0x118/0x160)
[] (sctp_assoc_bh_rcv+0x0/0x160) from [] (sctp_inq_push+0x6c/0x74)
[] (sctp_inq_push+0x0/0x74) from [] (sctp_rcv+0x7d8/0x888)

While we already had various kind of bugs in that area
ec0223ec48a9 ("net: sctp: fix sctp_sf_do_5_1D_ce to verify if
we/peer is AUTH capable") and b14878ccb7fa ("net: sctp: cache
auth_enable per endpoint"), this one is a bit of a different
kind.

Giving a bit more background on why SCTP authentication is
needed can be found in RFC4895:

SCTP uses 32-bit verification tags to protect itself against
blind attackers. These values are not changed during the
lifetime of an SCTP association.

Looking at new SCTP extensions, there is the need to have a
method of proving that an SCTP chunk(s) was really sent by
the original peer that started the association and not by a
malicious attacker.

To cause this bug, we're triggering an INIT collision between
peers; normal SCTP handshake where both sides intent to
authenticate packets contains RANDOM; CHUNKS; HMAC-ALGO
parameters that are being negotiated among peers:

---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->

...

Since such collisions can also happen with verification tags,
the RFC4895 for AUTH rather vaguely says under section 6.1:

In case of INIT collision, the rules governing the handling
of this Random Number follow the same pattern as those for
the Verification Tag, as explained in Section 5.2.4 of
RFC 2960 [5]. Therefore, each endpoint knows its own Random
Number and the peer's Random Number after the association
has been established.

In RFC2960, section 5.2.4, we're eventually hitting Action B:

B) In this case, both sides may be attempting to start an
association at about the same time but the peer endpoint
started its INIT after responding to the local endpoint's
INIT. Thus it may have picked a new Verification Tag not
being aware of the previous Tag it had sent this endpoint.
The endpoint should stay in or enter the ESTABLISHED
state but it MUST update its peer's Verification Tag from
the State Cookie, stop any init or cookie timers that may
running and send a COOKIE ACK.

In other words, the handling of the Random parameter is the
same as behavior for the Verification Tag as described in
Action B of section 5.2.4.

Looking at the code, we exactly hit the sctp_sf_do_dupcook_b()
case which triggers an SCTP_CMD_UPDATE_ASSOC command to the
side effect interpreter, and in fact it properly copies over
peer_{random, hmacs, chunks} parameters from the newly created
association to update the existing one.

Also, the old asoc_shared_key is being released and based on
the new params, sctp_auth_asoc_init_active_key() updated.
However, the issue observed in this case is that the previous
asoc->peer.auth_capable was 0, and has *not* been updated, so
that instead of creating a new secret, we're doing an early
return from the function sctp_auth_asoc_init_active_key()
leaving asoc->asoc_shared_key as NULL. However, we now have to
authenticate chunks from the updated chunk list (e.g. COOKIE-ACK).

That in fact causes the server side when responding with ...

active_key_id is still inherited from the
endpoint, and the same as encoded into the chunk, it uses
asoc->asoc_shared_key, which is still NULL, as an asoc_key
and dereferences it in ...

crypto_hash_setkey(desc.tfm, &asoc_key->data[0], asoc_key->len)

... causing an oops. All this happens because sctp_make_cookie_ack()
called with the *new* association has the peer.auth_capable=1
and therefore marks the chunk with auth=1 after checking
sctp_auth_send_cid(), but it is *actually* sent later on over
the then *updated* association's transport that didn't initialize
its shared key due to peer.auth_capable=0. Since control chunks
in that case are not sent by the temporary association which
are scheduled for deletion, they are issued for xmit via
SCTP_CMD_REPLY in the interpreter with the context of the
*updated* association. peer.auth_capable was 0 in the updated
association (which went from COOKIE_WAIT into ESTABLISHED state),
since all previous processing that performed sctp_process_init()
was being done on temporary associations, that we eventually
throw away each time.

The correct fix is to update to the new peer.auth_capable
value as well in the collision case via sctp_assoc_update(),
so that in case the collision migrated from 0 -> 1,
sctp_auth_asoc_init_active_key() can properly recalculate
the secret. This therefore fixes the observed server panic.

Fixes: 730fc3d05cd4 ("[SCTP]: Implete SCTP-AUTH parameter processing")
Reported-by: Jason Gunthorpe
Signed-off-by: Daniel Borkmann
Tested-by: Jason Gunthorpe
Cc: Vlad Yasevich
Acked-by: Vlad Yasevich
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Daniel Borkmann
2014-08-14 09:51:48 +0800
6980baed3 bna: fix performance regression ... Browse Code »

[ Upstream commit c36c9d50cc6af5c5bfcc195f21b73f55520c15f9 ]

The recent commit "e29aa33 bna: Enable Multi Buffer RX" is causing
a performance regression. It does not properly update 'cmpl' pointer
at the end of the loop in NAPI handler bnad_cq_process(). The result is
only one packet / per NAPI-schedule is processed.

Signed-off-by: Ivan Vecera
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Ivan Vecera
2014-08-14 09:51:48 +0800
4fe35cff3 tcp: Fix integer-overflow in TCP vegas ... Browse Code »

[ Upstream commit 1f74e613ded11517db90b2bd57e9464d9e0fb161 ]

In vegas we do a multiplication of the cwnd and the rtt. This
may overflow and thus their result is stored in a u64. However, we first
need to cast the cwnd so that actually 64-bit arithmetic is done.

Then, we need to do do_div to allow this to be used on 32-bit arches.

Cc: Stephen Hemminger
Cc: Neal Cardwell
Cc: Eric Dumazet
Cc: David Laight
Cc: Doug Leith
Fixes: 8d3a564da34e (tcp: tcp_vegas cong avoid fix)
Signed-off-by: Christoph Paasch
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Christoph Paasch
2014-08-14 09:51:48 +0800
c2b2fb6a9 ip: make IP identifiers less predictable ... Browse Code »

[ Upstream commit 04ca6973f7c1a0d8537f2d9906a0cf8e69886d75 ]

In "Counting Packets Sent Between Arbitrary Internet Hosts", Jeffrey and
Jedidiah describe ways exploiting linux IP identifier generation to
infer whether two machines are exchanging packets.

With commit 73f156a6e8c1 ("inetpeer: get rid of ip_id_count"), we
changed IP id generation, but this does not really prevent this
side-channel technique.

This patch adds a random amount of perturbation so that IP identifiers
for a given destination [1] are no longer monotonically increasing after
an idle period.

Note that prandom_u32_max(1) returns 0, so if generator is used at most
once per jiffy, this patch inserts no hole in the ID suite and do not
increase collision probability.

This is jiffies based, so in the worst case (HZ=1000), the id can
rollover after ~65 seconds of idle time, which should be fine.

We also change the hash used in __ip_select_ident() to not only hash
on daddr, but also saddr and protocol, so that ICMP probes can not be
used to infer information for other protocols.

For IPv6, adds saddr into the hash as well, but not nexthdr.

If I ping the patched target, we can see ID are now hard to predict.

21:57:11.008086 IP (...)
A > target: ICMP echo request, seq 1, length 64
21:57:11.010752 IP (... id 2081 ...)
target > A: ICMP echo reply, seq 1, length 64

21:57:12.013133 IP (...)
A > target: ICMP echo request, seq 2, length 64
21:57:12.015737 IP (... id 3039 ...)
target > A: ICMP echo reply, seq 2, length 64

21:57:13.016580 IP (...)
A > target: ICMP echo request, seq 3, length 64
21:57:13.019251 IP (... id 3437 ...)
target > A: ICMP echo reply, seq 3, length 64

[1] TCP sessions uses a per flow ID generator not changed by this patch.

Signed-off-by: Eric Dumazet
Reported-by: Jeffrey Knockel
Reported-by: Jedidiah R. Crandall
Cc: Willy Tarreau
Cc: Hannes Frederic Sowa
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Eric Dumazet
2014-08-14 09:51:48 +0800
6ea4adaf4 inetpeer: get rid of ip_id_count ... Browse Code »

[ Upstream commit 73f156a6e8c1074ac6327e0abd1169e95eb66463 ]

Ideally, we would need to generate IP ID using a per destination IP
generator.

linux kernels used inet_peer cache for this purpose, but this had a huge
cost on servers disabling MTU discovery.

1) each inet_peer struct consumes 192 bytes

2) inetpeer cache uses a binary tree of inet_peer structs,
with a nominal size of ~66000 elements under load.

3) lookups in this tree are hitting a lot of cache lines, as tree depth
is about 20.

4) If server deals with many tcp flows, we have a high probability of
not finding the inet_peer, allocating a fresh one, inserting it in
the tree with same initial ip_id_count, (cf secure_ip_id())

5) We garbage collect inet_peer aggressively.

IP ID generation do not have to be 'perfect'

Goal is trying to avoid duplicates in a short period of time,
so that reassembly units have a chance to complete reassembly of
fragments belonging to one message before receiving other fragments
with a recycled ID.

We simply use an array of generators, and a Jenkin hash using the dst IP
as a key.

ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it
belongs (it is only used from this file)

secure_ip_id() and secure_ipv6_id() no longer are needed.

Rename ip_select_ident_more() to ip_select_ident_segs() to avoid
unnecessary decrement/increment of the number of segments.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Eric Dumazet
2014-08-14 09:51:48 +0800
6e2a4fd1e tcp: Fix integer-overflows in TCP veno ... Browse Code »

[ Upstream commit 45a07695bc64b3ab5d6d2215f9677e5b8c05a7d0 ]

In veno we do a multiplication of the cwnd and the rtt. This
may overflow and thus their result is stored in a u64. However, we first
need to cast the cwnd so that actually 64-bit arithmetic is done.

A first attempt at fixing 76f1017757aa0 ([TCP]: TCP Veno congestion
control) was made by 159131149c2 (tcp: Overflow bug in Vegas), but it
failed to add the required cast in tcp_veno_cong_avoid().

Fixes: 76f1017757aa0 ([TCP]: TCP Veno congestion control)
Signed-off-by: Christoph Paasch
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Christoph Paasch
2014-08-14 09:51:48 +0800
b28977247 ip_tunnel(ipv4): fix tunnels with "local any remote $remote_ip" ... Browse Code »

[ Upstream commit 95cb5745983c222867cc9ac593aebb2ad67d72c0 ]

Ipv4 tunnels created with "local any remote $ip" didn't work properly since
7d442fab0 (ipv4: Cache dst in tunnels). 99% of packets sent via those tunnels
had src addr = 0.0.0.0. That was because only dst_entry was cached, although
fl4.saddr has to be cached too. Every time ip_tunnel_xmit used cached dst_entry
(tunnel_rtable_get returned non-NULL), fl4.saddr was initialized with
tnl_params->saddr (= 0 in our case), and wasn't changed until iptunnel_xmit().

This patch adds saddr to ip_tunnel->dst_cache, fixing this issue.

Reported-by: Sergey Popov
Signed-off-by: Dmitry Popov
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Dmitry Popov
2014-08-14 09:51:48 +0800
4521c12b5 net: phy: re-apply PHY fixups during phy_register_device ... Browse Code »

[ Upstream commit d92f5dec6325079c550889883af51db1b77d5623 ]

Commit 87aa9f9c61ad ("net: phy: consolidate PHY reset in phy_init_hw()")
moved the call to phy_scan_fixups() in phy_init_hw() after a software
reset is performed.

By the time phy_init_hw() is called in phy_device_register(), no driver
has been bound to this PHY yet, so all the checks in phy_init_hw()
against the PHY driver and the PHY driver's config_init function will
return 0. We will therefore never call phy_scan_fixups() as we should.

Fix this by calling phy_scan_fixups() and check for its return value to
restore the intended functionality.

This broke PHY drivers which do register an early PHY fixup callback to
intercept the PHY probing and do things like changing the 32-bits unique
PHY identifier when a pseudo-PHY address has been used, as well as
board-specific PHY fixups that need to be applied during driver probe
time.

Reported-by: Hauke Merthens
Reported-by: Jonas Gorski
Signed-off-by: Florian Fainelli
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Florian Fainelli
2014-08-14 09:51:48 +0800
d68ff2f91 net: sendmsg: fix NULL pointer dereference ... Browse Code »

[ Upstream commit 40eea803c6b2cfaab092f053248cbeab3f368412 ]

Sasha's report:
> While fuzzing with trinity inside a KVM tools guest running the latest -next
> kernel with the KASAN patchset, I've stumbled on the following spew:
>
> [ 4448.949424] ==================================================================
> [ 4448.951737] AddressSanitizer: user-memory-access on address 0
> [ 4448.952988] Read of size 2 by thread T19638:
> [ 4448.954510] CPU: 28 PID: 19638 Comm: trinity-c76 Not tainted 3.16.0-rc4-next-20140711-sasha-00046-g07d3099-dirty #813
> [ 4448.956823] ffff88046d86ca40 0000000000000000 ffff880082f37e78 ffff880082f37a40
> [ 4448.958233] ffffffffb6e47068 ffff880082f37a68 ffff880082f37a58 ffffffffb242708d
> [ 4448.959552] 0000000000000000 ffff880082f37a88 ffffffffb24255b1 0000000000000000
> [ 4448.961266] Call Trace:
> [ 4448.963158] dump_stack (lib/dump_stack.c:52)
> [ 4448.964244] kasan_report_user_access (mm/kasan/report.c:184)
> [ 4448.965507] __asan_load2 (mm/kasan/kasan.c:352)
> [ 4448.966482] ? netlink_sendmsg (net/netlink/af_netlink.c:2339)
> [ 4448.967541] netlink_sendmsg (net/netlink/af_netlink.c:2339)
> [ 4448.968537] ? get_parent_ip (kernel/sched/core.c:2555)
> [ 4448.970103] sock_sendmsg (net/socket.c:654)
> [ 4448.971584] ? might_fault (mm/memory.c:3741)
> [ 4448.972526] ? might_fault (./arch/x86/include/asm/current.h:14 mm/memory.c:3740)
> [ 4448.973596] ? verify_iovec (net/core/iovec.c:64)
> [ 4448.974522] ___sys_sendmsg (net/socket.c:2096)
> [ 4448.975797] ? put_lock_stats.isra.13 (./arch/x86/include/asm/preempt.h:98 kernel/locking/lockdep.c:254)
> [ 4448.977030] ? lock_release_holdtime (kernel/locking/lockdep.c:273)
> [ 4448.978197] ? lock_release_non_nested (kernel/locking/lockdep.c:3434 (discriminator 1))
> [ 4448.979346] ? check_chain_key (kernel/locking/lockdep.c:2188)
> [ 4448.980535] __sys_sendmmsg (net/socket.c:2181)
> [ 4448.981592] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2600)
> [ 4448.982773] ? trace_hardirqs_on (kernel/locking/lockdep.c:2607)
> [ 4448.984458] ? syscall_trace_enter (arch/x86/kernel/ptrace.c:1500 (discriminator 2))
> [ 4448.985621] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2600)
> [ 4448.986754] SyS_sendmmsg (net/socket.c:2201)
> [ 4448.987708] tracesys (arch/x86/kernel/entry_64.S:542)
> [ 4448.988929] ==================================================================

This reports means that we've come to netlink_sendmsg() with msg->msg_name == NULL and msg->msg_namelen > 0.

After this report there was no usual "Unable to handle kernel NULL pointer dereference"
and this gave me a clue that address 0 is mapped and contains valid socket address structure in it.

This bug was introduced in f3d3342602f8bcbf37d7c46641cb9bca7618eb1c
(net: rework recvmsg handler msg_name and msg_namelen logic).
Commit message states that:
"Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
affect sendto as it would bail out earlier while trying to copy-in the
address."
But in fact this affects sendto when address 0 is mapped and contains
socket address structure in it. In such case copy-in address will succeed,
verify_iovec() function will successfully exit with msg->msg_namelen > 0
and msg->msg_name == NULL.

This patch fixes it by setting msg_namelen to 0 if msg_name == NULL.

Cc: Hannes Frederic Sowa
Cc: Eric Dumazet
Cc:
Reported-by: Sasha Levin
Signed-off-by: Andrey Ryabinin
Acked-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Andrey Ryabinin
2014-08-14 09:51:48 +0800
e0d1b8941 bnx2x: fix crash during TSO tunneling ... Browse Code »

[ Upstream commit fe26566d8a05151ba1dce75081f6270f73ec4ae1 ]

When TSO packet is transmitted additional BD w/o mapping is used
to describe the packed. The BD needs special handling in tx
completion.

kernel: Call Trace:
kernel: [] dump_stack+0x19/0x1b
kernel: [] warn_slowpath_common+0x61/0x80
kernel: [] warn_slowpath_fmt+0x5c/0x80
kernel: [] ? find_iova+0x4d/0x90
kernel: [] intel_unmap_page.part.36+0x142/0x160
kernel: [] intel_unmap_page+0x26/0x30
kernel: [] bnx2x_free_tx_pkt+0x157/0x2b0 [bnx2x]
kernel: [] bnx2x_tx_int+0xac/0x220 [bnx2x]
kernel: [] ? read_tsc+0x9/0x20
kernel: [] bnx2x_poll+0xbb/0x3c0 [bnx2x]
kernel: [] net_rx_action+0x15a/0x250
kernel: [] __do_softirq+0xf7/0x290
kernel: [] call_softirq+0x1c/0x30
kernel: [] do_softirq+0x55/0x90
kernel: [] irq_exit+0x115/0x120
kernel: [] do_IRQ+0x58/0xf0
kernel: [] common_interrupt+0x6d/0x6d
kernel: [] ? clockevents_notify+0x127/0x140
kernel: [] ? cpuidle_enter_state+0x4f/0xc0
kernel: [] cpuidle_idle_call+0xc5/0x200
kernel: [] arch_cpu_idle+0xe/0x30
kernel: [] cpu_startup_entry+0xf5/0x290
kernel: [] start_secondary+0x265/0x27b
kernel: ---[ end trace 11aa7726f18d7e80 ]---

Fixes: a848ade408b ("bnx2x: add CSUM and TSO support for encapsulation protocols")
Reported-by: Yulong Pei
Cc: Michal Schmidt
Signed-off-by: Dmitry Kravkov
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Dmitry Kravkov
2014-08-14 09:51:48 +0800
a052e55c0 xfrm: Fix installation of AH IPsec SAs ... Browse Code »

[ Upstream commit a0e5ef53aac8e5049f9344857d8ec5237d31e58b ]

The SPI check introduced in ea9884b3acf3311c8a11db67bfab21773f6f82ba
was intended for IPComp SAs but actually prevented AH SAs from getting
installed (depending on the SPI).

Fixes: ea9884b3acf3 ("xfrm: check user specified spi for IPComp")
Cc: Fan Du
Signed-off-by: Tobias Brunner
Signed-off-by: Steffen Klassert
Signed-off-by: Greg Kroah-Hartman

Tobias Brunner
2014-08-14 09:51:48 +0800
c4b76e186 xfrm: Fix refcount imbalance in xfrm_lookup ... Browse Code »

[ Upstream commit b7eea4545ea775df957460f58eb56085a8892856 ]

xfrm_lookup must return a dst_entry with a refcount for the caller.
Git commit 1a1ccc96abb ("xfrm: Remove caching of xfrm_policy_sk_bundles")
removed this refcount for the socket policy case accidentally.
This patch restores it and sets DST_NOCACHE flag to make sure
that the dst_entry is freed when the refcount becomes null.

Fixes: 1a1ccc96abb ("xfrm: Remove caching of xfrm_policy_sk_bundles")
Signed-off-by: Steffen Klassert
Signed-off-by: Greg Kroah-Hartman

Steffen Klassert
2014-08-14 09:51:47 +0800
34ada3629 net: bcmgenet: correctly pad short packets ... Browse Code »

[ Upstream commit 474ea9cafc459976827a477f2c30eaf6313cb7c1 ]

Packets shorter than ETH_ZLEN were not padded with zeroes, hence leaking
potentially sensitive information. This bug has been present since the
driver got accepted in commit 1c1008c793fa46703a2fee469f4235e1c7984333
("net: bcmgenet: add main driver file").

Signed-off-by: Florian Fainelli
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Florian Fainelli
2014-08-14 09:51:47 +0800

08 Aug, 2014

24 commits

0617859f5 Linux 3.15.9 Browse Code »

Greg Kroah-Hartman
2014-08-08 07:53:58 +0800
8f0d9e956 x86/espfix/xen: Fix allocation of pages for paravirt page tables ... Browse Code »

commit 8762e5092828c4dc0f49da5a47a644c670df77f3 upstream.

init_espfix_ap() is currently off by one level when informing hypervisor
that allocated pages will be used for ministacks' page tables.

The most immediate effect of this on a PV guest is that if
'stack_page = __get_free_page()' returns a non-zeroed-out page the hypervisor
will refuse to use it for a page table (which it shouldn't be anyway). This will
result in warnings by both Xen and Linux.

More importantly, a subsequent write to that page (again, by a PV guest) is
likely to result in fatal page fault.

Signed-off-by: Boris Ostrovsky
Link: http://lkml.kernel.org/r/1404926298-5565-1-git-send-email-boris.ostrovsky@oracle.com
Reviewed-by: Konrad Rzeszutek Wilk
Signed-off-by: H. Peter Anvin
Signed-off-by: Greg Kroah-Hartman

Boris Ostrovsky
2014-08-08 07:53:54 +0800
849c17730 lib/btree.c: fix leak of whole btree nodes ... Browse Code »

commit c75b53af2f0043aff500af0a6f878497bef41bca upstream.

I use btree from 3.14-rc2 in my own module. When the btree module is
removed, a warning arises:

kmem_cache_destroy btree_node: Slab cache still has objects
CPU: 13 PID: 9150 Comm: rmmod Tainted: GF O 3.14.0-rc2 #1
Hardware name: Inspur NF5270M3/NF5270M3, BIOS CHEETAH_2.1.3 09/10/2013
Call Trace:
dump_stack+0x49/0x5d
kmem_cache_destroy+0xcf/0xe0
btree_module_exit+0x10/0x12 [btree]
SyS_delete_module+0x198/0x1f0
system_call_fastpath+0x16/0x1b

The cause is that it doesn't release the last btree node, when height = 1
and fill = 1.

[akpm@linux-foundation.org: remove unneeded test of NULL]
Signed-off-by: Minfei Huang
Cc: Joern Engel
Cc: Johannes Berg
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Minfei Huang
2014-08-08 07:53:53 +0800
9425aebd9 net/l2tp: don't fall back on UDP [get|set]sockopt ... Browse Code »

commit 3cf521f7dc87c031617fd47e4b7aa2593c2f3daf upstream.

The l2tp [get|set]sockopt() code has fallen back to the UDP functions
for socket option levels != SOL_PPPOL2TP since day one, but that has
never actually worked, since the l2tp socket isn't an inet socket.

As David Miller points out:

"If we wanted this to work, it'd have to look up the tunnel and then
use tunnel->sk, but I wonder how useful that would be"

Since this can never have worked so nobody could possibly have depended
on that functionality, just remove the broken code and return -EINVAL.

Reported-by: Sasha Levin
Acked-by: James Chapman
Acked-by: David Miller
Cc: Phil Turnbull
Cc: Vegard Nossum
Cc: Willy Tarreau
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Sasha Levin
2014-08-08 07:53:53 +0800
809c61dac xtensa: add fixup for double exception raised in window overflow ... Browse Code »

commit 17290231df16eeee5dfc198dbf5ee4b419996dcd upstream.

There are two FIXMEs in the double exception handler 'for the extremely
unlikely case'. This case gets hit by gcc during kernel build once in
a few hours, resulting in an unrecoverable exception condition.

Provide missing fixup routine to handle this case. Double exception
literals now need 8 more bytes, add them to the linker script.

Also replace bbsi instructions with bbsi.l as we're branching depending
on 8th and 7th LSB-based bits of exception address.

This may be tested by adding the explicit DTLB invalidation to window
overflow handlers, like the following:

# --- a/arch/xtensa/kernel/vectors.S
# +++ b/arch/xtensa/kernel/vectors.S
# @@ -592,6 +592,14 @@ ENDPROC(_WindowUnderflow4)
# ENTRY_ALIGN64(_WindowOverflow8)
#
# s32e a0, a9, -16
# + bbsi.l a9, 31, 1f
# + rsr a0, ccount
# + bbsi.l a0, 4, 1f
# + pdtlb a0, a9
# + idtlb a0
# + movi a0, 9
# + idtlb a0
# +1:
# l32e a0, a1, -12
# s32e a2, a9, -8
# s32e a1, a9, -12

Signed-off-by: Max Filippov
Signed-off-by: Greg Kroah-Hartman

Max Filippov
2014-08-08 07:53:53 +0800
03b3b027c x86/xen: no need to explicitly register an NMI callback ... Browse Code »

commit ea9f9274bf4337ba7cbab241c780487651642d63 upstream.

Remove xen_enable_nmi() to fix a 64-bit guest crash when registering
the NMI callback on Xen 3.1 and earlier.

It's not needed since the NMI callback is set by a set_trap_table
hypercall (in xen_load_idt() or xen_write_idt_entry()).

It's also broken since it only set the current VCPU's callback.

Signed-off-by: David Vrabel
Reported-by: Vitaly Kuznetsov
Tested-by: Vitaly Kuznetsov
Cc: Steven Noonan
Signed-off-by: Greg Kroah-Hartman

David Vrabel
2014-08-08 07:53:53 +0800
6e285bbe0 drm/i915: Ignore VBT backlight presence check on HP Chromebook 14 ... Browse Code »

commit 724cb06fa9b1e1ffd98188275543fdb3b8eaca4f upstream.

commit c675949ec58ca50d5a3ae3c757892f1560f6e896
drm/i915: do not setup backlight if not available according to VBT

caused a regression on the HP Chromebook 14 (with Celeron 2955U CPU),
which has a misconfigured VBT. Apply quirk to ignore the VBT backlight
presence check during backlight setup.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79813
Signed-off-by: Scot Doyle
Tested-by: Stefan Nagy
Cc: Jani Nikula
Cc: Daniel Vetter
Signed-off-by: Daniel Vetter
Signed-off-by: Greg Kroah-Hartman

Scot Doyle
2014-08-08 07:53:53 +0800
124231f18 staging: vt6655: Fix Warning on boot handle_irq_event_percpu. ... Browse Code »

commit 6cff1f6ad4c615319c1a146b2aa0af1043c5e9f5 upstream.

WARNING: CPU: 0 PID: 929 at /home/apw/COD/linux/kernel/irq/handle.c:147 handle_irq_event_percpu+0x1d1/0x1e0()
irq 17 handler device_intr+0x0/0xa80 [vt6655_stage] enabled interrupts

Using spin_lock_irqsave appears to fix this.

Signed-off-by: Malcolm Priestley
Signed-off-by: Greg Kroah-Hartman

Malcolm Priestley
2014-08-08 07:53:53 +0800
710dfba5f ARM: dts: dra7-evm: Make VDDA_1V8_PHY supply always on ... Browse Code »

commit e120fb459693bbc1ac3eabdd65c3659d7cfbfd2a upstream.

After clarification from the hardware team it was found that
this 1.8V PHY supply can't be switched OFF when SoC is Active.

Since the PHY IPs don't contain isolation logic built in the design to
allow the power rail to be switched off, there is a very high risk
of IP reliability and additional leakage paths which can result in
additional power consumption.

The only scenario where this rail can be switched off is part of Power on
reset sequencing, but it needs to be kept always-on during operation.

This patch is required for proper functionality of USB, SATA
and PCIe on DRA7-evm.

CC: Rajendra Nayak
CC: Tero Kristo
Signed-off-by: Roger Quadros
Signed-off-by: Tony Lindgren
Signed-off-by: Greg Kroah-Hartman

Roger Quadros
2014-08-08 07:53:53 +0800
7a4564003 vfs: fix check for fallocate on active swapfile ... Browse Code »

commit 6d2b6170c8914c6c69256b687651fb16d7ec3e18 upstream.

Fix the broken check for calling sys_fallocate() on an active swapfile,
introduced by commit 0790b31b69374ddadefe ("fs: disallow all fallocate
operation on active swapfile").

Signed-off-by: Eric Biggers
Signed-off-by: Al Viro
Signed-off-by: Greg Kroah-Hartman

Eric Biggers
2014-08-08 07:53:53 +0800
b3faa01ff pinctrl: dra: dt-bindings: Fix pull enable/disable ... Browse Code »

commit 23d9cec07c589276561c13b180577c0b87930140 upstream.

The DRA74/72 control module pins have a weak pull up and pull down.
This is configured by bit offset 17. if BIT(17) is 1, a pull up is
selected, else a pull down is selected.

However, this pull resisstor is applied based on BIT(16) -
PULLUDENABLE - if BIT(18) is *0*, then pull as defined in BIT(17) is
applied, else no weak pulls are applied. We defined this in reverse.

Reference: Table 18-5 (Description of the pad configuration register
bits) in Technical Reference Manual Revision (DRA74x revision Q:
SPRUHI2Q Revised June 2014 and DRA72x revision F: SPRUHP2F - Revised
June 2014)

Fixes: 6e58b8f1daaf1a ("ARM: dts: DRA7: Add the dts files for dra7 SoC and dra7-evm board")
Signed-off-by: Nishanth Menon
Tested-by: Felipe Balbi
Acked-by: Felipe Balbi
Signed-off-by: Tony Lindgren
Signed-off-by: Greg Kroah-Hartman

Nishanth Menon
2014-08-08 07:53:53 +0800
84dd7a67c x86_64/entry/xen: Do not invoke espfix64 on Xen ... Browse Code »

commit 7209a75d2009dbf7745e2fd354abf25c3deb3ca3 upstream.

This moves the espfix64 logic into native_iret. To make this work,
it gets rid of the native patch for INTERRUPT_RETURN:
INTERRUPT_RETURN on native kernels is now 'jmp native_iret'.

This changes the 16-bit SS behavior on Xen from OOPSing to leaking
some bits of the Xen hypervisor's RSP (I think).

[ hpa: this is a nonzero cost on native, but probably not enough to
measure. Xen needs to fix this in their own code, probably doing
something equivalent to espfix64. ]

Signed-off-by: Andy Lutomirski
Link: http://lkml.kernel.org/r/7b8f1d8ef6597cb16ae004a43c56980a7de3cf94.1406129132.git.luto@amacapital.net
Signed-off-by: H. Peter Anvin
Signed-off-by: Greg Kroah-Hartman

Andy Lutomirski
2014-08-08 07:53:53 +0800
639d979e0 x86, espfix: Make it possible to disable 16-bit support ... Browse Code »

commit 34273f41d57ee8d854dcd2a1d754cbb546cb548f upstream.

Embedded systems, which may be very memory-size-sensitive, are
extremely unlikely to ever encounter any 16-bit software, so make it
a CONFIG_EXPERT option to turn off support for any 16-bit software
whatsoever.

Signed-off-by: H. Peter Anvin
Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
Signed-off-by: Greg Kroah-Hartman

H. Peter Anvin
2014-08-08 07:53:53 +0800
9f8002a3b x86, espfix: Make espfix64 a Kconfig option, fix UML ... Browse Code »

commit 197725de65477bc8509b41388157c1a2283542bb upstream.

Make espfix64 a hidden Kconfig option. This fixes the x86-64 UML
build which had broken due to the non-existence of init_espfix_bsp()
in UML: since UML uses its own Kconfig, this option does not appear in
the UML build.

This also makes it possible to make support for 16-bit segments a
configuration option, for the people who want to minimize the size of
the kernel.

Reported-by: Ingo Molnar
Signed-off-by: H. Peter Anvin
Cc: Richard Weinberger
Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
Signed-off-by: Greg Kroah-Hartman

H. Peter Anvin
2014-08-08 07:53:53 +0800
caf3a27f5 x86, espfix: Fix broken header guard ... Browse Code »

commit 20b68535cd27183ebd3651ff313afb2b97dac941 upstream.

Header guard is #ifndef, not #ifdef...

Reported-by: Fengguang Wu
Signed-off-by: H. Peter Anvin
Signed-off-by: Greg Kroah-Hartman

H. Peter Anvin
2014-08-08 07:53:53 +0800
3faa4ea91 x86, espfix: Move espfix definitions into a separate header file ... Browse Code »

commit e1fe9ed8d2a4937510d0d60e20705035c2609aea upstream.

Sparse warns that the percpu variables aren't declared before they are
defined. Rather than hacking around it, move espfix definitions into
a proper header file.

Reported-by: Fengguang Wu
Signed-off-by: H. Peter Anvin
Signed-off-by: Greg Kroah-Hartman

H. Peter Anvin
2014-08-08 07:53:53 +0800
6fd50a78a x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack ... Browse Code »

commit 3891a04aafd668686239349ea58f3314ea2af86b upstream.

The IRET instruction, when returning to a 16-bit segment, only
restores the bottom 16 bits of the user space stack pointer. This
causes some 16-bit software to break, but it also leaks kernel state
to user space. We have a software workaround for that ("espfix") for
the 32-bit kernel, but it relies on a nonzero stack segment base which
is not available in 64-bit mode.

In checkin:

b3b42ac2cbae x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels

we "solved" this by forbidding 16-bit segments on 64-bit kernels, with
the logic that 16-bit support is crippled on 64-bit kernels anyway (no
V86 support), but it turns out that people are doing stuff like
running old Win16 binaries under Wine and expect it to work.

This works around this by creating percpu "ministacks", each of which
is mapped 2^16 times 64K apart. When we detect that the return SS is
on the LDT, we copy the IRET frame to the ministack and use the
relevant alias to return to userspace. The ministacks are mapped
readonly, so if IRET faults we promote #GP to #DF which is an IST
vector and thus has its own stack; we then do the fixup in the #DF
handler.

(Making #GP an IST exception would make the msr_safe functions unsafe
in NMI/MC context, and quite possibly have other effects.)

Special thanks to:

- Andy Lutomirski, for the suggestion of using very small stack slots
and copy (as opposed to map) the IRET frame there, and for the
suggestion to mark them readonly and let the fault promote to #DF.
- Konrad Wilk for paravirt fixup and testing.
- Borislav Petkov for testing help and useful comments.

Reported-by: Brian Gerst
Signed-off-by: H. Peter Anvin
Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
Cc: Konrad Rzeszutek Wilk
Cc: Borislav Petkov
Cc: Andrew Lutomriski
Cc: Linus Torvalds
Cc: Dirk Hohndel
Cc: Arjan van de Ven
Cc: comex
Cc: Alexander van Heukelum
Cc: Boris Ostrovsky
Signed-off-by: Greg Kroah-Hartman

H. Peter Anvin
2014-08-08 07:53:53 +0800
4a4545cd2 Revert "x86-64, modify_ldt: Make support for 16-bit segments a runtime option" ... Browse Code »

commit 7ed6fb9b5a5510e4ef78ab27419184741169978a upstream.

This reverts commit fa81511bb0bbb2b1aace3695ce869da9762624ff in
preparation of merging in the proper fix (espfix64).

Signed-off-by: H. Peter Anvin
Signed-off-by: Greg Kroah-Hartman

H. Peter Anvin
2014-08-08 07:53:52 +0800
d7bbbf455 timer: Fix lock inversion between hrtimer_bases.lock and scheduler locks ... Browse Code »

commit 504d58745c9ca28d33572e2d8a9990b43e06075d upstream.

clockevents_increase_min_delta() calls printk() from under
hrtimer_bases.lock. That causes lock inversion on scheduler locks because
printk() can call into the scheduler. Lockdep puts it as:

======================================================
[ INFO: possible circular locking dependency detected ]
3.15.0-rc8-06195-g939f04b #2 Not tainted
-------------------------------------------------------
trinity-main/74 is trying to acquire lock:
(&port_lock_key){-.....}, at: [] serial8250_console_write+0x8c/0x10c

but task is already holding lock:
(hrtimer_bases.lock){-.-...}, at: [] hrtimer_try_to_cancel+0x13/0x66

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #5 (hrtimer_bases.lock){-.-...}:
[] lock_acquire+0x92/0x101
[] _raw_spin_lock_irqsave+0x2e/0x3e
[] __hrtimer_start_range_ns+0x1c/0x197
[] perf_swevent_start_hrtimer.part.41+0x7a/0x85
[] task_clock_event_start+0x3a/0x3f
[] task_clock_event_add+0xd/0x14
[] event_sched_in+0xb6/0x17a
[] group_sched_in+0x44/0x122
[] ctx_sched_in.isra.67+0x105/0x11f
[] perf_event_sched_in.isra.70+0x47/0x4b
[] __perf_install_in_context+0x8b/0xa3
[] remote_function+0x12/0x2a
[] smp_call_function_single+0x2d/0x53
[] task_function_call+0x30/0x36
[] perf_install_in_context+0x87/0xbb
[] SYSC_perf_event_open+0x5c6/0x701
[] SyS_perf_event_open+0x17/0x19
[] syscall_call+0x7/0xb

-> #4 (&ctx->lock){......}:
[] lock_acquire+0x92/0x101
[] _raw_spin_lock+0x21/0x30
[] __perf_event_task_sched_out+0x1dc/0x34f
[] __schedule+0x4c6/0x4cb
[] schedule+0xf/0x11
[] work_resched+0x5/0x30

-> #3 (&rq->lock){-.-.-.}:
[] lock_acquire+0x92/0x101
[] _raw_spin_lock+0x21/0x30
[] __task_rq_lock+0x33/0x3a
[] wake_up_new_task+0x25/0xc2
[] do_fork+0x15c/0x2a0
[] kernel_thread+0x1a/0x1f
[] rest_init+0x1a/0x10e
[] start_kernel+0x303/0x308
[] i386_start_kernel+0x79/0x7d

-> #2 (&p->pi_lock){-.-...}:
[] lock_acquire+0x92/0x101
[] _raw_spin_lock_irqsave+0x2e/0x3e
[] try_to_wake_up+0x1d/0xd6
[] default_wake_function+0xb/0xd
[] __wake_up_common+0x39/0x59
[] __wake_up+0x29/0x3b
[] tty_wakeup+0x49/0x51
[] uart_write_wakeup+0x17/0x19
[] serial8250_tx_chars+0xbc/0xfb
[] serial8250_handle_irq+0x54/0x6a
[] serial8250_default_handle_irq+0x19/0x1c
[] serial8250_interrupt+0x38/0x9e
[] handle_irq_event_percpu+0x5f/0x1e2
[] handle_irq_event+0x2c/0x43
[] handle_level_irq+0x57/0x80
[] handle_irq+0x46/0x5c
[] do_IRQ+0x32/0x89
[] common_interrupt+0x2e/0x33
[] _raw_spin_unlock_irqrestore+0x3f/0x49
[] uart_start+0x2d/0x32
[] uart_write+0xc7/0xd6
[] n_tty_write+0xb8/0x35e
[] tty_write+0x163/0x1e4
[] redirected_tty_write+0x6d/0x75
[] vfs_write+0x75/0xb0
[] SyS_write+0x44/0x77
[] syscall_call+0x7/0xb

-> #1 (&tty->write_wait){-.....}:
[] lock_acquire+0x92/0x101
[] _raw_spin_lock_irqsave+0x2e/0x3e
[] __wake_up+0x15/0x3b
[] tty_wakeup+0x49/0x51
[] uart_write_wakeup+0x17/0x19
[] serial8250_tx_chars+0xbc/0xfb
[] serial8250_handle_irq+0x54/0x6a
[] serial8250_default_handle_irq+0x19/0x1c
[] serial8250_interrupt+0x38/0x9e
[] handle_irq_event_percpu+0x5f/0x1e2
[] handle_irq_event+0x2c/0x43
[] handle_level_irq+0x57/0x80
[] handle_irq+0x46/0x5c
[] do_IRQ+0x32/0x89
[] common_interrupt+0x2e/0x33
[] _raw_spin_unlock_irqrestore+0x3f/0x49
[] uart_start+0x2d/0x32
[] uart_write+0xc7/0xd6
[] n_tty_write+0xb8/0x35e
[] tty_write+0x163/0x1e4
[] redirected_tty_write+0x6d/0x75
[] vfs_write+0x75/0xb0
[] SyS_write+0x44/0x77
[] syscall_call+0x7/0xb

-> #0 (&port_lock_key){-.....}:
[] __lock_acquire+0x9ea/0xc6d
[] lock_acquire+0x92/0x101
[] _raw_spin_lock_irqsave+0x2e/0x3e
[] serial8250_console_write+0x8c/0x10c
[] call_console_drivers.constprop.31+0x87/0x118
[] console_unlock+0x1d7/0x398
[] vprintk_emit+0x3da/0x3e4
[] printk+0x17/0x19
[] clockevents_program_min_delta+0x104/0x116
[] clockevents_program_event+0xe7/0xf3
[] tick_program_event+0x1e/0x23
[] hrtimer_force_reprogram+0x88/0x8f
[] __remove_hrtimer+0x5b/0x79
[] hrtimer_try_to_cancel+0x49/0x66
[] hrtimer_cancel+0xd/0x18
[] perf_swevent_cancel_hrtimer.part.60+0x2b/0x30
[] task_clock_event_stop+0x20/0x64
[] task_clock_event_del+0xd/0xf
[] event_sched_out+0xab/0x11e
[] group_sched_out+0x1d/0x66
[] ctx_sched_out+0xaf/0xbf
[] __perf_event_task_sched_out+0x1ed/0x34f
[] __schedule+0x4c6/0x4cb
[] schedule+0xf/0x11
[] work_resched+0x5/0x30

other info that might help us debug this:

Chain exists of:
&port_lock_key --> &ctx->lock --> hrtimer_bases.lock

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(hrtimer_bases.lock);
lock(&ctx->lock);
lock(hrtimer_bases.lock);
lock(&port_lock_key);

*** DEADLOCK ***

4 locks held by trinity-main/74:
#0: (&rq->lock){-.-.-.}, at: [] __schedule+0xed/0x4cb
#1: (&ctx->lock){......}, at: [] __perf_event_task_sched_out+0x1dc/0x34f
#2: (hrtimer_bases.lock){-.-...}, at: [] hrtimer_try_to_cancel+0x13/0x66
#3: (console_lock){+.+...}, at: [] vprintk_emit+0x3c7/0x3e4

stack backtrace:
CPU: 0 PID: 74 Comm: trinity-main Not tainted 3.15.0-rc8-06195-g939f04b #2
00000000 81c3a310 8b995c14 81426f69 8b995c44 81425a99 8161f671 8161f570
8161f538 8161f559 8161f538 8b995c78 8b142bb0 00000004 8b142fdc 8b142bb0
8b995ca8 8104a62d 8b142fac 000016f2 81c3a310 00000001 00000001 00000003
Call Trace:
[] dump_stack+0x16/0x18
[] print_circular_bug+0x18f/0x19c
[] __lock_acquire+0x9ea/0xc6d
[] lock_acquire+0x92/0x101
[] ? serial8250_console_write+0x8c/0x10c
[] ? wait_for_xmitr+0x76/0x76
[] _raw_spin_lock_irqsave+0x2e/0x3e
[] ? serial8250_console_write+0x8c/0x10c
[] serial8250_console_write+0x8c/0x10c
[] ? lock_release+0x191/0x223
[] ? wait_for_xmitr+0x76/0x76
[] call_console_drivers.constprop.31+0x87/0x118
[] console_unlock+0x1d7/0x398
[] vprintk_emit+0x3da/0x3e4
[] printk+0x17/0x19
[] clockevents_program_min_delta+0x104/0x116
[] tick_program_event+0x1e/0x23
[] hrtimer_force_reprogram+0x88/0x8f
[] __remove_hrtimer+0x5b/0x79
[] hrtimer_try_to_cancel+0x49/0x66
[] hrtimer_cancel+0xd/0x18
[] perf_swevent_cancel_hrtimer.part.60+0x2b/0x30
[] task_clock_event_stop+0x20/0x64
[] task_clock_event_del+0xd/0xf
[] event_sched_out+0xab/0x11e
[] group_sched_out+0x1d/0x66
[] ctx_sched_out+0xaf/0xbf
[] __perf_event_task_sched_out+0x1ed/0x34f
[] ? __dequeue_entity+0x23/0x27
[] ? pick_next_task_fair+0xb1/0x120
[] __schedule+0x4c6/0x4cb
[] ? trace_hardirqs_off_caller+0xd7/0x108
[] ? trace_hardirqs_off+0xb/0xd
[] ? rcu_irq_exit+0x64/0x77

Fix the problem by using printk_deferred() which does not call into the
scheduler.

Reported-by: Fengguang Wu
Signed-off-by: Jan Kara
Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Jan Kara
2014-08-08 07:53:52 +0800
0c2377944 sched_clock: Avoid corrupting hrtimer tree during suspend ... Browse Code »

commit f723aa1817dd8f4fe005aab52ba70c8ab0ef9457 upstream.

During suspend we call sched_clock_poll() to update the epoch and
accumulated time and reprogram the sched_clock_timer to fire
before the next wrap-around time. Unfortunately,
sched_clock_poll() doesn't restart the timer, instead it relies
on the hrtimer layer to do that and during suspend we aren't
calling that function from the hrtimer layer. Instead, we're
reprogramming the expires time while the hrtimer is enqueued,
which can cause the hrtimer tree to be corrupted. Furthermore, we
restart the timer during suspend but we update the epoch during
resume which seems counter-intuitive.

Let's fix this by saving the accumulated state and canceling the
timer during suspend. On resume we can update the epoch and
restart the timer similar to what we would do if we were starting
the clock for the first time.

Fixes: a08ca5d1089d "sched_clock: Use an hrtimer instead of timer"
Signed-off-by: Stephen Boyd
Signed-off-by: John Stultz
Link: http://lkml.kernel.org/r/1406174630-23458-1-git-send-email-john.stultz@linaro.org
Cc: Ingo Molnar
Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Stephen Boyd
2014-08-08 07:53:52 +0800
3a5e0137b printk: rename printk_sched to printk_deferred ... Browse Code »

commit aac74dc495456412c4130a1167ce4beb6c1f0b38 upstream.

After learning we'll need some sort of deferred printk functionality in
the timekeeping core, Peter suggested we rename the printk_sched function
so it can be reused by needed subsystems.

This only changes the function name. No logic changes.

Signed-off-by: John Stultz
Reviewed-by: Steven Rostedt
Cc: Jan Kara
Cc: Peter Zijlstra
Cc: Jiri Bohac
Cc: Thomas Gleixner
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

John Stultz
2014-08-08 07:53:52 +0800
54926c1bc dm cache: fix race affecting dirty block count ... Browse Code »

commit 44fa816bb778edbab6b6ddaaf24908dd6295937e upstream.

nr_dirty is updated without locking, causing it to drift so that it is
non-zero (either a small positive integer, or a very large one when an
underflow occurs) even when there are no actual dirty blocks. This was
due to a race between the workqueue and map function accessing nr_dirty
in parallel without proper protection.

People were seeing under runs due to a race on increment/decrement of
nr_dirty, see: https://lkml.org/lkml/2014/6/3/648

Fix this by using an atomic_t for nr_dirty.

Reported-by: roma1390@gmail.com
Signed-off-by: Anssi Hannula
Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer
Signed-off-by: Greg Kroah-Hartman

Anssi Hannula
2014-08-08 07:53:52 +0800
71881a3da dm bufio: fully initialize shrinker ... Browse Code »

commit d8c712ea471ce7a4fd1734ad2211adf8469ddddc upstream.

1d3d4437eae1 ("vmscan: per-node deferred work") added a flags field to
struct shrinker assuming that all shrinkers were zero filled. The dm
bufio shrinker is not zero filled, which leaves arbitrary kmalloc() data
in flags. So far the only defined flags bit is SHRINKER_NUMA_AWARE.
But there are proposed patches which add other bits to shrinker.flags
(e.g. memcg awareness).

Rather than simply initializing the shrinker, this patch uses kzalloc()
when allocating the dm_bufio_client to ensure that the embedded shrinker
and any other similar structures are zeroed.

This fixes theoretical over aggressive shrinking of dm bufio objects.
If the uninitialized dm_bufio_client.shrinker.flags contains
SHRINKER_NUMA_AWARE then shrink_slab() would call the dm shrinker for
each numa node rather than just once. This has been broken since 3.12.

Signed-off-by: Greg Thelen
Acked-by: Mikulas Patocka
Signed-off-by: Mike Snitzer
Signed-off-by: Greg Kroah-Hartman

Greg Thelen
2014-08-08 07:53:52 +0800
49ab5de52 iio: buffer: Fix demux table creation ... Browse Code »

commit 61bd55ce1667809f022be88da77db17add90ea4e upstream.

When creating the demux table we need to iterate over the selected scan mask for
the buffer to get the samples which should be copied to destination buffer.
Right now the code uses the mask which contains all active channels, which means
the demux table contains entries which causes it to copy all the samples from
source to destination buffer one by one without doing any demuxing.

Signed-off-by: Lars-Peter Clausen
Signed-off-by: Jonathan Cameron
Signed-off-by: Greg Kroah-Hartman

Lars-Peter Clausen
2014-08-08 07:53:52 +0800