07 Jan, 2012
40 commits
-
commit 50b8d257486a45cba7b65ca978986ed216bbcc10 upstream.
Test-case:
int main(void)
{
int pid, status;pid = fork();
if (!pid) {
for (;;) {
if (!fork())
return 0;
if (waitpid(-1, &status, 0) < 0) {
printf("ERR!! wait: %m\n");
return 0;
}
}
}assert(ptrace(PTRACE_ATTACH, pid, 0,0) == 0);
assert(waitpid(-1, NULL, 0) == pid);assert(ptrace(PTRACE_SETOPTIONS, pid, 0,
PTRACE_O_TRACEFORK) == 0);do {
ptrace(PTRACE_CONT, pid, 0, 0);
pid = waitpid(-1, NULL, 0);
} while (pid > 0);return 1;
}It fails because ->real_parent sees its child in EXIT_DEAD state
while the tracer is going to change the state back to EXIT_ZOMBIE
in wait_task_zombie().The offending commit is 823b018e which moved the EXIT_DEAD check,
but in fact we should not blame it. The original code was not
correct as well because it didn't take ptrace_reparented() into
account and because we can't really trust ->ptrace.This patch adds the additional check to close this particular
race but it doesn't solve the whole problem. We simply can't
rely on ->ptrace in this case, it can be cleared if the tracer
is multithreaded by the exiting ->parent.I think we should kill EXIT_DEAD altogether, we should always
remove the soon-to-be-reaped child from ->children or at least
we should never do the DEAD->ZOMBIE transition. But this is too
complex for 3.2.Reported-and-tested-by: Denys Vlasenko
Tested-by: Lukasz Michalik
Acked-by: Tejun Heo
Signed-off-by: Oleg Nesterov
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman -
commit 157e8bf8b4823bfcdefa6c1548002374b61f61df upstream.
This reverts commit c0afabd3d553c521e003779c127143ffde55a16f.
It causes failures on Toshiba laptops - instead of disabling the alarm,
it actually seems to enable it on the affected laptops, resulting in
(for example) the laptop powering on automatically five minutes after
shutdown.There's a patch for it that appears to work for at least some people,
but it's too late to play around with this, so revert for now and try
again in the next merge window.See for example
http://bugs.debian.org/652869
Reported-and-bisected-by: Andreas Friedrich (Toshiba Tecra)
Reported-by: Antonio-M. Corbi Bellot (Toshiba Portege R500)
Reported-by: Marco Santos (Toshiba Portege Z830)
Reported-by: Christophe Vu-Brugier (Toshiba Portege R830)
Cc: Jonathan Nieder
Requested-by: John Stultz
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman -
commit f9fab10bbd768b0e5254e53a4a8477a94bfc4b96 upstream.
vfork parent uninterruptibly and unkillably waits for its child to
exec/exit. This wait is of unbounded length. Ignore such waits
in the hung_task detector.Signed-off-by: Mandeep Singh Baines
Reported-by: Sasha Levin
LKML-Reference:
Cc: Linus Torvalds
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Andrew Morton
Cc: John Kacur
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman -
commit 4376eee92e5a8332b470040e672ea99cd44c826a upstream.
If we end up with no power states, don't look up
current vddc.fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=44130agd5f: fix patch formatting
Signed-off-by: Alex Deucher
Signed-off-by: Dave Airlie
Signed-off-by: Greg Kroah-Hartman -
Commit be4f1ac828776bbc7868a68b465cd8eedb733cfd upstream.
Since Linux 2.6.36 the writeback code has introduces various measures for
live lock prevention during sync(). Unfortunately some of these are
actively harmful for the XFS model, where the inode gets marked dirty for
metadata from the data I/O handler.The older_than_this checks that are now more strictly enforced since
writeback: avoid livelocking WB_SYNC_ALL writeback
by only calling into __writeback_inodes_sb and thus only sampling the
current cut off time once. But on a slow enough devices the previous
asynchronous sync pass might not have fully completed yet, and thus XFS
might mark metadata dirty only after that sampling of the cut off time for
the blocking pass already happened. I have not myself reproduced this
myself on a real system, but by introducing artificial delay into the
XFS I/O completion workqueues it can be reproduced easily.Fix this by iterating over all XFS inodes in ->sync_fs and log all that
are dirty. This might log inode that only got redirtied after the
previous pass, but given how cheap delayed logging of inodes is it
isn't a major concern for performance.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Tested-by: Mark Tinguely
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers
Signed-off-by: Greg Kroah-Hartman -
Commit 0b8fd3033c308e4088760aa1d38ce77197b4e074 upstream.
If the writeback code writes back an inode because it has expired we currently
use the non-blockin ->write_inode path. This means any inode that is pinned
is skipped. With delayed logging and a workload that has very little log
traffic otherwise it is very likely that an inode that gets constantly
written to is always pinned, and thus we keep refusing to write it. The VM
writeback code at that point redirties it and doesn't try to write it again
for another 30 seconds. This means under certain scenarious time based
metadata writeback never happens.Fix this by calling into xfs_log_inode for kupdate in addition to data
integrity syncs, and thus transfer the inode to the log ASAP.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Tested-by: Mark Tinguely
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers
Signed-off-by: Greg Kroah-Hartman -
commit 63a741757d15320a25ebf5778f8651cce2ed0611 upstream.
This fixes an odd bug found on a Dell PowerEdge 1850/0RC130
(BIOS A05 01/09/2006) where all of the modules doing pci_set_dma_mask
would fail with:ata_piix 0000:00:1f.1: enabling device (0005 -> 0007)
ata_piix 0000:00:1f.1: can't derive routing for PCI INT A
ata_piix 0000:00:1f.1: BMDMA: failed to set dma mask, falling back to PIOThe issue was the Xen-SWIOTLB was allocated such as that the end of
buffer was stradling a page (and also above 4GB). The fix was
spotted by Kalev Leonid which was to piggyback on git commit
e79f86b2ef9c0a8c47225217c1018b7d3d90101c "swiotlb: Use page alignment
for early buffer allocation" which:We could call free_bootmem_late() if swiotlb is not used, and
it will shrink to page alignment.So alloc them with page alignment at first, to avoid lose two pages
And doing that fixes the outstanding issue.
Suggested-by: "Kalev, Leonid"
Reported-and-Tested-by: "Taylor, Neal E"
Signed-off-by: Konrad Rzeszutek Wilk
Signed-off-by: Greg Kroah-Hartman -
commit 3d6271f92e98094584fd1e609a9969cd33e61122 upstream.
Without turning the MADC clock on, no MADC conversions occur.
$ cat /sys/class/hwmon/hwmon0/device/in8_input
[ 53.428436] twl4030_madc twl4030_madc: conversion timeout!
cat: read error: Resource temporarily unavailableSigned-off-by: Kyle Manna
Signed-off-by: Samuel Ortiz
Signed-off-by: Greg Kroah-Hartman -
commit d0e84caeb4cd535923884735906e5730329505b4 upstream.
If the twl4030-madc device wasn't registered, and another device, such
as twl4030-madc-hwmon, calls twl4030_madc_conversion() a NULL pointer is
dereferenced.Signed-off-by: Kyle Manna
Signed-off-by: Samuel Ortiz
Signed-off-by: Greg Kroah-Hartman -
commit 66cc5b8e50af87b0bbd0f179d76d2826f4549c13 upstream.
Worst case this fixes the following error:
[ 72.086212] (NULL device *): conversion timeout!Best case it prevents a crash
Signed-off-by: Kyle Manna
Signed-off-by: Samuel Ortiz -
commit e178ccb33569da17dc897a08a3865441b813bdfb upstream.
A mutex is locked on entry into twl4030_madc_conversion().
Immediate return on some error conditions leaves the
mutex locked.This patch ensures that mutex is always unlocked before
leaving the function.Signed-off-by: Sanjeev Premi
Cc: Keerthy
Signed-off-by: Samuel Ortiz
Signed-off-by: Greg Kroah-Hartman -
commit 96f1f05af76b601ab21a7dc603ae0a1cea4efc3d upstream.
Since we configure all the queues as CHAINABLE, we need to update the
byte count for all the queues, not only the AGGREGATABLE ones.Not doing so can confuse the SCD and make the fw assert.
Signed-off-by: Emmanuel Grumbach
Signed-off-by: Wey-Yi Guy
Signed-off-by: John W. Linville -
[ Upstream commit b9eda06f80b0db61a73bd87c6b0eb67d8aca55ad ]
Signed-off-by: Stephen Rothwell
Acked-by: Eric Dumazet
Acked-by: David Miller
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 9f28a2fc0bd77511f649c0a788c7bf9a5fd04edb ]
Commit 2c8cec5c10b (ipv4: Cache learned PMTU information in inetpeer)
removed IP route cache garbage collector a bit too soon, as this gc was
responsible for expired routes cleanup, releasing their neighbour
reference.As pointed out by Robert Gladewitz, recent kernels can fill and exhaust
their neighbour cache.Reintroduce the garbage collection, since we'll have to wait our
neighbour lookups become refcount-less to not depend on this stuff.Reported-by: Robert Gladewitz
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit d01ff0a049f749e0bf10a35bb23edd012718c8c2 ]
After reset ipv4_devconf->data[IPV4_DEVCONF_ACCEPT_LOCAL] to 0,
we should flush route cache, or it will continue receive packets with local
source address, which should be dropped.Signed-off-by: Weiping Pan
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit a76c0adf60f6ca5ff3481992e4ea0383776b24d2 ]
When checking whether a DATA chunk fits into the estimated rwnd a
full sizeof(struct sk_buff) is added to the needed chunk size. This
quickly exhausts the available rwnd space and leads to packets being
sent which are much below the PMTU limit. This can lead to much worse
performance.The reason for this behaviour was to avoid putting too much memory
pressure on the receiver. The concept is not completely irational
because a Linux receiver does in fact clone an skb for each DATA chunk
delivered. However, Linux also reserves half the available socket
buffer space for data structures therefore usage of it is already
accounted for.When proposing to change this the last time it was noted that this
behaviour was introduced to solve a performance issue caused by rwnd
overusage in combination with small DATA chunks.Trying to reproduce this I found that with the sk_buff overhead removed,
the performance would improve significantly unless socket buffer limits
are increased.The following numbers have been gathered using a patched iperf
supporting SCTP over a live 1 Gbit ethernet network. The -l option
was used to limit DATA chunk sizes. The numbers listed are based on
the average of 3 test runs each. Default values have been used for
sk_(r|w)mem.Chunk
Size Unpatched No Overhead
-------------------------------------
4 15.2 Kbit [!] 12.2 Mbit [!]
8 35.8 Kbit [!] 26.0 Mbit [!]
16 95.5 Kbit [!] 54.4 Mbit [!]
32 106.7 Mbit 102.3 Mbit
64 189.2 Mbit 188.3 Mbit
128 331.2 Mbit 334.8 Mbit
256 537.7 Mbit 536.0 Mbit
512 766.9 Mbit 766.6 Mbit
1024 810.1 Mbit 808.6 MbitSigned-off-by: Thomas Graf
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 2692ba61a82203404abd7dd2a027bda962861f74 ]
Commit 8ffd3208 voids the previous patches f6778aab and 810c0719 for
limiting the autoclose value. If userspace passes in -1 on 32-bit
platform, the overflow check didn't work and autoclose would be set
to 0xffffffff.This patch defines a max_autoclose (in seconds) for limiting the value
and exposes it through sysctl, with the following intentions.1) Avoid overflowing autoclose * HZ.
2) Keep the default autoclose bound consistent across 32- and 64-bit
platforms (INT_MAX / HZ in this patch).3) Keep the autoclose value consistent between setsockopt() and
getsockopt() calls.Suggested-by: Vlad Yasevich
Signed-off-by: Xi Wang
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 3f1e6d3fd37bd4f25e5b19f1c7ca21850426c33f ]
gred_change_vq() is called under sch_tree_lock(sch).
This means a spinlock is held, and we are not allowed to sleep in this
context.We might pre-allocate memory using GFP_KERNEL before taking spinlock,
but this is not suitable for stable material.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit cd7816d14953c8af910af5bb92f488b0b277e29d ]
previous commit 3fb72f1e6e6165c5f495e8dc11c5bbd14c73385c
makes IP-Config wait for carrier on at least one network device.Before waiting (predefined value 120s), check that at least one device
was successfully brought up. Otherwise (e.g. buggy bootloader
which does not set the MAC address) there is no point in waiting
for carrier.Cc: Micha Nelissen
Cc: Holger Brunck
Signed-off-by: Gerlando Falauto
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 7838f2ce36b6ab5c13ef20b1857e3bbd567f1759 ]
Userspace may not provide TCA_OPTIONS, in fact tc currently does
so not do so if no arguments are specified on the command line.
Return EINVAL instead of panicing.Signed-off-by: Thomas Graf
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 9cef310fcdee12b49b8b4c96fd8f611c8873d284 ]
Received non stream protocol packets were calling llc_cmsg_rcv that used a
skb after that skb was released by sk_eat_skb. This caused received STP
packets to generate kernel panics.Signed-off-by: Alexandru Juncu
Signed-off-by: Kunjan Naik
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit a454daceb78844a09c08b6e2d8badcb76a5d73b9 ]
Signed-off-by: Djalal Harouni
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit a03ffcf873fe0f2565386ca8ef832144c42e67fa ]
x86 jump instruction size is 2 or 5 bytes (near/long jump), not 2 or 6
bytes.In case a conditional jump is followed by a long jump, conditional jump
target is one byte past the start of target instruction.Signed-off-by: Markus Kötter
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman -
[ A combination of upstream commits 1d299bc7732c34d85bd43ac1a8745f5a2fed2078 and
e88d2468718b0789b4c33da2f7e1cef2a1eee279 ]Although we provide a proper way for a debugger to control whether
syscall restart occurs, we run into problems because orig_i0 is not
saved and restored properly.Luckily we can solve this problem without having to make debuggers
aware of the issue. Across system calls, several registers are
considered volatile and can be safely clobbered.Therefore we use the pt_regs save area of one of those registers, %g6,
as a place to save and restore orig_i0.Debuggers transparently will do the right thing because they save and
restore this register already.Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 2e8ecdc008a16b9a6c4b9628bb64d0d1c05f9f92 ]
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit a52312b88c8103e965979a79a07f6b34af82ca4b ]
Properly return the original destination buffer pointer.
Signed-off-by: David S. Miller
Tested-by: Kjetil Oftedal
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 21f74d361dfd6a7d0e47574e315f780d8172084a ]
This is setting things up so that we can correct the return
value, so that it properly returns the original destination
buffer pointer.Signed-off-by: David S. Miller
Tested-by: Kjetil Oftedal
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 045b7de9ca0cf09f1adc3efa467f668b89238390 ]
Signed-off-by: David S. Miller
Tested-by: Kjetil Oftedal
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 3e37fd3153ac95088a74f5e7c569f7567e9f993a ]
To handle the large physical addresses, just make a simple wrapper
around remap_pfn_range() like MIPS does.Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 0b64120cceb86e93cb1bda0dc055f13016646907 ]
Some of the sun4v code patching occurs in inline functions visible
to, and usable by, modules.Therefore we have to patch them up during module load.
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit b1f44e13a525d2ffb7d5afe2273b7169d6f2222e ]
The "(insn & 0x01800000) != 0x01800000" test matches 'restore'
but that is a legitimate place to see the %lo() part of a 32-bit
symbol relocation, particularly in tail calls.Signed-off-by: David S. Miller
Tested-by: Sergei Trofimovich
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 7cc8583372a21d98a23b703ad96cab03180b5030 ]
This silently was working for many years and stopped working on
Niagara-T3 machines.We need to set the MSIQ to VALID before we can set it's state to IDLE.
On Niagara-T3, setting the state to IDLE first was causing HV_EINVAL
errors. The hypervisor documentation says, rather ambiguously, that
the MSIQ must be "initialized" before one can set the state.I previously understood this to mean merely that a successful setconf()
operation has been performed on the MSIQ, which we have done at this
point. But it seems to also mean that it has been set VALID too.Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman -
Upstrem commit: 911ae9434f83e7355d343f6c2be3ef5b00ea7aed
There's a bug in the MSIX backup and restore routines that cause a crash on
non-x86 (direct access to PCI space not via read/write). These routines are
unnecessary and were removed by the above commit, so also remove them from
stable to fix the crash.Signed-off-by: Nagalakshmi Nandigama
Signed-off-by: James Bottomley
Signed-off-by: Greg Kroah-Hartman -
commit b0365c8d0cb6e79eb5f21418ae61ab511f31b575 upstream.
If a huge page is enqueued under the protection of hugetlb_lock, then the
operation is atomic and safe.Signed-off-by: Hillf Danton
Reviewed-by: Michal Hocko
Acked-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman -
commit 77e00f2ea94abee1ad13bdfde19cf7aa25992b0e upstream.
We already do this for cayman, need to also do it for
BTC parts. The default memory and voltage setup is not
adequate for advanced operation. Continuing will
result in an unusable display.Signed-off-by: Alex Deucher
Cc: Jean Delvare
Signed-off-by: Dave Airlie
Signed-off-by: Greg Kroah-Hartman -
commit e67d668e147c3b4fec638c9e0ace04319f5ceccd upstream.
This patch makes use of the set_memory_x() kernel API in order
to make necessary BIOS calls to source NMIs.This is needed for SLES11 SP2 and the latest upstream kernel as it appears
the NX Execute Disable has grown in its control.Signed-off by: Thomas Mingarelli
Signed-off by: Wim Van Sebroeck
Signed-off-by: Greg Kroah-Hartman -
commit e6780f7243eddb133cc20ec37fa69317c218b709 upstream.
It was found (by Sasha) that if you use a futex located in the gate
area we get stuck in an uninterruptible infinite loop, much like the
ZERO_PAGE issue.While looking at this problem, PeterZ realized you'll get into similar
trouble when hitting any install_special_pages() mapping. And are there
still drivers setting up their own special mmaps without page->mapping,
and without special VM or pte flags to make get_user_pages fail?In most cases, if page->mapping is NULL, we do not need to retry at all:
Linus points out that even /proc/sys/vm/drop_caches poses no problem,
because it ends up using remove_mapping(), which takes care not to
interfere when the page reference count is raised.But there is still one case which does need a retry: if memory pressure
called shmem_writepage in between get_user_pages_fast dropping page
table lock and our acquiring page lock, then the page gets switched from
filecache to swapcache (and ->mapping set to NULL) whatever the refcount.
Fault it back in to get the page->mapping needed for key->shared.inode.Reported-by: Sasha Levin
Signed-off-by: Hugh Dickins
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman -
commit 55205c916e179e09773d98d290334d319f45ac6b upstream.
This change fixes a linking problem, which happens if oprofile
is selected to be compiled as built-in:`oprofile_arch_exit' referenced in section `.init.text' of
arch/arm/oprofile/built-in.o: defined in discarded section
`.exit.text' of arch/arm/oprofile/built-in.oThe problem is appeared after commit 87121ca504, which
introduced oprofile_arch_exit() calls from __init function. Note
that the aforementioned commit has been backported to stable
branches, and the problem is known to be reproduced at least
with 3.0.13 and 3.1.5 kernels.Signed-off-by: Vladimir Zapolskiy
Signed-off-by: Robert Richter
Cc: Will Deacon
Cc: oprofile-list
Link: http://lkml.kernel.org/r/20111222151540.GB16765@erda.amd.com
Signed-off-by: Ingo Molnar
Signed-off-by: Greg Kroah-Hartman -
commit 3b6e3c73851a9a4b0e6ed9d378206341dd65e8a5 upstream.
When getting a cmd irq during an ongoing data transfer
with dma, the dma job were never terminated. This is now
corrected.Tested-by: Linus Walleij
Signed-off-by: Per Forlin
Signed-off-by: Ulf Hansson
Signed-off-by: Russell King
Signed-off-by: Greg Kroah-Hartman -
commit b63038d6f4ca5d1849ce01d9fc5bb9cb426dec73 upstream.
The interrupt was previously enabled and then correctly cleared.
Now we also handle it correctly.Tested-by: Linus Walleij
Signed-off-by: Ulf Hansson
Signed-off-by: Russell King
Signed-off-by: Greg Kroah-Hartman