07 Jan, 2012

40 commits

  • Greg Kroah-Hartman
     
  • commit b25bfda38236f349cde0d1b28952f4eea2148d3f upstream.

    don't do aggregation related stuff for 'AP mode client power save
    handling' if aggregation is not enabled in the driver, otherwise it
    will lead to panic because those data structures won't be never
    intialized in 'ath_tx_node_init' if aggregation is disabled

    EIP is at ath_tx_aggr_wakeup+0x37/0x80 [ath9k]
    EAX: e8c09a20 EBX: f2a304e8 ECX: 00000001 EDX: 00000000
    ESI: e8c085e0 EDI: f2a304ac EBP: f40e1ca4 ESP: f40e1c8c
    DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
    Process swapper/1 (pid: 0, ti=f40e0000 task=f408e860
    task.ti=f40dc000)
    Stack:
    0001e966 e8c09a20 00000000 f2a304ac e8c085e0 f2a304ac
    f40e1cb0 f8186741
    f8186700 f40e1d2c f922988d f2a304ac 00000202 00000001
    c0b4ba43 00000000
    0000000f e8eb75c0 e8c085e0 205b0001 34383220 f2a304ac
    f2a30000 00010020
    Call Trace:
    [] ath9k_sta_notify+0x41/0x50 [ath9k]
    [] ? ath9k_get_survey+0x110/0x110 [ath9k]
    [] ieee80211_sta_ps_deliver_wakeup+0x9d/0x350
    [mac80211]
    [] ? __module_address+0x95/0xb0
    [] ap_sta_ps_end+0x63/0xa0 [mac80211]
    [] ieee80211_rx_h_sta_process+0x156/0x2b0
    [mac80211]
    [] ieee80211_rx_handlers+0xce/0x510 [mac80211]
    [] ? trace_hardirqs_on+0xb/0x10
    [] ? skb_queue_tail+0x3e/0x50
    [] ieee80211_prepare_and_rx_handle+0x111/0x750
    [mac80211]
    [] ieee80211_rx+0x349/0xb20 [mac80211]
    [] ? ieee80211_rx+0x99/0xb20 [mac80211]
    [] ath_rx_tasklet+0x818/0x1d00 [ath9k]
    [] ? ath9k_tasklet+0x35/0x1c0 [ath9k]
    [] ? ath9k_tasklet+0x35/0x1c0 [ath9k]
    [] ath9k_tasklet+0xf3/0x1c0 [ath9k]
    [] tasklet_action+0xbe/0x180

    Cc: Senthil Balasubramanian
    Cc: Rajkumar Manoharan
    Reported-by: Ashwin Mendonca
    Tested-by: Ashwin Mendonca
    Signed-off-by: Mohammed Shafi Shajakhan
    Signed-off-by: John W. Linville
    Signed-off-by: Greg Kroah-Hartman

    Mohammed Shafi Shajakhan
     
  • commit 8a88951b5878dc475dcd841cefc767e36397d14e upstream.

    This is the temporary simple fix for 3.2, we need more changes in this
    area.

    1. do_signal_stop() assumes that the running untraced thread in the
    stopped thread group is not possible. This was our goal but it is
    not yet achieved: a stopped-but-resumed tracee can clone the running
    thread which can initiate another group-stop.

    Remove WARN_ON_ONCE(!current->ptrace).

    2. A new thread always starts with ->jobctl = 0. If it is auto-attached
    and this group is stopped, __ptrace_unlink() sets JOBCTL_STOP_PENDING
    but JOBCTL_STOP_SIGMASK part is zero, this triggers WANR_ON(!signr)
    in do_jobctl_trap() if another debugger attaches.

    Change __ptrace_unlink() to set the artificial SIGSTOP for report.

    Alternatively we could change ptrace_init_task() to copy signr from
    current, but this means we can copy it for no reason and hide the
    possible similar problems.

    Acked-by: Tejun Heo
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Oleg Nesterov
     
  • commit 50b8d257486a45cba7b65ca978986ed216bbcc10 upstream.

    Test-case:

    int main(void)
    {
    int pid, status;

    pid = fork();
    if (!pid) {
    for (;;) {
    if (!fork())
    return 0;
    if (waitpid(-1, &status, 0) < 0) {
    printf("ERR!! wait: %m\n");
    return 0;
    }
    }
    }

    assert(ptrace(PTRACE_ATTACH, pid, 0,0) == 0);
    assert(waitpid(-1, NULL, 0) == pid);

    assert(ptrace(PTRACE_SETOPTIONS, pid, 0,
    PTRACE_O_TRACEFORK) == 0);

    do {
    ptrace(PTRACE_CONT, pid, 0, 0);
    pid = waitpid(-1, NULL, 0);
    } while (pid > 0);

    return 1;
    }

    It fails because ->real_parent sees its child in EXIT_DEAD state
    while the tracer is going to change the state back to EXIT_ZOMBIE
    in wait_task_zombie().

    The offending commit is 823b018e which moved the EXIT_DEAD check,
    but in fact we should not blame it. The original code was not
    correct as well because it didn't take ptrace_reparented() into
    account and because we can't really trust ->ptrace.

    This patch adds the additional check to close this particular
    race but it doesn't solve the whole problem. We simply can't
    rely on ->ptrace in this case, it can be cleared if the tracer
    is multithreaded by the exiting ->parent.

    I think we should kill EXIT_DEAD altogether, we should always
    remove the soon-to-be-reaped child from ->children or at least
    we should never do the DEAD->ZOMBIE transition. But this is too
    complex for 3.2.

    Reported-and-tested-by: Denys Vlasenko
    Tested-by: Lukasz Michalik
    Acked-by: Tejun Heo
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Oleg Nesterov
     
  • commit 157e8bf8b4823bfcdefa6c1548002374b61f61df upstream.

    This reverts commit c0afabd3d553c521e003779c127143ffde55a16f.

    It causes failures on Toshiba laptops - instead of disabling the alarm,
    it actually seems to enable it on the affected laptops, resulting in
    (for example) the laptop powering on automatically five minutes after
    shutdown.

    There's a patch for it that appears to work for at least some people,
    but it's too late to play around with this, so revert for now and try
    again in the next merge window.

    See for example

    http://bugs.debian.org/652869

    Reported-and-bisected-by: Andreas Friedrich (Toshiba Tecra)
    Reported-by: Antonio-M. Corbi Bellot (Toshiba Portege R500)
    Reported-by: Marco Santos (Toshiba Portege Z830)
    Reported-by: Christophe Vu-Brugier (Toshiba Portege R830)
    Cc: Jonathan Nieder
    Requested-by: John Stultz
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Linus Torvalds
     
  • commit f9fab10bbd768b0e5254e53a4a8477a94bfc4b96 upstream.

    vfork parent uninterruptibly and unkillably waits for its child to
    exec/exit. This wait is of unbounded length. Ignore such waits
    in the hung_task detector.

    Signed-off-by: Mandeep Singh Baines
    Reported-by: Sasha Levin
    LKML-Reference:
    Cc: Linus Torvalds
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Andrew Morton
    Cc: John Kacur
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Mandeep Singh Baines
     
  • commit 4376eee92e5a8332b470040e672ea99cd44c826a upstream.

    If we end up with no power states, don't look up
    current vddc.

    fixes:
    https://bugs.freedesktop.org/show_bug.cgi?id=44130

    agd5f: fix patch formatting

    Signed-off-by: Alex Deucher
    Signed-off-by: Dave Airlie
    Signed-off-by: Greg Kroah-Hartman

    Alexander Müller
     
  • commit 3d6271f92e98094584fd1e609a9969cd33e61122 upstream.

    Without turning the MADC clock on, no MADC conversions occur.

    $ cat /sys/class/hwmon/hwmon0/device/in8_input
    [ 53.428436] twl4030_madc twl4030_madc: conversion timeout!
    cat: read error: Resource temporarily unavailable

    Signed-off-by: Kyle Manna
    Signed-off-by: Samuel Ortiz
    Signed-off-by: Greg Kroah-Hartman

    Kyle Manna
     
  • commit 96f1f05af76b601ab21a7dc603ae0a1cea4efc3d upstream.

    Since we configure all the queues as CHAINABLE, we need to update the
    byte count for all the queues, not only the AGGREGATABLE ones.

    Not doing so can confuse the SCD and make the fw assert.

    Signed-off-by: Emmanuel Grumbach
    Signed-off-by: Wey-Yi Guy
    Signed-off-by: John W. Linville
    Signed-off-by: Greg Kroah-Hartman

    Emmanuel Grumbach
     
  • [ Upstream commit b9eda06f80b0db61a73bd87c6b0eb67d8aca55ad ]

    Signed-off-by: Stephen Rothwell
    Acked-by: Eric Dumazet
    Acked-by: David Miller
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Stephen Rothwell
     
  • [ Upstream commit 9f28a2fc0bd77511f649c0a788c7bf9a5fd04edb ]

    Commit 2c8cec5c10b (ipv4: Cache learned PMTU information in inetpeer)
    removed IP route cache garbage collector a bit too soon, as this gc was
    responsible for expired routes cleanup, releasing their neighbour
    reference.

    As pointed out by Robert Gladewitz, recent kernels can fill and exhaust
    their neighbour cache.

    Reintroduce the garbage collection, since we'll have to wait our
    neighbour lookups become refcount-less to not depend on this stuff.

    Reported-by: Robert Gladewitz
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit e688a604807647c9450f9c12a7cb6d027150a895 ]

    Chris Boot reported crashes occurring in ipv6_select_ident().

    [ 461.457562] RIP: 0010:[] []
    ipv6_select_ident+0x31/0xa7

    [ 461.578229] Call Trace:
    [ 461.580742]
    [ 461.582870] [] ? udp6_ufo_fragment+0x124/0x1a2
    [ 461.589054] [] ? ipv6_gso_segment+0xc0/0x155
    [ 461.595140] [] ? skb_gso_segment+0x208/0x28b
    [ 461.601198] [] ? ipv6_confirm+0x146/0x15e
    [nf_conntrack_ipv6]
    [ 461.608786] [] ? nf_iterate+0x41/0x77
    [ 461.614227] [] ? dev_hard_start_xmit+0x357/0x543
    [ 461.620659] [] ? nf_hook_slow+0x73/0x111
    [ 461.626440] [] ? br_parse_ip_options+0x19a/0x19a
    [bridge]
    [ 461.633581] [] ? dev_queue_xmit+0x3af/0x459
    [ 461.639577] [] ? br_dev_queue_push_xmit+0x72/0x76
    [bridge]
    [ 461.646887] [] ? br_nf_post_routing+0x17d/0x18f
    [bridge]
    [ 461.653997] [] ? nf_iterate+0x41/0x77
    [ 461.659473] [] ? br_flood+0xfa/0xfa [bridge]
    [ 461.665485] [] ? nf_hook_slow+0x73/0x111
    [ 461.671234] [] ? br_flood+0xfa/0xfa [bridge]
    [ 461.677299] [] ?
    nf_bridge_update_protocol+0x20/0x20 [bridge]
    [ 461.684891] [] ? nf_ct_zone+0xa/0x17 [nf_conntrack]
    [ 461.691520] [] ? br_flood+0xfa/0xfa [bridge]
    [ 461.697572] [] ? NF_HOOK.constprop.8+0x3c/0x56
    [bridge]
    [ 461.704616] [] ?
    nf_bridge_push_encap_header+0x1c/0x26 [bridge]
    [ 461.712329] [] ? br_nf_forward_finish+0x8a/0x95
    [bridge]
    [ 461.719490] [] ?
    nf_bridge_pull_encap_header+0x1c/0x27 [bridge]
    [ 461.727223] [] ? br_nf_forward_ip+0x1c0/0x1d4 [bridge]
    [ 461.734292] [] ? nf_iterate+0x41/0x77
    [ 461.739758] [] ? __br_deliver+0xa0/0xa0 [bridge]
    [ 461.746203] [] ? nf_hook_slow+0x73/0x111
    [ 461.751950] [] ? __br_deliver+0xa0/0xa0 [bridge]
    [ 461.758378] [] ? NF_HOOK.constprop.4+0x56/0x56
    [bridge]

    This is caused by bridge netfilter special dst_entry (fake_rtable), a
    special shared entry, where attaching an inetpeer makes no sense.

    Problem is present since commit 87c48fa3b46 (ipv6: make fragment
    identifications less predictable)

    Introduce DST_NOPEER dst flag and make sure ipv6_select_ident() and
    __ip_select_ident() fallback to the 'no peer attached' handling.

    Reported-by: Chris Boot
    Tested-by: Chris Boot
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit bb3c36863e8001fc21a88bebfdead4da4c23e848 ]

    After commit 8e2ec639173f325977818c45011ee176ef2b11f6 ("ipv6: don't
    use inetpeer to store metrics for routes.") the test in rt6_alloc_cow()
    for setting the ANYCAST flag is now wrong.

    'rt' will always now have a plen of 128, because it is set explicitly
    to 128 by ip6_rt_copy.

    So to restore the semantics of the test, check the destination prefix
    length of 'ort'.

    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    David S. Miller
     
  • [ Upstream commit d01ff0a049f749e0bf10a35bb23edd012718c8c2 ]

    After reset ipv4_devconf->data[IPV4_DEVCONF_ACCEPT_LOCAL] to 0,
    we should flush route cache, or it will continue receive packets with local
    source address, which should be dropped.

    Signed-off-by: Weiping Pan
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Weiping Pan
     
  • [ Upstream commit c0ed1c14a72ca9ebacd51fb94a8aca488b0d361e ]

    flow_cach_flush() might sleep but can be called from
    atomic context via the xfrm garbage collector. So add
    a flow_cache_flush_deferred() function and use this if
    the xfrm garbage colector is invoked from within the
    packet path.

    Signed-off-by: Steffen Klassert
    Acked-by: Timo Teräs
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Steffen Klassert
     
  • [ Upstream commit a76c0adf60f6ca5ff3481992e4ea0383776b24d2 ]

    When checking whether a DATA chunk fits into the estimated rwnd a
    full sizeof(struct sk_buff) is added to the needed chunk size. This
    quickly exhausts the available rwnd space and leads to packets being
    sent which are much below the PMTU limit. This can lead to much worse
    performance.

    The reason for this behaviour was to avoid putting too much memory
    pressure on the receiver. The concept is not completely irational
    because a Linux receiver does in fact clone an skb for each DATA chunk
    delivered. However, Linux also reserves half the available socket
    buffer space for data structures therefore usage of it is already
    accounted for.

    When proposing to change this the last time it was noted that this
    behaviour was introduced to solve a performance issue caused by rwnd
    overusage in combination with small DATA chunks.

    Trying to reproduce this I found that with the sk_buff overhead removed,
    the performance would improve significantly unless socket buffer limits
    are increased.

    The following numbers have been gathered using a patched iperf
    supporting SCTP over a live 1 Gbit ethernet network. The -l option
    was used to limit DATA chunk sizes. The numbers listed are based on
    the average of 3 test runs each. Default values have been used for
    sk_(r|w)mem.

    Chunk
    Size Unpatched No Overhead
    -------------------------------------
    4 15.2 Kbit [!] 12.2 Mbit [!]
    8 35.8 Kbit [!] 26.0 Mbit [!]
    16 95.5 Kbit [!] 54.4 Mbit [!]
    32 106.7 Mbit 102.3 Mbit
    64 189.2 Mbit 188.3 Mbit
    128 331.2 Mbit 334.8 Mbit
    256 537.7 Mbit 536.0 Mbit
    512 766.9 Mbit 766.6 Mbit
    1024 810.1 Mbit 808.6 Mbit

    Signed-off-by: Thomas Graf
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Thomas Graf
     
  • [ Upstream commit 2692ba61a82203404abd7dd2a027bda962861f74 ]

    Commit 8ffd3208 voids the previous patches f6778aab and 810c0719 for
    limiting the autoclose value. If userspace passes in -1 on 32-bit
    platform, the overflow check didn't work and autoclose would be set
    to 0xffffffff.

    This patch defines a max_autoclose (in seconds) for limiting the value
    and exposes it through sysctl, with the following intentions.

    1) Avoid overflowing autoclose * HZ.

    2) Keep the default autoclose bound consistent across 32- and 64-bit
    platforms (INT_MAX / HZ in this patch).

    3) Keep the autoclose value consistent between setsockopt() and
    getsockopt() calls.

    Suggested-by: Vlad Yasevich
    Signed-off-by: Xi Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Xi Wang
     
  • [ Upstream commit 3f1e6d3fd37bd4f25e5b19f1c7ca21850426c33f ]

    gred_change_vq() is called under sch_tree_lock(sch).

    This means a spinlock is held, and we are not allowed to sleep in this
    context.

    We might pre-allocate memory using GFP_KERNEL before taking spinlock,
    but this is not suitable for stable material.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit cd7816d14953c8af910af5bb92f488b0b277e29d ]

    previous commit 3fb72f1e6e6165c5f495e8dc11c5bbd14c73385c
    makes IP-Config wait for carrier on at least one network device.

    Before waiting (predefined value 120s), check that at least one device
    was successfully brought up. Otherwise (e.g. buggy bootloader
    which does not set the MAC address) there is no point in waiting
    for carrier.

    Cc: Micha Nelissen
    Cc: Holger Brunck
    Signed-off-by: Gerlando Falauto
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Gerlando Falauto
     
  • [ Upstream commit 7838f2ce36b6ab5c13ef20b1857e3bbd567f1759 ]

    Userspace may not provide TCA_OPTIONS, in fact tc currently does
    so not do so if no arguments are specified on the command line.
    Return EINVAL instead of panicing.

    Signed-off-by: Thomas Graf
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Thomas Graf
     
  • [ Upstream commit 9cef310fcdee12b49b8b4c96fd8f611c8873d284 ]

    Received non stream protocol packets were calling llc_cmsg_rcv that used a
    skb after that skb was released by sk_eat_skb. This caused received STP
    packets to generate kernel panics.

    Signed-off-by: Alexandru Juncu
    Signed-off-by: Kunjan Naik
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Alex Juncu
     
  • [ Upstream commit a454daceb78844a09c08b6e2d8badcb76a5d73b9 ]

    Signed-off-by: Djalal Harouni
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Djalal Harouni
     
  • [ Upstream commit a03ffcf873fe0f2565386ca8ef832144c42e67fa ]

    x86 jump instruction size is 2 or 5 bytes (near/long jump), not 2 or 6
    bytes.

    In case a conditional jump is followed by a long jump, conditional jump
    target is one byte past the start of target instruction.

    Signed-off-by: Markus Kötter
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Markus Kötter
     
  • [ A combination of upstream commits 1d299bc7732c34d85bd43ac1a8745f5a2fed2078 and
    e88d2468718b0789b4c33da2f7e1cef2a1eee279 ]

    Although we provide a proper way for a debugger to control whether
    syscall restart occurs, we run into problems because orig_i0 is not
    saved and restored properly.

    Luckily we can solve this problem without having to make debuggers
    aware of the issue. Across system calls, several registers are
    considered volatile and can be safely clobbered.

    Therefore we use the pt_regs save area of one of those registers, %g6,
    as a place to save and restore orig_i0.

    Debuggers transparently will do the right thing because they save and
    restore this register already.

    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    David S. Miller
     
  • [ Upstream commit 2e8ecdc008a16b9a6c4b9628bb64d0d1c05f9f92 ]

    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    David S. Miller
     
  • [ Upstream commit a52312b88c8103e965979a79a07f6b34af82ca4b ]

    Properly return the original destination buffer pointer.

    Signed-off-by: David S. Miller
    Tested-by: Kjetil Oftedal
    Signed-off-by: Greg Kroah-Hartman

    David S. Miller
     
  • [ Upstream commit 21f74d361dfd6a7d0e47574e315f780d8172084a ]

    This is setting things up so that we can correct the return
    value, so that it properly returns the original destination
    buffer pointer.

    Signed-off-by: David S. Miller
    Tested-by: Kjetil Oftedal
    Signed-off-by: Greg Kroah-Hartman

    David S. Miller
     
  • [ Upstream commit 045b7de9ca0cf09f1adc3efa467f668b89238390 ]

    Signed-off-by: David S. Miller
    Tested-by: Kjetil Oftedal
    Signed-off-by: Greg Kroah-Hartman

    David S. Miller
     
  • [ Upstream commit 3e37fd3153ac95088a74f5e7c569f7567e9f993a ]

    To handle the large physical addresses, just make a simple wrapper
    around remap_pfn_range() like MIPS does.

    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    David S. Miller
     
  • [ Upstream commit 0b64120cceb86e93cb1bda0dc055f13016646907 ]

    Some of the sun4v code patching occurs in inline functions visible
    to, and usable by, modules.

    Therefore we have to patch them up during module load.

    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    David S. Miller
     
  • [ Upstream commit b1f44e13a525d2ffb7d5afe2273b7169d6f2222e ]

    The "(insn & 0x01800000) != 0x01800000" test matches 'restore'
    but that is a legitimate place to see the %lo() part of a 32-bit
    symbol relocation, particularly in tail calls.

    Signed-off-by: David S. Miller
    Tested-by: Sergei Trofimovich
    Signed-off-by: Greg Kroah-Hartman

    David S. Miller
     
  • [ Upstream commit 7cc8583372a21d98a23b703ad96cab03180b5030 ]

    This silently was working for many years and stopped working on
    Niagara-T3 machines.

    We need to set the MSIQ to VALID before we can set it's state to IDLE.

    On Niagara-T3, setting the state to IDLE first was causing HV_EINVAL
    errors. The hypervisor documentation says, rather ambiguously, that
    the MSIQ must be "initialized" before one can set the state.

    I previously understood this to mean merely that a successful setconf()
    operation has been performed on the MSIQ, which we have done at this
    point. But it seems to also mean that it has been set VALID too.

    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    David S. Miller
     
  • Upstrem commit: 911ae9434f83e7355d343f6c2be3ef5b00ea7aed

    There's a bug in the MSIX backup and restore routines that cause a crash on
    non-x86 (direct access to PCI space not via read/write). These routines are
    unnecessary and were removed by the above commit, so also remove them from
    stable to fix the crash.

    Signed-off-by: Nagalakshmi Nandigama
    Signed-off-by: James Bottomley
    Signed-off-by: Greg Kroah-Hartman

    Nagalakshmi Nandigama
     
  • commit e26a51148f3ebd859bca8bf2e0f212839b447f62 upstream.

    commit 8aacc9f550 ("mm/mempolicy.c: fix pgoff in mbind vma merge") is the
    slightly incorrect fix.

    Why? Think following case.

    1. map 4 pages of a file at offset 0

    [0123]

    2. map 2 pages just after the first mapping of the same file but with
    page offset 2

    [0123][23]

    3. mbind() 2 pages from the first mapping at offset 2.
    mbind_range() should treat new vma is,

    [0123][23]
    |23|
    mbind vma

    but it does

    [0123][23]
    |01|
    mbind vma

    Oops. then, it makes wrong vma merge and splitting ([01][0123] or similar).

    This patch fixes it.

    [testcase]
    test result - before the patch

    case4: 126: test failed. expect '2,4', actual '2,2,2'
    case5: passed
    case6: passed
    case7: passed
    case8: passed
    case_n: 246: test failed. expect '4,2', actual '1,4'

    ------------[ cut here ]------------
    kernel BUG at mm/filemap.c:135!
    invalid opcode: 0000 [#4] SMP DEBUG_PAGEALLOC

    (snip long bug on messages)

    test result - after the patch

    case4: passed
    case5: passed
    case6: passed
    case7: passed
    case8: passed
    case_n: passed

    source: mbind_vma_test.c
    ============================================================
    #include
    #include
    #include
    #include
    #include
    #include
    #include

    static unsigned long pagesize;
    void* mmap_addr;
    struct bitmask *nmask;
    char buf[1024];
    FILE *file;
    char retbuf[10240] = "";
    int mapped_fd;

    char *rubysrc = "ruby -e '\
    pid = %d; \
    vstart = 0x%llx; \
    vend = 0x%llx; \
    s = `pmap -q #{pid}`; \
    rary = []; \
    s.each_line {|line|; \
    ary=line.split(\" \"); \
    addr = ary[0].to_i(16); \
    if(vstart < vend) then \
    rary.push(ary[1].to_i()/4); \
    end; \
    }; \
    print rary.join(\",\"); \
    '";

    void init(void)
    {
    void* addr;
    char buf[128];

    nmask = numa_allocate_nodemask();
    numa_bitmask_setbit(nmask, 0);

    pagesize = getpagesize();

    sprintf(buf, "%s", "mbind_vma_XXXXXX");
    mapped_fd = mkstemp(buf);
    if (mapped_fd == -1)
    perror("mkstemp "), exit(1);
    unlink(buf);

    if (lseek(mapped_fd, pagesize*8, SEEK_SET) < 0)
    perror("lseek "), exit(1);
    if (write(mapped_fd, "\0", 1) < 0)
    perror("write "), exit(1);

    addr = mmap(NULL, pagesize*8, PROT_NONE,
    MAP_SHARED, mapped_fd, 0);
    if (addr == MAP_FAILED)
    perror("mmap "), exit(1);

    if (mprotect(addr+pagesize, pagesize*6, PROT_READ|PROT_WRITE) < 0)
    perror("mprotect "), exit(1);

    mmap_addr = addr + pagesize;

    /* make page populate */
    memset(mmap_addr, 0, pagesize*6);
    }

    void fin(void)
    {
    void* addr = mmap_addr - pagesize;
    munmap(addr, pagesize*8);

    memset(buf, 0, sizeof(buf));
    memset(retbuf, 0, sizeof(retbuf));
    }

    void mem_bind(int index, int len)
    {
    int err;

    err = mbind(mmap_addr+pagesize*index, pagesize*len,
    MPOL_BIND, nmask->maskp, nmask->size, 0);
    if (err)
    perror("mbind "), exit(err);
    }

    void mem_interleave(int index, int len)
    {
    int err;

    err = mbind(mmap_addr+pagesize*index, pagesize*len,
    MPOL_INTERLEAVE, nmask->maskp, nmask->size, 0);
    if (err)
    perror("mbind "), exit(err);
    }

    void mem_unbind(int index, int len)
    {
    int err;

    err = mbind(mmap_addr+pagesize*index, pagesize*len,
    MPOL_DEFAULT, NULL, 0, 0);
    if (err)
    perror("mbind "), exit(err);
    }

    void Assert(char *expected, char *value, char *name, int line)
    {
    if (strcmp(expected, value) == 0) {
    fprintf(stderr, "%s: passed\n", name);
    return;
    }
    else {
    fprintf(stderr, "%s: %d: test failed. expect '%s', actual '%s'\n",
    name, line,
    expected, value);
    // exit(1);
    }
    }

    /*
    AAAA
    PPPPPPNNNNNN
    might become
    PPNNNNNNNNNN
    case 4 below
    */
    void case4(void)
    {
    init();
    sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);

    mem_bind(0, 4);
    mem_unbind(2, 2);

    file = popen(buf, "r");
    fread(retbuf, sizeof(retbuf), 1, file);
    Assert("2,4", retbuf, "case4", __LINE__);

    fin();
    }

    /*
    AAAA
    PPPPPPNNNNNN
    might become
    PPPPPPPPPPNN
    case 5 below
    */
    void case5(void)
    {
    init();
    sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);

    mem_bind(0, 2);
    mem_bind(2, 2);

    file = popen(buf, "r");
    fread(retbuf, sizeof(retbuf), 1, file);
    Assert("4,2", retbuf, "case5", __LINE__);

    fin();
    }

    /*
    AAAA
    PPPPNNNNXXXX
    might become
    PPPPPPPPPPPP 6
    */
    void case6(void)
    {
    init();
    sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);

    mem_bind(0, 2);
    mem_bind(4, 2);
    mem_bind(2, 2);

    file = popen(buf, "r");
    fread(retbuf, sizeof(retbuf), 1, file);
    Assert("6", retbuf, "case6", __LINE__);

    fin();
    }

    /*
    AAAA
    PPPPNNNNXXXX
    might become
    PPPPPPPPXXXX 7
    */
    void case7(void)
    {
    init();
    sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);

    mem_bind(0, 2);
    mem_interleave(4, 2);
    mem_bind(2, 2);

    file = popen(buf, "r");
    fread(retbuf, sizeof(retbuf), 1, file);
    Assert("4,2", retbuf, "case7", __LINE__);

    fin();
    }

    /*
    AAAA
    PPPPNNNNXXXX
    might become
    PPPPNNNNNNNN 8
    */
    void case8(void)
    {
    init();
    sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);

    mem_bind(0, 2);
    mem_interleave(4, 2);
    mem_interleave(2, 2);

    file = popen(buf, "r");
    fread(retbuf, sizeof(retbuf), 1, file);
    Assert("2,4", retbuf, "case8", __LINE__);

    fin();
    }

    void case_n(void)
    {
    init();
    sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);

    /* make redundunt mappings [0][1234][34][7] */
    mmap(mmap_addr + pagesize*4, pagesize*2, PROT_READ|PROT_WRITE,
    MAP_FIXED|MAP_SHARED, mapped_fd, pagesize*3);

    /* Expect to do nothing. */
    mem_unbind(2, 2);

    file = popen(buf, "r");
    fread(retbuf, sizeof(retbuf), 1, file);
    Assert("4,2", retbuf, "case_n", __LINE__);

    fin();
    }

    int main(int argc, char** argv)
    {
    case4();
    case5();
    case6();
    case7();
    case8();
    case_n();

    return 0;
    }
    =============================================================

    Signed-off-by: KOSAKI Motohiro
    Acked-by: Johannes Weiner
    Cc: Minchan Kim
    Cc: Caspar Zhang
    Cc: KOSAKI Motohiro
    Cc: Christoph Lameter
    Cc: Hugh Dickins
    Cc: Mel Gorman
    Cc: Lee Schermerhorn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    KOSAKI Motohiro
     
  • commit b0365c8d0cb6e79eb5f21418ae61ab511f31b575 upstream.

    If a huge page is enqueued under the protection of hugetlb_lock, then the
    operation is atomic and safe.

    Signed-off-by: Hillf Danton
    Reviewed-by: Michal Hocko
    Acked-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Hillf Danton
     
  • commit 77e00f2ea94abee1ad13bdfde19cf7aa25992b0e upstream.

    We already do this for cayman, need to also do it for
    BTC parts. The default memory and voltage setup is not
    adequate for advanced operation. Continuing will
    result in an unusable display.

    Signed-off-by: Alex Deucher
    Cc: Jean Delvare
    Signed-off-by: Dave Airlie
    Signed-off-by: Greg Kroah-Hartman

    Alex Deucher
     
  • commit e67d668e147c3b4fec638c9e0ace04319f5ceccd upstream.

    This patch makes use of the set_memory_x() kernel API in order
    to make necessary BIOS calls to source NMIs.

    This is needed for SLES11 SP2 and the latest upstream kernel as it appears
    the NX Execute Disable has grown in its control.

    Signed-off by: Thomas Mingarelli
    Signed-off by: Wim Van Sebroeck
    Signed-off-by: Greg Kroah-Hartman

    Mingarelli, Thomas
     
  • commit e6780f7243eddb133cc20ec37fa69317c218b709 upstream.

    It was found (by Sasha) that if you use a futex located in the gate
    area we get stuck in an uninterruptible infinite loop, much like the
    ZERO_PAGE issue.

    While looking at this problem, PeterZ realized you'll get into similar
    trouble when hitting any install_special_pages() mapping. And are there
    still drivers setting up their own special mmaps without page->mapping,
    and without special VM or pte flags to make get_user_pages fail?

    In most cases, if page->mapping is NULL, we do not need to retry at all:
    Linus points out that even /proc/sys/vm/drop_caches poses no problem,
    because it ends up using remove_mapping(), which takes care not to
    interfere when the page reference count is raised.

    But there is still one case which does need a retry: if memory pressure
    called shmem_writepage in between get_user_pages_fast dropping page
    table lock and our acquiring page lock, then the page gets switched from
    filecache to swapcache (and ->mapping set to NULL) whatever the refcount.
    Fault it back in to get the page->mapping needed for key->shared.inode.

    Reported-by: Sasha Levin
    Signed-off-by: Hugh Dickins
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Hugh Dickins
     
  • commit 55205c916e179e09773d98d290334d319f45ac6b upstream.

    This change fixes a linking problem, which happens if oprofile
    is selected to be compiled as built-in:

    `oprofile_arch_exit' referenced in section `.init.text' of
    arch/arm/oprofile/built-in.o: defined in discarded section
    `.exit.text' of arch/arm/oprofile/built-in.o

    The problem is appeared after commit 87121ca504, which
    introduced oprofile_arch_exit() calls from __init function. Note
    that the aforementioned commit has been backported to stable
    branches, and the problem is known to be reproduced at least
    with 3.0.13 and 3.1.5 kernels.

    Signed-off-by: Vladimir Zapolskiy
    Signed-off-by: Robert Richter
    Cc: Will Deacon
    Cc: oprofile-list
    Link: http://lkml.kernel.org/r/20111222151540.GB16765@erda.amd.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Vladimir Zapolskiy
     
  • commit 3b6e3c73851a9a4b0e6ed9d378206341dd65e8a5 upstream.

    When getting a cmd irq during an ongoing data transfer
    with dma, the dma job were never terminated. This is now
    corrected.

    Tested-by: Linus Walleij
    Signed-off-by: Per Forlin
    Signed-off-by: Ulf Hansson
    Signed-off-by: Russell King
    Signed-off-by: Greg Kroah-Hartman

    Ulf Hansson