08 Aug, 2014

34 commits

  • commit 6d2b6170c8914c6c69256b687651fb16d7ec3e18 upstream.

    Fix the broken check for calling sys_fallocate() on an active swapfile,
    introduced by commit 0790b31b69374ddadefe ("fs: disallow all fallocate
    operation on active swapfile").

    Signed-off-by: Eric Biggers
    Signed-off-by: Al Viro
    Signed-off-by: Greg Kroah-Hartman

    Eric Biggers
     
  • commit 23d9cec07c589276561c13b180577c0b87930140 upstream.

    The DRA74/72 control module pins have a weak pull up and pull down.
    This is configured by bit offset 17. if BIT(17) is 1, a pull up is
    selected, else a pull down is selected.

    However, this pull resisstor is applied based on BIT(16) -
    PULLUDENABLE - if BIT(18) is *0*, then pull as defined in BIT(17) is
    applied, else no weak pulls are applied. We defined this in reverse.

    Reference: Table 18-5 (Description of the pad configuration register
    bits) in Technical Reference Manual Revision (DRA74x revision Q:
    SPRUHI2Q Revised June 2014 and DRA72x revision F: SPRUHP2F - Revised
    June 2014)

    Fixes: 6e58b8f1daaf1a ("ARM: dts: DRA7: Add the dts files for dra7 SoC and dra7-evm board")
    Signed-off-by: Nishanth Menon
    Tested-by: Felipe Balbi
    Acked-by: Felipe Balbi
    Signed-off-by: Tony Lindgren
    Signed-off-by: Greg Kroah-Hartman

    Nishanth Menon
     
  • commit 7209a75d2009dbf7745e2fd354abf25c3deb3ca3 upstream.

    This moves the espfix64 logic into native_iret. To make this work,
    it gets rid of the native patch for INTERRUPT_RETURN:
    INTERRUPT_RETURN on native kernels is now 'jmp native_iret'.

    This changes the 16-bit SS behavior on Xen from OOPSing to leaking
    some bits of the Xen hypervisor's RSP (I think).

    [ hpa: this is a nonzero cost on native, but probably not enough to
    measure. Xen needs to fix this in their own code, probably doing
    something equivalent to espfix64. ]

    Signed-off-by: Andy Lutomirski
    Link: http://lkml.kernel.org/r/7b8f1d8ef6597cb16ae004a43c56980a7de3cf94.1406129132.git.luto@amacapital.net
    Signed-off-by: H. Peter Anvin
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     
  • commit 34273f41d57ee8d854dcd2a1d754cbb546cb548f upstream.

    Embedded systems, which may be very memory-size-sensitive, are
    extremely unlikely to ever encounter any 16-bit software, so make it
    a CONFIG_EXPERT option to turn off support for any 16-bit software
    whatsoever.

    Signed-off-by: H. Peter Anvin
    Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
    Signed-off-by: Greg Kroah-Hartman

    H. Peter Anvin
     
  • commit 197725de65477bc8509b41388157c1a2283542bb upstream.

    Make espfix64 a hidden Kconfig option. This fixes the x86-64 UML
    build which had broken due to the non-existence of init_espfix_bsp()
    in UML: since UML uses its own Kconfig, this option does not appear in
    the UML build.

    This also makes it possible to make support for 16-bit segments a
    configuration option, for the people who want to minimize the size of
    the kernel.

    Reported-by: Ingo Molnar
    Signed-off-by: H. Peter Anvin
    Cc: Richard Weinberger
    Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
    Signed-off-by: Greg Kroah-Hartman

    H. Peter Anvin
     
  • commit 20b68535cd27183ebd3651ff313afb2b97dac941 upstream.

    Header guard is #ifndef, not #ifdef...

    Reported-by: Fengguang Wu
    Signed-off-by: H. Peter Anvin
    Signed-off-by: Greg Kroah-Hartman

    H. Peter Anvin
     
  • commit e1fe9ed8d2a4937510d0d60e20705035c2609aea upstream.

    Sparse warns that the percpu variables aren't declared before they are
    defined. Rather than hacking around it, move espfix definitions into
    a proper header file.

    Reported-by: Fengguang Wu
    Signed-off-by: H. Peter Anvin
    Signed-off-by: Greg Kroah-Hartman

    H. Peter Anvin
     
  • commit 3891a04aafd668686239349ea58f3314ea2af86b upstream.

    The IRET instruction, when returning to a 16-bit segment, only
    restores the bottom 16 bits of the user space stack pointer. This
    causes some 16-bit software to break, but it also leaks kernel state
    to user space. We have a software workaround for that ("espfix") for
    the 32-bit kernel, but it relies on a nonzero stack segment base which
    is not available in 64-bit mode.

    In checkin:

    b3b42ac2cbae x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels

    we "solved" this by forbidding 16-bit segments on 64-bit kernels, with
    the logic that 16-bit support is crippled on 64-bit kernels anyway (no
    V86 support), but it turns out that people are doing stuff like
    running old Win16 binaries under Wine and expect it to work.

    This works around this by creating percpu "ministacks", each of which
    is mapped 2^16 times 64K apart. When we detect that the return SS is
    on the LDT, we copy the IRET frame to the ministack and use the
    relevant alias to return to userspace. The ministacks are mapped
    readonly, so if IRET faults we promote #GP to #DF which is an IST
    vector and thus has its own stack; we then do the fixup in the #DF
    handler.

    (Making #GP an IST exception would make the msr_safe functions unsafe
    in NMI/MC context, and quite possibly have other effects.)

    Special thanks to:

    - Andy Lutomirski, for the suggestion of using very small stack slots
    and copy (as opposed to map) the IRET frame there, and for the
    suggestion to mark them readonly and let the fault promote to #DF.
    - Konrad Wilk for paravirt fixup and testing.
    - Borislav Petkov for testing help and useful comments.

    Reported-by: Brian Gerst
    Signed-off-by: H. Peter Anvin
    Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
    Cc: Konrad Rzeszutek Wilk
    Cc: Borislav Petkov
    Cc: Andrew Lutomriski
    Cc: Linus Torvalds
    Cc: Dirk Hohndel
    Cc: Arjan van de Ven
    Cc: comex
    Cc: Alexander van Heukelum
    Cc: Boris Ostrovsky
    Signed-off-by: Greg Kroah-Hartman

    H. Peter Anvin
     
  • commit 7ed6fb9b5a5510e4ef78ab27419184741169978a upstream.

    This reverts commit fa81511bb0bbb2b1aace3695ce869da9762624ff in
    preparation of merging in the proper fix (espfix64).

    Signed-off-by: H. Peter Anvin
    Signed-off-by: Greg Kroah-Hartman

    H. Peter Anvin
     
  • commit 504d58745c9ca28d33572e2d8a9990b43e06075d upstream.

    clockevents_increase_min_delta() calls printk() from under
    hrtimer_bases.lock. That causes lock inversion on scheduler locks because
    printk() can call into the scheduler. Lockdep puts it as:

    ======================================================
    [ INFO: possible circular locking dependency detected ]
    3.15.0-rc8-06195-g939f04b #2 Not tainted
    -------------------------------------------------------
    trinity-main/74 is trying to acquire lock:
    (&port_lock_key){-.....}, at: [] serial8250_console_write+0x8c/0x10c

    but task is already holding lock:
    (hrtimer_bases.lock){-.-...}, at: [] hrtimer_try_to_cancel+0x13/0x66

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #5 (hrtimer_bases.lock){-.-...}:
    [] lock_acquire+0x92/0x101
    [] _raw_spin_lock_irqsave+0x2e/0x3e
    [] __hrtimer_start_range_ns+0x1c/0x197
    [] perf_swevent_start_hrtimer.part.41+0x7a/0x85
    [] task_clock_event_start+0x3a/0x3f
    [] task_clock_event_add+0xd/0x14
    [] event_sched_in+0xb6/0x17a
    [] group_sched_in+0x44/0x122
    [] ctx_sched_in.isra.67+0x105/0x11f
    [] perf_event_sched_in.isra.70+0x47/0x4b
    [] __perf_install_in_context+0x8b/0xa3
    [] remote_function+0x12/0x2a
    [] smp_call_function_single+0x2d/0x53
    [] task_function_call+0x30/0x36
    [] perf_install_in_context+0x87/0xbb
    [] SYSC_perf_event_open+0x5c6/0x701
    [] SyS_perf_event_open+0x17/0x19
    [] syscall_call+0x7/0xb

    -> #4 (&ctx->lock){......}:
    [] lock_acquire+0x92/0x101
    [] _raw_spin_lock+0x21/0x30
    [] __perf_event_task_sched_out+0x1dc/0x34f
    [] __schedule+0x4c6/0x4cb
    [] schedule+0xf/0x11
    [] work_resched+0x5/0x30

    -> #3 (&rq->lock){-.-.-.}:
    [] lock_acquire+0x92/0x101
    [] _raw_spin_lock+0x21/0x30
    [] __task_rq_lock+0x33/0x3a
    [] wake_up_new_task+0x25/0xc2
    [] do_fork+0x15c/0x2a0
    [] kernel_thread+0x1a/0x1f
    [] rest_init+0x1a/0x10e
    [] start_kernel+0x303/0x308
    [] i386_start_kernel+0x79/0x7d

    -> #2 (&p->pi_lock){-.-...}:
    [] lock_acquire+0x92/0x101
    [] _raw_spin_lock_irqsave+0x2e/0x3e
    [] try_to_wake_up+0x1d/0xd6
    [] default_wake_function+0xb/0xd
    [] __wake_up_common+0x39/0x59
    [] __wake_up+0x29/0x3b
    [] tty_wakeup+0x49/0x51
    [] uart_write_wakeup+0x17/0x19
    [] serial8250_tx_chars+0xbc/0xfb
    [] serial8250_handle_irq+0x54/0x6a
    [] serial8250_default_handle_irq+0x19/0x1c
    [] serial8250_interrupt+0x38/0x9e
    [] handle_irq_event_percpu+0x5f/0x1e2
    [] handle_irq_event+0x2c/0x43
    [] handle_level_irq+0x57/0x80
    [] handle_irq+0x46/0x5c
    [] do_IRQ+0x32/0x89
    [] common_interrupt+0x2e/0x33
    [] _raw_spin_unlock_irqrestore+0x3f/0x49
    [] uart_start+0x2d/0x32
    [] uart_write+0xc7/0xd6
    [] n_tty_write+0xb8/0x35e
    [] tty_write+0x163/0x1e4
    [] redirected_tty_write+0x6d/0x75
    [] vfs_write+0x75/0xb0
    [] SyS_write+0x44/0x77
    [] syscall_call+0x7/0xb

    -> #1 (&tty->write_wait){-.....}:
    [] lock_acquire+0x92/0x101
    [] _raw_spin_lock_irqsave+0x2e/0x3e
    [] __wake_up+0x15/0x3b
    [] tty_wakeup+0x49/0x51
    [] uart_write_wakeup+0x17/0x19
    [] serial8250_tx_chars+0xbc/0xfb
    [] serial8250_handle_irq+0x54/0x6a
    [] serial8250_default_handle_irq+0x19/0x1c
    [] serial8250_interrupt+0x38/0x9e
    [] handle_irq_event_percpu+0x5f/0x1e2
    [] handle_irq_event+0x2c/0x43
    [] handle_level_irq+0x57/0x80
    [] handle_irq+0x46/0x5c
    [] do_IRQ+0x32/0x89
    [] common_interrupt+0x2e/0x33
    [] _raw_spin_unlock_irqrestore+0x3f/0x49
    [] uart_start+0x2d/0x32
    [] uart_write+0xc7/0xd6
    [] n_tty_write+0xb8/0x35e
    [] tty_write+0x163/0x1e4
    [] redirected_tty_write+0x6d/0x75
    [] vfs_write+0x75/0xb0
    [] SyS_write+0x44/0x77
    [] syscall_call+0x7/0xb

    -> #0 (&port_lock_key){-.....}:
    [] __lock_acquire+0x9ea/0xc6d
    [] lock_acquire+0x92/0x101
    [] _raw_spin_lock_irqsave+0x2e/0x3e
    [] serial8250_console_write+0x8c/0x10c
    [] call_console_drivers.constprop.31+0x87/0x118
    [] console_unlock+0x1d7/0x398
    [] vprintk_emit+0x3da/0x3e4
    [] printk+0x17/0x19
    [] clockevents_program_min_delta+0x104/0x116
    [] clockevents_program_event+0xe7/0xf3
    [] tick_program_event+0x1e/0x23
    [] hrtimer_force_reprogram+0x88/0x8f
    [] __remove_hrtimer+0x5b/0x79
    [] hrtimer_try_to_cancel+0x49/0x66
    [] hrtimer_cancel+0xd/0x18
    [] perf_swevent_cancel_hrtimer.part.60+0x2b/0x30
    [] task_clock_event_stop+0x20/0x64
    [] task_clock_event_del+0xd/0xf
    [] event_sched_out+0xab/0x11e
    [] group_sched_out+0x1d/0x66
    [] ctx_sched_out+0xaf/0xbf
    [] __perf_event_task_sched_out+0x1ed/0x34f
    [] __schedule+0x4c6/0x4cb
    [] schedule+0xf/0x11
    [] work_resched+0x5/0x30

    other info that might help us debug this:

    Chain exists of:
    &port_lock_key --> &ctx->lock --> hrtimer_bases.lock

    Possible unsafe locking scenario:

    CPU0 CPU1
    ---- ----
    lock(hrtimer_bases.lock);
    lock(&ctx->lock);
    lock(hrtimer_bases.lock);
    lock(&port_lock_key);

    *** DEADLOCK ***

    4 locks held by trinity-main/74:
    #0: (&rq->lock){-.-.-.}, at: [] __schedule+0xed/0x4cb
    #1: (&ctx->lock){......}, at: [] __perf_event_task_sched_out+0x1dc/0x34f
    #2: (hrtimer_bases.lock){-.-...}, at: [] hrtimer_try_to_cancel+0x13/0x66
    #3: (console_lock){+.+...}, at: [] vprintk_emit+0x3c7/0x3e4

    stack backtrace:
    CPU: 0 PID: 74 Comm: trinity-main Not tainted 3.15.0-rc8-06195-g939f04b #2
    00000000 81c3a310 8b995c14 81426f69 8b995c44 81425a99 8161f671 8161f570
    8161f538 8161f559 8161f538 8b995c78 8b142bb0 00000004 8b142fdc 8b142bb0
    8b995ca8 8104a62d 8b142fac 000016f2 81c3a310 00000001 00000001 00000003
    Call Trace:
    [] dump_stack+0x16/0x18
    [] print_circular_bug+0x18f/0x19c
    [] __lock_acquire+0x9ea/0xc6d
    [] lock_acquire+0x92/0x101
    [] ? serial8250_console_write+0x8c/0x10c
    [] ? wait_for_xmitr+0x76/0x76
    [] _raw_spin_lock_irqsave+0x2e/0x3e
    [] ? serial8250_console_write+0x8c/0x10c
    [] serial8250_console_write+0x8c/0x10c
    [] ? lock_release+0x191/0x223
    [] ? wait_for_xmitr+0x76/0x76
    [] call_console_drivers.constprop.31+0x87/0x118
    [] console_unlock+0x1d7/0x398
    [] vprintk_emit+0x3da/0x3e4
    [] printk+0x17/0x19
    [] clockevents_program_min_delta+0x104/0x116
    [] tick_program_event+0x1e/0x23
    [] hrtimer_force_reprogram+0x88/0x8f
    [] __remove_hrtimer+0x5b/0x79
    [] hrtimer_try_to_cancel+0x49/0x66
    [] hrtimer_cancel+0xd/0x18
    [] perf_swevent_cancel_hrtimer.part.60+0x2b/0x30
    [] task_clock_event_stop+0x20/0x64
    [] task_clock_event_del+0xd/0xf
    [] event_sched_out+0xab/0x11e
    [] group_sched_out+0x1d/0x66
    [] ctx_sched_out+0xaf/0xbf
    [] __perf_event_task_sched_out+0x1ed/0x34f
    [] ? __dequeue_entity+0x23/0x27
    [] ? pick_next_task_fair+0xb1/0x120
    [] __schedule+0x4c6/0x4cb
    [] ? trace_hardirqs_off_caller+0xd7/0x108
    [] ? trace_hardirqs_off+0xb/0xd
    [] ? rcu_irq_exit+0x64/0x77

    Fix the problem by using printk_deferred() which does not call into the
    scheduler.

    Reported-by: Fengguang Wu
    Signed-off-by: Jan Kara
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Jan Kara
     
  • commit f723aa1817dd8f4fe005aab52ba70c8ab0ef9457 upstream.

    During suspend we call sched_clock_poll() to update the epoch and
    accumulated time and reprogram the sched_clock_timer to fire
    before the next wrap-around time. Unfortunately,
    sched_clock_poll() doesn't restart the timer, instead it relies
    on the hrtimer layer to do that and during suspend we aren't
    calling that function from the hrtimer layer. Instead, we're
    reprogramming the expires time while the hrtimer is enqueued,
    which can cause the hrtimer tree to be corrupted. Furthermore, we
    restart the timer during suspend but we update the epoch during
    resume which seems counter-intuitive.

    Let's fix this by saving the accumulated state and canceling the
    timer during suspend. On resume we can update the epoch and
    restart the timer similar to what we would do if we were starting
    the clock for the first time.

    Fixes: a08ca5d1089d "sched_clock: Use an hrtimer instead of timer"
    Signed-off-by: Stephen Boyd
    Signed-off-by: John Stultz
    Link: http://lkml.kernel.org/r/1406174630-23458-1-git-send-email-john.stultz@linaro.org
    Cc: Ingo Molnar
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Stephen Boyd
     
  • commit aac74dc495456412c4130a1167ce4beb6c1f0b38 upstream.

    After learning we'll need some sort of deferred printk functionality in
    the timekeeping core, Peter suggested we rename the printk_sched function
    so it can be reused by needed subsystems.

    This only changes the function name. No logic changes.

    Signed-off-by: John Stultz
    Reviewed-by: Steven Rostedt
    Cc: Jan Kara
    Cc: Peter Zijlstra
    Cc: Jiri Bohac
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    John Stultz
     
  • commit 44fa816bb778edbab6b6ddaaf24908dd6295937e upstream.

    nr_dirty is updated without locking, causing it to drift so that it is
    non-zero (either a small positive integer, or a very large one when an
    underflow occurs) even when there are no actual dirty blocks. This was
    due to a race between the workqueue and map function accessing nr_dirty
    in parallel without proper protection.

    People were seeing under runs due to a race on increment/decrement of
    nr_dirty, see: https://lkml.org/lkml/2014/6/3/648

    Fix this by using an atomic_t for nr_dirty.

    Reported-by: roma1390@gmail.com
    Signed-off-by: Anssi Hannula
    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Anssi Hannula
     
  • commit d8c712ea471ce7a4fd1734ad2211adf8469ddddc upstream.

    1d3d4437eae1 ("vmscan: per-node deferred work") added a flags field to
    struct shrinker assuming that all shrinkers were zero filled. The dm
    bufio shrinker is not zero filled, which leaves arbitrary kmalloc() data
    in flags. So far the only defined flags bit is SHRINKER_NUMA_AWARE.
    But there are proposed patches which add other bits to shrinker.flags
    (e.g. memcg awareness).

    Rather than simply initializing the shrinker, this patch uses kzalloc()
    when allocating the dm_bufio_client to ensure that the embedded shrinker
    and any other similar structures are zeroed.

    This fixes theoretical over aggressive shrinking of dm bufio objects.
    If the uninitialized dm_bufio_client.shrinker.flags contains
    SHRINKER_NUMA_AWARE then shrink_slab() would call the dm shrinker for
    each numa node rather than just once. This has been broken since 3.12.

    Signed-off-by: Greg Thelen
    Acked-by: Mikulas Patocka
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Greg Thelen
     
  • commit 61bd55ce1667809f022be88da77db17add90ea4e upstream.

    When creating the demux table we need to iterate over the selected scan mask for
    the buffer to get the samples which should be copied to destination buffer.
    Right now the code uses the mask which contains all active channels, which means
    the demux table contains entries which causes it to copy all the samples from
    source to destination buffer one by one without doing any demuxing.

    Signed-off-by: Lars-Peter Clausen
    Signed-off-by: Jonathan Cameron
    Signed-off-by: Greg Kroah-Hartman

    Lars-Peter Clausen
     
  • commit 9b2a4d35a6ceaf217be61ed8eb3c16986244f640 upstream.

    val2 should be zero

    This will make no difference for correct inputs but will reject
    incorrect ones with a decimal part in the value written to the sysfs
    interface.

    Signed-off-by: Peter Meerwald
    Cc: Oleksandr Kravchenko
    Signed-off-by: Jonathan Cameron
    Signed-off-by: Greg Kroah-Hartman

    Peter Meerwald
     
  • commit 381676d5e86596b11e22a62f196e192df6091373 upstream.

    The userspace interface for acceleration sensors is documented as using
    m/s^2 units [Documentation/ABI/testing/sysfs-bus-iio]

    The fullscale raw values for the BMA80 corresponds to -/+ 1, 1.5, 2, etc G
    depending on the selected mode.

    The scale table was converting to G rather than m/s^2.
    Change the scaling table to match the documented interface.

    See commit 71702e6e, iio: mma8452: Use correct acceleration units,
    for a related fix.

    Signed-off-by: Peter Meerwald
    Cc: Oleksandr Kravchenko
    Signed-off-by: Jonathan Cameron
    Signed-off-by: Greg Kroah-Hartman

    Peter Meerwald
     
  • commit b6328a07bd6b3d31b64f85864fe74f3b08c010ca upstream.

    The acpi_pnp_match() function is used for finding the ACPI device
    object that should be associated with the given PNP device.
    Unfortunately, the check used by that function is not strict enough
    and may cause success to be returned for a wrong ACPI device object.

    To fix that, use the observation that the pointer to the ACPI
    device object in question is already stored in the data field
    in struct pnp_dev, so acpi_pnp_match() can simply use that
    field to do its job.

    This problem was uncovered in 3.14 by commit 202317a573b2 (ACPI / scan:
    Add acpi_device objects for all device nodes in the namespace).

    Fixes: 202317a573b2 (ACPI / scan: Add acpi_device objects for all device nodes in the namespace)
    Reported-and-tested-by: Vinson Lee
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Greg Kroah-Hartman

    Rafael J. Wysocki
     
  • commit 4aa0abed3a2a11b7d71ad560c1a3e7631c5a31cd upstream.

    byReAssocCount is incremented every second resulting in
    disassociated message being send every 10 seconds whether
    connection or not.

    byReAssocCount should only advance while eCommandState
    is in WLAN_ASSOCIATE_WAIT

    Change existing scope to if condition.

    Signed-off-by: Malcolm Priestley
    Signed-off-by: Greg Kroah-Hartman

    Malcolm Priestley
     
  • commit 2bcf2e92c3918ce62ab4e934256e47e9a16d19c3 upstream.

    Paul Furtado has reported the following GPF:

    general protection fault: 0000 [#1] SMP
    Modules linked in: ipv6 dm_mod xen_netfront coretemp hwmon x86_pkg_temp_thermal crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 microcode pcspkr ext4 jbd2 mbcache raid0 xen_blkfront
    CPU: 3 PID: 3062 Comm: java Not tainted 3.16.0-rc5 #1
    task: ffff8801cfe8f170 ti: ffff8801d2ec4000 task.ti: ffff8801d2ec4000
    RIP: e030:mem_cgroup_oom_synchronize+0x140/0x240
    RSP: e02b:ffff8801d2ec7d48 EFLAGS: 00010283
    RAX: 0000000000000001 RBX: ffff88009d633800 RCX: 000000000000000e
    RDX: fffffffffffffffe RSI: ffff88009d630200 RDI: ffff88009d630200
    RBP: ffff8801d2ec7da8 R08: 0000000000000012 R09: 00000000fffffffe
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff88009d633800
    R13: ffff8801d2ec7d48 R14: dead000000100100 R15: ffff88009d633a30
    FS: 00007f1748bb4700(0000) GS:ffff8801def80000(0000) knlGS:0000000000000000
    CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 00007f4110300308 CR3: 00000000c05f7000 CR4: 0000000000002660
    Call Trace:
    pagefault_out_of_memory+0x18/0x90
    mm_fault_error+0xa9/0x1a0
    __do_page_fault+0x478/0x4c0
    do_page_fault+0x2c/0x40
    page_fault+0x28/0x30
    Code: 44 00 00 48 89 df e8 40 ca ff ff 48 85 c0 49 89 c4 74 35 4c 8b b0 30 02 00 00 4c 8d b8 30 02 00 00 4d 39 fe 74 1b 0f 1f 44 00 00 8b 7e 10 be 01 00 00 00 e8 42 d2 04 00 4d 8b 36 4d 39 fe 75
    RIP mem_cgroup_oom_synchronize+0x140/0x240

    Commit fb2a6fc56be6 ("mm: memcg: rework and document OOM waiting and
    wakeup") has moved mem_cgroup_oom_notify outside of memcg_oom_lock
    assuming it is protected by the hierarchical OOM-lock.

    Although this is true for the notification part the protection doesn't
    cover unregistration of event which can happen in parallel now so
    mem_cgroup_oom_notify can see already unlinked and/or freed
    mem_cgroup_eventfd_list.

    Fix this by using memcg_oom_lock also in mem_cgroup_oom_notify.

    Addresses https://bugzilla.kernel.org/show_bug.cgi?id=80881

    Fixes: fb2a6fc56be6 (mm: memcg: rework and document OOM waiting and wakeup)
    Signed-off-by: Michal Hocko
    Reported-by: Paul Furtado
    Tested-by: Paul Furtado
    Acked-by: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Michal Hocko
     
  • commit b104a35d32025ca740539db2808aa3385d0f30eb upstream.

    The page allocator relies on __GFP_WAIT to determine if ALLOC_CPUSET
    should be set in allocflags. ALLOC_CPUSET controls if a page allocation
    should be restricted only to the set of allowed cpuset mems.

    Transparent hugepages clears __GFP_WAIT when defrag is disabled to prevent
    the fault path from using memory compaction or direct reclaim. Thus, it
    is unfairly able to allocate outside of its cpuset mems restriction as a
    side-effect.

    This patch ensures that ALLOC_CPUSET is only cleared when the gfp mask is
    truly GFP_ATOMIC by verifying it is also not a thp allocation.

    Signed-off-by: David Rientjes
    Reported-by: Alex Thorlton
    Tested-by: Alex Thorlton
    Cc: Bob Liu
    Cc: Dave Hansen
    Cc: Hedi Berriche
    Cc: Hugh Dickins
    Cc: Johannes Weiner
    Cc: Kirill A. Shutemov
    Cc: Mel Gorman
    Cc: Rik van Riel
    Cc: Srivatsa S. Bhat
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    David Rientjes
     
  • commit f6789593d5cea42a4ecb1cbeab6a23ade5ebbba7 upstream.

    Under memory pressure, it is possible for dirty_thresh, calculated by
    global_dirty_limits() in balance_dirty_pages(), to equal zero. Then, if
    strictlimit is true, bdi_dirty_limits() tries to resolve the proportion:

    bdi_bg_thresh : bdi_thresh = background_thresh : dirty_thresh

    by dividing by zero.

    Signed-off-by: Maxim Patlasov
    Acked-by: Rik van Riel
    Cc: Michal Hocko
    Cc: KOSAKI Motohiro
    Cc: Wu Fengguang
    Cc: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Maxim Patlasov
     
  • commit 89fb4cd1f717a871ef79fa7debbe840e3225cd54 upstream.

    Flush commands don't transfer data and thus need to be special cased
    in the I/O completion handler so that we can propagate errors to
    the block layer and filesystem.

    Signed-off-by: James Bottomley
    Reported-by: Steven Haber
    Tested-by: Steven Haber
    Reviewed-by: Martin K. Petersen
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Greg Kroah-Hartman

    James Bottomley
     
  • commit 0193ed8225e1a79ed64632106ec3cc81798cb13c upstream.

    This is a bug fix for the situation when function tsi721_desc_get() fails
    to obtain a free transaction descriptor.

    The bug usually results in a memory access crash dump when data transfer
    scatter-gather list has more entries than size of hardware buffer
    descriptors ring. This fix ensures that error is properly returned to a
    caller instead of an invalid entry.

    This patch is applicable to kernel versions starting from v3.5.

    Signed-off-by: Alexandre Bounine
    Cc: Matt Porter
    Cc: Andre van Herk
    Cc: Stef van Os
    Cc: Vinod Koul
    Cc: Dan Williams
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Alexandre Bounine
     
  • commit 08b9939997df30e42a228e1ecb97f99e9c8ea84e upstream.

    This reverts commit 277d916fc2e959c3f106904116bb4f7b1148d47a as it was
    at least breaking iwlwifi by setting the IEEE80211_TX_CTL_NO_PS_BUFFER
    flag in all kinds of interface modes, not only for AP mode where it is
    appropriate.

    To avoid reintroducing the original problem, explicitly check for probe
    request frames in the multicast buffering code.

    Fixes: 277d916fc2e9 ("mac80211: move "bufferable MMPDU" check to fix AP mode scan")
    Signed-off-by: Johannes Berg
    Signed-off-by: Greg Kroah-Hartman

    Johannes Berg
     
  • commit 8c26d458394be44e135d1c6bd4557e1c4e1a0535 upstream.

    tsc can be NULL (mac80211 currently always passes NULL),
    resulting in NULL-dereference. check before copying it.

    Signed-off-by: Eliad Peller
    Signed-off-by: Emmanuel Grumbach
    Signed-off-by: Johannes Berg
    Signed-off-by: Greg Kroah-Hartman

    Eliad Peller
     
  • commit c01fac1c77a00227f706a1654317023e3f4ac7f0 upstream.

    If an aggregation session fails, frames still end up in the driver queue
    with IEEE80211_TX_CTL_AMPDU set.
    This causes tx for the affected station/tid to stall, since
    ath_tx_get_tid_subframe returning packets to send.

    Fix this by clearing IEEE80211_TX_CTL_AMPDU as long as no aggregation
    session is running.

    Reported-by: Antonio Quartulli
    Signed-off-by: Felix Fietkau
    Signed-off-by: John W. Linville
    Signed-off-by: Greg Kroah-Hartman

    Felix Fietkau
     
  • commit 811a2407a3cf7bbd027fbe92d73416f17485a3d8 upstream.

    On LPAE, each level 1 (pgd) page table entry maps 1GiB, and the level 2
    (pmd) entries map 2MiB.

    When the identity mapping is created on LPAE, the pgd pointers are copied
    from the swapper_pg_dir. If we find that we need to modify the contents
    of a pmd, we allocate a new empty pmd table and insert it into the
    appropriate 1GB slot, before then filling it with the identity mapping.

    However, if the 1GB slot covers the kernel lowmem mappings, we obliterate
    those mappings.

    When replacing a PMD, first copy the old PMD contents to the new PMD, so
    that we preserve the existing mappings, particularly the mappings of the
    kernel itself.

    [rewrote commit message and added code comment -- rmk]

    Fixes: ae2de101739c ("ARM: LPAE: Add identity mapping support for the 3-level page table format")
    Signed-off-by: Konstantin Khlebnikov
    Signed-off-by: Russell King
    Signed-off-by: Greg Kroah-Hartman

    Konstantin Khlebnikov
     
  • commit 823a19cd3b91b0729d7417f1848413846be61712 upstream.

    If init_mm.brk is not section aligned, the LPAE fixup code will miss
    updating the final PMD. Fix this by aligning map_end.

    Fixes: a77e0c7b2774 ("ARM: mm: Recreate kernel mappings in early_paging_init()")
    Signed-off-by: Russell King
    Signed-off-by: Greg Kroah-Hartman

    Russell King
     
  • commit 33753cd2ba41c72a0756edc5dc094d91602deda5 upstream.

    This patch adds bch8 ecc software fallback which is mostly used by
    omap3s because they lack hardware elm support.

    Fixes: 0611c41934ab35ce84dea34ab291897ad3cbc7be (ARM: OMAP2+: gpmc:
    update gpmc_hwecc_bch_capable() for new platforms and ECC schemes)
    Signed-off-by: Christoph Fritz
    Reviewed-by: Pekon Gupta
    Signed-off-by: Tony Lindgren
    Signed-off-by: Greg Kroah-Hartman

    Christoph Fritz
     
  • commit 28c9770bcbd2b6dbab99669825a2f8fa69e6d35b upstream.

    Fix the address of L2 controler register in hi3620 SoC.
    This has been wrong from the point that the file was merged
    in v3.14.

    Signed-off-by: Haojian Zhuang
    Acked-by: Wei Xu
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Greg Kroah-Hartman

    Haojian Zhuang
     
  • commit 4c63f83c2c2e16a13ce274ee678e28246bd33645 upstream.

    Th AF_ALG socket was missing a security label (e.g. SELinux)
    which means that socket was in "unlabeled" state.

    This was recently demonstrated in the cryptsetup package
    (cryptsetup v1.6.5 and later.)
    See https://bugzilla.redhat.com/show_bug.cgi?id=1115120

    This patch clones the sock's label from the parent sock
    and resolves the issue (similar to AF_BLUETOOTH protocol family).

    Signed-off-by: Milan Broz
    Acked-by: Paul Moore
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Milan Broz
     
  • commit f3c400ef473e00c680ea713a66196b05870b3710 upstream.

    Fix the same alignment bug as in arm64 - we need to pass residue
    unprocessed bytes as the last argument to blkcipher_walk_done.

    Signed-off-by: Mikulas Patocka
    Acked-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Mikulas Patocka
     
  • commit 8903461c9bc56fcb041fb92d054e2529951770b6 upstream.

    In the recent commit b50a6c584bb4 "Clear MMCR2 when enabling PMU", I
    screwed up the handling of MMCR2 for tasks using EBB.

    We must make sure we set MMCR2 *before* ebb_switch_in(), otherwise we
    overwrite the value of MMCR2 that userspace may have written. That
    potentially breaks a task that uses EBB and manually uses MMCR2 for
    event freezing.

    Fixes: b50a6c584bb4 ("powerpc/perf: Clear MMCR2 when enabling PMU")
    Signed-off-by: Michael Ellerman
    Signed-off-by: Benjamin Herrenschmidt
    Signed-off-by: Greg Kroah-Hartman

    Michael Ellerman
     

01 Aug, 2014

6 commits

  • Greg Kroah-Hartman
     
  • commit aff008ad813c7cf3cfe7b532e7ba2c526c136f22 upstream.

    Commits 9ec36ca (of/irq: do irq resolution in platform_get_irq)
    and ad69674 (of/irq: do irq resolution in platform_get_irq_byname)
    change the semantics of platform_get_irq and platform_get_irq_byname
    to always rely on devicetree information if devicetree is enabled
    and if a devicetree node is attached to the device. The functions
    now return an error if the devicetree data does not include interrupt
    information, even if the information is available as platform resource
    data.

    This causes mfd client drivers to fail if the interrupt number is
    passed via platform resources. Therefore, if of_irq_get fails, try
    platform_get_resource as method of last resort. This restores the
    original functionality for drivers depending on platform resources
    to get irq information.

    Cc: Russell King
    Cc: Tony Lindgren
    Cc: Grant Likely
    Cc: Grygorii Strashko
    Signed-off-by: Guenter Roeck
    Acked-by: Rob Herring
    [ Guenter Roeck: backported to 3.15 ]
    Signed-off-by: Guenter Roeck

    Guenter Roeck
     
  • commit 02df00eb0019e7d15a1fcddebe4d020226c1ccda upstream.

    The non-split wiphy state shouldn't be increased in size
    so move the new set_qos_map command into the split if
    statement.

    Fixes: fa9ffc745610 ("cfg80211: Add support for QoS mapping")
    Reviewed-by: Emmanuel Grumbach
    Signed-off-by: Johannes Berg
    Signed-off-by: Greg Kroah-Hartman

    Johannes Berg
     
  • commit c118678bc79e8241f9d3434d9324c6400d72f48a upstream.

    Ingo Korb reported that "repeated mapping of the same file on tmpfs
    using remap_file_pages sometimes triggers a BUG at mm/filemap.c:202 when
    the process exits".

    He bisected the bug to d7c1755179b8 ("mm: implement ->map_pages for
    shmem/tmpfs"), although the bug was actually added by commit
    8c6e50b0290c ("mm: introduce vm_ops->map_pages()").

    The problem is caused by calling do_fault_around for a _non-linear_
    fault. In this case pgoff is shifted and might become negative during
    calculation.

    Faulting around non-linear page-fault makes no sense and breaks the
    logic in do_fault_around because pgoff is shifted.

    Signed-off-by: Konstantin Khlebnikov
    Reported-by: Ingo Korb
    Tested-by: Ingo Korb
    Cc: Hugh Dickins
    Cc: Sasha Levin
    Cc: Dave Jones
    Cc: Ning Qu
    Cc: "Kirill A. Shutemov"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Konstantin Khlebnikov
     
  • commit e052dbf554610e2104c5a7518c4d8374bed701bb upstream.

    The hwrng core asks for random data in the hwrng_register() call itself
    from commit d9e7972619. This doesn't play well with virtio -- the
    DRIVER_OK bit is only set by virtio core on a successful probe, and
    we're not yet out of our probe routine when this call is made. This
    causes the host to not acknowledge any requests we put in the virtqueue,
    and the insmod or kernel boot process just waits for data to arrive from
    the host, which never happens.

    CC: Kees Cook
    CC: Jason Cooper
    CC: Herbert Xu
    Reviewed-by: Jason Cooper
    Signed-off-by: Amit Shah
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Amit Shah
     
  • commit 2062afb4f804afef61cbe62a30cac9a46e58e067 upstream.

    Michel Dänzer and a couple of other people reported inexplicable random
    oopses in the scheduler, and the cause turns out to be gcc mis-compiling
    the load_balance() function when debugging is enabled. The gcc bug
    apparently goes back to gcc-4.5, but slight optimization changes means
    that it now showed up as a problem in 4.9.0 and 4.9.1.

    The instruction scheduling problem causes gcc to schedule a spill
    operation to before the stack frame has been created, which in turn can
    corrupt the spilled value if an interrupt comes in. There may be other
    effects of this bug too, but that's the code generation problem seen in
    Michel's case.

    This is fixed in current gcc HEAD, but the workaround as suggested by
    Markus Trippelsdorf is pretty simple: use -fno-var-tracking-assignments
    when compiling the kernel, which disables the gcc code that causes the
    problem. This can result in slightly worse debug information for
    variable accesses, but that is infinitely preferable to actual code
    generation problems.

    Doing this unconditionally (not just for CONFIG_DEBUG_INFO) also allows
    non-debug builds to verify that the debug build would be identical: we
    can do

    export GCC_COMPARE_DEBUG=1

    to make gcc internally verify that the result of the build is
    independent of the "-g" flag (it will make the compiler build everything
    twice, toggling the debug flag, and compare the results).

    Without the "-fno-var-tracking-assignments" option, the build would fail
    (even with 4.8.3 that didn't show the actual stack frame bug) with a gcc
    compare failure.

    See also gcc bugzilla:

    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61801

    Reported-by: Michel Dänzer
    Suggested-by: Markus Trippelsdorf
    Cc: Jakub Jelinek
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Linus Torvalds