12 Jan, 2015

8 commits

  • Linus Torvalds
     
  • Pull ARM fixes from Russell King:
    "Three small fixes from over the Christmas period, and wiring up the
    new execveat syscall for ARM"

    * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
    ARM: 8275/1: mm: fix PMD_SECT_RDONLY undeclared compile error
    ARM: 8253/1: mm: use phys_addr_t type in map_lowmem() for kernel mem region
    ARM: 8249/1: mm: dump: don't skip regions
    ARM: wire up execveat syscall

    Linus Torvalds
     
  • Pull x86 fixes from Ingo Molnar:
    "Misc fixes: two vdso fixes, two kbuild fixes and a boot failure fix
    with certain odd memory mappings"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86, vdso: Use asm volatile in __getcpu
    x86/build: Clean auto-generated processor feature files
    x86: Fix mkcapflags.sh bash-ism
    x86: Fix step size adjustment during initial memory mapping
    x86_64, vdso: Fix the vdso address randomization algorithm

    Linus Torvalds
     
  • Pull scheduler fixes from Ingo Molnar:
    "Misc fixes: group scheduling corner case fix, two deadline scheduler
    fixes, effective_load() overflow fix, nested sleep fix, 6144 CPUs
    system fix"

    * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched/fair: Fix RCU stall upon -ENOMEM in sched_create_group()
    sched/deadline: Avoid double-accounting in case of missed deadlines
    sched/deadline: Fix migration of SCHED_DEADLINE tasks
    sched: Fix odd values in effective_load() calculations
    sched, fanotify: Deal with nested sleeps
    sched: Fix KMALLOC_MAX_SIZE overflow during cpumask allocation

    Linus Torvalds
     
  • Pull perf fixes from Ingo Molnar:
    "Mostly tooling fixes, but also some kernel side fixes: uncore PMU
    driver fix, user regs sampling fix and an instruction decoder fix that
    unbreaks PEBS precise sampling"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf/x86/uncore/hsw-ep: Handle systems with only two SBOXes
    perf/x86_64: Improve user regs sampling
    perf: Move task_pt_regs sampling into arch code
    x86: Fix off-by-one in instruction decoder
    perf hists browser: Fix segfault when showing callchain
    perf callchain: Free callchains when hist entries are deleted
    perf hists: Fix children sort key behavior
    perf diff: Fix to sort by baseline field by default
    perf list: Fix --raw-dump option
    perf probe: Fix crash in dwarf_getcfi_elf
    perf probe: Fix to fall back to find probe point in symbols
    perf callchain: Append callchains only when requested
    perf ui/tui: Print backtrace symbols when segfault occurs
    perf report: Show progress bar for output resorting

    Linus Torvalds
     
  • Pull locking fixes from Ingo Molnar:
    "A liblockdep fix and a mutex_unlock() mutex-debugging fix"

    * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    mutex: Always clear owner field upon mutex_unlock()
    tools/liblockdep: Fix debug_check thinko in mutex destroy

    Linus Torvalds
     
  • Fix for BUG_ON(anon_vma->degree) splashes in unlink_anon_vmas() ("kernel
    BUG at mm/rmap.c:399!") caused by commit 7a3ef208e662 ("mm: prevent
    endless growth of anon_vma hierarchy")

    Anon_vma_clone() is usually called for a copy of source vma in
    destination argument. If source vma has anon_vma it should be already
    in dst->anon_vma. NULL in dst->anon_vma is used as a sign that it's
    called from anon_vma_fork(). In this case anon_vma_clone() finds
    anon_vma for reusing.

    Vma_adjust() calls it differently and this breaks anon_vma reusing
    logic: anon_vma_clone() links vma to old anon_vma and updates degree
    counters but vma_adjust() overrides vma->anon_vma right after that. As
    a result final unlink_anon_vmas() decrements degree for wrong anon_vma.

    This patch assigns ->anon_vma before calling anon_vma_clone().

    Signed-off-by: Konstantin Khlebnikov
    Reported-and-tested-by: Chris Clayton
    Reported-and-tested-by: Oded Gabbay
    Reported-and-tested-by: Chih-Wei Huang
    Acked-by: Rik van Riel
    Acked-by: Vlastimil Babka
    Cc: Daniel Forrest
    Cc: Michal Hocko
    Cc: stable@vger.kernel.org # to match back-porting of 7a3ef208e662
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     
  • Commit fee7e49d4514 ("mm: propagate error from stack expansion even for
    guard page") made sure that we return the error properly for stack
    growth conditions. It also theorized that counting the guard page
    towards the stack limit might break something, but also said "Let's see
    if anybody notices".

    Somebody did notice. Apparently android-x86 sets the stack limit very
    close to the limit indeed, and including the guard page in the rlimit
    check causes the android 'zygote' process problems.

    So this adds the (fairly trivial) code to make the stack rlimit check be
    against the actual real stack size, rather than the size of the vma that
    includes the guard page.

    Reported-and-tested-by: Chih-Wei Huang
    Cc: Jay Foad
    Cc: stable@kernel.org # to match back-porting of fee7e49d4514
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

11 Jan, 2015

3 commits


10 Jan, 2015

10 commits

  • Pull sound fixes from Takashi Iwai:
    "All a few small regression or stable fixes: a Nvidia HDMI ID addition,
    a regression fix for CAIAQ stream count, a typo fix for GPIO setup
    with STAC/IDT HD-audio codecs, and a Fireworks big-endian fix"

    * tag 'sound-3.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
    ALSA: fireworks: fix an endianness bug for transaction length
    ALSA: hda - Add new GPU codec ID 0x10de0072 to snd-hda
    ALSA: hda - Fix wrong gpio_dir & gpio_mask hint setups for IDT/STAC codecs
    ALSA: snd-usb-caiaq: fix stream count check

    Linus Torvalds
     
  • Pull HID updates from Jiri Kosina:

    - bounds checking fixes in logitech and roccat drivers, from Peter Wu
    and Dan Carpenter

    - double-kfree fix in i2c-hid driver on bus shutdown, from Mika
    Westerberg

    - a couple of various small driver fixes

    - a few device id additions

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
    HID: roccat: potential out of bounds in pyra_sysfs_write_settings()
    HID: Add a new id 0x501a for Genius MousePen i608X
    HID: logitech-hidpp: prefix the name with "Logitech"
    HID: logitech-hidpp: avoid unintended fall-through
    HID: Allow HID_BATTERY_STRENGTH to be enabled
    HID: i2c-hid: Do not free buffers in i2c_hid_stop()
    HID: add battery quirk for USB_DEVICE_ID_APPLE_ALU_WIRELESS_2011_ISO keyboard
    HID: logitech-hidpp: check WTP report length
    HID: logitech-dj: check report length

    Linus Torvalds
     
  • Pull drm fixes from Dave Airlie:
    "I'm briefly working between holidays and LCA, so this is close to a
    couple of weeks of fixes,

    Two sets of amdkfd fixes, this is a new feature this kernel, and this
    pull fixes a few issues since it got merged, ordering when built-in to
    kernel and also the iommu vs gpu ordering patch, it also reworks the
    ioctl before the initial release.

    Otherwise:
    - radeon: some misc fixes all over, hdmi, 4k, dpm
    - nouveau: mcp77 init fixes, oops fix, bug on fix, msi fix
    - i915: power fixes, revert VGACNTR patch

    Probably be quiteer next week since I'll be at LCA anyways"

    * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (33 commits)
    drm/amdkfd: rewrite kfd_ioctl() according to drm_ioctl()
    drm/amdkfd: reformat IOCTL definitions to drm-style
    drm/amdkfd: Do copy_to/from_user in general kfd_ioctl()
    drm/radeon: integer underflow in radeon_cp_dispatch_texture()
    drm/radeon: adjust default bapm settings for KV
    drm/radeon: properly filter DP1.2 4k modes on non-DP1.2 hw
    drm/radeon: fix sad_count check for dce3
    drm/radeon: KV has three PPLLs (v2)
    drm/amdkfd: unmap VMIDPASID when relesing VMID (non-HWS)
    drm/radeon: Init amdkfd only if it was compiled
    amdkfd: actually allocate longs for the pasid bitmask
    drm/nouveau/nouveau: Do not BUG_ON(!spin_is_locked()) on UP
    drm/nv4c/mc: disable msi
    drm/nouveau/fb/ram/mcp77: enable NISO poller
    drm/nouveau/fb/ram/mcp77: use carveout reg to determine size
    drm/nouveau/fb/ram/mcp77: subclass nouveau_ram
    drm/nouveau: wake up the card if necessary during gem callbacks
    drm/nouveau/device: Add support for GK208B, resolves bug 86935
    drm/nouveau: fix missing return statement in nouveau_ttm_tt_unpopulate
    drm/nouveau/bios: fix oops on pre-nv50 chipsets
    ...

    Linus Torvalds
     
  • Pull arm64 fixes from Will Deacon:
    "Here is a handful of minor arm64 fixes discovered and fixed over the
    Christmas break. The main part is adding some missing #includes that
    we seem to be getting transitively but have started causing problems
    in -next.

    - Fix early mapping fixmap corruption by EFI runtime services
    - Fix __NR_compat_syscalls off-by-one
    - Add missing sanity checks for some 32-bit registers
    - Add some missing #includes which we get transitively
    - Remove unused prepare_to_copy() macro"

    * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
    arm64/efi: add missing call to early_ioremap_reset()
    arm64: fix missing asm/io.h include in kernel/smp_spin_table.c
    arm64: fix missing asm/alternative.h include in kernel/module.c
    arm64: fix missing linux/bug.h include in asm/arch_timer.h
    arm64: fix missing asm/pgtable-hwdef.h include in asm/processor.h
    arm64: sanity checks: add missing AArch32 registers
    arm64: Remove unused prepare_to_copy()
    arm64: Correct __NR_compat_syscalls for bpf

    Linus Torvalds
     
  • Pull kgdb/kdb fixes from Jason Wessel:
    "These have been around since 3.17 and in kgdb-next for the last 9
    weeks and some will go back to -stable.

    Summary of changes:

    Cleanups
    - kdb: Remove unused command flags, repeat flags and KDB_REPEAT_NONE

    Fixes
    - kgdb/kdb: Allow access on a single core, if a CPU round up is
    deemed impossible, which will allow inspection of the now "trashed"
    kernel
    - kdb: Add enable mask for the command groups
    - kdb: access controls to restrict sensitive commands"

    * tag 'for_linus-3.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb:
    kernel/debug/debug_core.c: Logging clean-up
    kgdb: timeout if secondary CPUs ignore the roundup
    kdb: Allow access to sensitive commands to be restricted by default
    kdb: Add enable mask for groups of commands
    kdb: Categorize kdb commands (similar to SysRq categorization)
    kdb: Remove KDB_REPEAT_NONE flag
    kdb: Use KDB_REPEAT_* values as flags
    kdb: Rename kdb_register_repeat() to kdb_register_flags()
    kdb: Rename kdb_repeat_t to kdb_cmdflags_t, cmd_repeat to cmd_flags
    kdb: Remove currently unused kdbtab_t->cmd_flags

    Linus Torvalds
     
  • Pull two nfsd bugfixes from Bruce Fields.

    * 'for-3.19' of git://linux-nfs.org/~bfields/linux:
    rpc: fix xdr_truncate_encode to handle buffer ending on page boundary
    nfsd: fix fi_delegees leak when fi_had_conflict returns true

    Linus Torvalds
     
  • Pull two Ceph fixes from Sage Weil:
    "These are both pretty trivial: a sparse warning fix and size_t printk
    thing"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
    libceph: fix sparse endianness warnings
    ceph: use %zu for len in ceph_fill_inline_data()

    Linus Torvalds
     
  • Pull btrfs fixes from Chris Mason:
    "None of these are huge, but my commit does fix a regression from 3.18
    that could cause lost files during log replay.

    This also adds Dave Sterba to the list of Btrfs maintainers. It
    doesn't mean we're doing things differently, but Dave has really been
    helping with the maintainer workload for years"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
    Btrfs: don't delay inode ref updates during log replay
    Btrfs: correctly get tree level in tree_backref_for_extent
    Btrfs: call inode_dec_link_count() on mkdir error path
    Btrfs: abort transaction if we don't find the block group
    Btrfs, scrub: uninitialized variable in scrub_extent_for_parity()
    Btrfs: add more maintainers

    Linus Torvalds
     
  • Merge misc fixes from Andrew Morton:
    "12 fixes"

    * emailed patches from Andrew Morton :
    mm, vmscan: prevent kswapd livelock due to pfmemalloc-throttled process being killed
    memcg: fix destination cgroup leak on task charges migration
    mm: memcontrol: switch soft limit default back to infinity
    mm/debug_pagealloc: remove obsolete Kconfig options
    vfs: renumber FMODE_NONOTIFY and add to uniqueness check
    arch/blackfin/mach-bf533/boards/stamp.c: add linux/delay.h
    ocfs2: fix the wrong directory passed to ocfs2_lookup_ino_from_name() when link file
    MAINTAINERS: update rydberg's addresses
    mm: protect set_page_dirty() from ongoing truncation
    mm: prevent endless growth of anon_vma hierarchy
    exit: fix race between wait_consider_task() and wait_task_zombie()
    ocfs2: remove bogus check in dlm_process_recovery_data

    Linus Torvalds
     
  • In v3.19-rc3 tree when CONFIG_ARM_LPAE and CONFIG_DEBUG_RODATA are enabled
    image failed to compile with the following error:

    arch/arm/mm/init.c:661:14: error: ‘PMD_SECT_RDONLY’ undeclared here (not in a function)

    It seems that '80d6b0c ARM: mm: allow text and rodata sections to be read-only'
    and 'ded9477 ARM: 8109/1: mm: Modify pte_write and pmd_write logic for LPAE'
    commits crossed. 80d6b0c uses PMD_SECT_RDONLY macro but ded9477 renames it
    and uses software bits L_PMD_SECT_RDONLY instead.

    Fix is to use L_PMD_SECT_RDONLY instead PMD_SECT_RDONLY as ded9477 does in
    another places.

    Signed-off-by: Victor Kamensky
    Acked-by: Will Deacon
    Signed-off-by: Russell King

    Victor Kamensky
     

09 Jan, 2015

19 commits

  • This is a static checker fix. We write some binary settings to the
    sysfs file. One of the settings is the "->startup_profile". There
    isn't any checking to make sure it fits into the
    pyra->profile_settings[] array in the profile_activated() function.

    I added a check to pyra_sysfs_write_settings() in both places because
    I wasn't positive that the other callers were correct.

    Cc:
    Signed-off-by: Dan Carpenter
    Signed-off-by: Jiri Kosina

    Dan Carpenter
     
  • Currently if DEBUG_MUTEXES is enabled, the mutex->owner field is only
    cleared iff debug_locks is active. This exposes a race to other users of
    the field where the mutex->owner may be still set to a stale value,
    potentially upsetting mutex_spin_on_owner() among others.

    References: https://bugs.freedesktop.org/show_bug.cgi?id=87955
    Signed-off-by: Chris Wilson
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Davidlohr Bueso
    Cc: Daniel Vetter
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1420540175-30204-1-git-send-email-chris@chris-wilson.co.uk
    Signed-off-by: Ingo Molnar

    Chris Wilson
     
  • When alloc_fair_sched_group() in sched_create_group() fails,
    free_sched_group() is called, and free_fair_sched_group() is called by
    free_sched_group(). Since destroy_cfs_bandwidth() is called by
    free_fair_sched_group() without calling init_cfs_bandwidth(),
    RCU stall occurs at hrtimer_cancel():

    INFO: rcu_sched self-detected stall on CPU { 1} (t=60000 jiffies g=13074 c=13073 q=0)
    Task dump for CPU 1:
    (fprintd) R running task 0 6249 1 0x00000088
    ...
    Call Trace:
    [] sched_show_task+0xa8/0x110
    [] dump_cpu_task+0x3d/0x50
    [] rcu_dump_cpu_stacks+0x90/0xd0
    [] rcu_check_callbacks+0x491/0x700
    [] update_process_times+0x4b/0x80
    [] tick_sched_handle.isra.20+0x36/0x50
    [] tick_sched_timer+0x42/0x70
    [] __run_hrtimer+0x69/0x1a0
    [] ? tick_sched_handle.isra.20+0x50/0x50
    [] hrtimer_interrupt+0xef/0x230
    [] local_apic_timer_interrupt+0x3b/0x70
    [] smp_apic_timer_interrupt+0x45/0x60
    [] apic_timer_interrupt+0x6d/0x80
    [] ? lock_hrtimer_base.isra.23+0x18/0x50
    [] ? __kmalloc+0x211/0x230
    [] hrtimer_try_to_cancel+0x22/0xd0
    [] ? __kmalloc+0x211/0x230
    [] hrtimer_cancel+0x22/0x30
    [] free_fair_sched_group+0x25/0xd0
    [] free_sched_group+0x16/0x40
    [] sched_create_group+0x4b/0x80
    [] sched_autogroup_create_attach+0x43/0x1c0
    [] sys_setsid+0x7c/0x110
    [] system_call_fastpath+0x12/0x17

    Check whether init_cfs_bandwidth() was called before calling
    destroy_cfs_bandwidth().

    Signed-off-by: Tetsuo Handa
    [ Move the check into destroy_cfs_bandwidth() to aid compilability. ]
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Paul Turner
    Cc: Ben Segall
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/201412252210.GCC30204.SOMVFFOtQJFLOH@I-love.SAKURA.ne.jp
    Signed-off-by: Ingo Molnar

    Tetsuo Handa
     
  • The dl_runtime_exceeded() function is supposed to ckeck if
    a SCHED_DEADLINE task must be throttled, by checking if its
    current runtime is
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Juri Lelli
    Cc:
    Cc: Dario Faggioli
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1418813432-20797-3-git-send-email-luca.abeni@unitn.it
    Signed-off-by: Ingo Molnar

    Luca Abeni
     
  • According to global EDF, tasks should be migrated between runqueues
    without checking if their scheduling deadlines and runtimes are valid.
    However, SCHED_DEADLINE currently performs such a check:
    a migration happens doing:

    deactivate_task(rq, next_task, 0);
    set_task_cpu(next_task, later_rq->cpu);
    activate_task(later_rq, next_task, 0);

    which ends up calling dequeue_task_dl(), setting the new CPU, and then
    calling enqueue_task_dl().

    enqueue_task_dl() then calls enqueue_dl_entity(), which calls
    update_dl_entity(), which can modify scheduling deadline and runtime,
    breaking global EDF scheduling.

    As a result, some of the properties of global EDF are not respected:
    for example, a taskset {(30, 80), (40, 80), (120, 170)} scheduled on
    two cores can have unbounded response times for the third task even
    if 30/80+40/80+120/170 = 1.5809 < 2

    This can be fixed by invoking update_dl_entity() only in case of
    wakeup, or if this is a new SCHED_DEADLINE task.

    Signed-off-by: Luca Abeni
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Juri Lelli
    Cc:
    Cc: Dario Faggioli
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1418813432-20797-2-git-send-email-luca.abeni@unitn.it
    Signed-off-by: Ingo Molnar

    Luca Abeni
     
  • In effective_load, we have (long w * unsigned long tg->shares) / long W,
    when w is negative, it is cast to unsigned long and hence the product is
    insanely large. Fix this by casting tg->shares to long.

    Reported-by: Sasha Levin
    Signed-off-by: Yuyang Du
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Dave Jones
    Cc: Andrey Ryabinin
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/20141219002956.GA25405@intel.com
    Signed-off-by: Ingo Molnar

    Yuyang Du
     
  • As per e23738a7300a ("sched, inotify: Deal with nested sleeps").

    fanotify_read is a wait loop with sleeps in. Wait loops rely on
    task_struct::state and sleeps do too, since that's the only means of
    actually sleeping. Therefore the nested sleeps destroy the wait loop
    state and the wait loop breaks the sleep functions that assume
    TASK_RUNNING (mutex_lock).

    Fix this by using the new woken_wake_function and wait_woken() stuff,
    which registers wakeups in wait and thereby allows shrinking the
    task_state::state changes to the actual sleep part.

    Reported-by: Yuanhan Liu
    Reported-by: Sedat Dilek
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Takashi Iwai
    Cc: Al Viro
    Cc: Eric Paris
    Cc: Linus Torvalds
    Cc: Eric Paris
    Link: http://lkml.kernel.org/r/20141216152838.GZ3337@twins.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • There was another report of a boot failure with a #GP fault in the
    uncore SBOX initialization. The earlier work around was not enough
    for this system.

    The boot was failing while trying to initialize the third SBOX.

    This patch detects parts with only two SBOXes and limits the number
    of SBOX units to two there.

    Stable material, as it affects boot problems on 3.18.

    Tested-by: Andreas Oehler
    Signed-off-by: Andi Kleen
    Signed-off-by: Peter Zijlstra (Intel)
    Cc:
    Cc: Arnaldo Carvalho de Melo
    Cc: Stephane Eranian
    Cc: Yan, Zheng
    Link: http://lkml.kernel.org/r/1420583675-9163-1-git-send-email-andi@firstfloor.org
    Signed-off-by: Ingo Molnar

    Andi Kleen
     
  • Perf reports user regs for kernel-mode samples so that samples can
    be backtraced through user code. The old code was very broken in
    syscall context, resulting in useless backtraces.

    The new code, in contrast, is still dangerously racy, but it should
    at least work most of the time.

    Tested-by: Jiri Olsa
    Signed-off-by: Andy Lutomirski
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Stephane Eranian
    Cc: Andrew Morton
    Cc: chenggang.qcg@taobao.com
    Cc: Wu Fengguang
    Cc: Namhyung Kim
    Cc: Mike Galbraith
    Cc: Arjan van de Ven
    Cc: David Ahern
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/243560c26ff0f739978e2459e203f6515367634d.1420396372.git.luto@amacapital.net
    Signed-off-by: Ingo Molnar

    Andy Lutomirski
     
  • On x86_64, at least, task_pt_regs may be only partially initialized
    in many contexts, so x86_64 should not use it without extra care
    from interrupt context, let alone NMI context.

    This will allow x86_64 to override the logic and will supply some
    scratch space to use to make a cleaner copy of user regs.

    Tested-by: Jiri Olsa
    Signed-off-by: Andy Lutomirski
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Stephane Eranian
    Cc: chenggang.qcg@taobao.com
    Cc: Wu Fengguang
    Cc: Namhyung Kim
    Cc: Mike Galbraith
    Cc: Arjan van de Ven
    Cc: David Ahern
    Cc: Arnaldo Carvalho de Melo
    Cc: Catalin Marinas
    Cc: Jean Pihet
    Cc: Linus Torvalds
    Cc: Mark Salter
    Cc: Russell King
    Cc: Will Deacon
    Cc: linux-arm-kernel@lists.infradead.org
    Link: http://lkml.kernel.org/r/e431cd4c18c2e1c44c774f10758527fb2d1025c4.1420396372.git.luto@amacapital.net
    Signed-off-by: Ingo Molnar

    Andy Lutomirski
     
  • Stephane reported that the PEBS fixup was broken by the recent commit to
    the instruction decoder. The thing had an off-by-one which resulted in
    not being able to decode the last instruction and always bail.

    Reported-by: Stephane Eranian
    Fixes: 6ba48ff46f76 ("x86: Remove arbitrary instruction size limit in instruction decoder")
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: stable@vger.kernel.org # 3.18
    Cc:
    Cc: Jiri Olsa
    Cc: Liang Kan
    Cc: Arnaldo Carvalho de Melo
    Cc: Dave Hansen
    Cc: Jim Keniston
    Cc: Linus Torvalds
    Cc: Masami Hiramatsu
    Link: http://lkml.kernel.org/r/20141216104614.GV3337@twins.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • …it/acme/linux into perf/urgent

    Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

    - Free callchains when hist entries are deleted, plugging a massive leak in
    'top -g', where hist_entries (and its callchains) are decayed over time. (Namhyung Kim)

    - Fix segfault when showing callchain in the hists browser (report & top) (Namhyung Kim)

    - Fix children sort key behavior, and also the 'perf test 32' test that
    was failing due to reliance on undefined behaviour (Namhyung Kim)

    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     
  • Charles Shirron and Paul Cassella from Cray Inc have reported kswapd
    stuck in a busy loop with nothing left to balance, but
    kswapd_try_to_sleep() failing to sleep. Their analysis found the cause
    to be a combination of several factors:

    1. A process is waiting in throttle_direct_reclaim() on pgdat->pfmemalloc_wait

    2. The process has been killed (by OOM in this case), but has not yet been
    scheduled to remove itself from the waitqueue and die.

    3. kswapd checks for throttled processes in prepare_kswapd_sleep():

    if (waitqueue_active(&pgdat->pfmemalloc_wait)) {
    wake_up(&pgdat->pfmemalloc_wait);
    return false; // kswapd will not go to sleep
    }

    However, for a process that was already killed, wake_up() does not remove
    the process from the waitqueue, since try_to_wake_up() checks its state
    first and returns false when the process is no longer waiting.

    4. kswapd is running on the same CPU as the only CPU that the process is
    allowed to run on (through cpus_allowed, or possibly single-cpu system).

    5. CONFIG_PREEMPT_NONE=y kernel is used. If there's nothing to balance, kswapd
    encounters no voluntary preemption points and repeatedly fails
    prepare_kswapd_sleep(), blocking the process from running and removing
    itself from the waitqueue, which would let kswapd sleep.

    So, the source of the problem is that we prevent kswapd from going to
    sleep until there are processes waiting on the pfmemalloc_wait queue,
    and a process waiting on a queue is guaranteed to be removed from the
    queue only when it gets scheduled. This was done to make sure that no
    process is left sleeping on pfmemalloc_wait when kswapd itself goes to
    sleep.

    However, it isn't necessary to postpone kswapd sleep until the
    pfmemalloc_wait queue actually empties. To prevent processes from being
    left sleeping, it's actually enough to guarantee that all processes
    waiting on pfmemalloc_wait queue have been woken up by the time we put
    kswapd to sleep.

    This patch therefore fixes this issue by substituting 'wake_up' with
    'wake_up_all' and removing 'return false' in the code snippet from
    prepare_kswapd_sleep() above. Note that if any process puts itself in
    the queue after this waitqueue_active() check, or after the wake up
    itself, it means that the process will also wake up kswapd - and since
    we are under prepare_to_wait(), the wake up won't be missed. Also we
    update the comment prepare_kswapd_sleep() to hopefully more clearly
    describe the races it is preventing.

    Fixes: 5515061d22f0 ("mm: throttle direct reclaimers if PF_MEMALLOC reserves are low and swap is backed by network storage")
    Signed-off-by: Vlastimil Babka
    Signed-off-by: Vladimir Davydov
    Cc: Mel Gorman
    Cc: Johannes Weiner
    Acked-by: Michal Hocko
    Acked-by: Rik van Riel
    Cc: [3.6+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vlastimil Babka
     
  • We are supposed to take one css reference per each memory page and per
    each swap entry accounted to a memory cgroup. However, during task
    charges migration we take a reference to the destination cgroup twice
    per each swap entry: first in mem_cgroup_do_precharge()->try_charge()
    and then in mem_cgroup_move_swap_account(), permanently leaking the
    destination cgroup.

    The hunk taking the second reference seems to be a leftover from the
    pre-00501b531c472 ("mm: memcontrol: rewrite charge API") era. Remove it
    to fix the leak.

    Fixes: e8ea14cc6ead (mm: memcontrol: take a css reference for each charged page)
    Signed-off-by: Vladimir Davydov
    Cc: Johannes Weiner
    Acked-by: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     
  • Commit 3e32cb2e0a12 ("mm: memcontrol: lockless page counters")
    accidentally switched the soft limit default from infinity to zero,
    which turns all memcgs with even a single page into soft limit excessors
    and engages soft limit reclaim on all of them during global memory
    pressure. This makes global reclaim generally more aggressive, but also
    inverts the meaning of existing soft limit configurations where unset
    soft limits are usually more generous than set ones.

    Signed-off-by: Johannes Weiner
    Acked-by: Michal Hocko
    Acked-by: Vladimir Davydov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • These are obsolete since commit e30825f1869a ("mm/debug-pagealloc:
    prepare boottime configurable") was merged. So remove them.

    [pebolle@tiscali.nl: find obsolete Kconfig options]
    Signed-off-by: Joonsoo Kim
    Cc: Paul Bolle
    Cc: Mel Gorman
    Cc: Johannes Weiner
    Cc: Minchan Kim
    Cc: Dave Hansen
    Cc: Michal Nazarewicz
    Cc: Jungsoo Son
    Acked-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • Fix clashing values for O_PATH and FMODE_NONOTIFY on sparc. The
    clashing O_PATH value was added in commit 5229645bdc35 ("vfs: add
    nonconflicting values for O_PATH") but this can't be changed as it is
    user-visible.

    FMODE_NONOTIFY is only used internally in the kernel, but it is in the
    same numbering space as the other O_* flags, as indicated by the comment
    at the top of include/uapi/asm-generic/fcntl.h (and its use in
    fs/notify/fanotify/fanotify_user.c). So renumber it to avoid the clash.

    All of this has happened before (commit 12ed2e36c98a: "fanotify:
    FMODE_NONOTIFY and __O_SYNC in sparc conflict"), and all of this will
    happen again -- so update the uniqueness check in fcntl_init() to
    include __FMODE_NONOTIFY.

    Signed-off-by: David Drysdale
    Acked-by: David S. Miller
    Acked-by: Jan Kara
    Cc: Heinrich Schuchardt
    Cc: Alexander Viro
    Cc: Arnd Bergmann
    Cc: Stephen Rothwell
    Cc: Eric Paris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Drysdale
     
  • build error

    arch/blackfin/mach-bf533/boards/stamp.c:834:2: error: implicit declaration of function 'mdelay'

    Signed-off-by: Oleg Nesterov
    Reported-by: Wu Fengguang
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • In ocfs2_link(), the parent directory inode passed to function
    ocfs2_lookup_ino_from_name() is wrong. Parameter dir is the parent of
    new_dentry not old_dentry. We should get old_dir from old_dentry and
    lookup old_dentry in old_dir in case another node remove the old dentry.

    With this change, hard linking works again, when paths are relative with
    at least one subdirectory. This is how the problem was reproducable:

    # mkdir a
    # mkdir b
    # touch a/test
    # ln a/test b/test
    ln: failed to create hard link `b/test' => `a/test': No such file or directory

    However when creating links in the same dir, it worked well.

    Now the link gets created.

    Fixes: 0e048316ff57 ("ocfs2: check existence of old dentry in ocfs2_link()")
    Signed-off-by: joyce.xue
    Reported-by: Szabo Aron - UBIT
    Cc: Mark Fasheh
    Cc: Joel Becker
    Tested-by: Aron Szabo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xue jiufei