28 Jul, 2014

5 commits


27 Jul, 2014

2 commits

  • Michel Dänzer and a couple of other people reported inexplicable random
    oopses in the scheduler, and the cause turns out to be gcc mis-compiling
    the load_balance() function when debugging is enabled. The gcc bug
    apparently goes back to gcc-4.5, but slight optimization changes means
    that it now showed up as a problem in 4.9.0 and 4.9.1.

    The instruction scheduling problem causes gcc to schedule a spill
    operation to before the stack frame has been created, which in turn can
    corrupt the spilled value if an interrupt comes in. There may be other
    effects of this bug too, but that's the code generation problem seen in
    Michel's case.

    This is fixed in current gcc HEAD, but the workaround as suggested by
    Markus Trippelsdorf is pretty simple: use -fno-var-tracking-assignments
    when compiling the kernel, which disables the gcc code that causes the
    problem. This can result in slightly worse debug information for
    variable accesses, but that is infinitely preferable to actual code
    generation problems.

    Doing this unconditionally (not just for CONFIG_DEBUG_INFO) also allows
    non-debug builds to verify that the debug build would be identical: we
    can do

    export GCC_COMPARE_DEBUG=1

    to make gcc internally verify that the result of the build is
    independent of the "-g" flag (it will make the compiler build everything
    twice, toggling the debug flag, and compare the results).

    Without the "-fno-var-tracking-assignments" option, the build would fail
    (even with 4.8.3 that didn't show the actual stack frame bug) with a gcc
    compare failure.

    See also gcc bugzilla:

    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61801

    Reported-by: Michel Dänzer
    Suggested-by: Markus Trippelsdorf
    Cc: Jakub Jelinek
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Shortly before 3.16-rc1, Dave Jones reported:

    WARNING: CPU: 3 PID: 19721 at fs/xfs/xfs_aops.c:971
    xfs_vm_writepage+0x5ce/0x630 [xfs]()
    CPU: 3 PID: 19721 Comm: trinity-c61 Not tainted 3.15.0+ #3
    Call Trace:
    xfs_vm_writepage+0x5ce/0x630 [xfs]
    shrink_page_list+0x8f9/0xb90
    shrink_inactive_list+0x253/0x510
    shrink_lruvec+0x563/0x6c0
    shrink_zone+0x3b/0x100
    shrink_zones+0x1f1/0x3c0
    try_to_free_pages+0x164/0x380
    __alloc_pages_nodemask+0x822/0xc90
    alloc_pages_vma+0xaf/0x1c0
    handle_mm_fault+0xa31/0xc50
    etc.

    970 if (WARN_ON_ONCE((current->flags & (PF_MEMALLOC|PF_KSWAPD)) ==
    971 PF_MEMALLOC))

    I did not respond at the time, because a glance at the PageDirty block
    in shrink_page_list() quickly shows that this is impossible: we don't do
    writeback on file pages (other than tmpfs) from direct reclaim nowadays.
    Dave was hallucinating, but it would have been disrespectful to say so.

    However, my own /var/log/messages now shows similar complaints

    WARNING: CPU: 1 PID: 28814 at fs/ext4/inode.c:1881 ext4_writepage+0xa7/0x38b()
    WARNING: CPU: 0 PID: 27347 at fs/ext4/inode.c:1764 ext4_writepage+0xa7/0x38b()

    from stressing some mmotm trees during July.

    Could a dirty xfs or ext4 file page somehow get marked PageSwapBacked,
    so fail shrink_page_list()'s page_is_file_cache() test, and so proceed
    to mapping->a_ops->writepage()?

    Yes, 3.16-rc1's commit 68711a746345 ("mm, migration: add destination
    page freeing callback") has provided such a way to compaction: if
    migrating a SwapBacked page fails, its newpage may be put back on the
    list for later use with PageSwapBacked still set, and nothing will clear
    it.

    Whether that can do anything worse than issue WARN_ON_ONCEs, and get
    some statistics wrong, is unclear: easier to fix than to think through
    the consequences.

    Fixing it here, before the put_new_page(), addresses the bug directly,
    but is probably the worst place to fix it. Page migration is doing too
    many parts of the job on too many levels: fixing it in
    move_to_new_page() to complement its SetPageSwapBacked would be
    preferable, except why is it (and newpage->mapping and newpage->index)
    done there, rather than down in migrate_page_move_mapping(), once we are
    sure of success? Not a cleanup to get into right now, especially not
    with memcg cleanups coming in 3.17.

    Reported-by: Dave Jones
    Signed-off-by: Hugh Dickins
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

26 Jul, 2014

13 commits


25 Jul, 2014

6 commits

  • Promote one fix for 3.16

    This fix was necessary after

    9c15a24b038f ("x86/mce: Improve mcheck_init_device() error handling")

    went in. What this patch did was, among others, check the return value
    of misc_register and exit early if it encountered an error. Original
    code sloppily didn't do that.

    However,

    cef12ee52b05 ("xen/mce: Add mcelog support for Xen platform")

    made it so that xen's init routine xen_late_init_mcelog runs first. This
    was needed for the xen mcelog device which is supposed to be independent
    from the baremetal one.

    Initially it was reported that misc_register() fails often on xen and
    that's why it needed fixing. However, it is *supposed* to fail by
    design, when running in dom0 so that the xen mcelog device file gets
    registered first.

    And *then* you need the notifier *not* unregistered on the error path so
    that the timer does get deleted properly in the CPU hotplug notifier.

    Btw, this fix is needed also on baremetal in the unlikely event that
    misc_register(&mce_chrdev_device) fails there too.

    I was unsure whether to rush it in now and decided to delay it to 3.17.
    However, xen people wanted it promoted as it breaks xen when doing cpu
    hotplug there. So, after a bit of simmering in tip/master for initial
    smoke testing, let's move it to 3.16. It fixes a semi-regression which
    got introduced in 3.16 so no need for stable tagging.

    tip/x86/ras contains that exact same commit but we can't remove it
    there as it is not the last one. It won't cause any merge issues, as I
    confirmed locally but I should state here the special situation of this
    one fix explicitly anyway.

    Thanks.

    Signed-off-by: H. Peter Anvin

    H. Peter Anvin
     
  • This is a halfway fix for hawaii acceleration. More fixes to come
    but hopefully isolated to userspace.

    Signed-off-by: Jérôme Glisse
    Cc: stable@vger.kernel.org
    Signed-off-by: Dave Airlie

    Jerome Glisse
     
  • two more radeon fixes.

    * 'drm-fixes-3.16' of git://people.freedesktop.org/~agd5f/linux:
    drm/radeon: fix irq ring buffer overflow handling
    drm/radeon: fix error handling in radeon_vm_bo_set_addr

    Dave Airlie
     
  • This time in time! Just 32bit-pae fix from Hugh, semaphores fun from Chris
    and a fix for runtime pm cherry-picked from next.

    Paulo is still working on a fix for runtime pm when X does cursor fun when
    the display is off, but that one isn't ready yet.

    * tag 'drm-intel-fixes-2014-07-24' of git://anongit.freedesktop.org/drm-intel:
    drm/i915: Simplify i915_gem_release_all_mmaps()
    drm/i915: fix freeze with blank screen booting highmem
    drm/i915: Reorder the semaphore deadlock check, again

    Dave Airlie
     
  • alloc_bootmem and related function always return zeroed region of
    memory. Thus a memset after calls to these functions is unnecessary.

    The following Coccinelle semantic patch was used for making the change:

    @@
    expression E,E1;
    @@

    E = \(alloc_bootmem\|alloc_bootmem_low\|alloc_bootmem_pages\|alloc_bootmem_low_pages\)(...)
    ... when != E
    - memset(E,0,E1);

    Signed-off-by: Himangi Saraogi
    Acked-by: Julia Lawall
    Signed-off-by: Helge Deller

    HIMANGI SARAOGI
     
  • The sa_restorer field in struct sigaction is obsolete and no longer in
    the parisc implementation. However, the core code assumes the field is
    present if SA_RESTORER is defined. So, the define needs to be removed.

    Signed-off-by: John David Anglin
    Cc:
    Signed-off-by: Helge Deller

    John David Anglin
     

24 Jul, 2014

14 commits

  • Temperature limit clamps are applied after converting the temperature
    from milli-degrees C to degrees C, so either the clamp limit needs
    to be specified in degrees C, not milli-degrees C, or clamping must
    happen before converting to degrees C. Use the latter method to avoid
    overflows.

    vrm is an u8, so the written value needs to be limited to [0, 255].

    Cc: Axel Lin
    Cc: stable@vger.kernel.org
    Signed-off-by: Guenter Roeck
    Reviewed-by: Jean Delvare

    Guenter Roeck
     
  • Currently umount on symlink blocks following umount:

    /vz is separate mount

    # ls /vz/ -al | grep test
    drwxr-xr-x. 2 root root 4096 Jul 19 01:14 testdir
    lrwxrwxrwx. 1 root root 11 Jul 19 01:16 testlink -> /vz/testdir
    # umount -l /vz/testlink
    umount: /vz/testlink: not mounted (expected)

    # lsof /vz
    # umount /vz
    umount: /vz: device is busy. (unexpected)

    In this case mountpoint_last() gets an extra refcount on path->mnt

    Signed-off-by: Vasily Averin
    Acked-by: Ian Kent
    Acked-by: Jeff Layton
    Cc: stable@vger.kernel.org
    Signed-off-by: Christoph Hellwig

    Vasily Averin
     
  • The following warnings:

    fs/direct-io.c: In function ‘__blockdev_direct_IO’:
    fs/direct-io.c:1011:12: warning: ‘to’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    fs/direct-io.c:913:16: note: ‘to’ was declared here
    fs/direct-io.c:1011:12: warning: ‘from’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    fs/direct-io.c:913:10: note: ‘from’ was declared here

    are false positive because dio_get_page() either fails, or sets both
    'from' and 'to'.

    Paul Bolle said ...
    Maybe it's better to move initializing "to" and "from" out of
    dio_get_page(). That _might_ make it easier for both the the reader and
    the compiler to understand what's going on. Something like this:

    Christoph Hellwig said ...
    The fix of moving the code definitively looks nicer, while I think
    uninitialized_var is horrible wart that won't get anywhere near my code.

    Boaz Harrosh: I agree with Christoph and Paul

    Signed-off-by: Boaz Harrosh
    Signed-off-by: Christoph Hellwig

    Boaz Harrosh
     
  • Pull nfsd bugfix from Bruce Fields:
    "Another regression from the xdr encoding rewrite"

    * 'for-3.16' of git://linux-nfs.org/~bfields/linux:
    NFSD: Fix crash encoding lock reply on 32-bit

    Linus Torvalds
     
  • Pull arm64 fix from Catalin Marinas:
    "Fix arm64 regression introduced by limiting the CMA buffer to ZONE_DMA
    on platforms where RAM starts above 4GB (and ZONE_DMA becoming 0)"

    * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
    arm64: Create non-empty ZONE_DMA when DRAM starts above 4GB

    Linus Torvalds
     
  • Pull Xtensa fixes from Chris Zankel:
    - resolve FIXMEs in double exception handler for window overflow. This
    fix makes native building of linux on xtensa host possible;
    - fix sysmem region removal issue introduced in 3.15.

    * tag 'xtensa-next-20140721' of git://github.com/czankel/xtensa-linux:
    xtensa: fix sysmem reservation at the end of existing block
    xtensa: add fixup for double exception raised in window overflow

    Linus Torvalds
     
  • Pull pin control fixes from Linus Walleij:
    "Here are three pin control fixes for the v3.16 series. Sorry that
    some of these arrive late, the summer heat in Sweden makes me slow.

    - an IRQ handling fix for the STi driver, also for stable
    - another IRQ fix for the RCAR GPIO driver
    - a MAINTAINERS entry"

    * tag 'pinctrl-v3.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
    gpio: rcar: Add support for DT IRQ flags
    MAINTAINERS: Add entry for the Renesas pin controller driver
    pinctrl: st: Fix irqmux handler

    Linus Torvalds
     
  • Pull libata regression fix from Tejun Heo:
    "The last libata/for-3.16-fixes pull contained a regression introduced
    by 1871ee134b73 ("libata: support the ata host which implements a
    queue depth less than 32") which in turn was a fix for a regression
    introduced earlier while changing queue tag order to accomodate hard
    drives which perform poorly if tags are not allocated in circular
    order (ugh...).

    The regression happens only for SAS controllers making use of libata
    to serve ATA devices. They don't fill an ata_host field which is used
    by the new tag allocation function leading to NULL dereference.

    This patch adds a new intermediate field ata_host->n_tags which is
    initialized for both SAS and !SAS cases to fix the issue"

    * 'for-3.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
    libata: introduce ata_host->n_tags to avoid oops on SAS controllers

    Linus Torvalds
     
  • Pull input layer fixes from Dmitry Torokhov:
    "A few fixups for the input subsystem"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
    Input: document INPUT_PROP_TOPBUTTONPAD
    Input: fix defuzzing logic
    Input: sirfsoc-onkey - fix GPL v2 license string typo
    Input: st-keyscan - fix 'defined but not used' compiler warnings
    Input: synaptics - add min/max quirk for pnp-id LEN2002 (Edge E531)
    Input: i8042 - add Acer Aspire 5710 to nomux blacklist
    Input: ti_am335x_tsc - warn about incorrect spelling
    Input: wacom - cleanup multitouch code when touch_max is 2

    Linus Torvalds
     
  • Pull powerpc fixes from Ben Herrenschmidt:
    "Here is a handful of powerpc fixes for 3.16. They are all pretty
    simple and self contained and should still make this release"

    * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
    powerpc: use _GLOBAL_TOC for memmove
    powerpc/pseries: dynamically added OF nodes need to call of_node_init
    powerpc: subpage_protect: Increase the array size to take care of 64TB
    powerpc: Fix bugs in emulate_step()
    powerpc: Disable doorbells on Power8 DD1.x

    Linus Torvalds
     
  • Pull slab fix from Mike Snitzer:
    "This fixes the broken duplicate slab name check in
    kmem_cache_sanity_check() that has been repeatedly reported (as
    recently as today against Fedora rawhide).

    Pekka seemed to have it staged for a late 3.15-rc in his 'slab/urgent'
    branch but never sent a pull request, see:
    https://lkml.org/lkml/2014/5/23/648"

    * tag 'urgent-slab-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
    slab_common: fix the check for duplicate slab names

    Linus Torvalds
     
  • Merge fixes from Andrew Morton:
    "10 fixes"

    * emailed patches from Andrew Morton :
    mm: hugetlb: fix copy_hugetlb_page_range()
    simple_xattr: permit 0-size extended attributes
    mm/fs: fix pessimization in hole-punching pagecache
    shmem: fix splicing from a hole while it's punched
    shmem: fix faulting into a hole, not taking i_mutex
    mm: do not call do_fault_around for non-linear fault
    sh: also try passing -m4-nofpu for SH2A builds
    zram: avoid lockdep splat by revalidate_disk
    mm/rmap.c: fix pgoff calculation to handle hugepage correctly
    coredump: fix the setting of PF_DUMPCORE

    Linus Torvalds
     
  • Commit 4a705fef9862 ("hugetlb: fix copy_hugetlb_page_range() to handle
    migration/hwpoisoned entry") changed the order of
    huge_ptep_set_wrprotect() and huge_ptep_get(), which leads to breakage
    in some workloads like hugepage-backed heap allocation via libhugetlbfs.
    This patch fixes it.

    The test program for the problem is shown below:

    $ cat heap.c
    #include
    #include
    #include

    #define HPS 0x200000

    int main() {
    int i;
    char *p = malloc(HPS);
    memset(p, '1', HPS);
    for (i = 0; i < 5; i++) {
    if (!fork()) {
    memset(p, '2', HPS);
    p = malloc(HPS);
    memset(p, '3', HPS);
    free(p);
    return 0;
    }
    }
    sleep(1);
    free(p);
    return 0;
    }

    $ export HUGETLB_MORECORE=yes ; export HUGETLB_NO_PREFAULT= ; hugectl --heap ./heap

    Fixes 4a705fef9862 ("hugetlb: fix copy_hugetlb_page_range() to handle
    migration/hwpoisoned entry"), so is applicable to -stable kernels which
    include it.

    Signed-off-by: Naoya Horiguchi
    Reported-by: Guillaume Morin
    Suggested-by: Guillaume Morin
    Acked-by: Hugh Dickins
    Cc: [2.6.37+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     
  • If a filesystem uses simple_xattr to support user extended attributes,
    LTP setxattr01 and xfstests generic/062 fail with "Cannot allocate
    memory": simple_xattr_alloc()'s wrap-around test mistakenly excludes
    values of zero size. Fix that off-by-one (but apparently no filesystem
    needs them yet).

    Signed-off-by: Hugh Dickins
    Cc: Al Viro
    Cc: Jeff Layton
    Cc: Aristeu Rozanski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins