25 Jul, 2014

1 commit

  • The sa_restorer field in struct sigaction is obsolete and no longer in
    the parisc implementation. However, the core code assumes the field is
    present if SA_RESTORER is defined. So, the define needs to be removed.

    Signed-off-by: John David Anglin
    Cc:
    Signed-off-by: Helge Deller

    John David Anglin
     

24 Jul, 2014

19 commits

  • Pull nfsd bugfix from Bruce Fields:
    "Another regression from the xdr encoding rewrite"

    * 'for-3.16' of git://linux-nfs.org/~bfields/linux:
    NFSD: Fix crash encoding lock reply on 32-bit

    Linus Torvalds
     
  • Pull arm64 fix from Catalin Marinas:
    "Fix arm64 regression introduced by limiting the CMA buffer to ZONE_DMA
    on platforms where RAM starts above 4GB (and ZONE_DMA becoming 0)"

    * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
    arm64: Create non-empty ZONE_DMA when DRAM starts above 4GB

    Linus Torvalds
     
  • Pull Xtensa fixes from Chris Zankel:
    - resolve FIXMEs in double exception handler for window overflow. This
    fix makes native building of linux on xtensa host possible;
    - fix sysmem region removal issue introduced in 3.15.

    * tag 'xtensa-next-20140721' of git://github.com/czankel/xtensa-linux:
    xtensa: fix sysmem reservation at the end of existing block
    xtensa: add fixup for double exception raised in window overflow

    Linus Torvalds
     
  • Pull pin control fixes from Linus Walleij:
    "Here are three pin control fixes for the v3.16 series. Sorry that
    some of these arrive late, the summer heat in Sweden makes me slow.

    - an IRQ handling fix for the STi driver, also for stable
    - another IRQ fix for the RCAR GPIO driver
    - a MAINTAINERS entry"

    * tag 'pinctrl-v3.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
    gpio: rcar: Add support for DT IRQ flags
    MAINTAINERS: Add entry for the Renesas pin controller driver
    pinctrl: st: Fix irqmux handler

    Linus Torvalds
     
  • Pull libata regression fix from Tejun Heo:
    "The last libata/for-3.16-fixes pull contained a regression introduced
    by 1871ee134b73 ("libata: support the ata host which implements a
    queue depth less than 32") which in turn was a fix for a regression
    introduced earlier while changing queue tag order to accomodate hard
    drives which perform poorly if tags are not allocated in circular
    order (ugh...).

    The regression happens only for SAS controllers making use of libata
    to serve ATA devices. They don't fill an ata_host field which is used
    by the new tag allocation function leading to NULL dereference.

    This patch adds a new intermediate field ata_host->n_tags which is
    initialized for both SAS and !SAS cases to fix the issue"

    * 'for-3.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
    libata: introduce ata_host->n_tags to avoid oops on SAS controllers

    Linus Torvalds
     
  • Pull input layer fixes from Dmitry Torokhov:
    "A few fixups for the input subsystem"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
    Input: document INPUT_PROP_TOPBUTTONPAD
    Input: fix defuzzing logic
    Input: sirfsoc-onkey - fix GPL v2 license string typo
    Input: st-keyscan - fix 'defined but not used' compiler warnings
    Input: synaptics - add min/max quirk for pnp-id LEN2002 (Edge E531)
    Input: i8042 - add Acer Aspire 5710 to nomux blacklist
    Input: ti_am335x_tsc - warn about incorrect spelling
    Input: wacom - cleanup multitouch code when touch_max is 2

    Linus Torvalds
     
  • Pull powerpc fixes from Ben Herrenschmidt:
    "Here is a handful of powerpc fixes for 3.16. They are all pretty
    simple and self contained and should still make this release"

    * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
    powerpc: use _GLOBAL_TOC for memmove
    powerpc/pseries: dynamically added OF nodes need to call of_node_init
    powerpc: subpage_protect: Increase the array size to take care of 64TB
    powerpc: Fix bugs in emulate_step()
    powerpc: Disable doorbells on Power8 DD1.x

    Linus Torvalds
     
  • Pull slab fix from Mike Snitzer:
    "This fixes the broken duplicate slab name check in
    kmem_cache_sanity_check() that has been repeatedly reported (as
    recently as today against Fedora rawhide).

    Pekka seemed to have it staged for a late 3.15-rc in his 'slab/urgent'
    branch but never sent a pull request, see:
    https://lkml.org/lkml/2014/5/23/648"

    * tag 'urgent-slab-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
    slab_common: fix the check for duplicate slab names

    Linus Torvalds
     
  • Merge fixes from Andrew Morton:
    "10 fixes"

    * emailed patches from Andrew Morton :
    mm: hugetlb: fix copy_hugetlb_page_range()
    simple_xattr: permit 0-size extended attributes
    mm/fs: fix pessimization in hole-punching pagecache
    shmem: fix splicing from a hole while it's punched
    shmem: fix faulting into a hole, not taking i_mutex
    mm: do not call do_fault_around for non-linear fault
    sh: also try passing -m4-nofpu for SH2A builds
    zram: avoid lockdep splat by revalidate_disk
    mm/rmap.c: fix pgoff calculation to handle hugepage correctly
    coredump: fix the setting of PF_DUMPCORE

    Linus Torvalds
     
  • Commit 4a705fef9862 ("hugetlb: fix copy_hugetlb_page_range() to handle
    migration/hwpoisoned entry") changed the order of
    huge_ptep_set_wrprotect() and huge_ptep_get(), which leads to breakage
    in some workloads like hugepage-backed heap allocation via libhugetlbfs.
    This patch fixes it.

    The test program for the problem is shown below:

    $ cat heap.c
    #include
    #include
    #include

    #define HPS 0x200000

    int main() {
    int i;
    char *p = malloc(HPS);
    memset(p, '1', HPS);
    for (i = 0; i < 5; i++) {
    if (!fork()) {
    memset(p, '2', HPS);
    p = malloc(HPS);
    memset(p, '3', HPS);
    free(p);
    return 0;
    }
    }
    sleep(1);
    free(p);
    return 0;
    }

    $ export HUGETLB_MORECORE=yes ; export HUGETLB_NO_PREFAULT= ; hugectl --heap ./heap

    Fixes 4a705fef9862 ("hugetlb: fix copy_hugetlb_page_range() to handle
    migration/hwpoisoned entry"), so is applicable to -stable kernels which
    include it.

    Signed-off-by: Naoya Horiguchi
    Reported-by: Guillaume Morin
    Suggested-by: Guillaume Morin
    Acked-by: Hugh Dickins
    Cc: [2.6.37+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     
  • If a filesystem uses simple_xattr to support user extended attributes,
    LTP setxattr01 and xfstests generic/062 fail with "Cannot allocate
    memory": simple_xattr_alloc()'s wrap-around test mistakenly excludes
    values of zero size. Fix that off-by-one (but apparently no filesystem
    needs them yet).

    Signed-off-by: Hugh Dickins
    Cc: Al Viro
    Cc: Jeff Layton
    Cc: Aristeu Rozanski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • I wanted to revert my v3.1 commit d0823576bf4b ("mm: pincer in
    truncate_inode_pages_range"), to keep truncate_inode_pages_range() in
    synch with shmem_undo_range(); but have stepped back - a change to
    hole-punching in truncate_inode_pages_range() is a change to
    hole-punching in every filesystem (except tmpfs) that supports it.

    If there's a logical proof why no filesystem can depend for its own
    correctness on the pincer guarantee in truncate_inode_pages_range() - an
    instant when the entire hole is removed from pagecache - then let's
    revisit later. But the evidence is that only tmpfs suffered from the
    livelock, and we have no intention of extending hole-punch to ramfs. So
    for now just add a few comments (to match or differ from those in
    shmem_undo_range()), and fix one silliness noticed in d0823576bf4b...

    Its "index == start" addition to the hole-punch termination test was
    incomplete: it opened a way for the end condition to be missed, and the
    loop go on looking through the radix_tree, all the way to end of file.
    Fix that pessimization by resetting index when detected in inner loop.

    Note that it's actually hard to hit this case, without the obsessive
    concurrent faulting that trinity does: normally all pages are removed in
    the initial trylock_page() pass, and this loop finds nothing to do. I
    had to "#if 0" out the initial pass to reproduce bug and test fix.

    Signed-off-by: Hugh Dickins
    Cc: Sasha Levin
    Cc: Konstantin Khlebnikov
    Cc: Lukas Czerner
    Cc: Dave Jones
    Acked-by: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • shmem_fault() is the actual culprit in trinity's hole-punch starvation,
    and the most significant cause of such problems: since a page faulted is
    one that then appears page_mapped(), needing unmap_mapping_range() and
    i_mmap_mutex to be unmapped again.

    But it is not the only way in which a page can be brought into a hole in
    the radix_tree while that hole is being punched; and Vlastimil's testing
    implies that if enough other processors are busy filling in the hole,
    then shmem_undo_range() can be kept from completing indefinitely.

    shmem_file_splice_read() is the main other user of SGP_CACHE, which can
    instantiate shmem pagecache pages in the read-only case (without holding
    i_mutex, so perhaps concurrently with a hole-punch). Probably it's
    silly not to use SGP_READ already (using the ZERO_PAGE for holes): which
    ought to be safe, but might bring surprises - not a change to be rushed.

    shmem_read_mapping_page_gfp() is an internal interface used by
    drivers/gpu/drm GEM (and next by uprobes): it should be okay. And
    shmem_file_read_iter() uses the SGP_DIRTY variant of SGP_CACHE, when
    called internally by the kernel (perhaps for a stacking filesystem,
    which might rely on holes to be reserved): it's unclear whether it could
    be provoked to keep hole-punch busy or not.

    We could apply the same umbrella as now used in shmem_fault() to
    shmem_file_splice_read() and the others; but it looks ugly, and use over
    a range raises questions - should it actually be per page? can these get
    starved themselves?

    The origin of this part of the problem is my v3.1 commit d0823576bf4b
    ("mm: pincer in truncate_inode_pages_range"), once it was duplicated
    into shmem.c. It seemed like a nice idea at the time, to ensure
    (barring RCU lookup fuzziness) that there's an instant when the entire
    hole is empty; but the indefinitely repeated scans to ensure that make
    it vulnerable.

    Revert that "enhancement" to hole-punch from shmem_undo_range(), but
    retain the unproblematic rescanning when it's truncating; add a couple
    of comments there.

    Remove the "indices[0] >= end" test: that is now handled satisfactorily
    by the inner loop, and mem_cgroup_uncharge_start()/end() are too light
    to be worth avoiding here.

    But if we do not always loop indefinitely, we do need to handle the case
    of swap swizzled back to page before shmem_free_swap() gets it: add a
    retry for that case, as suggested by Konstantin Khlebnikov; and for the
    case of page swizzled back to swap, as suggested by Johannes Weiner.

    Signed-off-by: Hugh Dickins
    Reported-by: Sasha Levin
    Suggested-by: Vlastimil Babka
    Cc: Konstantin Khlebnikov
    Cc: Johannes Weiner
    Cc: Lukas Czerner
    Cc: Dave Jones
    Cc: [3.1+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Commit f00cdc6df7d7 ("shmem: fix faulting into a hole while it's
    punched") was buggy: Sasha sent a lockdep report to remind us that
    grabbing i_mutex in the fault path is a no-no (write syscall may already
    hold i_mutex while faulting user buffer).

    We tried a completely different approach (see following patch) but that
    proved inadequate: good enough for a rational workload, but not good
    enough against trinity - which forks off so many mappings of the object
    that contention on i_mmap_mutex while hole-puncher holds i_mutex builds
    into serious starvation when concurrent faults force the puncher to fall
    back to single-page unmap_mapping_range() searches of the i_mmap tree.

    So return to the original umbrella approach, but keep away from i_mutex
    this time. We really don't want to bloat every shmem inode with a new
    mutex or completion, just to protect this unlikely case from trinity.
    So extend the original with wait_queue_head on stack at the hole-punch
    end, and wait_queue item on the stack at the fault end.

    This involves further use of i_lock to guard against the races: lockdep
    has been happy so far, and I see fs/inode.c:unlock_new_inode() holds
    i_lock around wake_up_bit(), which is comparable to what we do here.
    i_lock is more convenient, but we could switch to shmem's info->lock.

    This issue has been tagged with CVE-2014-4171, which will require commit
    f00cdc6df7d7 and this and the following patch to be backported: we
    suggest to 3.1+, though in fact the trinity forkbomb effect might go
    back as far as 2.6.16, when madvise(,,MADV_REMOVE) came in - or might
    not, since much has changed, with i_mmap_mutex a spinlock before 3.0.
    Anyone running trinity on 3.0 and earlier? I don't think we need care.

    Signed-off-by: Hugh Dickins
    Reported-by: Sasha Levin
    Tested-by: Sasha Levin
    Cc: Vlastimil Babka
    Cc: Konstantin Khlebnikov
    Cc: Johannes Weiner
    Cc: Lukas Czerner
    Cc: Dave Jones
    Cc: [3.1+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Ingo Korb reported that "repeated mapping of the same file on tmpfs
    using remap_file_pages sometimes triggers a BUG at mm/filemap.c:202 when
    the process exits".

    He bisected the bug to d7c1755179b8 ("mm: implement ->map_pages for
    shmem/tmpfs"), although the bug was actually added by commit
    8c6e50b0290c ("mm: introduce vm_ops->map_pages()").

    The problem is caused by calling do_fault_around for a _non-linear_
    fault. In this case pgoff is shifted and might become negative during
    calculation.

    Faulting around non-linear page-fault makes no sense and breaks the
    logic in do_fault_around because pgoff is shifted.

    Signed-off-by: Konstantin Khlebnikov
    Reported-by: Ingo Korb
    Tested-by: Ingo Korb
    Cc: Hugh Dickins
    Cc: Sasha Levin
    Cc: Dave Jones
    Cc: Ning Qu
    Cc: "Kirill A. Shutemov"
    Cc: [3.15.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     
  • When compiling a SH2A kernel (e.g. se7206_defconfig or rsk7203_defconfig)
    using sh4-linux-gcc, linking fails with:

    net/built-in.o: In function `__sk_run_filter':
    net/core/filter.c:566: undefined reference to `__fpscr_values'
    net/core/filter.c:269: undefined reference to `__fpscr_values'
    ...
    net/built-in.o:net/core/filter.c:580: more undefined references to `__fpscr_values' follow

    This happens because sh4-linux-gcc doesn't support the "-m2a-nofpu",
    which is thus filtered out by "$(call cc-option, ...)".

    As compiling using sh4-linux-gcc is useful for compile coverage, also
    try passing "-m4-nofpu" (which is presumably filtered out when using a
    real sh2a-linux toolchain) to disable the generation of FPU instructions
    and references to __fpscr_values[].

    Signed-off-by: Geert Uytterhoeven
    Cc: Guenter Roeck
    Cc: Tony Breeds
    Cc: Alexei Starovoitov
    Cc: Fengguang Wu
    Cc: Daniel Borkmann
    Cc: Magnus Damm
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Geert Uytterhoeven
     
  • Sasha reported lockdep warning [1] introduced by [2].

    It could be fixed by doing disk revalidation out of the init_lock. It's
    okay because disk capacity change is protected by init_lock so that
    revalidate_disk always sees up-to-date value so there is no race.

    [1] https://lkml.org/lkml/2014/7/3/735
    [2] zram: revalidate disk after capacity change

    Fixes 2e32baea46ce ("zram: revalidate disk after capacity change").

    Signed-off-by: Minchan Kim
    Reported-by: Sasha Levin
    Cc: "Alexander E. Patrakov"
    Cc: Nitin Gupta
    Cc: Jerome Marchand
    Cc: Sergey Senozhatsky
    CC:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Minchan Kim
     
  • I triggered VM_BUG_ON() in vma_address() when I tried to migrate an
    anonymous hugepage with mbind() in the kernel v3.16-rc3. This is
    because pgoff's calculation in rmap_walk_anon() fails to consider
    compound_order() only to have an incorrect value.

    This patch introduces page_to_pgoff(), which gets the page's offset in
    PAGE_CACHE_SIZE.

    Kirill pointed out that page cache tree should natively handle
    hugepages, and in order to make hugetlbfs fit it, page->index of
    hugetlbfs page should be in PAGE_CACHE_SIZE. This is beyond this patch,
    but page_to_pgoff() contains the point to be fixed in a single function.

    Signed-off-by: Naoya Horiguchi
    Acked-by: Kirill A. Shutemov
    Cc: Joonsoo Kim
    Cc: Hugh Dickins
    Cc: Rik van Riel
    Cc: Hillf Danton
    Cc: Naoya Horiguchi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     
  • Commit 079148b919d0 ("coredump: factor out the setting of PF_DUMPCORE")
    cleaned up the setting of PF_DUMPCORE by removing it from all the
    linux_binfmt->core_dump() and moving it to zap_threads().But this ended
    up clearing all the previously set flags. This causes issues during
    core generation when tsk->flags is checked again (eg. for PF_USED_MATH
    to dump floating point registers). Fix this.

    Signed-off-by: Silesh C V
    Acked-by: Oleg Nesterov
    Cc: Mandeep Singh Baines
    Cc: [3.10+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Silesh C V
     

23 Jul, 2014

5 commits

  • Commit 8c7424cff6 "nfsd4: don't try to encode conflicting owner if low
    on space" forgot to free conf->data in nfsd4_encode_lockt and before
    sign conf->data to NULL in nfsd4_encode_lock_denied, causing a leak.

    Worse, kfree() can be called on an uninitialized pointer in the case of
    a succesful lock (or one that fails for a reason other than a conflict).

    (Note that lock->lk_denied.ld_owner.data appears it should be zero here,
    until you notice that it's one arm of a union the other arm of which is
    written to in the succesful case by the

    memcpy(&lock->lk_resp_stateid, &lock_stp->st_stid.sc_stateid,
    sizeof(stateid_t));

    in nfsd4_lock(). In the 32-bit case this overwrites ld_owner.data.)

    Signed-off-by: Kinglong Mee
    Fixes: 8c7424cff6 ""nfsd4: don't try to encode conflicting owner if low on space"
    Signed-off-by: J. Bruce Fields

    Kinglong Mee
     
  • 1871ee134b73 ("libata: support the ata host which implements a queue
    depth less than 32") directly used ata_port->scsi_host->can_queue from
    ata_qc_new() to determine the number of tags supported by the host;
    unfortunately, SAS controllers doing SATA don't initialize ->scsi_host
    leading to the following oops.

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
    IP: [] ata_qc_new_init+0x188/0x1b0
    PGD 0
    Oops: 0002 [#1] SMP
    Modules linked in: isci libsas scsi_transport_sas mgag200 drm_kms_helper ttm
    CPU: 1 PID: 518 Comm: udevd Not tainted 3.16.0-rc6+ #62
    Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
    task: ffff880c1a00b280 ti: ffff88061a000000 task.ti: ffff88061a000000
    RIP: 0010:[] [] ata_qc_new_init+0x188/0x1b0
    RSP: 0018:ffff88061a003ae8 EFLAGS: 00010012
    RAX: 0000000000000001 RBX: ffff88000241ca80 RCX: 00000000000000fa
    RDX: 0000000000000020 RSI: 0000000000000020 RDI: ffff8806194aa298
    RBP: ffff88061a003ae8 R08: ffff8806194a8000 R09: 0000000000000000
    R10: 0000000000000000 R11: ffff88000241ca80 R12: ffff88061ad58200
    R13: ffff8806194aa298 R14: ffffffff814e67a0 R15: ffff8806194a8000
    FS: 00007f3ad7fe3840(0000) GS:ffff880627620000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000058 CR3: 000000061a118000 CR4: 00000000001407e0
    Stack:
    ffff88061a003b20 ffffffff814e96e1 ffff88000241ca80 ffff88061ad58200
    ffff8800b6bf6000 ffff880c1c988000 ffff880619903850 ffff88061a003b68
    ffffffffa0056ce1 ffff88061a003b48 0000000013d6e6f8 ffff88000241ca80
    Call Trace:
    [] ata_sas_queuecmd+0xa1/0x430
    [] sas_queuecommand+0x191/0x220 [libsas]
    [] scsi_dispatch_cmd+0x10e/0x300
    [] scsi_request_fn+0x2f5/0x550
    [] __blk_run_queue+0x33/0x40
    [] queue_unplugged+0x2a/0x90
    [] blk_flush_plug_list+0x1b4/0x210
    [] blk_finish_plug+0x14/0x50
    [] __do_page_cache_readahead+0x198/0x1f0
    [] force_page_cache_readahead+0x31/0x50
    [] page_cache_sync_readahead+0x3e/0x50
    [] generic_file_read_iter+0x496/0x5a0
    [] blkdev_read_iter+0x37/0x40
    [] new_sync_read+0x7e/0xb0
    [] vfs_read+0x94/0x170
    [] SyS_read+0x46/0xb0
    [] ? SyS_lseek+0x91/0xb0
    [] system_call_fastpath+0x16/0x1b
    Code: 00 00 00 88 50 29 83 7f 08 01 19 d2 83 e2 f0 83 ea 50 88 50 34 c6 81 1d 02 00 00 40 c6 81 17 02 00 00 00 5d c3 66 0f 1f 44 00 00 14 25 58 00 00 00

    Fix it by introducing ata_host->n_tags which is initialized to
    ATA_MAX_QUEUE - 1 in ata_host_init() for SAS controllers and set to
    scsi_host_template->can_queue in ata_host_register() for !SAS ones.
    As SAS hosts are never registered, this will give them the same
    ATA_MAX_QUEUE - 1 as before. Note that we can't use
    scsi_host->can_queue directly for SAS hosts anyway as they can go
    higher than the libata maximum.

    Signed-off-by: Tejun Heo
    Reported-by: Mike Qiu
    Reported-by: Jesse Brandeburg
    Reported-by: Peter Hurley
    Reported-by: Peter Zijlstra
    Tested-by: Alexey Kardashevskiy
    Fixes: 1871ee134b73 ("libata: support the ata host which implements a queue depth less than 32")
    Cc: Kevin Hao
    Cc: Dan Williams
    Cc: stable@vger.kernel.org

    Tejun Heo
     
  • ZONE_DMA is created to allow 32-bit only devices to access memory in the
    absence of an IOMMU. On systems where the memory starts above 4GB, it is
    expected that some devices have a DMA offset hardwired to be able to
    access the bottom of the memory. Linux currently supports DT bindings
    for the DMA offsets but they are not (easily) available early during
    boot.

    This patch tries to guess a DMA offset and assumes that ZONE_DMA
    corresponds to the 32-bit mask above the start of DRAM.

    Fixes: 2d5a5612bc (arm64: Limit the CMA buffer to 32-bit if ZONE_DMA)
    Signed-off-by: Catalin Marinas
    Reported-by: Mark Salter
    Tested-by: Mark Salter
    Tested-by: Anup Patel

    Catalin Marinas
     
  • Signed-off-by: Peter Hutterer
    Signed-off-by: Dmitry Torokhov

    Peter Hutterer
     
  • …erg/linux into for-3.16-rcX

    Mike Snitzer
     

22 Jul, 2014

15 commits

  • memmove may be called from module code copy_pages(btrfs), and it may
    call memcpy, which may call back to C code, so it needs to use
    _GLOBAL_TOC to set up r2 correctly.

    This fixes following error when I tried to boot an le guest:

    Vector: 300 (Data Access) at [c000000073f97210]
    pc: c000000000015004: enable_kernel_altivec+0x24/0x80
    lr: c000000000058fbc: enter_vmx_copy+0x3c/0x60
    sp: c000000073f97490
    msr: 8000000002009033
    dar: d000000001d50170
    dsisr: 40000000
    current = 0xc0000000734c0000
    paca = 0xc00000000fff0000 softe: 0 irq_happened: 0x01
    pid = 815, comm = mktemp
    enter ? for help
    [c000000073f974f0] c000000000058fbc enter_vmx_copy+0x3c/0x60
    [c000000073f97510] c000000000057d34 memcpy_power7+0x274/0x840
    [c000000073f97610] d000000001c3179c copy_pages+0xfc/0x110 [btrfs]
    [c000000073f97660] d000000001c3c248 memcpy_extent_buffer+0xe8/0x160 [btrfs]
    [c000000073f97700] d000000001be4be8 setup_items_for_insert+0x208/0x4a0 [btrfs]
    [c000000073f97820] d000000001be50b4 btrfs_insert_empty_items+0xf4/0x140 [btrfs]
    [c000000073f97890] d000000001bfed30 insert_with_overflow+0x70/0x180 [btrfs]
    [c000000073f97900] d000000001bff174 btrfs_insert_dir_item+0x114/0x2f0 [btrfs]
    [c000000073f979a0] d000000001c1f92c btrfs_add_link+0x10c/0x370 [btrfs]
    [c000000073f97a40] d000000001c20e94 btrfs_create+0x204/0x270 [btrfs]
    [c000000073f97b00] c00000000026d438 vfs_create+0x178/0x210
    [c000000073f97b50] c000000000270a70 do_last+0x9f0/0xe90
    [c000000073f97c20] c000000000271010 path_openat+0x100/0x810
    [c000000073f97ce0] c000000000272ea8 do_filp_open+0x58/0xd0
    [c000000073f97dc0] c00000000025ade8 do_sys_open+0x1b8/0x300
    [c000000073f97e30] c00000000000a008 syscall_exit+0x0/0x7c

    Signed-off-by: Benjamin Herrenschmidt

    Li Zhong
     
  • Commit 75b57ecf9 refactored device tree nodes to use kobjects such that they
    can be exposed via /sysfs. A secondary commit 0829f6d1f furthered this rework
    by moving the kobect initialization logic out of of_node_add into its own
    of_node_init function. The inital commit removed the existing kref_init calls
    in the pseries dlpar code with the assumption kobject initialization would
    occur in of_node_add. The second commit had the side effect of triggering a
    BUG_ON during DLPAR, migration and suspend/resume operations as a result of
    dynamically added nodes being uninitialized.

    This patch fixes this by adding of_node_init calls in place of the previously
    removed kref_init calls.

    Fixes: 0829f6d1f69e ("of: device_node kobject lifecycle fixes")
    Cc: stable@vger.kernel.org
    Signed-off-by: Tyrel Datwyler
    Acked-by: Nathan Fontenot
    Acked-by: Grant Likely
    Signed-off-by: Benjamin Herrenschmidt

    Tyrel Datwyler
     
  • We now support TASK_SIZE of 16TB, hence the array should be 8.

    Fixes the below crash:

    Unable to handle kernel paging request for data at address 0x000100bd
    Faulting instruction address: 0xc00000000004f914
    cpu 0x13: Vector: 300 (Data Access) at [c000000fea75fa90]
    pc: c00000000004f914: .sys_subpage_prot+0x2d4/0x5c0
    lr: c00000000004fb5c: .sys_subpage_prot+0x51c/0x5c0
    sp: c000000fea75fd10
    msr: 9000000000009032
    dar: 100bd
    dsisr: 40000000
    current = 0xc000000fea6ae490
    paca = 0xc00000000fb8ab00 softe: 0 irq_happened: 0x00
    pid = 8237, comm = a.out
    enter ? for help
    [c000000fea75fe30] c00000000000a164 syscall_exit+0x0/0x98

    Signed-off-by: Benjamin Herrenschmidt

    Aneesh Kumar K.V
     
  • This fixes some bugs in emulate_step(). First, the setting of the carry
    bit for the arithmetic right-shift instructions was not correct on 64-bit
    machines because we were masking with a mask of type int rather than
    unsigned long. Secondly, the sld (shift left doubleword) instruction was
    using the wrong instruction field for the register containing the shift
    count.

    Signed-off-by: Paul Mackerras
    Signed-off-by: Benjamin Herrenschmidt

    Paul Mackerras
     
  • These processors do not currently support doorbell IPIs, so remove them
    from the feature list if we are at DD 1.xx for the 0x004d part.

    This fixes a regression caused by d4e58e5928f8 (powerpc/powernv: Enable
    POWER8 doorbell IPIs). With that patch the kernel would hang at boot
    when calling smp_call_function_many, as the doorbell would not be
    received by the target CPUs:

    .smp_call_function_many+0x2bc/0x3c0 (unreliable)
    .on_each_cpu_mask+0x30/0x100
    .cpuidle_register_driver+0x158/0x1a0
    .cpuidle_register+0x2c/0x110
    .powernv_processor_idle_init+0x23c/0x2c0
    .do_one_initcall+0xd4/0x260
    .kernel_init_freeable+0x25c/0x33c
    .kernel_init+0x1c/0x120
    .ret_from_kernel_thread+0x58/0x7c

    Fixes: d4e58e5928f8 (powerpc/powernv: Enable POWER8 doorbell IPIs)
    Signed-off-by: Joel Stanley
    Signed-off-by: Benjamin Herrenschmidt

    Joel Stanley
     
  • Pull networking fixes from David Miller:

    1) Null termination fix in dns_resolver got the pointer dereferncing
    wrong, fix from Ben Hutchings.

    2) ip_options_compile() has a benign but real buffer overflow when
    parsing options. From Eric Dumazet.

    3) Table updates can crash in netfilter's nftables if none of the state
    flags indicate an actual change, from Pablo Neira Ayuso.

    4) Fix race in nf_tables dumping, also from Pablo.

    5) GRE-GRO support broke the forwarding path because the segmentation
    state was not fully initialized in these paths, from Jerry Chu.

    6) sunvnet driver leaks objects and potentially crashes on module
    unload, from Sowmini Varadhan.

    7) We can accidently generate the same handle for several u32
    classifier filters, fix from Cong Wang.

    8) Several edge case bug fixes in fragment handling in xen-netback,
    from Zoltan Kiss.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (21 commits)
    ipv4: fix buffer overflow in ip_options_compile()
    batman-adv: fix TT VLAN inconsistency on VLAN re-add
    batman-adv: drop QinQ claim frames in bridge loop avoidance
    dns_resolver: Null-terminate the right string
    xen-netback: Fix pointer incrementation to avoid incorrect logging
    xen-netback: Fix releasing header slot on error path
    xen-netback: Fix releasing frag_list skbs in error path
    xen-netback: Fix handling frag_list on grant op error path
    net_sched: avoid generating same handle for u32 filters
    net: huawei_cdc_ncm: add "subclass 3" devices
    net: qmi_wwan: add two Sierra Wireless/Netgear devices
    wan/x25_asy: integer overflow in x25_asy_change_mtu()
    net: ppp: fix creating PPP pass and active filters
    net/mlx4_en: cq->irq_desc wasn't set in legacy EQ's
    sunvnet: clean up objects created in vnet_new() on vnet_exit()
    r8169: Enable RX_MULTI_EN for RTL_GIGA_MAC_VER_40
    net-gre-gro: Fix a bug that breaks the forwarding path
    netfilter: nf_tables: 64bit stats need some extra synchronization
    netfilter: nf_tables: set NLM_F_DUMP_INTR if netlink dumping is stale
    netfilter: nf_tables: safe RCU iteration on list when dumping
    ...

    Linus Torvalds
     
  • Pull sparc fix from David Miller:
    "Need to hook up the new renameat2 system call"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
    sparc: Hook up renameat2 syscall.

    Linus Torvalds
     
  • Pull IDE fixes from David Miller:
    - fix interrupt registry for some Atari IDE chipsets.
    - adjust Kconfig dependencies for x86_32 specific chips.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
    ide: Fix SC1200 dependencies
    ide: Fix CS5520 and CS5530 dependencies
    m68k/atari - ide: do not register interrupt if host->get_lock is set

    Linus Torvalds
     
  • …it/rostedt/linux-trace

    Pull trace fix from Steven Rostedt:
    "Tony Luck found that using the "uptime" trace clock that uses jiffies
    as a counter was converted to nanoseconds (silly), and after 1 hour 11
    minutes and 34 seconds, this monotonic clock would wrap, causing havoc
    with the tracing system and making the clock useless.

    He converted that clock to use jiffies_64 and made it into a counter
    instead of nanosecond conversions, and displayed the clock with the
    straight jiffy count, which works much better than it did in the past"

    * tag 'trace-fixes-v3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    tracing: Fix wraparound problems in "uptime" trace clock

    Linus Torvalds
     
  • Signed-off-by: David S. Miller

    David S. Miller
     
  • Antonio Quartulli says:

    ====================
    pull request [net]: batman-adv 20140721

    here you have two fixes that we have been testing for quite some time
    (this is why they arrived a bit late in the rc cycle).

    Patch 1) ensures that BLA packets get dropped and not forwarded to the
    mesh even if they reach batman-adv within QinQ frames. Forwarding them
    into the mesh means messing up with the TT database of other nodes which
    can generate all kind of unexpected behaviours during route computation.

    Patch 2) avoids a couple of race conditions triggered upon fast VLAN
    deletion-addition. Such race conditions are pretty dangerous because
    they not only create inconsistencies in the TT database of the nodes
    in the network, but such scenario is also unrecoverable (unless
    nodes are rebooted).
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • There is a benign buffer overflow in ip_options_compile spotted by
    AddressSanitizer[1] :

    Its benign because we always can access one extra byte in skb->head
    (because header is followed by struct skb_shared_info), and in this case
    this byte is not even used.

    [28504.910798] ==================================================================
    [28504.912046] AddressSanitizer: heap-buffer-overflow in ip_options_compile
    [28504.913170] Read of size 1 by thread T15843:
    [28504.914026] [] ip_options_compile+0x121/0x9c0
    [28504.915394] [] ip_options_get_from_user+0xad/0x120
    [28504.916843] [] do_ip_setsockopt.isra.15+0x8df/0x1630
    [28504.918175] [] ip_setsockopt+0x30/0xa0
    [28504.919490] [] tcp_setsockopt+0x5b/0x90
    [28504.920835] [] sock_common_setsockopt+0x5f/0x70
    [28504.922208] [] SyS_setsockopt+0xa2/0x140
    [28504.923459] [] system_call_fastpath+0x16/0x1b
    [28504.924722]
    [28504.925106] Allocated by thread T15843:
    [28504.925815] [] ip_options_get_from_user+0x35/0x120
    [28504.926884] [] do_ip_setsockopt.isra.15+0x8df/0x1630
    [28504.927975] [] ip_setsockopt+0x30/0xa0
    [28504.929175] [] tcp_setsockopt+0x5b/0x90
    [28504.930400] [] sock_common_setsockopt+0x5f/0x70
    [28504.931677] [] SyS_setsockopt+0xa2/0x140
    [28504.932851] [] system_call_fastpath+0x16/0x1b
    [28504.934018]
    [28504.934377] The buggy address ffff880026382828 is located 0 bytes to the right
    [28504.934377] of 40-byte region [ffff880026382800, ffff880026382828)
    [28504.937144]
    [28504.937474] Memory state around the buggy address:
    [28504.938430] ffff880026382300: ........ rrrrrrrr rrrrrrrr rrrrrrrr
    [28504.939884] ffff880026382400: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28504.941294] ffff880026382500: .....rrr rrrrrrrr rrrrrrrr rrrrrrrr
    [28504.942504] ffff880026382600: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28504.943483] ffff880026382700: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28504.944511] >ffff880026382800: .....rrr rrrrrrrr rrrrrrrr rrrrrrrr
    [28504.945573] ^
    [28504.946277] ffff880026382900: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28505.094949] ffff880026382a00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28505.096114] ffff880026382b00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28505.097116] ffff880026382c00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28505.098472] ffff880026382d00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28505.099804] Legend:
    [28505.100269] f - 8 freed bytes
    [28505.100884] r - 8 redzone bytes
    [28505.101649] . - 8 allocated bytes
    [28505.102406] x=1..7 - x allocated bytes + (8-x) redzone bytes
    [28505.103637] ==================================================================

    [1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Pull media fixes from Mauro Carvalho Chehab:
    "A series of driver fixes:
    - fix DVB-S tuning with tda1071
    - fix tuner probe on af9035 when the device has a bad eeprom
    - some fixes for the new si2168/2157 drivers
    - one Kconfig build fix (for omap4iss)
    - fixes at vpif error path
    - don't lock saa7134 ioctl at driver's base core level, as it now
    uses V4L2 and VB2 locking schema
    - fix audio at hdpvr driver
    - fix the aspect ratio at the digital timings table
    - one new USB ID (at gspca_pac7302): Genius i-Look 317 webcam"

    * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
    [media] gspca_pac7302: Add new usb-id for Genius i-Look 317
    [media] tda10071: fix returned symbol rate calculation
    [media] tda10071: fix spec inversion reporting
    [media] tda10071: add missing DVB-S2/PSK-8 FEC AUTO
    [media] tda10071: force modulation to QPSK on DVB-S
    [media] hdpvr: fix two audio bugs
    [media] davinci: vpif: missing unlocks on error
    [media] af9035: override tuner id when bad value set into eeprom
    [media] saa7134: use unlocked_ioctl instead of ioctl
    [media] media: v4l2-core: v4l2-dv-timings.c: Cleaning up code wrong value used in aspect ratio
    [media] si2168: firmware download fix
    [media] si2157: add one missing parenthesis
    [media] si2168: add one missing parenthesis
    [media] staging: tighten omap4iss dependencies

    Linus Torvalds
     
  • Pull block fixes from Jens Axboe:
    "Final block fixes for 3.16

    Four small fixes that should go into 3.16, have been queued up for a
    bit and delayed due to vacation and other euro duties. But here they
    are. The pull request contains:

    - Fix for a reported crash with shared tagging on SCSI from Christoph

    - A regression fix for drbd. From Lars Ellenberg.

    - Hooking up the compat ioctl for BLKZEROOUT, which requires no
    translation. From Mikulas.

    - A fix for a regression where we woud crash on queue exit if the
    root_blkg is gone/not there. From Tejun"

    * 'for-linus' of git://git.kernel.dk/linux-block:
    block: provide compat ioctl for BLKZEROOUT
    blkcg: don't call into policy draining if root_blkg is already gone
    drbd: fix regression 'out of mem, failed to invoke fence-peer helper'
    block: don't assume last put of shared tags is for the host

    Linus Torvalds
     
  • Pull libata fixes from Tejun Heo:
    "Late libata fixes.

    The most important one is from Kevin Hao which makes sure that libata
    only allocates tags inside the max tag number the controller supports.
    libata always had this problem but the recent tag allocation change
    and addition of support for sata_fsl which only supports queue depth
    of 16 exposed the issue.

    Hans de Goede agreed to become the maintainer of libahci_platform
    which is under higher than usual development pressure from all the new
    controllers popping up from the ARM world"

    * 'for-3.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
    ahci: add support for the Promise FastTrak TX8660 SATA HBA (ahci mode)
    drivers/ata/pata_ep93xx.c: use signed int type for result of platform_get_irq()
    libata: EH should handle AMNF error condition as a media error
    libata: support the ata host which implements a queue depth less than 32
    MAINTAINERS: Add Hans de Goede as ahci-platform maintainer

    Linus Torvalds