24 Jul, 2014

12 commits

  • Pull slab fix from Mike Snitzer:
    "This fixes the broken duplicate slab name check in
    kmem_cache_sanity_check() that has been repeatedly reported (as
    recently as today against Fedora rawhide).

    Pekka seemed to have it staged for a late 3.15-rc in his 'slab/urgent'
    branch but never sent a pull request, see:
    https://lkml.org/lkml/2014/5/23/648"

    * tag 'urgent-slab-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
    slab_common: fix the check for duplicate slab names

    Linus Torvalds
     
  • Merge fixes from Andrew Morton:
    "10 fixes"

    * emailed patches from Andrew Morton :
    mm: hugetlb: fix copy_hugetlb_page_range()
    simple_xattr: permit 0-size extended attributes
    mm/fs: fix pessimization in hole-punching pagecache
    shmem: fix splicing from a hole while it's punched
    shmem: fix faulting into a hole, not taking i_mutex
    mm: do not call do_fault_around for non-linear fault
    sh: also try passing -m4-nofpu for SH2A builds
    zram: avoid lockdep splat by revalidate_disk
    mm/rmap.c: fix pgoff calculation to handle hugepage correctly
    coredump: fix the setting of PF_DUMPCORE

    Linus Torvalds
     
  • Commit 4a705fef9862 ("hugetlb: fix copy_hugetlb_page_range() to handle
    migration/hwpoisoned entry") changed the order of
    huge_ptep_set_wrprotect() and huge_ptep_get(), which leads to breakage
    in some workloads like hugepage-backed heap allocation via libhugetlbfs.
    This patch fixes it.

    The test program for the problem is shown below:

    $ cat heap.c
    #include
    #include
    #include

    #define HPS 0x200000

    int main() {
    int i;
    char *p = malloc(HPS);
    memset(p, '1', HPS);
    for (i = 0; i < 5; i++) {
    if (!fork()) {
    memset(p, '2', HPS);
    p = malloc(HPS);
    memset(p, '3', HPS);
    free(p);
    return 0;
    }
    }
    sleep(1);
    free(p);
    return 0;
    }

    $ export HUGETLB_MORECORE=yes ; export HUGETLB_NO_PREFAULT= ; hugectl --heap ./heap

    Fixes 4a705fef9862 ("hugetlb: fix copy_hugetlb_page_range() to handle
    migration/hwpoisoned entry"), so is applicable to -stable kernels which
    include it.

    Signed-off-by: Naoya Horiguchi
    Reported-by: Guillaume Morin
    Suggested-by: Guillaume Morin
    Acked-by: Hugh Dickins
    Cc: [2.6.37+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     
  • If a filesystem uses simple_xattr to support user extended attributes,
    LTP setxattr01 and xfstests generic/062 fail with "Cannot allocate
    memory": simple_xattr_alloc()'s wrap-around test mistakenly excludes
    values of zero size. Fix that off-by-one (but apparently no filesystem
    needs them yet).

    Signed-off-by: Hugh Dickins
    Cc: Al Viro
    Cc: Jeff Layton
    Cc: Aristeu Rozanski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • I wanted to revert my v3.1 commit d0823576bf4b ("mm: pincer in
    truncate_inode_pages_range"), to keep truncate_inode_pages_range() in
    synch with shmem_undo_range(); but have stepped back - a change to
    hole-punching in truncate_inode_pages_range() is a change to
    hole-punching in every filesystem (except tmpfs) that supports it.

    If there's a logical proof why no filesystem can depend for its own
    correctness on the pincer guarantee in truncate_inode_pages_range() - an
    instant when the entire hole is removed from pagecache - then let's
    revisit later. But the evidence is that only tmpfs suffered from the
    livelock, and we have no intention of extending hole-punch to ramfs. So
    for now just add a few comments (to match or differ from those in
    shmem_undo_range()), and fix one silliness noticed in d0823576bf4b...

    Its "index == start" addition to the hole-punch termination test was
    incomplete: it opened a way for the end condition to be missed, and the
    loop go on looking through the radix_tree, all the way to end of file.
    Fix that pessimization by resetting index when detected in inner loop.

    Note that it's actually hard to hit this case, without the obsessive
    concurrent faulting that trinity does: normally all pages are removed in
    the initial trylock_page() pass, and this loop finds nothing to do. I
    had to "#if 0" out the initial pass to reproduce bug and test fix.

    Signed-off-by: Hugh Dickins
    Cc: Sasha Levin
    Cc: Konstantin Khlebnikov
    Cc: Lukas Czerner
    Cc: Dave Jones
    Acked-by: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • shmem_fault() is the actual culprit in trinity's hole-punch starvation,
    and the most significant cause of such problems: since a page faulted is
    one that then appears page_mapped(), needing unmap_mapping_range() and
    i_mmap_mutex to be unmapped again.

    But it is not the only way in which a page can be brought into a hole in
    the radix_tree while that hole is being punched; and Vlastimil's testing
    implies that if enough other processors are busy filling in the hole,
    then shmem_undo_range() can be kept from completing indefinitely.

    shmem_file_splice_read() is the main other user of SGP_CACHE, which can
    instantiate shmem pagecache pages in the read-only case (without holding
    i_mutex, so perhaps concurrently with a hole-punch). Probably it's
    silly not to use SGP_READ already (using the ZERO_PAGE for holes): which
    ought to be safe, but might bring surprises - not a change to be rushed.

    shmem_read_mapping_page_gfp() is an internal interface used by
    drivers/gpu/drm GEM (and next by uprobes): it should be okay. And
    shmem_file_read_iter() uses the SGP_DIRTY variant of SGP_CACHE, when
    called internally by the kernel (perhaps for a stacking filesystem,
    which might rely on holes to be reserved): it's unclear whether it could
    be provoked to keep hole-punch busy or not.

    We could apply the same umbrella as now used in shmem_fault() to
    shmem_file_splice_read() and the others; but it looks ugly, and use over
    a range raises questions - should it actually be per page? can these get
    starved themselves?

    The origin of this part of the problem is my v3.1 commit d0823576bf4b
    ("mm: pincer in truncate_inode_pages_range"), once it was duplicated
    into shmem.c. It seemed like a nice idea at the time, to ensure
    (barring RCU lookup fuzziness) that there's an instant when the entire
    hole is empty; but the indefinitely repeated scans to ensure that make
    it vulnerable.

    Revert that "enhancement" to hole-punch from shmem_undo_range(), but
    retain the unproblematic rescanning when it's truncating; add a couple
    of comments there.

    Remove the "indices[0] >= end" test: that is now handled satisfactorily
    by the inner loop, and mem_cgroup_uncharge_start()/end() are too light
    to be worth avoiding here.

    But if we do not always loop indefinitely, we do need to handle the case
    of swap swizzled back to page before shmem_free_swap() gets it: add a
    retry for that case, as suggested by Konstantin Khlebnikov; and for the
    case of page swizzled back to swap, as suggested by Johannes Weiner.

    Signed-off-by: Hugh Dickins
    Reported-by: Sasha Levin
    Suggested-by: Vlastimil Babka
    Cc: Konstantin Khlebnikov
    Cc: Johannes Weiner
    Cc: Lukas Czerner
    Cc: Dave Jones
    Cc: [3.1+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Commit f00cdc6df7d7 ("shmem: fix faulting into a hole while it's
    punched") was buggy: Sasha sent a lockdep report to remind us that
    grabbing i_mutex in the fault path is a no-no (write syscall may already
    hold i_mutex while faulting user buffer).

    We tried a completely different approach (see following patch) but that
    proved inadequate: good enough for a rational workload, but not good
    enough against trinity - which forks off so many mappings of the object
    that contention on i_mmap_mutex while hole-puncher holds i_mutex builds
    into serious starvation when concurrent faults force the puncher to fall
    back to single-page unmap_mapping_range() searches of the i_mmap tree.

    So return to the original umbrella approach, but keep away from i_mutex
    this time. We really don't want to bloat every shmem inode with a new
    mutex or completion, just to protect this unlikely case from trinity.
    So extend the original with wait_queue_head on stack at the hole-punch
    end, and wait_queue item on the stack at the fault end.

    This involves further use of i_lock to guard against the races: lockdep
    has been happy so far, and I see fs/inode.c:unlock_new_inode() holds
    i_lock around wake_up_bit(), which is comparable to what we do here.
    i_lock is more convenient, but we could switch to shmem's info->lock.

    This issue has been tagged with CVE-2014-4171, which will require commit
    f00cdc6df7d7 and this and the following patch to be backported: we
    suggest to 3.1+, though in fact the trinity forkbomb effect might go
    back as far as 2.6.16, when madvise(,,MADV_REMOVE) came in - or might
    not, since much has changed, with i_mmap_mutex a spinlock before 3.0.
    Anyone running trinity on 3.0 and earlier? I don't think we need care.

    Signed-off-by: Hugh Dickins
    Reported-by: Sasha Levin
    Tested-by: Sasha Levin
    Cc: Vlastimil Babka
    Cc: Konstantin Khlebnikov
    Cc: Johannes Weiner
    Cc: Lukas Czerner
    Cc: Dave Jones
    Cc: [3.1+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Ingo Korb reported that "repeated mapping of the same file on tmpfs
    using remap_file_pages sometimes triggers a BUG at mm/filemap.c:202 when
    the process exits".

    He bisected the bug to d7c1755179b8 ("mm: implement ->map_pages for
    shmem/tmpfs"), although the bug was actually added by commit
    8c6e50b0290c ("mm: introduce vm_ops->map_pages()").

    The problem is caused by calling do_fault_around for a _non-linear_
    fault. In this case pgoff is shifted and might become negative during
    calculation.

    Faulting around non-linear page-fault makes no sense and breaks the
    logic in do_fault_around because pgoff is shifted.

    Signed-off-by: Konstantin Khlebnikov
    Reported-by: Ingo Korb
    Tested-by: Ingo Korb
    Cc: Hugh Dickins
    Cc: Sasha Levin
    Cc: Dave Jones
    Cc: Ning Qu
    Cc: "Kirill A. Shutemov"
    Cc: [3.15.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     
  • When compiling a SH2A kernel (e.g. se7206_defconfig or rsk7203_defconfig)
    using sh4-linux-gcc, linking fails with:

    net/built-in.o: In function `__sk_run_filter':
    net/core/filter.c:566: undefined reference to `__fpscr_values'
    net/core/filter.c:269: undefined reference to `__fpscr_values'
    ...
    net/built-in.o:net/core/filter.c:580: more undefined references to `__fpscr_values' follow

    This happens because sh4-linux-gcc doesn't support the "-m2a-nofpu",
    which is thus filtered out by "$(call cc-option, ...)".

    As compiling using sh4-linux-gcc is useful for compile coverage, also
    try passing "-m4-nofpu" (which is presumably filtered out when using a
    real sh2a-linux toolchain) to disable the generation of FPU instructions
    and references to __fpscr_values[].

    Signed-off-by: Geert Uytterhoeven
    Cc: Guenter Roeck
    Cc: Tony Breeds
    Cc: Alexei Starovoitov
    Cc: Fengguang Wu
    Cc: Daniel Borkmann
    Cc: Magnus Damm
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Geert Uytterhoeven
     
  • Sasha reported lockdep warning [1] introduced by [2].

    It could be fixed by doing disk revalidation out of the init_lock. It's
    okay because disk capacity change is protected by init_lock so that
    revalidate_disk always sees up-to-date value so there is no race.

    [1] https://lkml.org/lkml/2014/7/3/735
    [2] zram: revalidate disk after capacity change

    Fixes 2e32baea46ce ("zram: revalidate disk after capacity change").

    Signed-off-by: Minchan Kim
    Reported-by: Sasha Levin
    Cc: "Alexander E. Patrakov"
    Cc: Nitin Gupta
    Cc: Jerome Marchand
    Cc: Sergey Senozhatsky
    CC:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Minchan Kim
     
  • I triggered VM_BUG_ON() in vma_address() when I tried to migrate an
    anonymous hugepage with mbind() in the kernel v3.16-rc3. This is
    because pgoff's calculation in rmap_walk_anon() fails to consider
    compound_order() only to have an incorrect value.

    This patch introduces page_to_pgoff(), which gets the page's offset in
    PAGE_CACHE_SIZE.

    Kirill pointed out that page cache tree should natively handle
    hugepages, and in order to make hugetlbfs fit it, page->index of
    hugetlbfs page should be in PAGE_CACHE_SIZE. This is beyond this patch,
    but page_to_pgoff() contains the point to be fixed in a single function.

    Signed-off-by: Naoya Horiguchi
    Acked-by: Kirill A. Shutemov
    Cc: Joonsoo Kim
    Cc: Hugh Dickins
    Cc: Rik van Riel
    Cc: Hillf Danton
    Cc: Naoya Horiguchi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     
  • Commit 079148b919d0 ("coredump: factor out the setting of PF_DUMPCORE")
    cleaned up the setting of PF_DUMPCORE by removing it from all the
    linux_binfmt->core_dump() and moving it to zap_threads().But this ended
    up clearing all the previously set flags. This causes issues during
    core generation when tsk->flags is checked again (eg. for PF_USED_MATH
    to dump floating point registers). Fix this.

    Signed-off-by: Silesh C V
    Acked-by: Oleg Nesterov
    Cc: Mandeep Singh Baines
    Cc: [3.10+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Silesh C V
     

23 Jul, 2014

1 commit


22 Jul, 2014

12 commits

  • Pull networking fixes from David Miller:

    1) Null termination fix in dns_resolver got the pointer dereferncing
    wrong, fix from Ben Hutchings.

    2) ip_options_compile() has a benign but real buffer overflow when
    parsing options. From Eric Dumazet.

    3) Table updates can crash in netfilter's nftables if none of the state
    flags indicate an actual change, from Pablo Neira Ayuso.

    4) Fix race in nf_tables dumping, also from Pablo.

    5) GRE-GRO support broke the forwarding path because the segmentation
    state was not fully initialized in these paths, from Jerry Chu.

    6) sunvnet driver leaks objects and potentially crashes on module
    unload, from Sowmini Varadhan.

    7) We can accidently generate the same handle for several u32
    classifier filters, fix from Cong Wang.

    8) Several edge case bug fixes in fragment handling in xen-netback,
    from Zoltan Kiss.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (21 commits)
    ipv4: fix buffer overflow in ip_options_compile()
    batman-adv: fix TT VLAN inconsistency on VLAN re-add
    batman-adv: drop QinQ claim frames in bridge loop avoidance
    dns_resolver: Null-terminate the right string
    xen-netback: Fix pointer incrementation to avoid incorrect logging
    xen-netback: Fix releasing header slot on error path
    xen-netback: Fix releasing frag_list skbs in error path
    xen-netback: Fix handling frag_list on grant op error path
    net_sched: avoid generating same handle for u32 filters
    net: huawei_cdc_ncm: add "subclass 3" devices
    net: qmi_wwan: add two Sierra Wireless/Netgear devices
    wan/x25_asy: integer overflow in x25_asy_change_mtu()
    net: ppp: fix creating PPP pass and active filters
    net/mlx4_en: cq->irq_desc wasn't set in legacy EQ's
    sunvnet: clean up objects created in vnet_new() on vnet_exit()
    r8169: Enable RX_MULTI_EN for RTL_GIGA_MAC_VER_40
    net-gre-gro: Fix a bug that breaks the forwarding path
    netfilter: nf_tables: 64bit stats need some extra synchronization
    netfilter: nf_tables: set NLM_F_DUMP_INTR if netlink dumping is stale
    netfilter: nf_tables: safe RCU iteration on list when dumping
    ...

    Linus Torvalds
     
  • Pull sparc fix from David Miller:
    "Need to hook up the new renameat2 system call"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
    sparc: Hook up renameat2 syscall.

    Linus Torvalds
     
  • Pull IDE fixes from David Miller:
    - fix interrupt registry for some Atari IDE chipsets.
    - adjust Kconfig dependencies for x86_32 specific chips.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
    ide: Fix SC1200 dependencies
    ide: Fix CS5520 and CS5530 dependencies
    m68k/atari - ide: do not register interrupt if host->get_lock is set

    Linus Torvalds
     
  • …it/rostedt/linux-trace

    Pull trace fix from Steven Rostedt:
    "Tony Luck found that using the "uptime" trace clock that uses jiffies
    as a counter was converted to nanoseconds (silly), and after 1 hour 11
    minutes and 34 seconds, this monotonic clock would wrap, causing havoc
    with the tracing system and making the clock useless.

    He converted that clock to use jiffies_64 and made it into a counter
    instead of nanosecond conversions, and displayed the clock with the
    straight jiffy count, which works much better than it did in the past"

    * tag 'trace-fixes-v3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    tracing: Fix wraparound problems in "uptime" trace clock

    Linus Torvalds
     
  • Signed-off-by: David S. Miller

    David S. Miller
     
  • Antonio Quartulli says:

    ====================
    pull request [net]: batman-adv 20140721

    here you have two fixes that we have been testing for quite some time
    (this is why they arrived a bit late in the rc cycle).

    Patch 1) ensures that BLA packets get dropped and not forwarded to the
    mesh even if they reach batman-adv within QinQ frames. Forwarding them
    into the mesh means messing up with the TT database of other nodes which
    can generate all kind of unexpected behaviours during route computation.

    Patch 2) avoids a couple of race conditions triggered upon fast VLAN
    deletion-addition. Such race conditions are pretty dangerous because
    they not only create inconsistencies in the TT database of the nodes
    in the network, but such scenario is also unrecoverable (unless
    nodes are rebooted).
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • There is a benign buffer overflow in ip_options_compile spotted by
    AddressSanitizer[1] :

    Its benign because we always can access one extra byte in skb->head
    (because header is followed by struct skb_shared_info), and in this case
    this byte is not even used.

    [28504.910798] ==================================================================
    [28504.912046] AddressSanitizer: heap-buffer-overflow in ip_options_compile
    [28504.913170] Read of size 1 by thread T15843:
    [28504.914026] [] ip_options_compile+0x121/0x9c0
    [28504.915394] [] ip_options_get_from_user+0xad/0x120
    [28504.916843] [] do_ip_setsockopt.isra.15+0x8df/0x1630
    [28504.918175] [] ip_setsockopt+0x30/0xa0
    [28504.919490] [] tcp_setsockopt+0x5b/0x90
    [28504.920835] [] sock_common_setsockopt+0x5f/0x70
    [28504.922208] [] SyS_setsockopt+0xa2/0x140
    [28504.923459] [] system_call_fastpath+0x16/0x1b
    [28504.924722]
    [28504.925106] Allocated by thread T15843:
    [28504.925815] [] ip_options_get_from_user+0x35/0x120
    [28504.926884] [] do_ip_setsockopt.isra.15+0x8df/0x1630
    [28504.927975] [] ip_setsockopt+0x30/0xa0
    [28504.929175] [] tcp_setsockopt+0x5b/0x90
    [28504.930400] [] sock_common_setsockopt+0x5f/0x70
    [28504.931677] [] SyS_setsockopt+0xa2/0x140
    [28504.932851] [] system_call_fastpath+0x16/0x1b
    [28504.934018]
    [28504.934377] The buggy address ffff880026382828 is located 0 bytes to the right
    [28504.934377] of 40-byte region [ffff880026382800, ffff880026382828)
    [28504.937144]
    [28504.937474] Memory state around the buggy address:
    [28504.938430] ffff880026382300: ........ rrrrrrrr rrrrrrrr rrrrrrrr
    [28504.939884] ffff880026382400: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28504.941294] ffff880026382500: .....rrr rrrrrrrr rrrrrrrr rrrrrrrr
    [28504.942504] ffff880026382600: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28504.943483] ffff880026382700: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28504.944511] >ffff880026382800: .....rrr rrrrrrrr rrrrrrrr rrrrrrrr
    [28504.945573] ^
    [28504.946277] ffff880026382900: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28505.094949] ffff880026382a00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28505.096114] ffff880026382b00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28505.097116] ffff880026382c00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28505.098472] ffff880026382d00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28505.099804] Legend:
    [28505.100269] f - 8 freed bytes
    [28505.100884] r - 8 redzone bytes
    [28505.101649] . - 8 allocated bytes
    [28505.102406] x=1..7 - x allocated bytes + (8-x) redzone bytes
    [28505.103637] ==================================================================

    [1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Pull media fixes from Mauro Carvalho Chehab:
    "A series of driver fixes:
    - fix DVB-S tuning with tda1071
    - fix tuner probe on af9035 when the device has a bad eeprom
    - some fixes for the new si2168/2157 drivers
    - one Kconfig build fix (for omap4iss)
    - fixes at vpif error path
    - don't lock saa7134 ioctl at driver's base core level, as it now
    uses V4L2 and VB2 locking schema
    - fix audio at hdpvr driver
    - fix the aspect ratio at the digital timings table
    - one new USB ID (at gspca_pac7302): Genius i-Look 317 webcam"

    * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
    [media] gspca_pac7302: Add new usb-id for Genius i-Look 317
    [media] tda10071: fix returned symbol rate calculation
    [media] tda10071: fix spec inversion reporting
    [media] tda10071: add missing DVB-S2/PSK-8 FEC AUTO
    [media] tda10071: force modulation to QPSK on DVB-S
    [media] hdpvr: fix two audio bugs
    [media] davinci: vpif: missing unlocks on error
    [media] af9035: override tuner id when bad value set into eeprom
    [media] saa7134: use unlocked_ioctl instead of ioctl
    [media] media: v4l2-core: v4l2-dv-timings.c: Cleaning up code wrong value used in aspect ratio
    [media] si2168: firmware download fix
    [media] si2157: add one missing parenthesis
    [media] si2168: add one missing parenthesis
    [media] staging: tighten omap4iss dependencies

    Linus Torvalds
     
  • Pull block fixes from Jens Axboe:
    "Final block fixes for 3.16

    Four small fixes that should go into 3.16, have been queued up for a
    bit and delayed due to vacation and other euro duties. But here they
    are. The pull request contains:

    - Fix for a reported crash with shared tagging on SCSI from Christoph

    - A regression fix for drbd. From Lars Ellenberg.

    - Hooking up the compat ioctl for BLKZEROOUT, which requires no
    translation. From Mikulas.

    - A fix for a regression where we woud crash on queue exit if the
    root_blkg is gone/not there. From Tejun"

    * 'for-linus' of git://git.kernel.dk/linux-block:
    block: provide compat ioctl for BLKZEROOUT
    blkcg: don't call into policy draining if root_blkg is already gone
    drbd: fix regression 'out of mem, failed to invoke fence-peer helper'
    block: don't assume last put of shared tags is for the host

    Linus Torvalds
     
  • Pull libata fixes from Tejun Heo:
    "Late libata fixes.

    The most important one is from Kevin Hao which makes sure that libata
    only allocates tags inside the max tag number the controller supports.
    libata always had this problem but the recent tag allocation change
    and addition of support for sata_fsl which only supports queue depth
    of 16 exposed the issue.

    Hans de Goede agreed to become the maintainer of libahci_platform
    which is under higher than usual development pressure from all the new
    controllers popping up from the ARM world"

    * 'for-3.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
    ahci: add support for the Promise FastTrak TX8660 SATA HBA (ahci mode)
    drivers/ata/pata_ep93xx.c: use signed int type for result of platform_get_irq()
    libata: EH should handle AMNF error condition as a media error
    libata: support the ata host which implements a queue depth less than 32
    MAINTAINERS: Add Hans de Goede as ahci-platform maintainer

    Linus Torvalds
     
  • Pull kvm fixes from Paolo Bonzini:
    "These are mostly PPC changes for 3.16-new things. However, there is
    an x86 change too and it is a regression from 3.14. As it only
    affects nested virtualization and there were other changes in this
    area in 3.16, I am not nominating it for 3.15-stable"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
    KVM: x86: Check for nested events if there is an injectable interrupt
    KVM: PPC: RTAS: Do byte swaps explicitly
    KVM: PPC: Book3S PR: Fix ABIv2 on LE
    KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC()
    PPC: Add _GLOBAL_TOC for 32bit
    KVM: PPC: BOOK3S: HV: Use base page size when comparing against slb value
    KVM: PPC: Book3E: Unlock mmu_lock when setting caching atttribute

    Linus Torvalds
     
  • Pull s390 fixes from Martin Schwidefsky:
    "A couple of last minute bug fixes for 3.16, including a fix for ptrace
    to close a hole which allowed a user space program to write to the
    kernel address space"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
    s390: fix restore of invalid floating-point-control
    s390/zcrypt: improve device probing for zcrypt adapter cards
    s390/ptrace: fix PSW mask check
    s390/MSI: Use standard mask and unmask funtions
    s390/3270: correct size detection with the read-partition command
    s390: require mvcos facility, not tod clock steering facility

    Linus Torvalds
     

21 Jul, 2014

15 commits

  • The "uptime" trace clock added in:

    commit 8aacf017b065a805d27467843490c976835eb4a5
    tracing: Add "uptime" trace clock that uses jiffies

    has wraparound problems when the system has been up more
    than 1 hour 11 minutes and 34 seconds. It converts jiffies
    to nanoseconds using:
    (u64)jiffies_to_usecs(jiffy) * 1000ULL
    but since jiffies_to_usecs() only returns a 32-bit value, it
    truncates at 2^32 microseconds. An additional problem on 32-bit
    systems is that the argument is "unsigned long", so fixing the
    return value only helps until 2^32 jiffies (49.7 days on a HZ=1000
    system).

    Avoid these problems by using jiffies_64 as our basis, and
    not converting to nanoseconds (we do convert to clock_t because
    user facing API must not be dependent on internal kernel
    HZ values).

    Link: http://lkml.kernel.org/p/99d63c5bfe9b320a3b428d773825a37095bf6a51.1405708254.git.tony.luck@intel.com

    Cc: stable@vger.kernel.org # 3.10+
    Fixes: 8aacf017b065 "tracing: Add "uptime" trace clock that uses jiffies"
    Signed-off-by: Tony Luck
    Signed-off-by: Steven Rostedt

    Tony Luck
     
  • When a VLAN interface (on top of batX) is removed and
    re-added within a short timeframe TT does not have enough
    time to properly cleanup. This creates an internal TT state
    mismatch as the newly created softif_vlan will be
    initialized from scratch with a TT client count of zero
    (even if TT entries for this VLAN still exist). The
    resulting TT messages are bogus due to the counter / tt
    client listing mismatch, thus creating inconsistencies on
    every node in the network

    To fix this issue destroy_vlan() has to not free the VLAN
    object immediately but it has to be kept alive until all the
    TT entries for this VLAN have been removed. destroy_vlan()
    still removes the sysfs folder so that the user has the
    feeling that everything went fine.

    If the same VLAN is re-added before the old object is free'd,
    then the latter is resurrected and re-used.

    Implement such behaviour by increasing the reference counter
    of a softif_vlan object every time a new local TT entry for
    such VLAN is created and remove the object from the list
    only when all the TT entries have been destroyed.

    Signed-off-by: Antonio Quartulli
    Signed-off-by: Marek Lindner

    Antonio Quartulli
     
  • Since bridge loop avoidance only supports untagged or simple 802.1q
    tagged VLAN claim frames, claim frames with stacked VLAN headers (QinQ)
    should be detected and dropped. Transporting the over the mesh may cause
    problems on the receivers, or create bogus entries in the local tt
    tables.

    Reported-by: Antonio Quartulli
    Signed-off-by: Simon Wunderlich
    Signed-off-by: Marek Lindner
    Signed-off-by: Antonio Quartulli

    Simon Wunderlich
     
  • *_result[len] is parsed as *(_result[len]) which is not at all what we
    want to touch here.

    Signed-off-by: Ben Hutchings
    Fixes: 84a7c0b1db1c ("dns_resolver: assure that dns_query() result is null-terminated")
    Signed-off-by: David S. Miller

    Ben Hutchings
     
  • Linus Torvalds
     
  • Zoltan Kiss says:

    ====================
    xen-netback: Fixing up xenvif_tx_check_gop

    This series fixes a lot of bugs on the error path around this function, which
    were introduced with my grant mapping series in 3.15. They apply to the latest
    net tree, but probably to net-next as well without any modification.
    I'll post an another series which applies to 3.15 stable, as the problem was
    first discovered there. The only difference is that the "queue" variable name is
    replaced to "vif".
    ====================

    Signed-off-by: Zoltan Kiss
    Reported-by: Armin Zentai
    Signed-off-by: David S. Miller

    David S. Miller
     
  • Due to this pointer is increased prematurely, the error log contains rubbish.

    Signed-off-by: Zoltan Kiss
    Reported-by: Armin Zentai
    Cc: netdev@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Cc: xen-devel@lists.xenproject.org
    Signed-off-by: David S. Miller

    Zoltan Kiss
     
  • This patch makes this function aware that the first frag and the header might
    share the same ring slot. That could happen if the first slot is bigger than
    PKT_PROT_LEN. Due to this the error path might release that slot twice or never,
    depending on the error scenario.
    xenvif_idx_release is also removed from xenvif_idx_unmap, and called separately.

    Signed-off-by: Zoltan Kiss
    Reported-by: Armin Zentai
    Cc: netdev@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Cc: xen-devel@lists.xenproject.org
    Signed-off-by: David S. Miller

    Zoltan Kiss
     
  • When the grant operations failed, the skb is freed up eventually, and it tries
    to release the frags, if there is any. For the main skb nr_frags is set to 0 to
    avoid this, but on the frag_list it iterates through the frags array, and tries
    to call put_page on the page pointer which contains garbage at that time.

    Signed-off-by: Zoltan Kiss
    Reported-by: Armin Zentai
    Cc: netdev@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Cc: xen-devel@lists.xenproject.org
    Signed-off-by: David S. Miller

    Zoltan Kiss
     
  • The error handling for skb's with frag_list was completely wrong, it caused
    double unmap attempts to happen if the error was on the first skb. Move it to
    the right place in the loop.

    Signed-off-by: Zoltan Kiss
    Reported-by: Armin Zentai
    Cc: netdev@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Cc: xen-devel@lists.xenproject.org
    Signed-off-by: David S. Miller

    Zoltan Kiss
     
  • When kernel generates a handle for a u32 filter, it tries to start
    from the max in the bucket. So when we have a filter with the max (fff)
    handle, it will cause kernel always generates the same handle for new
    filters. This can be shown by the following command:

    tc qdisc add dev eth0 ingress
    tc filter add dev eth0 parent ffff: protocol ip pref 770 handle 800::fff u32 match ip protocol 1 0xff
    tc filter add dev eth0 parent ffff: protocol ip pref 770 u32 match ip protocol 1 0xff
    ...

    we will get some u32 filters with same handle:

    # tc filter show dev eth0 parent ffff:
    filter protocol ip pref 770 u32
    filter protocol ip pref 770 u32 fh 800: ht divisor 1
    filter protocol ip pref 770 u32 fh 800::fff order 4095 key ht 800 bkt 0
    match 00010000/00ff0000 at 8
    filter protocol ip pref 770 u32 fh 800::fff order 4095 key ht 800 bkt 0
    match 00010000/00ff0000 at 8
    filter protocol ip pref 770 u32 fh 800::fff order 4095 key ht 800 bkt 0
    match 00010000/00ff0000 at 8
    filter protocol ip pref 770 u32 fh 800::fff order 4095 key ht 800 bkt 0
    match 00010000/00ff0000 at 8

    handles should be unique. This patch fixes it by looking up a bitmap,
    so that can guarantee the handle is as unique as possible. For compatibility,
    we still start from 0x800.

    Cc: "David S. Miller"
    Signed-off-by: Cong Wang
    Signed-off-by: Cong Wang
    Signed-off-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Cong Wang
     
  • Pull more IIO driver fixes from Greg KH:
    "Here are two IIO driver fixes for 3.16-rc6 that resolve some reported
    issues"

    * tag 'staging-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
    iio: mma8452: Use correct acceleration units.
    iio:core: Handle error when mask type is not separate

    Linus Torvalds
     
  • Pull USB fixes from Greg KH:
    "Here are two USB patches that resolve some reported issues, one with
    an odd HUB, and one in the chipidea driver"

    * tag 'usb-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
    usb: Check if port status is equal to RxDetect
    usb: chipidea: udc: Disable auto ZLP generation on ep0

    Linus Torvalds
     
  • Pull driver core fix from Greg KH:
    "Here is a single driver core fix that reverts an older patch that has
    been causing a number of reported problems with the platform devices.

    This revert has been in linux-next for a while with no reported issues"

    * tag 'driver-core-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
    platform_get_irq: Revert to platform_get_resource if of_irq_get fails

    Linus Torvalds
     
  • Pull char/misc fix from Greg KH:
    "Here's a single hyper-v driver fix for a reported issue"

    * tag 'char-misc-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
    Drivers: hv: hv_fcopy: fix a race condition for SMP guest

    Linus Torvalds