15 Aug, 2015

6 commits

  • sem_lock() did not properly pair memory barriers:

    !spin_is_locked() and spin_unlock_wait() are both only control barriers.
    The code needs an acquire barrier, otherwise the cpu might perform read
    operations before the lock test.

    As no primitive exists inside and since it seems
    noone wants another primitive, the code creates a local primitive within
    ipc/sem.c.

    With regards to -stable:

    The change of sem_wait_array() is a bugfix, the change to sem_lock() is a
    nop (just a preprocessor redefinition to improve the readability). The
    bugfix is necessary for all kernels that use sem_wait_array() (i.e.:
    starting from 3.10).

    Signed-off-by: Manfred Spraul
    Reported-by: Oleg Nesterov
    Acked-by: Peter Zijlstra (Intel)
    Cc: "Paul E. McKenney"
    Cc: Kirill Tkhai
    Cc: Ingo Molnar
    Cc: Josh Poimboeuf
    Cc: Davidlohr Bueso
    Cc: [3.10+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Manfred Spraul
     
  • Bug:

    ------------[ cut here ]------------
    kernel BUG at mm/huge_memory.c:1957!
    invalid opcode: 0000 [#1] SMP
    Modules linked in: snd_hda_codec_hdmi i915 rpcsec_gss_krb5 snd_hda_codec_realtek snd_hda_codec_generic nfsv4 dns_re
    CPU: 2 PID: 2576 Comm: test_huge Not tainted 4.2.0-rc5-mm1+ #27
    Hardware name: Dell Inc. OptiPlex 7020/0F5C5X, BIOS A03 01/08/2015
    task: ffff880204e3d600 ti: ffff8800db16c000 task.ti: ffff8800db16c000
    RIP: split_huge_page_to_list+0xdb/0x120
    Call Trace:
    memory_failure+0x32e/0x7c0
    madvise_hwpoison+0x8b/0x160
    SyS_madvise+0x40/0x240
    ? do_page_fault+0x37/0x90
    entry_SYSCALL_64_fastpath+0x12/0x71
    Code: ff f0 41 ff 4c 24 30 74 0d 31 c0 48 83 c4 08 5b 41 5c 41 5d c9 c3 4c 89 e7 e8 e2 58 fd ff 48 83 c4 08 31 c0
    RIP split_huge_page_to_list+0xdb/0x120
    RSP
    ---[ end trace aee7ce0df8e44076 ]---

    Testcase:

    #define _GNU_SOURCE
    #include
    #include
    #include
    #include
    #include
    #include
    #include
    #include

    #define MB 1024*1024

    int main(void)
    {
    char *mem;

    posix_memalign((void **)&mem, 2 * MB, 200 * MB);

    madvise(mem, 200 * MB, MADV_HWPOISON);

    free(mem);

    return 0;
    }

    Huge zero page is allocated if page fault w/o FAULT_FLAG_WRITE flag.
    The get_user_pages_fast() which called in madvise_hwpoison() will get
    huge zero page if the page is not allocated before. Huge zero page is a
    tranparent huge page, however, it is not an anonymous page.
    memory_failure will split the huge zero page and trigger
    BUG_ON(is_huge_zero_page(page));

    After commit 98ed2b0052e6 ("mm/memory-failure: give up error handling
    for non-tail-refcounted thp"), memory_failure will not catch non anon
    thp from madvise_hwpoison path and this bug occur.

    Fix it by catching non anon thp in memory_failure in order to not split
    huge zero page in madvise_hwpoison path.

    After this patch:

    Injecting memory failure for page 0x202800 at 0x7fd8ae800000
    MCE: 0x202800: non anonymous thp
    [...]

    [akpm@linux-foundation.org: remove second split, per Wanpeng]
    Signed-off-by: Wanpeng Li
    Acked-by: Naoya Horiguchi
    Signed-off-by: Andrew Morton

    Signed-off-by: Linus Torvalds

    Wanpeng Li
     
  • After we acquire the sma->sem_perm lock in exit_sem(), we are protected
    against a racing IPC_RMID operation. Also at that point, we are the last
    user of sem_undo_list. Therefore it isn't required that we acquire or use
    ulp->lock.

    Signed-off-by: Herton R. Krzesinski
    Acked-by: Manfred Spraul
    Cc: Davidlohr Bueso
    Cc: Rafael Aquini
    CC: Aristeu Rozanski
    Cc: David Jeffery
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Herton R. Krzesinski
     
  • The current semaphore code allows a potential use after free: in
    exit_sem we may free the task's sem_undo_list while there is still
    another task looping through the same semaphore set and cleaning the
    sem_undo list at freeary function (the task called IPC_RMID for the same
    semaphore set).

    For example, with a test program [1] running which keeps forking a lot
    of processes (which then do a semop call with SEM_UNDO flag), and with
    the parent right after removing the semaphore set with IPC_RMID, and a
    kernel built with CONFIG_SLAB, CONFIG_SLAB_DEBUG and
    CONFIG_DEBUG_SPINLOCK, you can easily see something like the following
    in the kernel log:

    Slab corruption (Not tainted): kmalloc-64 start=ffff88003b45c1c0, len=64
    000: 6b 6b 6b 6b 6b 6b 6b 6b 00 6b 6b 6b 6b 6b 6b 6b kkkkkkkk.kkkkkkk
    010: ff ff ff ff 6b 6b 6b 6b ff ff ff ff ff ff ff ff ....kkkk........
    Prev obj: start=ffff88003b45c180, len=64
    000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a .....N......ZZZZ
    010: ff ff ff ff ff ff ff ff c0 fb 01 37 00 88 ff ff ...........7....
    Next obj: start=ffff88003b45c200, len=64
    000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a .....N......ZZZZ
    010: ff ff ff ff ff ff ff ff 68 29 a7 3c 00 88 ff ff ........h). 8b 84 24 88 03 00 00 49 8d 8c 24 60 05 00 00 8b 53 04 48 89
    RIP [] spin_dump+0x53/0xc0
    RSP
    ---[ end trace 783ebb76612867a0 ]---
    NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [test:18053]
    Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
    CPU: 3 PID: 18053 Comm: test Tainted: G D 4.2.0-rc5+ #1
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
    RIP: native_read_tsc+0x0/0x20
    Call Trace:
    ? delay_tsc+0x40/0x70
    __delay+0xf/0x20
    do_raw_spin_lock+0x96/0x140
    _raw_spin_lock+0xe/0x10
    sem_lock_and_putref+0x11/0x70
    SYSC_semtimedop+0x7bf/0x960
    ? handle_mm_fault+0xbf6/0x1880
    ? dequeue_task_fair+0x79/0x4a0
    ? __do_page_fault+0x19a/0x430
    ? kfree_debugcheck+0x16/0x40
    ? __do_page_fault+0x19a/0x430
    ? __audit_syscall_entry+0xa8/0x100
    ? do_audit_syscall_entry+0x66/0x70
    ? syscall_trace_enter_phase1+0x139/0x160
    SyS_semtimedop+0xe/0x10
    SyS_semop+0x10/0x20
    entry_SYSCALL_64_fastpath+0x12/0x71
    Code: 47 10 83 e8 01 85 c0 89 47 10 75 08 65 48 89 3d 1f 74 ff 7e c9 c3 0f 1f 44 00 00 55 48 89 e5 e8 87 17 04 00 66 90 c9 c3 0f 1f 00 48 89 e5 0f 31 89 c1 48 89 d0 48 c1 e0 20 89 c9 48 09 c8 c9
    Kernel panic - not syncing: softlockup: hung tasks

    I wasn't able to trigger any badness on a recent kernel without the
    proper config debugs enabled, however I have softlockup reports on some
    kernel versions, in the semaphore code, which are similar as above (the
    scenario is seen on some servers running IBM DB2 which uses semaphore
    syscalls).

    The patch here fixes the race against freeary, by acquiring or waiting
    on the sem_undo_list lock as necessary (exit_sem can race with freeary,
    while freeary sets un->semid to -1 and removes the same sem_undo from
    list_proc or when it removes the last sem_undo).

    After the patch I'm unable to reproduce the problem using the test case
    [1].

    [1] Test case used below:

    #include
    #include
    #include
    #include
    #include
    #include
    #include
    #include
    #include

    #define NSEM 1
    #define NSET 5

    int sid[NSET];

    void thread()
    {
    struct sembuf op;
    int s;
    uid_t pid = getuid();

    s = rand() % NSET;
    op.sem_num = pid % NSEM;
    op.sem_op = 1;
    op.sem_flg = SEM_UNDO;

    semop(sid[s], &op, 1);
    exit(EXIT_SUCCESS);
    }

    void create_set()
    {
    int i, j;
    pid_t p;
    union {
    int val;
    struct semid_ds *buf;
    unsigned short int *array;
    struct seminfo *__buf;
    } un;

    /* Create and initialize semaphore set */
    for (i = 0; i < NSET; i++) {
    sid[i] = semget(IPC_PRIVATE , NSEM, 0644 | IPC_CREAT);
    if (sid[i] < 0) {
    perror("semget");
    exit(EXIT_FAILURE);
    }
    }
    un.val = 0;
    for (i = 0; i < NSET; i++) {
    for (j = 0; j < NSEM; j++) {
    if (semctl(sid[i], j, SETVAL, un) < 0)
    perror("semctl");
    }
    }

    /* Launch threads that operate on semaphore set */
    for (i = 0; i < NSEM * NSET * NSET; i++) {
    p = fork();
    if (p < 0)
    perror("fork");
    if (p == 0)
    thread();
    }

    /* Free semaphore set */
    for (i = 0; i < NSET; i++) {
    if (semctl(sid[i], NSEM, IPC_RMID))
    perror("IPC_RMID");
    }

    /* Wait for forked processes to exit */
    while (wait(NULL)) {
    if (errno == ECHILD)
    break;
    };
    }

    int main(int argc, char **argv)
    {
    pid_t p;

    srand(time(NULL));

    while (1) {
    p = fork();
    if (p < 0) {
    perror("fork");
    exit(EXIT_FAILURE);
    }
    if (p == 0) {
    create_set();
    goto end;
    }

    /* Wait for forked processes to exit */
    while (wait(NULL)) {
    if (errno == ECHILD)
    break;
    };
    }
    end:
    return 0;
    }

    [akpm@linux-foundation.org: use normal comment layout]
    Signed-off-by: Herton R. Krzesinski
    Acked-by: Manfred Spraul
    Cc: Davidlohr Bueso
    Cc: Rafael Aquini
    CC: Aristeu Rozanski
    Cc: David Jeffery
    Cc:
    Signed-off-by: Andrew Morton

    Signed-off-by: Linus Torvalds

    Herton R. Krzesinski
     
  • Hugetlbfs pages will get a refcount in get_any_page() or
    madvise_hwpoison() if soft offlining through madvise. The refcount which
    is held by the soft offline path should be released if we fail to isolate
    hugetlbfs pages.

    Fix it by reducing the refcount for both isolation success and failure.

    Signed-off-by: Wanpeng Li
    Acked-by: Naoya Horiguchi
    Cc: [3.9+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wanpeng Li
     
  • After trying to drain pages from pagevec/pageset, we try to get reference
    count of the page again, however, the reference count of the page is not
    reduced if the page is still not on LRU list.

    Fix it by adding the put_page() to drop the page reference which is from
    __get_any_page().

    Signed-off-by: Wanpeng Li
    Acked-by: Naoya Horiguchi
    Cc: [3.9+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wanpeng Li
     

14 Aug, 2015

8 commits

  • Pull ARM fixes from Russell King:
    "Another few small ARM fixes, mostly addressing some VDSO issues"

    * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
    ARM: 8410/1: VDSO: fix coarse clock monotonicity regression
    ARM: 8409/1: Mark ret_fast_syscall as a function
    ARM: 8408/1: Fix the secondary_startup function in Big Endian case
    ARM: 8405/1: VDSO: fix regression with toolchains lacking ld.bfd executable

    Linus Torvalds
     
  • Commit 3f5159a9221f ("x86/asm/entry/32: Update -ENOSYS handling to match
    the 64-bit logic") broke the ENOSYS handling for the 32-bit compat case.
    The proper error return value was never loaded into %rax, except if
    things just happened to go through the audit paths, which ended up
    reloading the return value.

    This moves the loading or %rax into the normal system call path, just to
    make sure the error case triggers it. It's kind of sad, since it adds a
    useless instruction to reload the register to the fast path, but it's
    not like that single load from the stack is going to be noticeable.

    Reported-by: David Drysdale
    Tested-by: Kees Cook
    Acked-by: Andy Lutomirski
    Cc: Denys Vlasenko
    Cc: Ingo Molnar
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Pull device mapper fixes from Mike Snitzer:

    - two stable fixes for corruption seen in a snapshot of thinp metadata;
    metadata snapshots aren't widely used but help provide a consistent
    view of the metadata associated with an active thin-pool.

    - a dm-cache fix for the 4.2 "default" policy switch from "mq" to "smq"

    * tag 'dm-4.2-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
    dm cache policy smq: move 'dm-cache-default' module alias to SMQ
    dm btree: add ref counting ops for the leaves of top level btrees
    dm thin metadata: delete btrees when releasing metadata snapshot

    Linus Torvalds
     
  • Pull xen block driver fixes from Jens Axboe:
    "A few small bug fixes for xen-blk{front,back} that have been sitting
    over my vacation"

    * 'for-linus' of git://git.kernel.dk/linux-block:
    xen-blkback: replace work_pending with work_busy in purge_persistent_gnt()
    xen-blkfront: don't add indirect pages to list when !feature_persistent
    xen-blkfront: introduce blkfront_gather_backend_features()

    Linus Torvalds
     
  • Pull xen bug fixes from David Vrabel:

    - revert a fix from 4.2-rc5 that was causing lots of WARNING spam.

    - fix a memory leak affecting backends in HVM guests.

    - fix PV domU hang with certain configurations.

    * tag 'for-linus-4.2-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    xen/xenbus: Don't leak memory when unmapping the ring on HVM backend
    Revert "xen/events/fifo: Handle linked events when closing a port"
    x86/xen: build "Xen PV" APIC driver for domU as well

    Linus Torvalds
     
  • This reverts commits 9a036b93a344 ("x86/signal/64: Remove 'fs' and 'gs'
    from sigcontext") and c6f2062935c8 ("x86/signal/64: Fix SS handling for
    signals delivered to 64-bit programs").

    They were cleanups, but they break dosemu by changing the signal return
    behavior (and removing 'fs' and 'gs' from the sigcontext struct - while
    not actually changing any behavior - causes build problems).

    Reported-and-tested-by: Stas Sergeev
    Acked-by: Andy Lutomirski
    Cc: Ingo Molnar
    Cc: stable@vger.kernel.org
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Pull networking fixes from David Miller:

    1) Workaround hw bug when acquiring PCI bos ownership of iwlwifi
    devices, from Emmanuel Grumbach.

    2) Falling back to vmalloc in conntrack should not emit a warning, from
    Pablo Neira Ayuso.

    3) Fix NULL deref when rtlwifi driver is used as an AP, from Luis
    Felipe Dominguez Vega.

    4) Rocker doesn't free netdev on device removal, from Ido Schimmel.

    5) UDP multicast early sock demux has route handling races, from Eric
    Dumazet.

    6) Fix L4 checksum handling in openvswitch, from Glenn Griffin.

    7) Fix use-after-free in skb_set_peeked, from Herbert Xu.

    8) Don't advertize NETIF_F_FRAGLIST in virtio_net driver, this can lead
    to fraglists longer than the driver can support. From Jason Wang.

    9) Fix mlx5 on non-4k-pagesize systems, from Carol L Soto.

    10) Fix interrupt storm in bna driver, from Ivan Vecera.

    11) Don't propagate -EBUSY from netlink_insert(), from Daniel Borkmann.

    12) Fix inet request sock leak, from Eric Dumazet.

    13) Fix TX interrupt masking and marking in TX descriptors of fs_enet
    driver, from LEROY Christophe.

    14) Get rid of rule optimizer in gianfar driver, it's buggy and unlikely
    to get fixed any time soon. From Jakub Kicinski

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
    cosa: missing error code on failure in probe()
    gianfar: remove faulty filer optimizer
    gianfar: correct list membership accounting
    gianfar: correct filer table writing
    bonding: Gratuitous ARP gets dropped when first slave added
    net: dsa: Do not override PHY interface if already configured
    net: fs_enet: mask interrupts for TX partial frames.
    net: fs_enet: explicitly remove I flag on TX partial frames
    inet: fix possible request socket leak
    inet: fix races with reqsk timers
    mkiss: Fix error handling in mkiss_open()
    bnx2x: Free NVRAM lock at end of each page
    bnx2x: Prevent null pointer dereference on SKB release
    cxgb4: missing curly braces in t4_setup_debugfs()
    net-timestamp: Update skb_complete_tx_timestamp comment
    ipv6: don't reject link-local nexthop on other interface
    netlink: make sure -EBUSY won't escape from netlink_insert
    bna: fix interrupts storm caused by erroneous packets
    net: mvpp2: replace TX coalescing interrupts with hrtimer
    net: mvpp2: enable proper per-CPU TX buffers unmapping
    ...

    Linus Torvalds
     
  • Pull EDAC fix from Borislav Petkov:
    "A ppc4xx_edac fix for accessing ->csrows properly. This driver was
    missed during the conversion a couple of years ago"

    * tag 'edac_fix_for_4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
    EDAC, ppc4xx: Access mci->csrows array elements properly

    Linus Torvalds
     

13 Aug, 2015

14 commits

  • The commit

    de3910eb79ac ("edac: change the mem allocation scheme to
    make Documentation/kobject.txt happy")

    changed the memory allocation for the csrows member. But ppc4xx_edac was
    forgotten in the patch. Fix it.

    Signed-off-by: Michael Walle
    Cc:
    Cc: linux-edac
    Cc: Mauro Carvalho Chehab
    Link: http://lkml.kernel.org/r/1437469253-8611-1-git-send-email-michael@walle.cc
    Signed-off-by: Borislav Petkov

    Michael Walle
     
  • If register_hdlc_device() fails, the current code returns 0 but we
    should return an error code instead.

    Signed-off-by: Dan Carpenter
    Signed-off-by: David S. Miller

    Dan Carpenter
     
  • Jakub Kicinski says:

    ====================
    gianfar: filer changes

    respinning with examples as requested.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Current filer rule optimization is broken in several ways:
    (1) Can perform reads/writes beyond end of allocated tables.
    (gianfar_ethtool.c:1326).

    (2) It breaks badly for rules with more than 2 specifiers
    (e.g. matching ip, port, tos).

    Example:
    # ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.1 dst-port 1 tos 1 action 1
    Added rule with ID 254
    # ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.2 dst-port 2 tos 2 action 9
    Added rule with ID 253
    # ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.3 dst-port 3 tos 3 action 17
    Added rule with ID 252
    # ./filer_decode /sys/kernel/debug/gfar1/filer_raw
    00: MASK == 00000210 AND Q:00 ctrl:00000080 prop:00000210
    01: FPR == 00000210 AND CLE Q:00 ctrl:00000281 prop:00000210
    02: MASK == ffffffff AND Q:00 ctrl:00000080 prop:ffffffff
    03: DPT == 00000003 AND Q:00 ctrl:0000008e prop:00000003
    04: TOS == 00000003 AND Q:00 ctrl:0000008a prop:00000003
    05: DIA == 0a000003 AND Q:11 ctrl:0000448c prop:0a000003
    06: DPT == 00000002 AND Q:00 ctrl:0000008e prop:00000002
    07: TOS == 00000002 AND Q:00 ctrl:0000008a prop:00000002
    08: DIA == 0a000002 AND Q:09 ctrl:0000248c prop:0a000002
    09: DIA == 0a000001 AND Q:00 ctrl:0000008c prop:0a000001
    0a: DPT == 00000001 AND Q:00 ctrl:0000008e prop:00000001
    0b: TOS == 00000001 CLE Q:01 ctrl:0000060a prop:00000001
    ff: MASK >= 00000000 Q:00 ctrl:00000020 prop:00000000

    (Entire cluster gets AND-ed together).

    (3) We observed that the masking rules it generates do not
    play well with clustering on P2020. Only first rule
    of the cluster would ever fire. Given that optimizer
    relies heavily on masking this is very hard to fix.

    Example:
    # ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.1 dst-port 1 action 1
    Added rule with ID 254
    # ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.2 dst-port 2 action 9
    Added rule with ID 253
    # ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.3 dst-port 3 action 17
    Added rule with ID 252
    # ./filer_decode /sys/kernel/debug/gfar1/filer_raw
    00: MASK == 00000210 AND Q:00 ctrl:00000080 prop:00000210
    01: FPR == 00000210 AND CLE Q:00 ctrl:00000281 prop:00000210
    02: MASK == ffffffff AND Q:00 ctrl:00000080 prop:ffffffff
    03: DPT == 00000003 AND Q:00 ctrl:0000008e prop:00000003
    04: DIA == 0a000003 Q:11 ctrl:0000440c prop:0a000003
    05: DPT == 00000002 AND Q:00 ctrl:0000008e prop:00000002
    06: DIA == 0a000002 Q:09 ctrl:0000240c prop:0a000002
    07: DIA == 0a000001 AND Q:00 ctrl:0000008c prop:0a000001
    08: DPT == 00000001 CLE Q:01 ctrl:0000060e prop:00000001
    ff: MASK >= 00000000 Q:00 ctrl:00000020 prop:00000000

    Which looks correct according to the spec but only the first
    (eth id 252)/last added rule for 10.0.0.3 will ever trigger.
    As if filer did not treat the AND CLE as cluster start but
    also kept AND-ing the rules. We found no errata covering this.

    The fact that nobody noticed (2) or (3) makes me think
    that this feature is not very widely used and we should just
    remove it.

    Reported-by: Aleksander Dutkowski
    Signed-off-by: Jakub Kicinski
    Acked-by: Claudiu Manoil
    Signed-off-by: David S. Miller

    Jakub Kicinski
     
  • At a cost of one line let's make sure .count is correct
    when calling gfar_process_filer_changes().

    Signed-off-by: Jakub Kicinski
    Signed-off-by: David S. Miller

    Jakub Kicinski
     
  • MAX_FILER_IDX is the last usable index. Using less-than
    will already guarantee that one entry for catch-all rule
    will be left, no need to subtract 1 here.

    Signed-off-by: Jakub Kicinski
    Signed-off-by: David S. Miller

    Jakub Kicinski
     
  • When the first slave is added (such as during bootup) the first
    gratuitous ARP gets dropped. We don't see this drop during a failover.
    The packet gets dropped in qdisc (noop_enqueue).

    The fix is to delay the sending of gratuitous ARPs till the bond dev's
    carrier is present.

    It can also be worked around by setting num_grat_arp to more than 1.

    Signed-off-by: Venkat Venkatsubra
    Signed-off-by: David S. Miller

    Venkat Venkatsubra
     
  • In case we need to divert reads/writes using the slave MII bus, we may have
    already fetched a valid PHY interface property from Device Tree, and that
    mode is used by the PHY driver to make configuration decisions.

    If we could not fetch the "phy-mode" property, we will assign p->phy_interface
    to PHY_INTERFACE_MODE_NA, such that we can actually check for that condition as
    to whether or not we should override the interface value.

    Fixes: 19334920eaf7 ("net: dsa: Set valid phy interface type")
    Signed-off-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • Pull arm64 fix from Catalin Marinas:
    "Fix coarse clock monotonicity (VDSO timestamp off by one jiffy
    compared to the syscall one)"

    * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
    arm64: VDSO: fix coarse clock monotonicity regression

    Linus Torvalds
     
  • Pull amd drm fixes from Alex Deucher:
    "Dave is on vacation at the moment, so please pull these radeon and
    amdgpu fixes directly.

    Just a few minor things for 4.2:

    - add a new radeon pci id
    - fix a power management regression in amdgpu
    - fix HEVC command buffer validation in amdgpu"

    * 'drm-fixes-4.2' of git://people.freedesktop.org/~agd5f/linux:
    drm/radeon: add new OLAND pci id
    Revert "drm/amdgpu: Configure doorbell to maximum slots"
    drm/amdgpu: add context buffer size check for HEVC

    Linus Torvalds
     
  • Signed-off-by: Alex Deucher
    Cc: stable@vger.kernel.org

    Alex Deucher
     
  • This reverts commit 78ad5cdd21f0d614983fc397338944e797ec70b9.
    This commit breaks dpm and suspend/resume on CZ.

    Alex Deucher
     
  • Signed-off-by: Boyuan Zhang
    Reviewed-by: Christian König

    Boyuan Zhang
     
  • Pull regmap fix from Mark Brown:
    "regmap: Fix handling of present bits on rbtree cache block resize

    When expanding a cache block we use krealloc() to resize the register
    present bitmap without initialising the newly allocated data (the
    original code was written for kzalloc()). Add an appropraite memset()
    to fix that"

    * tag 'regmap-fix-v4.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
    regmap: regcache-rbtree: Clean new present bits on present bitmap resize

    Linus Torvalds
     

12 Aug, 2015

9 commits

  • When creating dm-cache with the default policy, it will call
    request_module("dm-cache-default") to register the default policy.
    But the "dm-cache-default" alias was left referring to the MQ policy.
    Fix this by moving the module alias to SMQ.

    Fixes: bccab6a0 (dm cache: switch the "default" cache replacement policy from mq to smq)
    Signed-off-by: Yi Zhang
    Signed-off-by: Mike Snitzer

    Yi Zhang
     
  • When using nested btrees, the top leaves of the top levels contain
    block addresses for the root of the next tree down. If we shadow a
    shared leaf node the leaf values (sub tree roots) should be incremented
    accordingly.

    This is only an issue if there is metadata sharing in the top levels.
    Which only occurs if metadata snapshots are being used (as is possible
    with dm-thinp). And could result in a block from the thinp metadata
    snap being reused early, thus corrupting the thinp metadata snap.

    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Cc: stable@vger.kernel.org

    Joe Thornber
     
  • The device details and mapping trees were just being decremented
    before. Now btree_del() is called to do a deep delete.

    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Cc: stable@vger.kernel.org

    Joe Thornber
     
  • …/git/rostedt/linux-kconfig

    Pull localmodconfig fix from Steven Rostedt:
    "Leonidas Spyropoulos found that modules like nouveau were being
    unselected by make localmodconfig even though their configs were set
    and the module was loaded and visible by lsmod.

    The reason for this was because streamline-config.pl only looks at
    Makefiles, and not Kbuild files. As these modules use Kbuild for
    their names, they too need to be checked by localmodconfig. This was
    fixed by Richard Weinberger"

    * tag 'localmodconfig-v4.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-kconfig:
    localmodconfig: Use Kbuild files too

    Linus Torvalds
     
  • In kbuild it is allowed to define objects in files named "Makefile"
    and "Kbuild".
    Currently localmodconfig reads objects only from "Makefile"s and misses
    modules like nouveau.

    Link: http://lkml.kernel.org/r/1437948415-16290-1-git-send-email-richard@nod.at

    Cc: stable@vger.kernel.org
    Reported-and-tested-by: Leonidas Spyropoulos
    Signed-off-by: Richard Weinberger
    Signed-off-by: Steven Rostedt

    Richard Weinberger
     
  • Johan Hedberg says:

    ====================
    pull request: bluetooth 2015-08-11

    Here's an important regression fix for the 4.2-rc series that ensures
    user space isn't given invalid LTK values. The bug essentially prevents
    the encryption of subsequent LE connections, i.e. makes it impossible to
    pair devices over LE.

    Let me know if there are any issues pulling. Thanks.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • We are not interested in interrupts for partially transmitted frames.
    Unlike SCC and FCC, the FEC doesn't handle the I bit in buffer
    descriptors, instead it defines two interrupt bits, TXB and TXF.

    We have to mask TXB in order to only get interrupts once the
    frame is fully transmitted.

    Signed-off-by: Christophe Leroy
    Signed-off-by: David S. Miller

    LEROY Christophe
     
  • We are not interested in interrupts for partially transmitted frames,
    we have to clear BD_ENET_TX_INTR explicitly otherwise it may remain
    from a previously used descriptor.

    Signed-off-by: Christophe Leroy
    Signed-off-by: David S. Miller

    LEROY Christophe
     
  • Pull fbdev fixes from Tomi Valkeinen:
    - fix display regression on Versatile boards
    - fix OF node refcount bugs on omapdss
    - fix WARN about clock prepare on pxa3xx_gcu
    - fix mem leak in videomode helpers
    - fix fbconsole related boot problem on sun7i-a20-olinuxino-micro

    * tag 'fbdev-fixes-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
    fbcon: unconditionally initialize cursor blink interval
    video: Fix possible leak in of_get_videomode()
    video: fbdev: pxa3xx_gcu: prepare the clocks
    OMAPDSS: Fix omap_dss_find_output_by_port_node() port refcount decrement
    OMAPDSS: Fix node refcount leak in omapdss_of_get_next_port()
    fbdev: select versatile helpers for the integrator

    Linus Torvalds
     

11 Aug, 2015

3 commits

  • Since 906c55579a63 ("timekeeping: Copy the shadow-timekeeper over the
    real timekeeper last") it has become possible on ARM to:

    - Obtain a CLOCK_MONOTONIC_COARSE or CLOCK_REALTIME_COARSE timestamp
    via syscall.
    - Subsequently obtain a timestamp for the same clock ID via VDSO which
    predates the first timestamp (by one jiffy).

    This is because ARM's update_vsyscall is deriving the coarse time
    using the __current_kernel_time interface, when it should really be
    using the timekeeper object provided to it by the timekeeping core.
    It happened to work before only because __current_kernel_time would
    access the same timekeeper object which had been passed to
    update_vsyscall. This is no longer the case.

    Cc: stable@vger.kernel.org
    Fixes: 906c55579a63 ("timekeeping: Copy the shadow-timekeeper over the real timekeeper last")
    Signed-off-by: Nathan Lynch
    Acked-by: Will Deacon
    Signed-off-by: Russell King

    Nathan Lynch
     
  • The commit ccc9d90a9a8b5c4ad7e9708ec41f75ff9e98d61d "xenbus_client:
    Extend interface to support multi-page ring" removes the call to
    free_xenballooned_pages() in xenbus_unmap_ring_vfree_hvm(), leaking a
    page for every shared ring.

    Only with backends running in HVM domains were affected.

    Signed-off-by: Julien Grall
    Cc:
    Reviewed-by: Boris Ostrovsky
    Reviewed-by: Wei Liu
    Signed-off-by: David Vrabel

    Julien Grall
     
  • This reverts commit fcdf31a7c162de0c93a2bee51df4688ab0a348f8.

    This was causing a WARNING whenever a PIRQ was closed since
    shutdown_pirq() is called with irqs disabled.

    Signed-off-by: David Vrabel
    Cc:

    David Vrabel