02 Mar, 2017

2 commits


28 Feb, 2017

1 commit

  • Apart from adding the helper function itself, the rest of the kernel is
    converted mechanically using:

    git grep -l 'atomic_inc.*mm_count' | xargs sed -i 's/atomic_inc(&\(.*\)->mm_count);/mmgrab\(\1\);/'
    git grep -l 'atomic_inc.*mm_count' | xargs sed -i 's/atomic_inc(&\(.*\)\.mm_count);/mmgrab\(\&\1\);/'

    This is needed for a later patch that hooks into the helper, but might
    be a worthwhile cleanup on its own.

    (Michal Hocko provided most of the kerneldoc comment.)

    Link: http://lkml.kernel.org/r/20161218123229.22952-1-vegard.nossum@oracle.com
    Signed-off-by: Vegard Nossum
    Acked-by: Michal Hocko
    Acked-by: Peter Zijlstra (Intel)
    Acked-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vegard Nossum
     

28 Apr, 2016

1 commit

  • Some architectures (such as Alpha) rely on include/linux/sched.h definitions
    in their mmu_context.h files.

    So include sched.h before mmu_context.h.

    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Linus Torvalds
    Cc: linux-kernel@vger.kernel.org
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

21 Feb, 2014

1 commit

  • The finish_arch_post_lock_switch is called at the end of the task
    switch after all locks have been released. In concept it is paired
    with the switch_mm function, but the current code only does the
    call in finish_task_switch. Add the call to idle_task_exit and
    use_mm. One use case for the additional calls is s390 which will
    use finish_arch_post_lock_switch to wait for the completion of
    TLB flush operations.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     

08 May, 2013

1 commit

  • Bunch of performance improvements and cleanups Zach Brown and I have
    been working on. The code should be pretty solid at this point, though
    it could of course use more review and testing.

    The results in my testing are pretty impressive, particularly when an
    ioctx is being shared between multiple threads. In my crappy synthetic
    benchmark, with 4 threads submitting and one thread reaping completions,
    I saw overhead in the aio code go from ~50% (mostly ioctx lock
    contention) to low single digits. Performance with ioctx per thread
    improved too, but I'd have to rerun those benchmarks.

    The reason I've been focused on performance when the ioctx is shared is
    that for a fair number of real world completions, userspace needs the
    completions aggregated somehow - in practice people just end up
    implementing this aggregation in userspace today, but if it's done right
    we can do it much more efficiently in the kernel.

    Performance wise, the end result of this patch series is that submitting
    a kiocb writes to _no_ shared cachelines - the penalty for sharing an
    ioctx is gone there. There's still going to be some cacheline
    contention when we deliver the completions to the aio ringbuffer (at
    least if you have interrupts being delivered on multiple cores, which
    for high end stuff you do) but I have a couple more patches not in this
    series that implement coalescing for that (by taking advantage of
    interrupt coalescing). With that, there's basically no bottlenecks or
    performance issues to speak of in the aio code.

    This patch:

    use_mm() is used in more places than just aio. There's no need to mention
    callers when describing the function.

    Signed-off-by: Zach Brown
    Signed-off-by: Kent Overstreet
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Acked-by: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Reviewed-by: "Theodore Ts'o"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Zach Brown
     

22 Mar, 2012

1 commit


31 Oct, 2011

1 commit


25 Mar, 2010

1 commit

  • In 2.6.34-rc1, removing vhost_net module causes an oops in sync_mm_rss
    (called from do_exit) when workqueue is destroyed. This does not happen
    on net-next, or with vhost on top of to 2.6.33.

    The issue seems to be introduced by
    34e55232e59f7b19050267a05ff1226e5cd122a5 ("mm: avoid false sharing of
    mm_counter) which added sync_mm_rss() that is passed task->mm, and
    dereferences it without checking. If task is a kernel thread, mm might be
    NULL. I think this might also happen e.g. with aio.

    This patch fixes the oops by calling sync_mm_rss when task->mm is set to
    NULL. I also added BUG_ON to detect any other cases where counters get
    incremented while mm is NULL.

    The oops I observed looks like this:

    BUG: unable to handle kernel NULL pointer dereference at 00000000000002a8
    IP: [] sync_mm_rss+0x33/0x6f
    PGD 0
    Oops: 0002 [#1] SMP
    last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
    CPU 2
    Modules linked in: vhost_net(-) tun bridge stp sunrpc ipv6 cpufreq_ondemand acpi_cpufreq freq_table kvm_intel kvm i5000_edac edac_core rtc_cmos bnx2 button i2c_i801 i2c_core rtc_core e1000e sg joydev ide_cd_mod serio_raw pcspkr rtc_lib cdrom virtio_net virtio_blk virtio_pci virtio_ring virtio af_packet e1000 shpchp aacraid uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]

    Pid: 2046, comm: vhost Not tainted 2.6.34-rc1-vhost #25 System Planar/IBM System x3550 -[7978B3G]-
    RIP: 0010:[] [] sync_mm_rss+0x33/0x6f
    RSP: 0018:ffff8802379b7e60 EFLAGS: 00010202
    RAX: 0000000000000008 RBX: ffff88023f2390c0 RCX: 0000000000000000
    RDX: ffff88023f2396b0 RSI: 0000000000000000 RDI: ffff88023f2390c0
    RBP: ffff8802379b7e60 R08: 0000000000000000 R09: 0000000000000000
    R10: ffff88023aecfbc0 R11: 0000000000013240 R12: 0000000000000000
    R13: ffffffff81051a6c R14: ffffe8ffffc0f540 R15: 0000000000000000
    FS: 0000000000000000(0000) GS:ffff880001e80000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 00000000000002a8 CR3: 000000023af23000 CR4: 00000000000406e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process vhost (pid: 2046, threadinfo ffff8802379b6000, task ffff88023f2390c0)
    Stack:
    ffff8802379b7ee0 ffffffff81040687 ffffe8ffffc0f558 ffffffffa00a3e2d
    0000000000000000 ffff88023f2390c0 ffffffff81055817 ffff8802379b7e98
    ffff8802379b7e98 0000000100000286 ffff8802379b7ee0 ffff88023ad47d78
    Call Trace:
    [] do_exit+0x147/0x6c4
    [] ? handle_rx_net+0x0/0x17 [vhost_net]
    [] ? autoremove_wake_function+0x0/0x39
    [] ? worker_thread+0x0/0x229
    [] kthreadd+0x0/0xf2
    [] kernel_thread_helper+0x4/0x10
    [] ? kthread+0x0/0x87
    [] ? kernel_thread_helper+0x0/0x10
    Code: 00 8b 87 6c 02 00 00 85 c0 74 14 48 98 f0 48 01 86 a0 02 00 00 c7 87 6c 02 00 00 00 00 00 00 8b 87 70 02 00 00 85 c0 74 14 48 98 48 01 86 a8 02 00 00 c7 87 70 02 00 00 00 00 00 00 8b 87 74
    RIP [] sync_mm_rss+0x33/0x6f
    RSP
    CR2: 00000000000002a8
    ---[ end trace 41603ba922beddd2 ]---
    Fixing recursive fault but reboot is needed!

    (note: handle_rx_net is a work item using workqueue in question).
    sync_mm_rss+0x33/0x6f gave me a hint. I also tried reverting
    34e55232e59f7b19050267a05ff1226e5cd122a5 and the oops goes away.

    The module in question calls use_mm and later unuse_mm from a kernel
    thread. It is when this kernel thread is destroyed that the crash
    happens.

    Signed-off-by: Michael S. Tsirkin
    Andrea Arcangeli
    Reviewed-by: Rik van Riel
    Reviewed-by: KAMEZAWA Hiroyuki
    Reviewed-by: Minchan Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael S. Tsirkin
     

15 Jan, 2010

1 commit


22 Sep, 2009

2 commits

  • When the mm being switched to matches the active mm, we don't need to
    increment and then drop the mm count. In a simple benchmark this happens
    in about 50% of time. Making that conditional reduces contention on that
    cacheline on SMP systems.

    Acked-by: Andrea Arcangeli
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael S. Tsirkin
     
  • Anyone who wants to do copy to/from user from a kernel thread, needs
    use_mm (like what fs/aio has). Move that into mm/, to make reusing and
    exporting easier down the line, and make aio use it. Next intended user,
    besides aio, will be vhost-net.

    Acked-by: Andrea Arcangeli
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael S. Tsirkin