31 Oct, 2011

1 commit


25 Mar, 2010

1 commit

  • In 2.6.34-rc1, removing vhost_net module causes an oops in sync_mm_rss
    (called from do_exit) when workqueue is destroyed. This does not happen
    on net-next, or with vhost on top of to 2.6.33.

    The issue seems to be introduced by
    34e55232e59f7b19050267a05ff1226e5cd122a5 ("mm: avoid false sharing of
    mm_counter) which added sync_mm_rss() that is passed task->mm, and
    dereferences it without checking. If task is a kernel thread, mm might be
    NULL. I think this might also happen e.g. with aio.

    This patch fixes the oops by calling sync_mm_rss when task->mm is set to
    NULL. I also added BUG_ON to detect any other cases where counters get
    incremented while mm is NULL.

    The oops I observed looks like this:

    BUG: unable to handle kernel NULL pointer dereference at 00000000000002a8
    IP: [] sync_mm_rss+0x33/0x6f
    PGD 0
    Oops: 0002 [#1] SMP
    last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
    CPU 2
    Modules linked in: vhost_net(-) tun bridge stp sunrpc ipv6 cpufreq_ondemand acpi_cpufreq freq_table kvm_intel kvm i5000_edac edac_core rtc_cmos bnx2 button i2c_i801 i2c_core rtc_core e1000e sg joydev ide_cd_mod serio_raw pcspkr rtc_lib cdrom virtio_net virtio_blk virtio_pci virtio_ring virtio af_packet e1000 shpchp aacraid uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]

    Pid: 2046, comm: vhost Not tainted 2.6.34-rc1-vhost #25 System Planar/IBM System x3550 -[7978B3G]-
    RIP: 0010:[] [] sync_mm_rss+0x33/0x6f
    RSP: 0018:ffff8802379b7e60 EFLAGS: 00010202
    RAX: 0000000000000008 RBX: ffff88023f2390c0 RCX: 0000000000000000
    RDX: ffff88023f2396b0 RSI: 0000000000000000 RDI: ffff88023f2390c0
    RBP: ffff8802379b7e60 R08: 0000000000000000 R09: 0000000000000000
    R10: ffff88023aecfbc0 R11: 0000000000013240 R12: 0000000000000000
    R13: ffffffff81051a6c R14: ffffe8ffffc0f540 R15: 0000000000000000
    FS: 0000000000000000(0000) GS:ffff880001e80000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 00000000000002a8 CR3: 000000023af23000 CR4: 00000000000406e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process vhost (pid: 2046, threadinfo ffff8802379b6000, task ffff88023f2390c0)
    Stack:
    ffff8802379b7ee0 ffffffff81040687 ffffe8ffffc0f558 ffffffffa00a3e2d
    0000000000000000 ffff88023f2390c0 ffffffff81055817 ffff8802379b7e98
    ffff8802379b7e98 0000000100000286 ffff8802379b7ee0 ffff88023ad47d78
    Call Trace:
    [] do_exit+0x147/0x6c4
    [] ? handle_rx_net+0x0/0x17 [vhost_net]
    [] ? autoremove_wake_function+0x0/0x39
    [] ? worker_thread+0x0/0x229
    [] kthreadd+0x0/0xf2
    [] kernel_thread_helper+0x4/0x10
    [] ? kthread+0x0/0x87
    [] ? kernel_thread_helper+0x0/0x10
    Code: 00 8b 87 6c 02 00 00 85 c0 74 14 48 98 f0 48 01 86 a0 02 00 00 c7 87 6c 02 00 00 00 00 00 00 8b 87 70 02 00 00 85 c0 74 14 48 98 48 01 86 a8 02 00 00 c7 87 70 02 00 00 00 00 00 00 8b 87 74
    RIP [] sync_mm_rss+0x33/0x6f
    RSP
    CR2: 00000000000002a8
    ---[ end trace 41603ba922beddd2 ]---
    Fixing recursive fault but reboot is needed!

    (note: handle_rx_net is a work item using workqueue in question).
    sync_mm_rss+0x33/0x6f gave me a hint. I also tried reverting
    34e55232e59f7b19050267a05ff1226e5cd122a5 and the oops goes away.

    The module in question calls use_mm and later unuse_mm from a kernel
    thread. It is when this kernel thread is destroyed that the crash
    happens.

    Signed-off-by: Michael S. Tsirkin
    Andrea Arcangeli
    Reviewed-by: Rik van Riel
    Reviewed-by: KAMEZAWA Hiroyuki
    Reviewed-by: Minchan Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael S. Tsirkin
     

15 Jan, 2010

1 commit


22 Sep, 2009

2 commits

  • When the mm being switched to matches the active mm, we don't need to
    increment and then drop the mm count. In a simple benchmark this happens
    in about 50% of time. Making that conditional reduces contention on that
    cacheline on SMP systems.

    Acked-by: Andrea Arcangeli
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael S. Tsirkin
     
  • Anyone who wants to do copy to/from user from a kernel thread, needs
    use_mm (like what fs/aio has). Move that into mm/, to make reusing and
    exporting easier down the line, and make aio use it. Next intended user,
    besides aio, will be vhost-net.

    Acked-by: Andrea Arcangeli
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael S. Tsirkin