15 Mar, 2011

23 commits

  • To avoid forking usermode thread when creating an idle task, move fork_idle
    to a work queue.

    If kernel starts with maxcpus= option which does not bring all available
    cpus online at boot time, idle tasks for offline cpus are not created. If
    later offline cpus are hotplugged through sysfs, __cpu_up is called in
    the context of the user task, and fork_idle copies its non-zero mm
    pointer. This causes BUG() in per_cpu_trap_init.

    This also avoids issues with resource limits of the CPU writing to sysfs,
    containers, maybe others.

    Signed-off-by: Maksim Rayskiy
    To: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/2070/
    Signed-off-by: Ralf Baechle

    Maksim Rayskiy
     
  • Leverage the commit for ARM by Will Deacon:

    - 446a5a8b1eb91a6990e5c8fe29f14e7a95b69132
    ARM: 6205/1: perf: ensure counter delta is treated as unsigned

    Hardware performance counters on ARM are 32-bits wide but atomic64_t
    variables are used to represent counter data in the hw_perf_event structure.

    The armpmu_event_update function right-shifts a signed 64-bit delta variable
    and adds the result to the event count. This can lead to shifting in sign-bits
    if the MSB of the 32-bit counter value is set. This results in perf output
    such as:

    Performance counter stats for 'sleep 20':

    18446744073460670464 cycles
    Acked-by: David Daney
    Signed-off-by: Deng-Cheng Zhu
    To: a.p.zijlstra@chello.nl
    To: fweisbec@gmail.com
    To: will.deacon@arm.com
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Cc: wuzhangjin@gmail.com
    Cc: paulus@samba.org
    Cc: mingo@elte.hu
    Cc: acme@redhat.com
    Cc: matt@console-pimps.org
    Cc: sshtylyov@mvista.com
    Patchwork: http://patchwork.linux-mips.org/patch/2015/
    Signed-off-by: Ralf Baechle

    Deng-Cheng Zhu
     
  • This is the MIPS part of the following commits by Frederic Weisbecker:

    - f72c1a931e311bb7780fee19e41a89ac42cab50e
    perf: Factorize callchain context handling

    Store the kernel and user contexts from the generic layer instead
    of archs, this gathers some repetitive code.

    - 56962b4449af34070bb1994621ef4f0265eed4d8
    perf: Generalize some arch callchain code

    - Most archs use one callchain buffer per cpu, except x86 that needs
    to deal with NMIs. Provide a default perf_callchain_buffer()
    implementation that x86 overrides.

    - Centralize all the kernel/user regs handling and invoke new arch
    handlers from there: perf_callchain_user() / perf_callchain_kernel()
    That avoid all the user_mode(), current->mm checks and so...

    - Invert some parameters in perf_callchain_*() helpers: entry to the
    left, regs to the right, following the traditional (dst, src).

    - 70791ce9ba68a5921c9905ef05d23f62a90bc10c
    perf: Generalize callchain_store()

    callchain_store() is the same on every archs, inline it in
    perf_event.h and rename it to perf_callchain_store() to avoid
    any collision.

    This removes repetitive code.

    - c1a65932fd7216fdc9a0db8bbffe1d47842f862c
    perf: Drop unappropriate tests on arch callchains

    Drop the TASK_RUNNING test on user tasks for callchains as
    this check doesn't seem to make any sense.

    Also remove the tests for !current that is not supposed to
    happen and current->pid as this should be handled at the
    generic level, with exclude_idle attribute.

    Reported-by: Wu Zhangjin
    Acked-by: Frederic Weisbecker
    Acked-by: David Daney
    Signed-off-by: Deng-Cheng Zhu
    To: a.p.zijlstra@chello.nl
    To: will.deacon@arm.com
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Cc: paulus@samba.org
    Cc: mingo@elte.hu
    Cc: acme@redhat.com
    Cc: dengcheng.zhu@gmail.com
    Cc: matt@console-pimps.org
    Cc: sshtylyov@mvista.com
    Patchwork: http://patchwork.linux-mips.org/patch/2014/
    Signed-off-by: Ralf Baechle

    Deng-Cheng Zhu
     
  • Ignore events that are in off/error state or belong to a different PMU.

    This patch originates from the following commit for ARM by Will Deacon:

    - 65b4711ff513767341aa1915c822de6ec0de65cb
    ARM: 6352/1: perf: fix event validation

    The validate_event function in the ARM perf events backend has the
    following problems:

    1.) Events that are disabled count towards the cost.
    2.) Events associated with other PMUs [for example, software events or
    breakpoints] do not count towards the cost, but do fail validation,
    causing the group to fail.

    This patch changes validate_event so that it ignores events in the
    PERF_EVENT_STATE_OFF state or that are scheduled for other PMUs.

    Acked-by: Will Deacon
    Acked-by: David Daney
    Signed-off-by: Deng-Cheng Zhu
    To: a.p.zijlstra@chello.nl
    To: fweisbec@gmail.com
    To: will.deacon@arm.com
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Cc: wuzhangjin@gmail.com
    Cc: paulus@samba.org
    Cc: mingo@elte.hu
    Cc: acme@redhat.com
    Cc: dengcheng.zhu@gmail.com
    Cc: matt@console-pimps.org
    Cc: sshtylyov@mvista.com
    Cc: ddaney@caviumnetworks.com
    Patchwork: http://patchwork.linux-mips.org/patch/2013/
    Signed-off-by: Ralf Baechle

    Deng-Cheng Zhu
     
  • This is the MIPS part of the following commits by Peter Zijlstra:

    - a4eaf7f14675cb512d69f0c928055e73d0c6d252
    perf: Rework the PMU methods

    Replace pmu::{enable,disable,start,stop,unthrottle} with
    pmu::{add,del,start,stop}, all of which take a flags argument.

    The new interface extends the capability to stop a counter while
    keeping it scheduled on the PMU. We replace the throttled state with
    the generic stopped state.

    This also allows us to efficiently stop/start counters over certain
    code paths (like IRQ handlers).

    It also allows scheduling a counter without it starting, allowing for
    a generic frozen state (useful for rotating stopped counters).

    The stopped state is implemented in two different ways, depending on
    how the architecture implemented the throttled state:

    1) We disable the counter:
    a) the pmu has per-counter enable bits, we flip that
    b) we program a NOP event, preserving the counter state

    2) We store the counter state and ignore all read/overflow events

    For MIPSXX, the stopped state is implemented in the way of 1.b as above.

    - 33696fc0d141bbbcb12f75b69608ea83282e3117
    perf: Per PMU disable

    Changes perf_disable() into perf_pmu_disable().

    - 24cd7f54a0d47e1d5b3de29e2456bfbd2d8447b7
    perf: Reduce perf_disable() usage

    Since the current perf_disable() usage is only an optimization,
    remove it for now. This eases the removal of the __weak
    hw_perf_enable() interface.

    - b0a873ebbf87bf38bf70b5e39a7cadc96099fa13
    perf: Register PMU implementations

    Simple registration interface for struct pmu, this provides the
    infrastructure for removing all the weak functions.

    - 51b0fe39549a04858001922919ab355dee9bdfcf
    perf: Deconstify struct pmu

    sed -ie 's/const struct pmu\>/struct pmu/g' `git grep -l "const struct pmu\>"`

    Reported-by: Wu Zhangjin
    Acked-by: David Daney
    Signed-off-by: Deng-Cheng Zhu
    To: a.p.zijlstra@chello.nl
    To: fweisbec@gmail.com
    To: will.deacon@arm.com
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Cc: wuzhangjin@gmail.com
    Cc: paulus@samba.org
    Cc: mingo@elte.hu
    Cc: acme@redhat.com
    Cc: dengcheng.zhu@gmail.com
    Cc: matt@console-pimps.org
    Cc: sshtylyov@mvista.com
    Cc: ddaney@caviumnetworks.com
    Patchwork: http://patchwork.linux-mips.org/patch/2012/
    Signed-off-by: Ralf Baechle

    Deng-Cheng Zhu
     
  • This is the MIPS part of the following commit by Peter Zijlstra:

    - e360adbe29241a0194e10e20595360dd7b98a2b3
    irq_work: Add generic hardirq context callbacks

    Provide a mechanism that allows running code in IRQ context. It is
    most useful for NMI code that needs to interact with the rest of the
    system -- like wakeup a task to drain buffers.

    Perf currently has such a mechanism, so extract that and provide it as
    a generic feature, independent of perf so that others may also
    benefit.

    The IRQ context callback is generated through self-IPIs where
    possible, or on architectures like powerpc the decrementer (the
    built-in timer facility) is set to generate an interrupt immediately.

    Architectures that don't have anything like this get to do with a
    callback from the timer tick. These architectures can call
    irq_work_run() at the tail of any IRQ handlers that might enqueue such
    work (like the perf IRQ handler) to avoid undue latencies in
    processing the work.

    For MIPSXX, we need to call irq_work_run() at the tail of the perf IRQ
    handler as described above.

    Reported-by: Wu Zhangjin
    Acked-by: Peter Zijlstra
    Acked-by: David Daney
    Signed-off-by: Deng-Cheng Zhu
    To: fweisbec@gmail.com
    To: will.deacon@arm.com
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Cc: paulus@samba.org
    Cc: mingo@elte.hu
    Cc: acme@redhat.com
    Cc: matt@console-pimps.org
    Cc: sshtylyov@mvista.com,
    Patchwork: http://patchwork.linux-mips.org/patch/2011/
    Signed-off-by: Ralf Baechle

    Deng-Cheng Zhu
     
  • Signed-off-by: Yoichi Yuasa
    Cc: linux-mips
    Patchwork: https://patchwork.linux-mips.org/patch/2055/
    Signed-off-by: Ralf Baechle

    Yoichi Yuasa
     
  • This error was reported by cppcheck:
    arch/mips/loongson/common/machtype.c:56: error: Dangerous usage of 'str' (strncpy doesn't always 0-terminate it)

    If strncpy copied MACHTYPE_LEN bytes, the destination string str
    was not terminated.

    The patch adds one more byte to str and makes sure that this byte is
    always 0.

    Signed-off-by: Stefan Weil
    Cc: Wu Zhangjin
    Cc: Arnaud Patard
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Patchwork: https://patchwork.linux-mips.org/patch/2053/
    Signed-off-by: Ralf Baechle

    Stefan Weil
     
  • Under some combinations of CONFIG_*, lastpfn in page_is_ram is 'set
    but not used'. Mark it as __maybe_unused to quiet the warning/error.

    Signed-off-by: David Daney
    To: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/2033/
    Signed-off-by: Ralf Baechle

    David Daney
     
  • GCC-4.6 can find more unused code than previous versions could.

    In the case of arch/mips/math-emu/ieee754int.h, the COMPXSP and
    COMPXDP macros are used in several places, but a couple of them leave
    xs unused. The easiest thing to do is mark it as __maybe_unused to
    quiet the warning.

    Signed-off-by: David Daney
    To: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/2032/
    Signed-off-by: Ralf Baechle

    David Daney
     
  • The variable arg3 in _sys_sysmips() is unused. Remove it.

    Signed-off-by: David Daney
    To: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/2034/
    Signed-off-by: Ralf Baechle

    David Daney
     
  • GCC-4.6 can find more unused code than previous versions could.

    In the case of protected_restore_fp_context{,32}, the variable tmp is
    really used. Its use is tricky in that we really care about the side
    effects of the __put_user() calls. So we must mark tmp with
    __maybe_unused to quiet the warning.

    Signed-off-by: David Daney
    To: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/2035/
    Signed-off-by: Ralf Baechle

    David Daney
     
  • Signed-off-by: Anoop P A
    To: Ben Hutchings
    To: linux-mips@linux-mips.org
    To: linux-kernel@vger.kernel.org
    Patchwork: https://patchwork.linux-mips.org/patch/1804/
    Signed-off-by: Ralf Baechle

    Anoop P A
     
  • Signed-off-by: Anoop P A
    To: linux-mips@linux-mips.org
    To: linux-kernel@vger.kernel.org
    Patchwork: https://patchwork.linux-mips.org/patch/1803/
    Tested-by: Shane McDonald
    Signed-off-by: Ralf Baechle

    Anoop P A
     
  • Loongson builds have an ad-hoc cmdline default of "console=ttyS0,115200
    root=/dev/hda1". These settings come from a vendor; I remember builds
    from Lemote branch requiring a "console=tty" override in order to get a
    working console.

    At least on Yeeloong, they're particularly useless: there's no external
    serial port, and the IDE drive is now recognised as /dev/sda.

    Signed-off-by: Robert Millan
    To: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/1759/
    Signed-off-by: Ralf Baechle

    Robert Millan
     
  • The sysmips(MIPS_FIXADE, ...) case contains an obvious copy-and-paste
    error in the handling of the TIF_LOGADE flag. Fix that

    Patchwork: https://patchwork.linux-mips.org/patch/1997/
    Signed-off-by: Ralf Baechle

    Stefan Oberhumer
     
  • It was reported that GCC-4.3.3 (with CodeSourcery extensions) fails
    without this.

    Reported-by: Jonas Gorski
    Signed-off-by: David Daney
    Cc: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/2010/
    Signed-off-by: Ralf Baechle

    David Daney
     
  • trace.func should be set to the recorded ip of the mcount calling site
    in the __mcount_loc section to filter the function entries configured
    through the tracing/set_graph_function interface, but before, this is
    set to the self_ra(the return address of mcount), which has made
    set_graph_function not work as expected.

    This fixes it via calculating the right recorded ip in the __mcount_loc
    section and assign it to trace.func.

    Reported-by: Zhiping Zhong
    Signed-off-by: Wu Zhangjin
    Cc: Steven Rostedt
    Cc: Sergei Shtylyov
    Cc: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/2017/
    Signed-off-by: Ralf Baechle

    Wu Zhangjin
     
  • This moves the comments out of ftrace_make_nop() and cleans it. At the
    same time, a macro MCOUNT_OFFSET_INSNS is defined for sharing with the
    next patch.

    Signed-off-by: Wu Zhangjin
    Cc: Steven Rostedt
    Cc: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/2008/
    Signed-off-by: Ralf Baechle

    Wu Zhangjin
     
  • The old prepare_ftrace_return() for MIPS is confused and have introduced
    some problem. This patch cleans up the names of the arguments, variables
    and related functions.

    For MIPS, the 2nd argument of prepare_ftrace_return() is not really the
    'selfpc' described in ftrace-design.txt but instead it is the self
    return address. This did break the compatibility of the generic
    interface but really reduced one unneeded calculation for to get the
    current function name, the parent return address and the self return
    address are enough, no need to tranform the self return address to the
    self address.

    But set_graph_function of function graph tracer is an exception, it does
    need the 2nd argument of prepare_ftrace_return() as 'selfpc', for it
    will use 'selfpc' to match user's configuration of function graph
    entries, but in reality, it doesn't need the 'selfpc' but the recorded
    ip address of the mcount calling site in the __mcount_loc section. So,
    the 2nd argument of prepare_ftrace_return() is not important, the real
    requirement is the right recorded ip address should be calculated and
    assign to trace.func, this will be fixed in the next patches.

    Reported-by: Zhiping Zhong
    Signed-off-by: Wu Zhangjin
    Cc: Steven Rostedt
    Cc: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/2007/
    Signed-off-by: Ralf Baechle

    Wu Zhangjin
     
  • The old in_module() may not work in some situations(e.g. when module &
    kernel are in the same address space when CONFIG_MAPPED_KERNEL=y), The
    in_kernel_space() is more generic and it is also easy to be implemented
    via cloning the existing core_kernel_text(), so, replace the in_module()
    with in_kernel_space().

    Signed-off-by: Wu Zhangjin
    Cc: Steven Rostedt
    Cc: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/2005/
    Signed-off-by: Ralf Baechle

    Wu Zhangjin
     
  • This simply moves the "ip-=4" statement down to the end of the do { ...
    } while (...); loop, which reduces one unneeded subtration and the
    subsequent memory loading and comparison.

    Signed-off-by: Wu Zhangjin
    Cc: Steven Rostedt
    Cc: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/2006/
    Signed-off-by: Ralf Baechle

    Wu Zhangjin
     
  • SPIN_LOCK_UNLOCK is deprecated. Use the lockdep capable variant instead.

    Signed-off-by: Thomas Gleixner
    Cc: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/2025/
    Signed-off-by: Ralf Baechle

    Thomas Gleixner
     

14 Mar, 2011

11 commits

  • Fix for a dumb preadv()/pwritev() compat bug - unlike the native
    variants, the compat_... ones forget to check FMODE_P{READ,WRITE}, so
    e.g. on pipe the native preadv() will fail with -ESPIPE and compat one
    will act as readv() and succeed.

    Not critical, but it's a clear bug with trivial fix, so IMO it's OK for
    -final.

    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Al Viro
     
  • * 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging:
    hwmon/f71882fg: Set platform drvdata to NULL later
    hwmon/f71882fg: Fix a typo in a comment

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
    Btrfs: break out of shrink_delalloc earlier
    btrfs: fix not enough reserved space
    btrfs: fix dip leak
    Btrfs: make sure not to return overlapping extents to fiemap
    Btrfs: deal with short returns from copy_from_user
    Btrfs: fix regressions in copy_from_user handling

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
    [SCSI] target: Fix t_transport_aborted handling in LUN_RESET + active I/O shutdown

    Linus Torvalds
     
  • Recent change to fixdep:

    commit b7bd182176960fdd139486cadb9962b39f8a2b50
    Author: Michal Marek
    Date: Thu Feb 17 15:13:54 2011 +0100

    fixdep: Do not record dependency on the source file itself

    changed the format of the *.cmd files without realizing that it is also
    used by modpost. Put the path to the source file to the file back, in a
    special variable, so that modpost sees all source files when calculating
    srcversion for modules.

    Reported-and-tested-by: Henrik Rydberg
    Signed-off-by: Michal Marek
    Signed-off-by: Linus Torvalds

    Michal Marek
     
  • * git://git.infradead.org/users/dwmw2/mtd-2.6.38:
    mtd: add "platform:" prefix for platform modalias
    mtd: mtd_blkdevs: fix double free on error path
    mtd: amd76xrom: fix oops at boot when resources are not available
    mtd: fix race in cfi_cmdset_0001 driver
    mtd: jedec_probe: initialise make sector erase command variable
    mtd: jedec_probe: Change variable name from cfi_p to cfi

    Linus Torvalds
     
  • * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
    drm/radeon: fix page flipping hangs on r300/r400
    drm/radeon: add pageflip hooks for fusion

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.dk/linux-2.6-block:
    block: fix mis-synchronisation in blkdev_issue_zeroout()

    Linus Torvalds
     
  • * 'fix/asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
    ASoC: Ensure WM8958 gets all WM8994 late revision widgets
    ASoC: Fix typo in late revision WM8994 DAC2R name
    ASoC: Use the correct DAPM context when cleaning up final widget set
    ASoC: Fix broken bitfield definitions in WM8978
    ASoC: AM3517: Update codec name after multi-component update

    Linus Torvalds
     
  • The device table is required to load modules based on modaliases.

    After adding MODULE_DEVICE_TABLE, below entries will be added to
    modules.pcimap:

    pch_gpio 0x00008086 0x00008803 0xffffffff 0xffffffff 0x00000000 0x00000000 0x0
    ml_ioh_gpio 0x000010db 0x0000802e 0xffffffff 0xffffffff 0x00000000 0x00000000 0x0

    Signed-off-by: Axel Lin
    Cc: Tomoya MORINAGA
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Axel Lin
     
  • When vmscan.c calls page_referenced(), if an anon page was created
    before a process forked, rmap will search for it in both of the
    processes, even though one of them might have since broken COW.

    If the child process mlocks the vma where the COWed page belongs to,
    page_referenced() running on the page mapped by the parent would lead to
    *vm_flags getting VM_LOCKED set erroneously (leading to the references
    on the parent page being ignored and evicting the parent page too
    early).

    *mapcount would also be decremented by page_referenced_one even if the
    page wasn't found by page_check_address.

    This also lets pmdp_clear_flush_young_notify() go ahead on a
    pmd_trans_splitting() pmd.

    We hold the page_table_lock so __split_huge_page_map() must wait the
    pmdp_clear_flush_young_notify() to complete before it can modify the
    pmd. The pmd is also still mapped in userland so the young bit may
    materialize through a tlb miss before split_huge_page_map runs.

    This will provide a more accurate page_referenced() behavior during
    split_huge_page().

    Signed-off-by: Andrea Arcangeli
    Reported-by: Michel Lespinasse
    Reviewed-by: Michel Lespinasse
    Reviewed-by: Minchan Kim
    Reviewed-by: Johannes Weiner
    Reviewed-by: Rik van Riel
    Reviewed-by: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     

13 Mar, 2011

3 commits


12 Mar, 2011

1 commit

  • Josef had changed shrink_delalloc to exit after three shrink
    attempts, which wasn't quite enough because new writers could
    race in and steal free space.

    But it also fixed deadlocks and stalls as we tried to recover
    delalloc reservations. The code was tweaked to loop 1024
    times, and would reset the counter any time a small amount
    of progress was made. This was too drastic, and with a
    lot of writers we can end up stuck in shrink_delalloc forever.

    The shrink_delalloc loop is fairly complex because the caller is looping
    too, and the caller will go ahead and force a transaction commit to make
    sure we reclaim space.

    This reworks things to exit shrink_delalloc when we've forced some
    writeback and the delalloc reservations have gone down. This means
    the writeback has not just started but has also finished at
    least some of the metadata changes required to reclaim delalloc
    space.

    If we've got this wrong, we're returning ENOSPC too early, which
    is a big improvement over the current behavior of hanging the machine.

    Test 224 in xfstests hammers on this nicely, and with 1000 writers
    trying to fill a 1GB drive we get our first ENOSPC at 93% full. The
    other writers are able to continue until we get 100%.

    This is a worst case test for btrfs because the 1000 writers are doing
    small IO, and the small FS size means we don't have a lot of room
    for metadata chunks.

    Signed-off-by: Chris Mason

    Chris Mason
     

11 Mar, 2011

2 commits

  • BZ29402
    https://bugzilla.kernel.org/show_bug.cgi?id=29402

    We can hit serious mis-synchronization in bio completion path of
    blkdev_issue_zeroout() leading to a panic.

    The problem is that when we are going to wait_for_completion() in
    blkdev_issue_zeroout() we check if the bb.done equals issued (number of
    submitted bios). If it does, we can skip the wait_for_completition()
    and just out of the function since there is nothing to wait for.
    However, there is a ordering problem because bio_batch_end_io() is
    calling atomic_inc(&bb->done) before complete(), hence it might seem to
    blkdev_issue_zeroout() that all bios has been completed and exit. At
    this point when bio_batch_end_io() is going to call complete(bb->wait),
    bb and wait does not longer exist since it was allocated on stack in
    blkdev_issue_zeroout() ==> panic!

    (thread 1) (thread 2)
    bio_batch_end_io() blkdev_issue_zeroout()
    if(bb) { ...
    if (bb->end_io) ...
    bb->end_io(bio, err); ...
    atomic_inc(&bb->done); ...
    ... while (issued != atomic_read(&bb.done))
    ... (let issued == bb.done)
    ... (do the rest of the function)
    ... return ret;
    complete(bb->wait);
    ^^^^^^^^
    panic

    We can fix this easily by simplifying bio_batch and completion counting.

    Also remove bio_end_io_t *end_io since it is not used.

    Signed-off-by: Lukas Czerner
    Reported-by: Eric Whitney
    Tested-by: Eric Whitney
    Reviewed-by: Jeff Moyer
    CC: Dmitry Monakhov
    Signed-off-by: Jens Axboe

    Lukas Czerner
     
  • Since 43cc71eed1250755986da4c0f9898f9a635cb3bf (platform: prefix MODALIAS
    with "platform:"), the platform modalias is prefixed with "platform:".

    Signed-off-by: Axel Lin
    Signed-off-by: Artem Bityutskiy
    Signed-off-by: David Woodhouse
    Cc: stable@kernel.org

    Axel Lin