08 Dec, 2006

24 commits

  • Introduce pagefault_{disable,enable}() and use these where previously we did
    manual preempt increments/decrements to make the pagefault handler do the
    atomic thing.

    Currently they still rely on the increased preempt count, but do not rely on
    the disabled preemption, this might go away in the future.

    (NOTE: the extra barrier() in pagefault_disable might fix some holes on
    machines which have too many registers for their own good)

    [heiko.carstens@de.ibm.com: s390 fix]
    Signed-off-by: Peter Zijlstra
    Acked-by: Nick Piggin
    Cc: Martin Schwidefsky
    Signed-off-by: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • In light of the recent pagefault and filemap_copy_from_user work I've gone
    through all the arch pagefault handlers to make sure the inc_preempt_count()
    'feature' works as expected.

    Several sections of code (including the new filemap_copy_from_user) rely on
    the fact that faults do not take locks under increased preempt count.

    arch/x86_64 - good
    arch/powerpc - good
    arch/cris - fixed
    arch/i386 - good
    arch/parisc - fixed
    arch/sh - good
    arch/sparc - good
    arch/s390 - good
    arch/m68k - fixed
    arch/ppc - good
    arch/alpha - fixed
    arch/mips - good
    arch/sparc64 - good
    arch/ia64 - good
    arch/arm - fixed
    arch/um - good
    arch/avr32 - good
    arch/h8300 - NA
    arch/m32r - good
    arch/v850 - good
    arch/frv - fixed
    arch/m68knommu - NA
    arch/arm26 - fixed
    arch/sh64 - fixed
    arch/xtensa - good

    Signed-off-by: Peter Zijlstra
    Acked-by: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • When using numa=fake on non-NUMA hardware there is no benefit to having the
    alien caches, and they consume much memory.

    Add a kernel boot option to disable them.

    Christoph sayeth "This is good to have even on large NUMA. The problem is
    that the alien caches grow by the square of the size of the system in terms of
    nodes."

    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: Manfred Spraul
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Menage
     
  • Here's an attempt towards doing away with lock_cpu_hotplug in the slab
    subsystem. This approach also fixes a bug which shows up when cpus are
    being offlined/onlined and slab caches are being tuned simultaneously.

    http://marc.theaimsgroup.com/?l=linux-kernel&m=116098888100481&w=2

    The patch has been stress tested overnight on a 2 socket 4 core AMD box with
    repeated cpu online and offline, while dbench and kernbench process are
    running, and slab caches being tuned at the same time.
    There were no lockdep warnings either. (This test on 2,6.18 as 2.6.19-rc
    crashes at __drain_pages
    http://marc.theaimsgroup.com/?l=linux-kernel&m=116172164217678&w=2 )

    The approach here is to hold cache_chain_mutex from CPU_UP_PREPARE until
    CPU_ONLINE (similar in approach as worqueue_mutex) . Slab code sensitive
    to cpu_online_map (kmem_cache_create, kmem_cache_destroy, slabinfo_write,
    __cache_shrink) is already serialized with cache_chain_mutex. (This patch
    lengthens cache_chain_mutex hold time at kmem_cache_destroy to cover this).
    This patch also takes the cache_chain_sem at kmem_cache_shrink to protect
    sanity of cpu_online_map at __cache_shrink, as viewed by slab.
    (kmem_cache_shrink->__cache_shrink->drain_cpu_caches). But, really,
    kmem_cache_shrink is used at just one place in the acpi subsystem! Do we
    really need to keep kmem_cache_shrink at all?

    Another note. Looks like a cpu hotplug event can send CPU_UP_CANCELED to
    a registered subsystem even if the subsystem did not receive CPU_UP_PREPARE.
    This could be due to a subsystem registered for notification earlier than
    the current subsystem crapping out with NOTIFY_BAD. Badness can occur with
    in the CPU_UP_CANCELED code path at slab if this happens (The same would
    apply for workqueue.c as well). To overcome this, we might have to use either
    a) a per subsystem flag and avoid handling of CPU_UP_CANCELED, or
    b) Use a special notifier events like LOCK_ACQUIRE/RELEASE as Gautham was
    using in his experiments, or
    c) Do not send CPU_UP_CANCELED to a subsystem which did not receive
    CPU_UP_PREPARE.

    I would prefer c).

    Signed-off-by: Ravikiran Thirumalai
    Signed-off-by: Shai Fultheim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ravikiran G Thirumalai
     
  • When CONFIG_SLAB_DEBUG is used in combination with ARCH_SLAB_MINALIGN, some
    debug flags should be disabled which depend on BYTES_PER_WORD alignment.

    The disabling of these debug flags is not properly handled when
    BYTES_PER_WORD < ARCH_SLAB_MEMALIGN < cache_line_size()

    This patch fixes that and also adds an alignment check to
    cache_alloc_debugcheck_after() when ARCH_SLAB_MINALIGN is used.

    Signed-off-by: Kevin Hilman
    Cc: Pekka Enberg
    Cc: Christoph Lameter
    Cc: Manfred Spraul
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kevin Hilman
     
  • Imprecise RSS accounting is an irritating ill effect with pt sharing. After
    consulted with several VM experts, I have tried various methods to solve that
    problem: (1) iterate through all mm_structs that share the PT and increment
    count; (2) keep RSS count in page table structure and then sum them up at
    reporting time. None of the above methods yield any satisfactory
    implementation.

    Since process RSS accounting is pure information only, I propose we don't
    count them at all for hugetlb page. rlimit has such field, though there is
    absolutely no enforcement on limiting that resource. One other method is to
    account all RSS at hugetlb mmap time regardless they are faulted or not. I
    opt for the simplicity of no accounting at all.

    Hugetlb page are special, they are reserved up front in global reservation
    pool and is not reclaimable. From physical memory resource point of view, it
    is already consumed regardless whether there are users using them.

    If the concern is that RSS can be used to control resource allocation, we
    already can specify hugetlb fs size limit and sysadmin can enforce that at
    mount time. Combined with the two points mentioned above, I fail to see if
    there is anything got affected because of this patch.

    Signed-off-by: Ken Chen
    Acked-by: Hugh Dickins
    Cc: Dave McCracken
    Cc: William Lee Irwin III
    Cc: "Luck, Tony"
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: David Gibson
    Cc: Adam Litke
    Cc: Paul Mundt
    Cc: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chen, Kenneth W
     
  • Following up with the work on shared page table done by Dave McCracken. This
    set of patch target shared page table for hugetlb memory only.

    The shared page table is particular useful in the situation of large number of
    independent processes sharing large shared memory segments. In the normal
    page case, the amount of memory saved from process' page table is quite
    significant. For hugetlb, the saving on page table memory is not the primary
    objective (as hugetlb itself already cuts down page table overhead
    significantly), instead, the purpose of using shared page table on hugetlb is
    to allow faster TLB refill and smaller cache pollution upon TLB miss.

    With PT sharing, pte entries are shared among hundreds of processes, the cache
    consumption used by all the page table is smaller and in return, application
    gets much higher cache hit ratio. One other effect is that cache hit ratio
    with hardware page walker hitting on pte in cache will be higher and this
    helps to reduce tlb miss latency. These two effects contribute to higher
    application performance.

    Signed-off-by: Ken Chen
    Acked-by: Hugh Dickins
    Cc: Dave McCracken
    Cc: William Lee Irwin III
    Cc: "Luck, Tony"
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: David Gibson
    Cc: Adam Litke
    Cc: Paul Mundt
    Cc: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chen, Kenneth W
     
  • Despaghettify balance_pdgat() a bit.

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Add an arch_alloc_page to match arch_free_page.

    Signed-off-by: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • The new swap token patches replace the current token traversal algo. The old
    algo had a crude timeout parameter that was used to handover the token from
    one task to another. This algo, transfers the token to the tasks that are in
    need of the token. The urgency for the token is based on the number of times
    a task is required to swap-in pages. Accordingly, the priority of a task is
    incremented if it has been badly affected due to swap-outs. To ensure that
    the token doesnt bounce around rapidly, the token holders are given a priority
    boost. The priority of tasks is also decremented, if their rate of swap-in's
    keeps reducing. This way, the condition to check whether to pre-empt the swap
    token, is a matter of comparing two task's priority fields.

    [akpm@osdl.org: cleanups]
    Signed-off-by: Ashwin Chaugule
    Cc: Rik van Riel
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ashwin Chaugule
     
  • Make sure the contention for the token happens _before_ any read-in and
    kicks the swap-token algo only when the VM is under pressure.

    Signed-off-by: Ashwin Chaugule
    Cc: Rik van Riel
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ashwin Chaugule
     
  • Some drivers are returning OOM when it is not in response to a memory
    shortage.

    Signed-off-by: Nick Piggin
    Cc: Dave Airlie
    Cc: Jaroslav Kysela
    Cc: Takashi Iwai
    Cc: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • Don't cause all threads in all other thread groups to gain TIF_MEMDIE
    otherwise we'll get a thundering herd eating our memory reserve. This may not
    be the optimal scheme, but it fits our policy of allowing just one TIF_MEMDIE
    in the system at once.

    Signed-off-by: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • Clean up the OOM killer messages to be more consistent.

    Signed-off-by: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • Abort the kill if any of our threads have OOM_DISABLE set. Having this
    test here also prevents any OOM_DISABLE child of the "selected" process
    from being killed.

    Signed-off-by: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • Rearrange the struct members in the 'struct zonelist_cache' structure, so
    as to put the readonly (once initialized) z_to_n[] array first, where it
    will come right after the zones[] array in struct zonelist.

    This pretty much eliminates the chance that the two frequently written
    elements of 'struct zonelist_cache', the fullzones bitmap and last_full_zap
    times, will end up on the same cache line as the performance sensitive,
    frequently read, never (after init) written zones[] array.

    Keeping frequently written data off frequently read cache lines is good for
    performance.

    Thanks to Rohit Seth for the suggestion.

    Signed-off-by: Paul Jackson
    Cc: Rohit Seth
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Jackson
     
  • Optimize the critical zonelist scanning for free pages in the kernel memory
    allocator by caching the zones that were found to be full recently, and
    skipping them.

    Remembers the zones in a zonelist that were short of free memory in the
    last second. And it stashes a zone-to-node table in the zonelist struct,
    to optimize that conversion (minimize its cache footprint.)

    Recent changes:

    This differs in a significant way from a similar patch that I
    posted a week ago. Now, instead of having a nodemask_t of
    recently full nodes, I have a bitmask of recently full zones.
    This solves a problem that last weeks patch had, which on
    systems with multiple zones per node (such as DMA zone) would
    take seeing any of these zones full as meaning that all zones
    on that node were full.

    Also I changed names - from "zonelist faster" to "zonelist cache",
    as that seemed to better convey what we're doing here - caching
    some of the key zonelist state (for faster access.)

    See below for some performance benchmark results. After all that
    discussion with David on why I didn't need them, I went and got
    some ;). I wanted to verify that I had not hurt the normal case
    of memory allocation noticeably. At least for my one little
    microbenchmark, I found (1) the normal case wasn't affected, and
    (2) workloads that forced scanning across multiple nodes for
    memory improved up to 10% fewer System CPU cycles and lower
    elapsed clock time ('sys' and 'real'). Good. See details, below.

    I didn't have the logic in get_page_from_freelist() for various
    full nodes and zone reclaim failures correct. That should be
    fixed up now - notice the new goto labels zonelist_scan,
    this_zone_full, and try_next_zone, in get_page_from_freelist().

    There are two reasons I persued this alternative, over some earlier
    proposals that would have focused on optimizing the fake numa
    emulation case by caching the last useful zone:

    1) Contrary to what I said before, we (SGI, on large ia64 sn2 systems)
    have seen real customer loads where the cost to scan the zonelist
    was a problem, due to many nodes being full of memory before
    we got to a node we could use. Or at least, I think we have.
    This was related to me by another engineer, based on experiences
    from some time past. So this is not guaranteed. Most likely, though.

    The following approach should help such real numa systems just as
    much as it helps fake numa systems, or any combination thereof.

    2) The effort to distinguish fake from real numa, using node_distance,
    so that we could cache a fake numa node and optimize choosing
    it over equivalent distance fake nodes, while continuing to
    properly scan all real nodes in distance order, was going to
    require a nasty blob of zonelist and node distance munging.

    The following approach has no new dependency on node distances or
    zone sorting.

    See comment in the patch below for a description of what it actually does.

    Technical details of note (or controversy):

    - See the use of "zlc_active" and "did_zlc_setup" below, to delay
    adding any work for this new mechanism until we've looked at the
    first zone in zonelist. I figured the odds of the first zone
    having the memory we needed were high enough that we should just
    look there, first, then get fancy only if we need to keep looking.

    - Some odd hackery was needed to add items to struct zonelist, while
    not tripping up the custom zonelists built by the mm/mempolicy.c
    code for MPOL_BIND. My usual wordy comments below explain this.
    Search for "MPOL_BIND".

    - Some per-node data in the struct zonelist is now modified frequently,
    with no locking. Multiple CPU cores on a node could hit and mangle
    this data. The theory is that this is just performance hint data,
    and the memory allocator will work just fine despite any such mangling.
    The fields at risk are the struct 'zonelist_cache' fields 'fullzones'
    (a bitmask) and 'last_full_zap' (unsigned long jiffies). It should
    all be self correcting after at most a one second delay.

    - This still does a linear scan of the same lengths as before. All
    I've optimized is making the scan faster, not algorithmically
    shorter. It is now able to scan a compact array of 'unsigned
    short' in the case of many full nodes, so one cache line should
    cover quite a few nodes, rather than each node hitting another
    one or two new and distinct cache lines.

    - If both Andi and Nick don't find this too complicated, I will be
    (pleasantly) flabbergasted.

    - I removed the comment claiming we only use one cachline's worth of
    zonelist. We seem, at least in the fake numa case, to have put the
    lie to that claim.

    - I pay no attention to the various watermarks and such in this performance
    hint. A node could be marked full for one watermark, and then skipped
    over when searching for a page using a different watermark. I think
    that's actually quite ok, as it will tend to slightly increase the
    spreading of memory over other nodes, away from a memory stressed node.

    ===============

    Performance - some benchmark results and analysis:

    This benchmark runs a memory hog program that uses multiple
    threads to touch alot of memory as quickly as it can.

    Multiple runs were made, touching 12, 38, 64 or 90 GBytes out of
    the total 96 GBytes on the system, and using 1, 19, 37, or 55
    threads (on a 56 CPU system.) System, user and real (elapsed)
    timings were recorded for each run, shown in units of seconds,
    in the table below.

    Two kernels were tested - 2.6.18-mm3 and the same kernel with
    this zonelist caching patch added. The table also shows the
    percentage improvement the zonelist caching sys time is over
    (lower than) the stock *-mm kernel.

    number 2.6.18-mm3 zonelist-cache delta (< 0 good) percent
    GBs N ------------ -------------- ---------------- systime
    mem threads sys user real sys user real sys user real better
    12 1 153 24 177 151 24 176 -2 0 -1 1%
    12 19 99 22 8 99 22 8 0 0 0 0%
    12 37 111 25 6 112 25 6 1 0 0 -0%
    12 55 115 25 5 110 23 5 -5 -2 0 4%
    38 1 502 74 576 497 73 570 -5 -1 -6 0%
    38 19 426 78 48 373 76 39 -53 -2 -9 12%
    38 37 544 83 36 547 82 36 3 -1 0 -0%
    38 55 501 77 23 511 80 24 10 3 1 -1%
    64 1 917 125 1042 890 124 1014 -27 -1 -28 2%
    64 19 1118 138 119 965 141 103 -153 3 -16 13%
    64 37 1202 151 94 1136 150 81 -66 -1 -13 5%
    64 55 1118 141 61 1072 140 58 -46 -1 -3 4%
    90 1 1342 177 1519 1275 174 1450 -67 -3 -69 4%
    90 19 2392 199 192 2116 189 176 -276 -10 -16 11%
    90 37 3313 238 175 2972 225 145 -341 -13 -30 10%
    90 55 1948 210 104 1843 213 100 -105 3 -4 5%

    Notes:
    1) This test ran a memory hog program that started a specified number N of
    threads, and had each thread allocate and touch 1/N'th of
    the total memory to be used in the test run in a single loop,
    writing a constant word to memory, one store every 4096 bytes.
    Watching this test during some earlier trial runs, I would see
    each of these threads sit down on one CPU and stay there, for
    the remainder of the pass, a different CPU for each thread.

    2) The 'real' column is not comparable to the 'sys' or 'user' columns.
    The 'real' column is seconds wall clock time elapsed, from beginning
    to end of that test pass. The 'sys' and 'user' columns are total
    CPU seconds spent on that test pass. For a 19 thread test run,
    for example, the sum of 'sys' and 'user' could be up to 19 times the
    number of 'real' elapsed wall clock seconds.

    3) Tests were run on a fresh, single-user boot, to minimize the amount
    of memory already in use at the start of the test, and to minimize
    the amount of background activity that might interfere.

    4) Tests were done on a 56 CPU, 28 Node system with 96 GBytes of RAM.

    5) Notice that the 'real' time gets large for the single thread runs, even
    though the measured 'sys' and 'user' times are modest. I'm not sure what
    that means - probably something to do with it being slow for one thread to
    be accessing memory along ways away. Perhaps the fake numa system, running
    ostensibly the same workload, would not show this substantial degradation
    of 'real' time for one thread on many nodes -- lets hope not.

    6) The high thread count passes (one thread per CPU - on 55 of 56 CPUs)
    ran quite efficiently, as one might expect. Each pair of threads needed
    to allocate and touch the memory on the node the two threads shared, a
    pleasantly parallizable workload.

    7) The intermediate thread count passes, when asking for alot of memory forcing
    them to go to a few neighboring nodes, improved the most with this zonelist
    caching patch.

    Conclusions:
    * This zonelist cache patch probably makes little difference one way or the
    other for most workloads on real numa hardware, if those workloads avoid
    heavy off node allocations.
    * For memory intensive workloads requiring substantial off-node allocations
    on real numa hardware, this patch improves both kernel and elapsed timings
    up to ten per-cent.
    * For fake numa systems, I'm optimistic, but will have to leave that up to
    Rohit Seth to actually test (once I get him a 2.6.18 backport.)

    Signed-off-by: Paul Jackson
    Cc: Rohit Seth
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Jackson
     
  • The zone table is mostly not needed. If we have a node in the page flags
    then we can get to the zone via NODE_DATA() which is much more likely to be
    already in the cpu cache.

    In case of SMP and UP NODE_DATA() is a constant pointer which allows us to
    access an exact replica of zonetable in the node_zones field. In all of
    the above cases there will be no need at all for the zone table.

    The only remaining case is if in a NUMA system the node numbers do not fit
    into the page flags. In that case we make sparse generate a table that
    maps sections to nodes and use that table to to figure out the node number.
    This table is sized to fit in a single cache line for the known 32 bit
    NUMA platform which makes it very likely that the information can be
    obtained without a cache miss.

    For sparsemem the zone table seems to be have been fairly large based on
    the maximum possible number of sections and the number of zones per node.
    There is some memory saving by removing zone_table. The main benefit is to
    reduce the cache foootprint of the VM from the frequent lookups of zones.
    Plus it simplifies the page allocator.

    [akpm@osdl.org: build fix]
    Signed-off-by: Christoph Lameter
    Cc: Dave Hansen
    Cc: Andy Whitcroft
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Signed-off-by: Ken Chen
    Cc: David Gibson
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chen, Kenneth W
     
  • - s/freeliest/freelist/ spelling fix

    - Check for NULL *z zone seems useless - even if it could happen, so
    what? Perhaps we should have a check later on if we are faced with an
    allocation request that is not allowed to fail - shouldn't that be a
    serious kernel error, passing an empty zonelist with a mandate to not
    fail?

    - Initializing 'z' to zonelist->zones can wait until after the first
    get_page_from_freelist() fails; we only use 'z' in the wakeup_kswapd()
    loop, so let's initialize 'z' there, in a 'for' loop. Seems clearer.

    - Remove superfluous braces around a break

    - Fix a couple errant spaces

    - Adjust indentation on the cpuset_zone_allowed() check, to match the
    lines just before it -- seems easier to read in this case.

    - Add another set of braces to the zone_watermark_ok logic

    From: Paul Jackson

    Backout one item from a previous "memory page_alloc minor cleanups" patch.
    Until and unless we are certain that no one can ever pass an empty zonelist
    to __alloc_pages(), this check for an empty zonelist (or some BUG
    equivalent) is essential. The code in get_page_from_freelist() blow ups if
    passed an empty zonelist.

    Signed-off-by: Paul Jackson
    Acked-by: Christoph Lameter
    Cc: Nick Piggin
    Signed-off-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Jackson
     
  • arch/um/drivers/chan_kern.c:643: error: conflicting types for 'chan_interrupt'
    arch/um/include/chan_kern.h:31: error: previous declaration of 'chan_interrupt'

    Cc: David Howells
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • OpenVZ Linux kernel team has found a problem with mounting in compat mode.

    Simple command "mount -t smbfs ..." on Fedora Core 5 distro in 32-bit mode
    leads to oops:

    Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: compat_sys_mount+0xd6/0x290
    Process mount (pid: 14656, veid=300, threadinfo ffff810034d30000, task ffff810034c86bc0)
    Call Trace: ia32_sysret+0x0/0xa

    The problem is that data_page pointer can be NULL, so we should skip data
    conversion in this case.

    Signed-off-by: Andrey Mirkin
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Mirkin
     
  • Fix http://bugzilla.kernel.org/show_bug.cgi?id=7606

    WARNING: "drm_sman_set_manager" [drivers/char/drm/sis.ko] undefined!

    Cc:
    Cc: Dave Airlie
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • With CONFIG_SMP=n:

    drivers/input/ff-memless.c:384: warning: implicit declaration of function 'local_bh_disable'
    drivers/input/ff-memless.c:393: warning: implicit declaration of function 'local_bh_enable'

    Really linux/spinlock.h should include linux/interrupt.h. But interrupt.h
    includes sched.h which will need spinlock.h.

    So the patch breaks the _bh declarations out into a separate header and
    includes it in both interrupt.h and spinlock.h.

    Cc: "Randy.Dunlap"
    Cc: Andi Kleen
    Cc:
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

07 Dec, 2006

16 commits