26 Mar, 2006

14 commits

  • Hugh is rightly concerned that the CONFIG_DEBUG_VM coverage has gone too
    far in vm_normal_page, considering that we expect production kernels to be
    shipped with the option turned off, and that the code has been under some
    large changes recently.

    Signed-off-by: Nick Piggin
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (21 commits)
    BUG_ON() Conversion in drivers/video/
    BUG_ON() Conversion in drivers/parisc/
    BUG_ON() Conversion in drivers/block/
    BUG_ON() Conversion in sound/sparc/cs4231.c
    BUG_ON() Conversion in drivers/s390/block/dasd.c
    BUG_ON() Conversion in lib/swiotlb.c
    BUG_ON() Conversion in kernel/cpu.c
    BUG_ON() Conversion in ipc/msg.c
    BUG_ON() Conversion in block/elevator.c
    BUG_ON() Conversion in fs/coda/
    BUG_ON() Conversion in fs/binfmt_elf_fdpic.c
    BUG_ON() Conversion in input/serio/hil_mlc.c
    BUG_ON() Conversion in md/dm-hw-handler.c
    BUG_ON() Conversion in md/bitmap.c
    The comment describing how MS_ASYNC works in msync.c is confusing
    rcu: undeclared variable used in documentation
    fix typos "wich" -> "which"
    typo patch for fs/ufs/super.c
    Fix simple typos
    tabify drivers/char/Makefile
    ...

    Linus Torvalds
     
  • The "rounded up to nearest power of 2 in size" algorithm in
    alloc_large_system_hash is not correct. As coded, it takes an otherwise
    acceptable power-of-2 value and doubles it. For example, we see the error
    if we boot with thash_entries=2097152 which produces a hash table with
    4194304 entries.

    Signed-off-by: John Hawkes
    Cc: Roland Dreier
    Cc: "Chen, Kenneth W"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    John Hawkes
     
  • A couple of places are forgetting to take it.

    The kswapd case is probably unimportant. keventd_create_kthread() was racy.

    The whole thing is a bit flakey: you start a kernel thread, get its pid from
    kernel_thread() then look up its task_struct.

    a) It assumes that pid recycling takes a "long" time.

    b) We get a task_struct but no reference was taken on it. The owner of the
    kswapd and kthread task_struct*'s must assume that the new thread won't
    exit unexpectedly. Because if it does, they're left holding dead memory
    and any attempt to control or stop that task will crash.

    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • In zone_pcp_init we print out all zones even if they are empty:

    On node 0 totalpages: 245760
    DMA zone: 245760 pages, LIFO batch:31
    DMA32 zone: 0 pages, LIFO batch:0
    Normal zone: 0 pages, LIFO batch:0
    HighMem zone: 0 pages, LIFO batch:0

    To conserve dmesg space why not print only the non zero zones.

    Signed-off-by: Anton Blanchard
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Anton Blanchard
     
  • The page migration code could function without NUMA but we currently have
    no users for the non-NUMA case.

    Signed-off-by: Christoph Lameter
    Cc: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • We have had this memory leak for a while now. The situation is complicated
    by the use of alloc_kmemlist() as a function to resize various caches by
    do_tune_cpucache().

    What we do here is first of all make sure that we deallocate properly in
    the loop over all the nodes.

    If we are just resizing caches then we can simply return with -ENOMEM if an
    allocation fails.

    If the cache is new then we need to rollback and remove all earlier
    allocations.

    We detect that a cache is new by checking if the link to the global cache
    chain has been setup. This is a bit hackish ....

    (also fix up too overlong lines that I added in the last patch...)

    Signed-off-by: Christoph Lameter
    Cc: Jesper Juhl
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Inspired by Jesper Juhl's patch from today

    1. Get rid of err
    We do not set it to anything else but zero.

    2. Drop the CONFIG_NUMA stuff.
    There are definitions for alloc_alien_cache and free_alien_cache()
    that do the right thing for the non NUMA case.

    3. Better naming of variables.

    4. Remove redundant cachep->nodelists[node] expressions.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Jesper Juhl
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • __drain_alien_cache() currently drains objects by freeing them to the
    (remote) freelists of the original node. However, each node also has a
    shared list containing objects to be used on any processor of that node.
    We can avoid a number of remote node accesses by copying the pointers to
    the free objects directly into the remote shared array.

    And while we are at it: Skip alien draining if the alien cache spinlock is
    already taken.

    Kiran reported that this is a performance benefit.

    Signed-off-by: Christoph Lameter
    Cc: Pekka Enberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • slabr_objects() can be used to transfer objects between various object
    caches of the slab allocator. It is currently only used during
    __cache_alloc() to retrieve elements from the shared array. We will be
    using it soon to transfer elements from the alien caches to the remote
    shared array.

    Signed-off-by: Christoph Lameter
    Cc: Pekka Enberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Convert mm/ to use the new kmem_cache_zalloc allocator.

    Signed-off-by: Pekka Enberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pekka Enberg
     
  • As suggested by Eric Dumazet, optimize kzalloc() calls that pass a
    compile-time constant size. Please note that the patch increases kernel
    text slightly (~200 bytes for defconfig on x86).

    Signed-off-by: Pekka Enberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pekka Enberg
     
  • Introduce a memory-zeroing variant of kmem_cache_alloc. The allocator
    already exits in XFS and there are potential users for it so this patch
    makes the allocator available for the general public.

    Signed-off-by: Pekka Enberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pekka Enberg
     
  • Implement /proc/slab_allocators. It produces output like:

    idr_layer_cache: 80 idr_pre_get+0x33/0x4e
    buffer_head: 2555 alloc_buffer_head+0x20/0x75
    mm_struct: 9 mm_alloc+0x1e/0x42
    mm_struct: 20 dup_mm+0x36/0x370
    vm_area_struct: 384 dup_mm+0x18f/0x370
    vm_area_struct: 151 do_mmap_pgoff+0x2e0/0x7c3
    vm_area_struct: 1 split_vma+0x5a/0x10e
    vm_area_struct: 11 do_brk+0x206/0x2e2
    vm_area_struct: 2 copy_vma+0xda/0x142
    vm_area_struct: 9 setup_arg_pages+0x99/0x214
    fs_cache: 8 copy_fs_struct+0x21/0x133
    fs_cache: 29 copy_process+0xf38/0x10e3
    files_cache: 30 alloc_files+0x1b/0xcf
    signal_cache: 81 copy_process+0xbaa/0x10e3
    sighand_cache: 77 copy_process+0xe65/0x10e3
    sighand_cache: 1 de_thread+0x4d/0x5f8
    anon_vma: 241 anon_vma_prepare+0xd9/0xf3
    size-2048: 1 add_sect_attrs+0x5f/0x145
    size-2048: 2 journal_init_revoke+0x99/0x302
    size-2048: 2 journal_init_revoke+0x137/0x302
    size-2048: 2 journal_init_inode+0xf9/0x1c4

    Cc: Manfred Spraul
    Cc: Alexander Nyberg
    Cc: Pekka Enberg
    Cc: Christoph Lameter
    Cc: Ravikiran Thirumalai
    Signed-off-by: Al Viro
    DESC
    slab-leaks3-locking-fix
    EDESC
    From: Andrew Morton

    Update for slab-remove-cachep-spinlock.patch

    Cc: Al Viro
    Cc: Manfred Spraul
    Cc: Alexander Nyberg
    Cc: Pekka Enberg
    Cc: Christoph Lameter
    Cc: Ravikiran Thirumalai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Al Viro
     

25 Mar, 2006

1 commit


24 Mar, 2006

17 commits

  • This patch series creates a strndup_user() function to easy copying C strings
    from userspace. Also we avoid common pitfalls like userspace modifying the
    final \0 after the strlen_user().

    Signed-off-by: Davi Arnaut
    Cc: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davi Arnaut
     
  • No need to duplicate all that code.

    Cc: Hugh Dickins
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • msync() does a strange thing. Essentially:

    vma = find_vma();
    for ( ; ; ) {
    if (!vma)
    return -ENOMEM;
    ...
    vma = vma->vm_next;
    }

    so an msync() request which starts within or before a valid VMA and which ends
    within or beyond the final VMA will incorrectly return -ENOMEM.

    Fix.

    Cc: Hugh Dickins
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • It seems bad to hold mmap_sem while performing synchronous disk I/O. Alter
    the msync(MS_SYNC) code so that the lock is released while we sync the file.

    Cc: Hugh Dickins
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • It seems sensible to perform dirty page throttling in msync: as the application
    dirties pages we can kick off pdflush early, or even force the msync() caller
    to perform writeout, or even throttle the msync() caller.

    The main effect of this is to start disk writeback earlier if we've just
    discovered that a large amount of pagecache has been dirtied. (Otherwise it
    wouldn't happen for up to five seconds, next time pdflush wakes up).

    It also will cause the page-dirtying process to get panalised for dirtying
    those pages rather than whacking someone else with the problem.

    We should do this for munmap() and possibly even exit(), too.

    We drop the mmap_sem while performing the dirty page balancing. It doesn't
    seem right to hold mmap_sem for that long.

    Note that this patch only affects MS_ASYNC. MS_SYNC will be syncing all the
    dirty pages anyway.

    We note that msync(MS_SYNC) does a full-file-sync inside mmap_sem, and always
    has. We can fix that up...

    The patch also tightens up the mmap_sem coverage in sys_msync(): no point in
    taking it while we perform the incoming arg checking.

    Cc: Hugh Dickins
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • We need set_page_dirty() to return true if it actually transitioned the page
    from a clean to dirty state. This wasn't right in a couple of places. Do a
    kernel-wide audit, fix things up.

    This leaves open the possibility of returning a negative errno from
    set_page_dirty() sometime in the future. But we don't do that at present.

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Modify balance_dirty_pages_ratelimited() so that it can take a
    number-of-pages-which-I-just-dirtied argument. For msync().

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Add two new linux-specific fadvise extensions():

    LINUX_FADV_ASYNC_WRITE: start async writeout of any dirty pages between file
    offsets `offset' and `offset+len'. Any pages which are currently under
    writeout are skipped, whether or not they are dirty.

    LINUX_FADV_WRITE_WAIT: wait upon writeout of any dirty pages between file
    offsets `offset' and `offset+len'.

    By combining these two operations the application may do several things:

    LINUX_FADV_ASYNC_WRITE: push some or all of the dirty pages at the disk.

    LINUX_FADV_WRITE_WAIT, LINUX_FADV_ASYNC_WRITE: push all of the currently dirty
    pages at the disk.

    LINUX_FADV_WRITE_WAIT, LINUX_FADV_ASYNC_WRITE, LINUX_FADV_WRITE_WAIT: push all
    of the currently dirty pages at the disk, wait until they have been written.

    It should be noted that none of these operations write out the file's
    metadata. So unless the application is strictly performing overwrites of
    already-instantiated disk blocks, there are no guarantees here that the data
    will be available after a crash.

    To complete this suite of operations I guess we should have a "sync file
    metadata only" operation. This gives applications access to all the building
    blocks needed for all sorts of sync operations. But sync-metadata doesn't fit
    well with the fadvise() interface. Probably it should be a new syscall:
    sys_fmetadatasync().

    The patch also diddles with the meaning of `endbyte' in sys_fadvise64_64().
    It is made to represent that last affected byte in the file (ie: it is
    inclusive). Generally, all these byterange and pagerange functions are
    inclusive so we can easily represent EOF with -1.

    As Ulrich notes, these two functions are somewhat abusive of the fadvise()
    concept, which appears to be "set the future policy for this fd".

    But these commands are a perfect fit with the fadvise() impementation, and
    several of the existing fadvise() commands are synchronous and don't affect
    future policy either. I think we can live with the slight incongruity.

    Cc: Michael Kerrisk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • I had trouble understanding working out whether filemap_fdatawrite_range()'s
    `end' parameter describes the last-byte-to-be-written or the last-plus-one.
    Clarify that in comments.

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • The hook in the slab cache allocation path to handle cpuset memory
    spreading for tasks in cpusets with 'memory_spread_slab' enabled has a
    modest performance bug. The hook calls into the memory spreading handler
    alternate_node_alloc() if either of 'memory_spread_slab' or
    'memory_spread_page' is enabled, even though the handler does nothing
    (albeit harmlessly) for the page case

    Fix - drop PF_SPREAD_PAGE from the set of flag bits that are used to
    trigger a call to alternate_node_alloc().

    The page case is handled by separate hooks -- see the calls conditioned on
    cpuset_do_page_mem_spread() in mm/filemap.c

    Signed-off-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Jackson
     
  • The hooks in the slab cache allocator code path for support of NUMA
    mempolicies and cpuset memory spreading are in an important code path. Many
    systems will use neither feature.

    This patch optimizes those hooks down to a single check of some bits in the
    current tasks task_struct flags. For non NUMA systems, this hook and related
    code is already ifdef'd out.

    The optimization is done by using another task flag, set if the task is using
    a non-default NUMA mempolicy. Taking this flag bit along with the
    PF_SPREAD_PAGE and PF_SPREAD_SLAB flag bits added earlier in this 'cpuset
    memory spreading' patch set, one can check for the combination of any of these
    special case memory placement mechanisms with a single test of the current
    tasks task_struct flags.

    This patch also tightens up the code, to save a few bytes of kernel text
    space, and moves some of it out of line. Due to the nested inlines called
    from multiple places, we were ending up with three copies of this code, which
    once we get off the main code path (for local node allocation) seems a bit
    wasteful of instruction memory.

    Signed-off-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Jackson
     
  • Provide the slab cache infrastructure to support cpuset memory spreading.

    See the previous patches, cpuset_mem_spread, for an explanation of cpuset
    memory spreading.

    This patch provides a slab cache SLAB_MEM_SPREAD flag. If set in the
    kmem_cache_create() call defining a slab cache, then any task marked with the
    process state flag PF_MEMSPREAD will spread memory page allocations for that
    cache over all the allowed nodes, instead of preferring the local (faulting)
    node.

    On systems not configured with CONFIG_NUMA, this results in no change to the
    page allocation code path for slab caches.

    On systems with cpusets configured in the kernel, but the "memory_spread"
    cpuset option not enabled for the current tasks cpuset, this adds a call to a
    cpuset routine and failed bit test of the processor state flag PF_SPREAD_SLAB.

    For tasks so marked, a second inline test is done for the slab cache flag
    SLAB_MEM_SPREAD, and if that is set and if the allocation is not
    in_interrupt(), this adds a call to to a cpuset routine that computes which of
    the tasks mems_allowed nodes should be preferred for this allocation.

    ==> This patch adds another hook into the performance critical
    code path to allocating objects from the slab cache, in the
    ____cache_alloc() chunk, below. The next patch optimizes this
    hook, reducing the impact of the combined mempolicy plus memory
    spreading hooks on this critical code path to a single check
    against the tasks task_struct flags word.

    This patch provides the generic slab flags and logic needed to apply memory
    spreading to a particular slab.

    A subsequent patch will mark a few specific slab caches for this placement
    policy.

    Signed-off-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Jackson
     
  • Change the page cache allocation calls to support cpuset memory spreading.

    See the previous patch, cpuset_mem_spread, for an explanation of cpuset memory
    spreading.

    On systems without cpusets configured in the kernel, this is no change.

    On systems with cpusets configured in the kernel, but the "memory_spread"
    cpuset option not enabled for the current tasks cpuset, this adds a call to a
    cpuset routine and failed bit test of the processor state flag PF_SPREAD_PAGE.

    On tasks in cpusets with "memory_spread" enabled, this adds a call to a cpuset
    routine that computes which of the tasks mems_allowed nodes should be
    preferred for this allocation.

    If memory spreading applies to a particular allocation, then any other NUMA
    mempolicy does not apply.

    Signed-off-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Jackson
     
  • If we get under some memory pressure in a cpuset (we only scan zones that
    are in the cpuset for memory) then kswapd is woken up for all zones. This
    patch only wakes up kswapd in zones that are part of the current cpuset.

    Signed-off-by: Christoph Lameter
    Acked-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Make that the internal value for /proc/sys/vm/laptop_mode is stored as
    jiffies instead of seconds. Let the sysctl interface do the conversions,
    instead of doing on-the-fly conversions every time the value is used.

    Add a description of the fact that laptop_mode doubles as a flag and a
    timeout to the comment above the laptop_mode variable.

    Signed-off-by: Bart Samwel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bart Samwel
     
  • Make that the internal values for:

    /proc/sys/vm/dirty_writeback_centisecs
    /proc/sys/vm/dirty_expire_centisecs

    are stored as jiffies instead of centiseconds. Let the sysctl interface do
    the conversions with full precision using clock_t_to_jiffies, instead of
    doing overflow-sensitive on-the-fly conversions every time the values are
    used.

    Cons: apparent precision loss if HZ is not a multiple of 100, because of
    conversion back and forth. This is a common problem for all sysctl values
    that use proc_dointvec_userhz_jiffies. (There is only one other in-tree
    use, in net/core/neighbour.c.)

    Signed-off-by: Bart Samwel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bart Samwel
     
  • Signed-off-by: Jens Axboe

    Jens Axboe
     

23 Mar, 2006

3 commits

  • Linus points out that ext3_readdir's readahead only cuts in when
    ext3_readdir() is operating at the very start of the directory. So for large
    directories we end up performing no readahead at all and we suck.

    So take it all out and use the core VM's page_cache_readahead(). This means
    that ext3 directory reads will use all of readahead's dynamic sizing goop.

    Note that we're using the directory's filp->f_ra to hold the readahead state,
    but readahead is actually being performed against the underlying blockdev's
    address_space. Fortunately the readahead code is all set up to handle this.

    Tested with printk. It works. I was struggling to find a real workload which
    actually cared.

    (The patch also exports page_cache_readahead() to GPL modules)

    Cc: "Stephen C. Tweedie"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • This patch introduces a user space interface for swsusp.

    The interface is based on a special character device, called the snapshot
    device, that allows user space processes to perform suspend and resume-related
    operations with the help of some ioctls and the read()/write() functions.
     Additionally it allows these processes to allocate free swap pages from a
    selected swap partition, called the resume partition, so that they know which
    sectors of the resume partition are available to them.

    The interface uses the same low-level system memory snapshot-handling
    functions that are used by the built-it swap-writing/reading code of swsusp.

    The interface documentation is included in the patch.

    The patch assumes that the major and minor numbers of the snapshot device will
    be 10 (ie. misc device) and 231, the registration of which has already been
    requested.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Introduce the low level interface that can be used for handling the
    snapshot of the system memory by the in-kernel swap-writing/reading code of
    swsusp and the userland interface code (to be introduced shortly).

    Also change the way in which swsusp records the allocated swap pages and,
    consequently, simplifies the in-kernel swap-writing/reading code (this is
    necessary for the userland interface too). To this end, it introduces two
    helper functions in mm/swapfile.c, so that the swsusp code does not refer
    directly to the swap internals.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

22 Mar, 2006

5 commits

  • Centralize the page migration functions in anticipation of additional
    tinkering. Creates a new file mm/migrate.c

    1. Extract buffer_migrate_page() from fs/buffer.c

    2. Extract central migration code from vmscan.c

    3. Extract some components from mempolicy.c

    4. Export pageout() and remove_from_swap() from vmscan.c

    5. Make it possible to configure NUMA systems without page migration
    and non-NUMA systems with page migration.

    I had to so some #ifdeffing in mempolicy.c that may need a cleanup.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • The alien cache rotor in mm/slab.c assumes that the first online node is
    node 0. Eventually for some archs, especially with hotplug, this will no
    longer be true.

    Fix the interleave rotor to handle the general case of node numbering.

    Signed-off-by: Paul Jackson
    Acked-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Jackson
     
  • Fix bogus node loop in hugetlb.c alloc_fresh_huge_page(), which was
    assuming that nodes are numbered contiguously from 0 to num_online_nodes().
    Once the hotplug folks get this far, that will be false.

    Signed-off-by: Paul Jackson
    Acked-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Jackson
     
  • When we've allocated SWAPFILE_CLUSTER pages, ->cluster_next should be the
    first index of swap cluster. But current code probably sets it wrong offset.

    Signed-off-by: Akinobu Mita
    Acked-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • 1. Only disable interrupts if there is actually something to free

    2. Only dirty the pcp cacheline if we actually freed something.

    3. Disable interrupts for each single pcp and not for cleaning
    all the pcps in all zones of a node.

    drain_node_pages is called every 2 seconds from cache_reap. This
    fix should avoid most disabling of interrupts.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter