12 Oct, 2006

2 commits

  • If try_to_release_page() is called with a zero gfp mask, then the
    filesystem is effectively denied the possibility of sleeping while
    attempting to release the page. There doesn't appear to be any valid
    reason why this should be banned, given that we're not calling this from a
    memory allocation context.

    For this reason, change the gfp_mask argument of the call to GFP_KERNEL.

    Signed-off-by: Trond Myklebust
    Cc: Steve Dickson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Trond Myklebust
     
  • A failure in invalidate_inode_pages2_range() can result in unpleasant things
    happening in NFS (at least). Stick a WARN_ON_ONCE() in there so we can find
    out if it happens, and maybe why.

    (akpm: might be a -mm-only patch, we'll see..)

    Cc: Chuck Lever
    Cc: Trond Myklebust
    Cc: Steve Dickson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

01 Oct, 2006

3 commits

  • The recent fix to invalidate_inode_pages() (git commit 016eb4a) managed to
    unfix invalidate_inode_pages2().

    The problem is that various bits of code in the kernel can take transient refs
    on pages: the page scanner will do this when inspecting a batch of pages, and
    the lru_cache_add() batching pagevecs also hold a ref.

    Net result is transient failures in invalidate_inode_pages2(). This affects
    NFS directory invalidation (observed) and presumably also block-backed
    direct-io (not yet reported).

    Fix it by reverting invalidate_inode_pages2() back to the old version which
    ignores the page refcounts.

    We may come up with something more clever later, but for now we need a 2.6.18
    fix for NFS.

    Cc: Chuck Lever
    Cc: Nick Piggin
    Cc: Peter Zijlstra
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Make it possible to disable the block layer. Not all embedded devices require
    it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require
    the block layer to be present.

    This patch does the following:

    (*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev
    support.

    (*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls
    an item that uses the block layer. This includes:

    (*) Block I/O tracing.

    (*) Disk partition code.

    (*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS.

    (*) The SCSI layer. As far as I can tell, even SCSI chardevs use the
    block layer to do scheduling. Some drivers that use SCSI facilities -
    such as USB storage - end up disabled indirectly from this.

    (*) Various block-based device drivers, such as IDE and the old CDROM
    drivers.

    (*) MTD blockdev handling and FTL.

    (*) JFFS - which uses set_bdev_super(), something it could avoid doing by
    taking a leaf out of JFFS2's book.

    (*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and
    linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is,
    however, still used in places, and so is still available.

    (*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and
    parts of linux/fs.h.

    (*) Makes a number of files in fs/ contingent on CONFIG_BLOCK.

    (*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK.

    (*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK
    is not enabled.

    (*) fs/no-block.c is created to hold out-of-line stubs and things that are
    required when CONFIG_BLOCK is not set:

    (*) Default blockdev file operations (to give error ENODEV on opening).

    (*) Makes some /proc changes:

    (*) /proc/devices does not list any blockdevs.

    (*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK.

    (*) Makes some compat ioctl handling contingent on CONFIG_BLOCK.

    (*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if
    given command other than Q_SYNC or if a special device is specified.

    (*) In init/do_mounts.c, no reference is made to the blockdev routines if
    CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2.

    (*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return
    error ENOSYS by way of cond_syscall if so).

    (*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if
    CONFIG_BLOCK is not set, since they can't then happen.

    Signed-Off-By: David Howells
    Signed-off-by: Jens Axboe

    David Howells
     
  • Move some functions out of the buffering code that aren't strictly buffering
    specific. This is a precursor to being able to disable the block layer.

    (*) Moved some stuff out of fs/buffer.c:

    (*) The file sync and general sync stuff moved to fs/sync.c.

    (*) The superblock sync stuff moved to fs/super.c.

    (*) do_invalidatepage() moved to mm/truncate.c.

    (*) try_to_release_page() moved to mm/filemap.c.

    (*) Moved some related declarations between header files:

    (*) declarations for do_invalidatepage() and try_to_release_page() moved
    to linux/mm.h.

    (*) __set_page_dirty_buffers() moved to linux/buffer_head.h.

    Signed-Off-By: David Howells
    Signed-off-by: Jens Axboe

    David Howells
     

27 Sep, 2006

1 commit


09 Sep, 2006

1 commit

  • If a CPU faults this page into pagetables after invalidate_mapping_pages()
    checked page_mapped(), invalidate_complete_page() will still proceed to remove
    the page from pagecache. This leaves the page-faulting process with a
    detached page. If it was MAP_SHARED then file data loss will ensue.

    Fix that up by checking the page's refcount after taking tree_lock.

    Cc: Nick Piggin
    Cc: Hugh Dickins
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

23 Jun, 2006

1 commit

  • If invalidate_mapping_pages is called to invalidate a very large mapping
    (e.g. a very large block device) and if the only active page in that
    device is near the end (or at least, at a very large index), such as, say,
    the superblock of an md array, and if that page happens to be locked when
    invalidate_mapping_pages is called, then

    pagevec_lookup will return this page and
    as it is locked, 'next' will be incremented and pagevec_lookup
    will be called again. and again. and again.
    while we count from 0 upto a very large number.

    We should really always set 'next' to 'page->index+1' before going around
    the loop again, not just if the page isn't locked.

    Cc: "Steinar H. Gunderson"
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     

10 Jan, 2006

1 commit


09 Jan, 2006

1 commit

  • Add /proc/sys/vm/drop_caches. When written to, this will cause the kernel to
    discard as much pagecache and/or reclaimable slab objects as it can. THis
    operation requires root permissions.

    It won't drop dirty data, so the user should run `sync' first.

    Caveats:

    a) Holds inode_lock for exorbitant amounts of time.

    b) Needs to be taught about NUMA nodes: propagate these all the way through
    so the discarding can be controlled on a per-node basis.

    This is a debugging feature: useful for getting consistent results between
    filesystem benchmarks. We could possibly put it under a config option, but
    it's less than 300 bytes.

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

07 Jan, 2006

1 commit

  • This patch makes truncate_inode_pages_range from truncate_inode_pages.
    truncate_inode_pages became a one-liner call to truncate_inode_pages_range.

    Reiser4 needs truncate_inode_pages_ranges because it tries to keep
    correspondence between existences of metadata pointing to data pages and pages
    to which those metadata point to. So, when metadata of certain part of file
    is removed from filesystem tree, only pages of corresponding range are to be
    truncated.

    (Needed by the madvise(MADV_REMOVE) patch)

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hans Reiser
     

24 Nov, 2005

1 commit


31 Oct, 2005

1 commit

  • Fix the problem (BUG 4964) with unmapped buffers in transaction's
    t_sync_data list. The problem is we need to call filesystem's own
    invalidatepage() from block_write_full_page().

    block_write_full_page() must call filesystem's invalidatepage(). Otherwise
    following nasty race can happen:

    proc 1 proc 2
    ------ ------
    - write some new data to 'offset'
    => bh gets to the transactions data list
    - starts truncate
    => i_size set to new size
    - mpage_writepages()
    - ext3_ordered_writepage() to 'offset'
    - block_write_full_page()
    - page->index > end_index+1
    - block_invalidatepage()
    - discard_buffer()
    - clear_buffer_mapped()

    - commit triggers and finds unmapped buffer - BOOM!

    Signed-off-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

01 May, 2005

1 commit


17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds