12 Jun, 2009

5 commits

  • Rename the function so that it better describe what it really does. Also
    remove the unnecessary include of buffer_head.h.

    Signed-off-by: Jan Kara
    Signed-off-by: Al Viro

    Jan Kara
     
  • It is unnecessarily fragile to have two places (fsync_super() and do_sync())
    doing data integrity sync of the filesystem. Alter __fsync_super() to
    accommodate needs of both callers and use it. So after this patch
    __fsync_super() is the only place where we gather all the calls needed to
    properly send all data on a filesystem to disk.

    Nice bonus is that we get a complete livelock avoidance and write_supers()
    is now only used for periodic writeback of superblocks.

    sync_blockdevs() introduced a couple of patches ago is gone now.

    [build fixes folded]

    Signed-off-by: Jan Kara
    Signed-off-by: Al Viro

    Jan Kara
     
  • __fsync_super() does the same thing as fsync_super(). So change the only
    caller to use fsync_super() and make __fsync_super() static. This removes
    unnecessarily duplicated call to sync_blockdev() and prepares ground
    for the changes to __fsync_super() in the following patches.

    Signed-off-by: Jan Kara
    Signed-off-by: Al Viro

    Jan Kara
     
  • * 'for-linus' of git://linux-arm.org/linux-2.6:
    kmemleak: Add the corresponding MAINTAINERS entry
    kmemleak: Simple testing module for kmemleak
    kmemleak: Enable the building of the memory leak detector
    kmemleak: Remove some of the kmemleak false positives
    kmemleak: Add modules support
    kmemleak: Add kmemleak_alloc callback from alloc_large_system_hash
    kmemleak: Add the vmalloc memory allocation/freeing hooks
    kmemleak: Add the slub memory allocation/freeing hooks
    kmemleak: Add the slob memory allocation/freeing hooks
    kmemleak: Add the slab memory allocation/freeing hooks
    kmemleak: Add documentation on the memory leak detector
    kmemleak: Add the base support

    Manual conflict resolution (with the slab/earlyboot changes) in:
    drivers/char/vt.c
    init/main.c
    mm/slab.c

    Linus Torvalds
     
  • There are allocations for which the main pointer cannot be found but
    they are not memory leaks. This patch fixes some of them. For more
    information on false positives, see Documentation/kmemleak.txt.

    Signed-off-by: Catalin Marinas

    Catalin Marinas
     

05 Jun, 2009

1 commit

  • This reverts commit db2dbb12dc47a50c7a4c5678f526014063e486f6.

    It apparently causes problems with partition table read-ahead
    on archs with large page sizes. Until that problem is diagnosed
    further, just drop the readpages support on block devices.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

23 May, 2009

1 commit

  • Until now we have had a 1:1 mapping between storage device physical
    block size and the logical block sized used when addressing the device.
    With SATA 4KB drives coming out that will no longer be the case. The
    sector size will be 4KB but the logical block size will remain
    512-bytes. Hence we need to distinguish between the physical block size
    and the logical ditto.

    This patch renames hardsect_size to logical_block_size.

    Signed-off-by: Martin K. Petersen
    Signed-off-by: Jens Axboe

    Martin K. Petersen
     

28 Apr, 2009

1 commit


01 Apr, 2009

1 commit


28 Mar, 2009

1 commit


10 Jan, 2009

1 commit

  • The ioctls for the generic freeze feature are below.
    o Freeze the filesystem
    int ioctl(int fd, int FIFREEZE, arg)
    fd: The file descriptor of the mountpoint
    FIFREEZE: request code for the freeze
    arg: Ignored
    Return value: 0 if the operation succeeds. Otherwise, -1

    o Unfreeze the filesystem
    int ioctl(int fd, int FITHAW, arg)
    fd: The file descriptor of the mountpoint
    FITHAW: request code for unfreeze
    arg: Ignored
    Return value: 0 if the operation succeeds. Otherwise, -1
    Error number: If the filesystem has already been unfrozen,
    errno is set to EINVAL.

    [akpm@linux-foundation.org: fix CONFIG_BLOCK=n]
    Signed-off-by: Takashi Sato
    Signed-off-by: Masayuki Hamaguchi
    Cc:
    Cc:
    Cc: Christoph Hellwig
    Cc: Dave Kleikamp
    Cc: Dave Chinner
    Cc: Alasdair G Kergon
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Takashi Sato
     

09 Jan, 2009

2 commits

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (57 commits)
    jbd2: Fix oops in jbd2_journal_init_inode() on corrupted fs
    ext4: Remove "extents" mount option
    block: Add Kconfig help which notes that ext4 needs CONFIG_LBD
    ext4: Make printk's consistently prefixed with "EXT4-fs: "
    ext4: Add sanity checks for the superblock before mounting the filesystem
    ext4: Add mount option to set kjournald's I/O priority
    jbd2: Submit writes to the journal using WRITE_SYNC
    jbd2: Add pid and journal device name to the "kjournald2 starting" message
    ext4: Add markers for better debuggability
    ext4: Remove code to create the journal inode
    ext4: provide function to release metadata pages under memory pressure
    ext3: provide function to release metadata pages under memory pressure
    add releasepage hooks to block devices which can be used by file systems
    ext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc
    ext4: Init the complete page while building buddy cache
    ext4: Don't allow new groups to be added during block allocation
    ext4: mark the blocks/inode bitmap beyond end of group as used
    ext4: Use new buffer_head flag to check uninit group bitmaps initialization
    ext4: Fix the race between read_inode_bitmap() and ext4_new_inode()
    ext4: code cleanup
    ...

    Linus Torvalds
     
  • Currently md devices, once created, never disappear until the module
    is unloaded. This is essentially because the gendisk holds a
    reference to the mddev, and the mddev holds a reference to the
    gendisk, this a circular reference.

    If we drop the reference from mddev to gendisk, then we need to ensure
    that the mddev is destroyed when the gendisk is destroyed. However it
    is not possible to hook into the gendisk destruction process to enable
    this.

    So we drop the reference from the gendisk to the mddev and destroy the
    gendisk when the mddev gets destroyed. However this has a
    complication.
    Between the call
    __blkdev_get->get_gendisk->kobj_lookup->md_probe
    and the call
    __blkdev_get->md_open

    there is no obvious way to hold a reference on the mddev any more, so
    unless something is done, it will disappear and gendisk will be
    destroyed prematurely.

    Also, once we decide to destroy the mddev, there will be an unlockable
    moment before the gendisk is unlinked (blk_unregister_region) during
    which a new reference to the gendisk can be created. We need to
    ensure that this reference can not be used. i.e. the ->open must
    fail.

    So:
    1/ in md_probe we set a flag in the mddev (hold_active) which
    indicates that the array should be treated as active, even
    though there are no references, and no appearance of activity.
    This is cleared by md_release when the device is closed if it
    is no longer needed.
    This ensures that the gendisk will survive between md_probe and
    md_open.

    2/ In md_open we check if the mddev we expect to open matches
    the gendisk that we did open.
    If there is a mismatch we return -ERESTARTSYS and modify
    __blkdev_get to retry from the top in that case.
    In the -ERESTARTSYS sys case we make sure to wait until
    the old gendisk (that we succeeded in opening) is really gone so
    we loop at most once.

    Some udev configurations will always open an md device when it first
    appears. If we allow an md device that was just created by an open
    to disappear on an immediate close, then this can race with such udev
    configurations and result in an infinite loop the device being opened
    and closed, then re-open due to the 'ADD' even from the first open,
    and then close and so on.
    So we make sure an md device, once created by an open, remains active
    at least until some md 'ioctl' has been made on it. This means that
    all normal usage of md devices will allow them to disappear promptly
    when not needed, but the worst that an incorrect usage will do it
    cause an inactive md device to be left in existence (it can easily be
    removed).

    As an array can be stopped by writing to a sysfs attribute
    echo clear > /sys/block/mdXXX/md/array_state
    we need to use scheduled work for deleting the gendisk and other
    kobjects. This allows us to wait for any pending gendisk deletion to
    complete by simply calling flush_scheduled_work().

    Signed-off-by: NeilBrown

    NeilBrown
     

07 Jan, 2009

1 commit

  • Fix function parameter name in kernel-doc:

    Warning(linux-2.6.28-git5//fs/block_dev.c:1272): No description found for parameter 'pathname'
    Warning(linux-2.6.28-git5//fs/block_dev.c:1272): Excess function parameter 'path' description in 'lookup_bdev'

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

03 Jan, 2009

1 commit


01 Jan, 2009

1 commit

  • - iget5_locked in bdget really needs blockdev_superblock, instead of
    bd_mnt, so bd_mnt could be just a local variable;

    - blockdev_superblock really needs __read_mostly, while local var bd_mnt
    not;

    - make use of sb_is_blkdev_sb in bd_forget, instead of direct reference
    to blockdev_superblock.

    Signed-off-by: Denis ChengRq
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Denis ChengRq
     

04 Dec, 2008

2 commits


06 Nov, 2008

1 commit

  • Commit 0762b8bde9729f10f8e6249809660ff2ec3ad735 moved disk_get_part()
    in front of recursive get on the whole disk, which caused removable
    devices to try disk_get_part() before rescanning after a new media is
    inserted, which might fail legit open attempts or give the old
    partition.

    This patch fixes the problem by moving disk_get_part() after
    __blkdev_get() on the whole disk.

    This problem was spotted by Borislav Petkov.

    Signed-off-by: Tejun Heo
    Tested-by: Borislav Petkov
    Signed-off-by: Jens Axboe

    Tejun Heo
     

24 Oct, 2008

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev: (66 commits)
    [PATCH] kill the rest of struct file propagation in block ioctls
    [PATCH] get rid of struct file use in blkdev_ioctl() BLKBSZSET
    [PATCH] get rid of blkdev_locked_ioctl()
    [PATCH] get rid of blkdev_driver_ioctl()
    [PATCH] sanitize blkdev_get() and friends
    [PATCH] remember mode of reiserfs journal
    [PATCH] propagate mode through swsusp_close()
    [PATCH] propagate mode through open_bdev_excl/close_bdev_excl
    [PATCH] pass fmode_t to blkdev_put()
    [PATCH] kill the unused bsize on the send side of /dev/loop
    [PATCH] trim file propagation in block/compat_ioctl.c
    [PATCH] end of methods switch: remove the old ones
    [PATCH] switch sr
    [PATCH] switch sd
    [PATCH] switch ide-scsi
    [PATCH] switch tape_block
    [PATCH] switch dcssblk
    [PATCH] switch dasd
    [PATCH] switch mtd_blkdevs
    [PATCH] switch mmc
    ...

    Linus Torvalds
     

23 Oct, 2008

1 commit


21 Oct, 2008

8 commits


17 Oct, 2008

1 commit

  • Fix block kernel-doc warnings:

    Warning(linux-2.6.27-git4//fs/block_dev.c:1272): No description found for parameter 'path'
    Warning(linux-2.6.27-git4//block/blk-core.c:1021): No description found for parameter 'cpu'
    Warning(linux-2.6.27-git4//block/blk-core.c:1021): No description found for parameter 'part'
    Warning(/var/linsrc/linux-2.6.27-git4//block/genhd.c:544): No description found for parameter 'partno'

    Signed-off-by: Randy Dunlap
    Signed-off-by: Jens Axboe

    Randy Dunlap
     

09 Oct, 2008

10 commits

  • Fix kernel-doc in new functions:

    Error(mmotm-2008-1002-1617//fs/block_dev.c:895): duplicate section name 'Description'
    Error(mmotm-2008-1002-1617//fs/block_dev.c:924): duplicate section name 'Description'
    Warning(mmotm-2008-1002-1617//fs/block_dev.c:1282): No description found for parameter 'pathname'

    Signed-off-by: Randy Dunlap
    cc: Andrew Patterson
    Signed-off-by: Jens Axboe

    Randy Dunlap
     
  • We call flush_disk() to make sure the buffer cache for the disk is
    flushed after a disk resize. There are two resize cases, growing and
    shrinking. Given that users can shrink/then grow a disk before
    revalidate_disk() is called, we treat the grow case identically to
    shrinking. We need to flush the buffer cache after an online shrink
    because, as James Bottomley puts it,

    The two use cases for shrinking I can see are

    1. planned: the fs is already shrunk to within the new boundaries
    and all data is relocated, so invalidate is fine (any dirty
    buffers that might exist in the shrunk region are there only
    because they were relocated but not yet written to their
    original location).
    2. unplanned: In this case, the fs is probably toast, so whether
    we invalidate or not isn't going to make a whole lot of
    difference; it's still going to try to read or write from
    sectors beyond the new size and get I/O errors.

    Immediately invalidating shrunk disks will cause errors for outstanding
    I/Os for reads/write beyond the new end of the disk to be generated
    earlier then if we waited for the normal buffer cache operation. It also
    removes a potential security hole where we might keep old data around
    from beyond the end of the shrunk disk if the disk was not invalidated.

    Signed-off-by: Andrew Patterson
    Signed-off-by: Jens Axboe

    Andrew Patterson
     
  • We need to be able to flush the buffer cache for for more than
    just when a disk is changed, so we factor out common cache flush code
    in check_disk_change() to an internal flush_disk() routine. This
    routine will then be used for both disk changes and disk resizes (in a
    later patch).

    Include the disk name in the text indicating that there are busy
    inodes on the device and increase the KERN severity of the message.

    Signed-off-by: Andrew Patterson
    Signed-off-by: Jens Axboe

    Andrew Patterson
     
  • The revalidate_disk routine now checks if a disk has been resized by
    comparing the gendisk capacity to the bdev inode size. If they are
    different (usually because the disk has been resized underneath the kernel)
    the bdev inode size is adjusted to match the capacity.

    Signed-off-by: Andrew Patterson
    Signed-off-by: Jens Axboe

    Andrew Patterson
     
  • This is a wrapper for the lower-level revalidate_disk call-backs such
    as sd_revalidate_disk(). It allows us to perform pre and post
    operations when calling them.

    We will use this wrapper in a later patch to adjust block device sizes
    after an online resize (a _post_ operation).

    Signed-off-by: Andrew Patterson
    Signed-off-by: Jens Axboe

    Andrew Patterson
     
  • Till now, bdev->bd_part is set only if the bdev was for parts other
    than part0. This patch makes bdev->bd_part always set so that code
    paths don't have to differenciate common handling.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Move disk->holder_dir to part0->holder_dir. Kill now mostly
    superflous bdev_get_holder().

    While at it, kill superflous kobject_get/put() around holder_dir,
    slave_dir and cmd_filter creation and collapse
    disk_sysfs_add_subdirs() into register_disk(). These serve no purpose
    but obfuscating the code.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • genhd and partition code handled disk and partitions separately. All
    information about the whole disk was in struct genhd and partitions in
    struct hd_struct. However, the whole disk (part0) and other
    partitions have a lot in common and the data structures end up having
    good number of common fields and thus separate code paths doing the
    same thing. Also, the partition array was indexed by partno - 1 which
    gets pretty confusing at times.

    This patch introduces partition 0 and makes the partition array
    indexed by partno. Following patches will unify the handling of disk
    and parts piece-by-piece.

    This patch also implements disk_partitionable() which tests whether a
    disk is partitionable. With coming dynamic partition array change,
    the most common usage of disk_max_parts() will be testing whether a
    disk is partitionable and the number of max partitions will become
    much less important.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Implement {disk|part}_to_dev() and use them to access generic device
    instead of directly dereferencing {disk|part}->dev. To make sure no
    user is left behind, rename generic devices fields to __dev.

    This is in preparation of unifying partition 0 handling with other
    partitions.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • disk->part[] is protected by its matching bdev's lock. However,
    non-critical accesses like collecting stats and printing out sysfs and
    proc information used to be performed without any locking. As
    partitions can come and go dynamically, partitions can go away
    underneath those non-critical accesses. As some of those accesses are
    writes, this theoretically can lead to silent corruption.

    This patch fixes the race by using RCU for the partition array and dev
    reference counter to hold partitions.

    * Rename disk->part[] to disk->__part[] to make sure no one outside
    genhd layer proper accesses it directly.

    * Use RCU for disk->__part[] dereferencing.

    * Implement disk_{get|put}_part() which can be used to get and put
    partitions from gendisk respectively.

    * Iterators are implemented to help iterate through all partitions
    safely.

    * Functions which require RCU readlock are marked with _rcu suffix.

    * Use disk_put_part() in __blkdev_put() instead of directly putting
    the contained kobject.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Tejun Heo