04 Jan, 2012

1 commit

  • Move invalidate_bdev, block_sync_page into fs/block_dev.c. Export
    kill_bdev as well, so brd doesn't have to open code it. Reduce
    buffer_head.h requirement accordingly.

    Removed a rather large comment from invalidate_bdev, as it looked a bit
    obsolete to bother moving. The small comment replacing it says enough.

    Signed-off-by: Nick Piggin
    Cc: Al Viro
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Al Viro
     

02 Dec, 2011

1 commit


25 Nov, 2011

1 commit


16 Nov, 2011

2 commits

  • 1) Anyone who has read access to loopdev has permission to call set_status
    and may change important parameters such as lo_offset, lo_sizelimit and
    so on, which contradicts to read access pattern and definitely equals
    to write access pattern.
    2) Add lo_offset over i_size check to prevent blkdev_size overflow.
    ##Testcase_bagin
    #dd if=/dev/zero of=./file bs=1k count=1
    #losetup /dev/loop0 ./file
    /* userspace_application */
    struct loop_info64 loinf;
    fd = open("/dev/loop0", O_RDONLY);
    ioctl(fd, LOOP_GET_STATUS64, &loinf);
    /* Set offset to any value which is bigger than i_size, and sizelimit
    * to nonzero value*/
    loinf.lo_offset = 4096*1024;
    loinf.lo_sizelimit = 1024;
    ioctl(fd, LOOP_SET_STATUS64, &loinf);
    /* After this loop device will have size similar to 0x7fffffffffxxxx */
    #blockdev --getsz /dev/loop0
    ##OUTPUT: 36028797018955968
    ##Testcase_end

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Dmitry Monakhov
    Signed-off-by: Andrew Morton
    Signed-off-by: Jens Axboe

    Dmitry Monakhov
     
  • If read was not fully successful we have to fail whole bio to prevent
    information leak of old pages

    ##Testcase_begin
    dd if=/dev/zero of=./file bs=1M count=1
    losetup /dev/loop0 ./file -o 4096
    truncate -s 0 ./file
    # OOps loop offset is now beyond i_size, so read will silently fail.
    # So bio's pages would not be cleared, may which result in information leak.
    hexdump -C /dev/loop0
    ##testcase_end

    Signed-off-by: Dmitry Monakhov
    Signed-off-by: Andrew Morton
    Signed-off-by: Jens Axboe

    Dmitry Monakhov
     

05 Nov, 2011

1 commit

  • * 'for-3.2/drivers' of git://git.kernel.dk/linux-block: (30 commits)
    virtio-blk: use ida to allocate disk index
    hpsa: add small delay when using PCI Power Management to reset for kump
    cciss: add small delay when using PCI Power Management to reset for kump
    xen/blkback: Fix two races in the handling of barrier requests.
    xen/blkback: Check for proper operation.
    xen/blkback: Fix the inhibition to map pages when discarding sector ranges.
    xen/blkback: Report VBD_WSECT (wr_sect) properly.
    xen/blkback: Support 'feature-barrier' aka old-style BARRIER requests.
    xen-blkfront: plug device number leak in xlblk_init() error path
    xen-blkfront: If no barrier or flush is supported, use invalid operation.
    xen-blkback: use kzalloc() in favor of kmalloc()+memset()
    xen-blkback: fixed indentation and comments
    xen-blkfront: fix a deadlock while handling discard response
    xen-blkfront: Handle discard requests.
    xen-blkback: Implement discard requests ('feature-discard')
    xen-blkfront: add BLKIF_OP_DISCARD and discard request struct
    drivers/block/loop.c: remove unnecessary bdev argument from loop_clr_fd()
    drivers/block/loop.c: emit uevent on auto release
    drivers/block/cpqarray.c: use pci_dev->revision
    loop: always allow userspace partitions and optionally support automatic scanning
    ...

    Fic up trivial header file includsion conflict in drivers/block/loop.c

    Linus Torvalds
     

24 Oct, 2011

1 commit


19 Oct, 2011

1 commit


17 Oct, 2011

1 commit

  • Currently the loop device tries to call directly into write_begin/write_end
    instead of going through ->write if it can. This is a fairly nasty shortcut
    as write_begin and write_end are only callbacks for the generic write code
    and expect to be called with filesystem specific locks held.

    This code currently causes various issues for clustered filesystems as it
    doesn't take the required cluster locks, and it also causes issues for XFS
    as it doesn't properly lock against the swapext ioctl as called by the
    defragmentation tools. This in case causes data corruption if
    defragmentation hits a busy loop device in the wrong time window, as
    reported by RH QA.

    The reason why we have this shortcut is that it saves a data copy when
    doing a transformation on the loop device, which is the technical term
    for using cryptoloop (or an XOR transformation). Given that cryptoloop
    has been deprecated in favour of dm-crypt my opinion is that we should
    simply drop this shortcut instead of finding complicated ways to to
    introduce a formal interface for this shortcut.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

21 Sep, 2011

2 commits


12 Sep, 2011

1 commit

  • There is very little benefit in allowing to let a ->make_request
    instance update the bios device and sector and loop around it in
    __generic_make_request when we can archive the same through calling
    generic_make_request from the driver and letting the loop in
    generic_make_request handle it.

    Note that various drivers got the return value from ->make_request and
    returned non-zero values for errors.

    Signed-off-by: Christoph Hellwig
    Acked-by: NeilBrown
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

24 Aug, 2011

1 commit

  • Automatic partition scanning can be requested individually per loop
    device during its setup by setting LO_FLAGS_PARTSCAN. By default, no
    partition tables are scanned.

    Userspace can now always add and remove partitions from all loop
    devices, regardless if the in-kernel partition scanner is enabled or
    not.

    The needed partition minor numbers are allocated from the extended
    minors space, the main loop device numbers will continue to match the
    loop minors, regardless of the number of partitions used.

    # grep . /sys/class/block/loop1/loop/*
    /sys/block/loop1/loop/autoclear:0
    /sys/block/loop1/loop/backing_file:/home/kay/data/stuff/part.img
    /sys/block/loop1/loop/offset:0
    /sys/block/loop1/loop/partscan:1
    /sys/block/loop1/loop/sizelimit:0

    # ls -l /dev/loop*
    brw-rw---- 1 root disk 7, 0 Aug 14 20:22 /dev/loop0
    brw-rw---- 1 root disk 7, 1 Aug 14 20:23 /dev/loop1
    brw-rw---- 1 root disk 259, 0 Aug 14 20:23 /dev/loop1p1
    brw-rw---- 1 root disk 259, 1 Aug 14 20:23 /dev/loop1p2
    brw-rw---- 1 root disk 7, 99 Aug 14 20:23 /dev/loop99
    brw-rw---- 1 root disk 259, 2 Aug 14 20:23 /dev/loop99p1
    brw-rw---- 1 root disk 259, 3 Aug 14 20:23 /dev/loop99p2
    crw------T 1 root root 10, 237 Aug 14 20:22 /dev/loop-control

    Cc: Karel Zak
    Cc: Davidlohr Bueso
    Acked-By: Tejun Heo
    Signed-off-by: Kay Sievers
    Signed-off-by: Jens Axboe

    Kay Sievers
     

19 Aug, 2011

1 commit

  • This commit adds discard support for loop devices. Discard is usually
    supported by SSD and thinly provisioned devices as a method for
    reclaiming unused space. This is no different than trying to reclaim
    back space which is not used by the file system on the image, but it
    still occupies space on the host file system.

    We can do the reclamation on file system which does support hole
    punching. So when discard request gets to the loop driver we can
    translate that to punch a hole to the underlying file, hence reclaim
    the free space.

    This is very useful for trimming down the size of the image to only what
    is really used by the file system on that image. Fstrim may be used for
    that purpose.

    It has been tested on ext4, xfs and btrfs with the image file systems
    ext4, ext3, xfs and btrfs. ext4, or ext6 image on ext4 file system has
    some problems but it seems that ext4 punch hole implementation is
    somewhat flawed and it is unrelated to this commit.

    Also this is a very good method of validating file systems punch hole
    implementation.

    Note that when encryption is used, discard support is disabled, because
    using it might leak some information useful for possible attacker.

    Signed-off-by: Lukas Czerner
    Reviewed-by: Jeff Moyer
    Signed-off-by: Jens Axboe

    Lukas Czerner
     

01 Aug, 2011

4 commits

  • LOOP_CLR_FD takes lo->lo_ctl_mutex and tries to remove the loop sysfs
    files. Sysfs calls show() and waits for lo->lo_ctl_mutex. LOOP_CLR_FD
    waits for show() to finish to remove the sysfs file.

    cat /sys/class/block/loop0/loop/backing_file
    mutex_lock_nested+0x176/0x350
    ? loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
    ? loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
    loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
    dev_attr_show+0x1b/0x60
    ? sysfs_read_file+0x86/0x1a0
    ? __get_free_pages+0x12/0x50
    sysfs_read_file+0xaf/0x1a0

    ioctl(LOOP_CLR_FD):
    wait_for_common+0x12c/0x180
    ? try_to_wake_up+0x2a0/0x2a0
    wait_for_completion+0x18/0x20
    sysfs_deactivate+0x178/0x180
    ? sysfs_addrm_finish+0x43/0x70
    ? sysfs_addrm_start+0x1d/0x20
    sysfs_addrm_finish+0x43/0x70
    sysfs_hash_and_remove+0x85/0xa0
    sysfs_remove_group+0x59/0x100
    loop_clr_fd+0x1dc/0x3f0 [loop]
    lo_ioctl+0x223/0x7a0 [loop]

    Instead of taking the lo_ctl_mutex from sysfs code, take the inner
    lo->lo_lock, to protect the access to the backing_file data.

    Thanks to Tejun for help debugging and finding a solution.

    Cc: Milan Broz
    Cc: Tejun Heo
    Signed-off-by: Kay Sievers
    Cc: stable@kernel.org
    Signed-off-by: Jens Axboe

    Kay Sievers
     
  • Instead of unconditionally creating a fixed number of dead loop
    devices which need to be investigated by storage handling services,
    even when they are never used, we allow distros start with 0
    loop devices and have losetup(8) and similar switch to the dynamic
    /dev/loop-control interface instead of searching /dev/loop%i for free
    devices.

    Signed-off-by: Kay Sievers
    Signed-off-by: Jens Axboe

    Kay Sievers
     
  • Loop devices today have a fixed pre-allocated number of usually 8.
    The number can only be changed at module init time. To find a free
    device to use, /dev/loop%i needs to be scanned, and all devices need
    to be opened until a free one is possibly found.

    This adds a new /dev/loop-control device node, that allows to
    dynamically find or allocate a free device, and to add and remove loop
    devices from the running system:
    LOOP_CTL_ADD adds a specific device. Arg is the number
    of the device. It returns the device i or a negative
    error code.

    LOOP_CTL_REMOVE removes a specific device, Arg is the
    number the device. It returns the device i or a negative
    error code.

    LOOP_CTL_GET_FREE finds the next unbound device or allocates
    a new one. No arg is given. It returns the device i or a
    negative error code.

    The loop kernel module gets automatically loaded when
    /dev/loop-control is accessed the first time. The alias
    specified in the module, instructs udev to create this
    'dead' device node, even when the module is not loaded.

    Example:
    cfd = open("/dev/loop-control", O_RDWR);

    # add a new specific loop device
    err = ioctl(cfd, LOOP_CTL_ADD, devnr);

    # remove a specific loop device
    err = ioctl(cfd, LOOP_CTL_REMOVE, devnr);

    # find or allocate a free loop device to use
    devnr = ioctl(cfd, LOOP_CTL_GET_FREE);

    sprintf(loopname, "/dev/loop%i", devnr);
    ffd = open("backing-file", O_RDWR);
    lfd = open(loopname, O_RDWR);
    err = ioctl(lfd, LOOP_SET_FD, ffd);

    Cc: Tejun Heo
    Cc: Karel Zak
    Signed-off-by: Kay Sievers
    Signed-off-by: Jens Axboe

    Kay Sievers
     
  • Replace the linked list, that keeps track of allocated devices, with an
    idr index to allow a more efficient lookup of devices.

    Cc: Tejun Heo
    Signed-off-by: Kay Sievers
    Signed-off-by: Jens Axboe

    Kay Sievers
     

27 May, 2011

1 commit

  • Export 'max_loop' and 'max_part' parameters to sysfs so user can know
    that how many devices are allowed and how many partitions are supported.

    If 'max_loop' is 0, there is no restriction on the number of loop devices.
    User can create/use the devices as many as minor numbers available. If
    'max_part' is 0, it means simply the device doesn't support partitioning.

    Also note that 'max_part' can be adjusted to power of 2 minus 1 form if
    needed. User should check this value after the module loading if he/she
    want to use that number correctly (i.e. fdisk, mknod, etc.).

    Signed-off-by: Namhyung Kim
    Cc: Laurent Vivier
    Signed-off-by: Jens Axboe

    Namhyung Kim
     

24 May, 2011

2 commits

  • When finding or allocating a loop device, loop_probe() did not take
    partition numbers into account so that it can result to a different
    device. Consider following example:

    $ sudo modprobe loop max_part=15
    $ ls -l /dev/loop*
    brw-rw---- 1 root disk 7, 0 2011-05-24 22:16 /dev/loop0
    brw-rw---- 1 root disk 7, 16 2011-05-24 22:16 /dev/loop1
    brw-rw---- 1 root disk 7, 32 2011-05-24 22:16 /dev/loop2
    brw-rw---- 1 root disk 7, 48 2011-05-24 22:16 /dev/loop3
    brw-rw---- 1 root disk 7, 64 2011-05-24 22:16 /dev/loop4
    brw-rw---- 1 root disk 7, 80 2011-05-24 22:16 /dev/loop5
    brw-rw---- 1 root disk 7, 96 2011-05-24 22:16 /dev/loop6
    brw-rw---- 1 root disk 7, 112 2011-05-24 22:16 /dev/loop7
    $ sudo mknod /dev/loop8 b 7 128
    $ sudo losetup /dev/loop8 ~/temp/disk-with-3-parts.img
    $ sudo losetup -a
    /dev/loop128: [0805]:278201 (/home/namhyung/temp/disk-with-3-parts.img)
    $ ls -l /dev/loop*
    brw-rw---- 1 root disk 7, 0 2011-05-24 22:16 /dev/loop0
    brw-rw---- 1 root disk 7, 16 2011-05-24 22:16 /dev/loop1
    brw-rw---- 1 root disk 7, 2048 2011-05-24 22:18 /dev/loop128
    brw-rw---- 1 root disk 7, 2049 2011-05-24 22:18 /dev/loop128p1
    brw-rw---- 1 root disk 7, 2050 2011-05-24 22:18 /dev/loop128p2
    brw-rw---- 1 root disk 7, 2051 2011-05-24 22:18 /dev/loop128p3
    brw-rw---- 1 root disk 7, 32 2011-05-24 22:16 /dev/loop2
    brw-rw---- 1 root disk 7, 48 2011-05-24 22:16 /dev/loop3
    brw-rw---- 1 root disk 7, 64 2011-05-24 22:16 /dev/loop4
    brw-rw---- 1 root disk 7, 80 2011-05-24 22:16 /dev/loop5
    brw-rw---- 1 root disk 7, 96 2011-05-24 22:16 /dev/loop6
    brw-rw---- 1 root disk 7, 112 2011-05-24 22:16 /dev/loop7
    brw-r--r-- 1 root root 7, 128 2011-05-24 22:17 /dev/loop8

    After this patch, /dev/loop8 - instead of /dev/loop128 - was
    accessed correctly.

    In addition, 'range' passed to blk_register_region() should
    include all range of dev_t that LOOP_MAJOR can address. It does
    not need to be limited by partition numbers unless 'max_loop'
    param was specified.

    Signed-off-by: Namhyung Kim
    Cc: Laurent Vivier
    Cc: stable@kernel.org
    Signed-off-by: Jens Axboe

    Namhyung Kim
     
  • The 'max_part' parameter controls the number of maximum partition
    a loop block device can have. However if a user specifies very
    large value it would exceed the limitation of device minor number
    and can cause a kernel panic (or, at least, produce invalid
    device nodes in some cases).

    On my desktop system, following command kills the kernel. On qemu,
    it triggers similar oops but the kernel was alive:

    $ sudo modprobe loop max_part0000
    ------------[ cut here ]------------
    kernel BUG at /media/Linux_Data/project/linux/fs/sysfs/group.c:65!
    invalid opcode: 0000 [#1] SMP
    last sysfs file:
    CPU 0
    Modules linked in: loop(+)

    Pid: 43, comm: insmod Tainted: G W 2.6.39-qemu+ #155 Bochs Bochs
    RIP: 0010:[] [] internal_create_group=
    +0x2a/0x170
    RSP: 0018:ffff880007b3fde8 EFLAGS: 00000246
    RAX: 00000000ffffffef RBX: ffff880007b3d878 RCX: 00000000000007b4
    RDX: ffffffff8152da50 RSI: 0000000000000000 RDI: ffff880007b3d878
    RBP: ffff880007b3fe38 R08: ffff880007b3fde8 R09: 0000000000000000
    R10: ffff88000783b4a8 R11: ffff880007b3d878 R12: ffffffff8152da50
    R13: ffff880007b3d868 R14: 0000000000000000 R15: ffff880007b3d800
    FS: 0000000002137880(0063) GS:ffff880007c00000(0000) knlGS:00000000000000=
    00
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000422680 CR3: 0000000007b50000 CR4: 00000000000006b0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
    Process insmod (pid: 43, threadinfo ffff880007b3e000, task ffff880007afb9c=
    0)
    Stack:
    ffff880007b3fe58 ffffffff811e66dd ffff880007b3fe58 ffffffff811e570b
    0000000000000010 ffff880007b3d800 ffff880007a7b390 ffff880007b3d868
    0000000000400920 ffff880007b3d800 ffff880007b3fe48 ffffffff8113cfc8
    Call Trace:
    [] ? device_add+0x4bc/0x5af
    [] ? dev_set_name+0x3c/0x3e
    [] sysfs_create_group+0xe/0x12
    [] blk_trace_init_sysfs+0x14/0x16
    [] blk_register_queue+0x47/0xf7
    [] add_disk+0xdf/0x290
    [] loop_init+0xeb/0x1b8 [loop]
    [] ? 0xffffffffa0005fff
    [] do_one_initcall+0x7a/0x12e
    [] sys_init_module+0x9c/0x1e0
    [] system_call_fastpath+0x16/0x1b
    Code: c3 55 48 89 e5 41 57 41 56 41 89 f6 41 55 41 54 49 89 d4 53 48 89 fb=
    48 83 ec 28 48 85 ff 74 0b 85 f6 75 0b 48 83 7f 30 00 75 14 0b eb fe =
    48 83 7f 30 00 b9 ea ff ff ff 0f 84 18 01 00 00 49
    RIP [] internal_create_group+0x2a/0x170
    RSP
    ---[ end trace a123eb592043acad ]---

    Signed-off-by: Namhyung Kim
    Cc: Laurent Vivier
    Cc: stable@kernel.org
    Signed-off-by: Jens Axboe

    Namhyung Kim
     

10 Mar, 2011

2 commits


05 Mar, 2011

1 commit

  • This merge creates two set of conflicts. One is simple context
    conflicts caused by removal of throtl_scheduled_delayed_work() in
    for-linus and removal of throtl_shutdown_timer_wq() in
    for-2.6.39/core.

    The other is caused by commit 255bb490c8 (block: blk-flush shouldn't
    call directly into q->request_fn() __blk_run_queue()) in for-linus
    crashing with FLUSH reimplementation in for-2.6.39/core. The conflict
    isn't trivial but the resolution is straight-forward.

    * __blk_run_queue() calls in flush_end_io() and flush_data_end_io()
    should be called with @force_kblockd set to %true.

    * elv_insert() in blk_kick_flush() should use
    %ELEVATOR_INSERT_REQUEUE.

    Both changes are to avoid invoking ->request_fn() directly from
    request completion path and closely match the changes in the commit
    255bb490c8.

    Signed-off-by: Tejun Heo

    Tejun Heo
     

04 Mar, 2011

1 commit

  • Following steps lead to deadlock in kernel:

    dd if=/dev/zero of=img bs=512 count=1000
    losetup -f img
    mkfs.ext2 /dev/loop0
    mount -t ext2 -o loop /dev/loop0 mnt
    umount mnt/

    Stacktrace:
    [] irq_exit+0x36/0x59
    [] smp_apic_timer_interrupt+0x6b/0x75
    [] apic_timer_interrupt+0x31/0x38
    [] mutex_spin_on_owner+0x54/0x5b
    [] lo_release+0x12/0x67 [loop]
    [] __blkdev_put+0x7c/0x10c
    [] fput+0xd5/0x1aa
    [] loop_clr_fd+0x1a9/0x1b1 [loop]
    [] lo_release+0x39/0x67 [loop]
    [] __blkdev_put+0x7c/0x10c
    [] deactivate_locked_super+0x17/0x36
    [] sys_umount+0x27e/0x2a5
    [] sys_oldumount+0xb/0xe
    [] sysenter_do_call+0x12/0x26
    [] 0xffffffff

    Regression since 2a48fc0ab24241755dc9, which introduced the private
    loop_mutex as part of the BKL removal process.

    As per [1], the mutex can be safely removed.

    [1] http://www.gossamer-threads.com/lists/linux/kernel/1341930

    Addresses: https://bugzilla.novell.com/show_bug.cgi?id=669394
    Addresses: https://bugzilla.kernel.org/show_bug.cgi?id=29172

    Signed-off-by: Petr Uzel
    Cc: stable@kernel.org
    Reviewed-by: Nikanth Karthikesan
    Acked-by: Arnd Bergmann
    Signed-off-by: Jens Axboe

    Petr Uzel
     

03 Mar, 2011

1 commit


19 Jan, 2011

1 commit

  • Performing
    $ sudo mount -o loop -o umask=0 /dev/sdb1 /mnt/
    mount: wrong fs type, bad option, bad superblock on /dev/loop0,
    missing codepage or helper program, or other error
    In some cases useful info is found in syslog - try
    dmesg | tail or so

    $ sudo modprobe -r loop

    results in oops:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
    IP: [] do_raw_spin_lock+0x14/0x122
    Process modprobe (pid: 6189, threadinfo ffff88009a898000, task ffff880154a88000)
    Call Trace:
    [] _raw_spin_lock_irq+0x4a/0x51
    [] ? blk_throtl_exit+0x3b/0xa0
    [] ? cancel_delayed_work_sync+0xd/0xf
    [] blk_throtl_exit+0x3b/0xa0
    [] blk_release_queue+0x21/0x65
    [] kobject_release+0x51/0x66
    [] ? kobject_release+0x0/0x66
    [] kref_put+0x43/0x4d
    [] kobject_put+0x47/0x4b
    [] blk_cleanup_queue+0x56/0x5b
    [] loop_exit+0x68/0x844 [loop]
    [] sys_delete_module+0x1e8/0x25b
    [] ? trace_hardirqs_on_thunk+0x3a/0x3f
    [] system_call_fastpath+0x16/0x1b

    because of an attempt to acquire NULL queue_lock.
    I added the same lines as in blk_queue_make_request -
    index 44e18c0..49e6a54 100644`fall back to embedded per-queue lock'.

    Signed-off-by: Sergey Senozhatsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Jens Axboe

    Sergey Senozhatsky
     

20 Dec, 2010

1 commit

  • Commit a8adbe3 forgot to remove the return variable, kill it.

    drivers/block/loop.c: In function 'lo_splice_actor':
    drivers/block/loop.c:398: warning: unused variable 'ret'
    [...]
    fs/nfsd/vfs.c: In function 'nfsd_splice_actor':
    fs/nfsd/vfs.c:848: warning: unused variable 'ret'

    Reported-by: Stephen Rothwell
    Signed-off-by: Jens Axboe

    Jens Axboe
     

17 Dec, 2010

1 commit

  • This patch pulls calls to buf->ops->confirm() from all actors passed
    (also indirectly) to splice_from_pipe_feed().

    Is avoiding the call to buf->ops->confirm() while splice()ing to
    /dev/null is an intentional optimization? No other user does that
    and this will remove this special case.

    Against current linux.git 6313e3c21743cc88bb5bd8aa72948ee1e83937b6.

    Signed-off-by: Michał Mirosław
    Signed-off-by: Jens Axboe

    Michał Mirosław
     

10 Nov, 2010

1 commit

  • REQ_HARDBARRIER is dead now, so remove the leftovers. What's left
    at this point is:

    - various checks inside the block layer.
    - sanity checks in bio based drivers.
    - now unused bio_empty_barrier helper.
    - Xen blockfront use of BLKIF_OP_WRITE_BARRIER - it's dead for a while,
    but Xen really needs to sort out it's barrier situaton.
    - setting of ordered tags in uas - dead code copied from old scsi
    drivers.
    - scsi different retry for barriers - it's dead and should have been
    removed when flushes were converted to FS requests.
    - blktrace handling of barriers - removed. Someone who knows blktrace
    better should add support for REQ_FLUSH and REQ_FUA, though.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

28 Oct, 2010

1 commit

  • In autoclear mode bdev is NULL but the sysfs
    entry should be destroyed otherwise this warning appears:

    WARNING: at fs/sysfs/dir.c:451 sysfs_add_one+0x82/0x95()
    sysfs: cannot create duplicate filename '/devices/virtual/block/loop0/loop'

    Fixes commit ee86273062cbb310665fe49e1f1937d2cf85b0b9

    Signed-off-by: Milan Broz
    Signed-off-by: Jens Axboe

    Milan Broz
     

27 Oct, 2010

1 commit

  • Ensure kmap_atomic() usage is strictly nested

    Signed-off-by: Peter Zijlstra
    Reviewed-by: Rik van Riel
    Acked-by: Chris Metcalf
    Cc: David Howells
    Cc: Hugh Dickins
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Steven Rostedt
    Cc: Russell King
    Cc: Ralf Baechle
    Cc: David Miller
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

23 Oct, 2010

2 commits

  • * 'for-2.6.37/barrier' of git://git.kernel.dk/linux-2.6-block: (46 commits)
    xen-blkfront: disable barrier/flush write support
    Added blk-lib.c and blk-barrier.c was renamed to blk-flush.c
    block: remove BLKDEV_IFL_WAIT
    aic7xxx_old: removed unused 'req' variable
    block: remove the BH_Eopnotsupp flag
    block: remove the BLKDEV_IFL_BARRIER flag
    block: remove the WRITE_BARRIER flag
    swap: do not send discards as barriers
    fat: do not send discards as barriers
    ext4: do not send discards as barriers
    jbd2: replace barriers with explicit flush / FUA usage
    jbd2: Modify ASYNC_COMMIT code to not rely on queue draining on barrier
    jbd: replace barriers with explicit flush / FUA usage
    nilfs2: replace barriers with explicit flush / FUA usage
    reiserfs: replace barriers with explicit flush / FUA usage
    gfs2: replace barriers with explicit flush / FUA usage
    btrfs: replace barriers with explicit flush / FUA usage
    xfs: replace barriers with explicit flush / FUA usage
    block: pass gfp_mask and flags to sb_issue_discard
    dm: convey that all flushes are processed as empty
    ...

    Linus Torvalds
     
  • * 'for-2.6.37/drivers' of git://git.kernel.dk/linux-2.6-block: (95 commits)
    cciss: fix PCI IDs for new Smart Array controllers
    drbd: add race-breaker to drbd_go_diskless
    drbd: use dynamic_dev_dbg to optionally log uuid changes
    dynamic_debug.h: Fix dynamic_dev_dbg() macro if CONFIG_DYNAMIC_DEBUG not set
    drbd: cleanup: change "s
    drbd: add explicit drbd_md_sync to drbd_resync_finished
    drbd: Do not log an ASSERT for P_OV_REQUEST packets while C_CONNECTED
    drbd: fix for possible deadlock on IO error during resync
    drbd: fix unlikely access after free and list corruption
    drbd: fix for spurious fullsync (uuids rotated too fast)
    drbd: allow for explicit resync-finished notifications
    drbd: preparation commit, using full state in receive_state()
    drbd: drbd_send_ack_dp must not rely on header information
    drbd: Fix regression in recv_bm_rle_bits (compressed bitmap)
    drbd: Fixed a stupid copy and paste error
    drbd: Allow larger values for c-fill-target.
    ...

    Fix up trivial conflict in drivers/block/ataflop.c due to BKL removal

    Linus Torvalds
     

05 Oct, 2010

1 commit

  • The block device drivers have all gained new lock_kernel
    calls from a recent pushdown, and some of the drivers
    were already using the BKL before.

    This turns the BKL into a set of per-driver mutexes.
    Still need to check whether this is safe to do.

    file=$1
    name=$2
    if grep -q lock_kernel ${file} ; then
    if grep -q 'include.*linux.mutex.h' ${file} ; then
    sed -i '/include.*/d' ${file}
    else
    sed -i 's/include.*.*$/include /g' ${file}
    fi
    sed -i ${file} \
    -e "/^#include.*linux.mutex.h/,$ {
    1,/^\(static\|int\|long\)/ {
    /^\(static\|int\|long\)/istatic DEFINE_MUTEX(${name}_mutex);

    } }" \
    -e "s/\(un\)*lock_kernel\>[ ]*()/mutex_\1lock(\&${name}_mutex)/g" \
    -e '/[ ]*cycle_kernel_lock();/d'
    else
    sed -i -e '/include.*\/d' ${file} \
    -e '/cycle_kernel_lock()/d'
    fi

    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

10 Sep, 2010

3 commits

  • Deprecate REQ_HARDBARRIER and implement REQ_FLUSH/FUA instead. Also,
    instead of checking file->f_op->fsync() directly, look at the value of
    vfs_fsync() and ignore -EINVAL return.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Barrier is deemed too heavy and will soon be replaced by FLUSH/FUA
    requests. Deprecate barrier. All REQ_HARDBARRIERs are failed with
    -EOPNOTSUPP and blk_queue_ordered() is replaced with simpler
    blk_queue_flush().

    blk_queue_flush() takes combinations of REQ_FLUSH and FUA. If a
    device has write cache and can flush it, it should set REQ_FLUSH. If
    the device can handle FUA writes, it should also set REQ_FUA.

    All blk_queue_ordered() users are converted.

    * ORDERED_DRAIN is mapped to 0 which is the default value.
    * ORDERED_DRAIN_FLUSH is mapped to REQ_FLUSH.
    * ORDERED_DRAIN_FLUSH_FUA is mapped to REQ_FLUSH | REQ_FUA.

    Signed-off-by: Tejun Heo
    Acked-by: Boaz Harrosh
    Cc: Christoph Hellwig
    Cc: Nick Piggin
    Cc: Michael S. Tsirkin
    Cc: Jeremy Fitzhardinge
    Cc: Chris Wright
    Cc: FUJITA Tomonori
    Cc: Geert Uytterhoeven
    Cc: David S. Miller
    Cc: Alasdair G Kergon
    Cc: Pierre Ossman
    Cc: Stefan Weinhuber
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • loop implements FLUSH using fsync but was incorrectly setting its
    ordered mode to DRAIN. Change it to DRAIN_FLUSH. In practice, this
    doesn't change anything as loop doesn't make use of the block layer
    ordered implementation.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Tejun Heo
     

23 Aug, 2010

2 commits

  • Create /sys/block/loopX/loop directory and provide these attributes:
    - backing_file
    - autoclear
    - offset
    - sizelimit

    This loop directory is present only if loop device is configured.

    To be used in util-linux-ng (and possibly elsewhere like udev rules)
    where code need to get loop attributes from kernel (and not store
    duplicate info in userspace).

    Moreover loop ioctls are not even able to provide full backing
    file info because of buffer limits.

    Signed-off-by: Milan Broz
    Signed-off-by: Jens Axboe

    Milan Broz
     
  • Return of the bi_rw tests is no longer bool after commit 74450be1. But
    results of such tests are stored in bools. This doesn't fit in there
    for some compilers (gcc 4.5 here), so either use !! magic to get real
    bools or use ulong where the result is assigned somewhere.

    Signed-off-by: Jiri Slaby
    Cc: Christoph Hellwig
    Reviewed-by: Jeff Moyer
    Signed-off-by: Jens Axboe

    Jiri Slaby