03 Apr, 2009

2 commits

  • Trivial cleanups for nbd: only the return -EIO one really changes code,
    and I've verified all the callers (plus 0 == success, 1 == error
    convention is really ugly).

    Signed-off-by: Pavel Machek
    Acked-by: Paul Clements
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Machek
     
  • The code was written to rely on big kernel lock to protect it from races.
    It mostly works when interface is not abused.

    So this uses tx_lock to protect data structures from concurrent use
    between ioctl and worker threads.

    Next step will be moving from ioctl to unlocked_ioctl.

    [akpm@linux-foundation.org: coding-style fixes]
    [akpm@linux-foundation.org: add missing return]
    Signed-off-by: Pavel Machek
    Acked-by: Paul Clements
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Machek
     

12 Feb, 2009

1 commit

  • Fix a problem that causes I/O to a disconnected (or partially initialized)
    nbd device to hang indefinitely. To reproduce:

    # ioctl NBD_SET_SIZE_BLOCKS /dev/nbd23 514048
    # dd if=/dev/nbd23 of=/dev/null bs=4096 count=1

    ...hangs...

    This can also occur when an nbd device loses its nbd-client/server
    connection. Although we clear the queue of any outstanding I/Os after the
    client/server connection fails, any additional I/Os that get queued later
    will hang.

    This bug may also be the problem reported in this bug report:
    http://bugzilla.kernel.org/show_bug.cgi?id=12277

    Testing would need to be performed to determine if the two issues are the
    same.

    This problem was introduced by the new request handling thread code ("NBD:
    allow nbd to be used locally", 3/2008), which entered into mainline around
    2.6.25.

    The fix, which is fairly simple, is to restore the check for lo->sock
    being NULL in do_nbd_request. This causes I/O to an uninitialized nbd to
    immediately fail with an I/O error, as it did prior to the introduction of
    this bug.

    Signed-off-by: Paul Clements
    Reported-by: Jon Nelson
    Acked-by: Pavel Machek
    Cc: [2.6.26.x, 2.6.27.x, 2.6.28.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Clements
     

16 Jan, 2009

1 commit

  • Two nbd-clients at same time are bad idea, and cause WARN_ON from nbd in
    2.6.28-rc7 from sysfs_add_one. This simply prevents that from happening.

    To reproduce:

    cat /dev/zero | head -c 10000000 > /tmp/delme.fstest.fs
    nbd-server 9100 -l /anyone.can.connect > /tmp/delme.fstest.fs &
    sleep 1
    nbd-client localhost 9100 /dev/nd0 &
    nbd-client localhost 9100 /dev/nd0 &

    Signed-off-by: Pavel Machek
    Acked-by: Paul Clements
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Machek
     

29 Dec, 2008

2 commits


21 Oct, 2008

2 commits

  • NB: nbd_ioctl() appears to be racy; BKL is held, but doesn't really
    help, AFAICS. Left as-is for now, but it'll need fixing.

    Signed-off-by: Al Viro

    Al Viro
     
  • To keep the size of changesets sane we split the switch by drivers;
    to keep the damn thing bisectable we do the following:
    1) rename the affected methods, add ones with correct
    prototypes, make (few) callers handle both. That's this changeset.
    2) for each driver convert to new methods. *ALL* drivers
    are converted in this series.
    3) kill the old (renamed) methods.

    Note that it _is_ a flagday; all in-tree drivers are converted and by the
    end of this series no trace of old methods remain. The only reason why
    we do that this way is to keep the damn thing bisectable and allow per-driver
    debugging if anything goes wrong.

    New methods:
    open(bdev, mode)
    release(disk, mode)
    ioctl(bdev, mode, cmd, arg) /* Called without BKL */
    compat_ioctl(bdev, mode, cmd, arg)
    locked_ioctl(bdev, mode, cmd, arg) /* Called with BKL, legacy */

    Signed-off-by: Al Viro

    Al Viro
     

20 Oct, 2008

1 commit

  • Tejun's commit 7b595756ec1f49e0049a9e01a1298d53a7faaa15 made sysfs
    attribute->owner unnecessary. But the field was left in the structure to
    ease the merge. It's been over a year since that change and it is now
    time to start killing attribute->owner along with its users - one arch at
    a time!

    This patch is attempt #1 to get rid of attribute->owner only for
    CONFIG_X86_64 or CONFIG_X86_32 . We will deal with other arches later on
    as and when possible - avr32 will be the next since that is something I
    can test. Compile (make allyesconfig / make allmodconfig / custom config)
    and boot tested.

    akpm: the idea is that we put the declaration of sttribute.owner inside
    `#ifndef CONFIG_X86'. But that proved to be too ambitious for now because
    new usages kept on turning up in subsystem trees.

    [akpm: remove the ifdef for now]
    Signed-off-by: Parag Warudkar
    Cc: Greg KH
    Cc: Ingo Molnar
    Cc: Tejun Heo
    Cc: Len Brown
    Cc: Jens Axboe
    Cc: Jean Delvare
    Cc: Roland Dreier
    Cc: David Brownell
    Cc: Alessandro Zummo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Parag Warudkar
     

09 Oct, 2008

1 commit

  • Implement {disk|part}_to_dev() and use them to access generic device
    instead of directly dereferencing {disk|part}->dev. To make sure no
    user is left behind, rename generic devices fields to __dev.

    This is in preparation of unifying partition 0 handling with other
    partitions.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Tejun Heo
     

21 Aug, 2008

1 commit

  • We leak the memory allocated for the nbd_dev array at multiple places.
    Fix them by either adding a kfree() or by rearranging code to return
    before we allocate the memory.

    Signed-off-by: Sven Wegener
    Cc: Paul Clements
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sven Wegener
     

29 Apr, 2008

5 commits

  • * 'for-linus' of git://git.kernel.dk/linux-2.6-block:
    block: Skip I/O merges when disabled
    block: add large command support
    block: replace sizeof(rq->cmd) with BLK_MAX_CDB
    ide: use blk_rq_init() to initialize the request
    block: use blk_rq_init() to initialize the request
    block: rename and export rq_init()
    block: no need to initialize rq->cmd with blk_get_request
    block: no need to initialize rq->cmd in prepare_flush_fn hook
    block/blk-barrier.c:blk_ordered_cur_seq() mustn't be inline
    block/elevator.c:elv_rq_merge_ok() mustn't be inline
    block: make queue flags non-atomic
    block: add dma alignment and padding support to blk_rq_map_kern
    unexport blk_max_pfn
    ps3disk: Remove superfluous cast
    block: make rq_init() do a full memset()
    relay: fix splice problem

    Linus Torvalds
     
  • Some drivers have duplicated unlikely() macros. IS_ERR() already has
    unlikely() in itself.

    This patch cleans up such pointless code.

    Signed-off-by: Hirofumi Nakagawa
    Acked-by: David S. Miller
    Acked-by: Jeff Garzik
    Cc: Paul Clements
    Cc: Richard Purdie
    Cc: Alessandro Zummo
    Cc: David Brownell
    Cc: James Bottomley
    Cc: Michael Halcrow
    Cc: Anton Altaparmakov
    Cc: Al Viro
    Cc: Carsten Otte
    Cc: Patrick McHardy
    Cc: Paul Mundt
    Cc: Jaroslav Kysela
    Cc: Takashi Iwai
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hirofumi Nakagawa
     
  • Permit the use of partitions with network block devices (NBD).

    A new parameter is introduced to define how many partition we want to be able
    to manage per network block device. This parameter is "max_part".

    For instance, to manage 63 partitions / loop device, we will do:

    [on the server side]
    # nbd-server 1234 /dev/sdb
    [on the client side]
    # modprobe nbd max_part=63
    # ls -l /dev/nbd*
    brw-rw---- 1 root disk 43, 0 2008-03-25 11:14 /dev/nbd0
    brw-rw---- 1 root disk 43, 64 2008-03-25 11:11 /dev/nbd1
    brw-rw---- 1 root disk 43, 640 2008-03-25 11:11 /dev/nbd10
    brw-rw---- 1 root disk 43, 704 2008-03-25 11:11 /dev/nbd11
    brw-rw---- 1 root disk 43, 768 2008-03-25 11:11 /dev/nbd12
    brw-rw---- 1 root disk 43, 832 2008-03-25 11:11 /dev/nbd13
    brw-rw---- 1 root disk 43, 896 2008-03-25 11:11 /dev/nbd14
    brw-rw---- 1 root disk 43, 960 2008-03-25 11:11 /dev/nbd15
    brw-rw---- 1 root disk 43, 128 2008-03-25 11:11 /dev/nbd2
    brw-rw---- 1 root disk 43, 192 2008-03-25 11:11 /dev/nbd3
    brw-rw---- 1 root disk 43, 256 2008-03-25 11:11 /dev/nbd4
    brw-rw---- 1 root disk 43, 320 2008-03-25 11:11 /dev/nbd5
    brw-rw---- 1 root disk 43, 384 2008-03-25 11:11 /dev/nbd6
    brw-rw---- 1 root disk 43, 448 2008-03-25 11:11 /dev/nbd7
    brw-rw---- 1 root disk 43, 512 2008-03-25 11:11 /dev/nbd8
    brw-rw---- 1 root disk 43, 576 2008-03-25 11:11 /dev/nbd9
    # nbd-client localhost 1234 /dev/nbd0
    Negotiation: ..size = 80418240KB
    bs=1024, sz=80418240

    -------NOTE, RFC: partition table is not automatically read.
    The driver sets bdev->bd_invalidated to 1 to force the read of the partition
    table of the device, but this is done only on an open of the device.
    So we have to do a "touch /dev/nbdX" or something like that.
    It can't be done from the nbd-client or nbd driver because at this
    level we can't ask to read the partition table and to serve the request
    at the same time (-> deadlock)

    If someone has a better idea, I'm open to any suggestion.
    -------NOTE, RFC

    # fdisk -l /dev/nbd0

    Disk /dev/nbd0: 82.3 GB, 82348277760 bytes
    255 heads, 63 sectors/track, 10011 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes

    Device Boot Start End Blocks Id System
    /dev/nbd0p1 * 1 9965 80043831 83 Linux
    /dev/nbd0p2 9966 10011 369495 5 Extended
    /dev/nbd0p5 9966 10011 369463+ 82 Linux swap / Solaris

    # ls -l /dev/nbd0*
    brw-rw---- 1 root disk 43, 0 2008-03-25 11:16 /dev/nbd0
    brw-rw---- 1 root disk 43, 1 2008-03-25 11:16 /dev/nbd0p1
    brw-rw---- 1 root disk 43, 2 2008-03-25 11:16 /dev/nbd0p2
    brw-rw---- 1 root disk 43, 5 2008-03-25 11:16 /dev/nbd0p5
    # mount /dev/nbd0p1 /mnt
    # ls /mnt
    bin dev initrd lost+found opt sbin sys var
    boot etc initrd.img media proc selinux tmp vmlinuz
    cdrom home lib mnt root srv usr
    # umount /mnt
    # nbd-client -d /dev/nbd0
    # ls -l /dev/nbd0*
    brw-rw---- 1 root disk 43, 0 2008-03-25 11:16 /dev/nbd0
    -------NOTE
    On "nbd-client -d", we can do an iocl(BLKRRPART) to update partition table:
    as the size of the device is 0, we don't have to serve the partition manager
    request (-> no deadlock).
    -------NOTE

    Signed-off-by: Paul Clements
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Laurent Vivier
     
  • This patch allows Network Block Device to be mounted locally (nbd-client to
    nbd-server over 127.0.0.1).

    It creates a kthread to avoid the deadlock described in NBD tools
    documentation. So, if nbd-client hangs waiting for pages, the kblockd thread
    can continue its work and free pages.

    I have tested the patch to verify that it avoids the hang that always occurs
    when writing to a localhost nbd connection. I have also tested to verify that
    no performance degradation results from the additional thread and queue.

    Patch originally from Laurent Vivier.

    Signed-off-by: Paul Clements
    Signed-off-by: Laurent Vivier
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Laurent Vivier
     
  • Any path needs to call it to initialize the request.

    This is a preparation for large command support, which needs to
    initialize the request in a proper way (that is, just doing a memset()
    will not work).

    Signed-off-by: FUJITA Tomonori
    Cc: Jens Axboe
    Signed-off-by: Jens Axboe

    FUJITA Tomonori
     

03 Apr, 2008

1 commit

  • NBD does not protect the nbd_device's socket from becoming NULL during
    receives.

    This closes a race with the NBD_CLEAR_SOCK ioctl (nbd-client -d) setting
    the nbd_device's socket to NULL right before NBD calls sock_xmit.

    Signed-off-by: Mike Snitzer
    Cc: Paul Clements
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Snitzer
     

24 Feb, 2008

1 commit

  • NBD doesn't work well with CFQ (or AS) schedulers, so let's default to
    something else.

    The two problems I have experienced with nbd and cfq are:

    1) nbd hangs with cfq on RHEL 5 (2.6.18) -- this may well have been
    fixed

    There's a similar debian bug that has been filed as well:

    http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=447638

    There have been posts to nbd-general mailing list about problems with
    cfq and nbd also.

    2) nbd performs about 10% better (the last time I tested) with deadline
    vs. cfq (the overhead of cfq doesn't provide much advantage to nbd [not
    being a real disk], and you end up going through the I/O scheduler on
    the nbd server anyway, so it makes sense that deadline is better with
    nbd)

    Signed-off-by: Paul Clements
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Clements
     

09 Feb, 2008

1 commit


28 Jan, 2008

1 commit


25 Jan, 2008

1 commit

  • This moves the block devices to /sys/class/block. It will create a
    flat list of all block devices, with the disks and partitions in one
    directory. For compatibility /sys/block is created and contains symlinks
    to the disks.

    /sys/class/block
    |-- sda -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda
    |-- sda1 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda1
    |-- sda10 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda10
    |-- sda5 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda5
    |-- sda6 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda6
    |-- sda7 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda7
    |-- sda8 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda8
    |-- sda9 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda9
    `-- sr0 -> ../../devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0

    /sys/block/
    |-- sda -> ../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda
    `-- sr0 -> ../devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0

    Signed-off-by: Kay Sievers
    Signed-off-by: Greg Kroah-Hartman

    Kay Sievers
     

13 Nov, 2007

1 commit

  • ...and fix a couple of bugs in the NBD, CIFS and OCFS2 socket handlers.

    Looking at the sock->op->shutdown() handlers, it looks as if all of them
    take a SHUT_RD/SHUT_WR/SHUT_RDWR argument instead of the
    RCV_SHUTDOWN/SEND_SHUTDOWN arguments.
    Add a helper, and then define the SHUT_* enum to ensure that kernel users
    of shutdown() don't get confused.

    Signed-off-by: Trond Myklebust
    Acked-by: Mark Fasheh
    Acked-by: David Howells
    Signed-off-by: David S. Miller

    Trond Myklebust
     

20 Oct, 2007

2 commits

  • Signed-off-by: Denis Cheng
    Signed-off-by: Adrian Bunk

    Denis Cheng
     
  • The task_struct->pid member is going to be deprecated, so start
    using the helpers (task_pid_nr/task_pid_vnr/task_pid_nr_ns) in
    the kernel.

    The first thing to start with is the pid, printed to dmesg - in
    this case we may safely use task_pid_nr(). Besides, printks produce
    more (much more) than a half of all the explicit pid usage.

    [akpm@linux-foundation.org: git-drm went and changed lots of stuff]
    Signed-off-by: Pavel Emelyanov
    Cc: Dave Airlie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     

17 Oct, 2007

4 commits

  • Allow NBD I/O to be cancelled when a network outage occurs. Previously, I/O
    would just hang, and if enough I/O was hung in nbd, the system (at least
    user-level) would completely hang until a TCP timeout (default, 15 minutes)
    occurred.

    The patch introduces a new ioctl NBD_SET_TIMEOUT that allows a transmit
    timeout value (in seconds) to be specified. Any network send that exceeds the
    timeout will be cancelled and the nbd connection will be shut down. I've
    tested with various timeout values and 6 seconds seems to be a good choice for
    the timeout. If the NBD_SET_TIMEOUT ioctl is not called, you get the old (I/O
    hang) behavior.

    Signed-off-by: Paul Clements
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Clements
     
  • This fixes errors with utilities (such as LVM's vgscan) that try to scan all
    devices. Previously this would generate read errors when uninitialized nbd
    devices were scanned:

    # vgscan
    Reading all physical volumes. This may take a while...
    /dev/nbd0: read failed after 0 of 1024 at 0: Input/output error
    /dev/nbd0: read failed after 0 of 1024 at 509804544: Input/output error
    /dev/nbd0: read failed after 0 of 2048 at 0: Input/output error
    /dev/nbd1: read failed after 0 of 1024 at 509804544: Input/output error
    /dev/nbd1: read failed after 0 of 2048 at 0: Input/output error

    From now on, uninitialized nbd devices will have size zero, which
    prevents these errors.

    Signed-off-by: Paul Clements
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Clements
     
  • This memcpy looks so strange, in fact it's merely a pointer dereference, so I
    change the parameter's type to refer it more directly, this could make the
    memcpy not needed anymore.

    In the function nbd_read_stat where nbd_find_request is only once called, the
    parameter served should be transformed accordingly.

    Signed-off-by: Denis Cheng
    Cc: Paul Clements
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Denis Cheng
     
  • Thus the traverse of the loop may delete nodes, use the safe version.

    Signed-off-by: Denis Cheng
    Cc: Paul Clements
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Denis Cheng
     

10 Oct, 2007

2 commits


24 Jul, 2007

1 commit

  • Some of the code has been gradually transitioned to using the proper
    struct request_queue, but there's lots left. So do a full sweet of
    the kernel and get rid of this typedef and replace its uses with
    the proper type.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

17 Jul, 2007

1 commit


10 Jul, 2007

1 commit

  • - I have unearthed very old bugs in stale drivers that still
    used request->cmd as a READ|WRITE int
    - This patch is maybe a proof that these drivers have not been
    used for a long time. Should they be removed completely?

    Drivers that currently do not work for sure:
    drivers/acorn/block/fd1772.c | 2 +-
    drivers/acorn/block/mfmhd.c | 8 ++++----
    drivers/cdrom/aztcd.c | 2 +-
    drivers/cdrom/cm206.c | 2 +-
    drivers/cdrom/gscd.c | 2 +-
    drivers/cdrom/mcdx.c | 2 +-
    drivers/cdrom/optcd.c | 2 +-
    drivers/cdrom/sjcd.c | 2 +-

    Drivers with cosmetic fixes only:
    b/drivers/block/amiflop.c
    b/drivers/block/nbd.c
    b/drivers/ide/legacy/hd.c

    Signed-off-by: Boaz Harrosh
    Signed-off-by: Jens Axboe

    Boaz Harrosh
     

10 May, 2007

1 commit


09 Dec, 2006

1 commit


08 Dec, 2006

1 commit

  • Allow nbd to expose the nbd-client daemon's PID in /sys/block/nbd/pid.

    This is helpful for tracking connection status of a device and for
    determining which nbd devices are currently in use.

    Signed-off-by: Paul Clements
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Clements
     

01 Oct, 2006

1 commit

  • Right now ->flags is a bit of a mess: some are request types, and
    others are just modifiers. Clean this up by splitting it into
    ->cmd_type and ->cmd_flags. This allows introduction of generic
    Linux block message types, useful for sending generic Linux commands
    to block devices.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

01 Aug, 2006

2 commits

  • When reading from nbd device, we need to receive all the data after
    receiving reply packet from the server - otherwise such request will never
    be ended.

    If socket is closed right after accepting reply control packet and in the
    middle of waiting for read data, nbd_read_stat() returns NULL and
    nbd_end_request() is not called.

    This patch fixes it.

    Signed-off-by: Michal Feix
    Acked-by: Paul Clements
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Feix
     
  • We should check magic sequence in reply packet before trying to find
    request with it's request handle. This also solves the problem with
    "Unexpected reply" message beeing logged, when packet with invalid magic is
    received.

    Signed-off-by: Michal Feix
    Acked-by: Paul Clements
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Feix
     

02 Jul, 2006

1 commit