04 Jan, 2012

1 commit


15 Sep, 2011

1 commit


07 May, 2011

2 commits


13 Nov, 2010

1 commit

  • Over time, block layer has accumulated a set of APIs dealing with bdev
    open, close, claim and release.

    * blkdev_get/put() are the primary open and close functions.

    * bd_claim/release() deal with exclusive open.

    * open/close_bdev_exclusive() are combination of open and claim and
    the other way around, respectively.

    * bd_link/unlink_disk_holder() to create and remove holder/slave
    symlinks.

    * open_by_devnum() wraps bdget() + blkdev_get().

    The interface is a bit confusing and the decoupling of open and claim
    makes it impossible to properly guarantee exclusive access as
    in-kernel open + claim sequence can disturb the existing exclusive
    open even before the block layer knows the current open if for another
    exclusive access. Reorganize the interface such that,

    * blkdev_get() is extended to include exclusive access management.
    @holder argument is added and, if is @FMODE_EXCL specified, it will
    gain exclusive access atomically w.r.t. other exclusive accesses.

    * blkdev_put() is similarly extended. It now takes @mode argument and
    if @FMODE_EXCL is set, it releases an exclusive access. Also, when
    the last exclusive claim is released, the holder/slave symlinks are
    removed automatically.

    * bd_claim/release() and close_bdev_exclusive() are no longer
    necessary and either made static or removed.

    * bd_link_disk_holder() remains the same but bd_unlink_disk_holder()
    is no longer necessary and removed.

    * open_bdev_exclusive() becomes a simple wrapper around lookup_bdev()
    and blkdev_get(). It also has an unexpected extra bdev_read_only()
    test which probably should be moved into blkdev_get().

    * open_by_devnum() is modified to take @holder argument and pass it to
    blkdev_get().

    Most of bdev open/close operations are unified into blkdev_get/put()
    and most exclusive accesses are tested atomically at the open time (as
    it should). This cleans up code and removes some, both valid and
    invalid, but unnecessary all the same, corner cases.

    open_bdev_exclusive() and open_by_devnum() can use further cleanup -
    rename to blkdev_get_by_path() and blkdev_get_by_devt() and drop
    special features. Well, let's leave them for another day.

    Most conversions are straight-forward. drbd conversion is a bit more
    involved as there was some reordering, but the logic should stay the
    same.

    Signed-off-by: Tejun Heo
    Acked-by: Neil Brown
    Acked-by: Ryusuke Konishi
    Acked-by: Mike Snitzer
    Acked-by: Philipp Reisner
    Cc: Peter Osterlund
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Jan Kara
    Cc: Andrew Morton
    Cc: Andreas Dilger
    Cc: "Theodore Ts'o"
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Alex Elder
    Cc: Christoph Hellwig
    Cc: dm-devel@redhat.com
    Cc: drbd-dev@lists.linbit.com
    Cc: Leo Chen
    Cc: Scott Branden
    Cc: Chris Mason
    Cc: Steven Whitehouse
    Cc: Dave Kleikamp
    Cc: Joern Engel
    Cc: reiserfs-devel@vger.kernel.org
    Cc: Alexander Viro

    Tejun Heo
     

23 Oct, 2010

1 commit

  • * 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
    vfs: make no_llseek the default
    vfs: don't use BKL in default_llseek
    llseek: automatically add .llseek fop
    libfs: use generic_file_llseek for simple_attr
    mac80211: disallow seeks in minstrel debug code
    lirc: make chardev nonseekable
    viotape: use noop_llseek
    raw: use explicit llseek file operations
    ibmasmfs: use generic_file_llseek
    spufs: use llseek in all file operations
    arm/omap: use generic_file_llseek in iommu_debug
    lkdtm: use generic_file_llseek in debugfs
    net/wireless: use generic_file_llseek in debugfs
    drm: use noop_llseek

    Linus Torvalds
     

19 Oct, 2010

1 commit

  • RAW_SETBIND and RAW_GETBIND 32bit versions are fscked in interesting ways.

    1) fs/compat_ioctl.c has COMPATIBLE_IOCTL(RAW_SETBIND) followed by
    HANDLE_IOCTL(RAW_SETBIND, raw_ioctl). The latter is ignored.

    2) on amd64 (and itanic) the damn thing is broken - we have int + u64 + u64
    and layouts on i386 and amd64 are _not_ the same. raw_ioctl() would
    work there, but it's never called due to (1). As it is, i386 /sbin/raw
    definitely doesn't work on amd64 boxen.

    3) switching to raw_ioctl() as is would *not* work on e.g. sparc64 and ppc64,
    which would be rather sad, seeing that normal userland there is 32bit.
    The thing is, slapping __packed on the struct in question does not DTRT -
    it eliminates *all* padding. The real solution is to use compat_u64.

    4) of course, all that stuff has no business being outside of raw.c in the
    first place - there should be ->compat_ioctl() for /dev/rawctl instead of
    messing with compat_ioctl.c.

    [akpm@linux-foundation.org: coding-style fixes]
    [arnd@arndb.de: port to 2.6.36]
    Signed-off-by: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Arnd Bergmann

    Al Viro
     

16 Sep, 2010

1 commit


17 May, 2010

1 commit

  • These are the last remaining device drivers using
    the ->ioctl file operation in the drivers directory
    (except from v4l drivers).

    [fweisbec: drop i8k pushdown as it has been done from
    procfs pushdown branch already]

    Signed-off-by: Arnd Bergmann
    Signed-off-by: Frederic Weisbecker

    Arnd Bergmann
     

07 Apr, 2010

2 commits

  • Requested by hch, for consistency now it is exported.

    Cc: Alexander Viro
    Cc: Anton Blanchard
    Cc: Christoph Hellwig
    Cc: Jan Kara
    Cc: Jeff Moyer
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Commit 148f948ba877f4d3cdef036b1ff6d9f68986706a (vfs: Introduce new
    helpers for syncing after writing to O_SYNC file or IS_SYNC inode) broke
    the raw driver.

    We now call through generic_file_aio_write -> generic_write_sync ->
    vfs_fsync_range. vfs_fsync_range has:

    if (!fop || !fop->fsync) {
    ret = -EINVAL;
    goto out;
    }

    But drivers/char/raw.c doesn't set an fsync method.

    We have two options: fix it or remove the raw driver completely. I'm
    happy to do either, the fact this has been broken for so long suggests it
    is rarely used.

    The patch below adds an fsync method to the raw driver. My knowledge of
    the block layer is pretty sketchy so this could do with a once over.

    If we instead decide to remove the raw driver, this patch might still be
    useful as a backport to 2.6.33 and 2.6.32.

    Signed-off-by: Anton Blanchard
    Reviewed-by: Jan Kara
    Cc: Christoph Hellwig
    Cc: Alexander Viro
    Cc: Jens Axboe
    Reviewed-by: Jeff Moyer
    Tested-by: Jeff Moyer
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Anton Blanchard
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

20 Sep, 2009

1 commit

  • This allows subsytems to provide devtmpfs with non-default permissions
    for the device node. Instead of the default mode of 0600, null, zero,
    random, urandom, full, tty, ptmx now have a mode of 0666, which allows
    non-privileged processes to access standard device nodes in case no
    other userspace process applies the expected permissions.

    This also fixes a wrong assignment in pktcdvd and a checkpatch.pl complain.

    Signed-off-by: Kay Sievers
    Signed-off-by: Greg Kroah-Hartman

    Kay Sievers
     

14 Sep, 2009

1 commit

  • generic_file_aio_write_nolock() is now used only by block devices and raw
    character device. Filesystems should use __generic_file_aio_write() in case
    generic_file_aio_write() doesn't suit them. So rename the function to
    blkdev_aio_write() and move it to fs/blockdev.c.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Christoph Hellwig
     

16 Jun, 2009

1 commit


23 May, 2009

1 commit

  • Until now we have had a 1:1 mapping between storage device physical
    block size and the logical block sized used when addressing the device.
    With SATA 4KB drives coming out that will no longer be the case. The
    sector size will be 4KB but the logical block size will remain
    512-bytes. Hence we need to distinguish between the physical block size
    and the logical ditto.

    This patch renames hardsect_size to logical_block_size.

    Signed-off-by: Martin K. Petersen
    Signed-off-by: Jens Axboe

    Martin K. Petersen
     

28 Mar, 2009

1 commit


21 Oct, 2008

3 commits


17 Oct, 2008

1 commit


22 Jul, 2008

1 commit


21 Jun, 2008

1 commit


13 Oct, 2007

1 commit


12 Feb, 2007

1 commit

  • Minor number 0 (under the raw major) is reserved for the rawctl device
    file, which is used to query, set, and unset raw device bindings. However,
    the ioctl interface does not protect the user from specifying a raw device
    with minor number 0:

    $ sudo ./raw /dev/raw/raw0 /dev/VolGroup00/swap
    /dev/raw/raw0: bound to major 253, minor 2
    $ ls -l /dev/rawctl
    ls: /dev/rawctl: No such file or directory
    $ ls -l /dev/raw/raw0
    crw------- 1 root root 162, 0 Jan 12 10:51 /dev/raw/raw0
    $ sudo ./raw -qa
    Cannot open master raw device '/dev/rawctl' (No such file or directory)

    As you can see, this prevents any further raw operations from
    succeeding. The fix (from Steve Fernandez) is quite simple--do not
    allow the allocation of minor number 0.

    Signed-off-by: Jeff Moyer
    Cc: Steven Fernandez
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Moyer
     

09 Dec, 2006

1 commit


02 Dec, 2006

1 commit


01 Oct, 2006

3 commits

  • This patch cleans up generic_file_*_read/write() interfaces. Christoph
    Hellwig gave me the idea for this clean ups.

    In a nutshell, all filesystems should set .aio_read/.aio_write methods and use
    do_sync_read/ do_sync_write() as their .read/.write methods. This allows us
    to cleanup all variants of generic_file_* routines.

    Final available interfaces:

    generic_file_aio_read() - read handler
    generic_file_aio_write() - write handler
    generic_file_aio_write_nolock() - no lock write handler

    __generic_file_aio_write_nolock() - internal worker routine

    Signed-off-by: Badari Pulavarty
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Badari Pulavarty
     
  • This patch removes readv() and writev() methods and replaces them with
    aio_read()/aio_write() methods.

    Signed-off-by: Badari Pulavarty
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Badari Pulavarty
     
  • This patch vectorizes aio_read() and aio_write() methods to prepare for
    collapsing all aio & vectored operations into one interface - which is
    aio_read()/aio_write().

    Signed-off-by: Badari Pulavarty
    Signed-off-by: Christoph Hellwig
    Cc: Michael Holzheu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Badari Pulavarty
     

30 Sep, 2006

1 commit


04 Jul, 2006

1 commit

  • Mark the static struct file_operations in drivers/char as const. Making
    them const prevents accidental bugs, and moves them to the .rodata section
    so that they no longer do any false sharing; in addition with the proper
    debug option they are then protected against corruption..

    [akpm@osdl.org: build fix]
    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

27 Jun, 2006

3 commits


23 Mar, 2006

1 commit


29 Oct, 2005

1 commit


21 Jun, 2005

1 commit


19 May, 2005

1 commit

  • Don't pass meaningless file handles to block device ioctls.

    The recent raw IO ioctl-passthrough fix started passing the raw file
    handle into the block device ioctl handler. That's unlikely to be
    useful, as the file handle is actually open on a character-mode raw
    device, not a block device, so dereferencing it is not going to yield
    useful results to a block device ioctl handler.

    Previously we just passed NULL; also not a value that can usefully
    be dereferenced, but at least if it does happen, we'll oops instead of
    silently pretending that the file is a block device, so NULL is the more
    defensive option here. This patch reverts to that behaviour.

    Noticed by Al Viro.

    Signed-off-by: Stephen Tweedie
    Acked-by: Al Viro
    Signed-off-by: Linus Torvalds

    Stephen Tweedie
     

17 May, 2005

1 commit

  • [Patch] Fix raw device ioctl pass-through

    Raw character devices are supposed to pass ioctls through to the block
    devices they are bound to. Unfortunately, they are using the wrong
    function for this: ioctl_by_bdev(), instead of blkdev_ioctl().

    ioctl_by_bdev() performs a set_fs(KERNEL_DS) before calling the ioctl,
    redirecting the user-space buffer access to the kernel address space.
    This is, needless to say, a bad thing.

    This was noticed first on s390, where raw IO was non-functioning. The
    s390 driver config does not actually allow raw IO to be enabled, which
    was the first part of the problem. Secondly, the s390 kernel address
    space is distinct from user, causing legal raw ioctls to fail. I've
    reproduced this on a kernel built with 4G:4G split on x86, which fails
    in the same way (-EFAULT if the address does not exist kernel-side;
    returns success without actually populating the user buffer if it does.)

    The patch below fixes both the config and address-space problems. It's
    based closely on a patch by Jan Glauber , which has
    been tested on s390 at IBM. I've tested it on x86 4G:4G (split address
    space) and x86_64 (common address space).

    Kernel-address-space access has been assigned CAN-2005-1264.

    Signed-off-by: Stephen Tweedie
    Signed-off-by: Dave Jones
    Signed-off-by: Greg Kroah-Hartman

    Stephen Tweedie