15 Jan, 2012

2 commits

  • Linux allows executing the SG_IO ioctl on a partition or LVM volume, and
    will pass the command to the underlying block device. This is
    well-known, but it is also a large security problem when (via Unix
    permissions, ACLs, SELinux or a combination thereof) a program or user
    needs to be granted access only to part of the disk.

    This patch lets partitions forward a small set of harmless ioctls;
    others are logged with printk so that we can see which ioctls are
    actually sent. In my tests only CDROM_GET_CAPABILITY actually occurred.
    Of course it was being sent to a (partition on a) hard disk, so it would
    have failed with ENOTTY and the patch isn't changing anything in
    practice. Still, I'm treating it specially to avoid spamming the logs.

    In principle, this restriction should include programs running with
    CAP_SYS_RAWIO. If for example I let a program access /dev/sda2 and
    /dev/sdb, it still should not be able to read/write outside the
    boundaries of /dev/sda2 independent of the capabilities. However, for
    now programs with CAP_SYS_RAWIO will still be allowed to send the
    ioctls. Their actions will still be logged.

    This patch does not affect the non-libata IDE driver. That driver
    however already tests for bd != bd->bd_contains before issuing some
    ioctl; it could be restricted further to forbid these ioctls even for
    programs running with CAP_SYS_ADMIN/CAP_SYS_RAWIO.

    Cc: linux-scsi@vger.kernel.org
    Cc: Jens Axboe
    Cc: James Bottomley
    Signed-off-by: Paolo Bonzini
    [ Make it also print the command name when warning - Linus ]
    Signed-off-by: Linus Torvalds

    Paolo Bonzini
     
  • Introduce a wrapper around scsi_cmd_ioctl that takes a block device.

    The function will then be enhanced to detect partition block devices
    and, in that case, subject the ioctls to whitelisting.

    Cc: linux-scsi@vger.kernel.org
    Cc: Jens Axboe
    Cc: James Bottomley
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Linus Torvalds

    Paolo Bonzini
     

19 Oct, 2011

1 commit

  • blk_get/put_queue() in scsi_cmd_ioctl() and throtl_get_tg() are
    completely bogus. The caller must have a reference to the queue on
    entry and taking an extra reference doesn't change anything.

    For scsi_cmd_ioctl(), the only effect is that it ends up checking
    QUEUE_FLAG_DEAD on entry; however, this is bogus as queue can die
    right after blk_get_queue(). Dead queue should be and is handled in
    request issue path (it's somewhat broken now but that's a separate
    problem and doesn't affect this one much).

    throtl_get_tg() incorrectly assumes that q is rcu freed. Also, it
    doesn't check return value of blk_get_queue(). If the queue is
    already dead, it ends up doing an extra put.

    Drop them.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     

10 Nov, 2010

1 commit


04 Nov, 2009

1 commit

  • Quiet sparse noise about symbol's not being declared.

    Symbol blk_default_cmd_filter is only used locally and should be static.

    The function blk_scsi_ioctl_init() is a fs_initcall and should also be
    static.

    Signed-off-by: H Hartley Sweeten
    Cc: James Bottomley
    Signed-off-by: Jens Axboe

    H Hartley Sweeten
     

11 Jul, 2009

1 commit

  • Currently, blk_scsi_ioctl_init() is not called since it lacks
    an initcall marking. This causes the command table to be
    unitialized, hence somce commands are block when they should
    not have been.

    This fixes a regression introduced by commit
    018e0446890661504783f92388ecce7138c1566d

    Signed-off-by: FUJITA Tomonori
    Signed-off-by: Jens Axboe

    FUJITA Tomonori
     

01 Jul, 2009

1 commit

  • The initial patches to support this through sysfs export were broken
    and have been if 0'ed out in any release. So lets just kill the code
    and reclaim some space in struct request_queue, if anyone would later
    like to fixup the sysfs bits, the git history can easily restore
    the removed bits.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

23 May, 2009

1 commit


11 May, 2009

1 commit

  • rq->data_len served two purposes - the length of data buffer on issue
    and the residual count on completion. This duality creates some
    headaches.

    First of all, block layer and low level drivers can't really determine
    what rq->data_len contains while a request is executing. It could be
    the total request length or it coulde be anything else one of the
    lower layers is using to keep track of residual count. This
    complicates things because blk_rq_bytes() and thus
    [__]blk_end_request_all() relies on rq->data_len for PC commands.
    Drivers which want to report residual count should first cache the
    total request length, update rq->data_len and then complete the
    request with the cached data length.

    Secondly, it makes requests default to reporting full residual count,
    ie. reporting that no data transfer occurred. The residual count is
    an exception not the norm; however, the driver should clear
    rq->data_len to zero to signify the normal cases while leaving it
    alone means no data transfer occurred at all. This reverse default
    behavior complicates code unnecessarily and renders block PC on some
    drivers (ide-tape/floppy) unuseable.

    This patch adds rq->resid_len which is used only for residual count.

    While at it, remove now unnecessasry blk_rq_bytes() caching in
    ide_pc_intr() as rq->data_len is not changed anymore.

    Boaz : spotted missing conversion in osd
    Sergei : spotted too early conversion to blk_rq_bytes() in ide-tape

    [ Impact: cleanup residual count handling, report 0 resid by default ]

    Signed-off-by: Tejun Heo
    Cc: James Bottomley
    Cc: Bartlomiej Zolnierkiewicz
    Cc: Borislav Petkov
    Cc: Sergei Shtylyov
    Cc: Mike Miller
    Cc: Eric Moore
    Cc: Alan Stern
    Cc: FUJITA Tomonori
    Cc: Doug Gilbert
    Cc: Mike Miller
    Cc: Eric Moore
    Cc: Darrick J. Wong
    Cc: Pete Zaitcev
    Cc: Boaz Harrosh
    Signed-off-by: Jens Axboe

    Tejun Heo
     

28 Apr, 2009

2 commits

  • blk_get_request() always returns properly zeroed requests. Don't set
    fields to zero/NULL unnecessarily.

    [ Impact: cleanup ]

    Signed-off-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Now that all block request data transfer is done via bio, rq->data
    isn't used. Kill it.

    While at it, make the roles of rq->special and buffer clear.

    [ Impact: drop now unncessary field from struct request ]

    Signed-off-by: Tejun Heo
    Cc: Boaz Harrosh

    Tejun Heo
     

22 Apr, 2009

1 commit

  • Impact: fix SG_IO behavior such that it matches the documentation

    SG_IO howto says that if ->dxfer_len and sum of iovec disagress, the
    shorter one wins. However, the current implementation returns -EINVAL
    for such cases. Trim iovc if it's longer than ->dxfer_len.

    This patch uses iov_*() helpers which take struct iovec * by casting
    struct sg_iovec * to it. sg_iovec is always identical to iovec and
    this will be further cleaned up with later patches.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Tejun Heo
     

15 Apr, 2009

1 commit


26 Mar, 2009

1 commit

  • Put a WARN_ON in __blk_put_request if it is about to
    leak bio(s). This is a serious bug that can happen in error
    handling code paths.

    For this to work I have fixed a couple of places in block/ where
    request->bio != NULL ownership was not honored. And a small cleanup
    at sg_io() while at it.

    Signed-off-by: Boaz Harrosh
    Signed-off-by: Jens Axboe

    Boaz Harrosh
     

29 Dec, 2008

1 commit


06 Dec, 2008

1 commit

  • There's no point in having too short SG_IO timeouts, since if the
    command does end up timing out, we'll end up through the reset sequence
    that is several seconds long in order to abort the command that timed
    out.

    As a result, shorter timeouts than a few seconds simply do not make
    sense, as the recovery would be longer than the timeout itself.

    Add a BLK_MIN_SG_TIMEOUT to match the existign BLK_DEFAULT_SG_TIMEOUT.

    Suggested-by: Alan Cox
    Acked-by: Tejun Heo
    Acked-by: Jens Axboe
    Cc: Jeff Garzik
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

21 Oct, 2008

4 commits


09 Oct, 2008

3 commits

  • This patch introduces struct rq_map_data to enable bio_copy_use_iov()
    use reserved pages.

    Currently, bio_copy_user_iov allocates bounce pages but
    drivers/scsi/sg.c wants to allocate pages by itself and use
    them. struct rq_map_data can be used to pass allocated pages to
    bio_copy_user_iov.

    The current users of bio_copy_user_iov simply passes NULL (they don't
    want to use pre-allocated pages).

    Signed-off-by: FUJITA Tomonori
    Cc: Jens Axboe
    Cc: Douglas Gilbert
    Cc: Mike Christie
    Cc: James Bottomley
    Signed-off-by: Jens Axboe

    FUJITA Tomonori
     
  • Currently, blk_rq_map_user and blk_rq_map_user_iov always do
    GFP_KERNEL allocation.

    This adds gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov
    so sg can use it (sg always does GFP_ATOMIC allocation).

    Signed-off-by: FUJITA Tomonori
    Signed-off-by: Douglas Gilbert
    Cc: Mike Christie
    Cc: James Bottomley
    Signed-off-by: Jens Axboe

    FUJITA Tomonori
     
  • I have another request for the block filter SG_IO command whitelist,
    specifically the MMC streaming command set SET READ AHEAD command.
    The command applies only to MMC CDROM/DVDROM drives with the streaming
    optional feature set. The command is useful to cdparanoia in that it
    allows explicit cache control side effects that are, on many drives,
    cdparanoia's most efficient way to flush/disable the media cache on
    cdrom drives. I am aware of no reason why it should not be accessible
    from usespace.

    Also note that the command is already fully accessible through the
    SCSI-native version of the SG_IO ioctl as well as the traditional SG
    interface. The command is only being refused on block devices. That
    means that on a typical stock distro, the command is available through
    /dev/sg* but not /dev/scd* although both are typically available and
    accessible. Filtering the command is not providing any protection,
    only a confusing inconsistency.

    Signed-off-by: Jens Axboe

    xiphmont@xiph.org
     

27 Aug, 2008

2 commits

  • Technically, the cmd_filter would be applied to other protocols though
    it's unlikely to happen. Putting SCSI stuff to request_queue is kinda
    layer violation. So let's rename it.

    Signed-off-by: FUJITA Tomonori
    Signed-off-by: Jens Axboe

    FUJITA Tomonori
     
  • cmd_filter works only for the block layer SG_IO with SCSI block
    devices. It breaks scsi/sg.c, bsg, and the block layer SG_IO with SCSI
    character devices (such as st). We hit a kernel crash with them.

    The problem is that cmd_filter code accesses to gendisk (having struct
    blk_scsi_cmd_filter) via inode->i_bdev->bd_disk. It works for only
    SCSI block device files. With character device files, inode->i_bdev
    leads you to struct cdev. inode->i_bdev->bd_disk->blk_scsi_cmd_filter
    isn't safe.

    SCSI ULDs don't expose gendisk; they keep it private. bsg needs to be
    independent on any protocols. We shouldn't change ULDs to expose their
    gendisk.

    This patch moves struct blk_scsi_cmd_filter from gendisk to
    request_queue, a common object, which eveyone can access to.

    The user interface doesn't change; users can change the filters via
    /sys/block/. gendisk has a pointer to request_queue so the cmd_filter
    code accesses to struct blk_scsi_cmd_filter.

    Signed-off-by: FUJITA Tomonori
    Signed-off-by: Jens Axboe

    FUJITA Tomonori
     

30 Jul, 2008

1 commit

  • It seems cdrwtool in the udftools has been unusable on "modern" kernels
    for some time. A Google search reveals many people with the same issue
    but no solution (cdrwtool fails to format the disk). After spending some
    time tracking down the issue, it comes down to the following:

    The udftools still use the older CDROM_SEND_PACKET interface to send
    things like FORMAT_UNIT through to the drive. They should really be
    updated, but that's another story. Since most distros are using libata
    now, the cd or dvd burner appears as a SCSI device, and we wind up in
    block/scsi_ioctl.c. Here, the code tries to take the "struct
    cdrom_generic_command" and translate it and stuff it into a "struct
    sg_io_hdr" structure so it can pass it to the modern sg_io() routine
    instead. Unfortunately, there is one error, or rather an omission in the
    translation. The timeout that is passed in in the "struct
    cdrom_generic_command" is in HZ=100 units, and this is modified and
    correctly converted to jiffies by use of clock_t_to_jiffies(). However,
    a little further down, this cgc.timeout value in jiffies is simply
    copied into the sg_io_hdr timeout, which should be in milliseconds.
    Since most modern x86 kernels seems to be getting build with HZ=250, the
    timeout that is passed to sg_io and eventually converted to the
    timeout_per_command member of the scsi_cmnd structure is now four times
    too small. Since cdrwtool tries to set the timeout to one hour for the
    FORMAT_UNIT command, and it takes about 20 minutes to format a 4x CDRW,
    the SCSI error-handler kicks in after the FORMAT_UNIT completes because
    it took longer than the incorrectly-calculated timeout.

    [jejb: fix up whitespace]
    Signed-off-by: Tim Wright
    Cc: Stable Tree
    Signed-off-by: James Bottomley

    Tim Wright
     

03 Jul, 2008

1 commit


03 May, 2008

2 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6:
    [SCSI] aic94xx: fix section mismatch
    [SCSI] u14-34f: Fix 32bit only problem
    [SCSI] dpt_i2o: sysfs code
    [SCSI] dpt_i2o: 64 bit support
    [SCSI] dpt_i2o: move from virt_to_bus/bus_to_virt to dma_alloc_coherent
    [SCSI] dpt_i2o: use standard __init / __exit code
    [SCSI] megaraid_sas: fix suspend/resume sections
    [SCSI] aacraid: Add Power Management support
    [SCSI] aacraid: Fix jbod operations scan issues
    [SCSI] aacraid: Fix warning about macro side-effects
    [SCSI] add support for variable length extended commands
    [SCSI] Let scsi_cmnd->cmnd use request->cmd buffer
    [SCSI] bsg: add large command support
    [SCSI] aacraid: Fix down_interruptible() to check the return value correctly
    [SCSI] megaraid_sas; Update the Version and Changelog
    [SCSI] ibmvscsi: Handle non SCSI error status
    [SCSI] bug fix for free list handling
    [SCSI] ipr: Rename ipr's state scsi host attribute to prevent collisions
    [SCSI] megaraid_mbox: fix Dell CERC firmware problem

    Linus Torvalds
     
  • Add support for variable-length, extended, and vendor specific
    CDBs to scsi-ml. It is now possible for initiators and ULD's
    to issue these types of commands. LLDs need not change much.
    All they need is to raise the .max_cmd_len to the longest command
    they support (see iscsi patch).

    - clean-up some code paths that did not expect commands to be
    larger than 16, and change cmd_len members' type to short as
    char is not enough.

    Signed-off-by: Boaz Harrosh
    Signed-off-by: Benny Halevy
    Signed-off-by: James Bottomley

    Boaz Harrosh
     

29 Apr, 2008

1 commit

  • blk_get_request initializes rq->cmd (rq_init does) so the users don't
    need to do that.

    The purpose of this patch is to remove sizeof(rq->cmd) and &rq->cmd,
    as a preparation for large command support, which changes rq->cmd from
    the static array to a pointer. sizeof(rq->cmd) will not make sense and
    &rq->cmd won't work.

    Signed-off-by: FUJITA Tomonori
    Cc: James Bottomley
    Cc: Alasdair G Kergon
    Cc: Jens Axboe
    Signed-off-by: Jens Axboe

    FUJITA Tomonori
     

04 Mar, 2008

1 commit

  • The meaning of rq->data_len was changed to the length of an allocated
    buffer from the true data length. It breaks SG_IO friends and
    bsg. This patch restores the meaning of rq->data_len to the true data
    length and adds rq->extra_len to store an extended length (due to
    drain buffer and padding).

    This patch also removes the code to update bio in blk_rq_map_user
    introduced by the commit 40b01b9bbdf51ae543a04744283bf2d56c4a6afa.
    The commit adjusts bio according to memory alignment
    (queue_dma_alignment). However, memory alignment is NOT padding
    alignment. This adjustment also breaks SG_IO friends and bsg. Padding
    alignment needs to be fixed in a proper way (by a separate patch).

    Signed-off-by: FUJITA Tomonori
    Signed-off-by: Jens Axboe

    FUJITA Tomonori
     

19 Feb, 2008

1 commit

  • With padding and draining moved into it, block layer now may extend
    requests as directed by queue parameters, so now a request has two
    sizes - the original request size and the extended size which matches
    the size of area pointed to by bios and later by sgs. The latter size
    is what lower layers are primarily interested in when allocating,
    filling up DMA tables and setting up the controller.

    Both padding and draining extend the data area to accomodate
    controller characteristics. As any controller which speaks SCSI can
    handle underflows, feeding larger data area is safe.

    So, this patch makes the primary data length field, request->data_len,
    indicate the size of full data area and add a separate length field,
    request->raw_data_len, for the unmodified request size. The latter is
    used to report to higher layer (userland) and where the original
    request size should be fed to the controller or device.

    Signed-off-by: Tejun Heo
    Cc: James Bottomley
    Signed-off-by: Jens Axboe

    Tejun Heo
     

18 Dec, 2007

1 commit


24 Jul, 2007

1 commit

  • Some of the code has been gradually transitioned to using the proper
    struct request_queue, but there's lots left. So do a full sweet of
    the kernel and get rid of this typedef and replace its uses with
    the proper type.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

23 Jul, 2007

1 commit

  • * master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (60 commits)
    [SCSI] libsas: make ATA functions selectable by a config option
    [SCSI] bsg: unexport sg v3 helper functions
    [SCSI] bsg: fix bsg_unregister_queue
    [SCSI] bsg: make class backlinks
    [SCSI] 3w-9xxx: add support for 9690SA
    [SCSI] bsg: fix bsg_register_queue error path
    [SCSI] ESP: Increase ESP_BUS_TIMEOUT to 275.
    [SCSI] libsas: fix scr_read/write users and update the libata documentation
    [SCSI] mpt fusion: update Kconfig help
    [SCSI] scsi_transport_sas: add destructor for bsg
    [SCSI] iscsi_tcp: buggered kmalloc()
    [SCSI] qla2xxx: Update version number to 8.02.00-k2.
    [SCSI] qla2xxx: Add ISP25XX support.
    [SCSI] qla2xxx: Use pci_try_set_mwi().
    [SCSI] qla2xxx: Use PCI-X/PCI-Express read control interfaces.
    [SCSI] qla2xxx: Re-factor isp_operations to static structures.
    [SCSI] qla2xxx: Validate mid-layer 'underflow' during check-condition handling.
    [SCSI] qla2xxx: Correct setting of 'current' and 'supported' speeds during FDMI registration.
    [SCSI] qla2xxx: Generalize iIDMA support.
    [SCSI] qla2xxx: Generalize FW-Interface-2 support.
    ...

    Linus Torvalds
     

22 Jul, 2007

1 commit


20 Jul, 2007

1 commit

  • Transform some calls to kmalloc/memset to a single kzalloc (or kcalloc).

    Here is a short excerpt of the semantic patch performing
    this transformation:

    @@
    type T2;
    expression x;
    identifier f,fld;
    expression E;
    expression E1,E2;
    expression e1,e2,e3,y;
    statement S;
    @@

    x =
    - kmalloc
    + kzalloc
    (E1,E2)
    ... when != \(x->fld=E;\|y=f(...,x,...);\|f(...,x,...);\|x=E;\|while(...) S\|for(e1;e2;e3) S\)
    - memset((T2)x,0,E1);

    @@
    expression E1,E2,E3;
    @@

    - kzalloc(E1 * E2,E3)
    + kcalloc(E1,E2,E3)

    [akpm@linux-foundation.org: get kcalloc args the right way around]
    Signed-off-by: Yoann Padioleau
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Acked-by: Russell King
    Cc: Bryan Wu
    Acked-by: Jiri Slaby
    Cc: Dave Airlie
    Acked-by: Roland Dreier
    Cc: Jiri Kosina
    Acked-by: Dmitry Torokhov
    Cc: Benjamin Herrenschmidt
    Acked-by: Mauro Carvalho Chehab
    Acked-by: Pierre Ossman
    Cc: Jeff Garzik
    Cc: "David S. Miller"
    Acked-by: Greg KH
    Cc: James Bottomley
    Cc: "Antonino A. Daplas"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yoann Padioleau
     

16 Jul, 2007

3 commits