08 Apr, 2013

1 commit

  • This reverts commit 8761a3dc1f07b163414e2215a2cadbb4cfe2a107.

    There are situations where the destruction path is called
    with the bdev->bd_mutex already held, which then deadlocks in
    loop_clr_fd(). The normal partition cleanup does a trylock()
    on the mutex, but it'd be nice to have a more bullet proof
    method in loop. So punt this more involved fix to the next
    merge window, and just back out this buggy fix for now.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

23 Mar, 2013

1 commit

  • Any partitions added by user space to the loop device were being
    left in place after detaching the loop device. This was because
    the detach path issued a BLKRRPART to clean up partitions if
    LO_FLAGS_PARTSCAN was set, meaning that the partitions were auto
    scanned on attach. Replace this BLKRRPART with code that
    unconditionally cleans up partitions on detach instead.

    Signed-off-by: Phillip Susi

    Modified by Jens to export delete_partition().

    Signed-off-by: Jens Axboe

    Phillip Susi
     

28 Feb, 2013

2 commits

  • Currently, sizeof(struct parsed_partitions) may be 64KB in 32bit arch, so
    it is easy to trigger page allocation failure by check_partition,
    especially in hotplug block device situation(such as, USB mass storage,
    MMC card, ...), and Felipe Balbi has observed the failure.

    This patch does below optimizations on the allocation of struct
    parsed_partitions to try to address the issue:

    - make parsed_partitions.parts as pointer so that the pointed memory can
    fit in 32KB buffer, then approximate 32KB memory can be saved

    - vmalloc the buffer pointed by parsed_partitions.parts because 32KB is
    still a bit big for kmalloc

    - given that many devices have the partition count limit, so only
    allocate disk_max_parts() partitions instead of 256 partitions always

    Signed-off-by: Ming Lei
    Reported-by: Felipe Balbi
    Cc: Jens Axboe
    Reviewed-by: Yasuaki Ishimatsu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ming Lei
     
  • While adding and removing a lot of disks disks and partitions this
    sometimes shows up:

    WARNING: at fs/sysfs/dir.c:512 sysfs_add_one+0xc9/0x130() (Not tainted)
    Hardware name:
    sysfs: cannot create duplicate filename '/dev/block/259:751'
    Modules linked in: raid1 autofs4 bnx2fc cnic uio fcoe libfcoe libfc 8021q scsi_transport_fc scsi_tgt garp stp llc sunrpc cpufreq_ondemand powernow_k8 freq_table mperf ipv6 dm_mirror dm_region_hash dm_log power_meter microcode dcdbas serio_raw amd64_edac_mod edac_core edac_mce_amd i2c_piix4 i2c_core k10temp bnx2 sg ixgbe dca mdio ext4 mbcache jbd2 dm_round_robin sr_mod cdrom sd_mod crc_t10dif ata_generic pata_acpi pata_atiixp ahci mptsas mptscsih mptbase scsi_transport_sas dm_multipath dm_mod [last unloaded: scsi_wait_scan]
    Pid: 44103, comm: async/16 Not tainted 2.6.32-195.el6.x86_64 #1
    Call Trace:
    warn_slowpath_common+0x87/0xc0
    warn_slowpath_fmt+0x46/0x50
    sysfs_add_one+0xc9/0x130
    sysfs_do_create_link+0x12b/0x170
    sysfs_create_link+0x13/0x20
    device_add+0x317/0x650
    idr_get_new+0x13/0x50
    add_partition+0x21c/0x390
    rescan_partitions+0x32b/0x470
    sd_open+0x81/0x1f0 [sd_mod]
    __blkdev_get+0x1b6/0x3c0
    blkdev_get+0x10/0x20
    register_disk+0x155/0x170
    add_disk+0xa6/0x160
    sd_probe_async+0x13b/0x210 [sd_mod]
    add_wait_queue+0x46/0x60
    async_thread+0x102/0x250
    default_wake_function+0x0/0x20
    async_thread+0x0/0x250
    kthread+0x96/0xa0
    child_rip+0xa/0x20
    kthread+0x0/0xa0
    child_rip+0x0/0x20

    This most likely happens because dev_t is freed while the number is
    still used and idr_get_new() is not protected on every use. The fix
    adds a mutex where it wasn't before and moves the dev_t free function so
    it is called after device del.

    Signed-off-by: Tomas Henzl
    Cc: Jens Axboe
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tomas Henzl
     

01 Aug, 2012

1 commit

  • Add a new operation code (BLKPG_RESIZE_PARTITION) to the BLKPG ioctl that
    allows altering the size of an existing partition, even if it is currently
    in use.

    This patch converts hd_struct->nr_sects into sequence counter because
    One might extend a partition while IO is happening to it and update of
    nr_sects can be non-atomic on 32bit machines with 64bit sector_t. This
    can lead to issues like reading inconsistent size of a partition. Sequence
    counter have been used so that readers don't have to take bdev mutex lock
    as we call sector_in_part() very frequently.

    Now all the access to hd_struct->nr_sects should happen using sequence
    counter read/update helper functions part_nr_sects_read/part_nr_sects_write.
    There is one exception though, set_capacity()/get_capacity(). I think
    theoritically race should exist there too but this patch does not
    modify set_capacity()/get_capacity() due to sheer number of call sites
    and I am afraid that change might break something. I have left that as a
    TODO item. We can handle it later if need be. This patch does not introduce
    any new races as such w.r.t set_capacity()/get_capacity().

    v2: Add CONFIG_LBDAF test to UP preempt case as suggested by Phillip.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Phillip Susi
    Signed-off-by: Jens Axboe

    Vivek Goyal
     

02 Mar, 2012

1 commit

  • Since 2.6.39 (1196f8b), when a driver returns -ENOMEDIUM for open(),
    __blkdev_get() calls rescan_partitions() to remove
    in-kernel partition structures and raise KOBJ_CHANGE uevent.

    However it ends up calling driver's revalidate_disk without open
    and could cause oops.

    In the case of SCSI:

    process A process B
    ----------------------------------------------
    sys_open
    __blkdev_get
    sd_open
    returns -ENOMEDIUM
    scsi_remove_device

    rescan_partitions
    sd_revalidate_disk

    Oopses are reported here:
    http://marc.info/?l=linux-scsi&m=132388619710052

    This patch separates the partition invalidation from rescan_partitions()
    and use it for -ENOMEDIUM case.

    Reported-by: Huajun Li
    Signed-off-by: Jun'ichi Nomura
    Acked-by: Tejun Heo
    Cc: stable@kernel.org
    Signed-off-by: Jens Axboe

    Jun'ichi Nomura
     

04 Jan, 2012

1 commit