06 Dec, 2011

21 commits

  • Signed-off-by: Christoph Hellwig
    Signed-off-by: Nicholas Bellinger

    Christoph Hellwig
     
  • This patch changes fileio to use for_each_sg() when walking se_task->task_sg
    memory passed into from loopback LLD struct scsi_cmnd scatterlist memory.

    This addresses an issue where FILEIO backends with loopback where hitting the
    following OOPs with mkfs.ext2:

    |kernel BUG at include/linux/scatterlist.h:97!
    |invalid opcode: 0000 [#1] PREEMPT SMP
    |Modules linked in: sd_mod tcm_loop target_core_stgt scsi_tgt target_core_pscsi target_core_file target_core_iblock target_core_mod configfs scsi_mod
    |
    |Pid: 671, comm: LIO_fileio Not tainted 3.1.0-rc10+ #139 Bochs Bochs
    |EIP: 0060:[] EFLAGS: 00010202 CPU: 0
    |EIP is at fd_do_task+0x396/0x420 [target_core_file]
    | [] __transport_execute_tasks+0xd4/0x190 [target_core_mod]
    | [] transport_execute_tasks+0x3c/0xf0 [target_core_mod]
    |EIP: [] fd_do_task+0x396/0x420 [target_core_file] SS:ESP 0068:dea47e90

    Signed-off-by: Sebastian Andrzej Siewior
    Cc: Christoph Hellwig
    Cc: stable@kernel.org
    Signed-off-by: Nicholas Bellinger

    Sebastian Andrzej Siewior
     
  • Some are never used, some are set but never read, dev_hoq_count is
    incremented and decremented, but never read.

    Signed-off-by: Joern Engel
    Signed-off-by: Nicholas Bellinger

    Jörn Engel
     
  • The LSB of the page length is at offset 3, not 2.

    Signed-off-by: Roland Dreier
    Cc: stable@kernel.org
    Signed-off-by: Nicholas Bellinger

    Roland Dreier
     
  • SBC-3 says:

    A TRANSFER LENGTH field set to zero specifies that 256 logical
    blocks shall be written. Any other value specifies the number
    of logical blocks that shall be written.

    The old code was always just returning the value in the TRANSFER LENGTH
    byte. Fix this to return 256 if the byte is 0.

    Signed-off-by: Roland Dreier
    Cc: stable@kernel.org
    Signed-off-by: Nicholas Bellinger

    Roland Dreier
     
  • IO commands with a TRANSFER LENGTH of 0 are not an error; for example,
    for READ (10) and WRITE (10), SBC-3 says:

    A TRANSFER LENGTH field set to zero specifies that no logical blocks
    shall be read. This condition shall not be considered an error.

    In case we have nothing to do, just complete the command with good status.

    Signed-off-by: Roland Dreier
    Cc: stable@kernel.org
    Signed-off-by: Nicholas Bellinger

    Roland Dreier
     
  • The semantic patch that makes this change is available
    in scripts/coccinelle/api/memdup.cocci.

    Signed-off-by: Thomas Meyer
    Signed-off-by: Nicholas Bellinger

    Thomas Meyer
     
  • This patch sets the missing ISCSI_FLAG_CMD_FINAL bit in
    iscsit_send_task_mgt_rsp() for a struct iscsi_tm_rsp PDU.

    This usage is hardcoded for all TM response PDUs in RFC-3720
    section 10.6.

    Reported-by: whucecil
    Cc: stable@kernel.org
    Signed-off-by: Nicholas Bellinger

    Nicholas Bellinger
     
  • This patch fixes iscsi-target handling of underflow where residual data is
    causing an OOPs by using the incorrect iscsi_cmd_t->data_length initially
    assigned in iscsit_allocate_se_cmd(). It resets iscsi_cmd_t->data_length
    from se_cmd_t->data_length after transport_generic_allocate_tasks()
    has been invoked in iscsit_handle_scsi_cmd() RX context, and converts
    iscsi_cmd->residual_count usage to access iscsi_cmd->se_cmd.residual_count
    to get the proper residual count set by target-core.

    Reported-by:
    Cc: Christoph Hellwig
    Cc: Andy Grover
    Cc: stable@kernel.org
    Signed-off-by: Nicholas Bellinger

    Nicholas Bellinger
     
  • This patch changes transport_generic_map_mem_to_cmd() to reject SCSI data
    overflow and to send exception status with CHECK_CONDITION + TCM_INVALID_CDB_FIELD
    for fabrics that are passing a pre-populated struct scatterlist (eg: tcm_loop
    and iscsi-target) being mapped into se_cmd->t_data_sg and se_cmd->t_data_nents.

    This addresses an OOPs where transport_allocate_data_tasks() would walk
    the incorrect post OVERFLOW cmd->data_length value beyond the end of
    the passed scatterlist.

    Cc: Christoph Hellwig
    Cc: Andy Grover
    Cc: stable@kernel.org
    Signed-off-by: Nicholas Bellinger

    Nicholas Bellinger
     
  • Signed-off-by: Christoph Hellwig
    Signed-off-by: Nicholas Bellinger

    Christoph Hellwig
     
  • And use a SCF_BIDI flag instead.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Nicholas Bellinger

    Christoph Hellwig
     
  • And use a SCF_FUA flag instead.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Nicholas Bellinger

    Christoph Hellwig
     
  • We never walk ordered_cmd_list in the se_device, so remove all code related
    to supporting it.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Nicholas Bellinger

    Christoph Hellwig
     
  • We already have a perfectly valid se_device pointer in the command, so
    remove the mostly useless duplicates.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Nicholas Bellinger

    Christoph Hellwig
     
  • This patch removes config_item_name() informational usage of
    TFO->free_wwn() treewide in loopback, tcm_fc, ib_srpt and
    tcm_vhost module code.

    Using v4 target_core_fabric_configfs.c logic, a fabric call for
    config_item_name() in TFO->drop_wwn() context returns NULL as
    target_fabric_drop_wwn() invoking config_item_put() ->
    config_group_put() will release fabric_port->port_wwn.wwn_group
    before the last config_item_put() -> TFO->drop_wwn() is
    invoked.

    Reported-by: Bart Van Assche
    Signed-off-by: Nicholas Bellinger

    Nicholas Bellinger
     
  • Signed-off-by: Roland Dreier
    Signed-off-by: Nicholas Bellinger

    Roland Dreier
     
  • Convert to unsigned bit fields for active I/O shutdown fields.

    Signed-off-by: Bart Van Assche
    Signed-off-by: Nicholas Bellinger

    Bart Van Assche
     
  • While testing ib_srpt I noticed that the target system became
    rather unresponsive during intensive I/O. The patch below made
    my target system responsive again during I/O without decreasing
    performance.

    Signed-off-by: Bart Van Assche
    Signed-off-by: Nicholas Bellinger

    Bart Van Assche
     
  • This patch adds missing kfree() for an allocation in iscsi_login_zero_tsih_s1()
    code, and make transport_init_session() check for IS_ERR() returns.

    Reported-by: Dan Carpenter
    Signed-off-by: Nicholas Bellinger

    Nicholas Bellinger
     
  • This patch removes legacy usage of PYX_TRANSPORT_* return codes in a number
    of locations and addresses cases where transport_generic_request_failure()
    was returning the incorrect sense upon CHECK_CONDITION status after the
    v3.1 converson to use errno return codes.

    This includes the conversion of transport_generic_request_failure() to
    process cmd->scsi_sense_reason and handle extra TCM_RESERVATION_CONFLICT
    before calling transport_send_check_condition_and_sense() to queue up
    response status. It also drops PYX_TRANSPORT_OUT_OF_MEMORY_RESOURCES legacy
    usgae, and returns TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE w/ a response
    for these cases.

    transport_generic_allocate_tasks(), transport_generic_new_cmd(), backend
    SCF_SCSI_DATA_SG_IO_CDB ->do_task(), and emulated ->execute_task() have
    all been updated to set se_cmd->scsi_sense_reason and return errno codes
    universally upon failure. This includes cmd->scsi_sense_reason assignment
    in target_core_alua.c, target_core_pr.c and target_core_cdb.c emulation code.

    Finally it updates fabric modules to remove the legacy usage, and for
    TFO->new_cmd_map() callers forwards return values outside of fabric code.
    iscsi-target has also been updated to remove a handful of special cases
    related to the cleanup and signaling QUEUE_FULL handling w/ ft_write_pending()

    (v2: Drop extra SCF_SCSI_CDB_EXCEPTION check during failure from
    transport_generic_new_cmd, and re-add missing task->task_error_status
    assignment in transport_complete_task)

    Cc: Christoph Hellwig
    Cc: stable@kernel.org
    Signed-off-by: Nicholas Bellinger

    Nicholas Bellinger
     

02 Dec, 2011

5 commits

  • Linus Torvalds
     
  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (31 commits)
    ocfs2: avoid unaligned access to dqc_bitmap
    ocfs2: Use filemap_write_and_wait() instead of write_inode_now()
    ocfs2: honor O_(D)SYNC flag in fallocate
    ocfs2: Add a missing journal credit in ocfs2_link_credits() -v2
    ocfs2: send correct UUID to cleancache initialization
    ocfs2: Commit transactions in error cases -v2
    ocfs2: make direntry invalid when deleting it
    fs/ocfs2/dlm/dlmlock.c: free kmem_cache_zalloc'd data using kmem_cache_free
    ocfs2: Avoid livelock in ocfs2_readpage()
    ocfs2: serialize unaligned aio
    ocfs2: Implement llseek()
    ocfs2: Fix ocfs2_page_mkwrite()
    ocfs2: Add comment about orphan scanning
    ocfs2: Clean up messages in the fs
    ocfs2/cluster: Cluster up now includes network connections too
    ocfs2/cluster: Add new function o2net_fill_node_map()
    ocfs2/cluster: Fix output in file elapsed_time_in_ms
    ocfs2/dlm: dlmlock_remote() needs to account for remastery
    ocfs2/dlm: Take inflight reference count for remotely mastered resources too
    ocfs2/dlm: Cleanup dlm_wait_for_node_death() and dlm_wait_for_node_recovery()
    ...

    Linus Torvalds
     
  • The dqc_bitmap field of struct ocfs2_local_disk_chunk is 32-bit aligned,
    but not 64-bit aligned. The dqc_bitmap is accessed by ocfs2_set_bit(),
    ocfs2_clear_bit(), ocfs2_test_bit(), or ocfs2_find_next_zero_bit(). These
    are wrapper macros for ext2_*_bit() which need to take an unsigned long
    aligned address (though some architectures are able to handle unaligned
    address correctly)

    So some 64bit architectures may not be able to access the dqc_bitmap
    correctly.

    This avoids such unaligned access by using another wrapper functions for
    ext2_*_bit(). The code is taken from fs/ext4/mballoc.c which also need to
    handle unaligned bitmap access.

    Signed-off-by: Akinobu Mita
    Acked-by: Joel Becker
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Joel Becker

    Akinobu Mita
     
  • * 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm:
    ARM: 7182/1: ARM cpu topology: fix warning
    ARM: 7181/1: Restrict kprobes probing SWP instructions to ARMv5 and below
    ARM: 7180/1: Change kprobes testcase with unpredictable STRD instruction
    ARM: 7177/1: GIC: avoid skipping non-existent PPIs in irq_start calculation
    ARM: 7176/1: cpu_pm: register GIC PM notifier only once
    ARM: 7175/1: add subname parameter to mfp_set_groupg callers
    ARM: 7174/1: Fix build error in kprobes test code on Thumb2 kernels
    ARM: 7172/1: dma: Drop GFP_COMP for DMA memory allocations
    ARM: 7171/1: unwind: add unwind directives to bitops assembly macros
    ARM: 7170/2: fix compilation breakage in entry-armv.S
    ARM: 7168/1: use cache type functions for arch_get_unmapped_area
    ARM: perf: check that we have a platform device when reserving PMU
    ARM: 7166/1: Use PMD_SHIFT instead of PGDIR_SHIFT in dma-consistent.c
    ARM: 7165/2: PL330: Fix typo in _prepare_ccr()
    ARM: 7163/2: PL330: Only register usable channels
    ARM: 7162/1: errata: tidy up Kconfig options for PL310 errata workarounds
    ARM: 7161/1: errata: no automatic store buffer drain
    ARM: perf: initialise used_mask for fake PMU during validation
    ARM: PMU: remove pmu_init declaration
    ARM: PMU: re-export release_pmu symbol to modules

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
    Btrfs: fix meta data raid-repair merge problem
    Btrfs: skip allocation attempt from empty cluster
    Btrfs: skip block groups without enough space for a cluster
    Btrfs: start search for new cluster at the beginning
    Btrfs: reset cluster's max_size when creating bitmap
    Btrfs: initialize new bitmaps' list
    Btrfs: fix oops when calling statfs on readonly device
    Btrfs: Don't error on resizing FS to same size
    Btrfs: fix deadlock on metadata reservation when evicting a inode
    Fix URL of btrfs-progs git repository in docs
    btrfs scrub: handle -ENOMEM from init_ipath()

    Linus Torvalds
     

01 Dec, 2011

14 commits

  • Commit 4a54c8c16 introduced raid-repair, killing the individual
    readpage_io_failed_hook entries from inode.c and disk-io.c. Commit
    4bb31e92 introduced new readahead code, adding a readpage_io_failed_hook to
    disk-io.c.

    The raid-repair commit had logic to disable raid-repair, if
    readpage_io_failed_hook is set. Thus, the readahead commit effectively
    disabled raid-repair for meta data.

    This commit changes the logic to always attempt raid-repair when needed and
    call the readpage_io_failed_hook in case raid-repair fails. This is much
    more straight forward and should have been like that from the beginning.

    Signed-off-by: Jan Schmidt
    Reported-by: Stefan Behrens
    Signed-off-by: Chris Mason

    Jan Schmidt
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
    IB: Fix RCU lockdep splats
    IB/ipoib: Prevent hung task or softlockup processing multicast response
    IB/qib: Fix over-scheduling of QSFP work
    RDMA/cxgb4: Fix retry with MPAv1 logic for MPAv2
    RDMA/cxgb4: Fix iw_cxgb4 count_rcqes() logic
    IB/qib: Don't use schedule_work()

    Linus Torvalds
     
  • * 'dt-for-linus' of git://sources.calxeda.com/kernel/linux:
    of: Add Silicon Image vendor prefix
    of/irq: of_irq_init: add check for parent equal to child node

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
    regulator: twl: fix twl4030 support for smps regulators
    regulator: fix use after free bug
    regulator: aat2870: Fix the logic of checking if no id is matched in aat2870_get_regulator

    Linus Torvalds
     
  • * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (45 commits)
    ARM: ux500: update defconfig
    ARM: u300: update defconfig
    ARM: at91: enable additional boards in existing soc defconfig files
    ARM: at91: refresh soc defconfig files for 3.2
    ARM: at91: rename defconfig files appropriately
    ARM: OMAP2+: Fix Compilation error when omap_l3_noc built as module
    ARM: OMAP2+: Remove empty io.h
    ARM: OMAP2: select ARM_AMBA if OMAP3_EMU is defined
    ARM: OMAP: smartreflex: fix IRQ handling bug
    ARM: OMAP: PM: only register TWL with voltage layer when device is present
    ARM: OMAP: hwmod: Fix the addr space, irq, dma count APIs
    arm: mx28: fix bit operation in clock setting
    ARM: imx: export imx_ioremap
    ARM: imx/mm-imx3: conditionally compile i.MX31 and i.MX35 code
    ARM: mx5: Fix checkpatch warnings in cpu-imx5.c
    MAINTAINERS: Add missing directory
    ARM: imx: drop 'ARCH_MX31' and 'ARCH_MX35'
    ARM: imx6q: move clock register map to machine_desc.map_io
    ARM: pxa168/gplugd: add the correct SSP device
    ARM: Update mach-types to fix mxs build breakage
    ...

    Linus Torvalds
     
  • kernel/sched.c:7354:2: warning: initialization from incompatible pointer type

    Align cpu_coregroup_mask prototype interface with sched_domain_mask_f typedef
    use int cpu instead of unsigned int cpu

    Cc:
    Signed-off-by: Vincent Guittot
    Signed-off-by: Russell King

    Vincent Guittot
     
  • The SWP instruction is deprecated on ARMv6 and with ARMv7 it will be
    UNDEFINED when CONFIG_SWP_EMULATE is selected. In this case, probing a
    SWP instruction will cause an oops when the kprobes emulation code
    executes an undefined instruction.

    As the SWP instruction should be rare or non-existent in kernels for
    ARMv6 and later, we can simply avoid these problems by not allowing
    probing of these.

    Reported-by: Leif Lindholm
    Tested-by: Leif Lindholm
    Acked-by: Nicolas Pitre
    Signed-off-by: Jon Medhurst
    Signed-off-by: Russell King

    Jon Medhurst (Tixy)
     
  • There is a kprobes testcase for the instruction "strd r2, [r3], r4".
    This has unpredictable behaviour as it uses r3 for register writeback
    addressing and also stores it to memory.

    On a cortex A9, this testcase would fail because the instruction writes
    the updated value of r3 to memory, whereas the kprobes emulation code
    writes the original value.

    Fix this by changing testcase to used r5 instead of r3.

    Reported-by: Leif Lindholm
    Tested-by: Leif Lindholm
    Acked-by: Nicolas Pitre
    Signed-off-by: Jon Medhurst
    Signed-off-by: Russell King

    Jon Medhurst (Tixy)
     
  • If we don't have a cluster, don't bother trying to allocate from it,
    jumping right away to the attempt to allocate a new cluster.

    Signed-off-by: Alexandre Oliva
    Signed-off-by: Chris Mason

    Alexandre Oliva
     
  • We test whether a block group has enough free space to hold the
    requested block, but when we're doing clustered allocation, we can
    save some cycles by testing whether it has enough room for the cluster
    upfront, otherwise we end up attempting to set up a cluster and
    failing. Only in the NO_EMPTY_SIZE loop do we attempt an unclustered
    allocation, and by then we'll have zeroed the cluster size, so this
    patch won't stop us from using the block group as a last resort.

    Signed-off-by: Alexandre Oliva
    Signed-off-by: Chris Mason

    Alexandre Oliva
     
  • Instead of starting at zero (offset is always zero), request a cluster
    starting at search_start, that denotes the beginning of the current
    block group.

    Signed-off-by: Alexandre Oliva
    Signed-off-by: Chris Mason

    Alexandre Oliva
     
  • The field that indicates the size of the largest contiguous chunk of
    free space in the cluster is not initialized when setting up bitmaps,
    it's only increased when we find a larger contiguous chunk. We end up
    retaining a larger value than appropriate for highly-fragmented
    clusters, which may cause pointless searches for large contiguous
    groups, and even cause clusters that do not meet the density
    requirements to be set up.

    Signed-off-by: Alexandre Oliva
    Signed-off-by: Chris Mason

    Alexandre Oliva
     
  • We're failing to create clusters with bitmaps because
    setup_cluster_no_bitmap checks that the list is empty before inserting
    the bitmap entry in the list for setup_cluster_bitmap, but the list
    field is only initialized when it is restored from the on-disk free
    space cache, or when it is written out to disk.

    Besides a potential race condition due to the multiple use of the list
    field, filesystem performance severely degrades over time: as we use
    up all non-bitmap free extents, the try-to-set-up-cluster dance is
    done at every metadata block allocation. For every block group, we
    fail to set up a cluster, and after failing on them all up to twice,
    we fall back to the much slower unclustered allocation.

    To make matters worse, before the unclustered allocation, we try to
    create new block groups until we reach the 1% threshold, which
    introduces additional bitmaps and thus block groups that we'll iterate
    over at each metadata block request.

    Alexandre Oliva
     
  • To reproduce this bug:

    # dd if=/dev/zero of=img bs=1M count=256
    # mkfs.btrfs img
    # losetup -r /dev/loop1 img
    # mount /dev/loop1 /mnt
    OOPS!!

    It triggered BUG_ON(!nr_devices) in btrfs_calc_avail_data_space().

    To fix this, instead of checking write-only devices, we check all open
    deivces:

    # df -h /dev/loop1
    Filesystem Size Used Avail Use% Mounted on
    /dev/loop1 250M 28K 238M 1% /mnt

    Signed-off-by: Li Zefan

    Li Zefan