26 Sep, 2014

5 commits

  • commit cbd5228199d8be45d895d9d0cc2b8ce53835fc21 upstream.

    Hidden away in the last 8 bytes of the buffer_list page is a solitary
    statistic. It needs to be byte swapped or else ethtool -S will
    produce numbers that terrify the user.

    Since we do this in multiple places, create a helper function with a
    comment explaining what is going on.

    Signed-off-by: Anton Blanchard
    Signed-off-by: David S. Miller
    Signed-off-by: Jiri Slaby

    Anton Blanchard
     
  • commit c5edfff9db6f4d2c35c802acb4abe0df178becee upstream.

    Keystone K2E EVM uses Marvel 0x9182 controller. This requires support
    for the ID in the ahci driver.

    Signed-off-by: Murali Karicheri
    Signed-off-by: Tejun Heo
    Cc: Santosh Shilimkar
    Signed-off-by: Jiri Slaby

    Murali Karicheri
     
  • commit 1b071a0947dbce5c184c12262e02540fbc493457 upstream.

    This patch adds the AHCI mode SATA Device IDs for the Intel 9 Series PCH.

    Signed-off-by: James Ralston
    Signed-off-by: Tejun Heo
    Signed-off-by: Jiri Slaby

    James Ralston
     
  • commit 4dc7c76cd500fa78c64adfda4b070b870a2b993c upstream.

    scc_bus_softreset not necessarily should return zero.
    Propagate the error code.

    Signed-off-by: Arjun Sreedharan
    Signed-off-by: Tejun Heo
    Signed-off-by: Jiri Slaby

    Arjun Sreedharan
     
  • commit 2a13772a144d2956a7fedd18685921d0a9b8b783 upstream.

    Crucial M550 may cause data corruption on queued trims and is
    blacklisted. The pattern used for it fails to match 1TB one as the
    capacity section will be four chars instead of three. Widen the
    pattern.

    Signed-off-by: Tejun Heo
    Reported-by: Charles Reiss
    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=81071
    Signed-off-by: Jiri Slaby

    Tejun Heo
     

18 Sep, 2014

20 commits

  • commit 730a336c33a3398d65896e8ee3ef9f5679fe30a9 upstream.

    There still seem to be stability problems with other systems.

    Bug:
    https://bugs.freedesktop.org/show_bug.cgi?id=72921

    Signed-off-by: Alex Deucher
    Signed-off-by: Jiri Slaby

    Alex Deucher
     
  • commit 0c78a44964db3d483b0c09a8236e0fe123aa9cfc upstream.

    bapm enabled the GPU and CPU to share TDP headroom. It was
    disabled by default since some laptops hung when it was enabled
    in conjunction with dpm. It seems to be stable on desktop
    boards and fixes hangs on boot with dpm enabled on certain
    boards, so enable it by default on desktop boards.

    bug:
    https://bugs.freedesktop.org/show_bug.cgi?id=72921

    Signed-off-by: Alex Deucher
    Signed-off-by: Jiri Slaby

    Alex Deucher
     
  • commit ece4a17d237a79f63fbfaf3f724a12b6d500555c upstream.

    Withtout this, ring initialization fails reliabily during resume with

    [drm:init_ring_common] *ERROR* render ring initialization failed ctl 0001f001 head ffffff8804 tail 00000000 start 000e4000

    This is not a complete fix, but it is verified to make the ring
    initialization failures during resume much less likely.

    We were not able to root-cause this bug (likely HW-specific to Gen4 chips)
    yet. This is therefore used as a ducttape before problem is fully
    understood and proper fix created, so that people don't suffer from
    completely unusable systems in the meantime.

    The discussion and debugging is happening at

    https://bugs.freedesktop.org/show_bug.cgi?id=76554

    Signed-off-by: Jiri Kosina
    Signed-off-by: Daniel Vetter
    Signed-off-by: Jiri Slaby

    Jiri Kosina
     
  • commit f1d2a26b506e9dc7bbe94fae40da0a0d8dcfacd0 upstream.

    Seems to make VM flushes more stable on SI and CIK.

    v2: only use the PFP on the GFX ring on CIK

    Signed-off-by: Christian König
    Signed-off-by: Alex Deucher
    Signed-off-by: Jiri Slaby

    Christian König
     
  • commit 5dc355325b648dc9b4cf3bea4d968de46fd59215 upstream.

    Looks like the lm63 driver supports the lm64 as well.

    Signed-off-by: Alex Deucher
    Signed-off-by: Jiri Slaby

    Alex Deucher
     
  • commit a91576d7916f6cce76d30303e60e1ac47cf4a76d upstream.

    Commit 7dc19d5a "drivers: convert shrinkers to new count/scan API" added
    deadlock warnings that ttm_page_pool_free() and ttm_dma_page_pool_free()
    are currently doing GFP_KERNEL allocation.

    But these functions did not get updated to receive gfp_t argument.
    This patch explicitly passes sc->gfp_mask or GFP_KERNEL to these functions,
    and removes the deadlock warning.

    Signed-off-by: Tetsuo Handa
    Signed-off-by: Dave Airlie
    Signed-off-by: Jiri Slaby

    Tetsuo Handa
     
  • commit 71336e011d1d2312bcbcaa8fcec7365024f3a95d upstream.

    While ttm_dma_pool_shrink_scan() tries to take mutex before doing GFP_KERNEL
    allocation, ttm_pool_shrink_scan() does not do it. This can result in stack
    overflow if kmalloc() in ttm_page_pool_free() triggered recursion due to
    memory pressure.

    shrink_slab()
    => ttm_pool_shrink_scan()
    => ttm_page_pool_free()
    => kmalloc(GFP_KERNEL)
    => shrink_slab()
    => ttm_pool_shrink_scan()
    => ttm_page_pool_free()
    => kmalloc(GFP_KERNEL)

    Change ttm_pool_shrink_scan() to do like ttm_dma_pool_shrink_scan() does.

    Signed-off-by: Tetsuo Handa
    Signed-off-by: Dave Airlie
    Signed-off-by: Jiri Slaby

    Tetsuo Handa
     
  • commit 22e71691fd54c637800d10816bbeba9cf132d218 upstream.

    I can observe that RHEL7 environment stalls with 100% CPU usage when a
    certain type of memory pressure is given. While the shrinker functions
    are called by shrink_slab() before the OOM killer is triggered, the stall
    lasts for many minutes.

    One of reasons of this stall is that
    ttm_dma_pool_shrink_count()/ttm_dma_pool_shrink_scan() are called and
    are blocked at mutex_lock(&_manager->lock). GFP_KERNEL allocation with
    _manager->lock held causes someone (including kswapd) to deadlock when
    these functions are called due to memory pressure. This patch changes
    "mutex_lock();" to "if (!mutex_trylock()) return ...;" in order to
    avoid deadlock.

    Signed-off-by: Tetsuo Handa
    Signed-off-by: Dave Airlie
    Signed-off-by: Jiri Slaby

    Tetsuo Handa
     
  • commit 46c2df68f03a236b30808bba361f10900c88d95e upstream.

    We can use "unsigned int" instead of "atomic_t" by updating start_pool
    variable under _manager->lock. This patch will make it possible to avoid
    skipping when choosing a pool to shrink in round-robin style, after next
    patch changes mutex_lock(_manager->lock) to !mutex_trylock(_manager->lork).

    Signed-off-by: Tetsuo Handa
    Signed-off-by: Dave Airlie
    Signed-off-by: Jiri Slaby

    Tetsuo Handa
     
  • commit 11e504cc705e8ccb06ac93a276e11b5e8fee4d40 upstream.

    list_empty(&_manager->pools) being false before taking _manager->lock
    does not guarantee that _manager->npools != 0 after taking _manager->lock
    because _manager->npools is updated under _manager->lock.

    Signed-off-by: Tetsuo Handa
    Signed-off-by: Dave Airlie
    Signed-off-by: Jiri Slaby

    Tetsuo Handa
     
  • commit c9a3ad25eddfdb898114a9d73cdb4c3472d9dfca upstream.

    display_timings_release calls kfree on the display_timings object passed
    to it. Calling kfree after it is wrong. SLUB debug showed the following
    warning:

    =============================================================================
    BUG kmalloc-64 (Tainted: G W ): Object already free
    -----------------------------------------------------------------------------

    Disabling lock debugging due to kernel taint
    INFO: Allocated in of_get_display_timings+0x2c/0x214 age=601 cpu=0
    pid=884
    __slab_alloc.constprop.79+0x2e0/0x33c
    kmem_cache_alloc+0xac/0xdc
    of_get_display_timings+0x2c/0x214
    panel_probe+0x7c/0x314 [tilcdc]
    platform_drv_probe+0x18/0x48
    [..snip..]
    INFO: Freed in panel_destroy+0x18/0x3c [tilcdc] age=0 cpu=0 pid=907
    __slab_free+0x34/0x330
    panel_destroy+0x18/0x3c [tilcdc]
    tilcdc_unload+0xd0/0x118 [tilcdc]
    drm_dev_unregister+0x24/0x98
    [..snip..]

    Signed-off-by: Guido Martínez
    Tested-by: Darren Etheridge
    Signed-off-by: Dave Airlie
    Signed-off-by: Jiri Slaby

    Guido Martínez
     
  • commit eb565a2bbadc6a5030a6dbe58db1aa52453e7edf upstream.

    Unregister resources in the correct order on tilcdc_drm_fini, which is
    the reverse order they were registered during tilcdc_drm_init.

    This also means unregistering the driver before releasing its resources.

    Signed-off-by: Guido Martínez
    Tested-by: Darren Etheridge
    Signed-off-by: Dave Airlie
    Signed-off-by: Jiri Slaby

    Guido Martínez
     
  • commit 3a49012224ca9016658a831a327ff6a7fe5bb4f9 upstream.

    The driver did not unregister the allocated framebuffer, which caused
    memory leaks (and memory manager WARNs) when unloading. Also, the
    framebuffer device under /dev still existed after unloading.

    Add a call to drm_fbdev_cma_fini when unloading the module to prevent
    both issues.

    Signed-off-by: Guido Martínez
    Tested-by: Darren Etheridge
    Signed-off-by: Dave Airlie
    Signed-off-by: Jiri Slaby

    Guido Martínez
     
  • commit 16dcbdef404f4e87dab985494381939fe0a2d456 upstream.

    Add a drm_sysfs_connector_remove call when we destroy the panel to make
    sure the connector node in sysfs gets deleted.

    This is required for proper unload and re-load of this driver, otherwise
    we will get a warning about a duplicate filename in sysfs.

    Signed-off-by: Guido Martínez
    Tested-by: Darren Etheridge
    Signed-off-by: Dave Airlie
    Signed-off-by: Jiri Slaby

    Guido Martínez
     
  • commit daa15b4cd1eee58eb1322062a3320b1dbe5dc96e upstream.

    Add a drm_sysfs_connector_remove call when we destroy the panel to make
    sure the connector node in sysfs gets deleted.

    This is required for proper unload and re-load of this driver as a
    module. Without this, we would get a warning at re-load time like so:

    tda998x 0-0070: found TDA19988
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 825 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x54/0x74()
    sysfs: cannot create duplicate filename '/class/drm/card0-HDMI-A-1'
    Modules linked in: [..]
    CPU: 0 PID: 825 Comm: modprobe Not tainted 3.15.0-rc4-00027-g9dcdef4 #82
    [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
    [] (show_stack) from [] (warn_slowpath_common+0x68/0x88)
    [] (warn_slowpath_common) from [] (warn_slowpath_fmt+0x30/0x40)
    [] (warn_slowpath_fmt) from [] (sysfs_warn_dup+0x54/0x74)
    [] (sysfs_warn_dup) from [] (sysfs_do_create_link_sd.isra.2+0xb0/0xb8)
    [] (sysfs_do_create_link_sd.isra.2) from [] (device_add+0x338/0x520)
    [] (device_add) from [] (device_create_groups_vargs+0xa0/0xc4)
    [] (device_create_groups_vargs) from [] (device_create+0x24/0x2c)
    [] (device_create) from [] (drm_sysfs_connector_add+0x64/0x204)
    [] (drm_sysfs_connector_add) from [] (slave_modeset_init+0x120/0x1bc [tilcdc])
    [] (slave_modeset_init [tilcdc]) from [] (tilcdc_load+0x214/0x4c0 [tilcdc])
    [] (tilcdc_load [tilcdc]) from [] (drm_dev_register+0xa4/0x104)
    [..snip..]
    ---[ end trace 4df8d614936ebdee ]---
    [drm:drm_sysfs_connector_add] *ERROR* failed to register connector device: -17

    Signed-off-by: Guido Martínez
    Tested-by: Darren Etheridge
    Signed-off-by: Dave Airlie
    Signed-off-by: Jiri Slaby

    Guido Martínez
     
  • commit e396900e649b0af31161634d87fe37076f46c12b upstream.

    Add a drm_sysfs_connector_remove call when we destroy the panel to make
    sure the connector node in sysfs gets deleted.

    This is required for proper unload and re-load of this driver as a
    module. Without this, we would get a warning at re-load time like so:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 824 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x54/0x74()
    sysfs: cannot create duplicate filename '/class/drm/card0-LVDS-1'
    Modules linked in: [...]
    CPU: 0 PID: 824 Comm: modprobe Not tainted 3.15.0-rc4-00027-g6484f96-dirty #81
    [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
    [] (show_stack) from [] (warn_slowpath_common+0x68/0x88)
    [] (warn_slowpath_common) from [] (warn_slowpath_fmt+0x30/0x40)
    [] (warn_slowpath_fmt) from [] (sysfs_warn_dup+0x54/0x74)
    [] (sysfs_warn_dup) from [] (sysfs_do_create_link_sd.isra.2+0xb0/0xb8)
    [] (sysfs_do_create_link_sd.isra.2) from [] (device_add+0x338/0x520)
    [] (device_add) from [] (device_create_groups_vargs+0xa0/0xc4)
    [] (device_create_groups_vargs) from [] (device_create+0x24/0x2c)
    [] (device_create) from [] (drm_sysfs_connector_add+0x64/0x204)
    [] (drm_sysfs_connector_add) from [] (panel_modeset_init+0xb8/0x134 [tilcdc])
    [] (panel_modeset_init [tilcdc]) from [] (tilcdc_load+0x214/0x4c0 [tilcdc])
    [] (tilcdc_load [tilcdc]) from [] (drm_dev_register+0xa4/0x104)
    [ .. snip .. ]
    ---[ end trace b2d09cd9578b0497 ]---
    [drm:drm_sysfs_connector_add] *ERROR* failed to register connector device: -17

    Signed-off-by: Guido Martínez
    Tested-by: Darren Etheridge
    Signed-off-by: Dave Airlie
    Signed-off-by: Jiri Slaby

    Guido Martínez
     
  • commit 671796dd96b6cd85b75fba9d3007bcf7e5f7c309 upstream.

    The driver assumes that endpoint 4 is always an interrupt endpoint.
    Unfortunately the type differs between high-speed and full-speed
    configurations while in the former case it is indeed an interrupt
    endpoint this is not true for the latter case - here it is a bulk
    endpoint. When sending URBs with the wrong type the kernel will
    generate a warning message including backtrace. In this specific
    case there will be a huge amount of warnings which can bring the system
    to freeze.

    To fix this we are now sending URBs to endpoint 4 using the type
    found in the endpoint descriptor.

    A side note: The carl9170 firmware currently specifies endpoint 4 as
    interrupt endpoint even in the full-speed configuration but this has
    no relevance because before this firmware is loaded the endpoint type
    is as described above and after the firmware is running the stick is not
    reenumerated and so the old descriptor is used.

    Signed-off-by: Ronald Wahl
    Signed-off-by: John W. Linville
    Signed-off-by: Jiri Slaby

    Ronald Wahl
     
  • commit bcc05910359183b431da92713e98eed478edf83a upstream.

    If scsi_remove_host() is invoked after a SCSI device has been blocked,
    if the fast_io_fail_tmo or dev_loss_tmo work gets scheduled on the
    workqueue executing srp_remove_work() and if an I/O request is
    scheduled after the SCSI device had been blocked by e.g. multipathd
    then the following deadlock can occur:

    kworker/6:1 D ffff880831f3c460 0 195 2 0x00000000
    Call Trace:
    [] schedule+0x29/0x70
    [] schedule_timeout+0x10f/0x2a0
    [] msleep+0x2f/0x40
    [] __blk_drain_queue+0x4e/0x180
    [] blk_cleanup_queue+0x225/0x230
    [] __scsi_remove_device+0x62/0xe0 [scsi_mod]
    [] scsi_forget_host+0x6f/0x80 [scsi_mod]
    [] scsi_remove_host+0x7a/0x130 [scsi_mod]
    [] srp_remove_work+0x95/0x180 [ib_srp]
    [] process_one_work+0x1ea/0x6c0
    [] worker_thread+0x11b/0x3a0
    [] kthread+0xed/0x110
    [] ret_from_fork+0x7c/0xb0
    multipathd D ffff880096acc460 0 5340 1 0x00000000
    Call Trace:
    [] schedule+0x29/0x70
    [] schedule_timeout+0x10f/0x2a0
    [] io_schedule_timeout+0x9b/0xf0
    [] wait_for_completion_io_timeout+0xdc/0x110
    [] blk_execute_rq+0x9b/0x100
    [] sg_io+0x1a5/0x450
    [] scsi_cmd_ioctl+0x2a1/0x430
    [] scsi_cmd_blk_ioctl+0x42/0x50
    [] sd_ioctl+0xbe/0x140 [sd_mod]
    [] blkdev_ioctl+0x234/0x840
    [] block_ioctl+0x41/0x50
    [] do_vfs_ioctl+0x300/0x520
    [] SyS_ioctl+0x41/0x80
    [] tracesys+0xd0/0xd5

    Fix this by scheduling removal work on another workqueue than the
    transport layer timers.

    Signed-off-by: Bart Van Assche
    Reviewed-by: Sagi Grimberg
    Reviewed-by: David Dillow
    Cc: Sebastian Parschauer
    Signed-off-by: Roland Dreier
    Signed-off-by: Jiri Slaby

    Bart Van Assche
     
  • commit 40ddbf5069bd4e11447c0088fc75318e0aac53f0 upstream.

    commit 65b97cf6b8de introduced in v3.7 caused a regression
    by using a reversed CS_MASK thus causing omap_calculate_ecc to
    always fail. As the NAND base driver never checks for .calculate()'s
    return value, the zeroed ECC values are used as is without showing
    any error to the user. However, this won't work and the NAND device
    won't be guarded by any error code.

    Fix the issue by using the correct mask.

    Code was tested on omap3beagle using the following procedure
    - flash the primary bootloader (MLO) from the kernel to the first
    NAND partition using nandwrite.
    - boot the board from NAND. This utilizes OMAP ROM loader that
    relies on 1-bit Hamming code ECC.

    Fixes: 65b97cf6b8de (mtd: nand: omap2: handle nand on gpmc)

    Signed-off-by: Roger Quadros
    Signed-off-by: Tony Lindgren
    Signed-off-by: Jiri Slaby

    Roger Quadros
     
  • commit a152056c912db82860a8b4c23d0bd3a5aa89e363 upstream.

    I got the following panic on my fsl p5020ds board.

    Unable to handle kernel paging request for data at address 0x7375627379737465
    Faulting instruction address: 0xc000000000100778
    Oops: Kernel access of bad area, sig: 11 [#1]
    SMP NR_CPUS=24 CoreNet Generic
    Modules linked in:
    CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.15.0-next-20140613 #145
    task: c0000000fe080000 ti: c0000000fe088000 task.ti: c0000000fe088000
    NIP: c000000000100778 LR: c00000000010073c CTR: 0000000000000000
    REGS: c0000000fe08aa00 TRAP: 0300 Not tainted (3.15.0-next-20140613)
    MSR: 0000000080029000 CR: 24ad2e24 XER: 00000000
    DEAR: 7375627379737465 ESR: 0000000000000000 SOFTE: 1
    GPR00: c0000000000c99b0 c0000000fe08ac80 c0000000009598e0 c0000000fe001d80
    GPR04: 00000000000000d0 0000000000000913 c000000007902b20 0000000000000000
    GPR08: c0000000feaae888 0000000000000000 0000000007091000 0000000000200200
    GPR12: 0000000028ad2e28 c00000000fff4000 c0000000007abe08 0000000000000000
    GPR16: c0000000007ab160 c0000000007aaf98 c00000000060ba68 c0000000007abda8
    GPR20: c0000000007abde8 c0000000feaea6f8 c0000000feaea708 c0000000007abd10
    GPR24: c000000000989370 c0000000008c6228 00000000000041ed c0000000fe00a400
    GPR28: c00000000017c1cc 00000000000000d0 7375627379737465 c0000000fe001d80
    NIP [c000000000100778] .__kmalloc_track_caller+0x70/0x168
    LR [c00000000010073c] .__kmalloc_track_caller+0x34/0x168
    Call Trace:
    [c0000000fe08ac80] [c00000000087e6b8] uevent_sock_list+0x0/0x10 (unreliable)
    [c0000000fe08ad20] [c0000000000c99b0] .kstrdup+0x44/0x90
    [c0000000fe08adc0] [c00000000017c1cc] .__kernfs_new_node+0x4c/0x130
    [c0000000fe08ae70] [c00000000017d7e4] .kernfs_new_node+0x2c/0x64
    [c0000000fe08aef0] [c00000000017db00] .kernfs_create_dir_ns+0x34/0xc8
    [c0000000fe08af80] [c00000000018067c] .sysfs_create_dir_ns+0x58/0xcc
    [c0000000fe08b010] [c0000000002c711c] .kobject_add_internal+0xc8/0x384
    [c0000000fe08b0b0] [c0000000002c7644] .kobject_add+0x64/0xc8
    [c0000000fe08b140] [c000000000355ebc] .device_add+0x11c/0x654
    [c0000000fe08b200] [c0000000002b5988] .add_disk+0x20c/0x4b4
    [c0000000fe08b2c0] [c0000000003a21d4] .add_mtd_blktrans_dev+0x340/0x514
    [c0000000fe08b350] [c0000000003a3410] .mtdblock_add_mtd+0x74/0xb4
    [c0000000fe08b3e0] [c0000000003a32cc] .blktrans_notify_add+0x64/0x94
    [c0000000fe08b470] [c00000000039b5b4] .add_mtd_device+0x1d4/0x368
    [c0000000fe08b520] [c00000000039b830] .mtd_device_parse_register+0xe8/0x104
    [c0000000fe08b5c0] [c0000000003b8408] .of_flash_probe+0x72c/0x734
    [c0000000fe08b750] [c00000000035ba40] .platform_drv_probe+0x38/0x84
    [c0000000fe08b7d0] [c0000000003599a4] .really_probe+0xa4/0x29c
    [c0000000fe08b870] [c000000000359d3c] .__driver_attach+0x100/0x104
    [c0000000fe08b900] [c00000000035746c] .bus_for_each_dev+0x84/0xe4
    [c0000000fe08b9a0] [c0000000003593c0] .driver_attach+0x24/0x38
    [c0000000fe08ba10] [c000000000358f24] .bus_add_driver+0x1c8/0x2ac
    [c0000000fe08bab0] [c00000000035a3a4] .driver_register+0x8c/0x158
    [c0000000fe08bb30] [c00000000035b9f4] .__platform_driver_register+0x6c/0x80
    [c0000000fe08bba0] [c00000000084e080] .of_flash_driver_init+0x1c/0x30
    [c0000000fe08bc10] [c000000000001864] .do_one_initcall+0xbc/0x238
    [c0000000fe08bd00] [c00000000082cdc0] .kernel_init_freeable+0x188/0x268
    [c0000000fe08bdb0] [c0000000000020a0] .kernel_init+0x1c/0xf7c
    [c0000000fe08be30] [c000000000000884] .ret_from_kernel_thread+0x58/0xd4
    Instruction dump:
    41bd0010 480000c8 4bf04eb5 60000000 e94d0028 e93f0000 7cc95214 e8a60008
    7fc9502a 2fbe0000 419e00c8 e93f0022 39200000 88ed06b2 992d06b2
    ---[ end trace b4c9a94804a42d40 ]---

    It seems that the corrupted partition header on my mtd device triggers
    a bug in the ftl. In function build_maps() it will allocate the buffers
    needed by the mtd partition, but if something goes wrong such as kmalloc
    failure, mtd read error or invalid partition header parameter, it will
    free all allocated buffers and then return non-zero. In my case, it
    seems that partition header parameter 'NumTransferUnits' is invalid.

    And the ftl_freepart() is a function which free all the partition
    buffers allocated by build_maps(). Given the build_maps() is a self
    cleaning function, so there is no need to invoke this function even
    if build_maps() return with error. Otherwise it will causes the
    buffers to be freed twice and then weird things would happen.

    Signed-off-by: Kevin Hao
    Signed-off-by: Brian Norris
    Signed-off-by: Jiri Slaby

    Kevin Hao
     

17 Sep, 2014

15 commits

  • commit 2f0304d21867476394cd51a54e97f7273d112261 upstream.

    If the user creates a listening cm_id with backlog of 0 the IWCM ends
    up not allowing any connection requests at all. The correct behavior
    is for the IWCM to pick a default value if the user backlog parameter
    is zero.

    Lustre from version 1.8.8 onward uses a backlog of 0, which breaks
    iwarp support without this fix.

    Signed-off-by: Steve Wise
    Signed-off-by: Roland Dreier
    Signed-off-by: Jiri Slaby

    Steve Wise
     
  • commit b39685526f46976bcd13aa08c82480092befa46c upstream.

    When a raid10 commences a resync/recovery/reshape it allocates
    some buffer space.
    When a resync/recovery completes the buffer space is freed. But not
    when the reshape completes.
    This can result in a small memory leak.

    There is a subtle side-effect of this bug. When a RAID10 is reshaped
    to a larger array (more devices), the reshape is immediately followed
    by a "resync" of the new space. This "resync" will use the buffer
    space which was allocated for "reshape". This can cause problems
    including a "BUG" in the SCSI layer. So this is suitable for -stable.

    Fixes: 3ea7daa5d7fde47cd41f4d56c2deb949114da9d6
    Signed-off-by: NeilBrown
    Signed-off-by: Jiri Slaby

    NeilBrown
     
  • commit ce0b0a46955d1bb389684a2605dbcaa990ba0154 upstream.

    raid10 reshape clears unwanted bits from a bio->bi_flags using
    a method which, while clumsy, worked until 3.10 when BIO_OWNS_VEC
    was added.
    Since then it clears that bit but shouldn't. This results in a
    memory leak.

    So change to used the approved method of clearing unwanted bits.

    As this causes a memory leak which can consume all of memory
    the fix is suitable for -stable.

    Fixes: a38352e0ac02dbbd4fa464dc22d1352b5fbd06fd
    Reported-by: mdraid.pkoch@dfgh.net (Peter Koch)
    Signed-off-by: NeilBrown
    Signed-off-by: Jiri Slaby

    NeilBrown
     
  • commit 9c4bdf697c39805078392d5ddbbba5ae5680e0dd upstream.

    During recovery of a double-degraded RAID6 it is possible for
    some blocks not to be recovered properly, leading to corruption.

    If a write happens to one block in a stripe that would be written to a
    missing device, and at the same time that stripe is recovering data
    to the other missing device, then that recovered data may not be written.

    This patch skips, in the double-degraded case, an optimisation that is
    only safe for single-degraded arrays.

    Bug was introduced in 2.6.32 and fix is suitable for any kernel since
    then. In an older kernel with separate handle_stripe5() and
    handle_stripe6() functions the patch must change handle_stripe6().

    Fixes: 6c0069c0ae9659e3a91b68eaed06a5c6c37f45c8
    Cc: Yuri Tikhonov
    Cc: Dan Williams
    Reported-by: "Manibalan P"
    Tested-by: "Manibalan P"
    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1090423
    Signed-off-by: NeilBrown
    Acked-by: Dan Williams
    Signed-off-by: Jiri Slaby

    NeilBrown
     
  • commit 2446dba03f9dabe0b477a126cbeb377854785b47 upstream.

    Currently we don't abort recovery on a write error if the write error
    to the recovering device was triggerd by normal IO (as opposed to
    recovery IO).

    This means that for one bitmap region, the recovery might write to the
    recovering device for a few sectors, then not bother for subsequent
    sectors (as it never writes to failed devices). In this case
    the bitmap bit will be cleared, but it really shouldn't.

    The result is that if the recovering device fails and is then re-added
    (after fixing whatever hardware problem triggerred the failure),
    the second recovery won't redo the region it was in the middle of,
    so some of the device will not be recovered properly.

    If we abort the recovery, the region being processes will be cancelled
    (bit not cleared) and the whole region will be retried.

    As the bug can result in data corruption the patch is suitable for
    -stable. For kernels prior to 3.11 there is a conflict in raid10.c
    which will require care.

    Original-from: jiao hui
    Reported-and-tested-by: jiao hui
    Signed-off-by: NeilBrown
    Signed-off-by: Jiri Slaby

    NeilBrown
     
  • commit 6726655dfdd2dc60c035c690d9f10cb69d7ea075 upstream.

    There is a following AB-BA dependency between cpu_hotplug.lock and
    cpuidle_lock:

    1) cpu_hotplug.lock -> cpuidle_lock
    enable_nonboot_cpus()
    _cpu_up()
    cpu_hotplug_begin()
    LOCK(cpu_hotplug.lock)
    cpu_notify()
    ...
    acpi_processor_hotplug()
    cpuidle_pause_and_lock()
    LOCK(cpuidle_lock)

    2) cpuidle_lock -> cpu_hotplug.lock
    acpi_os_execute_deferred() workqueue
    ...
    acpi_processor_cst_has_changed()
    cpuidle_pause_and_lock()
    LOCK(cpuidle_lock)
    get_online_cpus()
    LOCK(cpu_hotplug.lock)

    Fix this by reversing the order acpi_processor_cst_has_changed() does
    thigs -- let it first execute the protection against CPU hotplug by
    calling get_online_cpus() and obtain the cpuidle lock only after that (and
    perform the symmentric change when allowing CPUs hotplug again and
    dropping cpuidle lock).

    Spotted by lockdep.

    Signed-off-by: Jiri Kosina
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Jiri Slaby

    Jiri Kosina
     
  • commit aca26364689e00e3b2052072424682231bdae6ae upstream.

    The SPI host controller is the same as used in Baytrail, only the ACPI ID
    is different so add this new ID to the list.

    Signed-off-by: Alan Cox
    Signed-off-by: Mika Westerberg
    Signed-off-by: Mark Brown
    Signed-off-by: Jiri Slaby

    Alan Cox
     
  • commit 8aa5e56eeb61a099ea6519eb30ee399e1bc043ce upstream.

    Adds return status check on copy routines to delete the allocated destination
    object if either copy fails. Reported by Colin Ian King on bugs.acpica.org,
    Bug 1087.
    The last applicable commit:
    Commit: 3371c19c294a4cb3649aa4e84606be8a1d999e61
    Subject: ACPICA: Remove ACPI_GET_OBJECT_TYPE macro

    Link: https://bugs.acpica.org/show_bug.cgi?id=1087
    Reported-by: Colin Ian King
    Signed-off-by: David E. Box
    Signed-off-by: Bob Moore
    Signed-off-by: Lv Zheng
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Jiri Slaby

    David E. Box
     
  • commit 03a6c3ff3282ee9fa893089304d951e0be93a144 upstream.

    bfa_swap_words() shifts its argument (assumed to be 64-bit) by 32 bits
    each way. In two places the argument type is dma_addr_t, which may be
    32-bit, in which case the effect of the bit shift is undefined:

    drivers/scsi/bfa/bfa_fcpim.c: In function 'bfa_ioim_send_ioreq':
    drivers/scsi/bfa/bfa_fcpim.c:2497:4: warning: left shift count >= width of type [enabled by default]
    addr = bfa_sgaddr_le(sg_dma_address(sg));
    ^
    drivers/scsi/bfa/bfa_fcpim.c:2497:4: warning: right shift count >= width of type [enabled by default]
    drivers/scsi/bfa/bfa_fcpim.c:2509:4: warning: left shift count >= width of type [enabled by default]
    addr = bfa_sgaddr_le(sg_dma_address(sg));
    ^
    drivers/scsi/bfa/bfa_fcpim.c:2509:4: warning: right shift count >= width of type [enabled by default]

    Avoid this by adding casts to u64 in bfa_swap_words().

    Compile-tested only.

    Signed-off-by: Ben Hutchings
    Reviewed-by: Martin K. Petersen
    Acked-by: Anil Gurumurthy
    Fixes: f16a17507b09 ('[SCSI] bfa: remove all OS wrappers')
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jiri Slaby

    Ben Hutchings
     
  • commit 0213436a2cc5e4a5ca2fabfaa4d3877097f3b13f upstream.

    Some devices don't like REPORT SUPPORTED OPERATION CODES and will
    simply timeout causing sd_mod init to take a very very long time.
    Introduce BLIST_NO_RSOC scsi scan flag, that stops RSOC from being
    issued. Add it to Promise Vtrak E610f entry in scsi scan
    blacklist. Fixes bug #79901 reported at
    https://bugzilla.kernel.org/show_bug.cgi?id=79901

    Fixes: 98dcc2946adb ("SCSI: sd: Update WRITE SAME heuristics")

    Signed-off-by: Janusz Dziemidowicz
    Reviewed-by: Martin K. Petersen
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jiri Slaby

    Janusz Dziemidowicz
     
  • commit c1d40a527e885a40bb9ea6c46a1b1145d42b66a0 upstream.

    Despite supporting modern SCSI features some storage devices continue to
    claim conformance to an older version of the SPC spec. This is done for
    compatibility with legacy operating systems.

    Linux by default will not attempt to read VPD pages on devices that
    claim SPC-2 or older. Introduce a blacklist flag that can be used to
    trigger VPD page inquiries on devices that are known to support them.

    Reported-by: KY Srinivasan
    Tested-by: KY Srinivasan
    Reviewed-by: KY Srinivasan
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jiri Slaby

    Martin K. Petersen
     
  • commit 22ffeb48b7584d6cd50f2a595ed6065d86a87459 upstream.

    Sequential scan for more than 256 LUNs is very fragile as
    LUNs might not be numbered sequentially after that point.

    SAM revisions later than SCSI-3 impose a structure on
    LUNs larger than 256, making LUN numbers between 256
    and 16384 illegal.
    SCSI-3, however allows for plain 64-bit numbers with
    no internal structure.

    So restrict sequential LUN scan to 256 LUNs and add a
    new blacklist flag 'BLIST_SCSI3LUN' to scan up to
    max_lun devices.

    Signed-off-by: Hannes Reinecke
    Reviewed-by: Ewan Milne
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jiri Slaby

    Hannes Reinecke
     
  • commit 3533f8603d28b77c62d75ec899449a99bc6b77a1 upstream.

    On some Windows hosts on FC SANs, TEST_UNIT_READY can return SRB_STATUS_ERROR.
    Correctly handle this. Note that there is sufficient sense information to
    support scsi error handling even in this case.

    Signed-off-by: K. Y. Srinivasan
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jiri Slaby

    K. Y. Srinivasan
     
  • commit f885fb73f64154690c2158e813de56363389ffec upstream.

    Correctly set SRB flags for all valid I/O directions. Some IHV drivers on the
    Windows host require this. The host validates the command and SRB flags
    prior to passing the command down to native driver stack.

    Signed-off-by: K. Y. Srinivasan
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jiri Slaby

    K. Y. Srinivasan
     
  • commit adb6f9e1a8c6af1037232b59edb11277471537ea upstream.

    Based on the negotiated VMBUS protocol version, we adjust the size of the storage
    protocol messages. The two sizes we currently handle are pre-win8 and post-win8.
    In WS2012 R2, we are negotiating higher VMBUS protocol version than the win8
    version. Make adjustments to correctly handle this.

    Signed-off-by: K. Y. Srinivasan
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jiri Slaby

    K. Y. Srinivasan