23 Jan, 2020

1 commit

  • commit ad6bf88a6c19a39fb3b0045d78ea880325dfcf15 upstream.

    Logical block size has type unsigned short. That means that it can be at
    most 32768. However, there are architectures that can run with 64k pages
    (for example arm64) and on these architectures, it may be possible to
    create block devices with 64k block size.

    For exmaple (run this on an architecture with 64k pages):

    Mount will fail with this error because it tries to read the superblock using 2-sector
    access:
    device-mapper: writecache: I/O is not aligned, sector 2, size 1024, block size 65536
    EXT4-fs (dm-0): unable to read superblock

    This patch changes the logical block size from unsigned short to unsigned
    int to avoid the overflow.

    Cc: stable@vger.kernel.org
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Ming Lei
    Signed-off-by: Mikulas Patocka
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Mikulas Patocka
     

20 Sep, 2019

1 commit

  • Pull dma-mapping updates from Christoph Hellwig:

    - add dma-mapping and block layer helpers to take care of IOMMU merging
    for mmc plus subsequent fixups (Yoshihiro Shimoda)

    - rework handling of the pgprot bits for remapping (me)

    - take care of the dma direct infrastructure for swiotlb-xen (me)

    - improve the dma noncoherent remapping infrastructure (me)

    - better defaults for ->mmap, ->get_sgtable and ->get_required_mask
    (me)

    - cleanup mmaping of coherent DMA allocations (me)

    - various misc cleanups (Andy Shevchenko, me)

    * tag 'dma-mapping-5.4' of git://git.infradead.org/users/hch/dma-mapping: (41 commits)
    mmc: renesas_sdhi_internal_dmac: Add MMC_CAP2_MERGE_CAPABLE
    mmc: queue: Fix bigger segments usage
    arm64: use asm-generic/dma-mapping.h
    swiotlb-xen: merge xen_unmap_single into xen_swiotlb_unmap_page
    swiotlb-xen: simplify cache maintainance
    swiotlb-xen: use the same foreign page check everywhere
    swiotlb-xen: remove xen_swiotlb_dma_mmap and xen_swiotlb_dma_get_sgtable
    xen: remove the exports for xen_{create,destroy}_contiguous_region
    xen/arm: remove xen_dma_ops
    xen/arm: simplify dma_cache_maint
    xen/arm: use dev_is_dma_coherent
    xen/arm: consolidate page-coherent.h
    xen/arm: use dma-noncoherent.h calls for xen-swiotlb cache maintainance
    arm: remove wrappers for the generic dma remap helpers
    dma-mapping: introduce a dma_common_find_pages helper
    dma-mapping: always use VM_DMA_COHERENT for generic DMA remap
    vmalloc: lift the arm flag for coherent mappings to common code
    dma-mapping: provide a better default ->get_required_mask
    dma-mapping: remove the dma_declare_coherent_memory export
    remoteproc: don't allow modular build
    ...

    Linus Torvalds
     

06 Sep, 2019

1 commit

  • Introduce the definition of elevator features through the
    elevator_features flags in the elevator_type structure. Each flag can
    represent a feature supported by an elevator. The first feature defined
    by this patch is support for zoned block device sequential write
    constraint with the flag ELEVATOR_F_ZBD_SEQ_WRITE, which is implemented
    by the mq-deadline elevator using zone write locking.

    Other possible features are IO priorities, write hints, latency targets
    or single-LUN dual-actuator disks (for which the elevator could maintain
    one LBA ordered list per actuator).

    The required_elevator_features field is also added to the request_queue
    structure to allow a device driver to specify elevator feature flags
    that an elevator must support for the correct operation of the device
    (e.g. device drivers for zoned block devices can have the
    ELEVATOR_F_ZBD_SEQ_WRITE flag as a required feature).
    The helper function blk_queue_required_elevator_features() is
    defined for setting this new field.

    With these two new fields in place, the elevator functions
    elevator_match() and elevator_find() are modified to allow a user to set
    only an elevator with a set of features that satisfies the device
    required features. Elevators not matching the device requirements are
    not shown in the device sysfs queue/scheduler file to prevent their use.

    The "none" elevator can always be selected as before.

    Reviewed-by: Johannes Thumshirn
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Ming Lei
    Signed-off-by: Damien Le Moal
    Signed-off-by: Jens Axboe

    Damien Le Moal
     

03 Sep, 2019

1 commit


29 Aug, 2019

1 commit


27 Jul, 2019

1 commit

  • We should only set the max segment size to unlimited if we actually
    have a virt boundary. Otherwise we accidentally clear that limit
    when called from the SCSI midlayer, which always calls
    blk_queue_virt_boundary, even if that mask is 0.

    Fixes: 7ad388d8e4c7 ("scsi: core: add a host / host template field for the virt boundary")
    Reported-by: Guenter Roeck
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

24 May, 2019

1 commit

  • We currently fail to update the front/back segment size in the bio when
    deciding to allow an otherwise gappy segement to a device with a
    virt boundary. The reason why this did not cause problems is that
    devices with a virt boundary fundamentally don't use segments as we
    know it and thus don't care. Make that assumption formal by forcing
    an unlimited segement size in this case.

    Fixes: f6970f83ef79 ("block: don't check if adjacent bvecs in one bio can be mergeable")
    Signed-off-by: Christoph Hellwig
    Reviewed-by: Ming Lei
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

01 May, 2019

2 commits


10 Feb, 2019

1 commit

  • We have various helpers for setting/clearing this flag, and also
    a helper to check if the queue supports queueable flushes or not.
    But nobody uses them anymore, kill it with fire.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

29 Dec, 2018

1 commit

  • Pull SCSI updates from James Bottomley:
    "This is mostly update of the usual drivers: smarpqi, lpfc, qedi,
    megaraid_sas, libsas, zfcp, mpt3sas, hisi_sas.

    Additionally, we have a pile of annotation, unused variable and minor
    updates.

    The big API change is the updates for Christoph's DMA rework which
    include removing the DISABLE_CLUSTERING flag.

    And finally there are a couple of target tree updates"

    * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (259 commits)
    scsi: isci: request: mark expected switch fall-through
    scsi: isci: remote_node_context: mark expected switch fall-throughs
    scsi: isci: remote_device: Mark expected switch fall-throughs
    scsi: isci: phy: Mark expected switch fall-through
    scsi: iscsi: Capture iscsi debug messages using tracepoints
    scsi: myrb: Mark expected switch fall-throughs
    scsi: megaraid: fix out-of-bound array accesses
    scsi: mpt3sas: mpt3sas_scsih: Mark expected switch fall-through
    scsi: fcoe: remove set but not used variable 'port'
    scsi: smartpqi: call pqi_free_interrupts() in pqi_shutdown()
    scsi: smartpqi: fix build warnings
    scsi: smartpqi: update driver version
    scsi: smartpqi: add ofa support
    scsi: smartpqi: increase fw status register read timeout
    scsi: smartpqi: bump driver version
    scsi: smartpqi: add smp_utils support
    scsi: smartpqi: correct lun reset issues
    scsi: smartpqi: correct volume status
    scsi: smartpqi: do not offline disks for transient did no connect conditions
    scsi: smartpqi: allow for larger raid maps
    ...

    Linus Torvalds
     

19 Dec, 2018

1 commit

  • Now that the the SCSI layer replaced the use of the cluster flag with
    segment size limits and the DMA boundary we can remove the cluster flag
    from the block layer.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Jens Axboe
    Signed-off-by: Martin K. Petersen

    Christoph Hellwig
     

16 Nov, 2018

1 commit

  • ->queue_flags is generally not set or cleared in the fast path, and also
    generally set or cleared one flag at a time. Make use of the normal
    atomic bitops for it so that we don't need to take the queue_lock,
    which is otherwise mostly unused in the core block layer now.

    Reviewed-by: Hannes Reinecke
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

08 Nov, 2018

4 commits

  • With the legacy path gone, all we do is funnel it through the
    mq_ops->complete() operation.

    Tested-by: Ming Lei
    Reviewed-by: Omar Sandoval
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • The only user of legacy timing now is BSG, which is invoked
    from the mq timeout handler. Kill the legacy code, and rename
    the q->rq_timed_out_fn to q->bsg_job_timeout_fn.

    Reviewed-by: Hannes Reinecke
    Tested-by: Ming Lei
    Reviewed-by: Omar Sandoval
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • This removes a bunch of core and elevator related code. On the core
    front, we remove anything related to queue running, draining,
    initialization, plugging, and congestions. We also kill anything
    related to request allocation, merging, retrieval, and completion.

    Remove any checking for single queue IO schedulers, as they no
    longer exist. This means we can also delete a bunch of code related
    to request issue, adding, completion, etc - and all the SQ related
    ops and helpers.

    Also kill the load_default_modules(), as all that did was provide
    for a way to load the default single queue elevator.

    Tested-by: Ming Lei
    Reviewed-by: Omar Sandoval
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • Nobody is using the legacy path for blk_lld_busy() anymore, remove
    it.

    Reviewed-by: Hannes Reinecke
    Tested-by: Ming Lei
    Reviewed-by: Omar Sandoval
    Signed-off-by: Jens Axboe

    Jens Axboe
     

31 Oct, 2018

1 commit

  • Move remaining definitions and declarations from include/linux/bootmem.h
    into include/linux/memblock.h and remove the redundant header.

    The includes were replaced with the semantic patch below and then
    semi-automated removal of duplicated '#include

    @@
    @@
    - #include
    + #include

    [sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
    Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
    [sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
    Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
    [sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
    Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
    Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Signed-off-by: Stephen Rothwell
    Acked-by: Michal Hocko
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Jonas Bonn
    Cc: Jonathan Corbet
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Martin Schwidefsky
    Cc: Matt Turner
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Palmer Dabbelt
    Cc: Paul Burton
    Cc: Richard Kuo
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Serge Semin
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

25 Jul, 2018

1 commit

  • Set max_discard_segments to USHRT_MAX in blk_set_stacking_limits() so
    that blk_stack_limits() can stack up this limit for stacked devices.

    before:

    $ cat /sys/block/nvme0n1/queue/max_discard_segments
    256
    $ cat /sys/block/dm-0/queue/max_discard_segments
    1

    after:

    $ cat /sys/block/nvme0n1/queue/max_discard_segments
    256
    $ cat /sys/block/dm-0/queue/max_discard_segments
    256

    Fixes: 1e739730c5b9e ("block: optionally merge discontiguous discard bios into a single request")
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Mike Snitzer
    Signed-off-by: Jens Axboe

    Mike Snitzer
     

09 Jul, 2018

1 commit


09 Mar, 2018

2 commits

  • Introduce functions that modify the queue flags and that protect
    these modifications with the request queue lock. Except for moving
    one wake_up_all() call from inside to outside a critical section,
    this patch does not change any functionality.

    Cc: Christoph Hellwig
    Cc: Hannes Reinecke
    Cc: Ming Lei
    Reviewed-by: Johannes Thumshirn
    Reviewed-by: Martin K. Petersen
    Signed-off-by: Bart Van Assche
    Signed-off-by: Jens Axboe

    Bart Van Assche
     
  • Except for changing the atomic queue flag manipulations that are
    protected by the queue lock into non-atomic manipulations, this
    patch does not change any functionality.

    Cc: Christoph Hellwig
    Cc: Hannes Reinecke
    Cc: Ming Lei
    Reviewed-by: Johannes Thumshirn
    Reviewed-by: Martin K. Petersen
    Signed-off-by: Bart Van Assche
    Signed-off-by: Jens Axboe

    Bart Van Assche
     

11 Nov, 2017

1 commit

  • This helper doesn't buy us much over calling kmap_atomic directly.
    In fact in the only caller it does a bit of useless work as the
    caller already has the bvec at hand, and said caller would even
    buggy for a multi-segment bio due to the use of this helper.

    So just remove it.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

24 Aug, 2017

1 commit


28 Jun, 2017

1 commit


09 Apr, 2017

1 commit


09 Feb, 2017

1 commit

  • Add a new merge strategy that merges discard bios into a request until the
    maximum number of discard ranges (or the maximum discard size) is reached
    from the plug merging code. I/O scheduler merging is not wired up yet
    but might also be useful, although not for fast devices like NVMe which
    are the only user for now.

    Note that for now we don't support limiting the size of each discard range,
    but if needed that can be added later.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

02 Feb, 2017

1 commit

  • We will want to have struct backing_dev_info allocated separately from
    struct request_queue. As the first step add pointer to backing_dev_info
    to request_queue and convert all users touching it. No functional
    changes in this patch.

    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jan Kara
    Signed-off-by: Jens Axboe

    Jan Kara
     

14 Dec, 2016

1 commit

  • Pull block layer updates from Jens Axboe:
    "This is the main block pull request this series. Contrary to previous
    release, I've kept the core and driver changes in the same branch. We
    always ended up having dependencies between the two for obvious
    reasons, so makes more sense to keep them together. That said, I'll
    probably try and keep more topical branches going forward, especially
    for cycles that end up being as busy as this one.

    The major parts of this pull request is:

    - Improved support for O_DIRECT on block devices, with a small
    private implementation instead of using the pig that is
    fs/direct-io.c. From Christoph.

    - Request completion tracking in a scalable fashion. This is utilized
    by two components in this pull, the new hybrid polling and the
    writeback queue throttling code.

    - Improved support for polling with O_DIRECT, adding a hybrid mode
    that combines pure polling with an initial sleep. From me.

    - Support for automatic throttling of writeback queues on the block
    side. This uses feedback from the device completion latencies to
    scale the queue on the block side up or down. From me.

    - Support from SMR drives in the block layer and for SD. From Hannes
    and Shaun.

    - Multi-connection support for nbd. From Josef.

    - Cleanup of request and bio flags, so we have a clear split between
    which are bio (or rq) private, and which ones are shared. From
    Christoph.

    - A set of patches from Bart, that improve how we handle queue
    stopping and starting in blk-mq.

    - Support for WRITE_ZEROES from Chaitanya.

    - Lightnvm updates from Javier/Matias.

    - Supoort for FC for the nvme-over-fabrics code. From James Smart.

    - A bunch of fixes from a whole slew of people, too many to name
    here"

    * 'for-4.10/block' of git://git.kernel.dk/linux-block: (182 commits)
    blk-stat: fix a few cases of missing batch flushing
    blk-flush: run the queue when inserting blk-mq flush
    elevator: make the rqhash helpers exported
    blk-mq: abstract out blk_mq_dispatch_rq_list() helper
    blk-mq: add blk_mq_start_stopped_hw_queue()
    block: improve handling of the magic discard payload
    blk-wbt: don't throttle discard or write zeroes
    nbd: use dev_err_ratelimited in io path
    nbd: reset the setup task for NBD_CLEAR_SOCK
    nvme-fabrics: Add FC LLDD loopback driver to test FC-NVME
    nvme-fabrics: Add target support for FC transport
    nvme-fabrics: Add host support for FC transport
    nvme-fabrics: Add FC transport LLDD api definitions
    nvme-fabrics: Add FC transport FC-NVME definitions
    nvme-fabrics: Add FC transport error codes to nvme.h
    Add type 0x28 NVME type code to scsi fc headers
    nvme-fabrics: patch target code in prep for FC transport support
    nvme-fabrics: set sqe.command_id in core not transports
    parser: add u64 number parser
    nvme-rdma: align to generic ib_event logging helper
    ...

    Linus Torvalds
     

13 Dec, 2016

1 commit

  • We ran into a funky issue, where someone doing 256K buffered reads saw
    128K requests at the device level. Turns out it is read-ahead capping
    the request size, since we use 128K as the default setting. This
    doesn't make a lot of sense - if someone is issuing 256K reads, they
    should see 256K reads, regardless of the read-ahead setting, if the
    underlying device can support a 256K read in a single command.

    This patch introduces a bdi hint, io_pages. This is the soft max IO
    size for the lower level, I've hooked it up to the bdev settings here.
    Read-ahead is modified to issue the maximum of the user request size,
    and the read-ahead max size, but capped to the max request size on the
    device side. The latter is done to avoid reading ahead too much, if the
    application asks for a huge read. With this patch, the kernel behaves
    like the application expects.

    Link: http://lkml.kernel.org/r/1479498073-8657-1-git-send-email-axboe@fb.com
    Signed-off-by: Jens Axboe
    Acked-by: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jens Axboe
     

01 Dec, 2016

1 commit

  • This adds a new block layer operation to zero out a range of
    LBAs. This allows to implement zeroing for devices that don't use
    either discard with a predictable zero pattern or WRITE SAME of zeroes.
    The prominent example of that is NVMe with the Write Zeroes command,
    but in the future, this should also help with improving the way
    zeroing discards work. For this operation, suitable entry is exported in
    sysfs which indicate the number of maximum bytes allowed in one
    write zeroes operation by the device.

    Signed-off-by: Chaitanya Kulkarni
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Chaitanya Kulkarni
     

11 Nov, 2016

1 commit

  • Enable throttling of buffered writeback to make it a lot
    more smooth, and has way less impact on other system activity.
    Background writeback should be, by definition, background
    activity. The fact that we flush huge bundles of it at the time
    means that it potentially has heavy impacts on foreground workloads,
    which isn't ideal. We can't easily limit the sizes of writes that
    we do, since that would impact file system layout in the presence
    of delayed allocation. So just throttle back buffered writeback,
    unless someone is waiting for it.

    The algorithm for when to throttle takes its inspiration in the
    CoDel networking scheduling algorithm. Like CoDel, blk-wb monitors
    the minimum latencies of requests over a window of time. In that
    window of time, if the minimum latency of any request exceeds a
    given target, then a scale count is incremented and the queue depth
    is shrunk. The next monitoring window is shrunk accordingly. Unlike
    CoDel, if we hit a window that exhibits good behavior, then we
    simply increment the scale count and re-calculate the limits for that
    scale value. This prevents us from oscillating between a
    close-to-ideal value and max all the time, instead remaining in the
    windows where we get good behavior.

    Unlike CoDel, blk-wb allows the scale count to to negative. This
    happens if we primarily have writes going on. Unlike positive
    scale counts, this doesn't change the size of the monitoring window.
    When the heavy writers finish, blk-bw quickly snaps back to it's
    stable state of a zero scale count.

    The patch registers a sysfs entry, 'wb_lat_usec'. This sets the latency
    target to me met. It defaults to 2 msec for non-rotational storage, and
    75 msec for rotational storage. Setting this value to '0' disables
    blk-wb. Generally, a user would not have to touch this setting.

    We don't enable WBT on devices that are managed with CFQ, and have
    a non-root block cgroup attached. If we have a proportional share setup
    on this particular disk, then the wbt throttling will interfere with
    that. We don't have a strong need for wbt for that case, since we will
    rely on CFQ doing that for us.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

06 Nov, 2016

1 commit

  • For blk-mq, ->nr_requests does track queue depth, at least at init
    time. But for the older queue paths, it's simply a soft setting.
    On top of that, it's generally larger than the hardware setting
    on purpose, to allow backup of requests for merging.

    Fill a hole in struct request with a 'queue_depth' member, that
    drivers can call to more closely inform the block layer of the
    real queue depth.

    Signed-off-by: Jens Axboe
    Reviewed-by: Jan Kara

    Jens Axboe
     

19 Oct, 2016

2 commits

  • Signed-off-by: Hannes Reinecke
    Signed-off-by: Damien Le Moal
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Shaun Tancheff
    Tested-by: Shaun Tancheff
    Signed-off-by: Jens Axboe

    Hannes Reinecke
     
  • Add the zoned queue limit to indicate the zoning model of a block device.
    Defined values are 0 (BLK_ZONED_NONE) for regular block devices,
    1 (BLK_ZONED_HA) for host-aware zone block devices and 2 (BLK_ZONED_HM)
    for host-managed zone block devices. The standards defined drive managed
    model is not defined here since these block devices do not provide any
    command for accessing zone information. Drive managed model devices will
    be reported as BLK_ZONED_NONE.

    The helper functions blk_queue_zoned_model and bdev_zoned_model return
    the zoned limit and the functions blk_queue_is_zoned and bdev_is_zoned
    return a boolean for callers to test if a block device is zoned.

    The zoned attribute is also exported as a string to applications via
    sysfs. BLK_ZONED_NONE shows as "none", BLK_ZONED_HA as "host-aware" and
    BLK_ZONED_HM as "host-managed".

    Signed-off-by: Damien Le Moal
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Shaun Tancheff
    Tested-by: Shaun Tancheff
    Signed-off-by: Jens Axboe

    Damien Le Moal
     

14 Apr, 2016

1 commit


13 Apr, 2016

2 commits

  • We don't have any drivers left using it, so kill it off. Update
    documentation to use the newer blk_queue_write_cache().

    Signed-off-by: Jens Axboe
    Reviewed-by: Christoph Hellwig

    Jens Axboe
     
  • Add an internal helper and flag for setting whether a queue has
    write back caching, or write through (or none). Add a sysfs file
    to show this as well, and make it changeable from user space.

    This will replace the (awkward) blk_queue_flush() interface that
    drivers currently use to inform the block layer of write cache state
    and capabilities.

    Signed-off-by: Jens Axboe
    Reviewed-by: Christoph Hellwig

    Jens Axboe
     

05 Apr, 2016

1 commit

  • PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
    ago with promise that one day it will be possible to implement page
    cache with bigger chunks than PAGE_SIZE.

    This promise never materialized. And unlikely will.

    We have many places where PAGE_CACHE_SIZE assumed to be equal to
    PAGE_SIZE. And it's constant source of confusion on whether
    PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
    especially on the border between fs and mm.

    Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
    breakage to be doable.

    Let's stop pretending that pages in page cache are special. They are
    not.

    The changes are pretty straight-forward:

    - << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

    - page_cache_get() -> get_page();

    - page_cache_release() -> put_page();

    This patch contains automated changes generated with coccinelle using
    script below. For some reason, coccinelle doesn't patch header files.
    I've called spatch for them manually.

    The only adjustment after coccinelle is revert of changes to
    PAGE_CAHCE_ALIGN definition: we are going to drop it later.

    There are few places in the code where coccinelle didn't reach. I'll
    fix them manually in a separate patch. Comments and documentation also
    will be addressed with the separate patch.

    virtual patch

    @@
    expression E;
    @@
    - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    expression E;
    @@
    - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    @@
    - PAGE_CACHE_SHIFT
    + PAGE_SHIFT

    @@
    @@
    - PAGE_CACHE_SIZE
    + PAGE_SIZE

    @@
    @@
    - PAGE_CACHE_MASK
    + PAGE_MASK

    @@
    expression E;
    @@
    - PAGE_CACHE_ALIGN(E)
    + PAGE_ALIGN(E)

    @@
    expression E;
    @@
    - page_cache_get(E)
    + get_page(E)

    @@
    expression E;
    @@
    - page_cache_release(E)
    + put_page(E)

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

12 Feb, 2016

1 commit

  • The new queue limit is not used by the majority of block drivers, and
    should be initialized to 0 for the driver's requested settings to be used.

    Signed-off-by: Keith Busch
    Acked-by: Martin K. Petersen
    Reviewed-by: Sagi Grimberg
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Keith Busch