24 Aug, 2020

1 commit

  • Replace the existing /* fall through */ comments and its variants with
    the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
    fall-through markings when it is the case.

    [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     

03 Jun, 2020

1 commit

  • The pgprot argument to __vmalloc is always PAGE_KERNEL now, so remove it.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Reviewed-by: Michael Kelley [hyperv]
    Acked-by: Gao Xiang [erofs]
    Acked-by: Peter Zijlstra (Intel)
    Acked-by: Wei Liu
    Cc: Christian Borntraeger
    Cc: Christophe Leroy
    Cc: Daniel Vetter
    Cc: David Airlie
    Cc: Greg Kroah-Hartman
    Cc: Haiyang Zhang
    Cc: Johannes Weiner
    Cc: "K. Y. Srinivasan"
    Cc: Laura Abbott
    Cc: Mark Rutland
    Cc: Minchan Kim
    Cc: Nitin Gupta
    Cc: Robin Murphy
    Cc: Sakari Ailus
    Cc: Stephen Hemminger
    Cc: Sumit Semwal
    Cc: Benjamin Herrenschmidt
    Cc: Catalin Marinas
    Cc: Heiko Carstens
    Cc: Paul Mackerras
    Cc: Vasily Gorbik
    Cc: Will Deacon
    Link: http://lkml.kernel.org/r/20200414131348.444715-22-hch@lst.de
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     

09 May, 2020

4 commits

  • Before this patch, function gfs2_quota_unlock checked if quotas are
    turned off, and if so, it branched to label out, which called
    gfs2_quota_unhold. With the new system of gfs2_qa_get and put, we
    no longer want to call gfs2_quota_unhold or we won't balance our
    gets and puts.

    Signed-off-by: Bob Peterson
    Signed-off-by: Andreas Gruenbacher

    Bob Peterson
     
  • Before this patch, function gfs2_quota_lock checked if it was called
    from a privileged user, and if so, it bypassed the quota check:
    superuser can operate outside the quotas.
    That's the wrong place for the check because the lock/unlock functions
    are separate from the lock_check function, and you can do lock and
    unlock without actually checking the quotas.

    This patch moves the check to gfs2_quota_lock_check.

    Signed-off-by: Bob Peterson
    Signed-off-by: Andreas Gruenbacher

    Bob Peterson
     
  • This patch removes a check from gfs2_quota_check for whether quotas
    are enabled by the superblock. There is a test just prior for the
    GIF_QD_LOCKED bit in the inode, and that can only be set by functions
    that already check that quotas are enabled in the superblock.
    Therefore, the check is redundant.

    Signed-off-by: Bob Peterson
    Signed-off-by: Andreas Gruenbacher

    Bob Peterson
     
  • Before this patch, gfs2_quota_change() would BUG_ON if the
    qa_ref counter was not a positive number. This patch changes it to
    be a withdraw instead. That way we can debug things more easily.

    Signed-off-by: Bob Peterson
    Signed-off-by: Andreas Gruenbacher

    Bob Peterson
     

28 Mar, 2020

3 commits

  • Before this patch, multiple users called gfs2_qa_alloc which allocated
    a qadata structure to the inode, if quotas are turned on. Later, in
    file close or evict, the structure was deleted with gfs2_qa_delete.
    But there can be several competing processes who need access to the
    structure. There were races between file close (release) and the others.
    Thus, a release could delete the structure out from under a process
    that relied upon its existence. For example, chown.

    This patch changes the management of the qadata structures to be
    a get/put scheme. Function gfs2_qa_alloc has been changed to gfs2_qa_get
    and if the structure is allocated, the count essentially starts out at
    1. Function gfs2_qa_delete has been renamed to gfs2_qa_put, and the
    last guy to decrement the count to 0 frees the memory.

    Signed-off-by: Bob Peterson

    Bob Peterson
     
  • Before this patch, multiple callers called gfs2_rsqa_alloc to force
    the existence of a reservations structure and a quota data structure
    if needed. However, now the reservations are handled separately, so
    the quota data is only the quota data. So we eliminate the one in
    favor of just calling gfs2_qa_alloc directly.

    Signed-off-by: Bob Peterson

    Bob Peterson
     
  • Replace open-coded versions of list_first_entry and list_last_entry with those
    functions.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Bob Peterson

    Andreas Gruenbacher
     

27 Feb, 2020

1 commit

  • When a node withdraws from a file system, it often leaves its journal
    in an incomplete state. This is especially true when the withdraw is
    caused by io errors writing to the journal. Before this patch, a
    withdraw would try to write a "shutdown" record to the journal, tell
    dlm it's done with the file system, and none of the other nodes
    know about the problem. Later, when the problem is fixed and the
    withdrawn node is rebooted, it would then discover that its own
    journal was incomplete, and replay it. However, replaying it at this
    point is almost guaranteed to introduce corruption because the other
    nodes are likely to have used affected resource groups that appeared
    in the journal since the time of the withdraw. Replaying the journal
    later will overwrite any changes made, and not through any fault of
    dlm, which was instructed during the withdraw to release those
    resources.

    This patch makes file system withdraws seen by the entire cluster.
    Withdrawing nodes dequeue their journal glock to allow recovery.

    The remaining nodes check all the journals to see if they are
    clean or in need of replay. They try to replay dirty journals, but
    only the journals of withdrawn nodes will be "not busy" and
    therefore available for replay.

    Until the journal replay is complete, no i/o related glocks may be
    given out, to ensure that the replay does not cause the
    aforementioned corruption: We cannot allow any journal replay to
    overwrite blocks associated with a glock once it is held.

    The "live" glock which is now used to signal when a withdraw
    occurs. When a withdraw occurs, the node signals its withdraw by
    dequeueing the "live" glock and trying to enqueue it in EX mode,
    thus forcing the other nodes to all see a demote request, by way
    of a "1CB" (one callback) try lock. The "live" glock is not
    granted in EX; the callback is only just used to indicate a
    withdraw has occurred.

    Note that all nodes in the cluster must wait for the recovering
    node to finish replaying the withdrawing node's journal before
    continuing. To this end, it checks that the journals are clean
    multiple times in a retry loop.

    Also note that the withdraw function may be called from a wide
    variety of situations, and therefore, we need to take extra
    precautions to make sure pointers are valid before using them in
    many circumstances.

    We also need to take care when glocks decide to withdraw, since
    the withdraw code now uses glocks.

    Also, before this patch, if a process encountered an error and
    decided to withdraw, if another process was already withdrawing,
    the second withdraw would be silently ignored, which set it free
    to unlock its glocks. That's correct behavior if the original
    withdrawer encounters further errors down the road. But if
    secondary waiters don't wait for the journal replay, unlocking
    glocks will allow other nodes to use them, despite the fact that
    the journal containing those blocks is being replayed. The
    replay needs to finish before our glocks are released to other
    nodes. IOW, secondary withdraws need to wait for the first
    withdraw to finish.

    For example, if an rgrp glock is unlocked by a process that didn't
    wait for the first withdraw, a journal replay could introduce file
    system corruption by replaying a rgrp block that has already been
    granted to a different cluster node.

    Signed-off-by: Bob Peterson

    Bob Peterson
     

10 Feb, 2020

2 commits

  • Before this patch, all io errors received by the quota daemon or the
    logd daemon would cause a complaint message to be issued, such as:

    gfs2: fsid=dm-13.0: Error 10 writing to journal, jid=0

    This patch changes it so that the error message is only issued the
    first time the error is encountered.

    Also, before this patch function gfs2_end_log_write did not set the
    sd_log_error value, so log errors would not cause the file system to
    be withdrawn. This patch sets the error code so the file system is
    properly withdrawn if an io error is encountered writing to the journal.

    WARNING: This change in function breaks check xfstests generic/441
    and causes it to fail: io errors writing to the log should cause a
    file system to be withdrawn, and no further operations are tolerated.

    Signed-off-by: Bob Peterson
    Reviewed-by: Andreas Gruenbacher

    Bob Peterson
     
  • Before this patch, gfs2 kept track of journal io errors in two
    places sd_log_error and the SDF_AIL1_IO_ERROR flag in sd_flags.
    This patch consolidates the two into sd_log_error so that it
    reflects the first error encountered writing to the journal.
    In future patches, we will take advantage of this by checking
    this value rather than having to check both when reacting to
    io errors.

    In addition, this fixes a tight loop in unmount: If buffers
    get on the ail1 list and an io error occurs elsewhere, the
    ail1 list would never be cleared because they were always busy.
    So unmount would hang, waiting for the ail1 list to empty.

    Signed-off-by: Bob Peterson
    Reviewed-by: Andreas Gruenbacher

    Bob Peterson
     

15 Nov, 2019

1 commit

  • Add function gfs2_withdrawn and replace all checks for the SDF_WITHDRAWN
    bit to call it. This does not change the logic or function of gfs2, and
    it facilitates later improvements to the withdraw sequence.

    Signed-off-by: Bob Peterson
    Signed-off-by: Andreas Gruenbacher

    Bob Peterson
     

30 Oct, 2019

1 commit


05 Sep, 2019

1 commit


28 Jun, 2019

1 commit


05 Jun, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this copyrighted material is made available to anyone wishing to use
    modify copy or redistribute it subject to the terms and conditions
    of the gnu general public license version 2

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 44 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Reviewed-by: Kate Stewart
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190531081038.653000175@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

12 Oct, 2018

1 commit


13 Jun, 2018

1 commit

  • The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
    patch replaces cases of:

    kmalloc(a * b, gfp)

    with:
    kmalloc_array(a * b, gfp)

    as well as handling cases of:

    kmalloc(a * b * c, gfp)

    with:

    kmalloc(array3_size(a, b, c), gfp)

    as it's slightly less ugly than:

    kmalloc_array(array_size(a, b), c, gfp)

    This does, however, attempt to ignore constant size factors like:

    kmalloc(4 * 1024, gfp)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The tools/ directory was manually excluded, since it has its own
    implementation of kmalloc().

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    kmalloc(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    kmalloc(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    kmalloc(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (COUNT_ID)
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * COUNT_ID
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (COUNT_CONST)
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * COUNT_CONST
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (COUNT_ID)
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * COUNT_ID
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (COUNT_CONST)
    + COUNT_CONST, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * COUNT_CONST
    + COUNT_CONST, sizeof(THING)
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    - kmalloc
    + kmalloc_array
    (
    - SIZE * COUNT
    + COUNT, SIZE
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    kmalloc(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    kmalloc(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kmalloc(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    kmalloc(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products,
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    kmalloc(C1 * C2 * C3, ...)
    |
    kmalloc(
    - (E1) * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - (E1) * (E2) * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - (E1) * (E2) * (E3)
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants,
    // keeping sizeof() as the second factor argument.
    @@
    expression THING, E1, E2;
    type TYPE;
    constant C1, C2, C3;
    @@

    (
    kmalloc(sizeof(THING) * C2, ...)
    |
    kmalloc(sizeof(TYPE) * C2, ...)
    |
    kmalloc(C1 * C2 * C3, ...)
    |
    kmalloc(C1 * C2, ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (E2)
    + E2, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * E2
    + E2, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (E2)
    + E2, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * E2
    + E2, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - (E1) * E2
    + E1, E2
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - (E1) * (E2)
    + E1, E2
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - E1 * E2
    + E1, E2
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook
     

04 Jun, 2018

1 commit

  • In journaled data mode, we need to add each buffer head to the current
    transaction. In ordered write mode, we only need to add the inode to
    the ordered inode list. So far, both cases are handled in
    gfs2_trans_add_data. This makes the code look misleading and is
    inefficient for small block sizes as well. Handle both cases separately
    instead.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Bob Peterson

    Andreas Gruenbacher
     

23 Jan, 2018

2 commits

  • This patch just adds the capability for GFS2 to track which function
    called gfs2_log_flush. This should make it easier to diagnose
    problems based on the sequence of events found in the journals.

    Signed-off-by: Bob Peterson
    Reviewed-by: Andreas Gruenbacher

    Bob Peterson
     
  • This patch adds a new structure called gfs2_log_header_v2 which is used
    to store expanded fields into previously unused areas of the log headers
    (i.e., this change is backwards compatible). Some of these are used for
    debug purposes so we can backtrack when problems occur. Others are
    reserved for future expansion.

    This patch is based on a prototype from Steve Whitehouse.

    Signed-off-by: Bob Peterson
    Signed-off-by: Andreas Gruenbacher

    Bob Peterson
     

15 Sep, 2017

1 commit

  • Pull mount flag updates from Al Viro:
    "Another chunk of fmount preparations from dhowells; only trivial
    conflicts for that part. It separates MS_... bits (very grotty
    mount(2) ABI) from the struct super_block ->s_flags (kernel-internal,
    only a small subset of MS_... stuff).

    This does *not* convert the filesystems to new constants; only the
    infrastructure is done here. The next step in that series is where the
    conflicts would be; that's the conversion of filesystems. It's purely
    mechanical and it's better done after the merge, so if you could run
    something like

    list=$(for i in MS_RDONLY MS_NOSUID MS_NODEV MS_NOEXEC MS_SYNCHRONOUS MS_MANDLOCK MS_DIRSYNC MS_NOATIME MS_NODIRATIME MS_SILENT MS_POSIXACL MS_KERNMOUNT MS_I_VERSION MS_LAZYTIME; do git grep -l $i fs drivers/staging/lustre drivers/mtd ipc mm include/linux; done|sort|uniq|grep -v '^fs/namespace.c$')

    sed -i -e 's/\/SB_RDONLY/g' \
    -e 's/\/SB_NOSUID/g' \
    -e 's/\/SB_NODEV/g' \
    -e 's/\/SB_NOEXEC/g' \
    -e 's/\/SB_SYNCHRONOUS/g' \
    -e 's/\/SB_MANDLOCK/g' \
    -e 's/\/SB_DIRSYNC/g' \
    -e 's/\/SB_NOATIME/g' \
    -e 's/\/SB_NODIRATIME/g' \
    -e 's/\/SB_SILENT/g' \
    -e 's/\/SB_POSIXACL/g' \
    -e 's/\/SB_KERNMOUNT/g' \
    -e 's/\/SB_I_VERSION/g' \
    -e 's/\/SB_LAZYTIME/g' \
    $list

    and commit it with something along the lines of 'convert filesystems
    away from use of MS_... constants' as commit message, it would save a
    quite a bit of headache next cycle"

    * 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    VFS: Differentiate mount flags (MS_*) from internal superblock flags
    VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb)
    vfs: Add sb_rdonly(sb) to query the MS_RDONLY flag on s_flags

    Linus Torvalds
     

25 Aug, 2017

1 commit

  • Before this patch, if GFS2 encountered IO errors while writing to
    the journal, it would not report the problem, so they would go
    unnoticed, sometimes for many hours. Sometimes this would only be
    noticed later, when recovery tried to do journal replay and failed
    due to invalid metadata at the blocks that resulted in IO errors.

    This patch makes GFS2's log daemon check for IO errors. If it
    encounters one, it withdraws from the file system and reports
    why in dmesg. A similar action is taken when IO errors occur when
    writing to the system statfs file.

    These errors are also reported back to any callers of fsync, since
    that requires the journal to be flushed. Therefore, any IO errors
    that would previously go unnoticed are now noticed and the file
    system is withdrawn as early as possible, thus preventing further
    file system damage.

    Also note that this reintroduces superblock variable sd_log_error,
    which Christoph removed with commit f729b66fca.

    Signed-off-by: Bob Peterson

    Bob Peterson
     

21 Jul, 2017

1 commit

  • When gfs2 does metadata I/O, only REQ_META is used as a metadata hint of
    the bio. But flag REQ_META is just a hint for block trace, not for block
    layer code to handle a bio as metadata request.

    For some of metadata I/Os of gfs2, A REQ_PRIO flag on the metadata bio
    would be very informative to block layer code. For example, if bcache is
    used as a I/O cache for gfs2, it will be possible for bcache code to get
    the hint and cache the pre-fetched metadata blocks on cache device. This
    behavior may be helpful to improve metadata I/O performance if the
    following requests hit the cache.

    Here are the locations in gfs2 code where a REQ_PRIO flag should be added,
    - All places where REQ_READAHEAD is used, gfs2 code uses this flag for
    metadata read ahead.
    - In gfs2_meta_rq() where the first metadata block is read in.
    - In gfs2_write_buf_to_page(), read in quota metadata blocks to have them
    up to date.
    These metadata blocks are probably to be accessed again in future, adding
    a REQ_PRIO flag may have bcache to keep such metadata in fast cache
    device. For system without a cache layer, REQ_PRIO can still provide hint
    to block layer to handle metadata requests more properly.

    Signed-off-by: Coly Li
    Signed-off-by: Bob Peterson

    Coly Li
     

17 Jul, 2017

1 commit

  • Firstly by applying the following with coccinelle's spatch:

    @@ expression SB; @@
    -SB->s_flags & MS_RDONLY
    +sb_rdonly(SB)

    to effect the conversion to sb_rdonly(sb), then by applying:

    @@ expression A, SB; @@
    (
    -(!sb_rdonly(SB)) && A
    +!sb_rdonly(SB) && A
    |
    -A != (sb_rdonly(SB))
    +A != sb_rdonly(SB)
    |
    -A == (sb_rdonly(SB))
    +A == sb_rdonly(SB)
    |
    -!(sb_rdonly(SB))
    +!sb_rdonly(SB)
    |
    -A && (sb_rdonly(SB))
    +A && sb_rdonly(SB)
    |
    -A || (sb_rdonly(SB))
    +A || sb_rdonly(SB)
    |
    -(sb_rdonly(SB)) != A
    +sb_rdonly(SB) != A
    |
    -(sb_rdonly(SB)) == A
    +sb_rdonly(SB) == A
    |
    -(sb_rdonly(SB)) && A
    +sb_rdonly(SB) && A
    |
    -(sb_rdonly(SB)) || A
    +sb_rdonly(SB) || A
    )

    @@ expression A, B, SB; @@
    (
    -(sb_rdonly(SB)) ? 1 : 0
    +sb_rdonly(SB)
    |
    -(sb_rdonly(SB)) ? A : B
    +sb_rdonly(SB) ? A : B
    )

    to remove left over excess bracketage and finally by applying:

    @@ expression A, SB; @@
    (
    -(A & MS_RDONLY) != sb_rdonly(SB)
    +(bool)(A & MS_RDONLY) != sb_rdonly(SB)
    |
    -(A & MS_RDONLY) == sb_rdonly(SB)
    +(bool)(A & MS_RDONLY) == sb_rdonly(SB)
    )

    to make comparisons against the result of sb_rdonly() (which is a bool)
    work correctly.

    Signed-off-by: David Howells

    David Howells
     

11 Oct, 2016

1 commit

  • Pull more vfs updates from Al Viro:
    ">rename2() work from Miklos + current_time() from Deepa"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    fs: Replace current_fs_time() with current_time()
    fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
    fs: Replace CURRENT_TIME with current_time() for inode timestamps
    fs: proc: Delete inode time initializations in proc_alloc_inode()
    vfs: Add current_time() api
    vfs: add note about i_op->rename changes to porting
    fs: rename "rename2" i_op to "rename"
    vfs: remove unused i_op->rename
    fs: make remaining filesystems use .rename2
    libfs: support RENAME_NOREPLACE in simple_rename()
    fs: support RENAME_NOREPLACE for local filesystems
    ncpfs: fix unused variable warning

    Linus Torvalds
     

28 Sep, 2016

1 commit

  • CURRENT_TIME macro is not appropriate for filesystems as it
    doesn't use the right granularity for filesystem timestamps.
    Use current_time() instead.

    CURRENT_TIME is also not y2038 safe.

    This is also in preparation for the patch that transitions
    vfs timestamps to use 64 bit time and hence make them
    y2038 safe. As part of the effort current_time() will be
    extended to do range checks. Hence, it is necessary for all
    file system timestamps to use current_time(). Also,
    current_time() will be transitioned along with vfs to be
    y2038 safe.

    Note that whenever a single call to current_time() is used
    to change timestamps in different inodes, it is because they
    share the same time granularity.

    Signed-off-by: Deepa Dinamani
    Reviewed-by: Arnd Bergmann
    Acked-by: Felipe Balbi
    Acked-by: Steven Whitehouse
    Acked-by: Ryusuke Konishi
    Acked-by: David Sterba
    Signed-off-by: Al Viro

    Deepa Dinamani
     

03 Aug, 2016

1 commit

  • Replace 1 << value shift by more explicit BIT() macro

    Also fixes two bare unsigned definitions:

    WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
    + unsigned hsize = BIT(ip->i_depth);

    Signed-off-by: Fabian Frederick
    Signed-off-by: Bob Peterson

    Fabian Frederick
     

27 Jul, 2016

1 commit

  • Pull core block updates from Jens Axboe:

    - the big change is the cleanup from Mike Christie, cleaning up our
    uses of command types and modified flags. This is what will throw
    some merge conflicts

    - regression fix for the above for btrfs, from Vincent

    - following up to the above, better packing of struct request from
    Christoph

    - a 2038 fix for blktrace from Arnd

    - a few trivial/spelling fixes from Bart Van Assche

    - a front merge check fix from Damien, which could cause issues on
    SMR drives

    - Atari partition fix from Gabriel

    - convert cfq to highres timers, since jiffies isn't granular enough
    for some devices these days. From Jan and Jeff

    - CFQ priority boost fix idle classes, from me

    - cleanup series from Ming, improving our bio/bvec iteration

    - a direct issue fix for blk-mq from Omar

    - fix for plug merging not involving the IO scheduler, like we do for
    other types of merges. From Tahsin

    - expose DAX type internally and through sysfs. From Toshi and Yigal

    * 'for-4.8/core' of git://git.kernel.dk/linux-block: (76 commits)
    block: Fix front merge check
    block: do not merge requests without consulting with io scheduler
    block: Fix spelling in a source code comment
    block: expose QUEUE_FLAG_DAX in sysfs
    block: add QUEUE_FLAG_DAX for devices to advertise their DAX support
    Btrfs: fix comparison in __btrfs_map_block()
    block: atari: Return early for unsupported sector size
    Doc: block: Fix a typo in queue-sysfs.txt
    cfq-iosched: Charge at least 1 jiffie instead of 1 ns
    cfq-iosched: Fix regression in bonnie++ rewrite performance
    cfq-iosched: Convert slice_resid from u64 to s64
    block: Convert fifo_time from ulong to u64
    blktrace: avoid using timespec
    block/blk-cgroup.c: Declare local symbols static
    block/bio-integrity.c: Add #include "blk.h"
    block/partition-generic.c: Remove a set-but-not-used variable
    block: bio: kill BIO_MAX_SIZE
    cfq-iosched: temporarily boost queue priority for idle classes
    block: drbd: avoid to use BIO_MAX_SIZE
    block: bio: remove BIO_MAX_SECTORS
    ...

    Linus Torvalds
     

27 Jun, 2016

1 commit

  • Make the code more readable by cleaning up the different ways of
    initializing lock holders and checking for initialized lock holders:
    mark lock holders as uninitialized by setting the holder's glock to NULL
    (gfs2_holder_mark_uninitialized) instead of zeroing out the entire
    object or using a separate flag. Recognize initialized holders by their
    non-NULL glock (gfs2_holder_initialized). Don't zero out holder objects
    which are immeditiately initialized via gfs2_holder_init or
    gfs2_glock_nq_init.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Bob Peterson

    Andreas Gruenbacher
     

08 Jun, 2016

1 commit


05 Apr, 2016

1 commit

  • PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
    ago with promise that one day it will be possible to implement page
    cache with bigger chunks than PAGE_SIZE.

    This promise never materialized. And unlikely will.

    We have many places where PAGE_CACHE_SIZE assumed to be equal to
    PAGE_SIZE. And it's constant source of confusion on whether
    PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
    especially on the border between fs and mm.

    Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
    breakage to be doable.

    Let's stop pretending that pages in page cache are special. They are
    not.

    The changes are pretty straight-forward:

    - << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

    - page_cache_get() -> get_page();

    - page_cache_release() -> put_page();

    This patch contains automated changes generated with coccinelle using
    script below. For some reason, coccinelle doesn't patch header files.
    I've called spatch for them manually.

    The only adjustment after coccinelle is revert of changes to
    PAGE_CAHCE_ALIGN definition: we are going to drop it later.

    There are few places in the code where coccinelle didn't reach. I'll
    fix them manually in a separate patch. Comments and documentation also
    will be addressed with the separate patch.

    virtual patch

    @@
    expression E;
    @@
    - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    expression E;
    @@
    - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    @@
    - PAGE_CACHE_SHIFT
    + PAGE_SHIFT

    @@
    @@
    - PAGE_CACHE_SIZE
    + PAGE_SIZE

    @@
    @@
    - PAGE_CACHE_MASK
    + PAGE_MASK

    @@
    expression E;
    @@
    - PAGE_CACHE_ALIGN(E)
    + PAGE_ALIGN(E)

    @@
    expression E;
    @@
    - page_cache_get(E)
    + get_page(E)

    @@
    expression E;
    @@
    - page_cache_release(E)
    + put_page(E)

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

23 Jan, 2016

1 commit

  • parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
    inode_foo(inode) being mutex_foo(&inode->i_mutex).

    Please, use those for access to ->i_mutex; over the coming cycle
    ->i_mutex will become rwsem, with ->lookup() done with it held
    only shared.

    Signed-off-by: Al Viro

    Al Viro
     

15 Dec, 2015

2 commits

  • This patch makes no functional changes. Its goal is to reduce the
    size of the gfs2 inode in memory by rearranging structures and
    changing the size of some variables within the structure.

    Signed-off-by: Bob Peterson

    Bob Peterson
     
  • Before this patch, multi-block reservation structures were allocated
    from a special slab. This patch folds the structure into the gfs2_inode
    structure. The disadvantage is that the gfs2_inode needs more memory,
    even when a file is opened read-only. The advantages are: (a) we don't
    need the special slab and the extra time it takes to allocate and
    deallocate from it. (b) we no longer need to worry that the structure
    exists for things like quota management. (c) This also allows us to
    remove the calls to get_write_access and put_write_access since we
    know the structure will exist.

    Signed-off-by: Bob Peterson

    Bob Peterson
     

24 Nov, 2015

1 commit

  • This patch basically reverts the majority of patch 5407e24.
    That patch eliminated the gfs2_qadata structure in favor of just
    using the reservations structure. The problem with doing that is that
    it increases the size of the reservations structure. That is not an
    issue until it comes time to fold the reservations structure into the
    inode in memory so we know it's always there. By separating out the
    quota structure again, we aren't punishing the non-quota users by
    making all the inodes bigger, requiring more slab space. This patch
    creates a new slab area to allocate the quota stuff so it's managed
    a little more sanely.

    Signed-off-by: Bob Peterson

    Bob Peterson
     

17 Nov, 2015

1 commit

  • When gfs2 allocates an inode and its extended attribute block next to
    each other at inode create time, the inode's directory entry indicates
    that in de_rahead. In that case, we can readahead the extended
    attribute block when we read in the inode.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Bob Peterson

    Andreas Gruenbacher
     

04 Sep, 2015

1 commit

  • What uniquely identifies a glock in the glock hash table is not
    gl_name, but gl_name and its superblock pointer. This patch makes
    the gl_name field correspond to a unique glock identifier. That will
    allow us to simplify hashing with a future patch, since the hash
    algorithm can then take the gl_name and hash its components in one
    operation.

    Signed-off-by: Bob Peterson
    Signed-off-by: Andreas Gruenbacher
    Acked-by: Steven Whitehouse

    Bob Peterson
     

28 Jun, 2015

1 commit

  • Pull GFS2 updates from Bob Peterson:
    "Here are the patches we've accumulated for GFS2 for the current
    upstream merge window. We have a good mixture this time. Here are
    some of the features:

    - Fix a problem with RO mounts writing to the journal.

    - Further improvements to quotas on GFS2.

    - Added support for rename2 and RENAME_EXCHANGE on GFS2.

    - Increase performance by making glock lru_list less of a bottleneck.

    - Increase performance by avoiding unnecessary buffer_head releases.

    - Increase performance by using average glock round trip time from all CPUs.

    - Fixes for some compiler warnings and minor white space issues.

    - Other misc bug fixes"

    * tag 'gfs2-merge-window' of git://git.kernel.org:/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
    GFS2: Don't brelse rgrp buffer_heads every allocation
    GFS2: Don't add all glocks to the lru
    gfs2: Don't support fallocate on jdata files
    gfs2: s64 cast for negative quota value
    gfs2: limit quota log messages
    gfs2: fix quota updates on block boundaries
    gfs2: fix shadow warning in gfs2_rbm_find()
    gfs2: kerneldoc warning fixes
    gfs2: convert simple_str to kstr
    GFS2: make sure S_NOSEC flag isn't overwritten
    GFS2: add support for rename2 and RENAME_EXCHANGE
    gfs2: handle NULL rgd in set_rgrp_preferences
    GFS2: inode.c: indent with TABs, not spaces
    GFS2: mark the journal idle to fix ro mounts
    GFS2: Average in only non-zero round-trip times for congestion stats
    GFS2: Use average srttb value in congestion calculations

    Linus Torvalds