04 Jun, 2020

1 commit

  • Pull networking updates from David Miller:

    1) Allow setting bluetooth L2CAP modes via socket option, from Luiz
    Augusto von Dentz.

    2) Add GSO partial support to igc, from Sasha Neftin.

    3) Several cleanups and improvements to r8169 from Heiner Kallweit.

    4) Add IF_OPER_TESTING link state and use it when ethtool triggers a
    device self-test. From Andrew Lunn.

    5) Start moving away from custom driver versions, use the globally
    defined kernel version instead, from Leon Romanovsky.

    6) Support GRO vis gro_cells in DSA layer, from Alexander Lobakin.

    7) Allow hard IRQ deferral during NAPI, from Eric Dumazet.

    8) Add sriov and vf support to hinic, from Luo bin.

    9) Support Media Redundancy Protocol (MRP) in the bridging code, from
    Horatiu Vultur.

    10) Support netmap in the nft_nat code, from Pablo Neira Ayuso.

    11) Allow UDPv6 encapsulation of ESP in the ipsec code, from Sabrina
    Dubroca. Also add ipv6 support for espintcp.

    12) Lots of ReST conversions of the networking documentation, from Mauro
    Carvalho Chehab.

    13) Support configuration of ethtool rxnfc flows in bcmgenet driver,
    from Doug Berger.

    14) Allow to dump cgroup id and filter by it in inet_diag code, from
    Dmitry Yakunin.

    15) Add infrastructure to export netlink attribute policies to
    userspace, from Johannes Berg.

    16) Several optimizations to sch_fq scheduler, from Eric Dumazet.

    17) Fallback to the default qdisc if qdisc init fails because otherwise
    a packet scheduler init failure will make a device inoperative. From
    Jesper Dangaard Brouer.

    18) Several RISCV bpf jit optimizations, from Luke Nelson.

    19) Correct the return type of the ->ndo_start_xmit() method in several
    drivers, it's netdev_tx_t but many drivers were using
    'int'. From Yunjian Wang.

    20) Add an ethtool interface for PHY master/slave config, from Oleksij
    Rempel.

    21) Add BPF iterators, from Yonghang Song.

    22) Add cable test infrastructure, including ethool interfaces, from
    Andrew Lunn. Marvell PHY driver is the first to support this
    facility.

    23) Remove zero-length arrays all over, from Gustavo A. R. Silva.

    24) Calculate and maintain an explicit frame size in XDP, from Jesper
    Dangaard Brouer.

    25) Add CAP_BPF, from Alexei Starovoitov.

    26) Support terse dumps in the packet scheduler, from Vlad Buslov.

    27) Support XDP_TX bulking in dpaa2 driver, from Ioana Ciornei.

    28) Add devm_register_netdev(), from Bartosz Golaszewski.

    29) Minimize qdisc resets, from Cong Wang.

    30) Get rid of kernel_getsockopt and kernel_setsockopt in order to
    eliminate set_fs/get_fs calls. From Christoph Hellwig.

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2517 commits)
    selftests: net: ip_defrag: ignore EPERM
    net_failover: fixed rollback in net_failover_open()
    Revert "tipc: Fix potential tipc_aead refcnt leak in tipc_crypto_rcv"
    Revert "tipc: Fix potential tipc_node refcnt leak in tipc_rcv"
    vmxnet3: allow rx flow hash ops only when rss is enabled
    hinic: add set_channels ethtool_ops support
    selftests/bpf: Add a default $(CXX) value
    tools/bpf: Don't use $(COMPILE.c)
    bpf, selftests: Use bpf_probe_read_kernel
    s390/bpf: Use bcr 0,%0 as tail call nop filler
    s390/bpf: Maintain 8-byte stack alignment
    selftests/bpf: Fix verifier test
    selftests/bpf: Fix sample_cnt shared between two threads
    bpf, selftests: Adapt cls_redirect to call csum_level helper
    bpf: Add csum_level helper for fixing up csum levels
    bpf: Fix up bpf_skb_adjust_room helper's skb csum setting
    sfc: add missing annotation for efx_ef10_try_update_nic_stats_vf()
    crypto/chtls: IPv6 support for inline TLS
    Crypto/chcr: Fixes a coccinile check error
    Crypto/chcr: Fixes compilations warnings
    ...

    Linus Torvalds
     

05 May, 2020

5 commits

  • - Add a SPDX header;
    - Adjust document and section titles;
    - Some whitespace fixes and new line breaks;
    - Mark literal blocks as such;
    - Add table markups;
    - Add it to filesystems/caching/index.rst.

    Signed-off-by: Mauro Carvalho Chehab
    Link: https://lore.kernel.org/r/5d0a61abaa87bfe913b9e2f321e74ef7af0f3dfc.1588021877.git.mchehab+huawei@kernel.org
    Signed-off-by: Jonathan Corbet

    Mauro Carvalho Chehab
     
  • - Add a SPDX header;
    - Adjust document and section titles;
    - Comment out text ToC for html/pdf output;
    - Mark literal blocks as such;
    - Add it to filesystems/caching/index.rst.

    Signed-off-by: Mauro Carvalho Chehab
    Link: https://lore.kernel.org/r/97e71cc598a4f61df484ebda3ec06b63530ceb62.1588021877.git.mchehab+huawei@kernel.org
    Signed-off-by: Jonathan Corbet

    Mauro Carvalho Chehab
     
  • - Add a SPDX header;
    - Adjust document and section titles;
    - Some whitespace fixes and new line breaks;
    - Mark literal blocks as such;
    - Add it to filesystems/caching/index.rst.

    Signed-off-by: Mauro Carvalho Chehab
    Link: https://lore.kernel.org/r/cfe4cb1bf8e1f0093d44c30801ec42e74721e543.1588021877.git.mchehab+huawei@kernel.org
    Signed-off-by: Jonathan Corbet

    Mauro Carvalho Chehab
     
  • - Add a SPDX header;
    - Adjust document and section titles;
    - Comment out text ToC for html/pdf output;
    - Some whitespace fixes and new line breaks;
    - Add table markups;
    - Add it to filesystems/index.rst.

    Signed-off-by: Mauro Carvalho Chehab
    Link: https://lore.kernel.org/r/e33ec382a53cf10ffcbd802f6de3f384159cddba.1588021877.git.mchehab+huawei@kernel.org
    Signed-off-by: Jonathan Corbet

    Mauro Carvalho Chehab
     
  • - Add a SPDX header;
    - Adjust document and section titles;
    - Comment out text ToC for html/pdf output;
    - Some whitespace fixes and new line breaks;
    - Adjust the events list to make them look better for html output;
    - Add it to filesystems/index.rst.

    Signed-off-by: Mauro Carvalho Chehab
    Link: https://lore.kernel.org/r/49026a8ea7e714c2e0f003aa26b975b1025476b7.1588021877.git.mchehab+huawei@kernel.org
    Signed-off-by: Jonathan Corbet

    Mauro Carvalho Chehab
     

27 Apr, 2020

1 commit

  • Instead of having all the sysctl handlers deal with user pointers, which
    is rather hairy in terms of the BPF interaction, copy the input to and
    from userspace in common code. This also means that the strings are
    always NUL-terminated by the common code, making the API a little bit
    safer.

    As most handler just pass through the data to one of the common handlers
    a lot of the changes are mechnical.

    Signed-off-by: Christoph Hellwig
    Acked-by: Andrey Ignatov
    Signed-off-by: Al Viro

    Christoph Hellwig
     

04 Feb, 2020

1 commit

  • The most notable change is DEFINE_SHOW_ATTRIBUTE macro split in
    seq_file.h.

    Conversion rule is:

    llseek => proc_lseek
    unlocked_ioctl => proc_ioctl

    xxx => proc_xxx

    delete ".owner = THIS_MODULE" line

    [akpm@linux-foundation.org: fix drivers/isdn/capi/kcapi_proc.c]
    [sfr@canb.auug.org.au: fix kernel/sched/psi.c]
    Link: http://lkml.kernel.org/r/20200122180545.36222f50@canb.auug.org.au
    Link: http://lkml.kernel.org/r/20191225172546.GB13378@avx2
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

11 Jul, 2019

1 commit

  • …el/git/dhowells/linux-fs"

    This reverts merge 0f75ef6a9cff49ff612f7ce0578bced9d0b38325 (and thus
    effectively commits

    7a1ade847596 ("keys: Provide KEYCTL_GRANT_PERMISSION")
    2e12256b9a76 ("keys: Replace uid/gid/perm permissions checking with an ACL")

    that the merge brought in).

    It turns out that it breaks booting with an encrypted volume, and Eric
    biggers reports that it also breaks the fscrypt tests [1] and loading of
    in-kernel X.509 certificates [2].

    The root cause of all the breakage is likely the same, but David Howells
    is off email so rather than try to work it out it's getting reverted in
    order to not impact the rest of the merge window.

    [1] https://lore.kernel.org/lkml/20190710011559.GA7973@sol.localdomain/
    [2] https://lore.kernel.org/lkml/20190710013225.GB7973@sol.localdomain/

    Link: https://lore.kernel.org/lkml/CAHk-=wjxoeMJfeBahnWH=9zShKp2bsVy527vo3_y8HfOdhwAAw@mail.gmail.com/
    Reported-by: Eric Biggers <ebiggers@kernel.org>
    Cc: David Howells <dhowells@redhat.com>
    Cc: James Morris <jmorris@namei.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

    Linus Torvalds
     

09 Jul, 2019

1 commit

  • Pull keyring ACL support from David Howells:
    "This changes the permissions model used by keys and keyrings to be
    based on an internal ACL by the following means:

    - Replace the permissions mask internally with an ACL that contains a
    list of ACEs, each with a specific subject with a permissions mask.
    Potted default ACLs are available for new keys and keyrings.

    ACE subjects can be macroised to indicate the UID and GID specified
    on the key (which remain). Future commits will be able to add
    additional subject types, such as specific UIDs or domain
    tags/namespaces.

    Also split a number of permissions to give finer control. Examples
    include splitting the revocation permit from the change-attributes
    permit, thereby allowing someone to be granted permission to revoke
    a key without allowing them to change the owner; also the ability
    to join a keyring is split from the ability to link to it, thereby
    stopping a process accessing a keyring by joining it and thus
    acquiring use of possessor permits.

    - Provide a keyctl to allow the granting or denial of one or more
    permits to a specific subject. Direct access to the ACL is not
    granted, and the ACL cannot be viewed"

    * tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
    keys: Provide KEYCTL_GRANT_PERMISSION
    keys: Replace uid/gid/perm permissions checking with an ACL

    Linus Torvalds
     

28 Jun, 2019

1 commit

  • Replace the uid/gid/perm permissions checking on a key with an ACL to allow
    the SETATTR and SEARCH permissions to be split. This will also allow a
    greater range of subjects to represented.

    ============
    WHY DO THIS?
    ============

    The problem is that SETATTR and SEARCH cover a slew of actions, not all of
    which should be grouped together.

    For SETATTR, this includes actions that are about controlling access to a
    key:

    (1) Changing a key's ownership.

    (2) Changing a key's security information.

    (3) Setting a keyring's restriction.

    And actions that are about managing a key's lifetime:

    (4) Setting an expiry time.

    (5) Revoking a key.

    and (proposed) managing a key as part of a cache:

    (6) Invalidating a key.

    Managing a key's lifetime doesn't really have anything to do with
    controlling access to that key.

    Expiry time is awkward since it's more about the lifetime of the content
    and so, in some ways goes better with WRITE permission. It can, however,
    be set unconditionally by a process with an appropriate authorisation token
    for instantiating a key, and can also be set by the key type driver when a
    key is instantiated, so lumping it with the access-controlling actions is
    probably okay.

    As for SEARCH permission, that currently covers:

    (1) Finding keys in a keyring tree during a search.

    (2) Permitting keyrings to be joined.

    (3) Invalidation.

    But these don't really belong together either, since these actions really
    need to be controlled separately.

    Finally, there are number of special cases to do with granting the
    administrator special rights to invalidate or clear keys that I would like
    to handle with the ACL rather than key flags and special checks.

    ===============
    WHAT IS CHANGED
    ===============

    The SETATTR permission is split to create two new permissions:

    (1) SET_SECURITY - which allows the key's owner, group and ACL to be
    changed and a restriction to be placed on a keyring.

    (2) REVOKE - which allows a key to be revoked.

    The SEARCH permission is split to create:

    (1) SEARCH - which allows a keyring to be search and a key to be found.

    (2) JOIN - which allows a keyring to be joined as a session keyring.

    (3) INVAL - which allows a key to be invalidated.

    The WRITE permission is also split to create:

    (1) WRITE - which allows a key's content to be altered and links to be
    added, removed and replaced in a keyring.

    (2) CLEAR - which allows a keyring to be cleared completely. This is
    split out to make it possible to give just this to an administrator.

    (3) REVOKE - see above.

    Keys acquire ACLs which consist of a series of ACEs, and all that apply are
    unioned together. An ACE specifies a subject, such as:

    (*) Possessor - permitted to anyone who 'possesses' a key
    (*) Owner - permitted to the key owner
    (*) Group - permitted to the key group
    (*) Everyone - permitted to everyone

    Note that 'Other' has been replaced with 'Everyone' on the assumption that
    you wouldn't grant a permit to 'Other' that you wouldn't also grant to
    everyone else.

    Further subjects may be made available by later patches.

    The ACE also specifies a permissions mask. The set of permissions is now:

    VIEW Can view the key metadata
    READ Can read the key content
    WRITE Can update/modify the key content
    SEARCH Can find the key by searching/requesting
    LINK Can make a link to the key
    SET_SECURITY Can change owner, ACL, expiry
    INVAL Can invalidate
    REVOKE Can revoke
    JOIN Can join this keyring
    CLEAR Can clear this keyring

    The KEYCTL_SETPERM function is then deprecated.

    The KEYCTL_SET_TIMEOUT function then is permitted if SET_SECURITY is set,
    or if the caller has a valid instantiation auth token.

    The KEYCTL_INVALIDATE function then requires INVAL.

    The KEYCTL_REVOKE function then requires REVOKE.

    The KEYCTL_JOIN_SESSION_KEYRING function then requires JOIN to join an
    existing keyring.

    The JOIN permission is enabled by default for session keyrings and manually
    created keyrings only.

    ======================
    BACKWARD COMPATIBILITY
    ======================

    To maintain backward compatibility, KEYCTL_SETPERM will translate the
    permissions mask it is given into a new ACL for a key - unless
    KEYCTL_SET_ACL has been called on that key, in which case an error will be
    returned.

    It will convert possessor, owner, group and other permissions into separate
    ACEs, if each portion of the mask is non-zero.

    SETATTR permission turns on all of INVAL, REVOKE and SET_SECURITY. WRITE
    permission turns on WRITE, REVOKE and, if a keyring, CLEAR. JOIN is turned
    on if a keyring is being altered.

    The KEYCTL_DESCRIBE function translates the ACL back into a permissions
    mask to return depending on possessor, owner, group and everyone ACEs.

    It will make the following mappings:

    (1) INVAL, JOIN -> SEARCH

    (2) SET_SECURITY -> SETATTR

    (3) REVOKE -> WRITE if SETATTR isn't already set

    (4) CLEAR -> WRITE

    Note that the value subsequently returned by KEYCTL_DESCRIBE may not match
    the value set with KEYCTL_SETATTR.

    =======
    TESTING
    =======

    This passes the keyutils testsuite for all but a couple of tests:

    (1) tests/keyctl/dh_compute/badargs: The first wrong-key-type test now
    returns EOPNOTSUPP rather than ENOKEY as READ permission isn't removed
    if the type doesn't have ->read(). You still can't actually read the
    key.

    (2) tests/keyctl/permitting/valid: The view-other-permissions test doesn't
    work as Other has been replaced with Everyone in the ACL.

    Signed-off-by: David Howells

    David Howells
     

31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 3029 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

24 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public licence as published by
    the free software foundation either version 2 of the licence or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 114 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Reviewed-by: Kate Stewart
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190520170857.552531963@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

21 May, 2019

1 commit


30 Nov, 2018

1 commit

  • It was observed that a process blocked indefintely in
    __fscache_read_or_alloc_page(), waiting for FSCACHE_COOKIE_LOOKING_UP
    to be cleared via fscache_wait_for_deferred_lookup().

    At this time, ->backing_objects was empty, which would normaly prevent
    __fscache_read_or_alloc_page() from getting to the point of waiting.
    This implies that ->backing_objects was cleared *after*
    __fscache_read_or_alloc_page was was entered.

    When an object is "killed" and then "dropped",
    FSCACHE_COOKIE_LOOKING_UP is cleared in fscache_lookup_failure(), then
    KILL_OBJECT and DROP_OBJECT are "called" and only in DROP_OBJECT is
    ->backing_objects cleared. This leaves a window where
    something else can set FSCACHE_COOKIE_LOOKING_UP and
    __fscache_read_or_alloc_page() can start waiting, before
    ->backing_objects is cleared

    There is some uncertainty in this analysis, but it seems to be fit the
    observations. Adding the wake in this patch will be handled correctly
    by __fscache_read_or_alloc_page(), as it checks if ->backing_objects
    is empty again, after waiting.

    Customer which reported the hang, also report that the hang cannot be
    reproduced with this fix.

    The backtrace for the blocked process looked like:

    PID: 29360 TASK: ffff881ff2ac0f80 CPU: 3 COMMAND: "zsh"
    #0 [ffff881ff43efbf8] schedule at ffffffff815e56f1
    #1 [ffff881ff43efc58] bit_wait at ffffffff815e64ed
    #2 [ffff881ff43efc68] __wait_on_bit at ffffffff815e61b8
    #3 [ffff881ff43efca0] out_of_line_wait_on_bit at ffffffff815e625e
    #4 [ffff881ff43efd08] fscache_wait_for_deferred_lookup at ffffffffa04f2e8f [fscache]
    #5 [ffff881ff43efd18] __fscache_read_or_alloc_page at ffffffffa04f2ffe [fscache]
    #6 [ffff881ff43efd58] __nfs_readpage_from_fscache at ffffffffa0679668 [nfs]
    #7 [ffff881ff43efd78] nfs_readpage at ffffffffa067092b [nfs]
    #8 [ffff881ff43efda0] generic_file_read_iter at ffffffff81187a73
    #9 [ffff881ff43efe50] nfs_file_read at ffffffffa066544b [nfs]
    #10 [ffff881ff43efe70] __vfs_read at ffffffff811fc756
    #11 [ffff881ff43efee8] vfs_read at ffffffff811fccfa
    #12 [ffff881ff43eff18] sys_read at ffffffff811fda62
    #13 [ffff881ff43eff50] entry_SYSCALL_64_fastpath at ffffffff815e986e

    Signed-off-by: NeilBrown
    Signed-off-by: David Howells

    NeilBrown
     

18 Oct, 2018

2 commits

  • fscache_set_key() can incur an out-of-bounds read, reported by KASAN:

    BUG: KASAN: slab-out-of-bounds in fscache_alloc_cookie+0x5b3/0x680 [fscache]
    Read of size 4 at addr ffff88084ff056d4 by task mount.nfs/32615

    and also reported by syzbot at https://lkml.org/lkml/2018/7/8/236

    BUG: KASAN: slab-out-of-bounds in fscache_set_key fs/fscache/cookie.c:120 [inline]
    BUG: KASAN: slab-out-of-bounds in fscache_alloc_cookie+0x7a9/0x880 fs/fscache/cookie.c:171
    Read of size 4 at addr ffff8801d3cc8bb4 by task syz-executor907/4466

    This happens for any index_key_len which is not divisible by 4 and is
    larger than the size of the inline key, because the code allocates exactly
    index_key_len for the key buffer, but the hashing loop is stepping through
    it 4 bytes (u32) at a time in the buf[] array.

    Fix this by calculating how many u32 buffers we'll need by using
    DIV_ROUND_UP, and then using kcalloc() to allocate a precleared allocation
    buffer to hold the index_key, then using that same count as the hashing
    index limit.

    Fixes: ec0328e46d6e ("fscache: Maintain a catalogue of allocated cookies")
    Reported-by: syzbot+a95b989b2dde8e806af8@syzkaller.appspotmail.com
    Signed-off-by: Eric Sandeen
    Cc: stable
    Signed-off-by: David Howells
    Signed-off-by: Greg Kroah-Hartman

    Eric Sandeen
     
  • The inline key in struct rxrpc_cookie is insufficiently initialized,
    zeroing only 3 of the 4 slots, therefore an index_key_len between 13 and 15
    bytes will end up hashing uninitialized memory because the memcpy only
    partially fills the last buf[] element.

    Fix this by clearing fscache_cookie objects on allocation rather than using
    the slab constructor to initialise them. We're going to pretty much fill
    in the entire struct anyway, so bringing it into our dcache writably
    shouldn't incur much overhead.

    This removes the need to do clearance in fscache_set_key() (where we aren't
    doing it correctly anyway).

    Also, we don't need to set cookie->key_len in fscache_set_key() as we
    already did it in the only caller, so remove that.

    Fixes: ec0328e46d6e ("fscache: Maintain a catalogue of allocated cookies")
    Reported-by: syzbot+a95b989b2dde8e806af8@syzkaller.appspotmail.com
    Reported-by: Eric Sandeen
    Cc: stable
    Signed-off-by: David Howells
    Signed-off-by: Greg Kroah-Hartman

    David Howells
     

25 Jul, 2018

2 commits

  • When a cookie is allocated that causes fscache_object structs to be
    allocated, those objects are initialised with the cookie pointer, but
    aren't blessed with a ref on that cookie unless the attachment is
    successfully completed in fscache_attach_object().

    If attachment fails because the parent object was dying or there was a
    collision, fscache_attach_object() returns without incrementing the cookie
    counter - but upon failure of this function, the object is released which
    then puts the cookie, whether or not a ref was taken on the cookie.

    Fix this by taking a ref on the cookie when it is assigned in
    fscache_object_init(), even when we're creating a root object.

    Analysis from Kiran Kumar:

    This bug has been seen in 4.4.0-124-generic #148-Ubuntu kernel

    BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1776277

    fscache cookie ref count updated incorrectly during fscache object
    allocation resulting in following Oops.

    kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/internal.h:321!
    kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639!

    [Cause]
    Two threads are trying to do operate on a cookie and two objects.

    (1) One thread tries to unmount the filesystem and in process goes over a
    huge list of objects marking them dead and deleting the objects.
    cookie->usage is also decremented in following path:

    nfs_fscache_release_super_cookie
    -> __fscache_relinquish_cookie
    ->__fscache_cookie_put
    ->BUG_ON(atomic_read(&cookie->usage) fscache_object_init
    -> assign cookie, but usage not bumped.
    2) fscache_attach_object -> fails in cant_attach_object because the
    cookie's backing object or cookie's->parent object are going away
    3) fscache_put_object
    -> cachefiles_put_object
    ->fscache_object_destroy
    ->fscache_cookie_put
    ->BUG_ON(atomic_read(&cookie->usage)
    Signed-off-by: David Howells

    Kiran Kumar Modukuri
     
  • Alter the state-check assertion in fscache_enqueue_operation() to allow
    cancelled operations to be given processing time so they can be cleaned up.

    Also fix a debugging statement that was requiring such operations to have
    an object assigned.

    Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem")
    Reported-by: Kiran Kumar Modukuri
    Signed-off-by: David Howells

    Kiran Kumar Modukuri
     

16 May, 2018

2 commits


12 Apr, 2018

1 commit

  • Don't open-code accesses to data structure internals.

    Link: http://lkml.kernel.org/r/20180313132639.17387-7-willy@infradead.org
    Signed-off-by: Matthew Wilcox
    Reviewed-by: Jeff Layton
    Cc: Darrick J. Wong
    Cc: Dave Chinner
    Cc: Ryusuke Konishi
    Cc: Will Deacon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     

06 Apr, 2018

2 commits

  • Maintain a catalogue of allocated cookies so that cookie collisions can be
    handled properly. For the moment, this just involves printing a warning
    and returning a NULL cookie to the caller of fscache_acquire_cookie(), but
    in future it might make sense to wait for the old cookie to finish being
    cleaned up.

    This requires the cookie key to be stored attached to the cookie so that we
    still have the key available if the netfs relinquishes the cookie. This is
    done by an earlier patch.

    The catalogue also renders redundant fscache_netfs_list (used for checking
    for duplicates), so that can be removed.

    Signed-off-by: David Howells
    Acked-by: Anna Schumaker
    Tested-by: Steve Dickson

    David Howells
     
  • Pass the object size in to fscache_acquire_cookie() and
    fscache_write_page() rather than the netfs providing a callback by which it
    can be received. This makes it easier to update the size of the object
    when a new page is written that extends the object.

    The current object size is also passed by fscache to the check_aux
    function, obviating the need to store it in the aux data.

    Signed-off-by: David Howells
    Acked-by: Anna Schumaker
    Tested-by: Steve Dickson

    David Howells
     

04 Apr, 2018

7 commits

  • Attach copies of the index key and auxiliary data to the fscache cookie so
    that:

    (1) The callbacks to the netfs for this stuff can be eliminated. This
    can simplify things in the cache as the information is still
    available, even after the cache has relinquished the cookie.

    (2) Simplifies the locking requirements of accessing the information as we
    don't have to worry about the netfs object going away on us.

    (3) The cache can do lazy updating of the coherency information on disk.
    As long as the cache is flushed before reboot/poweroff, there's no
    need to update the coherency info on disk every time it changes.

    (4) Cookies can be hashed or put in a tree as the index key is easily
    available. This allows:

    (a) Checks for duplicate cookies can be made at the top fscache layer
    rather than down in the bowels of the cache backend.

    (b) Caching can be added to a netfs object that has a cookie if the
    cache is brought online after the netfs object is allocated.

    A certain amount of space is made in the cookie for inline copies of the
    data, but if it won't fit there, extra memory will be allocated for it.

    The downside of this is that live cache operation requires more memory.

    Signed-off-by: David Howells
    Acked-by: Anna Schumaker
    Tested-by: Steve Dickson

    David Howells
     
  • Add more tracepoints to fscache, including:

    (*) fscache_page - Tracks netfs pages known to fscache.

    (*) fscache_check_page - Tracks the netfs querying whether a page is
    pending storage.

    (*) fscache_wake_cookie - Tracks cookies being woken up after a page
    completes/aborts storage in the cache.

    (*) fscache_op - Tracks operations being initialised.

    (*) fscache_wrote_page - Tracks return of the backend write_page op.

    (*) fscache_gang_lookup - Tracks lookup of pages to be stored in the write
    operation.

    Signed-off-by: David Howells

    David Howells
     
  • Add some tracepoints to fscache:

    (*) fscache_cookie - Tracks a cookie's usage count.

    (*) fscache_netfs - Logs registration of a network filesystem, including
    the pointer to the cookie allocated.

    (*) fscache_acquire - Logs cookie acquisition.

    (*) fscache_relinquish - Logs cookie relinquishment.

    (*) fscache_enable - Logs enablement of a cookie.

    (*) fscache_disable - Logs disablement of a cookie.

    (*) fscache_osm - Tracks execution of states in the object state machine.

    and cachefiles:

    (*) cachefiles_ref - Tracks a cachefiles object's usage count.

    (*) cachefiles_lookup - Logs result of lookup_one_len().

    (*) cachefiles_mkdir - Logs result of vfs_mkdir().

    (*) cachefiles_create - Logs result of vfs_create().

    (*) cachefiles_unlink - Logs calls to vfs_unlink().

    (*) cachefiles_rename - Logs calls to vfs_rename().

    (*) cachefiles_mark_active - Logs an object becoming active.

    (*) cachefiles_wait_active - Logs a wait for an old object to be
    destroyed.

    (*) cachefiles_mark_inactive - Logs an object becoming inactive.

    (*) cachefiles_mark_buried - Logs the burial of an object.

    Signed-off-by: David Howells

    David Howells
     
  • If the fscache asynchronous write operation elects to discard a page that's
    pending storage to the cache because the page would be over the store limit
    then it needs to wake the page as someone may be waiting on completion of
    the write.

    The problem is that the store limit may be updated by a different
    asynchronous operation - and so may miss the write - and that the store
    limit may not even get updated until later by the netfs.

    Fix the kernel hang by making fscache_write_op() mark as written any pages
    that are over the limit.

    Signed-off-by: David Howells

    David Howells
     
  • Report if an fscache cookie is relinquished multiple times by the netfs.

    Signed-off-by: David

    David Howells
     
  • The last parameter to fscache_op_complete() is a bool indicating whether or
    not the operation was cancelled. A lot of the time the inverse value is
    given or no differentiation is made. Fix this.

    Signed-off-by: David Howells

    David Howells
     
  • Fix a couple of checker warnings in fscache and cachefiles:

    (1) fscache_n_op_requeue is never used, so get rid of it.

    (2) cachefiles_uncache_page() is passed in a lock that it releases, so
    this needs annotating.

    Signed-off-by: David Howells

    David Howells
     

20 Mar, 2018

1 commit


17 Nov, 2017

1 commit

  • Pull AFS updates from David Howells:
    "kAFS filesystem driver overhaul.

    The major points of the overhaul are:

    (1) Preliminary groundwork is laid for supporting network-namespacing
    of kAFS. The remainder of the namespacing work requires some way
    to pass namespace information to submounts triggered by an
    automount. This requires something like the mount overhaul that's
    in progress.

    (2) sockaddr_rxrpc is used in preference to in_addr for holding
    addresses internally and add support for talking to the YFS VL
    server. With this, kAFS can do everything over IPv6 as well as
    IPv4 if it's talking to servers that support it.

    (3) Callback handling is overhauled to be generally passive rather
    than active. 'Callbacks' are promises by the server to tell us
    about data and metadata changes. Callbacks are now checked when
    we next touch an inode rather than actively going and looking for
    it where possible.

    (4) File access permit caching is overhauled to store the caching
    information per-inode rather than per-directory, shared over
    subordinate files. Whilst older AFS servers only allow ACLs on
    directories (shared to the files in that directory), newer AFS
    servers break that restriction.

    To improve memory usage and to make it easier to do mass-key
    removal, permit combinations are cached and shared.

    (5) Cell database management is overhauled to allow lighter locks to
    be used and to make cell records autonomous state machines that
    look after getting their own DNS records and cleaning themselves
    up, in particular preventing races in acquiring and relinquishing
    the fscache token for the cell.

    (6) Volume caching is overhauled. The afs_vlocation record is got rid
    of to simplify things and the superblock is now keyed on the cell
    and the numeric volume ID only. The volume record is tied to a
    superblock and normal superblock management is used to mediate
    the lifetime of the volume fscache token.

    (7) File server record caching is overhauled to make server records
    independent of cells and volumes. A server can be in multiple
    cells (in such a case, the administrator must make sure that the
    VL services for all cells correctly reflect the volumes shared
    between those cells).

    Server records are now indexed using the UUID of the server
    rather than the address since a server can have multiple
    addresses.

    (8) File server rotation is overhauled to handle VMOVED, VBUSY (and
    similar), VOFFLINE and VNOVOL indications and to handle rotation
    both of servers and addresses of those servers. The rotation will
    also wait and retry if the server says it is busy.

    (9) Data writeback is overhauled. Each inode no longer stores a list
    of modified sections tagged with the key that authorised it in
    favour of noting the modified region of a page in page->private
    and storing a list of keys that made modifications in the inode.

    This simplifies things and allows other keys to be used to
    actually write to the server if a key that made a modification
    becomes useless.

    (10) Writable mmap() is implemented. This allows a kernel to be build
    entirely on AFS.

    Note that Pre AFS-3.4 servers are no longer supported, though this can
    be added back if necessary (AFS-3.4 was released in 1998)"

    * tag 'afs-next-20171113' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (35 commits)
    afs: Protect call->state changes against signals
    afs: Trace page dirty/clean
    afs: Implement shared-writeable mmap
    afs: Get rid of the afs_writeback record
    afs: Introduce a file-private data record
    afs: Use a dynamic port if 7001 is in use
    afs: Fix directory read/modify race
    afs: Trace the sending of pages
    afs: Trace the initiation and completion of client calls
    afs: Fix documentation on # vs % prefix in mount source specification
    afs: Fix total-length calculation for multiple-page send
    afs: Only progress call state at end of Tx phase from rxrpc callback
    afs: Make use of the YFS service upgrade to fully support IPv6
    afs: Overhaul volume and server record caching and fileserver rotation
    afs: Move server rotation code into its own file
    afs: Add an address list concept
    afs: Overhaul cell database management
    afs: Overhaul permit caching
    afs: Overhaul the callback handling
    afs: Rename struct afs_call server member to cm_server
    ...

    Linus Torvalds
     

16 Nov, 2017

1 commit

  • Every pagevec_init user claims the pages being released are hot even in
    cases where it is unlikely the pages are hot. As no one cares about the
    hotness of pages being released to the allocator, just ditch the
    parameter.

    No performance impact is expected as the overhead is marginal. The
    parameter is removed simply because it is a bit stupid to have a useless
    parameter copied everywhere.

    Link: http://lkml.kernel.org/r/20171018075952.10627-6-mgorman@techsingularity.net
    Signed-off-by: Mel Gorman
    Acked-by: Vlastimil Babka
    Cc: Andi Kleen
    Cc: Dave Chinner
    Cc: Dave Hansen
    Cc: Jan Kara
    Cc: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     

13 Nov, 2017

1 commit

  • Make wait_on_atomic_t() pass the TASK_* mode onto its action function as an
    extra argument and make it 'unsigned int throughout.

    Also, consolidate a bunch of identical action functions into a default
    function that can do the appropriate thing for the mode.

    Also, change the argument name in the bit_wait*() function declarations to
    reflect the fact that it's the mode and not the bit number.

    [Peter Z gives this a grudging ACK, but thinks that the whole atomic_t wait
    should be done differently, though he's not immediately sure as to how]

    Signed-off-by: David Howells
    Acked-by: Peter Zijlstra
    cc: Ingo Molnar

    David Howells
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

13 Oct, 2017

1 commit

  • When the file /proc/fs/fscache/objects (available with
    CONFIG_FSCACHE_OBJECT_LIST=y) is opened, we request a user key with
    description "fscache:objlist", then access its payload. However, a
    revoked key has a NULL payload, and we failed to check for this.
    request_key() *does* skip revoked keys, but there is still a window
    where the key can be revoked before we access its payload.

    Fix it by checking for a NULL payload, treating it like a key which was
    already revoked at the time it was requested.

    Fixes: 4fbf4291aa15 ("FS-Cache: Allow the current state of all objects to be dumped")
    Reviewed-by: James Morris
    Cc: [v2.6.32+]
    Signed-off-by: Eric Biggers
    Signed-off-by: David Howells

    Eric Biggers
     

14 Sep, 2017

1 commit

  • gcc points out a minor bug in the handling of unknown cookie types,
    which could result in a string overflow when the integer is copied into
    a 3-byte string:

    fs/fscache/object-list.c: In function 'fscache_objlist_show':
    fs/fscache/object-list.c:265:19: error: 'sprintf' may write a terminating nul past the end of the destination [-Werror=format-overflow=]
    sprintf(_type, "%02u", cookie->def->type);
    ^~~~~~
    fs/fscache/object-list.c:265:4: note: 'sprintf' output between 3 and 4 bytes into a destination of size 3

    This is currently harmless as no code sets a type other than 0 or 1, but
    it makes sense to use snprintf() here to avoid overflowing the array if
    that changes.

    Link: http://lkml.kernel.org/r/20170714120720.906842-22-arnd@arndb.de
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arnd Bergmann
     

07 Sep, 2017

2 commits

  • All users of pagevec_lookup() and pagevec_lookup_range() now pass
    PAGEVEC_SIZE as a desired number of pages.

    Just drop the argument.

    Link: http://lkml.kernel.org/r/20170726114704.7626-11-jack@suse.cz
    Signed-off-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Make pagevec_lookup() (and underlying find_get_pages()) update index to
    the next page where iteration should continue. Most callers want this
    and also pagevec_lookup_tag() already does this.

    Link: http://lkml.kernel.org/r/20170726114704.7626-3-jack@suse.cz
    Signed-off-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara