21 Oct, 2020

1 commit

  • Its possible that using AUTH_SYS and mountd manage-gids option a
    user may hit the 8k RPC channel buffer limit. This have been observed
    on field, causing unanswered RPCs on clients after mountd fails to
    write on channel :

    rpc.mountd[11231]: auth_unix_gid: error writing reply

    Userland nfs-utils uses a buffer size of 32k (RPC_CHAN_BUF_SIZE), so
    lets match those two.

    Signed-off-by: Roberto Bergantinos Corpas
    Signed-off-by: J. Bruce Fields

    Roberto Bergantinos Corpas
     

26 Sep, 2020

2 commits


13 Apr, 2020

1 commit

  • Deleting list entry within hlist_for_each_entry_safe is not safe unless
    next pointer (tmp) is protected too. It's not, because once hash_lock
    is released, cache_clean may delete the entry that tmp points to. Then
    cache_purge can walk to a deleted entry and tries to double free it.

    Fix this bug by holding only the deleted entry's reference.

    Suggested-by: NeilBrown
    Signed-off-by: Yihao Wu
    Reviewed-by: NeilBrown
    [ cel: removed unused variable ]
    Signed-off-by: Chuck Lever

    Yihao Wu
     

17 Mar, 2020

4 commits


08 Feb, 2020

1 commit

  • Pull nfsd updates from Bruce Fields:
    "Highlights:

    - Server-to-server copy code from Olga.

    To use it, client and both servers must have support, the target
    server must be able to access the source server over NFSv4.2, and
    the target server must have the inter_copy_offload_enable module
    parameter set.

    - Improvements and bugfixes for the new filehandle cache, especially
    in the container case, from Trond

    - Also from Trond, better reporting of write errors.

    - Y2038 work from Arnd"

    * tag 'nfsd-5.6' of git://linux-nfs.org/~bfields/linux: (55 commits)
    sunrpc: expiry_time should be seconds not timeval
    nfsd: make nfsd_filecache_wq variable static
    nfsd4: fix double free in nfsd4_do_async_copy()
    nfsd: convert file cache to use over/underflow safe refcount
    nfsd: Define the file access mode enum for tracing
    nfsd: Fix a perf warning
    nfsd: Ensure sampling of the write verifier is atomic with the write
    nfsd: Ensure sampling of the commit verifier is atomic with the commit
    sunrpc: clean up cache entry add/remove from hashtable
    sunrpc: Fix potential leaks in sunrpc_cache_unhash()
    nfsd: Ensure exclusion between CLONE and WRITE errors
    nfsd: Pass the nfsd_file as arguments to nfsd4_clone_file_range()
    nfsd: Update the boot verifier on stable writes too.
    nfsd: Fix stable writes
    nfsd: Allow nfsd_vfs_write() to take the nfsd_file as an argument
    nfsd: Fix a soft lockup race in nfsd_file_mark_find_or_create()
    nfsd: Reduce the number of calls to nfsd_file_gc()
    nfsd: Schedule the laundrette regularly irrespective of file errors
    nfsd: Remove unused constant NFSD_FILE_LRU_RESCAN
    nfsd: Containerise filecache laundrette
    ...

    Linus Torvalds
     

04 Feb, 2020

1 commit

  • The most notable change is DEFINE_SHOW_ATTRIBUTE macro split in
    seq_file.h.

    Conversion rule is:

    llseek => proc_lseek
    unlocked_ioctl => proc_ioctl

    xxx => proc_xxx

    delete ".owner = THIS_MODULE" line

    [akpm@linux-foundation.org: fix drivers/isdn/capi/kcapi_proc.c]
    [sfr@canb.auug.org.au: fix kernel/sched/psi.c]
    Link: http://lkml.kernel.org/r/20200122180545.36222f50@canb.auug.org.au
    Link: http://lkml.kernel.org/r/20191225172546.GB13378@avx2
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

23 Jan, 2020

2 commits


19 Dec, 2019

1 commit

  • The timestamps for the cache are all in boottime seconds, so they
    don't overflow 32-bit values, but the use of time_t is deprecated
    because it generally does overflow when used with wall-clock time.

    There are multiple possible ways of avoiding it:

    - leave time_t, which is safe here, but forces others to
    look into this code to determine that it is over and over.

    - use a more generic type, like 'int' or 'long', which is known
    to be sufficient here but loses the documentation of referring
    to timestamps

    - use ktime_t everywhere, and convert into seconds in the few
    places where we want realtime-seconds. The conversion is
    sometimes expensive, but not more so than the conversion we
    do today.

    - use time64_t to clarify that this code is safe. Nothing would
    change for 64-bit architectures, but it is slightly less
    efficient on 32-bit architectures.

    Without a clear winner of the three approaches above, this picks
    the last one, favouring readability over a small performance
    loss on 32-bit architectures.

    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

12 Oct, 2019

1 commit

  • I was investigating a crash in our Virtuozzo7 kernel which happened in
    in svcauth_unix_set_client. I found out that we access m_client field
    in ip_map structure, which was received from sunrpc_cache_lookup (we
    have a bit older kernel, now the code is in sunrpc_cache_add_entry), and
    these field looks uninitialized (m_client == 0x74 don't look like a
    pointer) but in the cache_head in flags we see 0x1 which is CACHE_VALID.

    It looks like the problem appeared from our previous fix to sunrpc (1):
    commit 4ecd55ea0742 ("sunrpc: fix cache_head leak due to queued
    request")

    And we've also found a patch already fixing our patch (2):
    commit d58431eacb22 ("sunrpc: don't mark uninitialised items as VALID.")

    Though the crash is eliminated, I think the core of the problem is not
    completely fixed:

    Neil in the patch (2) makes cache_head CACHE_NEGATIVE, before
    cache_fresh_locked which was added in (1) to fix crash. These way
    cache_is_valid won't say the cache is valid anymore and in
    svcauth_unix_set_client the function cache_check will return error
    instead of 0, and we don't count entry as initialized.

    But it looks like we need to remove cache_fresh_locked completely in
    sunrpc_cache_lookup:

    In (1) we've only wanted to make cache_fresh_unlocked->cache_dequeue so
    that cache_requests with no readers also release corresponding
    cache_head, to fix their leak. We with Vasily were not sure if
    cache_fresh_locked and cache_fresh_unlocked should be used in pair or
    not, so we've guessed to use them in pair.

    Now we see that we don't want the CACHE_VALID bit set here by
    cache_fresh_locked, as "valid" means "initialized" and there is no
    initialization in sunrpc_cache_add_entry. Both expiry_time and
    last_refresh are not used in cache_fresh_unlocked code-path and also not
    required for the initial fix.

    So to conclude cache_fresh_locked was called by mistake, and we can just
    safely remove it instead of crutching it with CACHE_NEGATIVE. It looks
    ideologically better for me. Hope I don't miss something here.

    Here is our crash backtrace:
    [13108726.326291] BUG: unable to handle kernel NULL pointer dereference at 0000000000000074
    [13108726.326365] IP: [] svcauth_unix_set_client+0x2ab/0x520 [sunrpc]
    [13108726.326448] PGD 0
    [13108726.326468] Oops: 0002 [#1] SMP
    [13108726.326497] Modules linked in: nbd isofs xfs loop kpatch_cumulative_81_0_r1(O) xt_physdev nfnetlink_queue bluetooth rfkill ip6table_nat nf_nat_ipv6 ip_vs_wrr ip_vs_wlc ip_vs_sh nf_conntrack_netlink ip_vs_sed ip_vs_pe_sip nf_conntrack_sip ip_vs_nq ip_vs_lc ip_vs_lblcr ip_vs_lblc ip_vs_ftp ip_vs_dh nf_nat_ftp nf_conntrack_ftp iptable_raw xt_recent nf_log_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_TCPMSS xt_tcpmss vxlan ip6_udp_tunnel udp_tunnel xt_statistic xt_NFLOG nfnetlink_log dummy xt_mark xt_REDIRECT nf_nat_redirect raw_diag udp_diag tcp_diag inet_diag netlink_diag af_packet_diag unix_diag rpcsec_gss_krb5 xt_addrtype ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 ebtable_nat ebtable_broute nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_mangle ip6table_raw nfsv4
    [13108726.327173] dns_resolver cls_u32 binfmt_misc arptable_filter arp_tables ip6table_filter ip6_tables devlink fuse_kio_pcs ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_nat iptable_nat nf_nat_ipv4 xt_comment nf_conntrack_ipv4 nf_defrag_ipv4 xt_wdog_tmo xt_multiport bonding xt_set xt_conntrack iptable_filter iptable_mangle kpatch(O) ebtable_filter ebt_among ebtables ip_set_hash_ip ip_set nfnetlink vfat fat skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass fuse pcspkr ses enclosure joydev sg mei_me hpwdt hpilo lpc_ich mei ipmi_si shpchp ipmi_devintf ipmi_msghandler xt_ipvs acpi_power_meter ip_vs_rr nfsv3 nfsd auth_rpcgss nfs_acl nfs lockd grace fscache nf_nat cls_fw sch_htb sch_cbq sch_sfq ip_vs em_u32 nf_conntrack tun br_netfilter veth overlay ip6_vzprivnet ip6_vznetstat ip_vznetstat
    [13108726.327817] ip_vzprivnet vziolimit vzevent vzlist vzstat vznetstat vznetdev vzmon vzdev bridge pio_kaio pio_nfs pio_direct pfmt_raw pfmt_ploop1 ploop ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper scsi_transport_iscsi 8021q syscopyarea sysfillrect garp sysimgblt fb_sys_fops mrp stp ttm llc bnx2x crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel drm dm_multipath ghash_clmulni_intel uas aesni_intel lrw gf128mul glue_helper ablk_helper cryptd tg3 smartpqi scsi_transport_sas mdio libcrc32c i2c_core usb_storage ptp pps_core wmi sunrpc dm_mirror dm_region_hash dm_log dm_mod [last unloaded: kpatch_cumulative_82_0_r1]
    [13108726.328403] CPU: 35 PID: 63742 Comm: nfsd ve: 51332 Kdump: loaded Tainted: G W O ------------ 3.10.0-862.20.2.vz7.73.29 #1 73.29
    [13108726.328491] Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 10/02/2018
    [13108726.328554] task: ffffa0a6a41b1160 ti: ffffa0c2a74bc000 task.ti: ffffa0c2a74bc000
    [13108726.328610] RIP: 0010:[] [] svcauth_unix_set_client+0x2ab/0x520 [sunrpc]
    [13108726.328706] RSP: 0018:ffffa0c2a74bfd80 EFLAGS: 00010246
    [13108726.328750] RAX: 0000000000000001 RBX: ffffa0a6183ae000 RCX: 0000000000000000
    [13108726.328811] RDX: 0000000000000074 RSI: 0000000000000286 RDI: ffffa0c2a74bfcf0
    [13108726.328864] RBP: ffffa0c2a74bfe00 R08: ffffa0bab8c22960 R09: 0000000000000001
    [13108726.328916] R10: 0000000000000001 R11: 0000000000000001 R12: ffffa0a32aa7f000
    [13108726.328969] R13: ffffa0a6183afac0 R14: ffffa0c233d88d00 R15: ffffa0c2a74bfdb4
    [13108726.329022] FS: 0000000000000000(0000) GS:ffffa0e17f9c0000(0000) knlGS:0000000000000000
    [13108726.329081] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [13108726.332311] CR2: 0000000000000074 CR3: 00000026a1b28000 CR4: 00000000007607e0
    [13108726.334606] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [13108726.336754] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [13108726.338908] PKRU: 00000000
    [13108726.341047] Call Trace:
    [13108726.343074] [] ? groups_alloc+0x34/0x110
    [13108726.344837] [] svc_set_client+0x24/0x30 [sunrpc]
    [13108726.346631] [] svc_process_common+0x241/0x710 [sunrpc]
    [13108726.348332] [] svc_process+0x103/0x190 [sunrpc]
    [13108726.350016] [] nfsd+0xdf/0x150 [nfsd]
    [13108726.351735] [] ? nfsd_destroy+0x80/0x80 [nfsd]
    [13108726.353459] [] kthread+0xd1/0xe0
    [13108726.355195] [] ? create_kthread+0x60/0x60
    [13108726.356896] [] ret_from_fork_nospec_begin+0x7/0x21
    [13108726.358577] [] ? create_kthread+0x60/0x60
    [13108726.360240] Code: 4c 8b 45 98 0f 8e 2e 01 00 00 83 f8 fe 0f 84 76 fe ff ff 85 c0 0f 85 2b 01 00 00 49 8b 50 40 b8 01 00 00 00 48 89 93 d0 1a 00 00 0f c1 02 83 c0 01 83 f8 01 0f 8e 53 02 00 00 49 8b 44 24 38
    [13108726.363769] RIP [] svcauth_unix_set_client+0x2ab/0x520 [sunrpc]
    [13108726.365530] RSP
    [13108726.367179] CR2: 0000000000000074

    Fixes: d58431eacb22 ("sunrpc: don't mark uninitialised items as VALID.")
    Signed-off-by: Pavel Tikhomirov
    Acked-by: NeilBrown
    Signed-off-by: J. Bruce Fields

    Pavel Tikhomirov
     

19 Aug, 2019

1 commit

  • When the exports table is changed, exportfs will usually write a new
    time to the "flush" file in the nfsd.export cache procfile. This tells
    the kernel to flush any entries that are older than that value.

    This gives us a mechanism to tell whether an unexport might have
    occurred. Add a new ->flush cache_detail operation that is called after
    flushing the cache whenever someone writes to a "flush" file.

    Signed-off-by: Jeff Layton
    Signed-off-by: Trond Myklebust
    Signed-off-by: Trond Myklebust
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     

30 Jul, 2019

1 commit

  • The sunrpc cache interface is susceptible to being fooled by a rogue
    process just reading a 'channel' file. If this happens the kernel
    may think a valid daemon exists to service the cache when it does not.
    For example, the following may fool the kernel:
    cat /proc/net/rpc/auth.unix.gid/channel

    Change the tracking of readers to writers when considering whether a
    listener exists as all valid daemon processes either open a channel
    file O_RDWR or O_WRONLY. While this does not prevent a rogue process
    from "stealing" a message from the kernel, it does at least improve
    the kernels perception of whether a valid process servicing the cache
    exists.

    Signed-off-by: Dave Wysochanski
    Signed-off-by: J. Bruce Fields

    Dave Wysochanski
     

11 Jul, 2019

1 commit

  • Pull nfsd updates from Bruce Fields:
    "Highlights:

    - Add a new /proc/fs/nfsd/clients/ directory which exposes some
    long-requested information about NFSv4 clients (like open files)
    and allows forced revocation of client state.

    - Replace the global duplicate reply cache by a cache per network
    namespace; previously, a request in one network namespace could
    incorrectly match an entry from another, though we haven't seen
    this in production. This is the last remaining container bug that
    I'm aware of; at this point you should be able to run separate
    nfsd's in each network namespace, each with their own set of
    exports, and everything should work.

    - Cleanup and modify lock code to show the pid of lockd as the owner
    of NLM locks. This is the correct version of the bugfix originally
    attempted in b8eee0e90f97 ("lockd: Show pid of lockd for remote
    locks")"

    * tag 'nfsd-5.3' of git://linux-nfs.org/~bfields/linux: (34 commits)
    nfsd: Make __get_nfsdfs_client() static
    nfsd: Make two functions static
    nfsd: Fix misuse of strlcpy
    sunrpc/cache: remove the exporting of cache_seq_next
    nfsd: decode implementation id
    nfsd: create xdr_netobj_dup helper
    nfsd: allow forced expiration of NFSv4 clients
    nfsd: create get_nfsdfs_clp helper
    nfsd4: show layout stateids
    nfsd: show lock and deleg stateids
    nfsd4: add file to display list of client's opens
    nfsd: add more information to client info file
    nfsd: escape high characters in binary data
    nfsd: copy client's address including port number to cl_addr
    nfsd4: add a client info file
    nfsd: make client/ directory names small ints
    nfsd: add nfsd/clients directory
    nfsd4: use reference count to free client
    nfsd: rename cl_refcount
    nfsd: persist nfsd filesystem across mounts
    ...

    Linus Torvalds
     

09 Jul, 2019

1 commit

  • The function cache_seq_next is declared static and marked
    EXPORT_SYMBOL_GPL, which is at best an odd combination. Because the
    function is not used outside of the net/sunrpc/cache.c file it is
    defined in, this commit removes the EXPORT_SYMBOL_GPL() marking.

    Fixes: d48cf356a130 ("SUNRPC: Remove non-RCU protected lookup")
    Signed-off-by: Denis Efremov
    Signed-off-by: J. Bruce Fields

    Denis Efremov
     

05 Jun, 2019

1 commit

  • Based on 1 normalized pattern(s):

    released under terms in gpl version 2 see copying

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 5 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Armijn Hemel
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190531081035.689962394@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

24 Apr, 2019

1 commit

  • If no handler (such as rpc.mountd) has opened
    a cache 'channel', the sunrpc cache responds to
    all lookup requests with -ENOENT. This is particularly
    important for the auth.unix.gid cache which is
    optional.

    If the channel was open briefly and an upcall was written to it,
    this upcall remains pending even when the handler closes the
    channel. When an upcall is pending, the code currently
    doesn't check if there are still listeners, it only performs
    that check before sending an upcall.

    As the cache treads a recently closes channel (closed less than
    30 seconds ago) as "potentially still open", there is a
    reasonable sized window when a request can become pending
    in a closed channel, and thereby block lookups indefinitely.

    This can easily be demonstrated by running
    cat /proc/net/rpc/auth.unix.gid/channel

    and then trying to mount an NFS filesystem from this host. It
    will block indefinitely (unless mountd is run with --manage-gids,
    or krb5 is used).

    When cache_check() finds that an upcall is pending, it should
    perform the "cache_listeners_exist()" exist test. If no
    listeners do exist, the request should be negated.

    With this change in place, there can still be a 30second wait on
    mount, until the cache gives up waiting for a handler to come
    back, but this is much better than an indefinite wait.

    Signed-off-by: NeilBrown
    Signed-off-by: J. Bruce Fields

    NeilBrown
     

06 Apr, 2019

1 commit

  • A recent commit added a call to cache_fresh_locked()
    when an expired item was found.
    The call sets the CACHE_VALID flag, so it is important
    that the item actually is valid.
    There are two ways it could be valid:
    1/ If ->update has been called to fill in relevant content
    2/ if CACHE_NEGATIVE is set, to say that content doesn't exist.

    An expired item that is waiting for an update will be neither.
    Setting CACHE_VALID will mean that a subsequent call to cache_put()
    will be likely to dereference uninitialised pointers.

    So we must make sure the item is valid, and we already have code to do
    that in try_to_negate_entry(). This takes the hash lock and so cannot
    be used directly, so take out the two lines that we need and use them.

    Now cache_fresh_locked() is certain to be called only on
    a valid item.

    Cc: stable@kernel.org # 2.6.35
    Fixes: 4ecd55ea0742 ("sunrpc: fix cache_head leak due to queued request")
    Signed-off-by: NeilBrown
    Signed-off-by: J. Bruce Fields

    NeilBrown
     

05 Dec, 2018

1 commit

  • After commit d202cce8963d, an expired cache_head can be removed from the
    cache_detail's hash.

    However, the expired cache_head may be waiting for a reply from a
    previously submitted request. Such a cache_head has an increased
    refcounter and therefore it won't be freed after cache_put(freeme).

    Because the cache_head was removed from the hash it cannot be found
    during cache_clean() and can be leaked forever, together with stalled
    cache_request and other taken resources.

    In our case we noticed it because an entry in the export cache was
    holding a reference on a filesystem.

    Fixes d202cce8963d ("sunrpc: never return expired entries in sunrpc_cache_lookup")
    Cc: Pavel Tikhomirov
    Cc: stable@kernel.org # 2.6.35
    Signed-off-by: Vasily Averin
    Reviewed-by: NeilBrown
    Signed-off-by: J. Bruce Fields

    Vasily Averin
     

30 Oct, 2018

3 commits


03 Oct, 2018

1 commit


13 Jun, 2018

1 commit

  • The kzalloc() function has a 2-factor argument form, kcalloc(). This
    patch replaces cases of:

    kzalloc(a * b, gfp)

    with:
    kcalloc(a * b, gfp)

    as well as handling cases of:

    kzalloc(a * b * c, gfp)

    with:

    kzalloc(array3_size(a, b, c), gfp)

    as it's slightly less ugly than:

    kzalloc_array(array_size(a, b), c, gfp)

    This does, however, attempt to ignore constant size factors like:

    kzalloc(4 * 1024, gfp)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    kzalloc(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    kzalloc(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    kzalloc(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (COUNT_ID)
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * COUNT_ID
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (COUNT_CONST)
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * COUNT_CONST
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (COUNT_ID)
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * COUNT_ID
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (COUNT_CONST)
    + COUNT_CONST, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * COUNT_CONST
    + COUNT_CONST, sizeof(THING)
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    - kzalloc
    + kcalloc
    (
    - SIZE * COUNT
    + COUNT, SIZE
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    kzalloc(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    kzalloc(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kzalloc(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    kzalloc(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products,
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    kzalloc(C1 * C2 * C3, ...)
    |
    kzalloc(
    - (E1) * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - (E1) * (E2) * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - (E1) * (E2) * (E3)
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants,
    // keeping sizeof() as the second factor argument.
    @@
    expression THING, E1, E2;
    type TYPE;
    constant C1, C2, C3;
    @@

    (
    kzalloc(sizeof(THING) * C2, ...)
    |
    kzalloc(sizeof(TYPE) * C2, ...)
    |
    kzalloc(C1 * C2 * C3, ...)
    |
    kzalloc(C1 * C2, ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (E2)
    + E2, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * E2
    + E2, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (E2)
    + E2, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * E2
    + E2, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - (E1) * E2
    + E1, E2
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - (E1) * (E2)
    + E1, E2
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - E1 * E2
    + E1, E2
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook
     

06 Apr, 2018

1 commit

  • Pull nfsd updates from Bruce Fields:
    "Chuck Lever did a bunch of work on nfsd tracepoints, on RDMA, and on
    server xdr decoding (with an eye towards eliminating a data copy in
    the RDMA case).

    I did some refactoring of the delegation code in preparation for
    eliminating some delegation self-conflicts and implementing write
    delegations"

    * tag 'nfsd-4.17' of git://linux-nfs.org/~bfields/linux: (40 commits)
    nfsd: fix incorrect umasks
    sunrpc: remove incorrect HMAC request initialization
    NFSD: Clean up legacy NFS SYMLINK argument XDR decoders
    NFSD: Clean up legacy NFS WRITE argument XDR decoders
    nfsd: Trace NFSv4 COMPOUND execution
    nfsd: Add I/O trace points in the NFSv4 read proc
    nfsd: Add I/O trace points in the NFSv4 write path
    nfsd: Add "nfsd_" to trace point names
    nfsd: Record request byte count, not count of vectors
    nfsd: Fix NFSD trace points
    svc: Report xprt dequeue latency
    sunrpc: Report per-RPC execution stats
    sunrpc: Re-purpose trace_svc_process
    sunrpc: Save remote presentation address in svc_xprt for trace events
    sunrpc: Simplify trace_svc_recv
    sunrpc: Simplify do_enqueue tracing
    sunrpc: Move trace_svc_xprt_dequeue()
    sunrpc: Update show_svc_xprt_flags() to include recently added flags
    svc: Simplify ->xpo_secure_port
    sunrpc: Remove unneeded pointer dereference
    ...

    Linus Torvalds
     

27 Mar, 2018

1 commit

  • Prefer the direct use of octal for permissions.

    Done with checkpatch -f --types=SYMBOLIC_PERMS --fix-inplace
    and some typing.

    Miscellanea:

    o Whitespace neatening around these conversions.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

20 Mar, 2018

1 commit

  • The interface for flushing the sunrpc auth cache was poorly
    designed and has caused problems a number of times.

    The design is that you write a timestamp, and all entries
    created before that time are discarded.
    The most obvious problem is that this is not what people
    actually want. They want to just flush the whole cache.
    The 1-second granularity can be a problem, as can the use
    of wall-clock time.

    A current problem is that code will write the current time to
    this file - expecting it to clear everything - and if the
    seconds number ticks over before this timestamp is checked,
    the test "then >= now" fails, and a full flush isn't forced.

    So lets just drop the subtleties and always flush the whole
    cache. The worst this could do is impose an extra cost
    refilling it, but that would require someone to be using
    non-standard tools.

    We still report an error if the string written is not a number,
    but we cause any valid number to flush the whole cache.

    Reported-by: "Wang, Alan 1. (NSB - CN/Hangzhou)"
    Signed-off-by: NeilBrown
    Signed-off-by: J. Bruce Fields

    NeilBrown
     

12 Feb, 2018

1 commit

  • This is the mindless scripted replacement of kernel use of POLL*
    variables as described by Al, done by this script:

    for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
    L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
    for f in $L; do sed -i "-es/^\([^\"]*\)\(\\)/\\1E\\2/" $f; done
    done

    with de-mangling cleanups yet to come.

    NOTE! On almost all architectures, the EPOLL* constants have the same
    values as the POLL* constants do. But they keyword here is "almost".
    For various bad reasons they aren't the same, and epoll() doesn't
    actually work quite correctly in some cases due to this on Sparc et al.

    The next patch from Al will sort out the final differences, and we
    should be all done.

    Scripted-by: Al Viro
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

31 Jan, 2018

1 commit

  • Pull poll annotations from Al Viro:
    "This introduces a __bitwise type for POLL### bitmap, and propagates
    the annotations through the tree. Most of that stuff is as simple as
    'make ->poll() instances return __poll_t and do the same to local
    variables used to hold the future return value'.

    Some of the obvious brainos found in process are fixed (e.g. POLLIN
    misspelled as POLL_IN). At that point the amount of sparse warnings is
    low and most of them are for genuine bugs - e.g. ->poll() instance
    deciding to return -EINVAL instead of a bitmap. I hadn't touched those
    in this series - it's large enough as it is.

    Another problem it has caught was eventpoll() ABI mess; select.c and
    eventpoll.c assumed that corresponding POLL### and EPOLL### were
    equal. That's true for some, but not all of them - EPOLL### are
    arch-independent, but POLL### are not.

    The last commit in this series separates userland POLL### values from
    the (now arch-independent) kernel-side ones, converting between them
    in the few places where they are copied to/from userland. AFAICS, this
    is the least disruptive fix preserving poll(2) ABI and making epoll()
    work on all architectures.

    As it is, it's simply broken on sparc - try to give it EPOLLWRNORM and
    it will trigger only on what would've triggered EPOLLWRBAND on other
    architectures. EPOLLWRBAND and EPOLLRDHUP, OTOH, are never triggered
    at all on sparc. With this patch they should work consistently on all
    architectures"

    * 'misc.poll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits)
    make kernel-side POLL... arch-independent
    eventpoll: no need to mask the result of epi_item_poll() again
    eventpoll: constify struct epoll_event pointers
    debugging printk in sg_poll() uses %x to print POLL... bitmap
    annotate poll(2) guts
    9p: untangle ->poll() mess
    ->si_band gets POLL... bitmap stored into a user-visible long field
    ring_buffer_poll_wait() return value used as return value of ->poll()
    the rest of drivers/*: annotate ->poll() instances
    media: annotate ->poll() instances
    fs: annotate ->poll() instances
    ipc, kernel, mm: annotate ->poll() instances
    net: annotate ->poll() instances
    apparmor: annotate ->poll() instances
    tomoyo: annotate ->poll() instances
    sound: annotate ->poll() instances
    acpi: annotate ->poll() instances
    crypto: annotate ->poll() instances
    block: annotate ->poll() instances
    x86: annotate ->poll() instances
    ...

    Linus Torvalds
     

28 Nov, 2017

2 commits


02 Mar, 2017

1 commit

  • Pull NFS client updates from Anna Schumaker:
    "Highlights include:

    Stable bugfixes:
    - NFSv4: Fix memory and state leak in _nfs4_open_and_get_state
    - xprtrdma: Fix Read chunk padding
    - xprtrdma: Per-connection pad optimization
    - xprtrdma: Disable pad optimization by default
    - xprtrdma: Reduce required number of send SGEs
    - nlm: Ensure callback code also checks that the files match
    - pNFS/flexfiles: If the layout is invalid, it must be updated before
    retrying
    - NFSv4: Fix reboot recovery in copy offload
    - Revert "NFSv4.1: Handle NFS4ERR_BADSESSION/NFS4ERR_DEADSESSION
    replies to OP_SEQUENCE"
    - NFSv4: fix getacl head length estimation
    - NFSv4: fix getacl ERANGE for sum ACL buffer sizes

    Features:
    - Add and use dprintk_cont macros
    - Various cleanups to NFS v4.x to reduce code duplication and
    complexity
    - Remove unused cr_magic related code
    - Improvements to sunrpc "read from buffer" code
    - Clean up sunrpc timeout code and allow changing TCP timeout
    parameters
    - Remove duplicate mw_list management code in xprtrdma
    - Add generic functions for encoding and decoding xdr streams

    Bugfixes:
    - Clean up nfs_show_mountd_netid
    - Make layoutreturn_ops static and use NULL instead of 0 to fix
    sparse warnings
    - Properly handle -ERESTARTSYS in nfs_rename()
    - Check if register_shrinker() failed during rpcauth_init()
    - Properly clean up procfs/pipefs entries
    - Various NFS over RDMA related fixes
    - Silence unititialized variable warning in sunrpc"

    * tag 'nfs-for-4.11-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (64 commits)
    NFSv4: fix getacl ERANGE for some ACL buffer sizes
    NFSv4: fix getacl head length estimation
    Revert "NFSv4.1: Handle NFS4ERR_BADSESSION/NFS4ERR_DEADSESSION replies to OP_SEQUENCE"
    NFSv4: Fix reboot recovery in copy offload
    pNFS/flexfiles: If the layout is invalid, it must be updated before retrying
    NFSv4: Clean up owner/group attribute decode
    SUNRPC: Add a helper function xdr_stream_decode_string_dup()
    NFSv4: Remove bogus "struct nfs_client" argument from decode_ace()
    NFSv4: Fix the underestimation of delegation XDR space reservation
    NFSv4: Replace callback string decode function with a generic
    NFSv4: Replace the open coded decode_opaque_inline() with the new generic
    NFSv4: Replace ad-hoc xdr encode/decode helpers with xdr_stream_* generics
    SUNRPC: Add generic helpers for xdr_stream encode/decode
    sunrpc: silence uninitialized variable warning
    nlm: Ensure callback code also checks that the files match
    sunrpc: Allow xprt->ops->timer method to sleep
    xprtrdma: Refactor management of mw_list field
    xprtrdma: Handle stale connection rejection
    xprtrdma: Properly recover FRWRs with in-flight FASTREG WRs
    xprtrdma: Shrink send SGEs array
    ...

    Linus Torvalds
     

01 Mar, 2017

1 commit

  • Pull nfsd updates from Bruce Fields:
    "The nfsd update this round is mainly a lot of miscellaneous cleanups
    and bugfixes.

    A couple changes could theoretically break working setups on upgrade.
    I don't expect complaints in practice, but they seem worth calling out
    just in case:

    - NFS security labels are now off by default; a new security_label
    export flag reenables it per export. But, having them on by default
    is a disaster, as it generally only makes sense if all your clients
    and servers have similar enough selinux policies. Thanks to Jason
    Tibbitts for pointing this out.

    - NFSv4/UDP support is off. It was never really supported, and the
    spec explicitly forbids it. We only ever left it on out of
    laziness; thanks to Jeff Layton for finally fixing that"

    * tag 'nfsd-4.11' of git://linux-nfs.org/~bfields/linux: (34 commits)
    nfsd: Fix display of the version string
    nfsd: fix configuration of supported minor versions
    sunrpc: don't register UDP port with rpcbind when version needs congestion control
    nfs/nfsd/sunrpc: enforce transport requirements for NFSv4
    sunrpc: flag transports as having congestion control
    sunrpc: turn bitfield flags in svc_version into bools
    nfsd: remove superfluous KERN_INFO
    nfsd: special case truncates some more
    nfsd: minor nfsd_setattr cleanup
    NFSD: Reserve adequate space for LOCKT operation
    NFSD: Get response size before operation for all RPCs
    nfsd/callback: Drop a useless data copy when comparing sessionid
    nfsd/callback: skip the callback tag
    nfsd/callback: Cleanup callback cred on shutdown
    nfsd/idmap: return nfserr_inval for 0-length names
    SUNRPC/Cache: Always treat the invalid cache as unexpired
    SUNRPC: Drop all entries from cache_detail when cache_purge()
    svcrdma: Poll CQs in "workqueue" mode
    svcrdma: Combine list fields in struct svc_rdma_op_ctxt
    svcrdma: Remove unused sc_dto_q field
    ...

    Linus Torvalds
     

09 Feb, 2017

4 commits