04 Feb, 2018

1 commit

  • [ Upstream commit 6b18dd1c03e07262ea0866084856b2a3c5ba8d09 ]

    lockd_inet[6]addr_event use nlmsvc_rqst without taken nlmsvc_mutex,
    nlmsvc_rqst can be changed during execution of notifiers and crash the host.

    Patch enables access to nlmsvc_rqst only when it was correctly initialized
    and delays its cleanup until notifiers are no longer in use.

    Note that nlmsvc_rqst can be temporally set to ERR_PTR, so the "if
    (nlmsvc_rqst)" check in notifiers is insufficient on its own.

    Signed-off-by: Vasily Averin
    Tested-by: Scott Mayhew
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Vasily Averin
     

05 Dec, 2017

1 commit

  • commit 3a2b19d1ee5633f76ae8a88da7bc039a5d1732aa upstream.

    Commit efda760fe95ea ("lockd: fix lockd shutdown race") is incorrect,
    it removes lockd_manager and disarm grace_period_end for init_net only.

    If nfsd was started from another net namespace lockd_up_net() calls
    set_grace_period() that adds lockd_manager into per-netns list
    and queues grace_period_end delayed work.

    These action should be reverted in lockd_down_net().
    Otherwise it can lead to double list_add on after restart nfsd in netns,
    and to use-after-free if non-disarmed delayed work will be executed after netns destroy.

    Fixes: efda760fe95e ("lockd: fix lockd shutdown race")
    Signed-off-by: Vasily Averin
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Greg Kroah-Hartman

    Vasily Averin
     

30 Nov, 2017

1 commit

  • commit dc3033e16c59a2c4e62b31341258a5786cbcee56 upstream.

    lockd_up() can call lockd_unregister_notifiers twice:
    inside lockd_start_svc() when it calls lockd_svc_exit_thread()
    and then in error path of lockd_up()

    Patch forces lockd_start_svc() to unregister notifiers in all error cases
    and removes extra unregister in error path of lockd_up().

    Fixes: cb7d224f82e4 "lockd: unregister notifier blocks if the service ..."
    Signed-off-by: Vasily Averin
    Reviewed-by: Jeff Layton
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Greg Kroah-Hartman

    Vasily Averin
     

25 Aug, 2017

1 commit


15 May, 2017

2 commits


09 May, 2017

1 commit

  • As reported by David Jeffery: "a signal was sent to lockd while lockd
    was shutting down from a request to stop nfs. The signal causes lockd
    to call restart_grace() which puts the lockd_net structure on the grace
    list. If this signal is received at the wrong time, it will occur after
    lockd_down_net() has called locks_end_grace() but before
    lockd_down_net() stops the lockd thread. This leads to lockd putting
    the lockd_net structure back on the grace list, then exiting without
    anything removing it from the list."

    So, perform the final locks_end_grace() from the the lockd thread; this
    ensures it's serialized with respect to restart_grace().

    Reported-by: David Jeffery
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

02 Mar, 2017

1 commit


01 Feb, 2017

1 commit


18 Nov, 2016

1 commit

  • Make struct pernet_operations::id unsigned.

    There are 2 reasons to do so:

    1)
    This field is really an index into an zero based array and
    thus is unsigned entity. Using negative value is out-of-bound
    access by definition.

    2)
    On x86_64 unsigned 32-bit data which are mixed with pointers
    via array indexing or offsets added or subtracted to pointers
    are preffered to signed 32-bit data.

    "int" being used as an array index needs to be sign-extended
    to 64-bit before being used.

    void f(long *p, int i)
    {
    g(p[i]);
    }

    roughly translates to

    movsx rsi, esi
    mov rdi, [rsi+...]
    call g

    MOVSX is 3 byte instruction which isn't necessary if the variable is
    unsigned because x86_64 is zero extending by default.

    Now, there is net_generic() function which, you guessed it right, uses
    "int" as an array index:

    static inline void *net_generic(const struct net *net, int id)
    {
    ...
    ptr = ng->ptr[id - 1];
    ...
    }

    And this function is used a lot, so those sign extensions add up.

    Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
    messing with code generation):

    add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)

    Unfortunately some functions actually grow bigger.
    This is a semmingly random artefact of code generation with register
    allocator being used differently. gcc decides that some variable
    needs to live in new r8+ registers and every access now requires REX
    prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
    used which is longer than [r8]

    However, overall balance is in negative direction:

    add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
    function old new delta
    nfsd4_lock 3886 3959 +73
    tipc_link_build_proto_msg 1096 1140 +44
    mac80211_hwsim_new_radio 2776 2808 +32
    tipc_mon_rcv 1032 1058 +26
    svcauth_gss_legacy_init 1413 1429 +16
    tipc_bcbase_select_primary 379 392 +13
    nfsd4_exchange_id 1247 1260 +13
    nfsd4_setclientid_confirm 782 793 +11
    ...
    put_client_renew_locked 494 480 -14
    ip_set_sockfn_get 730 716 -14
    geneve_sock_add 829 813 -16
    nfsd4_sequence_done 721 703 -18
    nlmclnt_lookup_host 708 686 -22
    nfsd4_lockt 1085 1063 -22
    nfs_get_client 1077 1050 -27
    tcf_bpf_init 1106 1076 -30
    nfsd4_encode_fattr 5997 5930 -67
    Total: Before=154856051, After=154854321, chg -0.00%

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

01 Jul, 2016

1 commit

  • If the lockd service fails to start up then we need to be sure that the
    notifier blocks are not registered, otherwise a subsequent start of the
    service could cause the same notifier to be registered twice, leading to
    soft lockups.

    Signed-off-by: Scott Mayhew
    Cc: stable@vger.kernel.org
    Fixes: 0751ddf77b6a "lockd: Register callbacks on the inetaddr_chain..."
    Signed-off-by: J. Bruce Fields

    Scott Mayhew
     

07 Jan, 2016

2 commits


23 Dec, 2015

1 commit


24 Oct, 2015

1 commit

  • Currently we have reference-counted per-net NSM RPC client
    which created on the first monitor request and destroyed
    after the last unmonitor request. It's needed because
    RPC client need to know 'utsname()->nodename', but utsname()
    might be NULL when nsm_unmonitor() called.

    So instead of holding the rpc client we could just save nodename
    in struct nlm_host and pass it to the rpc_create().
    Thus ther is no need in keeping rpc client until last
    unmonitor request. We could create separate RPC clients
    for each monitor/unmonitor requests.

    Signed-off-by: Andrey Ryabinin
    Signed-off-by: J. Bruce Fields

    Andrey Ryabinin
     

13 Oct, 2015

1 commit

  • Commit cb7323fffa85 ("lockd: create and use per-net NSM
    RPC clients on MON/UNMON requests") introduced per-net
    NSM RPC clients. Unfortunately this doesn't make any sense
    without per-net nsm_handle.

    E.g. the following scenario could happen
    Two hosts (X and Y) in different namespaces (A and B) share
    the same nsm struct.

    1. nsm_monitor(host_X) called => NSM rpc client created,
    nsm->sm_monitored bit set.
    2. nsm_mointor(host-Y) called => nsm->sm_monitored already set,
    we just exit. Thus in namespace B ln->nsm_clnt == NULL.
    3. host X destroyed => nsm->sm_count decremented to 1
    4. host Y destroyed => nsm_unmonitor() => nsm_mon_unmon() => NULL-ptr
    dereference of *ln->nsm_clnt

    So this could be fixed by making per-net nsm_handles list,
    instead of global. Thus different net namespaces will not be able
    share the same nsm_handle.

    Signed-off-by: Andrey Ryabinin
    Cc:
    Signed-off-by: J. Bruce Fields

    Andrey Ryabinin
     

13 Aug, 2015

1 commit


11 Aug, 2015

2 commits


06 Jan, 2015

1 commit

  • This commit fixes a race whereby nlmclnt_init() first starts the lockd
    daemon, and then calls nlm_bind_host() with the expectation that
    nlmsvc_timeout has already been initialised. Unfortunately, there is no
    no synchronisation between lockd() and lockd_up() to guarantee that this
    is the case.

    Fix is to move the initialisation of nlmsvc_timeout into lockd_create_svc

    Fixes: 9a1b6bf818e74 ("LOCKD: Don't call utsname()->nodename...")
    Cc: Bruce Fields
    Cc: stable@vger.kernel.org # 3.10.x
    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

10 Dec, 2014

1 commit


09 Oct, 2014

1 commit

  • Pull nfsd updates from Bruce Fields:
    "Highlights:

    - support the NFSv4.2 SEEK operation (allowing clients to support
    SEEK_HOLE/SEEK_DATA), thanks to Anna.
    - end the grace period early in a number of cases, mitigating a
    long-standing annoyance, thanks to Jeff
    - improve SMP scalability, thanks to Trond"

    * 'for-3.18' of git://linux-nfs.org/~bfields/linux: (55 commits)
    nfsd: eliminate "to_delegation" define
    NFSD: Implement SEEK
    NFSD: Add generic v4.2 infrastructure
    svcrdma: advertise the correct max payload
    nfsd: introduce nfsd4_callback_ops
    nfsd: split nfsd4_callback initialization and use
    nfsd: introduce a generic nfsd4_cb
    nfsd: remove nfsd4_callback.cb_op
    nfsd: do not clear rpc_resp in nfsd4_cb_done_sequence
    nfsd: fix nfsd4_cb_recall_done error handling
    nfsd4: clarify how grace period ends
    nfsd4: stop grace_time update at end of grace period
    nfsd: skip subsequent UMH "create" operations after the first one for v4.0 clients
    nfsd: set and test NFSD4_CLIENT_STABLE bit to reduce nfsdcltrack upcalls
    nfsd: serialize nfsdcltrack upcalls for a particular client
    nfsd: pass extra info in env vars to upcalls to allow for early grace period end
    nfsd: add a v4_end_grace file to /proc/fs/nfsd
    lockd: add a /proc/fs/lockd/nlm_end_grace file
    nfsd: reject reclaim request when client has already sent RECLAIM_COMPLETE
    nfsd: remove redundant boot_time parm from grace_done client tracking op
    ...

    Linus Torvalds
     

18 Sep, 2014

2 commits

  • Add a new procfile that will allow a (privileged) userland process to
    end the NLM grace period early. The basic idea here will be to have
    sm-notify write to this file, if it sent out no NOTIFY requests when
    it runs. In that situation, we can generally expect that there will be
    no reclaim requests so the grace period can be lifted early.

    Signed-off-by: Jeff Layton

    Jeff Layton
     
  • Currently, all of the grace period handling is part of lockd. Eventually
    though we'd like to be able to build v4-only servers, at which point
    we'll need to put all of this elsewhere.

    Move the code itself into fs/nfs_common and have it build a grace.ko
    module. Then, rejigger the Kconfig options so that both nfsd and lockd
    enable it automatically.

    Signed-off-by: Jeff Layton

    Jeff Layton
     

09 Sep, 2014

1 commit

  • Nikita Yuschenko reported that booting a kernel with init=/bin/sh and
    then nfs mounting without portmap or rpcbind running using a busybox
    mount resulted in:

    # mount -t nfs 10.30.130.21:/opt /mnt
    svc: failed to register lockdv1 RPC service (errno 111).
    lockd_up: makesock failed, error=-111
    Unable to handle kernel paging request for data at address 0x00000030
    Faulting instruction address: 0xc055e65c
    Oops: Kernel access of bad area, sig: 11 [#1]
    MPC85xx CDS
    Modules linked in:
    CPU: 0 PID: 1338 Comm: mount Not tainted 3.10.44.cge #117
    task: cf29cea0 ti: cf35c000 task.ti: cf35c000
    NIP: c055e65c LR: c0566490 CTR: c055e648
    REGS: cf35dad0 TRAP: 0300 Not tainted (3.10.44.cge)
    MSR: 00029000 CR: 22442488 XER: 20000000
    DEAR: 00000030, ESR: 00000000

    GPR00: c05606f4 cf35db80 cf29cea0 cf0ded80 cf0dedb8 00000001 1dec3086
    00000000
    GPR08: 00000000 c07b1640 00000007 1dec3086 22442482 100b9758 00000000
    10090ae8
    GPR16: 00000000 000186a5 00000000 00000000 100c3018 bfa46edc 100b0000
    bfa46ef0
    GPR24: cf386ae0 c07834f0 00000000 c0565f88 00000001 cf0dedb8 00000000
    cf0ded80
    NIP [c055e65c] call_start+0x14/0x34
    LR [c0566490] __rpc_execute+0x70/0x250
    Call Trace:
    [cf35db80] [00000080] 0x80 (unreliable)
    [cf35dbb0] [c05606f4] rpc_run_task+0x9c/0xc4
    [cf35dbc0] [c0560840] rpc_call_sync+0x50/0xb8
    [cf35dbf0] [c056ee90] rpcb_register_call+0x54/0x84
    [cf35dc10] [c056f24c] rpcb_register+0xf8/0x10c
    [cf35dc70] [c0569e18] svc_unregister.isra.23+0x100/0x108
    [cf35dc90] [c0569e38] svc_rpcb_cleanup+0x18/0x30
    [cf35dca0] [c0198c5c] lockd_up+0x1dc/0x2e0
    [cf35dcd0] [c0195348] nlmclnt_init+0x2c/0xc8
    [cf35dcf0] [c015bb5c] nfs_start_lockd+0x98/0xec
    [cf35dd20] [c015ce6c] nfs_create_server+0x1e8/0x3f4
    [cf35dd90] [c0171590] nfs3_create_server+0x10/0x44
    [cf35dda0] [c016528c] nfs_try_mount+0x158/0x1e4
    [cf35de20] [c01670d0] nfs_fs_mount+0x434/0x8c8
    [cf35de70] [c00cd3bc] mount_fs+0x20/0xbc
    [cf35de90] [c00e4f88] vfs_kern_mount+0x50/0x104
    [cf35dec0] [c00e6e0c] do_mount+0x1d0/0x8e0
    [cf35df10] [c00e75ac] SyS_mount+0x90/0xd0
    [cf35df40] [c000ccf4] ret_from_syscall+0x0/0x3c

    The addition of svc_shutdown_net() resulted in two calls to
    svc_rpcb_cleanup(); the second is no longer necessary and crashes when
    it calls rpcb_register_call with clnt=NULL.

    Reported-by: Nikita Yushchenko
    Fixes: 679b033df484 "lockd: ensure we tear down any live sockets when socket creation fails during lockd_up"
    Cc: stable@vger.kernel.org
    Acked-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

03 Sep, 2014

1 commit


18 Aug, 2014

1 commit


11 Jun, 2014

1 commit

  • Pull nfsd updates from Bruce Fields:
    "The largest piece is a long-overdue rewrite of the xdr code to remove
    some annoying limitations: for example, there was no way to return
    ACLs larger than 4K, and readdir results were returned only in 4k
    chunks, limiting performance on large directories.

    Also:
    - part of Neil Brown's work to make NFS work reliably over the
    loopback interface (so client and server can run on the same
    machine without deadlocks). The rest of it is coming through
    other trees.
    - cleanup and bugfixes for some of the server RDMA code, from
    Steve Wise.
    - Various cleanup of NFSv4 state code in preparation for an
    overhaul of the locking, from Jeff, Trond, and Benny.
    - smaller bugfixes and cleanup from Christoph Hellwig and
    Kinglong Mee.

    Thanks to everyone!

    This summer looks likely to be busier than usual for knfsd. Hopefully
    we won't break it too badly; testing definitely welcomed"

    * 'for-3.16' of git://linux-nfs.org/~bfields/linux: (100 commits)
    nfsd4: fix FREE_STATEID lockowner leak
    svcrdma: Fence LOCAL_INV work requests
    svcrdma: refactor marshalling logic
    nfsd: don't halt scanning the DRC LRU list when there's an RC_INPROG entry
    nfs4: remove unused CHANGE_SECURITY_LABEL
    nfsd4: kill READ64
    nfsd4: kill READ32
    nfsd4: simplify server xdr->next_page use
    nfsd4: hash deleg stateid only on successful nfs4_set_delegation
    nfsd4: rename recall_lock to state_lock
    nfsd: remove unneeded zeroing of fields in nfsd4_proc_compound
    nfsd: fix setting of NFS4_OO_CONFIRMED in nfsd4_open
    nfsd4: use recall_lock for delegation hashing
    nfsd: fix laundromat next-run-time calculation
    nfsd: make nfsd4_encode_fattr static
    SUNRPC/NFSD: Remove using of dprintk with KERN_WARNING
    nfsd: remove unused function nfsd_read_file
    nfsd: getattr for FATTR4_WORD0_FILES_AVAIL needs the statfs buffer
    NFSD: Error out when getting more than one fsloc/secinfo/uuid
    NFSD: Using type of uint32_t for ex_nflavors instead of int
    ...

    Linus Torvalds
     

07 Jun, 2014

1 commit


07 May, 2014

1 commit

  • When building without CONFIG_SYSCTL, the compiler saw an unused
    label. This moves the label into the #ifdef it is used under.

    fs/lockd/svc.c: In function ‘init_nlm’:
    fs/lockd/svc.c:626:1: warning: label ‘err_sysctl’ defined but not used [-Wunused-label]

    Signed-off-by: Kees Cook
    Signed-off-by: J. Bruce Fields

    Kees Cook
     

28 Mar, 2014

1 commit

  • We had a Fedora ABRT report with a stack trace like this:

    kernel BUG at net/sunrpc/svc.c:550!
    invalid opcode: 0000 [#1] SMP
    [...]
    CPU: 2 PID: 913 Comm: rpc.nfsd Not tainted 3.13.6-200.fc20.x86_64 #1
    Hardware name: Hewlett-Packard HP ProBook 4740s/1846, BIOS 68IRR Ver. F.40 01/29/2013
    task: ffff880146b00000 ti: ffff88003f9b8000 task.ti: ffff88003f9b8000
    RIP: 0010:[] [] svc_destroy+0x128/0x130 [sunrpc]
    RSP: 0018:ffff88003f9b9de0 EFLAGS: 00010206
    RAX: ffff88003f829628 RBX: ffff88003f829600 RCX: 00000000000041ee
    RDX: 0000000000000000 RSI: 0000000000000286 RDI: 0000000000000286
    RBP: ffff88003f9b9de8 R08: 0000000000017360 R09: ffff88014fa97360
    R10: ffffffff8114ce57 R11: ffffea00051c9c00 R12: ffff88003f829600
    R13: 00000000ffffff9e R14: ffffffff81cc7cc0 R15: 0000000000000000
    FS: 00007f4fde284840(0000) GS:ffff88014fa80000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f4fdf5192f8 CR3: 00000000a569a000 CR4: 00000000001407e0
    Stack:
    ffff88003f792300 ffff88003f9b9e18 ffffffffa02de02a 0000000000000000
    ffffffff81cc7cc0 ffff88003f9cb000 0000000000000008 ffff88003f9b9e60
    ffffffffa033bb35 ffffffff8131c86c ffff88003f9cb000 ffff8800a5715008
    Call Trace:
    [] lockd_up+0xaa/0x330 [lockd]
    [] nfsd_svc+0x1b5/0x2f0 [nfsd]
    [] ? simple_strtoull+0x2c/0x50
    [] ? write_pool_threads+0x280/0x280 [nfsd]
    [] write_threads+0x8b/0xf0 [nfsd]
    [] ? __get_free_pages+0x14/0x50
    [] ? get_zeroed_page+0x16/0x20
    [] ? simple_transaction_get+0xb1/0xd0
    [] nfsctl_transaction_write+0x48/0x80 [nfsd]
    [] vfs_write+0xb4/0x1f0
    [] ? putname+0x29/0x40
    [] SyS_write+0x49/0xa0
    [] ? __audit_syscall_exit+0x1f6/0x2a0
    [] system_call_fastpath+0x16/0x1b
    Code: 31 c0 e8 82 db 37 e1 e9 2a ff ff ff 48 8b 07 8b 57 14 48 c7 c7 d5 c6 31 a0 48 8b 70 20 31 c0 e8 65 db 37 e1 e9 f4 fe ff ff 0f 0b 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55
    RIP [] svc_destroy+0x128/0x130 [sunrpc]
    RSP

    Evidently, we created some lockd sockets and then failed to create
    others. make_socks then returned an error and we tried to tear down the
    svc, but svc->sv_permsocks was not empty so we ended up tripping over
    the BUG() in svc_destroy().

    Fix this by ensuring that we tear down any live sockets we created when
    socket creation is going to return an error.

    Fixes: 786185b5f8abefa (SUNRPC: move per-net operations from...)
    Reported-by: Raphos
    Signed-off-by: Jeff Layton
    Reviewed-by: Stanislav Kinsbursky
    Cc: stable@vger.kernel.org
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     

04 Jul, 2013

1 commit


13 Oct, 2012

1 commit

  • Pull nfsd update from J Bruce Fields:
    "Another relatively quiet cycle. There was some progress on my
    remaining 4.1 todo's, but a couple of them were just of the form
    "check that we do X correctly", so didn't have much affect on the
    code.

    Other than that, a bunch of cleanup and some bugfixes (including an
    annoying NFSv4.0 state leak and a busy-loop in the server that could
    cause it to peg the CPU without making progress)."

    * 'for-3.7' of git://linux-nfs.org/~bfields/linux: (46 commits)
    UAPI: (Scripted) Disintegrate include/linux/sunrpc
    UAPI: (Scripted) Disintegrate include/linux/nfsd
    nfsd4: don't allow reclaims of expired clients
    nfsd4: remove redundant callback probe
    nfsd4: expire old client earlier
    nfsd4: separate session allocation and initialization
    nfsd4: clean up session allocation
    nfsd4: minor free_session cleanup
    nfsd4: new_conn_from_crses should only allocate
    nfsd4: separate connection allocation and initialization
    nfsd4: reject bad forechannel attrs earlier
    nfsd4: enforce per-client sessions/no-sessions distinction
    nfsd4: set cl_minorversion at create time
    nfsd4: don't pin clientids to pseudoflavors
    nfsd4: fix bind_conn_to_session xdr comment
    nfsd4: cast readlink() bug argument
    NFSD: pass null terminated buf to kstrtouint()
    nfsd: remove duplicate init in nfsd4_cb_recall
    nfsd4: eliminate redundant nfs4_free_stateid
    fs/nfsd/nfs4idmap.c: adjust inconsistent IS_ERR and PTR_ERR
    ...

    Linus Torvalds
     

02 Oct, 2012

1 commit

  • NSM RPC client can be required on NFSv3 umount, when child reaper is dying (and
    destroying it's mount namespace). It means, that current nsproxy is set to
    NULL already, but creation of RPC client requires UTS namespace for gaining
    hostname string.
    This patch introduces reference counted NFS RPC clients creation and
    destruction helpers (similar to RPCBIND RPC clients).

    Signed-off-by: Stanislav Kinsbursky
    Cc:
    Signed-off-by: Trond Myklebust

    Stanislav Kinsbursky
     

22 Aug, 2012

1 commit


28 Jul, 2012

5 commits