17 Dec, 2016

1 commit

  • Pull nfsd updates from Bruce Fields:
    "The one new feature is support for a new NFSv4.2 mode_umask attribute
    that makes ACL inheritance a little more useful in environments that
    default to restrictive umasks. Requires client-side support, also on
    its way for 4.10.

    Other than that, miscellaneous smaller fixes and cleanup, especially
    to the server rdma code"

    [ The client side of the umask attribute was merged yesterday ]

    * tag 'nfsd-4.10' of git://linux-nfs.org/~bfields/linux:
    nfsd: add support for the umask attribute
    sunrpc: use DEFINE_SPINLOCK()
    svcrdma: Further clean-up of svc_rdma_get_inv_rkey()
    svcrdma: Break up dprintk format in svc_rdma_accept()
    svcrdma: Remove unused variable in rdma_copy_tail()
    svcrdma: Remove unused variables in xprt_rdma_bc_allocate()
    svcrdma: Remove svc_rdma_op_ctxt::wc_status
    svcrdma: Remove DMA map accounting
    svcrdma: Remove BH-disabled spin locking in svc_rdma_send()
    svcrdma: Renovate sendto chunk list parsing
    svcauth_gss: Close connection when dropping an incoming message
    svcrdma: Clear xpt_bc_xps in xprt_setup_rdma_bc() error exit arm
    nfsd: constify reply_cache_stats_operations structure
    nfsd: update workqueue creation
    sunrpc: GFP_KERNEL should be GFP_NOFS in crypto code
    nfsd: catch errors in decode_fattr earlier
    nfsd: clean up supported attribute handling
    nfsd: fix error handling for clients that fail to return the layout
    nfsd: more robust allocation failure handling in nfsd_reply_cache_init

    Linus Torvalds
     

18 Nov, 2016

1 commit

  • Make struct pernet_operations::id unsigned.

    There are 2 reasons to do so:

    1)
    This field is really an index into an zero based array and
    thus is unsigned entity. Using negative value is out-of-bound
    access by definition.

    2)
    On x86_64 unsigned 32-bit data which are mixed with pointers
    via array indexing or offsets added or subtracted to pointers
    are preffered to signed 32-bit data.

    "int" being used as an array index needs to be sign-extended
    to 64-bit before being used.

    void f(long *p, int i)
    {
    g(p[i]);
    }

    roughly translates to

    movsx rsi, esi
    mov rdi, [rsi+...]
    call g

    MOVSX is 3 byte instruction which isn't necessary if the variable is
    unsigned because x86_64 is zero extending by default.

    Now, there is net_generic() function which, you guessed it right, uses
    "int" as an array index:

    static inline void *net_generic(const struct net *net, int id)
    {
    ...
    ptr = ng->ptr[id - 1];
    ...
    }

    And this function is used a lot, so those sign extensions add up.

    Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
    messing with code generation):

    add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)

    Unfortunately some functions actually grow bigger.
    This is a semmingly random artefact of code generation with register
    allocator being used differently. gcc decides that some variable
    needs to live in new r8+ registers and every access now requires REX
    prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
    used which is longer than [r8]

    However, overall balance is in negative direction:

    add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
    function old new delta
    nfsd4_lock 3886 3959 +73
    tipc_link_build_proto_msg 1096 1140 +44
    mac80211_hwsim_new_radio 2776 2808 +32
    tipc_mon_rcv 1032 1058 +26
    svcauth_gss_legacy_init 1413 1429 +16
    tipc_bcbase_select_primary 379 392 +13
    nfsd4_exchange_id 1247 1260 +13
    nfsd4_setclientid_confirm 782 793 +11
    ...
    put_client_renew_locked 494 480 -14
    ip_set_sockfn_get 730 716 -14
    geneve_sock_add 829 813 -16
    nfsd4_sequence_done 721 703 -18
    nlmclnt_lookup_host 708 686 -22
    nfsd4_lockt 1085 1063 -22
    nfs_get_client 1077 1050 -27
    tcf_bpf_init 1106 1076 -30
    nfsd4_encode_fattr 5997 5930 -67
    Total: Before=154856051, After=154854321, chg -0.00%

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

15 Nov, 2016

1 commit


27 Sep, 2016

1 commit

  • NFSv4.1 has built-in trunking support that allows a client to determine
    whether two connections to two different IP addresses are actually to
    the same server. NFSv4.0 does not, but RFC 7931 attempts to provide
    clients a means to do this, basically by performing a SETCLIENTID to one
    address and confirming it with a SETCLIENTID_CONFIRM to the other.

    Linux clients since 05f4c350ee02 "NFS: Discover NFSv4 server trunking
    when mounting" implement a variation on this suggestion. It is possible
    that other clients do too.

    This depends on the clientid and verifier not being accepted by an
    unrelated server. Since both are 64-bit values, that would be very
    unlikely if they were random numbers. But they aren't:

    knfsd generates the 64-bit clientid by concatenating the 32-bit boot
    time (in seconds) and a counter. This makes collisions between
    clientids generated by the same server extremely unlikely. But
    collisions are very likely between clientids generated by servers that
    boot at the same time, and it's quite common for multiple servers to
    boot at the same time. The verifier is a concatenation of the
    SETCLIENTID time (in seconds) and a counter, so again collisions between
    different servers are likely if multiple SETCLIENTIDs are done at the
    same time, which is a common case.

    Therefore recent NFSv4.0 clients may decide two different servers are
    really the same, and mount a filesystem from the wrong server.

    Fortunately the Linux client, since 55b9df93ddd6 "nfsv4/v4.1: Verify the
    client owner id during trunking detection", only does this when given
    the non-default "migration" mount option.

    The fault is really with RFC 7931, and needs a client fix, but in the
    meantime we can mitigate the chance of these collisions by randomizing
    the starting value of the counters used to generate clientids and
    verifiers.

    Reported-by: Frank Sorenson
    Reviewed-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

30 Jul, 2016

1 commit

  • Pull userns vfs updates from Eric Biederman:
    "This tree contains some very long awaited work on generalizing the
    user namespace support for mounting filesystems to include filesystems
    with a backing store. The real world target is fuse but the goal is
    to update the vfs to allow any filesystem to be supported. This
    patchset is based on a lot of code review and testing to approach that
    goal.

    While looking at what is needed to support the fuse filesystem it
    became clear that there were things like xattrs for security modules
    that needed special treatment. That the resolution of those concerns
    would not be fuse specific. That sorting out these general issues
    made most sense at the generic level, where the right people could be
    drawn into the conversation, and the issues could be solved for
    everyone.

    At a high level what this patchset does a couple of simple things:

    - Add a user namespace owner (s_user_ns) to struct super_block.

    - Teach the vfs to handle filesystem uids and gids not mapping into
    to kuids and kgids and being reported as INVALID_UID and
    INVALID_GID in vfs data structures.

    By assigning a user namespace owner filesystems that are mounted with
    only user namespace privilege can be detected. This allows security
    modules and the like to know which mounts may not be trusted. This
    also allows the set of uids and gids that are communicated to the
    filesystem to be capped at the set of kuids and kgids that are in the
    owning user namespace of the filesystem.

    One of the crazier corner casees this handles is the case of inodes
    whose i_uid or i_gid are not mapped into the vfs. Most of the code
    simply doesn't care but it is easy to confuse the inode writeback path
    so no operation that could cause an inode write-back is permitted for
    such inodes (aka only reads are allowed).

    This set of changes starts out by cleaning up the code paths involved
    in user namespace permirted mounts. Then when things are clean enough
    adds code that cleanly sets s_user_ns. Then additional restrictions
    are added that are possible now that the filesystem superblock
    contains owner information.

    These changes should not affect anyone in practice, but there are some
    parts of these restrictions that are changes in behavior.

    - Andy's restriction on suid executables that does not honor the
    suid bit when the path is from another mount namespace (think
    /proc/[pid]/fd/) or when the filesystem was mounted by a less
    privileged user.

    - The replacement of the user namespace implicit setting of MNT_NODEV
    with implicitly setting SB_I_NODEV on the filesystem superblock
    instead.

    Using SB_I_NODEV is a stronger form that happens to make this state
    user invisible. The user visibility can be managed but it caused
    problems when it was introduced from applications reasonably
    expecting mount flags to be what they were set to.

    There is a little bit of work remaining before it is safe to support
    mounting filesystems with backing store in user namespaces, beyond
    what is in this set of changes.

    - Verifying the mounter has permission to read/write the block device
    during mount.

    - Teaching the integrity modules IMA and EVM to handle filesystems
    mounted with only user namespace root and to reduce trust in their
    security xattrs accordingly.

    - Capturing the mounters credentials and using that for permission
    checks in d_automount and the like. (Given that overlayfs already
    does this, and we need the work in d_automount it make sense to
    generalize this case).

    Furthermore there are a few changes that are on the wishlist:

    - Get all filesystems supporting posix acls using the generic posix
    acls so that posix_acl_fix_xattr_from_user and
    posix_acl_fix_xattr_to_user may be removed. [Maintainability]

    - Reducing the permission checks in places such as remount to allow
    the superblock owner to perform them.

    - Allowing the superblock owner to chown files with unmapped uids and
    gids to something that is mapped so the files may be treated
    normally.

    I am not considering even obvious relaxations of permission checks
    until it is clear there are no more corner cases that need to be
    locked down and handled generically.

    Many thanks to Seth Forshee who kept this code alive, and putting up
    with me rewriting substantial portions of what he did to handle more
    corner cases, and for his diligent testing and reviewing of my
    changes"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (30 commits)
    fs: Call d_automount with the filesystems creds
    fs: Update i_[ug]id_(read|write) to translate relative to s_user_ns
    evm: Translate user/group ids relative to s_user_ns when computing HMAC
    dquot: For now explicitly don't support filesystems outside of init_user_ns
    quota: Handle quota data stored in s_user_ns in quota_setxquota
    quota: Ensure qids map to the filesystem
    vfs: Don't create inodes with a uid or gid unknown to the vfs
    vfs: Don't modify inodes with a uid or gid unknown to the vfs
    cred: Reject inodes with invalid ids in set_create_file_as()
    fs: Check for invalid i_uid in may_follow_link()
    vfs: Verify acls are valid within superblock's s_user_ns.
    userns: Handle -1 in k[ug]id_has_mapping when !CONFIG_USER_NS
    fs: Refuse uid/gid changes which don't map into s_user_ns
    selinux: Add support for unprivileged mounts from user namespaces
    Smack: Handle labels consistently in untrusted mounts
    Smack: Add support for unprivileged mounts from user namespaces
    fs: Treat foreign mounts as nosuid
    fs: Limit file caps to the user namespace of the super block
    userns: Remove the now unnecessary FS_USERNS_DEV_MOUNT flag
    userns: Remove implicit MNT_NODEV fragility.
    ...

    Linus Torvalds
     

24 Jun, 2016

1 commit

  • Today what is normally called data (the mount options) is not passed
    to fill_super through mount_ns.

    Pass the mount options and the namespace separately to mount_ns so
    that filesystems such as proc that have mount options, can use
    mount_ns.

    Pass the user namespace to mount_ns so that the standard permission
    check that verifies the mounter has permissions over the namespace can
    be performed in mount_ns instead of in each filesystems .mount method.
    Thus removing the duplication between mqueuefs and proc in terms of
    permission checks. The extra permission check does not currently
    affect the rpc_pipefs filesystem and the nfsd filesystem as those
    filesystems do not currently allow unprivileged mounts. Without
    unpvileged mounts it is guaranteed that the caller has already passed
    capable(CAP_SYS_ADMIN) which guarantees extra permission check will
    pass.

    Update rpc_pipefs and the nfsd filesystem to ensure that the network
    namespace reference is always taken in fill_super and always put in kill_sb
    so that the logic is simpler and so that errors originating inside of
    fill_super do not cause a network namespace leak.

    Acked-by: Seth Forshee
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

30 May, 2016

1 commit


22 Apr, 2015

1 commit

  • nfsd triggered a BUG_ON in net_generic(...) when rpc_pipefs_event(...)
    in fs/nfsd/nfs4recover.c was called before assigning ntfsd_net_id.
    The following was observed on a MIPS 32-core processor:
    kernel: Call Trace:
    kernel: [] rpc_pipefs_event+0x7c/0x158 [nfsd]
    kernel: [] notifier_call_chain+0x70/0xb8
    kernel: [] __blocking_notifier_call_chain+0x4c/0x70
    kernel: [] rpc_fill_super+0xf8/0x1a0
    kernel: [] mount_ns+0xb4/0xf0
    kernel: [] mount_fs+0x50/0x1f8
    kernel: [] vfs_kern_mount+0x58/0xf0
    kernel: [] do_mount+0x27c/0xa28
    kernel: [] SyS_mount+0x98/0xe8
    kernel: [] handle_sys64+0x44/0x68
    kernel:
    kernel:
    Code: 0040f809 00000000 2e020001 3c12c00d
    3c02801a de100000 6442eb98 0040f809
    kernel: ---[ end trace 7471374335809536 ]---

    Fixed this behaviour by calling register_pernet_subsys(&nfsd_net_ops) before
    registering rpc_pipefs_event(...) with the notifier chain.

    Signed-off-by: Giuseppe Cantavenera
    Signed-off-by: Lorenzo Restelli
    Reviewed-by: Kinlong Mee
    Cc: stable@vger.kernel.org
    Signed-off-by: J. Bruce Fields

    Giuseppe Cantavenera
     

03 Feb, 2015

1 commit

  • Add support for the GETDEVICEINFO, LAYOUTGET, LAYOUTCOMMIT and
    LAYOUTRETURN NFSv4.1 operations, as well as backing code to manage
    outstanding layouts and devices.

    Layout management is very straight forward, with a nfs4_layout_stateid
    structure that extends nfs4_stid to manage layout stateids as the
    top-level structure. It is linked into the nfs4_file and nfs4_client
    structures like the other stateids, and contains a linked list of
    layouts that hang of the stateid. The actual layout operations are
    implemented in layout drivers that are not part of this commit, but
    will be added later.

    The worst part of this commit is the management of the pNFS device IDs,
    which suffers from a specification that is not sanely implementable due
    to the fact that the device-IDs are global and not bound to an export,
    and have a small enough size so that we can't store the fsid portion of
    a file handle, and must never be reused. As we still do need perform all
    export authentication and validation checks on a device ID passed to
    GETDEVICEINFO we are caught between a rock and a hard place. To work
    around this issue we add a new hash that maps from a 64-bit integer to a
    fsid so that we can look up the export to authenticate against it,
    a 32-bit integer as a generation that we can bump when changing the device,
    and a currently unused 32-bit integer that could be used in the future
    to handle more than a single device per export. Entries in this hash
    table are never deleted as we can't reuse the ids anyway, and would have
    a severe lifetime problem anyway as Linux export structures are temporary
    structures that can go away under load.

    Parts of the XDR data, structures and marshaling/unmarshaling code, as
    well as many concepts are derived from the old pNFS server implementation
    from Andy Adamson, Benny Halevy, Dean Hildebrand, Marc Eshel, Fred Isaman,
    Mike Sager, Ricardo Labiaga and many others.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     

17 Dec, 2014

1 commit

  • Pull nfsd updates from Bruce Fields:
    "A comparatively quieter cycle for nfsd this time, but still with two
    larger changes:

    - RPC server scalability improvements from Jeff Layton (using RCU
    instead of a spinlock to find idle threads).

    - server-side NFSv4.2 ALLOCATE/DEALLOCATE support from Anna
    Schumaker, enabling fallocate on new clients"

    * 'for-3.19' of git://linux-nfs.org/~bfields/linux: (32 commits)
    nfsd4: fix xdr4 count of server in fs_location4
    nfsd4: fix xdr4 inclusion of escaped char
    sunrpc/cache: convert to use string_escape_str()
    sunrpc: only call test_bit once in svc_xprt_received
    fs: nfsd: Fix signedness bug in compare_blob
    sunrpc: add some tracepoints around enqueue and dequeue of svc_xprt
    sunrpc: convert to lockless lookup of queued server threads
    sunrpc: fix potential races in pool_stats collection
    sunrpc: add a rcu_head to svc_rqst and use kfree_rcu to free it
    sunrpc: require svc_create callers to pass in meaningful shutdown routine
    sunrpc: have svc_wake_up only deal with pool 0
    sunrpc: convert sp_task_pending flag to use atomic bitops
    sunrpc: move rq_cachetype field to better optimize space
    sunrpc: move rq_splice_ok flag into rq_flags
    sunrpc: move rq_dropme flag into rq_flags
    sunrpc: move rq_usedeferral flag to rq_flags
    sunrpc: move rq_local field to rq_flags
    sunrpc: add a generic rq_flags field to svc_rqst and move rq_secure to it
    nfsd: minor off by one checks in __write_versions()
    sunrpc: release svc_pool_map reference when serv allocation fails
    ...

    Linus Torvalds
     

02 Dec, 2014

1 commit

  • My static checker complains that if "len == remaining" then it means we
    have truncated the last character off the version string.

    The intent of the code is that we print as many versions as we can
    without truncating a version. Then we put a newline at the end. If the
    newline can't fit we return -EINVAL.

    Signed-off-by: Dan Carpenter
    Reviewed-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Dan Carpenter
     

20 Nov, 2014

1 commit


18 Sep, 2014

1 commit

  • Allow a privileged userland process to end the v4 grace period early.
    Writing "Y", "y", or "1" to the file will cause the v4 grace period to
    be lifted. The basic idea with this will be to allow the userland
    client tracking program to lift the grace period once it knows that no
    more clients will be reclaiming state.

    Signed-off-by: Jeff Layton

    Jeff Layton
     

09 Jul, 2014

1 commit

  • Currently, the maximum number of connections that nfsd will allow
    is based on the number of threads spawned. While this is fine for a
    default, there really isn't a clear relationship between the two.

    The number of threads corresponds to the number of concurrent requests
    that we want to allow the server to process at any given time. The
    connection limit corresponds to the maximum number of clients that we
    want to allow the server to handle. These are two entirely different
    quantities.

    Break the dependency on increasing threads in order to allow for more
    connections, by adding a new per-net parameter that can be set to a
    non-zero value. The default is still to base it on the number of threads,
    so there should be no behavior change for anyone who doesn't use it.

    Cc: Trond Myklebust
    Signed-off-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     

23 Jun, 2014

1 commit


09 May, 2014

1 commit


01 Apr, 2014

1 commit

  • There could be a case, when NFSd file system is mounted in network, different
    to socket's one, like below:

    "ip netns exec" creates new network and mount namespace, which duplicates NFSd
    mount point, created in init_net context. And thus NFS server stop in nested
    network context leads to RPCBIND client destruction in init_net.
    Then, on NFSd start in nested network context, rpc.nfsd process creates socket
    in nested net and passes it into "write_ports", which leads to RPCBIND sockets
    creation in init_net context because of the same reason (NFSd monut point was
    created in init_net context). An attempt to register passed socket in nested
    net leads to panic, because no RPCBIND client present in nexted network
    namespace.

    This patch add check that passed socket's net matches NFSd superblock's one.
    And returns -EINVAL error to user psace otherwise.

    v2: Put socket on exit.

    Reported-by: Weng Meiling
    Signed-off-by: Stanislav Kinsbursky
    Cc: stable@vger.kernel.org
    Signed-off-by: J. Bruce Fields

    Stanislav Kinsbursky
     

04 May, 2013

1 commit

  • Pull nfsd changes from J Bruce Fields:
    "Highlights include:

    - Some more DRC cleanup and performance work from Jeff Layton

    - A gss-proxy upcall from Simo Sorce: currently krb5 mounts to the
    server using credentials from Active Directory often fail due to
    limitations of the svcgssd upcall interface. This replacement
    lifts those limitations. The existing upcall is still supported
    for backwards compatibility.

    - More NFSv4.1 support: at this point, if a user with a current
    client who upgrades from 4.0 to 4.1 should see no regressions. In
    theory we do everything a 4.1 server is required to do. Patches
    for a couple minor exceptions are ready for 3.11, and with those
    and some more testing I'd like to turn 4.1 on by default in 3.11."

    Fix up semantic conflict as per Stephen Rothwell and linux-next:

    Commit 030d794bf498 ("SUNRPC: Use gssproxy upcall for server RPCGSS
    authentication") adds two new users of "PDE(inode)->data", but we're
    supposed to use "PDE_DATA(inode)" instead since commit d9dda78bad87
    ("procfs: new helper - PDE_DATA(inode)").

    The old PDE() macro is no longer available since commit c30480b92cf4
    ("proc: Make the PROC_I() and PDE() macros internal to procfs")

    * 'for-3.10' of git://linux-nfs.org/~bfields/linux: (60 commits)
    NFSD: SECINFO doesn't handle unsupported pseudoflavors correctly
    NFSD: Simplify GSS flavor encoding in nfsd4_do_encode_secinfo()
    nfsd: make symbol nfsd_reply_cache_shrinker static
    svcauth_gss: fix error return code in rsc_parse()
    nfsd4: don't remap EISDIR errors in rename
    svcrpc: fix gss-proxy to respect user namespaces
    SUNRPC: gssp_procedures[] can be static
    SUNRPC: define {create,destroy}_use_gss_proxy_proc_entry in !PROC case
    nfsd4: better error return to indicate SSV non-support
    nfsd: fix EXDEV checking in rename
    SUNRPC: Use gssproxy upcall for server RPCGSS authentication.
    SUNRPC: Add RPC based upcall mechanism for RPCGSS auth
    SUNRPC: conditionally return endtime from import_sec_context
    SUNRPC: allow disabling idle timeout
    SUNRPC: attempt AF_LOCAL connect on setup
    nfsd: Decode and send 64bit time values
    nfsd4: put_client_renew_locked can be static
    nfsd4: remove unused macro
    nfsd4: remove some useless code
    nfsd4: implement SEQ4_STATUS_RECALLABLE_STATE_REVOKED
    ...

    Linus Torvalds
     

10 Apr, 2013

1 commit


04 Apr, 2013

1 commit


03 Apr, 2013

1 commit


04 Mar, 2013

1 commit

  • Modify the request_module to prefix the file system type with "fs-"
    and add aliases to all of the filesystems that can be built as modules
    to match.

    A common practice is to build all of the kernel code and leave code
    that is not commonly needed as modules, with the result that many
    users are exposed to any bug anywhere in the kernel.

    Looking for filesystems with a fs- prefix limits the pool of possible
    modules that can be loaded by mount to just filesystems trivially
    making things safer with no real cost.

    Using aliases means user space can control the policy of which
    filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
    with blacklist and alias directives. Allowing simple, safe,
    well understood work-arounds to known problematic software.

    This also addresses a rare but unfortunate problem where the filesystem
    name is not the same as it's module name and module auto-loading
    would not work. While writing this patch I saw a handful of such
    cases. The most significant being autofs that lives in the module
    autofs4.

    This is relevant to user namespaces because we can reach the request
    module in get_fs_type() without having any special permissions, and
    people get uncomfortable when a user specified string (in this case
    the filesystem type) goes all of the way to request_module.

    After having looked at this issue I don't think there is any
    particular reason to perform any filtering or permission checks beyond
    making it clear in the module request that we want a filesystem
    module. The common pattern in the kernel is to call request_module()
    without regards to the users permissions. In general all a filesystem
    module does once loaded is call register_filesystem() and go to sleep.
    Which means there is not much attack surface exposed by loading a
    filesytem module unless the filesystem is mounted. In a user
    namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
    which most filesystems do not set today.

    Acked-by: Serge Hallyn
    Acked-by: Kees Cook
    Reported-by: Kees Cook
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

01 Mar, 2013

1 commit

  • Pull nfsd changes from J Bruce Fields:
    "Miscellaneous bugfixes, plus:

    - An overhaul of the DRC cache by Jeff Layton. The main effect is
    just to make it larger. This decreases the chances of intermittent
    errors especially in the UDP case. But we'll need to watch for any
    reports of performance regressions.

    - Containerized nfsd: with some limitations, we now support
    per-container nfs-service, thanks to extensive work from Stanislav
    Kinsbursky over the last year."

    Some notes about conflicts, since there were *two* non-data semantic
    conflicts here:

    - idr_remove_all() had been added by a memory leak fix, but has since
    become deprecated since idr_destroy() does it for us now.

    - xs_local_connect() had been added by this branch to make AF_LOCAL
    connections be synchronous, but in the meantime Trond had changed the
    calling convention in order to avoid a RCU dereference.

    There were a couple of more obvious actual source-level conflicts due to
    the hlist traversal changes and one just due to code changes next to
    each other, but those were trivial.

    * 'for-3.9' of git://linux-nfs.org/~bfields/linux: (49 commits)
    SUNRPC: make AF_LOCAL connect synchronous
    nfsd: fix compiler warning about ambiguous types in nfsd_cache_csum
    svcrpc: fix rpc server shutdown races
    svcrpc: make svc_age_temp_xprts enqueue under sv_lock
    lockd: nlmclnt_reclaim(): avoid stack overflow
    nfsd: enable NFSv4 state in containers
    nfsd: disable usermode helper client tracker in container
    nfsd: use proper net while reading "exports" file
    nfsd: containerize NFSd filesystem
    nfsd: fix comments on nfsd_cache_lookup
    SUNRPC: move cache_detail->cache_request callback call to cache_read()
    SUNRPC: remove "cache_request" argument in sunrpc_cache_pipe_upcall() function
    SUNRPC: rework cache upcall logic
    SUNRPC: introduce cache_detail->cache_request callback
    NFS: simplify and clean cache library
    NFS: use SUNRPC cache creation and destruction helper for DNS cache
    nfsd4: free_stid can be static
    nfsd: keep a checksum of the first 256 bytes of request
    sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer
    sunrpc: fix comment in struct xdr_buf definition
    ...

    Linus Torvalds
     

23 Feb, 2013

1 commit


16 Feb, 2013

2 commits

  • Functuon "exports_open" is used for both "/proc/fs/nfs/exports" and
    "/proc/fs/nfsd/exports" files.
    Now NFSd filesystem is containerised, so proper net can be taken from
    superblock for "/proc/fs/nfsd/exports" reader.
    But for "/proc/fs/nfsd/exports" only current->nsproxy->net_ns can be used.

    Signed-off-by: Stanislav Kinsbursky
    Signed-off-by: J. Bruce Fields

    Stanislav Kinsbursky
     
  • This patch makes NFSD file system superblock to be created per net.
    This makes possible to get proper network namespace from superblock instead of
    using hard-coded "init_net".

    Note: NFSd fs super-block holds network namespace. This garantees, that
    network namespace won't disappear from underneath of it.
    This, obviously, means, that in case of kill of a container's "init" (which is not a mount
    namespace, but network namespace creator) netowrk namespace won't be
    destroyed.

    Signed-off-by: Stanislav Kinsbursky
    Signed-off-by: J. Bruce Fields

    Stanislav Kinsbursky
     

05 Feb, 2013

1 commit


24 Jan, 2013

1 commit


11 Dec, 2012

5 commits


29 Nov, 2012

1 commit

  • There were only a small number of functions in this file and since they
    all affect stored state I think it makes sense to put them in state.h
    instead. I also dropped most static inline declarations since there are
    no callers when fault injection is not enabled.

    Signed-off-by: Bryan Schumaker
    Signed-off-by: J. Bruce Fields

    Bryan Schumaker
     

28 Nov, 2012

3 commits


10 Sep, 2012

1 commit

  • You can use nfsd/portlist to give nfsd additional sockets to listen on.
    In theory you can also remove listening sockets this way. But nobody's
    ever done that as far as I can tell.

    Also this was partially broken in 2.6.25, by
    a217813f9067b785241cb7f31956e51d2071703a "knfsd: Support adding
    transports by writing portlist file".

    (Note that we decide whether to take the "delfd" case by checking for a
    digit--but what's actually expected in that case is something made by
    svc_one_sock_name(), which won't begin with a digit.)

    So, let's just rip out this stuff.

    Acked-by: NeilBrown
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

22 Aug, 2012

2 commits