Eric Lee / smarc-fsl-linux-kernel

17 Dec, 2016

1 commit

759b2656b Merge tag 'nfsd-4.10' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd updates from Bruce Fields:
"The one new feature is support for a new NFSv4.2 mode_umask attribute
that makes ACL inheritance a little more useful in environments that
default to restrictive umasks. Requires client-side support, also on
its way for 4.10.

Other than that, miscellaneous smaller fixes and cleanup, especially
to the server rdma code"

[ The client side of the umask attribute was merged yesterday ]

* tag 'nfsd-4.10' of git://linux-nfs.org/~bfields/linux:
nfsd: add support for the umask attribute
sunrpc: use DEFINE_SPINLOCK()
svcrdma: Further clean-up of svc_rdma_get_inv_rkey()
svcrdma: Break up dprintk format in svc_rdma_accept()
svcrdma: Remove unused variable in rdma_copy_tail()
svcrdma: Remove unused variables in xprt_rdma_bc_allocate()
svcrdma: Remove svc_rdma_op_ctxt::wc_status
svcrdma: Remove DMA map accounting
svcrdma: Remove BH-disabled spin locking in svc_rdma_send()
svcrdma: Renovate sendto chunk list parsing
svcauth_gss: Close connection when dropping an incoming message
svcrdma: Clear xpt_bc_xps in xprt_setup_rdma_bc() error exit arm
nfsd: constify reply_cache_stats_operations structure
nfsd: update workqueue creation
sunrpc: GFP_KERNEL should be GFP_NOFS in crypto code
nfsd: catch errors in decode_fattr earlier
nfsd: clean up supported attribute handling
nfsd: fix error handling for clients that fail to return the layout
nfsd: more robust allocation failure handling in nfsd_reply_cache_init

Linus Torvalds
2016-12-17 02:48:28 +0800

18 Nov, 2016

1 commit

c7d03a00b netns: make struct pernet_operations::id unsigned int ... Browse Code »

Make struct pernet_operations::id unsigned.

There are 2 reasons to do so:

1)
This field is really an index into an zero based array and
thus is unsigned entity. Using negative value is out-of-bound
access by definition.

2)
On x86_64 unsigned 32-bit data which are mixed with pointers
via array indexing or offsets added or subtracted to pointers
are preffered to signed 32-bit data.

"int" being used as an array index needs to be sign-extended
to 64-bit before being used.

void f(long *p, int i)
{
g(p[i]);
}

roughly translates to

movsx rsi, esi
mov rdi, [rsi+...]
call g

MOVSX is 3 byte instruction which isn't necessary if the variable is
unsigned because x86_64 is zero extending by default.

Now, there is net_generic() function which, you guessed it right, uses
"int" as an array index:

static inline void *net_generic(const struct net *net, int id)
{
...
ptr = ng->ptr[id - 1];
...
}

And this function is used a lot, so those sign extensions add up.

Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
messing with code generation):

add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)

Unfortunately some functions actually grow bigger.
This is a semmingly random artefact of code generation with register
allocator being used differently. gcc decides that some variable
needs to live in new r8+ registers and every access now requires REX
prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
used which is longer than [r8]

However, overall balance is in negative direction:

add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
function old new delta
nfsd4_lock 3886 3959 +73
tipc_link_build_proto_msg 1096 1140 +44
mac80211_hwsim_new_radio 2776 2808 +32
tipc_mon_rcv 1032 1058 +26
svcauth_gss_legacy_init 1413 1429 +16
tipc_bcbase_select_primary 379 392 +13
nfsd4_exchange_id 1247 1260 +13
nfsd4_setclientid_confirm 782 793 +11
...
put_client_renew_locked 494 480 -14
ip_set_sockfn_get 730 716 -14
geneve_sock_add 829 813 -16
nfsd4_sequence_done 721 703 -18
nlmclnt_lookup_host 708 686 -22
nfsd4_lockt 1085 1063 -22
nfs_get_client 1077 1050 -27
tcf_bpf_init 1106 1076 -30
nfsd4_encode_fattr 5997 5930 -67
Total: Before=154856051, After=154854321, chg -0.00%

Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller

Alexey Dobriyan
2016-11-18 23:59:15 +0800

15 Nov, 2016

1 commit

7ba630f54 nfsd: constify reply_cache_stats_operations structure ... Browse Code »

reply_cache_stats_operations, of type struct file_operations, is never
modified, so declare it as const.

Done with the help of Coccinelle.

Signed-off-by: Julia Lawall
Reviewed-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Julia Lawall
2016-11-15 04:24:19 +0800

27 Sep, 2016

1 commit

ebd7c72c6 nfsd: randomize SETCLIENTID reply to help distinguish servers ... Browse Code »

NFSv4.1 has built-in trunking support that allows a client to determine
whether two connections to two different IP addresses are actually to
the same server. NFSv4.0 does not, but RFC 7931 attempts to provide
clients a means to do this, basically by performing a SETCLIENTID to one
address and confirming it with a SETCLIENTID_CONFIRM to the other.

Linux clients since 05f4c350ee02 "NFS: Discover NFSv4 server trunking
when mounting" implement a variation on this suggestion. It is possible
that other clients do too.

This depends on the clientid and verifier not being accepted by an
unrelated server. Since both are 64-bit values, that would be very
unlikely if they were random numbers. But they aren't:

knfsd generates the 64-bit clientid by concatenating the 32-bit boot
time (in seconds) and a counter. This makes collisions between
clientids generated by the same server extremely unlikely. But
collisions are very likely between clientids generated by servers that
boot at the same time, and it's quite common for multiple servers to
boot at the same time. The verifier is a concatenation of the
SETCLIENTID time (in seconds) and a counter, so again collisions between
different servers are likely if multiple SETCLIENTIDs are done at the
same time, which is a common case.

Therefore recent NFSv4.0 clients may decide two different servers are
really the same, and mount a filesystem from the wrong server.

Fortunately the Linux client, since 55b9df93ddd6 "nfsv4/v4.1: Verify the
client owner id during trunking detection", only does this when given
the non-default "migration" mount option.

The fault is really with RFC 7931, and needs a client fix, but in the
meantime we can mitigate the chance of these collisions by randomizing
the starting value of the counters used to generate clientids and
verifiers.

Reported-by: Frank Sorenson
Reviewed-by: Jeff Layton
Signed-off-by: J. Bruce Fields

J. Bruce Fields
2016-09-27 03:20:38 +0800

30 Jul, 2016

1 commit

a867d7349 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull userns vfs updates from Eric Biederman:
"This tree contains some very long awaited work on generalizing the
user namespace support for mounting filesystems to include filesystems
with a backing store. The real world target is fuse but the goal is
to update the vfs to allow any filesystem to be supported. This
patchset is based on a lot of code review and testing to approach that
goal.

While looking at what is needed to support the fuse filesystem it
became clear that there were things like xattrs for security modules
that needed special treatment. That the resolution of those concerns
would not be fuse specific. That sorting out these general issues
made most sense at the generic level, where the right people could be
drawn into the conversation, and the issues could be solved for
everyone.

At a high level what this patchset does a couple of simple things:

- Add a user namespace owner (s_user_ns) to struct super_block.

- Teach the vfs to handle filesystem uids and gids not mapping into
to kuids and kgids and being reported as INVALID_UID and
INVALID_GID in vfs data structures.

By assigning a user namespace owner filesystems that are mounted with
only user namespace privilege can be detected. This allows security
modules and the like to know which mounts may not be trusted. This
also allows the set of uids and gids that are communicated to the
filesystem to be capped at the set of kuids and kgids that are in the
owning user namespace of the filesystem.

One of the crazier corner casees this handles is the case of inodes
whose i_uid or i_gid are not mapped into the vfs. Most of the code
simply doesn't care but it is easy to confuse the inode writeback path
so no operation that could cause an inode write-back is permitted for
such inodes (aka only reads are allowed).

This set of changes starts out by cleaning up the code paths involved
in user namespace permirted mounts. Then when things are clean enough
adds code that cleanly sets s_user_ns. Then additional restrictions
are added that are possible now that the filesystem superblock
contains owner information.

These changes should not affect anyone in practice, but there are some
parts of these restrictions that are changes in behavior.

- Andy's restriction on suid executables that does not honor the
suid bit when the path is from another mount namespace (think
/proc/[pid]/fd/) or when the filesystem was mounted by a less
privileged user.

- The replacement of the user namespace implicit setting of MNT_NODEV
with implicitly setting SB_I_NODEV on the filesystem superblock
instead.

Using SB_I_NODEV is a stronger form that happens to make this state
user invisible. The user visibility can be managed but it caused
problems when it was introduced from applications reasonably
expecting mount flags to be what they were set to.

There is a little bit of work remaining before it is safe to support
mounting filesystems with backing store in user namespaces, beyond
what is in this set of changes.

- Verifying the mounter has permission to read/write the block device
during mount.

- Teaching the integrity modules IMA and EVM to handle filesystems
mounted with only user namespace root and to reduce trust in their
security xattrs accordingly.

- Capturing the mounters credentials and using that for permission
checks in d_automount and the like. (Given that overlayfs already
does this, and we need the work in d_automount it make sense to
generalize this case).

Furthermore there are a few changes that are on the wishlist:

- Get all filesystems supporting posix acls using the generic posix
acls so that posix_acl_fix_xattr_from_user and
posix_acl_fix_xattr_to_user may be removed. [Maintainability]

- Reducing the permission checks in places such as remount to allow
the superblock owner to perform them.

- Allowing the superblock owner to chown files with unmapped uids and
gids to something that is mapped so the files may be treated
normally.

I am not considering even obvious relaxations of permission checks
until it is clear there are no more corner cases that need to be
locked down and handled generically.

Many thanks to Seth Forshee who kept this code alive, and putting up
with me rewriting substantial portions of what he did to handle more
corner cases, and for his diligent testing and reviewing of my
changes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (30 commits)
fs: Call d_automount with the filesystems creds
fs: Update i_[ug]id_(read|write) to translate relative to s_user_ns
evm: Translate user/group ids relative to s_user_ns when computing HMAC
dquot: For now explicitly don't support filesystems outside of init_user_ns
quota: Handle quota data stored in s_user_ns in quota_setxquota
quota: Ensure qids map to the filesystem
vfs: Don't create inodes with a uid or gid unknown to the vfs
vfs: Don't modify inodes with a uid or gid unknown to the vfs
cred: Reject inodes with invalid ids in set_create_file_as()
fs: Check for invalid i_uid in may_follow_link()
vfs: Verify acls are valid within superblock's s_user_ns.
userns: Handle -1 in k[ug]id_has_mapping when !CONFIG_USER_NS
fs: Refuse uid/gid changes which don't map into s_user_ns
selinux: Add support for unprivileged mounts from user namespaces
Smack: Handle labels consistently in untrusted mounts
Smack: Add support for unprivileged mounts from user namespaces
fs: Treat foreign mounts as nosuid
fs: Limit file caps to the user namespace of the super block
userns: Remove the now unnecessary FS_USERNS_DEV_MOUNT flag
userns: Remove implicit MNT_NODEV fragility.
...

Linus Torvalds
2016-07-30 06:54:19 +0800

24 Jun, 2016

1 commit

d91ee87d8 vfs: Pass data, ns, and ns->userns to mount_ns ... Browse Code »

Today what is normally called data (the mount options) is not passed
to fill_super through mount_ns.

Pass the mount options and the namespace separately to mount_ns so
that filesystems such as proc that have mount options, can use
mount_ns.

Pass the user namespace to mount_ns so that the standard permission
check that verifies the mounter has permissions over the namespace can
be performed in mount_ns instead of in each filesystems .mount method.
Thus removing the duplication between mqueuefs and proc in terms of
permission checks. The extra permission check does not currently
affect the rpc_pipefs filesystem and the nfsd filesystem as those
filesystems do not currently allow unprivileged mounts. Without
unpvileged mounts it is guaranteed that the caller has already passed
capable(CAP_SYS_ADMIN) which guarantees extra permission check will
pass.

Update rpc_pipefs and the nfsd filesystem to ensure that the network
namespace reference is always taken in fill_super and always put in kill_sb
so that the logic is simpler and so that errors originating inside of
fill_super do not cause a network namespace leak.

Acked-by: Seth Forshee
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2016-06-24 04:41:53 +0800

30 May, 2016

1 commit

84c60b138 drop redundant ->owner initializations ... Browse Code »

it's not needed for file_operations of inodes located on fs defined
in the hosting module and for file_operations that go into procfs.

Signed-off-by: Al Viro

Al Viro
2016-05-30 07:08:00 +0800

22 Apr, 2015

1 commit

bb7ffbf29 nfsd: fix nsfd startup race triggering BUG_ON ... Browse Code »

nfsd triggered a BUG_ON in net_generic(...) when rpc_pipefs_event(...)
in fs/nfsd/nfs4recover.c was called before assigning ntfsd_net_id.
The following was observed on a MIPS 32-core processor:
kernel: Call Trace:
kernel: [] rpc_pipefs_event+0x7c/0x158 [nfsd]
kernel: [] notifier_call_chain+0x70/0xb8
kernel: [] __blocking_notifier_call_chain+0x4c/0x70
kernel: [] rpc_fill_super+0xf8/0x1a0
kernel: [] mount_ns+0xb4/0xf0
kernel: [] mount_fs+0x50/0x1f8
kernel: [] vfs_kern_mount+0x58/0xf0
kernel: [] do_mount+0x27c/0xa28
kernel: [] SyS_mount+0x98/0xe8
kernel: [] handle_sys64+0x44/0x68
kernel:
kernel:
Code: 0040f809 00000000 2e020001 3c12c00d
3c02801a de100000 6442eb98 0040f809
kernel: ---[ end trace 7471374335809536 ]---

Fixed this behaviour by calling register_pernet_subsys(&nfsd_net_ops) before
registering rpc_pipefs_event(...) with the notifier chain.

Signed-off-by: Giuseppe Cantavenera
Signed-off-by: Lorenzo Restelli
Reviewed-by: Kinlong Mee
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields

Giuseppe Cantavenera
2015-04-22 04:16:03 +0800

03 Feb, 2015

1 commit

9cf514ccf nfsd: implement pNFS operations ... Browse Code »

Add support for the GETDEVICEINFO, LAYOUTGET, LAYOUTCOMMIT and
LAYOUTRETURN NFSv4.1 operations, as well as backing code to manage
outstanding layouts and devices.

Layout management is very straight forward, with a nfs4_layout_stateid
structure that extends nfs4_stid to manage layout stateids as the
top-level structure. It is linked into the nfs4_file and nfs4_client
structures like the other stateids, and contains a linked list of
layouts that hang of the stateid. The actual layout operations are
implemented in layout drivers that are not part of this commit, but
will be added later.

The worst part of this commit is the management of the pNFS device IDs,
which suffers from a specification that is not sanely implementable due
to the fact that the device-IDs are global and not bound to an export,
and have a small enough size so that we can't store the fsid portion of
a file handle, and must never be reused. As we still do need perform all
export authentication and validation checks on a device ID passed to
GETDEVICEINFO we are caught between a rock and a hard place. To work
around this issue we add a new hash that maps from a 64-bit integer to a
fsid so that we can look up the export to authenticate against it,
a 32-bit integer as a generation that we can bump when changing the device,
and a currently unused 32-bit integer that could be used in the future
to handle more than a single device per export. Entries in this hash
table are never deleted as we can't reuse the ids anyway, and would have
a severe lifetime problem anyway as Linux export structures are temporary
structures that can go away under load.

Parts of the XDR data, structures and marshaling/unmarshaling code, as
well as many concepts are derived from the old pNFS server implementation
from Andy Adamson, Benny Halevy, Dean Hildebrand, Marc Eshel, Fred Isaman,
Mike Sager, Ricardo Labiaga and many others.

Signed-off-by: Christoph Hellwig

Christoph Hellwig
2015-02-03 01:09:42 +0800

17 Dec, 2014

1 commit

0b233b7c7 Merge branch 'for-3.19' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd updates from Bruce Fields:
"A comparatively quieter cycle for nfsd this time, but still with two
larger changes:

- RPC server scalability improvements from Jeff Layton (using RCU
instead of a spinlock to find idle threads).

- server-side NFSv4.2 ALLOCATE/DEALLOCATE support from Anna
Schumaker, enabling fallocate on new clients"

* 'for-3.19' of git://linux-nfs.org/~bfields/linux: (32 commits)
nfsd4: fix xdr4 count of server in fs_location4
nfsd4: fix xdr4 inclusion of escaped char
sunrpc/cache: convert to use string_escape_str()
sunrpc: only call test_bit once in svc_xprt_received
fs: nfsd: Fix signedness bug in compare_blob
sunrpc: add some tracepoints around enqueue and dequeue of svc_xprt
sunrpc: convert to lockless lookup of queued server threads
sunrpc: fix potential races in pool_stats collection
sunrpc: add a rcu_head to svc_rqst and use kfree_rcu to free it
sunrpc: require svc_create callers to pass in meaningful shutdown routine
sunrpc: have svc_wake_up only deal with pool 0
sunrpc: convert sp_task_pending flag to use atomic bitops
sunrpc: move rq_cachetype field to better optimize space
sunrpc: move rq_splice_ok flag into rq_flags
sunrpc: move rq_dropme flag into rq_flags
sunrpc: move rq_usedeferral flag to rq_flags
sunrpc: move rq_local field to rq_flags
sunrpc: add a generic rq_flags field to svc_rqst and move rq_secure to it
nfsd: minor off by one checks in __write_versions()
sunrpc: release svc_pool_map reference when serv allocation fails
...

Linus Torvalds
2014-12-17 07:25:31 +0800

02 Dec, 2014

1 commit

818f2f57f nfsd: minor off by one checks in __write_versions() ... Browse Code »

My static checker complains that if "len == remaining" then it means we
have truncated the last character off the version string.

The intent of the code is that we print as many versions as we can
without truncating a version. Then we put a newline at the end. If the
newline can't fit we return -EINVAL.

Signed-off-by: Dan Carpenter
Reviewed-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Dan Carpenter
2014-12-02 03:45:28 +0800

20 Nov, 2014

1 commit

244c7d444 nfsd/nfsctl.c: new helper ... Browse Code »

... to get from opened file on nfsctl to relevant struct net *

Signed-off-by: Al Viro

Al Viro
2014-11-20 02:01:21 +0800

18 Sep, 2014

1 commit

7f5ef2e90 nfsd: add a v4_end_grace file to /proc/fs/nfsd ... Browse Code »

Allow a privileged userland process to end the v4 grace period early.
Writing "Y", "y", or "1" to the file will cause the v4 grace period to
be lifted. The basic idea with this will be to allow the userland
client tracking program to lift the grace period once it knows that no
more clients will be reclaiming state.

Signed-off-by: Jeff Layton

Jeff Layton
2014-09-18 04:33:14 +0800

09 Jul, 2014

1 commit

5b8db00ba nfsd: add a new /proc/fs/nfsd/max_connections file ... Browse Code »

Currently, the maximum number of connections that nfsd will allow
is based on the number of threads spawned. While this is fine for a
default, there really isn't a clear relationship between the two.

The number of threads corresponds to the number of concurrent requests
that we want to allow the server to process at any given time. The
connection limit corresponds to the maximum number of clients that we
want to allow the server to handle. These are two entirely different
quantities.

Break the dependency on increasing threads in order to allow for more
connections, by adding a new per-net parameter that can be set to a
non-zero value. The default is still to base it on the number of threads,
so there should be no behavior change for anyone who doesn't use it.

Cc: Trond Myklebust
Signed-off-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Jeff Layton
2014-07-09 05:14:32 +0800

23 Jun, 2014

1 commit

3c7aa15d2 NFSD: Using min/max/min_t/max_t for calculate ... Browse Code »

Signed-off-by: Kinglong Mee
Signed-off-by: J. Bruce Fields

Kinglong Mee
2014-06-23 23:31:36 +0800

09 May, 2014

1 commit

9fa1959e9 NFSD: Get rid of empty function nfs4_state_init ... Browse Code »

Signed-off-by: Kinglong Mee
Signed-off-by: J. Bruce Fields

Kinglong Mee
2014-05-09 02:59:52 +0800

01 Apr, 2014

1 commit

306463942 nfsd: check passed socket's net matches NFSd superblock's one ... Browse Code »

There could be a case, when NFSd file system is mounted in network, different
to socket's one, like below:

"ip netns exec" creates new network and mount namespace, which duplicates NFSd
mount point, created in init_net context. And thus NFS server stop in nested
network context leads to RPCBIND client destruction in init_net.
Then, on NFSd start in nested network context, rpc.nfsd process creates socket
in nested net and passes it into "write_ports", which leads to RPCBIND sockets
creation in init_net context because of the same reason (NFSd monut point was
created in init_net context). An attempt to register passed socket in nested
net leads to panic, because no RPCBIND client present in nexted network
namespace.

This patch add check that passed socket's net matches NFSd superblock's one.
And returns -EINVAL error to user psace otherwise.

v2: Put socket on exit.

Reported-by: Weng Meiling
Signed-off-by: Stanislav Kinsbursky
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields

Stanislav Kinsbursky
2014-04-01 04:58:16 +0800

04 May, 2013

1 commit

1db772216 Merge branch 'for-3.10' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd changes from J Bruce Fields:
"Highlights include:

- Some more DRC cleanup and performance work from Jeff Layton

- A gss-proxy upcall from Simo Sorce: currently krb5 mounts to the
server using credentials from Active Directory often fail due to
limitations of the svcgssd upcall interface. This replacement
lifts those limitations. The existing upcall is still supported
for backwards compatibility.

- More NFSv4.1 support: at this point, if a user with a current
client who upgrades from 4.0 to 4.1 should see no regressions. In
theory we do everything a 4.1 server is required to do. Patches
for a couple minor exceptions are ready for 3.11, and with those
and some more testing I'd like to turn 4.1 on by default in 3.11."

Fix up semantic conflict as per Stephen Rothwell and linux-next:

Commit 030d794bf498 ("SUNRPC: Use gssproxy upcall for server RPCGSS
authentication") adds two new users of "PDE(inode)->data", but we're
supposed to use "PDE_DATA(inode)" instead since commit d9dda78bad87
("procfs: new helper - PDE_DATA(inode)").

The old PDE() macro is no longer available since commit c30480b92cf4
("proc: Make the PROC_I() and PDE() macros internal to procfs")

* 'for-3.10' of git://linux-nfs.org/~bfields/linux: (60 commits)
NFSD: SECINFO doesn't handle unsupported pseudoflavors correctly
NFSD: Simplify GSS flavor encoding in nfsd4_do_encode_secinfo()
nfsd: make symbol nfsd_reply_cache_shrinker static
svcauth_gss: fix error return code in rsc_parse()
nfsd4: don't remap EISDIR errors in rename
svcrpc: fix gss-proxy to respect user namespaces
SUNRPC: gssp_procedures[] can be static
SUNRPC: define {create,destroy}_use_gss_proxy_proc_entry in !PROC case
nfsd4: better error return to indicate SSV non-support
nfsd: fix EXDEV checking in rename
SUNRPC: Use gssproxy upcall for server RPCGSS authentication.
SUNRPC: Add RPC based upcall mechanism for RPCGSS auth
SUNRPC: conditionally return endtime from import_sec_context
SUNRPC: allow disabling idle timeout
SUNRPC: attempt AF_LOCAL connect on setup
nfsd: Decode and send 64bit time values
nfsd4: put_client_renew_locked can be static
nfsd4: remove unused macro
nfsd4: remove some useless code
nfsd4: implement SEQ4_STATUS_RECALLABLE_STATE_REVOKED
...

Linus Torvalds
2013-05-04 01:59:39 +0800

10 Apr, 2013

1 commit

75ef9de12 constify a bunch of struct file_operations instances ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2013-04-10 02:16:20 +0800

04 Apr, 2013

1 commit

ff7c4b369 nfsd: remove /proc/fs/nfs when create /proc/fs/nfs/exports error ... Browse Code »

when create /proc/fs/nfs/exports error, we should remove /proc/fs/nfs,
if don't do it, it maybe cause Memory leak.

Signed-off-by: fanchaoting
Reviewed-by: chendt.fnst

Signed-off-by: J. Bruce Fields

fanchaoting
2013-04-04 03:30:07 +0800

03 Apr, 2013

1 commit

a2f999a37 nfsd: add new reply_cache_stats file in nfsdfs ... Browse Code »

For presenting statistics relating to duplicate reply cache.

Signed-off-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Jeff Layton
2013-04-03 23:47:24 +0800

04 Mar, 2013

1 commit

7f78e0351 fs: Limit sys_mount to only request filesystem modules. ... Browse Code »

Modify the request_module to prefix the file system type with "fs-"
and add aliases to all of the filesystems that can be built as modules
to match.

A common practice is to build all of the kernel code and leave code
that is not commonly needed as modules, with the result that many
users are exposed to any bug anywhere in the kernel.

Looking for filesystems with a fs- prefix limits the pool of possible
modules that can be loaded by mount to just filesystems trivially
making things safer with no real cost.

Using aliases means user space can control the policy of which
filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
with blacklist and alias directives. Allowing simple, safe,
well understood work-arounds to known problematic software.

This also addresses a rare but unfortunate problem where the filesystem
name is not the same as it's module name and module auto-loading
would not work. While writing this patch I saw a handful of such
cases. The most significant being autofs that lives in the module
autofs4.

This is relevant to user namespaces because we can reach the request
module in get_fs_type() without having any special permissions, and
people get uncomfortable when a user specified string (in this case
the filesystem type) goes all of the way to request_module.

After having looked at this issue I don't think there is any
particular reason to perform any filtering or permission checks beyond
making it clear in the module request that we want a filesystem
module. The common pattern in the kernel is to call request_module()
without regards to the users permissions. In general all a filesystem
module does once loaded is call register_filesystem() and go to sleep.
Which means there is not much attack surface exposed by loading a
filesytem module unless the filesystem is mounted. In a user
namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
which most filesystems do not set today.

Acked-by: Serge Hallyn
Acked-by: Kees Cook
Reported-by: Kees Cook
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2013-03-04 11:36:31 +0800

01 Mar, 2013

1 commit

b6669737d Merge branch 'for-3.9' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd changes from J Bruce Fields:
"Miscellaneous bugfixes, plus:

- An overhaul of the DRC cache by Jeff Layton. The main effect is
just to make it larger. This decreases the chances of intermittent
errors especially in the UDP case. But we'll need to watch for any
reports of performance regressions.

- Containerized nfsd: with some limitations, we now support
per-container nfs-service, thanks to extensive work from Stanislav
Kinsbursky over the last year."

Some notes about conflicts, since there were *two* non-data semantic
conflicts here:

- idr_remove_all() had been added by a memory leak fix, but has since
become deprecated since idr_destroy() does it for us now.

- xs_local_connect() had been added by this branch to make AF_LOCAL
connections be synchronous, but in the meantime Trond had changed the
calling convention in order to avoid a RCU dereference.

There were a couple of more obvious actual source-level conflicts due to
the hlist traversal changes and one just due to code changes next to
each other, but those were trivial.

* 'for-3.9' of git://linux-nfs.org/~bfields/linux: (49 commits)
SUNRPC: make AF_LOCAL connect synchronous
nfsd: fix compiler warning about ambiguous types in nfsd_cache_csum
svcrpc: fix rpc server shutdown races
svcrpc: make svc_age_temp_xprts enqueue under sv_lock
lockd: nlmclnt_reclaim(): avoid stack overflow
nfsd: enable NFSv4 state in containers
nfsd: disable usermode helper client tracker in container
nfsd: use proper net while reading "exports" file
nfsd: containerize NFSd filesystem
nfsd: fix comments on nfsd_cache_lookup
SUNRPC: move cache_detail->cache_request callback call to cache_read()
SUNRPC: remove "cache_request" argument in sunrpc_cache_pipe_upcall() function
SUNRPC: rework cache upcall logic
SUNRPC: introduce cache_detail->cache_request callback
NFS: simplify and clean cache library
NFS: use SUNRPC cache creation and destruction helper for DNS cache
nfsd4: free_stid can be static
nfsd: keep a checksum of the first 256 bytes of request
sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer
sunrpc: fix comment in struct xdr_buf definition
...

Linus Torvalds
2013-03-01 10:02:55 +0800

23 Feb, 2013

1 commit

496ad9aa8 new helper: file_inode(file) ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2013-02-23 12:31:31 +0800

16 Feb, 2013

2 commits

96d851c4d nfsd: use proper net while reading "exports" file ... Browse Code »

Functuon "exports_open" is used for both "/proc/fs/nfs/exports" and
"/proc/fs/nfsd/exports" files.
Now NFSd filesystem is containerised, so proper net can be taken from
superblock for "/proc/fs/nfsd/exports" reader.
But for "/proc/fs/nfsd/exports" only current->nsproxy->net_ns can be used.

Signed-off-by: Stanislav Kinsbursky
Signed-off-by: J. Bruce Fields

Stanislav Kinsbursky
2013-02-16 00:21:01 +0800
11f779421 nfsd: containerize NFSd filesystem ... Browse Code »

This patch makes NFSD file system superblock to be created per net.
This makes possible to get proper network namespace from superblock instead of
using hard-coded "init_net".

Note: NFSd fs super-block holds network namespace. This garantees, that
network namespace won't disappear from underneath of it.
This, obviously, means, that in case of kill of a container's "init" (which is not a mount
namespace, but network namespace creator) netowrk namespace won't be
destroyed.

Signed-off-by: Stanislav Kinsbursky
Signed-off-by: J. Bruce Fields

Stanislav Kinsbursky
2013-02-16 00:21:00 +0800

05 Feb, 2013

1 commit

5976687a2 sunrpc: move address copy/cmp/convert routines and prototypes from clnt.h to addr.h ... Browse Code »

These routines are used by server and client code, so having them in a
separate header would be best.

Signed-off-by: Jeff Layton
Acked-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Jeff Layton
2013-02-05 22:41:14 +0800

24 Jan, 2013

1 commit

ff89be87c nfsd4: require version 4 when enabling or disabling minorversion ... Browse Code »

The current code will allow silly things like:

echo "+2 +3 +4 +7.1">/proc/fs/nfsd/versions

Reported-by: Fan Chaoting
Signed-off-by: J. Bruce Fields

J. Bruce Fields
2013-01-24 07:25:01 +0800

11 Dec, 2012

5 commits

9dd9845f0 nfsd: make NFSd service structure allocated per net ... Browse Code »

This patch makes main step in NFSd containerisation.

There could be different approaches to how to make NFSd able to handle
incoming RPC request from different network namespaces. The two main
options are:

1) Share NFSd kthreads betwween all network namespaces.
2) Create separated pool of threads for each namespace.

While first approach looks more flexible, second one is simpler and
non-racy. This patch implements the second option.

To make it possible to allocate separate pools of threads, we have to
make it possible to allocate separate NFSd service structures per net.

Signed-off-by: Stanislav Kinsbursky
Signed-off-by: J. Bruce Fields

Stanislav Kinsbursky
2012-12-11 05:25:39 +0800
081603520 nfsd: pass net to __write_ports() and down ... Browse Code »

Precursor patch. Hard-coded "init_net" will be replaced by proper one in
future.

Signed-off-by: Stanislav Kinsbursky
Signed-off-by: J. Bruce Fields

Stanislav Kinsbursky
2012-12-11 05:25:36 +0800
3938a0d5e nfsd: pass net to nfsd_set_nrthreads() ... Browse Code »

Precursor patch. Hard-coded "init_net" will be replaced by proper one in
future.

Signed-off-by: Stanislav Kinsbursky
Signed-off-by: J. Bruce Fields

Stanislav Kinsbursky
2012-12-11 05:25:35 +0800
d41a9417c nfsd: pass net to nfsd_svc() ... Browse Code »

Precursor patch. Hard-coded "init_net" will be replaced by proper one in
future.

Signed-off-by: Stanislav Kinsbursky
Signed-off-by: J. Bruce Fields

Stanislav Kinsbursky
2012-12-11 05:25:34 +0800
6777436b0 nfsd: pass net to nfsd_create_serv() ... Browse Code »

Precursor patch. Hard-coded "init_net" will be replaced by proper one in
future.

Signed-off-by: Stanislav Kinsbursky
Signed-off-by: J. Bruce Fields

Stanislav Kinsbursky
2012-12-11 05:25:34 +0800

29 Nov, 2012

1 commit

f3c7521fe NFSD: Fold fault_inject.h into state.h ... Browse Code »

There were only a small number of functions in this file and since they
all affect stored state I think it makes sense to put them in state.h
instead. I also dropped most static inline declarations since there are
no callers when fault injection is not enabled.

Signed-off-by: Bryan Schumaker
Signed-off-by: J. Bruce Fields

Bryan Schumaker
2012-11-29 02:01:02 +0800

28 Nov, 2012

3 commits

5284b44e4 nfsd: make NFSv4 grace time per net ... Browse Code »

Grace time is a part of NFSv4 state engine, which is constructed per network
namespace.

Signed-off-by: Stanislav Kinsbursky
Signed-off-by: J. Bruce Fields

Stanislav Kinsbursky
2012-11-28 23:39:47 +0800
3d7337115 nfsd: make NFSv4 lease time per net ... Browse Code »

Lease time is a part of NFSv4 state engine, which is constructed per network
namespace.

Signed-off-by: Stanislav Kinsbursky
Signed-off-by: J. Bruce Fields

Stanislav Kinsbursky
2012-11-28 23:39:46 +0800
864aee5c6 nfsd: remove redundant declarations ... Browse Code »

This is a cleanup patch. Functions nfsd_pool_stats_open() and
nfsd_pool_stats_release() are declared in fs/nfsd/nfsd.h.

Signed-off-by: Stanislav Kinsbursky
Signed-off-by: J. Bruce Fields

Stanislav Kinsbursky
2012-11-28 23:13:55 +0800

10 Sep, 2012

1 commit

eccf50c12 nfsd: remove unused listener-removal interfaces ... Browse Code »

You can use nfsd/portlist to give nfsd additional sockets to listen on.
In theory you can also remove listening sockets this way. But nobody's
ever done that as far as I can tell.

Also this was partially broken in 2.6.25, by
a217813f9067b785241cb7f31956e51d2071703a "knfsd: Support adding
transports by writing portlist file".

(Note that we decide whether to take the "delfd" case by checking for a
digit--but what's actually expected in that case is something made by
svc_one_sock_name(), which won't begin with a digit.)

So, let's just rip out this stuff.

Acked-by: NeilBrown
Signed-off-by: J. Bruce Fields

J. Bruce Fields
2012-09-10 22:55:19 +0800

22 Aug, 2012

2 commits

a10fded18 nfsd: allow configuring nfsd to listen on 5-digit ports ... Browse Code »

Note a 16-bit value can require up to 5 digits.

Signed-off-by: J. Bruce Fields

J. Bruce Fields
2012-08-22 05:07:50 +0800
38af2cabb nfsd: remove redundant "port" argument ... Browse Code »

"port" in all these functions is always NFS_PORT.

nfsd can already be run on a nonstandard port using the "nfsd/portlist"
interface.

Signed-off-by: J. Bruce Fields

J. Bruce Fields
2012-08-22 05:07:49 +0800