04 May, 2012

1 commit


30 Apr, 2012

1 commit


28 Apr, 2012

4 commits


18 Apr, 2012

1 commit

  • PipeFS superblock creation routine relays on SUNRPC pernet data presense, which
    is created on register_pernet_subsys() call in SUNRPC module init function.
    Registering of PipeFS filesystem prior to registering of per-net subsystem
    leads to races (mount of PipeFS can dereference uninitialized data).

    Signed-off-by: Stanislav Kinsbursky
    Signed-off-by: Trond Myklebust

    Stanislav Kinsbursky
     

30 Mar, 2012

1 commit

  • Pull nfsd changes from Bruce Fields:

    Highlights:
    - Benny Halevy and Tigran Mkrtchyan implemented some more 4.1 features,
    moving us closer to a complete 4.1 implementation.
    - Bernd Schubert fixed a long-standing problem with readdir cookies on
    ext2/3/4.
    - Jeff Layton performed a long-overdue overhaul of the server reboot
    recovery code which will allow us to deprecate the current code (a
    rather unusual user of the vfs), and give us some needed flexibility
    for further improvements.
    - Like the client, we now support numeric uid's and gid's in the
    auth_sys case, allowing easier upgrades from NFSv2/v3 to v4.x.

    Plus miscellaneous bugfixes and cleanup.

    Thanks to everyone!

    There are also some delegation fixes waiting on vfs review that I
    suppose will have to wait for 3.5. With that done I think we'll finally
    turn off the "EXPERIMENTAL" dependency for v4 (though that's mostly
    symbolic as it's been on by default in distro's for a while).

    And the list of 4.1 todo's should be achievable for 3.5 as well:

    http://wiki.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues

    though we may still want a bit more experience with it before turning it
    on by default.

    * 'for-3.4' of git://linux-nfs.org/~bfields/linux: (55 commits)
    nfsd: only register cld pipe notifier when CONFIG_NFSD_V4 is enabled
    nfsd4: use auth_unix unconditionally on backchannel
    nfsd: fix NULL pointer dereference in cld_pipe_downcall
    nfsd4: memory corruption in numeric_name_to_id()
    sunrpc: skip portmap calls on sessions backchannel
    nfsd4: allow numeric idmapping
    nfsd: don't allow legacy client tracker init for anything but init_net
    nfsd: add notifier to handle mount/unmount of rpc_pipefs sb
    nfsd: add the infrastructure to handle the cld upcall
    nfsd: add a header describing upcall to nfsdcld
    nfsd: add a per-net-namespace struct for nfsd
    sunrpc: create nfsd dir in rpc_pipefs
    nfsd: add nfsd4_client_tracking_ops struct and a way to set it
    nfsd: convert nfs4_client->cl_cb_flags to a generic flags field
    NFSD: Fix nfs4_verifier memory alignment
    NFSD: Fix warnings when NFSD_DEBUG is not defined
    nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)
    nfsd: rename 'int access' to 'int may_flags' in nfsd_open()
    ext4: return 32/64-bit dir name hash according to usage type
    fs: add new FMODE flags: FMODE_32bithash and FMODE_64bithash
    ...

    Linus Torvalds
     

29 Mar, 2012

3 commits

  • Pull NFS client bugfixes for Linux 3.4 from Trond Myklebust

    Highlights include:
    - Fix infinite loops in the mount code
    - Fix a userspace buffer overflow in __nfs4_get_acl_uncached
    - Fix a memory leak due to a double reference count in rpcb_getport_async()

    Signed-off-by: Trond Myklebust

    * tag 'nfs-for-3.4-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
    NFSv4: Minor cleanups for nfs4_handle_exception and nfs4_async_handle_error
    NFSv4.1: Fix layoutcommit error handling
    NFSv4: Fix two infinite loops in the mount code
    SUNRPC: Use the already looked-up xprt in rpcb_getport_async()
    NFS4.1: remove duplicate variable declaration in filelayout_clear_request_commit
    Fix length of buffer copied in __nfs4_get_acl_uncached

    Linus Torvalds
     
  • …m/linux/kernel/git/dhowells/linux-asm_system

    Pull "Disintegrate and delete asm/system.h" from David Howells:
    "Here are a bunch of patches to disintegrate asm/system.h into a set of
    separate bits to relieve the problem of circular inclusion
    dependencies.

    I've built all the working defconfigs from all the arches that I can
    and made sure that they don't break.

    The reason for these patches is that I recently encountered a circular
    dependency problem that came about when I produced some patches to
    optimise get_order() by rewriting it to use ilog2().

    This uses bitops - and on the SH arch asm/bitops.h drags in
    asm-generic/get_order.h by a circuituous route involving asm/system.h.

    The main difficulty seems to be asm/system.h. It holds a number of
    low level bits with no/few dependencies that are commonly used (eg.
    memory barriers) and a number of bits with more dependencies that
    aren't used in many places (eg. switch_to()).

    These patches break asm/system.h up into the following core pieces:

    (1) asm/barrier.h

    Move memory barriers here. This already done for MIPS and Alpha.

    (2) asm/switch_to.h

    Move switch_to() and related stuff here.

    (3) asm/exec.h

    Move arch_align_stack() here. Other process execution related bits
    could perhaps go here from asm/processor.h.

    (4) asm/cmpxchg.h

    Move xchg() and cmpxchg() here as they're full word atomic ops and
    frequently used by atomic_xchg() and atomic_cmpxchg().

    (5) asm/bug.h

    Move die() and related bits.

    (6) asm/auxvec.h

    Move AT_VECTOR_SIZE_ARCH here.

    Other arch headers are created as needed on a per-arch basis."

    Fixed up some conflicts from other header file cleanups and moving code
    around that has happened in the meantime, so David's testing is somewhat
    weakened by that. We'll find out anything that got broken and fix it..

    * tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits)
    Delete all instances of asm/system.h
    Remove all #inclusions of asm/system.h
    Add #includes needed to permit the removal of asm/system.h
    Move all declarations of free_initmem() to linux/mm.h
    Disintegrate asm/system.h for OpenRISC
    Split arch_align_stack() out from asm-generic/system.h
    Split the switch_to() wrapper out of asm-generic/system.h
    Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h
    Create asm-generic/barrier.h
    Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h
    Disintegrate asm/system.h for Xtensa
    Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt]
    Disintegrate asm/system.h for Tile
    Disintegrate asm/system.h for Sparc
    Disintegrate asm/system.h for SH
    Disintegrate asm/system.h for Score
    Disintegrate asm/system.h for S390
    Disintegrate asm/system.h for PowerPC
    Disintegrate asm/system.h for PA-RISC
    Disintegrate asm/system.h for MN10300
    ...

    Linus Torvalds
     
  • Remove all #inclusions of asm/system.h preparatory to splitting and killing
    it. Performed with the following command:

    perl -p -i -e 's!^#\s*include\s*.*\n!!' `grep -Irl '^#\s*include\s*' *`

    Signed-off-by: David Howells

    David Howells
     

28 Mar, 2012

1 commit

  • rbcb_getport_async() was looking up the rpc_xprt (reference++) and then
    later looking it up again (reference++) to pass through the
    rpcbind_args. The xprt would only be dereferenced once, when we were
    done with the rpcbind_args (reference--). This leaves an extra
    reference to the transport that would never go away.

    Signed-off-by: Bryan Schumaker
    Signed-off-by: Trond Myklebust

    Bryan Schumaker
     

26 Mar, 2012

3 commits


23 Mar, 2012

1 commit

  • Pull NFS client updates for Linux 3.4 from Trond Myklebust:
    "New features include:
    - Add NFS client support for containers.

    This should enable most of the necessary functionality, including
    lockd support, and support for rpc.statd, NFSv4 idmapper and
    RPCSEC_GSS upcalls into the correct network namespace from which
    the mount system call was issued.

    - NFSv4 idmapper scalability improvements

    Base the idmapper cache on the keyring interface to allow
    concurrent access to idmapper entries. Start the process of
    migrating users from the single-threaded daemon-based approach to
    the multi-threaded request-key based approach.

    - NFSv4.1 implementation id.

    Allows the NFSv4.1 client and server to mutually identify each
    other for logging and debugging purposes.

    - Support the 'vers=4.1' mount option for mounting NFSv4.1 instead of
    having to use the more counterintuitive 'vers=4,minorversion=1'.

    - SUNRPC tracepoints.

    Start the process of adding tracepoints in order to improve
    debugging of the RPC layer.

    - pNFS object layout support for autologin.

    Important bugfixes include:

    - Fix a bug in rpc_wake_up/rpc_wake_up_status that caused them to
    fail to wake up all tasks when applied to priority waitqueues.

    - Ensure that we handle read delegations correctly, when we try to
    truncate a file.

    - A number of fixes for NFSv4 state manager loops (mostly to do with
    delegation recovery)."

    * tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (224 commits)
    NFS: fix sb->s_id in nfs debug prints
    xprtrdma: Remove assumption that each segment is ls_state in release_lockowner
    NFS: ncommit count is being double decremented
    SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up()
    Try using machine credentials for RENEW calls
    NFSv4.1: Fix a few issues in filelayout_commit_pagelist
    NFSv4.1: Clean ups and bugfixes for the pNFS read/writeback/commit code
    ...

    Linus Torvalds
     

22 Mar, 2012

1 commit

  • Pull vfs pile 1 from Al Viro:
    "This is _not_ all; in particular, Miklos' and Jan's stuff is not there
    yet."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (64 commits)
    ext4: initialization of ext4_li_mtx needs to be done earlier
    debugfs-related mode_t whack-a-mole
    hfsplus: add an ioctl to bless files
    hfsplus: change finder_info to u32
    hfsplus: initialise userflags
    qnx4: new helper - try_extent()
    qnx4: get rid of qnx4_bread/qnx4_getblk
    take removal of PF_FORKNOEXEC to flush_old_exec()
    trim includes in inode.c
    um: uml_dup_mmap() relies on ->mmap_sem being held, but activate_mm() doesn't hold it
    um: embed ->stub_pages[] into mmu_context
    gadgetfs: list_for_each_safe() misuse
    ocfs2: fix leaks on failure exits in module_init
    ecryptfs: make register_filesystem() the last potential failure exit
    ntfs: forgets to unregister sysctls on register_filesystem() failure
    logfs: missing cleanup on register_filesystem() failure
    jfs: mising cleanup on register_filesystem() failure
    make configfs_pin_fs() return root dentry on success
    configfs: configfs_create_dir() has parent dentry in dentry->d_parent
    configfs: sanitize configfs_create()
    ...

    Linus Torvalds
     

21 Mar, 2012

5 commits


20 Mar, 2012

2 commits


13 Mar, 2012

1 commit


12 Mar, 2012

1 commit

  • net/sunrpc/svcsock.c:412:22: warning: incorrect type in assignment
    (different address spaces)
    - svc_partial_recvfrom now takes a struct kvec, so the variable
    save_iovbase needs to be an ordinary (void *)

    Make a bunch of variables in net/sunrpc/xprtsock.c static

    Fix a couple of "warning: symbol 'foo' was not declared. Should it be
    static?" reports.

    Fix a couple of conflicting function declarations.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

03 Mar, 2012

3 commits

  • NFSv4.0 clients must send endpoint information for their callback
    service to NFSv4.0 servers during their first contact with a server.
    Traditionally on Linux, user space provides the callback endpoint IP
    address via the "clientaddr=" mount option.

    During an NFSv4 migration event, it is possible that an FSID may be
    migrated to a destination server that is accessible via a different
    source IP address than the source server was. The client must update
    callback endpoint information on the destination server so that it can
    maintain leases and allow delegation.

    Without a new "clientaddr=" option from user space, however, the
    kernel itself must construct an appropriate IP address for the
    callback update. Provide an API in the RPC client for upper layer
    RPC consumers to acquire a source address for a remote.

    The mechanism used by the mount.nfs command is copied: set up a
    connected UDP socket to the designated remote, then scrape the source
    address off the socket. We are careful to select the correct network
    namespace when setting up the temporary UDP socket.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • When the cl_xprt field is updated, the cl_server field will also have
    to change. Since the contents of cl_server follow the remote endpoint
    of cl_xprt, just move that field to the rpc_xprt.

    Signed-off-by: Trond Myklebust
    [ cel: simplify check_gss_callback_principal(), whitespace changes ]
    [ cel: forward ported to 3.4 ]
    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • A migration event will replace the rpc_xprt used by an rpc_clnt. To
    ensure this can be done safely, all references to cl_xprt must now use
    a form of rcu_dereference().

    Special care is taken with rpc_peeraddr2str(), which returns a pointer
    to memory whose lifetime is the same as the rpc_xprt.

    Signed-off-by: Trond Myklebust
    [ cel: fix lockdep splats and layering violations ]
    [ cel: forward ported to 3.4 ]
    [ cel: remove rpc_max_reqs(), add rpc_net_ns() ]
    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

28 Feb, 2012

3 commits

  • Currently, wait queue, used for polling of RPC pipe changes from user-space,
    is a part of RPC pipe. But the pipe data itself can be released on NFS umount
    prior to dentry-inode pair, connected to it (is case of this pair is open by
    some process).
    This is not a problem for almost all pipe users, because all PipeFS file
    operations checks pipe reference prior to using it.
    Except evenfd. This thing registers itself with "poll" file operation and thus
    has a reference to pipe wait queue. This leads to oopses on destroying eventfd
    after NFS umount (like rpc_idmapd do) since not pipe data left to the point
    already.
    The solution is to wait queue from pipe data to internal RPC inode data. This
    looks more logical, because this wiat queue used only for user-space processes,
    which already holds inode reference.

    Note: upcalls have to get pipe->dentry prior to dereferecing wait queue to make
    sure, that mount point won't disappear from underneath us.

    Signed-off-by: Stanislav Kinsbursky
    Signed-off-by: Trond Myklebust

    Stanislav Kinsbursky
     
  • There are 2 tightly bound objects: pipe data (created for kernel needs, has
    reference to dentry, which depends on PipeFS mount/umount) and PipeFS
    dentry/inode pair (created on mount for user-space needs). They both
    independently may have or have not a valid reference to each other.
    This means, that we have to make sure, that pipe->dentry reference is valid on
    upcalls, and dentry->pipe reference is valid on downcalls. The latter check is
    absent - my fault.
    IOW, PipeFS dentry can be opened by some process (rpc.idmapd for example), but
    it's pipe data can belong to NFS mount, which was unmounted already and thus
    pipe data was destroyed.
    To fix this, pipe reference have to be set to NULL on rpc_unlink() and checked
    on PipeFS file operations instead of pipe->dentry check.

    Note: PipeFS "poll" file operation will be updated in next patch, because it's
    logic is more complicated.

    Signed-off-by: Stanislav Kinsbursky
    Signed-off-by: Trond Myklebust

    Stanislav Kinsbursky
     
  • v3:
    1) Lookup for client is performed from the beginning of the list on each PipeFS
    event handling operation.

    Lockdep is sad otherwise, because inode mutex is taken on PipeFS dentry
    creation, which can be called on mount notification, where this per-net client
    lock is taken on clients list walk.

    Signed-off-by: Stanislav Kinsbursky
    Signed-off-by: Trond Myklebust

    Stanislav Kinsbursky
     

18 Feb, 2012

2 commits


17 Feb, 2012

2 commits

  • With static RPC slots, the xprt backlog queue stats were useful in showing
    when the transport (TCP) was starved by lack of RPC slots. The new dynamic
    RPC slot code, commit d9ba131d8f58c0d2ff5029e7002ab43f913b36f9, always
    provides an RPC slot and so only uses the xprt backlog queue when the
    tcp_max_slot_table_entries value has been hit or when an allocation error
    occurs. All requests are now placed on the xprt sending or pending queue which
    need to be monitored for debugging.

    The max_slot stat shows the maximum number of dynamic RPC slots reached which is
    useful when debugging performance issues.

    Add the new fields at the end of the mountstats xprt stanza so that mountstats
    outputs the previous correct values and ignores the new fields. Bump
    NFS_IOSTATS_VERS.

    Signed-off-by: Andy Adamson
    Signed-off-by: Trond Myklebust

    Andy Adamson
     
  • Signed-off-by: Stanislav Kinsbursky
    Signed-off-by: Trond Myklebust

    Stanislav Kinsbursky
     

15 Feb, 2012

4 commits

  • The tracepoint code relies on the queue->name being defined in order to
    be able to display the name of the waitqueue on which an RPC task is
    sleeping.

    Reported-by: Randy Dunlap
    Reported-by: Steven Rostedt
    Signed-off-by: Trond Myklebust
    Acked-by: Steven Rostedt
    Acked-by: Randy Dunlap

    Trond Myklebust
     
  • This patch introduces per-net Lockd initialization and destruction routines.
    The logic is the same as in global Lockd up and down routines. Probably the
    solution is not the best one. But at least it looks clear.
    So per-net "up" routine are called only in case of lockd is running already. If
    per-net resources are not allocated yet, then service is being registered with
    local portmapper and lockd sockets created.
    Per-net "down" routine is called on every lockd_down() call in case of global
    users counter is not zero.

    Signed-off-by: Stanislav Kinsbursky
    Signed-off-by: Trond Myklebust

    Stanislav Kinsbursky
     
  • This function is enough for releasing resources, allocated for network
    namespace context, in case of sharing service between them.
    IOW, each service "user" (LockD, NFSd, etc), which wants to share service
    between network namespaces, have to release related resources by the function,
    introduced in this patch, instead of performing service shutdown (of course in
    case the service is shared already to the moment of release).

    Signed-off-by: Stanislav Kinsbursky
    Signed-off-by: Trond Myklebust

    Stanislav Kinsbursky
     
  • v2: Added comment to BUG_ON's in svc_destroy() to make code looks clearer.

    This patch introduces network namespace filter for service destruction
    function.
    Nothing special here - just do exactly the same operations, but only for
    tranports in passed networks namespace context.
    BTW, BUG_ON() checks for empty service transports lists were returned into
    svc_destroy() function. This is because of swithing generic svc_close_all() to
    networks namespace dependable svc_close_net().

    Signed-off-by: Stanislav Kinsbursky
    Signed-off-by: Trond Myklebust

    Stanislav Kinsbursky