04 Mar, 2013

1 commit

  • Modify the request_module to prefix the file system type with "fs-"
    and add aliases to all of the filesystems that can be built as modules
    to match.

    A common practice is to build all of the kernel code and leave code
    that is not commonly needed as modules, with the result that many
    users are exposed to any bug anywhere in the kernel.

    Looking for filesystems with a fs- prefix limits the pool of possible
    modules that can be loaded by mount to just filesystems trivially
    making things safer with no real cost.

    Using aliases means user space can control the policy of which
    filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
    with blacklist and alias directives. Allowing simple, safe,
    well understood work-arounds to known problematic software.

    This also addresses a rare but unfortunate problem where the filesystem
    name is not the same as it's module name and module auto-loading
    would not work. While writing this patch I saw a handful of such
    cases. The most significant being autofs that lives in the module
    autofs4.

    This is relevant to user namespaces because we can reach the request
    module in get_fs_type() without having any special permissions, and
    people get uncomfortable when a user specified string (in this case
    the filesystem type) goes all of the way to request_module.

    After having looked at this issue I don't think there is any
    particular reason to perform any filtering or permission checks beyond
    making it clear in the module request that we want a filesystem
    module. The common pattern in the kernel is to call request_module()
    without regards to the users permissions. In general all a filesystem
    module does once loaded is call register_filesystem() and go to sleep.
    Which means there is not much attack surface exposed by loading a
    filesytem module unless the filesystem is mounted. In a user
    namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
    which most filesystems do not set today.

    Acked-by: Serge Hallyn
    Acked-by: Kees Cook
    Reported-by: Kees Cook
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

03 Mar, 2013

2 commits

  • Pull NFS client bugfixes from Trond Myklebust:
    "We've just concluded another Connectathon interoperability testing
    week, and so here are the fixes for the bugs that were discovered:

    - Don't allow NFS silly-renamed files to be deleted
    - Don't start the retransmission timer when out of socket space
    - Fix a couple of pnfs-related Oopses.
    - Fix one more NFSv4 state recovery deadlock
    - Don't loop forever when LAYOUTGET returns NFS4ERR_LAYOUTTRYLATER"

    * tag 'nfs-for-3.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
    SUNRPC: One line comment fix
    NFSv4.1: LAYOUTGET EDELAY loops timeout to the MDS
    SUNRPC: add call to get configured timeout
    PNFS: set the default DS timeout to 60 seconds
    NFSv4: Fix another open/open_recovery deadlock
    nfs: don't allow nfs_find_actor to match inodes of the wrong type
    NFSv4.1: Hold reference to layout hdr in layoutget
    pnfs: fix resend_to_mds for directio
    SUNRPC: Don't start the retransmission timer when out of socket space
    NFS: Don't allow NFS silly-renamed files to be deleted, no signal

    Linus Torvalds
     
  • Reported-by: Weston Andros Adamson
    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

01 Mar, 2013

3 commits

  • Pull nfsd changes from J Bruce Fields:
    "Miscellaneous bugfixes, plus:

    - An overhaul of the DRC cache by Jeff Layton. The main effect is
    just to make it larger. This decreases the chances of intermittent
    errors especially in the UDP case. But we'll need to watch for any
    reports of performance regressions.

    - Containerized nfsd: with some limitations, we now support
    per-container nfs-service, thanks to extensive work from Stanislav
    Kinsbursky over the last year."

    Some notes about conflicts, since there were *two* non-data semantic
    conflicts here:

    - idr_remove_all() had been added by a memory leak fix, but has since
    become deprecated since idr_destroy() does it for us now.

    - xs_local_connect() had been added by this branch to make AF_LOCAL
    connections be synchronous, but in the meantime Trond had changed the
    calling convention in order to avoid a RCU dereference.

    There were a couple of more obvious actual source-level conflicts due to
    the hlist traversal changes and one just due to code changes next to
    each other, but those were trivial.

    * 'for-3.9' of git://linux-nfs.org/~bfields/linux: (49 commits)
    SUNRPC: make AF_LOCAL connect synchronous
    nfsd: fix compiler warning about ambiguous types in nfsd_cache_csum
    svcrpc: fix rpc server shutdown races
    svcrpc: make svc_age_temp_xprts enqueue under sv_lock
    lockd: nlmclnt_reclaim(): avoid stack overflow
    nfsd: enable NFSv4 state in containers
    nfsd: disable usermode helper client tracker in container
    nfsd: use proper net while reading "exports" file
    nfsd: containerize NFSd filesystem
    nfsd: fix comments on nfsd_cache_lookup
    SUNRPC: move cache_detail->cache_request callback call to cache_read()
    SUNRPC: remove "cache_request" argument in sunrpc_cache_pipe_upcall() function
    SUNRPC: rework cache upcall logic
    SUNRPC: introduce cache_detail->cache_request callback
    NFS: simplify and clean cache library
    NFS: use SUNRPC cache creation and destruction helper for DNS cache
    nfsd4: free_stid can be static
    nfsd: keep a checksum of the first 256 bytes of request
    sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer
    sunrpc: fix comment in struct xdr_buf definition
    ...

    Linus Torvalds
     
  • Returns the configured timeout for the xprt of the rpc client.

    Signed-off-by: Weston Andros Adamson
    Signed-off-by: Trond Myklebust

    Weston Andros Adamson
     
  • It doesn't appear that anyone actually needs to connect asynchronously.

    Also, using a workqueue for the connect means we lose the namespace
    information from the original process. This is a problem since there's
    no way to explicitly pass in a filesystem namespace for resolution of an
    AF_LOCAL address.

    Acked-by: Trond Myklebust
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

28 Feb, 2013

1 commit

  • I'm not sure why, but the hlist for each entry iterators were conceived

    list_for_each_entry(pos, head, member)

    The hlist ones were greedy and wanted an extra parameter:

    hlist_for_each_entry(tpos, pos, head, member)

    Why did they need an extra pos parameter? I'm not quite sure. Not only
    they don't really need it, it also prevents the iterator from looking
    exactly like the list iterator, which is unfortunate.

    Besides the semantic patch, there was some manual work required:

    - Fix up the actual hlist iterators in linux/list.h
    - Fix up the declaration of other iterators based on the hlist ones.
    - A very small amount of places were using the 'node' parameter, this
    was modified to use 'obj->member' instead.
    - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
    properly, so those had to be fixed up manually.

    The semantic patch which is mostly the work of Peter Senna Tschudin is here:

    @@
    iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

    type T;
    expression a,c,d,e;
    identifier b;
    statement S;
    @@

    -T b;

    [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
    [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
    [akpm@linux-foundation.org: checkpatch fixes]
    [akpm@linux-foundation.org: fix warnings]
    [akpm@linux-foudnation.org: redo intrusive kvm changes]
    Tested-by: Peter Senna Tschudin
    Acked-by: Paul E. McKenney
    Signed-off-by: Sasha Levin
    Cc: Wu Fengguang
    Cc: Marcelo Tosatti
    Cc: Gleb Natapov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sasha Levin
     

27 Feb, 2013

2 commits

  • Pull vfs pile (part one) from Al Viro:
    "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
    locking violations, etc.

    The most visible changes here are death of FS_REVAL_DOT (replaced with
    "has ->d_weak_revalidate()") and a new helper getting from struct file
    to inode. Some bits of preparation to xattr method interface changes.

    Misc patches by various people sent this cycle *and* ocfs2 fixes from
    several cycles ago that should've been upstream right then.

    PS: the next vfs pile will be xattr stuff."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
    saner proc_get_inode() calling conventions
    proc: avoid extra pde_put() in proc_fill_super()
    fs: change return values from -EACCES to -EPERM
    fs/exec.c: make bprm_mm_init() static
    ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
    ocfs2: fix possible use-after-free with AIO
    ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
    get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
    target: writev() on single-element vector is pointless
    export kernel_write(), convert open-coded instances
    fs: encode_fh: return FILEID_INVALID if invalid fid_type
    kill f_vfsmnt
    vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
    nfsd: handle vfs_getattr errors in acl protocol
    switch vfs_getattr() to struct path
    default SET_PERSONALITY() in linux/elf.h
    ceph: prepopulate inodes only when request is aborted
    d_hash_and_lookup(): export, switch open-coded instances
    9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
    9p: split dropping the acls from v9fs_set_create_acl()
    ...

    Linus Torvalds
     
  • Pull infiniband update from Roland Dreier:
    "Main batch of InfiniBand/RDMA changes for 3.9:

    - SRP error handling fixes from Bart Van Assche

    - Implementation of memory windows for mlx4 from Shani Michaeli

    - Lots of cxgb4 HW driver fixes from Vipul Pandya

    - Make iSER work for virtual functions, other fixes from Or Gerlitz

    - Fix for bug in qib HW driver from Mike Marciniszyn

    - IPoIB fixes from me, Itai Garbi, Shlomo Pongratz, Yan Burman

    - Various cleanups and warning fixes from Julia Lawall, Paul Bolle,
    Wei Yongjun"

    * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (41 commits)
    IB/mlx4: Advertise MW support
    IB/mlx4: Support memory window binding
    mlx4: Implement memory windows allocation and deallocation
    mlx4_core: Enable memory windows in {INIT, QUERY}_HCA
    mlx4_core: Disable memory windows for virtual functions
    IPoIB: Free ipoib neigh on path record failure so path rec queries are retried
    IB/srp: Fail I/O requests if the transport is offline
    IB/srp: Avoid endless SCSI error handling loop
    IB/srp: Avoid sending a task management function needlessly
    IB/srp: Track connection state properly
    IB/mlx4: Remove redundant NULL check before kfree
    IB/mlx4: Fix compiler warning about uninitialized 'vlan' variable
    IB/mlx4: Convert is_xxx variables in build_mlx_header() to bool
    IB/iser: Enable iser when FMRs are not supported
    IB/iser: Avoid error prints on EAGAIN registration failures
    IB/iser: Use proper define for the commands per LUN value advertised to SCSI ML
    IB/uverbs: Implement memory windows support in uverbs
    IB/core: Add "type 2" memory windows support
    mlx4_core: Propagate MR deregistration failures to caller
    mlx4_core: Rename MPT-related functions to have mpt_ prefix
    ...

    Linus Torvalds
     

26 Feb, 2013

1 commit

  • Pull user namespace and namespace infrastructure changes from Eric W Biederman:
    "This set of changes starts with a few small enhnacements to the user
    namespace. reboot support, allowing more arbitrary mappings, and
    support for mounting devpts, ramfs, tmpfs, and mqueuefs as just the
    user namespace root.

    I do my best to document that if you care about limiting your
    unprivileged users that when you have the user namespace support
    enabled you will need to enable memory control groups.

    There is a minor bug fix to prevent overflowing the stack if someone
    creates way too many user namespaces.

    The bulk of the changes are a continuation of the kuid/kgid push down
    work through the filesystems. These changes make using uids and gids
    typesafe which ensures that these filesystems are safe to use when
    multiple user namespaces are in use. The filesystems converted for
    3.9 are ceph, 9p, afs, ocfs2, gfs2, ncpfs, nfs, nfsd, and cifs. The
    changes for these filesystems were a little more involved so I split
    the changes into smaller hopefully obviously correct changes.

    XFS is the only filesystem that remains. I was hoping I could get
    that in this release so that user namespace support would be enabled
    with an allyesconfig or an allmodconfig but it looks like the xfs
    changes need another couple of days before it they are ready."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (93 commits)
    cifs: Enable building with user namespaces enabled.
    cifs: Convert struct cifs_ses to use a kuid_t and a kgid_t
    cifs: Convert struct cifs_sb_info to use kuids and kgids
    cifs: Modify struct smb_vol to use kuids and kgids
    cifs: Convert struct cifsFileInfo to use a kuid
    cifs: Convert struct cifs_fattr to use kuid and kgids
    cifs: Convert struct tcon_link to use a kuid.
    cifs: Modify struct cifs_unix_set_info_args to hold a kuid_t and a kgid_t
    cifs: Convert from a kuid before printing current_fsuid
    cifs: Use kuids and kgids SID to uid/gid mapping
    cifs: Pass GLOBAL_ROOT_UID and GLOBAL_ROOT_GID to keyring_alloc
    cifs: Use BUILD_BUG_ON to validate uids and gids are the same size
    cifs: Override unmappable incoming uids and gids
    nfsd: Enable building with user namespaces enabled.
    nfsd: Properly compare and initialize kuids and kgids
    nfsd: Store ex_anon_uid and ex_anon_gid as kuids and kgids
    nfsd: Modify nfsd4_cb_sec to use kuids and kgids
    nfsd: Handle kuids and kgids in the nfs4acl to posix_acl conversion
    nfsd: Convert nfsxdr to use kuids and kgids
    nfsd: Convert nfs3xdr to use kuids and kgids
    ...

    Linus Torvalds
     

23 Feb, 2013

2 commits


22 Feb, 2013

3 commits

  • Pull driver core patches from Greg Kroah-Hartman:
    "Here is the big driver core merge for 3.9-rc1

    There are two major series here, both of which touch lots of drivers
    all over the kernel, and will cause you some merge conflicts:

    - add a new function called devm_ioremap_resource() to properly be
    able to check return values.

    - remove CONFIG_EXPERIMENTAL

    Other than those patches, there's not much here, some minor fixes and
    updates"

    Fix up trivial conflicts

    * tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (221 commits)
    base: memory: fix soft/hard_offline_page permissions
    drivercore: Fix ordering between deferred_probe and exiting initcalls
    backlight: fix class_find_device() arguments
    TTY: mark tty_get_device call with the proper const values
    driver-core: constify data for class_find_device()
    firmware: Ignore abort check when no user-helper is used
    firmware: Reduce ifdef CONFIG_FW_LOADER_USER_HELPER
    firmware: Make user-mode helper optional
    firmware: Refactoring for splitting user-mode helper code
    Driver core: treat unregistered bus_types as having no devices
    watchdog: Convert to devm_ioremap_resource()
    thermal: Convert to devm_ioremap_resource()
    spi: Convert to devm_ioremap_resource()
    power: Convert to devm_ioremap_resource()
    mtd: Convert to devm_ioremap_resource()
    mmc: Convert to devm_ioremap_resource()
    mfd: Convert to devm_ioremap_resource()
    media: Convert to devm_ioremap_resource()
    iommu: Convert to devm_ioremap_resource()
    drm: Convert to devm_ioremap_resource()
    ...

    Linus Torvalds
     
  • This patch enhances the IB core support for Memory Windows (MWs).

    MWs allow an application to have better/flexible control over remote
    access to memory.

    Two types of MWs are supported, with the second type having two flavors:

    Type 1 - associated with PD only
    Type 2A - associated with QPN only
    Type 2B - associated with PD and QPN

    Applications can allocate a MW once, and then repeatedly bind the MW
    to different ranges in MRs that are associated to the same PD. Type 1
    windows are bound through a verb, while type 2 windows are bound by
    posting a work request.

    The 32-bit memory key is composed of a 24-bit index and an 8-bit
    key. The key is changed with each bind, thus allowing more control
    over the peer's use of the memory key.

    The changes introduced are the following:

    * add memory window type enum and a corresponding parameter to ib_alloc_mw.
    * type 2 memory window bind work request support.
    * create a struct that contains the common part of the bind verb struct
    ibv_mw_bind and the bind work request into a single struct.
    * add the ib_inc_rkey helper function to advance the tag part of an rkey.

    Consumer interface details:

    * new device capability flags IB_DEVICE_MEM_WINDOW_TYPE_2A and
    IB_DEVICE_MEM_WINDOW_TYPE_2B are added to indicate device support
    for these features.

    Devices can set either IB_DEVICE_MEM_WINDOW_TYPE_2A or
    IB_DEVICE_MEM_WINDOW_TYPE_2B if it supports type 2A or type 2B
    memory windows. It can set neither to indicate it doesn't support
    type 2 windows at all.

    * modify existing provides and consumers code to the new param of
    ib_alloc_mw and the ib_mw_bind_info structure

    Signed-off-by: Haggai Eran
    Signed-off-by: Shani Michaeli
    Signed-off-by: Or Gerlitz
    Signed-off-by: Roland Dreier

    Shani Michaeli
     
  • Pull NFS client bugfixes from Trond Myklebust:

    - Fix an Oops in the pNFS layoutget code

    - Fix a number of NFSv4 and v4.1 state recovery deadlocks and hangs due
    to the interaction of the session drain lock and state management
    locks.

    - Remove task->tk_xprt, which was hiding a lot of RCU dereferencing
    bugs

    - Fix a long standing NFSv3 posix lock recovery bug.

    - Revert commit 324d003b0cd8 ("NFS: add nfs_sb_deactive_async to avoid
    deadlock"). It turned out that the root cause of the deadlock was
    due to interactions with the workqueues that have now been resolved.

    * tag 'nfs-for-3.9-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (22 commits)
    NLM: Ensure that we resend all pending blocking locks after a reclaim
    umount oops when remove blocklayoutdriver first
    sunrpc: silence build warning in gss_fill_context
    nfs: remove kfree() redundant null checks
    NFSv4.1: Don't decode skipped layoutgets
    NFSv4.1: Fix bulk recall and destroy of layouts
    NFSv4.1: Fix an ABBA locking issue with session and state serialisation
    NFSv4: Fix a reboot recovery race when opening a file
    NFSv4: Ensure delegation recall and byte range lock removal don't conflict
    NFSv4: Fix up the return values of nfs4_open_delegation_recall
    NFSv4.1: Don't lose locks when a server reboots during delegation return
    NFSv4.1: Prevent deadlocks between state recovery and file locking
    NFSv4: Allow the state manager to mark an open_owner as being recovered
    SUNRPC: Add missing static declaration to _gss_mech_get_by_name
    Revert "NFS: add nfs_sb_deactive_async to avoid deadlock"
    SUNRPC: Nuke the tk_xprt macro
    SUNRPC: Avoid RCU dereferences in the transport bind and connect code
    SUNRPC: Fix an RCU dereference in xprt_reserve
    SUNRPC: Pass pointers to struct rpc_xprt to the congestion window
    SUNRPC: Fix an RCU dereference in xs_local_rpcbind
    ...

    Linus Torvalds
     

18 Feb, 2013

1 commit

  • Since commit 620038f6d23, gcc is throwing the following warning:

    CC [M] net/sunrpc/auth_gss/auth_gss.o
    In file included from include/linux/sunrpc/types.h:14:0,
    from include/linux/sunrpc/sched.h:14,
    from include/linux/sunrpc/clnt.h:18,
    from net/sunrpc/auth_gss/auth_gss.c:45:
    net/sunrpc/auth_gss/auth_gss.c: In function ‘gss_pipe_downcall’:
    include/linux/sunrpc/debug.h:45:10: warning: ‘timeout’ may be used
    uninitialized in this function [-Wmaybe-uninitialized]
    printk(KERN_DEFAULT args); \
    ^
    net/sunrpc/auth_gss/auth_gss.c:194:15: note: ‘timeout’ was declared here
    unsigned int timeout;
    ^
    If simple_get_bytes returns an error, then we'll end up calling printk
    with an uninitialized timeout value. Reasonably harmless, but fairly
    simple to fix by removing the printout of the uninitialised parameters.

    Cc: Andy Adamson
    Signed-off-by: Jeff Layton
    [Trond: just remove the parameters rather than initialising timeout]
    Signed-off-by: Trond Myklebust

    Jeff Layton
     

17 Feb, 2013

2 commits

  • Rewrite server shutdown to remove the assumption that there are no
    longer any threads running (no longer true, for example, when shutting
    down the service in one network namespace while it's still running in
    others).

    Do that by doing what we'd do in normal circumstances: just CLOSE each
    socket, then enqueue it.

    Since there may not be threads to handle the resulting queued xprts,
    also run a simplified version of the svc_recv() loop run by a server to
    clean up any closed xprts afterwards.

    Cc: stable@kernel.org
    Tested-by: Jason Tibbitts
    Tested-by: Paweł Sikora
    Acked-by: Stanislav Kinsbursky
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • svc_age_temp_xprts expires xprts in a two-step process: first it takes
    the sv_lock and moves the xprts to expire off their server-wide list
    (sv_tempsocks or sv_permsocks) to a local list. Then it drops the
    sv_lock and enqueues and puts each one.

    I see no reason for this: svc_xprt_enqueue() will take sp_lock, but the
    sv_lock and sp_lock are not otherwise nested anywhere (and documentation
    at the top of this file claims it's correct to nest these with sp_lock
    inside.)

    Cc: stable@kernel.org
    Tested-by: Jason Tibbitts
    Tested-by: Paweł Sikora
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

15 Feb, 2013

4 commits

  • The reason to move cache_request() callback call from
    sunrpc_cache_pipe_upcall() to cache_read() is that this garantees, that cache
    access will be done userspace process context (only userspace process have
    proper root context).
    This is required for NFSd support in container: svc_export_request() (which is
    cache_request callback) calls d_path(), which, in turn, traverse dentry up to
    current->fs->root. Kernel threads always have global root, while container
    have be in "root jail" - i.e. have it's own nested root.

    Signed-off-by: Stanislav Kinsbursky
    Signed-off-by: J. Bruce Fields

    Stanislav Kinsbursky
     
  • Passing this pointer is redundant since it's stored on cache_detail structure,
    which is also passed to sunrpc_cache_pipe_upcall () function.

    Signed-off-by: Stanislav Kinsbursky
    Signed-off-by: J. Bruce Fields

    Stanislav Kinsbursky
     
  • For most of SUNRPC caches (except NFS DNS cache) cache_detail->cache_upcall is
    redundant since all that it's implementations are doing is calling
    sunrpc_cache_pipe_upcall() with proper function address argument.
    Cache request function address is now stored on cache_detail structure and
    thus all the code can be simplified.
    Now, for those cache details, which doesn't have cache_upcall callback (the
    only one, which still has is nfs_dns_resolve_template)
    sunrpc_cache_pipe_upcall will be called instead.

    Signed-off-by: Stanislav Kinsbursky
    Signed-off-by: J. Bruce Fields

    Stanislav Kinsbursky
     
  • This callback will allow to simplify upcalls in further patches in this
    series.

    Signed-off-by: Stanislav Kinsbursky
    Signed-off-by: J. Bruce Fields

    Stanislav Kinsbursky
     

13 Feb, 2013

12 commits

  • When reading kuids from the wire map them into the initial user
    namespace, and validate the mapping succeded.

    When reading kgids from the wire map them into the initial user
    namespace, and validate the mapping succeded.

    Cc: "J. Bruce Fields"
    Cc: Trond Myklebust
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • When a new rpc connection is established with an in-kernel server, the
    traffic passes through svc_process_common, and svc_set_client and down
    into svcauth_unix_set_client if it is of type RPC_AUTH_NULL or
    RPC_AUTH_UNIX.

    svcauth_unix_set_client then looks at the uid of the credential we
    have assigned to the incomming client and if we don't have the groups
    already cached makes an upcall to get a list of groups that the client
    can use.

    The upcall encodes send a rpc message to user space encoding the uid
    of the user whose groups we want to know. Encode the kuid of the user
    in the initial user namespace as nfs mounts can only happen today in
    the initial user namespace.

    When a reply to an upcall comes in convert interpret the uid and gid values
    from the rpc pipe as uids and gids in the initial user namespace and convert
    them into kuids and kgids before processing them further.

    When reading proc files listing the uid to gid list cache convert the
    kuids and kgids from into uids and gids the initial user namespace. As we are
    displaying server internal details it makes sense to display these values
    from the servers perspective.

    Cc: "J. Bruce Fields"
    Cc: Trond Myklebust
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • When writing kuids onto the wire first map them into the initial user
    namespace.

    When writing kgids onto the wire first map them into the initial user
    namespace.

    Cc: "J. Bruce Fields"
    Cc: Trond Myklebust
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • In svcauth_unix introduce a helper unix_gid_hash as otherwise the
    expresion to generate the hash value is just too long.

    Cc: "J. Bruce Fields"
    Cc: Trond Myklebust
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • For each received uid call make_kuid and validate the result.
    For each received gid call make_kgid and validate the result.

    Cc: "J. Bruce Fields"
    Cc: Trond Myklebust
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • - Use from_kuid when generating the on the wire uid values.
    - Use make_kuid when reading on the wire values.

    In gss_encode_v0_msg, since the uid in gss_upcall_msg is now a kuid_t
    generate the necessary uid_t value on the stack copy it into
    gss_msg->databuf where it can safely live until the message is no
    longer needed.

    Cc: "J. Bruce Fields"
    Cc: Trond Myklebust
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • In auth unix there are a couple of places INVALID_GID is used a
    sentinel to mark the end of uc_gids array. Use gid_valid
    as a type safe way to verify we have not hit the end of
    valid data in the array.

    Cc: "J. Bruce Fields"
    Cc: Trond Myklebust
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • When printing kuids and kgids for debugging purpropses convert them
    to ordinary integers so their values can be fed to the oridnary
    print functions.

    Cc: "J. Bruce Fields"
    Cc: Trond Myklebust
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • In unx_create_cred directly assign gids from acred->group_info
    to cred->uc_gids.

    In unx_match directly compare uc_gids with group_info.

    Now that both group_info and unx_cred gids are stored as kgids
    this is valid and the extra layer of translation can be removed.

    Cc: "J. Bruce Fields"
    Cc: Trond Myklebust
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • When comparing uids use uid_eq instead of ==.
    When comparing gids use gid_eq instead of ==.

    And unfortunate cost of type safety.

    Cc: "J. Bruce Fields"
    Cc: Trond Myklebust
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • Convert variables that store uids and gids to be of type
    kuid_t and kgid_t instead of type uid_t and gid_t.

    Cc: "J. Bruce Fields"
    Cc: Trond Myklebust
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • Instead of (uid_t)0 use GLOBAL_ROOT_UID.
    Instead of (gid_t)0 use GLOBAL_ROOT_GID.
    Instead of (uid_t)-1 use INVALID_UID
    Instead of (gid_t)-1 use INVALID_GID.
    Instead of NOGROUP use INVALID_GID.

    Cc: "J. Bruce Fields"
    Cc: Trond Myklebust
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

09 Feb, 2013

2 commits

  • Pull networking fixes from David Miller:

    1) Revert iwlwifi reclaimed packet tracking, it causes problems for a
    bunch of folks. From Emmanuel Grumbach.

    2) Work limiting code in brcmsmac wifi driver can clear tx status
    without processing the event. From Arend van Spriel.

    3) rtlwifi USB driver processes wrong SKB, fix from Larry Finger.

    4) l2tp tunnel delete can race with close, fix from Tom Parkin.

    5) pktgen_add_device() failures are not checked at all, fix from Cong
    Wang.

    6) Fix unintentional removal of carrier off from tun_detach(),
    otherwise we confuse userspace, from Michael S. Tsirkin.

    7) Don't leak socket reference counts and ubufs in vhost-net driver,
    from Jason Wang.

    8) vmxnet3 driver gets it's initial carrier state wrong, fix from Neil
    Horman.

    9) Protect against USB networking devices which spam the host with 0
    length frames, from Bjørn Mork.

    10) Prevent neighbour overflows in ipv6 for locally destined routes,
    from Marcelo Ricardo. This is the best short-term fix for this, a
    longer term fix has been implemented in net-next.

    11) L2TP uses ipv4 datagram routines in it's ipv6 code, whoops. This
    mistake is largely because the ipv6 functions don't even have some
    kind of prefix in their names to suggest they are ipv6 specific.
    From Tom Parkin.

    12) Check SYN packet drops properly in tcp_rcv_fastopen_synack(), from
    Yuchung Cheng.

    13) Fix races and TX skb freeing bugs in via-rhine's NAPI support, from
    Francois Romieu and your's truly.

    14) Fix infinite loops and divides by zero in TCP congestion window
    handling, from Eric Dumazet, Neal Cardwell, and Ilpo Järvinen.

    15) AF_PACKET tx ring handling can leak kernel memory to userspace, fix
    from Phil Sutter.

    16) Fix error handling in ipv6 GRE tunnel transmit, from Tommi Rantala.

    17) Protect XEN netback driver against hostile frontend putting garbage
    into the rings, don't leak pages in TX GOP checking, and add proper
    resource releasing in error path of xen_netbk_get_requests(). From
    Ian Campbell.

    18) SCTP authentication keys should be cleared out and released with
    kzfree(), from Daniel Borkmann.

    19) L2TP is a bit too clever trying to maintain skb->truesize, and ends
    up corrupting socket memory accounting to the point where packet
    sending is halted indefinitely. Just remove the adjustments
    entirely, they aren't really needed. From Eric Dumazet.

    20) ATM Iphase driver uses a data type with the same name as the S390
    headers, rename to fix the build. From Heiko Carstens.

    21) Fix a typo in copying the inner network header offset from one SKB
    to another, from Pravin B Shelar.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (56 commits)
    net: sctp: sctp_endpoint_free: zero out secret key data
    net: sctp: sctp_setsockopt_auth_key: use kzfree instead of kfree
    atm/iphase: rename fregt_t -> ffreg_t
    net: usb: fix regression from FLAG_NOARP code
    l2tp: dont play with skb->truesize
    net: sctp: sctp_auth_key_put: use kzfree instead of kfree
    netback: correct netbk_tx_err to handle wrap around.
    xen/netback: free already allocated memory on failure in xen_netbk_get_requests
    xen/netback: don't leak pages on failure in xen_netbk_tx_check_gop.
    xen/netback: shutdown the ring if it contains garbage.
    net: qmi_wwan: add more Huawei devices, including E320
    net: cdc_ncm: add another Huawei vendor specific device
    ipv6/ip6_gre: fix error case handling in ip6gre_tunnel_xmit()
    tcp: fix for zero packets_in_flight was too broad
    brcmsmac: rework of mac80211 .flush() callback operation
    ssb: unregister gpios before unloading ssb
    bcma: unregister gpios before unloading bcma
    rtlwifi: Fix scheduling while atomic bug
    net: usbnet: fix tx_dropped statistics
    tcp: ipv6: Update MIB counters for drops
    ...

    Linus Torvalds
     
  • When GSSAPI integrity signatures are in use, or when we're using GSSAPI
    privacy with the v2 token format, there is a trailing checksum on the
    xdr_buf that is returned.

    It's checked during the authentication stage, and afterward nothing
    cares about it. Ordinarily, it's not a problem since the XDR code
    generally ignores it, but it will be when we try to compute a checksum
    over the buffer to help prevent XID collisions in the duplicate reply
    cache.

    Fix the code to trim off the checksums after verifying them. Note that
    in unwrap_integ_data, we must avoid trying to reverify the checksum if
    the request was deferred since it will no longer be present when it's
    revisited.

    Signed-off-by: Jeff Layton

    Jeff Layton
     

05 Feb, 2013

1 commit


01 Feb, 2013

3 commits