17 Oct, 2012

3 commits


15 Oct, 2012

3 commits


13 Oct, 2012

1 commit

  • Pull nfsd update from J Bruce Fields:
    "Another relatively quiet cycle. There was some progress on my
    remaining 4.1 todo's, but a couple of them were just of the form
    "check that we do X correctly", so didn't have much affect on the
    code.

    Other than that, a bunch of cleanup and some bugfixes (including an
    annoying NFSv4.0 state leak and a busy-loop in the server that could
    cause it to peg the CPU without making progress)."

    * 'for-3.7' of git://linux-nfs.org/~bfields/linux: (46 commits)
    UAPI: (Scripted) Disintegrate include/linux/sunrpc
    UAPI: (Scripted) Disintegrate include/linux/nfsd
    nfsd4: don't allow reclaims of expired clients
    nfsd4: remove redundant callback probe
    nfsd4: expire old client earlier
    nfsd4: separate session allocation and initialization
    nfsd4: clean up session allocation
    nfsd4: minor free_session cleanup
    nfsd4: new_conn_from_crses should only allocate
    nfsd4: separate connection allocation and initialization
    nfsd4: reject bad forechannel attrs earlier
    nfsd4: enforce per-client sessions/no-sessions distinction
    nfsd4: set cl_minorversion at create time
    nfsd4: don't pin clientids to pseudoflavors
    nfsd4: fix bind_conn_to_session xdr comment
    nfsd4: cast readlink() bug argument
    NFSD: pass null terminated buf to kstrtouint()
    nfsd: remove duplicate init in nfsd4_cb_recall
    nfsd4: eliminate redundant nfs4_free_stateid
    fs/nfsd/nfs4idmap.c: adjust inconsistent IS_ERR and PTR_ERR
    ...

    Linus Torvalds
     

12 Oct, 2012

1 commit

  • Merge branch 'bugfixes' of git://linux-nfs.org/~trondmy/nfs-2.6 into
    for-3.7-incoming. Mainly needed for Bryan's "SUNRPC: Set alloc_slot for
    backchannel tcp ops", without which the 4.1 server oopses.

    J. Bruce Fields
     

10 Oct, 2012

2 commits

  • Pull NFS client updates from Trond Myklebust:
    "Features include:

    - Remove CONFIG_EXPERIMENTAL dependency from NFSv4.1
    Aside from the issues discussed at the LKS, distros are shipping
    NFSv4.1 with all the trimmings.
    - Fix fdatasync()/fsync() for the corner case of a server reboot.
    - NFSv4 OPEN access fix: finally distinguish correctly between
    open-for-read and open-for-execute permissions in all situations.
    - Ensure that the TCP socket is closed when we're in CLOSE_WAIT
    - More idmapper bugfixes
    - Lots of pNFS bugfixes and cleanups to remove unnecessary state and
    make the code easier to read.
    - In cases where a pNFS read or write fails, allow the client to
    resume trying layoutgets after two minutes of read/write-
    through-mds.
    - More net namespace fixes to the NFSv4 callback code.
    - More net namespace fixes to the NFSv3 locking code.
    - More NFSv4 migration preparatory patches.
    Including patches to detect network trunking in both NFSv4 and
    NFSv4.1
    - pNFS block updates to optimise LAYOUTGET calls."

    * tag 'nfs-for-3.7-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (113 commits)
    pnfsblock: cleanup nfs4_blkdev_get
    NFS41: send real read size in layoutget
    NFS41: send real write size in layoutget
    NFS: track direct IO left bytes
    NFSv4.1: Cleanup ugliness in pnfs_layoutgets_blocked()
    NFSv4.1: Ensure that the layout sequence id stays 'close' to the current
    NFSv4.1: Deal with seqid wraparound in the pNFS return-on-close code
    NFSv4 set open access operation call flag in nfs4_init_opendata_res
    NFSv4.1: Remove the dependency on CONFIG_EXPERIMENTAL
    NFSv4 reduce attribute requests for open reclaim
    NFSv4: nfs4_open_done first must check that GETATTR decoded a file type
    NFSv4.1: Deal with wraparound when updating the layout "barrier" seqid
    NFSv4.1: Deal with wraparound issues when updating the layout stateid
    NFSv4.1: Always set the layout stateid if this is the first layoutget
    NFSv4.1: Fix another refcount issue in pnfs_find_alloc_layout
    NFSv4: don't put ACCESS in OPEN compound if O_EXCL
    NFSv4: don't check MAY_WRITE access bit in OPEN
    NFS: Set key construction data for the legacy upcall
    NFSv4.1: don't do two EXCHANGE_IDs on mount
    NFS: nfs41_walk_client_list(): re-lock before iterating
    ...

    Linus Torvalds
     
  • This is to complete part of the Userspace API (UAPI) disintegration for which
    the preparatory patches were pulled recently. After these patches, userspace
    headers will be segregated into:

    include/uapi/linux/.../foo.h

    for the userspace interface stuff, and:

    include/linux/.../foo.h

    for the strictly kernel internal stuff.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

09 Oct, 2012

5 commits

  • Move actual pte filling for non-linear file mappings into the new special
    vma operation: ->remap_pages().

    Filesystems must implement this method to get non-linear mapping support,
    if it uses filemap_fault() then generic_file_remap_pages() can be used.

    Now device drivers can implement this method and obtain nonlinear vma support.

    Signed-off-by: Konstantin Khlebnikov
    Cc: Alexander Viro
    Cc: Carsten Otte
    Cc: Chris Metcalf #arch/tile
    Cc: Cyrill Gorcunov
    Cc: Eric Paris
    Cc: H. Peter Anvin
    Cc: Hugh Dickins
    Cc: Ingo Molnar
    Cc: James Morris
    Cc: Jason Baron
    Cc: Kentaro Takeda
    Cc: Matt Helsley
    Cc: Nick Piggin
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Robert Richter
    Cc: Suresh Siddha
    Cc: Tetsuo Handa
    Cc: Venkatesh Pallipadi
    Acked-by: Linus Torvalds
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     
  • It is not needed at all and it is messing with return values...

    Reported-by: Wei Yongjun
    Signed-off-by: Peng Tao
    Signed-off-by: Trond Myklebust

    Peng Tao
     
  • For buffer read, use offst-to-isize.

    For direct read, use dreq->bytes_left.

    Signed-off-by: Peng Tao
    Signed-off-by: Trond Myklebust

    Peng Tao
     
  • For buffer write, block layout client scan inode mapping to find
    next hole and use offset-to-hole as layoutget length. Object
    layout client uses offset-to-isize as layoutget length.

    For direct write, both block layout and object layout use dreq->bytes_left.

    Signed-off-by: Peng Tao
    Signed-off-by: Trond Myklebust

    Peng Tao
     
  • Signed-off-by: Peng Tao
    Signed-off-by: Trond Myklebust

    Peng Tao
     

06 Oct, 2012

1 commit


05 Oct, 2012

2 commits


04 Oct, 2012

2 commits


03 Oct, 2012

15 commits

  • Pull vfs update from Al Viro:

    - big one - consolidation of descriptor-related logics; almost all of
    that is moved to fs/file.c

    (BTW, I'm seriously tempted to rename the result to fd.c. As it is,
    we have a situation when file_table.c is about handling of struct
    file and file.c is about handling of descriptor tables; the reasons
    are historical - file_table.c used to be about a static array of
    struct file we used to have way back).

    A lot of stray ends got cleaned up and converted to saner primitives,
    disgusting mess in android/binder.c is still disgusting, but at least
    doesn't poke so much in descriptor table guts anymore. A bunch of
    relatively minor races got fixed in process, plus an ext4 struct file
    leak.

    - related thing - fget_light() partially unuglified; see fdget() in
    there (and yes, it generates the code as good as we used to have).

    - also related - bits of Cyrill's procfs stuff that got entangled into
    that work; _not_ all of it, just the initial move to fs/proc/fd.c and
    switch of fdinfo to seq_file.

    - Alex's fs/coredump.c spiltoff - the same story, had been easier to
    take that commit than mess with conflicts. The rest is a separate
    pile, this was just a mechanical code movement.

    - a few misc patches all over the place. Not all for this cycle,
    there'll be more (and quite a few currently sit in akpm's tree)."

    Fix up trivial conflicts in the android binder driver, and some fairly
    simple conflicts due to two different changes to the sock_alloc_file()
    interface ("take descriptor handling from sock_alloc_file() to callers"
    vs "net: Providing protocol type via system.sockprotoname xattr of
    /proc/PID/fd entries" adding a dentry name to the socket)

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits)
    MAX_LFS_FILESIZE should be a loff_t
    compat: fs: Generic compat_sys_sendfile implementation
    fs: push rcu_barrier() from deactivate_locked_super() to filesystems
    btrfs: reada_extent doesn't need kref for refcount
    coredump: move core dump functionality into its own file
    coredump: prevent double-free on an error path in core dumper
    usb/gadget: fix misannotations
    fcntl: fix misannotations
    ceph: don't abuse d_delete() on failure exits
    hypfs: ->d_parent is never NULL or negative
    vfs: delete surplus inode NULL check
    switch simple cases of fget_light to fdget
    new helpers: fdget()/fdput()
    switch o2hb_region_dev_write() to fget_light()
    proc_map_files_readdir(): don't bother with grabbing files
    make get_file() return its argument
    vhost_set_vring(): turn pollstart/pollstop into bool
    switch prctl_set_mm_exe_file() to fget_light()
    switch xfs_find_handle() to fget_light()
    switch xfs_swapext() to fget_light()
    ...

    Linus Torvalds
     
  • There's no reason to call rcu_barrier() on every
    deactivate_locked_super(). We only need to make sure that all delayed rcu
    free inodes are flushed before we destroy related cache.

    Removing rcu_barrier() from deactivate_locked_super() affects some fast
    paths. E.g. on my machine exit_group() of a last process in IPC
    namespace takes 0.07538s. rcu_barrier() takes 0.05188s of that time.

    Signed-off-by: Kirill A. Shutemov
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Kirill A. Shutemov
     
  • We currently make no distinction in attribute requests between normal OPENs
    and OPEN with CLAIM_PREVIOUS. This offers more possibility of failures in
    the GETATTR response which foils OPEN reclaim attempts.

    Reduce the requested attributes to the bare minimum needed to update the
    reclaim open stateid and split nfs4_opendata_to_nfs4_state processing
    accordingly.

    Signed-off-by: Andy Adamson
    Signed-off-by: Trond Myklebust

    Andy Adamson
     
  • ...before it can check the validity of that file type.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • ...and fix a bug in pnfs_set_layout_stateid.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • ...and add a helper function.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • If the list of layout segments is empty, we must unconditionally set
    the layout stateid.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Don't put an ACCESS op in OPEN compound if O_EXCL, because ACCESS
    will return permission denied for all bits until close.

    Fixes a regression due to commit 6168f62c (NFSv4: Add ACCESS operation to
    OPEN compound)

    Signed-off-by: Weston Andros Adamson
    Signed-off-by: Trond Myklebust

    Weston Andros Adamson
     
  • Don't check MAY_WRITE as a newly created file may not have write mode bits,
    but POSIX allows the creating process to write regardless.
    This is ok because NFSv4 OPEN ops handle write permissions correctly -
    the ACCESS in the OPEN compound is to differentiate READ v EXEC permissions.

    Fixes a regression due to commit 6168f62c (NFSv4: Add ACCESS operation to
    OPEN compound)

    Signed-off-by: Weston Andros Adamson
    Signed-off-by: Trond Myklebust

    Weston Andros Adamson
     
  • This prevents a null pointer dereference when
    nfs_idmap_complete_pipe_upcall_locked() calls complete_request_key().

    Fixes a regression caused by commit 0cac12023 (NFSv4: Ensure that
    idmap_pipe_downcall sanity-checks the downcall data).

    Signed-off-by: Bryan Schumaker
    Signed-off-by: Trond Myklebust

    Bryan Schumaker
     
  • Since the addition of NFSv4 server trunking detection the mount context
    calls nfs4_proc_exchange_id then schedules the state manager, which also
    calls nfs4_proc_exchange_id. Setting the NFS4CLNT_LEASE_CONFIRM bit
    makes the state manager skip the unneeded EXCHANGE_ID and continue on
    with session creation.

    Reported-by: Jorge Mora
    Signed-off-by: Weston Andros Adamson
    Signed-off-by: Trond Myklebust

    Weston Andros Adamson
     
  • Pull user namespace changes from Eric Biederman:
    "This is a mostly modest set of changes to enable basic user namespace
    support. This allows the code to code to compile with user namespaces
    enabled and removes the assumption there is only the initial user
    namespace. Everything is converted except for the most complex of the
    filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs,
    nfs, ocfs2 and xfs as those patches need a bit more review.

    The strategy is to push kuid_t and kgid_t values are far down into
    subsystems and filesystems as reasonable. Leaving the make_kuid and
    from_kuid operations to happen at the edge of userspace, as the values
    come off the disk, and as the values come in from the network.
    Letting compile type incompatible compile errors (present when user
    namespaces are enabled) guide me to find the issues.

    The most tricky areas have been the places where we had an implicit
    union of uid and gid values and were storing them in an unsigned int.
    Those places were converted into explicit unions. I made certain to
    handle those places with simple trivial patches.

    Out of that work I discovered we have generic interfaces for storing
    quota by projid. I had never heard of the project identifiers before.
    Adding full user namespace support for project identifiers accounts
    for most of the code size growth in my git tree.

    Ultimately there will be work to relax privlige checks from
    "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing
    root in a user names to do those things that today we only forbid to
    non-root users because it will confuse suid root applications.

    While I was pushing kuid_t and kgid_t changes deep into the audit code
    I made a few other cleanups. I capitalized on the fact we process
    netlink messages in the context of the message sender. I removed
    usage of NETLINK_CRED, and started directly using current->tty.

    Some of these patches have also made it into maintainer trees, with no
    problems from identical code from different trees showing up in
    linux-next.

    After reading through all of this code I feel like I might be able to
    win a game of kernel trivial pursuit."

    Fix up some fairly trivial conflicts in netfilter uid/git logging code.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits)
    userns: Convert the ufs filesystem to use kuid/kgid where appropriate
    userns: Convert the udf filesystem to use kuid/kgid where appropriate
    userns: Convert ubifs to use kuid/kgid
    userns: Convert squashfs to use kuid/kgid where appropriate
    userns: Convert reiserfs to use kuid and kgid where appropriate
    userns: Convert jfs to use kuid/kgid where appropriate
    userns: Convert jffs2 to use kuid and kgid where appropriate
    userns: Convert hpfs to use kuid and kgid where appropriate
    userns: Convert btrfs to use kuid/kgid where appropriate
    userns: Convert bfs to use kuid/kgid where appropriate
    userns: Convert affs to use kuid/kgid wherwe appropriate
    userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids
    userns: On ia64 deal with current_uid and current_gid being kuid and kgid
    userns: On ppc convert current_uid from a kuid before printing.
    userns: Convert s390 getting uid and gid system calls to use kuid and kgid
    userns: Convert s390 hypfs to use kuid and kgid where appropriate
    userns: Convert binder ipc to use kuids
    userns: Teach security_path_chown to take kuids and kgids
    userns: Add user namespace support to IMA
    userns: Convert EVM to deal with kuids and kgids in it's hmac computation
    ...

    Linus Torvalds
     
  • Pull workqueue changes from Tejun Heo:
    "This is workqueue updates for v3.7-rc1. A lot of activities this
    round including considerable API and behavior cleanups.

    * delayed_work combines a timer and a work item. The handling of the
    timer part has always been a bit clunky leading to confusing
    cancelation API with weird corner-case behaviors. delayed_work is
    updated to use new IRQ safe timer and cancelation now works as
    expected.

    * Another deficiency of delayed_work was lack of the counterpart of
    mod_timer() which led to cancel+queue combinations or open-coded
    timer+work usages. mod_delayed_work[_on]() are added.

    These two delayed_work changes make delayed_work provide interface
    and behave like timer which is executed with process context.

    * A work item could be executed concurrently on multiple CPUs, which
    is rather unintuitive and made flush_work() behavior confusing and
    half-broken under certain circumstances. This problem doesn't
    exist for non-reentrant workqueues. While non-reentrancy check
    isn't free, the overhead is incurred only when a work item bounces
    across different CPUs and even in simulated pathological scenario
    the overhead isn't too high.

    All workqueues are made non-reentrant. This removes the
    distinction between flush_[delayed_]work() and
    flush_[delayed_]_work_sync(). The former is now as strong as the
    latter and the specified work item is guaranteed to have finished
    execution of any previous queueing on return.

    * In addition to the various bug fixes, Lai redid and simplified CPU
    hotplug handling significantly.

    * Joonsoo introduced system_highpri_wq and used it during CPU
    hotplug.

    There are two merge commits - one to pull in IRQ safe timer from
    tip/timers/core and the other to pull in CPU hotplug fixes from
    wq/for-3.6-fixes as Lai's hotplug restructuring depended on them."

    Fixed a number of trivial conflicts, but the more interesting conflicts
    were silent ones where the deprecated interfaces had been used by new
    code in the merge window, and thus didn't cause any real data conflicts.

    Tejun pointed out a few of them, I fixed a couple more.

    * 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (46 commits)
    workqueue: remove spurious WARN_ON_ONCE(in_irq()) from try_to_grab_pending()
    workqueue: use cwq_set_max_active() helper for workqueue_set_max_active()
    workqueue: introduce cwq_set_max_active() helper for thaw_workqueues()
    workqueue: remove @delayed from cwq_dec_nr_in_flight()
    workqueue: fix possible stall on try_to_grab_pending() of a delayed work item
    workqueue: use hotcpu_notifier() for workqueue_cpu_down_callback()
    workqueue: use __cpuinit instead of __devinit for cpu callbacks
    workqueue: rename manager_mutex to assoc_mutex
    workqueue: WORKER_REBIND is no longer necessary for idle rebinding
    workqueue: WORKER_REBIND is no longer necessary for busy rebinding
    workqueue: reimplement idle worker rebinding
    workqueue: deprecate __cancel_delayed_work()
    workqueue: reimplement cancel_delayed_work() using try_to_grab_pending()
    workqueue: use mod_delayed_work() instead of __cancel + queue
    workqueue: use irqsafe timer for delayed_work
    workqueue: clean up delayed_work initializers and add missing one
    workqueue: make deferrable delayed_work initializer names consistent
    workqueue: cosmetic whitespace updates for macro definitions
    workqueue: deprecate system_nrt[_freezable]_wq
    workqueue: deprecate flush[_delayed]_work_sync()
    ...

    Linus Torvalds
     
  • Sparse identified an execution path in nfs41_walk_client_list()
    where the nfs_client_lock is not re-acquired before taking the next
    loop iteration.

    fs/nfs/nfs4client.c:437:9: sparse: context imbalance in
    'nfs41_walk_client_list' - different lock contexts for basic block

    Signed-off-by: Chuck Lever
    Cc: Fengguang Wu
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

02 Oct, 2012

5 commits