22 Dec, 2011

1 commit

  • * master: (848 commits)
    SELinux: Fix RCU deref check warning in sel_netport_insert()
    binary_sysctl(): fix memory leak
    mm/vmalloc.c: remove static declaration of va from __get_vm_area_node
    ipmi_watchdog: restore settings when BMC reset
    oom: fix integer overflow of points in oom_badness
    memcg: keep root group unchanged if creation fails
    nilfs2: potential integer overflow in nilfs_ioctl_clean_segments()
    nilfs2: unbreak compat ioctl
    cpusets: stall when updating mems_allowed for mempolicy or disjoint nodemask
    evm: prevent racing during tfm allocation
    evm: key must be set once during initialization
    mmc: vub300: fix type of firmware_rom_wait_states module parameter
    Revert "mmc: enable runtime PM by default"
    mmc: sdhci: remove "state" argument from sdhci_suspend_host
    x86, dumpstack: Fix code bytes breakage due to missing KERN_CONT
    IB/qib: Correct sense on freectxts increment and decrement
    RDMA/cma: Verify private data length
    cgroups: fix a css_set not found bug in cgroup_attach_proc
    oprofile: Fix uninitialized memory access when writing to writing to oprofilefs
    Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel"
    ...

    Conflicts:
    kernel/cgroup_freezer.c

    Rafael J. Wysocki
     

07 Dec, 2011

1 commit

  • Allow the freezer to skip wait_on_bit_killable sleeps in the sunrpc
    layer. This should allow suspend and hibernate events to proceed, even
    when there are RPC's pending on the wire.

    Also, wrap the TASK_KILLABLE sleeps in NFS layer in freezer_do_not_count
    and freezer_count calls. This allows the freezer to skip tasks that are
    sleeping while looping on EJUKEBOX or NFS4ERR_DELAY sorts of errors.

    Signed-off-by: Jeff Layton
    Signed-off-by: Rafael J. Wysocki

    Jeff Layton
     

02 Dec, 2011

1 commit


18 Jul, 2011

1 commit


08 Jul, 2011

1 commit


15 Jun, 2011

1 commit

  • If the NLM daemon is killed on the NFS server, we can currently end up
    hanging forever on an 'unlock' request, instead of aborting. Basically,
    if the rpcbind request fails, or the server keeps returning garbage, we
    really want to quit instead of retrying.

    Tested-by: Vasily Averin
    Signed-off-by: Trond Myklebust
    Cc: stable@kernel.org

    Trond Myklebust
     

27 Mar, 2011

1 commit

  • BUG: atomic_dec_and_test(): -1: atomic counter underflow at:
    Pid: 2827, comm: mount.nfs Not tainted 2.6.38 #1
    Call Trace:
    [] ? put_rpccred+0x44/0x14e [sunrpc]
    [] ? rpc_ping+0x4e/0x58 [sunrpc]
    [] ? rpc_create+0x481/0x4fc [sunrpc]
    [] ? rpcauth_lookup_credcache+0xab/0x22d [sunrpc]
    [] ? nfs_create_rpc_client+0xa6/0xeb [nfs]
    [] ? nfs4_set_client+0xc2/0x1f9 [nfs]
    [] ? nfs4_create_server+0xf2/0x2a6 [nfs]
    [] ? nfs4_remote_mount+0x4e/0x14a [nfs]
    [] ? vfs_kern_mount+0x6e/0x133
    [] ? nfs_do_root_mount+0x76/0x95 [nfs]
    [] ? nfs4_try_mount+0x56/0xaf [nfs]
    [] ? nfs_get_sb+0x435/0x73c [nfs]
    [] ? vfs_kern_mount+0x99/0x133
    [] ? do_kern_mount+0x48/0xd8
    [] ? do_mount+0x6da/0x741
    [] ? sys_mount+0x83/0xc0
    [] ? system_call_fastpath+0x16/0x1b

    Well, so, I think this is real bug of nfs codes somewhere. With some
    review, the code

    rpc_call_sync()
    rpc_run_task
    rpc_execute()
    __rpc_execute()
    rpc_release_task()
    rpc_release_resources_task()
    put_rpccred()
    Signed-off-by: Trond Myklebust

    OGAWA Hirofumi
     

18 Mar, 2011

2 commits

  • * 'nfs-for-2.6.39' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (54 commits)
    RPC: killing RPC tasks races fixed
    xprt: remove redundant check
    SUNRPC: Convert struct rpc_xprt to use atomic_t counters
    SUNRPC: Ensure we always run the tk_callback before tk_action
    sunrpc: fix printk format warning
    xprt: remove redundant null check
    nfs: BKL is no longer needed, so remove the include
    NFS: Fix a warning in fs/nfs/idmap.c
    Cleanup: Factor out some cut-and-paste code.
    cleanup: save 60 lines/100 bytes by combining two mostly duplicate functions.
    NFS: account direct-io into task io accounting
    gss:krb5 only include enctype numbers in gm_upcall_enctypes
    RPCRDMA: Fix FRMR registration/invalidate handling.
    RPCRDMA: Fix to XDR page base interpretation in marshalling logic.
    NFSv4: Send unmapped uid/gids to the server when using auth_sys
    NFSv4: Propagate the error NFS4ERR_BADOWNER to nfs4_do_setattr
    NFSv4: cleanup idmapper functions to take an nfs_server argument
    NFSv4: Send unmapped uid/gids to the server if the idmapper fails
    NFSv4: If the server sends us a numeric uid/gid then accept it
    NFSv4.1: reject zero layout with zeroed stripe unit
    ...

    Linus Torvalds
     
  • This fixes a race in which the task->tk_callback() puts the rpc_task
    to sleep, setting a new callback. Under certain circumstances, the current
    code may end up executing the task->tk_action before it gets round to the
    callback.

    Signed-off-by: Trond Myklebust
    Cc: stable@kernel.org

    Trond Myklebust
     

16 Mar, 2011

1 commit

  • * 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
    workqueue: fix build failure introduced by s/freezeable/freezable/
    workqueue: add system_freezeable_wq
    rds/ib: use system_wq instead of rds_ib_fmr_wq
    net/9p: replace p9_poll_task with a work
    net/9p: use system_wq instead of p9_mux_wq
    xfs: convert to alloc_workqueue()
    reiserfs: make commit_wq use the default concurrency level
    ocfs2: use system_wq instead of ocfs2_quota_wq
    ext4: convert to alloc_workqueue()
    scsi/scsi_tgt_lib: scsi_tgtd isn't used in memory reclaim path
    scsi/be2iscsi,qla2xxx: convert to alloc_workqueue()
    misc/iwmc3200top: use system_wq instead of dedicated workqueues
    i2o: use alloc_workqueue() instead of create_workqueue()
    acpi: kacpi*_wq don't need WQ_MEM_RECLAIM
    fs/aio: aio_wq isn't used in memory reclaim path
    input/tps6507x-ts: use system_wq instead of dedicated workqueue
    cpufreq: use system_wq instead of dedicated workqueues
    wireless/ipw2x00: use system_wq instead of dedicated workqueues
    arm/omap: use system_wq in mailbox
    workqueue: use WQ_MEM_RECLAIM instead of WQ_RESCUER

    Linus Torvalds
     

12 Mar, 2011

2 commits


11 Mar, 2011

1 commit

  • Although they run as rpciod background tasks, under normal operation
    (i.e. no SIGKILL), functions like nfs_sillyrename(), nfs4_proc_unlck()
    and nfs4_do_close() want to be fully synchronous. This means that when we
    exit, we want all references to the rpc_task to be gone, and we want
    any dentry references etc. held by that task to be released.

    For this reason these functions call __rpc_wait_for_completion_task(),
    followed by rpc_put_task() in the expectation that the latter will be
    releasing the last reference to the rpc_task, and thus ensuring that the
    callback_ops->rpc_release() has been called synchronously.

    This patch fixes a race which exists due to the fact that
    rpciod calls rpc_complete_task() (in order to wake up the callers of
    __rpc_wait_for_completion_task()) and then subsequently calls
    rpc_put_task() without ensuring that these two steps are done atomically.

    In order to avoid adding new spin locks, the patch uses the existing
    waitqueue spin lock to order the rpc_task reference count releases between
    the waiting process and rpciod.
    The common case where nobody is waiting for completion is optimised for by
    checking if the RPC_TASK_ASYNC flag is cleared and/or if the rpc_task
    reference count is 1: in those cases we drop trying to grab the spin lock,
    and immediately free up the rpc_task.

    Those few processes that need to put the rpc_task from inside an
    asynchronous context and that do not care about ordering are given a new
    helper: rpc_put_task_async().

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

25 Jan, 2011

1 commit


26 Oct, 2010

1 commit

  • * 'nfs-for-2.6.37' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (67 commits)
    SUNRPC: Cleanup duplicate assignment in rpcauth_refreshcred
    nfs: fix unchecked value
    Ask for time_delta during fsinfo probe
    Revalidate caches on lock
    SUNRPC: After calling xprt_release(), we must restart from call_reserve
    NFSv4: Fix up the 'dircount' hint in encode_readdir
    NFSv4: Clean up nfs4_decode_dirent
    NFSv4: nfs4_decode_dirent must clear entry->fattr->valid
    NFSv4: Fix a regression in decode_getfattr
    NFSv4: Fix up decode_attr_filehandle() to handle the case of empty fh pointer
    NFS: Ensure we check all allocation return values in new readdir code
    NFS: Readdir plus in v4
    NFS: introduce generic decode_getattr function
    NFS: check xdr_decode for errors
    NFS: nfs_readdir_filler catch all errors
    NFS: readdir with vmapped pages
    NFS: remove page size checking code
    NFS: decode_dirent should use an xdr_stream
    SUNRPC: Add a helper function xdr_inline_peek
    NFS: remove readdir plus limit
    ...

    Linus Torvalds
     

24 Sep, 2010

1 commit


22 Sep, 2010

1 commit


04 Aug, 2010

4 commits


15 May, 2010

3 commits

  • It has not triggered in almost a decade. Time to get rid of it...

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Currently RPC performance metrics that tabulate elapsed time use
    jiffies time values. This is problematic on systems that use slow
    jiffies (for instance 100HZ systems built for paravirtualized
    environments). It is also a problem for computing precise latency
    statistics for advanced network transports, such as InfiniBand,
    that can have round-trip latencies significanly faster than a single
    clock tick.

    For the RPC client, adopt the high resolution time stamp mechanism
    already used by the network layer and blktrace: ktime.

    We use ktime format time stamps for all internal computations, and
    convert to milliseconds for presentation. As a result, we need only
    addition operations in the performance critical paths; multiply/divide
    is required only for presentation.

    We could report RTT metrics in microseconds. In fact the mountstats
    format is versioned to accomodate exactly this kind of interface
    improvement.

    For now, however, we'll stay with millisecond precision for
    presentation to maintain backwards compatibility with the handful of
    currently deployed user space tools. At a later point, we'll move to
    an API such as BDI_STATS where a finer timestamp precision can be
    reported.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Also have it return an ERR_PTR(-ENOMEM) instead of a null pointer.

    Reviewed-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

16 Dec, 2009

2 commits


11 Sep, 2009

1 commit


13 Jul, 2009

1 commit

  • * Remove smp_lock.h from files which don't need it (including some headers!)
    * Add smp_lock.h to files which do need it
    * Make smp_lock.h include conditional in hardirq.h
    It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

    This will make hardirq.h inclusion cheaper for every PREEMPT=n config
    (which includes allmodconfig/allyesconfig, BTW)

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

18 Jun, 2009

1 commit


11 Mar, 2009

1 commit

  • We should probably not be testing any flags after we've cleared the
    RPC_TASK_RUNNING flag, since rpc_make_runnable() is then free to assign the
    rpc_task to another workqueue, which may then destroy it.

    We can fix any races with rpc_make_runnable() by ensuring that we only
    clear the RPC_TASK_RUNNING flag while holding the rpc_wait_queue->lock that
    the task is supposed to be sleeping on (and then checking whether or not
    the task really is sleeping).

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

16 Jul, 2008

1 commit


10 Jul, 2008

1 commit


15 Mar, 2008

2 commits


29 Feb, 2008

5 commits


26 Feb, 2008

1 commit

  • An audit of the current RPC timeout functions shows that they don't really
    ever need to run in the softirq context. As long as the softirq is
    able to signal that the wakeup is due to a timeout (which it can do by
    setting task->tk_status to -ETIMEDOUT) then the callback functions can just
    run as standard task->tk_callback functions (in the rpciod/process
    context).

    The only possible border-line case would be xprt_timer() for the case of
    UDP, when the callback is used to reduce the size of the transport
    congestion window. In testing, however, the effect of moving that update
    to a callback would appear to be minor.

    Signed-off-by: Trond Myklebust

    Trond Myklebust