02 Jul, 2013

2 commits

  • commit d202cce8963d9268ff355a386e20243e8332b308
    sunrpc: never return expired entries in sunrpc_cache_lookup

    moved the 'entry is expired' test from cache_check to
    sunrpc_cache_lookup, so that it happened early and some races could
    safely be ignored.

    However the ip_map (in svcauth_unix.c) has a separate single-item
    cache which allows quick lookup without locking. An entry in this
    case would not be subject to the expiry test and so could be used
    well after it has expired.

    This is not normally a big problem because the first time it is used
    after it is expired an up-call will be scheduled to refresh the entry
    (if it hasn't been scheduled already) and the old entry will then
    be invalidated. So on the second attempt to use it after it has
    expired, ip_map_cached_get will discard it.

    However that is subtle and not ideal, so replace the "!cache_valid"
    test with "cache_is_expired".
    In doing this we drop the test on the "CACHE_VALID" bit. This is
    unnecessary as the bit is never cleared, and an entry will only
    be cached if the bit is set.

    Reported-by: Bodo Stroesser
    Signed-off-by: NeilBrown
    Signed-off-by: J. Bruce Fields

    NeilBrown
     
  • It is possible for a race to set CACHE_PENDING after cache_clean()
    has removed a cache entry from the cache.
    If CACHE_PENDING is still set when the entry is finally 'put',
    the cache_dequeue() will never happen and we can leak memory.

    So set a new flag 'CACHE_CLEANED' when we remove something from
    the cache, and don't queue any upcall if it is set.

    If CACHE_PENDING is set before CACHE_CLEANED, the call that
    cache_clean() makes to cache_fresh_unlocked() will free memory
    as needed. If CACHE_PENDING is set after CACHE_CLEANED, the
    test in sunrpc_cache_pipe_upcall will ensure that the memory
    is not allocated.

    Reported-by:
    Signed-off-by: NeilBrown
    Signed-off-by: J. Bruce Fields

    NeilBrown
     

15 Feb, 2013

2 commits


15 Nov, 2012

1 commit

  • Commit bbf43dc888833ac0539e437dbaeb28bfd4fbab9f "sunrpc/cache.h: replace
    simple_strtoul" introduced new range-checking which could cause get_int
    to fail on unsigned integers too large to be represented as an int.

    We could parse them as unsigned instead--but it turns out svcgssd is
    actually passing down "-1" in some cases. Which is perhaps stupid, but
    there's nothing we can do about it now.

    So just revert back to the previous "sloppy" behavior that accepts
    either representation.

    Cc: stable@vger.kernel.org
    Reported-by: Sven Geggus
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

25 Jul, 2012

1 commit


12 Jul, 2012

2 commits

  • This patch replaces the usage of simple_strtoul with kstrtoint in
    get_int(), since the simple_str* family doesn't account for overflow
    and is deprecated.
    Also, in this specific case, the long from strtol is silently converted
    to an int by the caller.

    As Joe Perches suggested, this patch also removes
    the redundant temporary variable rv, since kstrtoint() will not write to
    anint unless it's successful.

    Cc: Joe Perches
    Signed-off-by: Eldad Zack
    Signed-off-by: J. Bruce Fields

    Eldad Zack
     
  • Neaten code style in get_int().
    Also use sizeof() instead of hard coded number as suggested by
    Joe Perches .

    Cc: Joe Perches
    Signed-off-by: Eldad Zack
    Signed-off-by: J. Bruce Fields

    Eldad Zack
     

01 Feb, 2012

3 commits


04 Jan, 2012

1 commit


27 Jul, 2011

1 commit

  • This allows us to move duplicated code in
    (atomic_inc_not_zero() for now) to

    Signed-off-by: Arun Sharma
    Reviewed-by: Eric Dumazet
    Cc: Ingo Molnar
    Cc: David Miller
    Cc: Eric Dumazet
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun Sharma
     

16 Jul, 2011

1 commit

  • As promised in feature-removal-schedule.txt it is time to
    remove the nfsctl system call.

    Userspace has perferred to not use this call throughout 2.6 and it has been
    excluded in the default configuration since 2.6.36 (9 months ago).

    So this patch removes all the code that was being compiled out.

    There are still references to sys_nfsctl in various arch systemcall tables
    and related code. These should be cleaned out too, probably in the next
    merge window.

    Signed-off-by: NeilBrown
    Signed-off-by: J. Bruce Fields

    NeilBrown
     

31 Mar, 2011

1 commit


15 Jan, 2011

1 commit

  • * 'for-2.6.38' of git://linux-nfs.org/~bfields/linux: (62 commits)
    nfsd4: fix callback restarting
    nfsd: break lease on unlink, link, and rename
    nfsd4: break lease on nfsd setattr
    nfsd: don't support msnfs export option
    nfsd4: initialize cb_per_client
    nfsd4: allow restarting callbacks
    nfsd4: simplify nfsd4_cb_prepare
    nfsd4: give out delegations more quickly in 4.1 case
    nfsd4: add helper function to run callbacks
    nfsd4: make sure sequence flags are set after destroy_session
    nfsd4: re-probe callback on connection loss
    nfsd4: set sequence flag when backchannel is down
    nfsd4: keep finer-grained callback status
    rpc: allow xprt_class->setup to return a preexisting xprt
    rpc: keep backchannel xprt as long as server connection
    rpc: move sk_bc_xprt to svc_xprt
    nfsd4: allow backchannel recovery
    nfsd4: support BIND_CONN_TO_SESSION
    nfsd4: modify session list under cl_lock
    Documentation: fl_mylease no longer exists
    ...

    Fix up conflicts in fs/nfsd/vfs.c with the vfs-scale work. The
    vfs-scale work touched some msnfs cases, and this merge removes support
    for that entirely, so the conflict was trivial to resolve.

    Linus Torvalds
     

11 Jan, 2011

1 commit


05 Jan, 2011

1 commit


27 Sep, 2010

1 commit


22 Sep, 2010

1 commit


08 Sep, 2010

3 commits

  • The current practice of waiting for cache updates by queueing the
    whole request to be retried has (at least) two problems.

    1/ With NFSv4, requests can be quite complex and re-trying a whole
    request when a later part fails should only be a last-resort, not a
    normal practice.

    2/ Large requests, and in particular any 'write' request, will not be
    queued by the current code and doing so would be undesirable.

    In many cases only a very sort wait is needed before the cache gets
    valid data.

    So, providing the underlying transport permits it by setting
    ->thread_wait,
    arrange to wait briefly for an upcall to be completed (as reflected in
    the clearing of CACHE_PENDING).
    If the short wait was not long enough and CACHE_PENDING is still set,
    fall back on the old approach.

    The 'thread_wait' value is set to 5 seconds when there are spare
    threads, and 1 second when there are no spare threads.

    These values are probably much higher than needed, but will ensure
    some forward progress.

    Note that as we only request an update for a non-valid item, and as
    non-valid items are updated in place it is extremely unlikely that
    cache_check will return -ETIMEDOUT. Normally cache_defer_req will
    sleep for a short while and then find that the item is_valid.

    Signed-off-by: NeilBrown
    Signed-off-by: J. Bruce Fields

    NeilBrown
     
  • This protects us from confusion when the wallclock time changes.

    We convert to and from wallclock when setting or reading expiry
    times.

    Also use seconds since boot for last_clost time.

    Signed-off-by: NeilBrown
    Signed-off-by: J. Bruce Fields

    NeilBrown
     
  • Rather can duplicating this idiom twice, put it in an inline function.
    This reduces the usage of 'expiry_time' out side the sunrpc/cache.c
    code and thus the impact of a change that is about to be made to that
    field.

    Signed-off-by: NeilBrown
    Signed-off-by: J. Bruce Fields

    NeilBrown
     

07 Jul, 2010

1 commit

  • This patch makes the cache_cleaner workqueue deferrable, to prevent
    unnecessary system wake-ups, which is very important for embedded
    battery-powered devices.

    do_cache_clean() is called every 30 seconds at the moment, and often
    makes the system wake up from its power-save sleep state. With this
    change, when the workqueue uses a deferrable timer, the
    do_cache_clean() invocation will be delayed and combined with the
    closest "real" wake-up. This improves the power consumption situation.

    Note, I tried to create a DECLARE_DELAYED_WORK_DEFERRABLE() helper
    macro, similar to DECLARE_DELAYED_WORK(), but failed because of the
    way the timer wheel core stores the deferrable flag (it is the
    LSBit in the time->base pointer). My attempt to define a static
    variable with this bit set ended up with the "initializer element is
    not constant" error.

    Thus, I have to use run-time initialization, so I created a new
    cache_initialize() function which is called once when sunrpc is
    being initialized.

    Signed-off-by: Artem Bityutskiy
    Signed-off-by: J. Bruce Fields

    Artem Bityutskiy
     

10 Aug, 2009

4 commits


24 Apr, 2008

1 commit


02 Feb, 2008

2 commits


10 Oct, 2007

1 commit


04 Oct, 2006

1 commit

  • Speed up high call-rate workloads by caching the struct ip_map for the peer on
    the connected struct svc_sock instead of looking it up in the ip_map cache
    hashtable on every call. This helps workloads using AUTH_SYS authentication
    over TCP.

    Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16
    synthetic client threads simulating an rsync (i.e. recursive directory
    listing) workload reading from an i386 RH9 install image (161480 regular files
    in 10841 directories) on the server. That tree is small enough to fill in the
    server's RAM so no disk traffic was involved. This setup gives a sustained
    call rate in excess of 60000 calls/sec before being CPU-bound on the server.

    Profiling showed strcmp(), called from ip_map_match(), was taking 4.8% of each
    CPU, and ip_map_lookup() was taking 2.9%. This patch drops both contribution
    into the profile noise.

    Note that the above result overstates this value of this patch for most
    workloads. The synthetic clients are all using separate IP addresses, so
    there are 64 entries in the ip_map cache hash. Because the kernel measured
    contained the bug fixed in commit

    commit 1f1e030bf75774b6a283518e1534d598e14147d4

    and was running on 64bit little-endian machine, probably all of those 64
    entries were on a single chain, thus increasing the cost of ip_map_lookup().

    With a modern kernel you would need more clients to see the same amount of
    performance improvement. This patch has helped to scale knfsd to handle a
    deployment with 2000 NFS clients.

    Signed-off-by: Greg Banks
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Greg Banks
     

28 Mar, 2006

6 commits


08 Sep, 2005

1 commit

  • When registering an RPC cache, cache_register() always sets the owner as the
    sunrpc module. However, there are RPC caches owned by other modules. With
    the incorrect owner setting, the real owning module can be removed potentially
    with an open reference to the cache from userspace.

    For example, if one were to stop the nfs server and unmount the nfsd
    filesystem, the nfsd module could be removed eventhough rpc.idmapd had
    references to the idtoname and nametoid caches (i.e.
    /proc/net/rpc/nfs4./channel is still open). This resulted in a
    system panic on one of our machines when attempting to restart the nfs
    services after reloading the nfsd module.

    The following patch adds a 'struct module *owner' field in struct
    cache_detail. The owner is further assigned to the struct proc_dir_entry
    in cache_register() so that the module cannot be unloaded while user-space
    daemons have an open reference on the associated file under /proc.

    Signed-off-by: Bruce Allan
    Cc: Trond Myklebust
    Cc: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bruce Allan