13 Jul, 2009

1 commit

  • * Remove smp_lock.h from files which don't need it (including some headers!)
    * Add smp_lock.h to files which do need it
    * Make smp_lock.h include conditional in hardirq.h
    It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

    This will make hardirq.h inclusion cheaper for every PREEMPT=n config
    (which includes allmodconfig/allyesconfig, BTW)

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

03 Jul, 2009

1 commit

  • nfsd_open() gets an unrefcounted pointer to the current process's effective
    credentials at the top of the function, then calls nfsd_setuser() via
    fh_verify() - which may replace and destroy the current process's effective
    credentials - and then passes the unrefcounted pointer to dentry_open() - but
    the credentials may have been destroyed by this point.

    Instead, the value from current_cred() should be passed directly to
    dentry_open() as one of its arguments, rather than being cached in a variable.

    Possibly fh_verify() should return the creds to use.

    This is a regression introduced by
    745ca2475a6ac596e3d8d37c2759c0fbe2586227 "CRED: Pass credentials through
    dentry_open()".

    Signed-off-by: David Howells
    Tested-and-Verified-By: Steve Dickson
    Cc: stable@kernel.org
    Signed-off-by: J. Bruce Fields

    David Howells
     

23 Jun, 2009

1 commit

  • * 'for-2.6.31' of git://fieldses.org/git/linux-nfsd: (60 commits)
    SUNRPC: Fix the TCP server's send buffer accounting
    nfsd41: Backchannel: minorversion support for the back channel
    nfsd41: Backchannel: cleanup nfs4.0 callback encode routines
    nfsd41: Remove ip address collision detection case
    nfsd: optimise the starting of zero threads when none are running.
    nfsd: don't take nfsd_mutex twice when setting number of threads.
    nfsd41: sanity check client drc maxreqs
    nfsd41: move channel attributes from nfsd4_session to a nfsd4_channel_attr struct
    NFS: kill off complicated macro 'PROC'
    sunrpc: potential memory leak in function rdma_read_xdr
    nfsd: minor nfsd_vfs_write cleanup
    nfsd: Pull write-gathering code out of nfsd_vfs_write
    nfsd: track last inode only in use_wgather case
    sunrpc: align cache_clean work's timer
    nfsd: Use write gathering only with NFSv2
    NFSv4: kill off complicated macro 'PROC'
    NFSv4: do exact check about attribute specified
    knfsd: remove unreported filehandle stats counters
    knfsd: fix reply cache memory corruption
    knfsd: reply cache cleanups
    ...

    Linus Torvalds
     

19 Jun, 2009

5 commits

  • Prepare to share backchannel code with NFSv4.1.

    Signed-off-by: Andy Adamson
    Signed-off-by: Benny Halevy
    Signed-off-by: Ricardo Labiaga
    [nfsd41: use nfsd4_cb_sequence for callback minorversion]
    Signed-off-by: Benny Halevy
    Signed-off-by: J. Bruce Fields

    Andy Adamson
     
  • Mimic the client and prepare to share the back channel xdr with NFSv4.1.
    Bump the number of operations in each encode routine, then backfill the
    number of operations.

    Signed-off-by: Andy Adamson
    Signed-off-by: Benny Halevy
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    Signed-off-by: J. Bruce Fields

    Andy Adamson
     
  • Verified that cthon and pynfs exchange id tests pass (except for the
    two expected fails: EID8 and EID50)

    Signed-off-by: Mike Sager
    Signed-off-by: Benny Halevy
    Signed-off-by: J. Bruce Fields

    Mike Sager
     
  • Currently, if we ask to set then number of nfsd threads to zero when
    there are none running, we set up all the sockets and register the
    service, and then tear it all down again.
    This is pointless.

    So detect that case and exit promptly.
    (also remove an assignment to 'error' which was never used.

    Signed-off-by: NeilBrown
    Acked-by: Jeff Layton

    NeilBrown
     
  • Currently when we write a number to 'threads' in nfsdfs,
    we take the nfsd_mutex, update the number of threads, then take the
    mutex again to read the number of threads.

    Mostly this isn't a big deal. However if we are write '0', and
    portmap happens to be dead, then we can get unpredictable behaviour.
    If the nfsd threads all got killed quickly and the last thread is
    waiting for portmap to respond, then the second time we take the mutex
    we will block waiting for the last thread.
    However if the nfsd threads didn't die quite that fast, then there
    will be no contention when we try to take the mutex again.

    Unpredictability isn't fun, and waiting for the last thread to exit is
    pointless, so avoid taking the lock twice.
    To achieve this, get nfsd_svc return a non-negative number of active
    threads when not returning a negative error.

    Signed-off-by: NeilBrown

    NeilBrown
     

17 Jun, 2009

2 commits


16 Jun, 2009

6 commits


12 Jun, 2009

6 commits


09 Jun, 2009

1 commit


02 Jun, 2009

2 commits

  • J. Bruce Fields wrote:
    ...
    > (This is extremely confusing code to track down: note that
    > proc->pc_decode is set to nfs4svc_decode_compoundargs() by the PROC()
    > macro at the end of fs/nfsd/nfs4proc.c. Which means, for example, that
    > grepping for nfs4svc_decode_compoundargs() gets you nowhere. Patches to
    > kill off that macro would be welcomed....)

    the macro 'PROC' is complicated and obscure, it had better
    be killed off in order to make the code more clear.

    Signed-off-by: Yu Zhiguo
    Signed-off-by: J. Bruce Fields

    Yu Zhiguo
     
  • Server should return NFS4ERR_ATTRNOTSUPP if an attribute specified is
    not supported in current environment.
    Operations CREATE, NVERIFY, OPEN, SETATTR and VERIFY should do this check.

    This bug is found when do newpynfs tests. The names of the tests that failed
    are following:
    CR12 NVF7a NVF7b NVF7c NVF7d NVF7f NVF7r NVF7s
    OPEN15 VF7a VF7b VF7c VF7d VF7f VF7r VF7s

    Add function do_check_fattr() to do exact check:
    1, Check attribute specified is supported by the NFSv4 server or not.
    2, Check FATTR4_WORD0_ACL & FATTR4_WORD0_FS_LOCATIONS are supported
    in current environment or not.
    3, Check attribute specified is writable or not.

    step 1 and 3 are done in function nfsd4_decode_fattr() but removed
    to this function now.

    Signed-off-by: Yu Zhiguo
    Signed-off-by: J. Bruce Fields

    Yu Zhiguo
     

28 May, 2009

5 commits

  • An nfsd exported file is opened/closed by the kernel causing the
    integrity imbalance message.

    Before a file is opened, there normally is permission checking, which
    is done in inode_permission(). However, as integrity checking requires
    a dentry and mount point, which is not available in inode_permission(),
    the integrity (permission) checking must be called separately.

    In order to detect any missing integrity checking calls, we keep track
    of file open/closes. ima_path_check() increments these counts and
    does the integrity (permission) checking. As a result, the number of
    calls to ima_path_check()/ima_file_free() should be balanced. An extra
    call to fput(), indicates the file could have been accessed without first
    calling ima_path_check().

    In nfsv3 permission checking is done once, followed by multiple reads,
    which do an open/close for each read. The integrity (permission) checking
    call should be in nfsd_permission() after the inode_permission() call, but
    as there is no correlation between the number of permission checking and
    open calls, the integrity checking call should not increment the counters,
    but defer it to when the file is actually opened.

    This patch adds:
    - integrity (permission) checking for nfsd exported files in nfsd_permission().
    - a call to increment counts for files opened by nfsd.

    This patch has been updated to return the nfs error types.

    Signed-off-by: Mimi Zohar
    Signed-off-by: James Morris

    Mimi Zohar
     
  • Commit 'Short write in nfsd becomes a full write to the client'
    (31dec2538e45e9fff2007ea1f4c6bae9f78db724) broken the sync write.
    With the following commands to reproduce:

    $ mount -t nfs -o sync 192.168.0.21:/nfsroot /mnt
    $ cd /mnt
    $ echo aaaa > temp.txt

    Then nfs client is hung up.

    In SYNC mode the server alaways return the write count 0 to the
    client. This is because the value of host_err in nfsd_vfs_write()
    will be overwrite in SYNC mode by 'host_err=nfsd_sync(file);',
    and then we return host_err(which is now 0) as write count.

    This patch fixed the problem.

    Signed-off-by: Wei Yongjun
    Signed-off-by: J. Bruce Fields

    Wei Yongjun
     
  • The file nfsfh.c contains two static variables nfsd_nr_verified and
    nfsd_nr_put. These are counters which are incremented as a side
    effect of the fh_verify() fh_compose() and fh_put() operations,
    i.e. at least twice per NFS call for any non-trivial workload.
    Needless to say this makes the cacheline that contains them (and any
    other innocent victims) a very hot contention point indeed under high
    call-rate workloads on multiprocessor NFS server. It also turns out
    that these counters are not used anywhere. They're not reported to
    userspace, they're not used in logic, they're not even exported from
    the object file (let alone the module). All they do is waste CPU time.

    So this patch removes them.

    Tests on a 16 CPU Altix A4700 with 2 10gige Myricom cards, configured
    separately (no bonding). Workload is 640 client threads doing directory
    traverals with random small reads, from server RAM.

    Before
    ======

    Kernel profile:

    % cumulative self self total
    time samples samples calls 1/call 1/call name
    6.05 2716.00 2716.00 30406 0.09 1.02 svc_process
    4.44 4706.00 1990.00 1975 1.01 1.01 spin_unlock_irqrestore
    3.72 6376.00 1670.00 1666 1.00 1.00 svc_export_put
    3.41 7907.00 1531.00 1786 0.86 1.02 nfsd_ofcache_lookup
    3.25 9363.00 1456.00 10965 0.13 1.01 nfsd_dispatch
    3.10 10752.00 1389.00 1376 1.01 1.01 nfsd_cache_lookup
    2.57 11907.00 1155.00 4517 0.26 1.03 svc_tcp_recvfrom
    ...
    2.21 15352.00 1003.00 1081 0.93 1.00 nfsd_choose_ofc
    Reviewed-by: David Chinner
    Signed-off-by: J. Bruce Fields

    Greg Banks
     
  • Fix a regression in the reply cache introduced when the code was
    converted to use proper Linux lists. When a new entry needs to be
    inserted, the case where all the entries are currently being used
    by threads is not correctly detected. This can result in memory
    corruption and a crash. In the current code this is an extremely
    unlikely corner case; it would require the machine to have 1024
    nfsd threads and all of them to be busy at the same time. However,
    upcoming reply cache changes make this more likely; a crash due to
    this problem was actually observed in field.

    Signed-off-by: Greg Banks
    Signed-off-by: J. Bruce Fields

    Greg Banks
     
  • Make REQHASH() an inline function. Rename hash_list to cache_hash.
    Fix an obsolete comment.

    Signed-off-by: Greg Banks
    Signed-off-by: J. Bruce Fields

    Greg Banks
     

12 May, 2009

1 commit


07 May, 2009

2 commits


04 May, 2009

3 commits


02 May, 2009

4 commits

  • Move this out of a local variable into the nfs4_delegation object in
    preparation for making this an async rpc call (at which point we'll need
    any state like this in a common object that's preserved across function
    calls).

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • There's no point in keeping this field around--it's always zero.

    (Background: the protocol allows you to tell the client that the file is
    about to be truncated, as an optimization to save the client from
    writing back dirty pages that will just be discarded. We don't
    implement this hint. If we do some day, adding this field back in will
    be the least of the work involved.)

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • The nfs4_cb_recall struct is used only in nfs4_delegation, so its
    pointer to the containing delegation is unnecessary--we could just use
    container_of().

    But there's no real reason to have this a separate struct at all--just
    move these fields to nfs4_delegation.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • I want to use the name for a struct that actually does represent a
    single callback.

    (Actually, I've never been sure it helps to a separate struct for the
    callback information. Some day maybe those fields could just be dumped
    into struct nfs4_client. I don't know.)

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields