05 Oct, 2006

2 commits

  • * master.kernel.org:/pub/scm/linux/kernel/git/davej/configh:
    Remove all inclusions of

    Manually resolved trivial path conflicts due to removed files in
    the sound/oss/ subdirectory.

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6: (292 commits)
    [GFS2] Fix endian bug for de_type
    [GFS2] Initialize SELinux extended attributes at inode creation time.
    [GFS2] Move logging code into log.c (mostly)
    [GFS2] Mark nlink cleared so VFS sees it happen
    [GFS2] Two redundant casts removed
    [GFS2] Remove uneeded endian conversion
    [GFS2] Remove duplicate sb reading code
    [GFS2] Mark metadata reads for blktrace
    [GFS2] Remove iflags.h, use FS_
    [GFS2] Fix code style/indent in ops_file.c
    [GFS2] streamline-generic_file_-interfaces-and-filemap gfs fix
    [GFS2] Remove readv/writev methods and use aio_read/aio_write instead (gfs bits)
    [GFS2] inode-diet: Eliminate i_blksize from the inode structure
    [GFS2] inode_diet: Replace inode.u.generic_ip with inode.i_private (gfs)
    [GFS2] Fix typo in last patch
    [GFS2] Fix direct i/o logic in filemap.c
    [GFS2] Fix bug in Makefiles for lock modules
    [GFS2] Remove (extra) fs_subsys declaration
    [GFS2/DLM] Fix trailing whitespace
    [GFS2] Tidy up meta_io code
    ...

    Linus Torvalds
     

04 Oct, 2006

38 commits

  • * master.kernel.org:/pub/scm/linux/kernel/git/willy/parisc-2.6: (41 commits)
    [PARISC] Kill wall_jiffies use
    [PARISC] Honour "panic_on_oops" sysctl
    [PARISC] Fix fs/binfmt_som.c
    [PARISC] Export clear_user_page to modules
    [PARISC] Make DMA routines more stubby
    [PARISC] Define pci_get_legacy_ide_irq
    [PARISC] Fix CONFIG_DEBUG_SPINLOCK
    [PARISC] Fix HPUX compat compile with current GCC
    [PARISC] Fix iounmap compile warning
    [PARISC] Add support for Quicksilver AGPGART
    [PARISC] Move LBA and SBA register defines to the common ropes.h
    [PARISC] Create shared header
    [PARISC] Stash the lba_device in its struct device drvdata
    [PARISC] Generalize IS_ASTRO et al to take a parisc_device like
    [PARISC] Pretty print the name of the lba type on kernel boot
    [PARISC] Remove some obsolete comments and I checked that Reo is similar to Ike
    [PARISC] Add hardware found in the rp8400
    [PARISC] Allow nested interrupts
    [PARISC] Further updates to timer_interrupt()
    [PARISC] remove halftick and copy clocktick to local var (gcc can optimize usage)
    ...

    Linus Torvalds
     
  • eCryptfs is a stacked cryptographic filesystem for Linux. It is derived from
    Erez Zadok's Cryptfs, implemented through the FiST framework for generating
    stacked filesystems. eCryptfs extends Cryptfs to provide advanced key
    management and policy features. eCryptfs stores cryptographic metadata in the
    header of each file written, so that encrypted files can be copied between
    hosts; the file will be decryptable with the proper key, and there is no need
    to keep track of any additional information aside from what is already in the
    encrypted file itself.

    [akpm@osdl.org: updates for ongoing API changes]
    [bunk@stusta.de: cleanups]
    [akpm@osdl.org: alpha build fix]
    [akpm@osdl.org: cleanups]
    [tytso@mit.edu: inode-diet updates]
    [pbadari@us.ibm.com: generic_file_*_read/write() interface updates]
    [rdunlap@xenotime.net: printk format fixes]
    [akpm@osdl.org: make slab creation and teardown table-driven]
    Signed-off-by: Phillip Hellewell
    Signed-off-by: Michael Halcrow
    Signed-off-by: Erez Zadok
    Signed-off-by: Adrian Bunk
    Signed-off-by: Stephan Mueller
    Signed-off-by: "Theodore Ts'o"
    Signed-off-by: Badari Pulavarty
    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael Halcrow
     
  • Use all the pieces set up so far to implement referral support, allowing
    return of NFS4ERR_MOVED and fs_locations attribute.

    Signed-off-by: Manoj Naik
    Signed-off-by: Fred Isaman
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    J.Bruce Fields
     
  • Encode fs_locations attribute.

    Signed-off-by: Manoj Naik
    Signed-off-by: Fred Isaman
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    J.Bruce Fields
     
  • Define FS locations structures, some functions to manipulate them, and add
    code to parse FS locations in downcall and add to the exports structure.

    [bfields@fieldses.org: bunch of fixes and cleanups]
    Signed-off-by: Manoj Naik
    Signed-off-by: Fred Isaman
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Manoj Naik
     
  • Store the export path in the svc_export structure instead of storing only the
    dentry. This will prevent the need for additional d_path calls to provide
    NFSv4 fs_locations support.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    J.Bruce Fields
     
  • There is a possible race in d_splice_alias. Though __d_find_alias(inode, 1)
    will only return a dentry with DCACHE_DISCONNECTED set, it is possible for it
    to get cleared before the BUG_ON, and it is is not possible to lock against
    that.

    There are a couple of problems here. Firstly, the code doesn't match the
    comment. The comment describes a 'disconnected' dentry as being IS_ROOT as
    well as DCACHE_DISCONNECTED, however there is not testing of IS_ROOT anythere.

    A dentry is marked DCACHE_DISCONNECTED when allocated with d_alloc_anon, and
    remains DCACHE_DISCONNECTED while a path is built up towards the root. So a
    dentry can have a valid name and a valid parent and even grandparent, but will
    still be DCACHE_DISCONNECTED until a path to the root is created. Once the
    path to the root is complete, everything in the path gets DCACHE_DISCONNECTED
    cleared. So the fact that DCACHE_DISCONNECTED isn't enough to say that a
    dentry is free to be spliced in with a given name. This can only be allowed
    if the dentry does not yet have a name, so the IS_ROOT test is needed too.

    However even adding that test to __d_find_alias isn't enough. As
    d_splice_alias drops dcache_lock before calling d_move to perform the splice,
    it could race with another thread calling d_splice_alias to splice the inode
    in with a different name in a different part of the tree (in the case where a
    file has hard links). So that splicing code is only really safe for
    directories (as we know that directories only have one link). For
    directories, the caller of d_splice_alias will be holding i_mutex on the
    (unique) parent so there is no room for a race.

    A consequence of this is that a non-directory will never benefit from being
    spliced into a pre-exisiting dentry, but that isn't a problem. It is
    perfectly OK for a non-directory to have multiple dentries, some anonymous,
    some not. And the comment for d_splice_alias says that it only happens for
    directories anyway.

    Signed-off-by: Neil Brown
    Cc: Christoph Hellwig
    Cc: Al Viro
    Cc: Dipankar Sarma
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • totalram is measured in pages, not bytes, so PAGE_SHIFT must be used when
    trying to find 1/4096 of RAM.

    Cc: "J. Bruce Fields"
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • If nlm_lookup_host finds what it is looking for it exits with an extra
    reference on the matching 'nsm' structure.

    So don't actually count the reference until we are (fairly) sure it is going
    to be used.

    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • It is legal to have zero-length NFSv4 acls; they just deny everything.

    Also, nfs4_acl_nfsv4_to_posix will always return with pacl and dpacl set on
    success, so the caller doesn't need to check this.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    J.Bruce Fields
     
  • There's no need to handle the case where the caller passes in null for pacl or
    dpacl; no caller does that, because it would be a dumb thing to do.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    J.Bruce Fields
     
  • We can be a little more flexible about the flags allowed for inheritance (in
    particular, we can deal with either the presence or the absence of
    INHERIT_ONLY), but we should probably reject other combinations that we don't
    understand.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    J.Bruce Fields
     
  • Use a different nfsv4->(draft posix) acl mapping which is
    1. completely backwards compatible,
    2. accepts any nfsv4 acl, and
    3. errs on the side of restricting permissions.

    In detail:

    1. completely backwards compatible: The new mapping produces the
    same result on any acl produced by the existing (draft
    posix)->nfsv4 mapping; the one exception is that we no longer
    attempt to guess the value of the mask by assuming certain denies
    represent the mask. Since the server still keeps track of the mask
    locally, sequences of chmod's will still be handled fine; the only
    thing this will change is sequences of chmod's with intervening
    read-modify-writes of the acl. That last case just isn't worth the
    trouble and the possible misrepresentations of the user's intent
    (if we guess that a certain deny indicates masking is in effect
    when it really isn't).

    2. accepts any nfsv4 acl: That's not quite true: we still reject
    acls that use combinations of inheritance flags that we don't
    support. We also reject acls that attempt to explicitly deny
    read_acl or read_attributes permissions, or that attempt to deny
    write_acl or write_attributes permissions to the owner of the file.

    3. errs on the side of restricting permissions: one exception to
    this last rule: we totally ignore some bits (write_owner,
    synchronize, read_named_attributes, etc.) that are completely alien
    to our filesystem semantics, in some cases even if that would mean
    ignoring an explicit deny that we have no intention of enforcing.
    Excepting that, the posix acl produced should be the most
    permissive acl that is not more permissive than the given nfsv4
    acl.

    And the new code's shorter, too. Neato.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    J.Bruce Fields
     
  • The previous patch enables some minor simplification here.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    J.Bruce Fields
     
  • We could be using more common code in exp_pseudoroot(). This will also
    simplify some changes we need to make later.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    J.Bruce Fields
     
  • Both the (recently introduces) nsm_sema and the older f_sema are converted
    over.

    Cc: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Neil Brown
     
  • The NFSACL patches introduced support for multiple RPC services listening on
    the same transport. However, only the first of these services was registered
    with portmapper. This was perfectly fine for nfsacl, as you traditionally do
    not want these to show up in a portmapper listing.

    The patch below changes the default behavior to always register all services
    listening on a given transport, but retains the old behavior for nfsacl
    services.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • nlmclnt_recovery would try to force a portmap rebind by setting
    host->h_nextrebind to 0. The right thing to do here is to set it to the
    current time.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • Every NLM call includes the client's NSM state. Currently, the Linux client
    always reports 0 - which seems not to cause any problems, but is not what the
    protocol says.

    This patch exposes the kernel's internal variable to user space via a sysctl,
    which can be set at system boot time by statd.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • When we send a GRANTED_MSG call, we current copy the NLM cookie provided in
    the original LOCK call - because in 1996, some broken clients seemed to rely
    on this bug. However, this means the cookies are not unique, so that when the
    client's GRANTED_RES message comes back, we cannot simply match it based on
    the cookie, but have to use the client's IP address in addition. Which breaks
    when you have a multi-homed NFS client.

    The X/Open spec explicitly mentions that clients should not expect the same
    cookie; so one may hope that any clients that were broken in 1996 have either
    been fixed or rendered obsolete.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • The way we incremented the NLM cookie in nlmclnt_next_cookie was not thread
    safe. This patch changes the counter to an atomic_t

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • This patch adds the nsm_use_hostnames sysctl and module param. If set, lockd
    will use the client's name (as given in the NLM arguments) to find the NSM
    handle. This makes recovery work when the NFS peer is multi-homed, and the
    reboot notification arrives from a different IP than the original lock calls.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • As a result of previous patches, the loop in nlmsvc_invalidate_all just sets
    h_expires for all client/hosts to 0 (though does it in a very complicated
    way).

    This was possibly meant to trigger early garbage collection but half the time
    '0' is in the future and so it infact delays garbage collection.

    Pre-aging the 'hosts' is not really needed at this point anyway so we throw
    out the loop and nlm_find_client which is no longer needed.

    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • This patch moves the host destruction code out of nlm_host_gc into a function
    of its own.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • This patch makes nlm_traverse{locks,blocks,shares} and friends use a function
    pointer rather than a "action" enum.

    This function pointer is given two nlm_hosts (one given by the caller, the
    other taken from the lock/block/share currently visited), and is free to do
    with them as it wants. If it returns a non-zero value, the lockd/block/share
    is released.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • This changes struct nlm_file and the nlm_files hash table to use a hlist
    instead of the home-grown lists.

    This allows us to remove f_hash which was only used to find the right hash
    chain to delete an entry from.

    It also increases the size of the nlm_files hash table from 32 to 128.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • This patch changes the nlm_blocked list to use a list_node instead of
    homegrown linked list handling.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • Get rid of the home-grown singly linked lists for the nlm_host hash table.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • This converts the statd upcalls to use the nsm_handle

    This means that we only register each host once with statd, rather than
    registering each host/vers/protocol triple.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • This patch makes the SM_NOTIFY handling understand and use the nsm_handle.

    To make it a bit clear what is happening:

    nlmclent_prepare_reclaim and nlmclnt_finish_reclaim
    get open-coded into 'reclaimer'

    The result is tidied up.

    Then some of that functionality is moved out into nlm_host_rebooted (which
    calls nlmclnt_recovery which starts a thread which runs reclaimer).

    Also host_rebooted now finds an nsm_handle rather than a host, then then
    iterates over all hosts and deals with each host that shares that nsm_handle.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • cleans up some code in lockd/host.c, fixes an error printk and makes it a
    fatal BUG if nlmsvc_free_host_resources fails.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • This patch introduces the nsm_handle, which is shared by all nlm_host objects
    referring to the same client.

    With this patch applied, all nlm_hosts from the same address will share the
    same nsm_handle. A future patch will add sharing by name.

    Note: this patch changes h_name so that it is no longer guaranteed to be an IP
    address of the host. When the host represents an NFS server, h_name will be
    the name passed in the mount call. When the host represents a client, h_name
    will be the name presented in the lock request received from the client. A
    h_name is only used for printing informational messages, this change should
    not be significant.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • This patch adds the peer's hostname (and name length) to all calls to
    nlm*_lookup_host functions. A subsequent patch will make use of these (is
    requested by a sysctl).

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • Common code from nlm4svc_proc_sm_notify and nlmsvc_proc_sm_notify is moved
    into a new nlm_host_rebooted.

    This is in preparation of a patch that will change the reboot notification
    handling entirely.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • This patch moves all checks of the h_monitored flag into the
    nsm_monitor/unmonitor functions. A subsequent patch will replace the
    mechanism by which we mark a host as being monitored.

    There is still one occurence of h_monitored outside of mon.c and that is in
    clntlock.c where we respond to a reboot. The subsequent patch will modify
    this too.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • Make the nfsd read-ahead params cache more SMP-friendly by changing the single
    global list and lock into a fixed 16-bucket hashtable with per-bucket locks.
    This reduces spinlock contention in nfsd_read() on read-heavy workloads on
    multiprocessor servers.

    Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients each doing 1K
    streaming reads at full line rate. The server had 128 nfsd threads, which
    sizes the RA cache at 256 entries, of which only a handful were used. Flat
    profiling shows nfsd_read(), including the inlined nfsd_get_raparms(), taking
    10.4% of each CPU. This patch drops the contribution from nfsd() to 1.71% for
    each CPU.

    Signed-off-by: Greg Banks
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Greg Banks
     
  • The max possible is the maximum RPC payload. The default depends on amount of
    total memory.

    The value can be set within reason as long as no nfsd threads are currently
    running. The value can also be ready, allowing the default to be determined
    after nfsd has started.

    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • The limit over UDP remains at 32K. Also, make some of the apparently
    arbitrary sizing constants clearer.

    The biggest change here involves replacing NFSSVC_MAXBLKSIZE by a function of
    the rqstp. This allows it to be different for different protocols (udp/tcp)
    and also allows it to depend on the servers declared sv_bufsiz.

    Note that we don't actually increase sv_bufsz for nfs yet. That comes next.

    Signed-off-by: Greg Banks
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Greg Banks