16 Jul, 2008

11 commits

  • Trond Myklebust
     
  • Conflicts:

    fs/nfs/file.c

    Fix up the conflict with Jon Corbet's bkl-removal tree

    Trond Myklebust
     
  • Push it into those callback functions that actually need it.

    Note that all the NFS operations use their own locking, so don't need the
    BKL. Ditto for the rpcbind client.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Introduce a new API to register RPC services on IPv6 interfaces to allow
    the NFS server and lockd to advertise on IPv6 networks.

    Unlike rpcb_register(), the new rpcb_v4_register() function uses rpcbind
    protocol version 4 to contact the local rpcbind daemon. The version 4
    SET/UNSET procedures allow services to register address families besides
    AF_INET, register at specific network interfaces, and register transport
    protocols besides UDP and TCP. All of this functionality is exposed via
    the new rpcb_v4_register() kernel API.

    A user-space rpcbind daemon implementation that supports version 4 of the
    rpcbind protocol is required in order to make use of this new API.

    Note that rpcbind version 3 is sufficient to support the new rpcbind
    facilities listed above, but most extant implementations use version 4.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • rpcbind version 4 registration will reuse part of rpcb_register, so just
    split it out into a separate function now.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up: Callers that required a privileged source port now use
    rpcb_create_local(), so we can remove the @privileged argument from
    rpcb_create().

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Add rpcb_create_local() for use by rpcb_register() and upcoming IPv6
    registration functions.

    Ensure any errors encountered by rpcb_create_local() are properly
    reported.

    We can also use a statically allocated constant loopback socket address
    instead of one allocated on the stack and initialized every time the
    function is called.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • The rpcbind versions 3 and 4 SET and UNSET procedures use the same
    arguments as the GETADDR procedure.

    While definitely a bug, this hasn't been a problem so far since the
    kernel hasn't used version 3 or 4 SET and UNSET. But this will change
    in just a moment.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • …l/git/tip/linux-2.6-tip

    * 'generic-ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits)
    generic-ipi: more merge fallout
    generic-ipi: merge fix
    x86, visws: use mach-default/entry_arch.h
    x86, visws: fix generic-ipi build
    generic-ipi: fixlet
    generic-ipi: fix s390 build bug
    generic-ipi: fix linux-next tree build failure
    fix: "smp_call_function: get rid of the unused nonatomic/retry argument"
    fix: "smp_call_function: get rid of the unused nonatomic/retry argument"
    fix "smp_call_function: get rid of the unused nonatomic/retry argument"
    on_each_cpu(): kill unused 'retry' parameter
    smp_call_function: get rid of the unused nonatomic/retry argument
    sh: convert to generic helpers for IPI function calls
    parisc: convert to generic helpers for IPI function calls
    mips: convert to generic helpers for IPI function calls
    m32r: convert to generic helpers for IPI function calls
    arm: convert to generic helpers for IPI function calls
    alpha: convert to generic helpers for IPI function calls
    ia64: convert to generic helpers for IPI function calls
    powerpc: convert to generic helpers for IPI function calls
    ...

    Fix trivial conflicts due to rcu updates in kernel/rcupdate.c manually

    Linus Torvalds
     
  • Conflicts:

    arch/powerpc/Kconfig
    arch/s390/kernel/time.c
    arch/x86/kernel/apic_32.c
    arch/x86/kernel/cpu/perfctr-watchdog.c
    arch/x86/kernel/i8259_64.c
    arch/x86/kernel/ldt.c
    arch/x86/kernel/nmi_64.c
    arch/x86/kernel/smpboot.c
    arch/x86/xen/smp.c
    include/asm-x86/hw_irq_32.h
    include/asm-x86/hw_irq_64.h
    include/asm-x86/mach-default/irq_vectors.h
    include/asm-x86/mach-voyager/irq_vectors.h
    include/asm-x86/smp.h
    kernel/Makefile

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Ingo Molnar
     

15 Jul, 2008

3 commits


14 Jul, 2008

1 commit


11 Jul, 2008

7 commits

  • Conflicts:

    include/linux/rculist.h
    kernel/rcupreempt.c

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits)
    tun: Persistent devices can get stuck in xoff state
    xfrm: Add a XFRM_STATE_AF_UNSPEC flag to xfrm_usersa_info
    ipv6: missed namespace context in ipv6_rthdr_rcv
    netlabel: netlink_unicast calls kfree_skb on error path by itself
    ipv4: fib_trie: Fix lookup error return
    tcp: correct kcalloc usage
    ip: sysctl documentation cleanup
    Documentation: clarify tcp_{r,w}mem sysctl docs
    netfilter: nf_nat_snmp_basic: fix a range check in NAT for SNMP
    netfilter: nf_conntrack_tcp: fix endless loop
    libertas: fix memory alignment problems on the blackfin
    zd1211rw: stop beacons on remove_interface
    rt2x00: Disable synchronization during initialization
    rc80211_pid: Fix fast_start parameter handling
    sctp: Add documentation for sctp sysctl variable
    ipv6: fix race between ipv6_del_addr and DAD timer
    irda: Fix netlink error path return value
    irda: New device ID for nsc-ircc
    irda: via-ircc proper dma freeing
    sctp: Mark the tsn as received after all allocations finish
    ...

    Linus Torvalds
     
  • Add a XFRM_STATE_AF_UNSPEC flag to handle the AF_UNSPEC behavior for
    the selector family. Userspace applications can set this flag to leave
    the selector family of the xfrm_state unspecified. This can be used
    to to handle inter family tunnels if the selector is not set from
    userspace.

    Signed-off-by: Steffen Klassert
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller

    Steffen Klassert
     
  • Signed-off-by: Denis V. Lunev
    Signed-off-by: David S. Miller

    Denis V. Lunev
     
  • So, no need to kfree_skb here on the error path. In this case we can
    simply return.

    Signed-off-by: Denis V. Lunev
    Acked-by: Paul Moore
    Signed-off-by: David S. Miller

    Denis V. Lunev
     
  • In commit a07f5f508a4d9728c8e57d7f66294bf5b254ff7f "[IPV4] fib_trie: style
    cleanup", the changes to check_leaf() and fn_trie_lookup() were wrong - where
    fn_trie_lookup() would previously return a negative error value from
    check_leaf(), it now returns 0.

    Now fn_trie_lookup() doesn't appear to care about plen, so we can revert
    check_leaf() to returning the error value.

    Signed-off-by: Ben Hutchings
    Tested-by: William Boughton
    Acked-by: Stephen Heminger
    Signed-off-by: David S. Miller

    Ben Hutchings
     
  • kcalloc is supposed to be called with the count as its first argument and
    the element size as the second.

    Signed-off-by: Milton Miller
    Signed-off-by: David S. Miller

    Milton Miller
     

10 Jul, 2008

18 commits

  • David S. Miller
     
  • Fix a range check in netfilter IP NAT for SNMP to always use a big enough size
    variable that the compiler won't moan about comparing it to ULONG_MAX/8 on a
    64-bit platform.

    Signed-off-by: David Howells
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    David Howells
     
  • When a conntrack entry is destroyed in process context and destruction
    is interrupted by packet processing and the packet is an attempt to
    reopen a closed connection, TCP conntrack tries to kill the old entry
    itself and returns NF_REPEAT to pass the packet through the hook
    again. This may lead to an endless loop: TCP conntrack repeatedly
    finds the old entry, but can not kill it itself since destruction
    is already in progress, but destruction in process context can not
    complete since TCP conntrack is keeping the CPU busy.

    Drop the packet in TCP conntrack if we can't kill the connection
    ourselves to avoid this.

    Reported by: hemao77@gmail.com [ Kernel bugzilla #11058 ]
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • This removes the fast_start parameter from the rc_pid parameters
    information and instead uses the parameter macro when initializing
    the rc_pid state. Since the parameter is only used on initialization,
    there is no point of making exporting it via debugfs. This also fixes
    uninitialized memory references to the fast_start and norm_offset
    parameters detected by the kmemcheck utility. Thanks to Vegard Nossum
    for reporting the bug.

    Signed-off-by: Mattias Nissler
    Signed-off-by: John W. Linville

    Mattias Nissler
     
  • If another task is busy in rpcb_getport_async number, it is more efficient
    to have it wake us up when it has finished instead of arbitrarily sleeping
    for 5 seconds.

    Also ensure that rpcb_wake_rpcbind_waiters() is called regardless of
    whether or not rpcb_getport_done() gets called.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Some server vendors support the higher versions of rpcbind only for
    AF_INET6. The kernel doesn't need to use v3 or v4 for AF_INET anyway,
    so change the kernel's rpcbind client to query AF_INET servers over
    rpcbind v2 only.

    This has a few interesting benefits:

    1. If the rpcbind request is going over TCP, and the server doesn't
    support rpcbind versions 3 or 4, the client reduces by two the number
    of ephemeral ports left in TIME_WAIT for each rpcbind request. This
    will help during NFS mount storms.

    2. The rpcbind interaction with servers that don't support rpcbind
    versions 3 or 4 will use less network traffic. Also helpful
    during mount storms.

    3. We can eliminate the kernel build option that controls whether the
    kernel's rpcbind client uses rpcbind version 3 and 4 for AF_INET
    servers. Less complicated kernel configuration...

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Some rpcbind servers that do support rpcbind version 4 do not support
    the GETVERSADDR procedure. Use GETADDR for querying rpcbind servers
    via rpcbind version 4 instead of GETVERSADDR.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up: Change the version 2 procedure name to GETPORT. It's the same
    procedure number as GETADDR, but version 2 implementations usually refer
    to it as GETPORT.

    This also now matches the procedure name used in the version 2 procedure
    entry in the rpcb_next_version[] array, making it slightly less confusing.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up: Replace naked integers that represent rpcbind protocol versions.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up dprintk's in rpcb client's XDR decoder functions.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • The RPC client uses the rq_xtime field in each RPC request to determine the
    round-trip time of the request. Currently, the rq_xtime field is
    initialized by each transport just before it starts enqueing a request to
    be sent. However, transports do not handle initializing this value
    consistently; sometimes they don't initialize it at all.

    To make the measurement of request round-trip time consistent for all
    RPC client transport capabilities, pull rq_xtime initialization into the
    RPC client's generic transport logic. Now all transports will get a
    standardized RTT measure automatically, from:

    xprt_transmit()

    to

    xprt_complete_rqst()

    This makes round-trip time calculation more accurate for the TCP transport.
    The socket ->sendmsg() method can return "-EAGAIN" if the socket's output
    buffer is full, so the TCP transport's ->send_request() method may call
    the ->sendmsg() method repeatedly until it gets all of the request's bytes
    queued in the socket's buffer.

    Currently, the TCP transport sets the rq_xtime field every time through
    that loop so the final value is the timestamp just before the *last* call
    to the underlying socket's ->sendmsg() method. After this patch, the
    rq_xtime field contains a timestamp that reflects the time just before the
    *first* call to ->sendmsg().

    This is consequential under heavy workloads because large requests often
    take multiple ->sendmsg() calls to get all the bytes of a request queued.
    The TCP transport causes the request to sleep until the remote end of the
    socket has received enough bytes to clear space in the socket's local
    output buffer. This delay can be quite significant.

    The method introduced by this patch is a more accurate measure of RTT
    for stream transports, since the server can cause enough back pressure
    to delay (ie increase the latency of) requests from the client.

    Additionally, this patch corrects the behavior of the RDMA transport, which
    entirely neglected to initialize the rq_xtime field. RPC performance
    metrics for RDMA transports now display correct RPC request round trip
    times.

    Signed-off-by: Chuck Lever
    Acked-by: Tom Talpey
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Try to make the comment here a little more clear and concise.

    Also, this macro definition seems unnecessary.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    \\\"J. Bruce Fields\\\
     
  • There used to be a print_hexl() function that used isprint(), now gone.
    I don't know why NFS_NGROUPS and CA_RUN_AS_MACHINE were here.

    I also don't know why another #define that's actually used was marked
    "unused".

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    \\\"J. Bruce Fields\\\
     
  • Also, a minor comment grammar fix in the same file.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    \\\"J. Bruce Fields\\\
     
  • The cl_chatty flag alows us to control whether a given rpc client leaves

    "server X not responding, timed out"

    messages in the syslog. Such messages make sense for ordinary nfs
    clients (where an unresponsive server means applications on the
    mountpoint are probably hanging), but not for the callback client (which
    can fail more commonly, with the only result just of disabling some
    optimizations).

    Previously cl_chatty was removed, do to lack of users; reinstate it, and
    use it for the nfsd's callback client.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    Olga Kornievskaia
     
  • Recent changes to the RPC client's transport connect logic make connect
    status values ECONNREFUSED and ECONNRESET impossible.

    Clean up xprt_connect_status() to account for these changes.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • In rpc_show_tasks(), display the program name, version number, procedure
    name and tk_action as human-readable variable-length text fields rather
    than columnar numbers.

    Doing the symbol lookup here helps in cases where we have actual
    debugging output from a kernel log, but don't have access to the kernel
    image or RPC module that generated the output.

    Sample output:

    -pid- flgs status -client- --rqstp- -timeout ---ops--
    5608 0001 -11 eeb42690 f6d93710 0 f8fa1764 nfsv3 WRITE a:call_transmit_status q:none
    5609 0001 -11 eeb42690 f6d937e0 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending
    5610 0001 -11 eeb42690 f6d93230 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending
    5611 0001 -11 eeb42690 f6d93300 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending
    5612 0001 -11 eeb42690 f6d93090 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending
    5613 0001 -11 eeb42690 f6d933d0 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending
    5614 0001 -11 eeb42690 f6d93cc0 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending
    5615 0001 -11 eeb42690 f6d93a50 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending
    5616 0001 -11 eeb42690 f6d93640 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending
    5617 0001 -11 eeb42690 f6d93b20 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending
    5618 0001 -11 eeb42690 f6d93160 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up: move the logic that displays each task to its own function.
    This removes indentation and makes future changes easier.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever