29 Jul, 2009

10 commits

  • This function is only used for SEQUENCE replay.

    Signed-off-by: Andy Adamson
    Signed-off-by: J. Bruce Fields

    Andy Adamson
     
  • Instead of trying to share the generic 4.1 reply cache code for the
    CREATE_SESSION reply cache, it's simpler to handle CREATE_SESSION
    separately.

    The nfs41 single slot clientid DRC holds the results of create session
    processing. CREATE_SESSION can be preceeded by a SEQUENCE operation
    (an embedded CREATE_SESSION) and the create session single slot cache must be
    maintained. nfsd4_replay_cache_entry() and nfsd4_store_cache_entry() do not
    implement the replay of an embedded CREATE_SESSION.

    The clientid DRC slot does not need the inuse, cachethis or other fields that
    the multiple slot session cache uses. Replace the clientid DRC cache struct
    nfs4_slot cache with a new nfsd4_clid_slot cache. Save the xdr struct
    nfsd4_create_session into the cache at the end of processing, and on a replay,
    replace the struct for the replay request with the cached version all while
    under the state lock.

    nfsd4_proc_compound will handle both the solo and embedded CREATE_SESSION case
    via the normal use of encode_operation.

    Errors that do not change the create session cache:
    A create session NFS4ERR_STALE_CLIENTID error means that a client record
    (and associated create session slot) could not be found and therefore can't
    be changed. NFSERR_SEQ_MISORDERED errors do not change the slot cache.

    All other errors get cached.

    Remove the clientid DRC specific check in nfs4svc_encode_compoundres to
    put the session only if cstate.session is set which will now always be true.

    Signed-off-by: Andy Adamson
    Signed-off-by: J. Bruce Fields

    Andy Adamson
     
  • For separation of session slot and clientid slot processing.

    Signed-off-by: Andy Adamson
    Signed-off-by: J. Bruce Fields

    Andy Adamson
     
  • This check is done in set_forechannel_maxreqs.

    Signed-off-by: Andy Adamson
    Signed-off-by: J. Bruce Fields

    Andy Adamson
     
  • NFSD_SLOT_CACHE_SIZE is the size of all encoded operation responses
    (excluding the sequence operation) that we want to cache.

    For now, keep NFSD_SLOT_CACHE_SIZE at PAGE_SIZE. It will be reduced
    when the DRC is changed from page based to memory based.

    Signed-off-by: Andy Adamson
    Signed-off-by: J. Bruce Fields

    Andy Adamson
     
  • Also remove a slightly misleading comment.

    Signed-off-by: Andy Adamson
    Signed-off-by: J. Bruce Fields

    Andy Adamson
     
  • Signed-off-by: Andy Adamson
    Signed-off-by: J. Bruce Fields

    Andy Adamson
     
  • This fixes a leak which would eventually lock out new clients.

    Signed-off-by: Andy Adamson
    Signed-off-by: J. Bruce Fields

    Andy Adamson
     
  • Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • kmemleak produces the following warning

    unreferenced object 0xc9ec02a0 (size 8):
    comm "cat", pid 19048, jiffies 730243
    backtrace:
    [] create_object+0x100/0x240
    [] kmemleak_alloc+0x2b/0x60
    [] __kmalloc+0x14b/0x270
    [] write_pool_threads+0x87/0x1d0
    [] nfsctl_transaction_write+0x58/0x70
    [] nfsctl_transaction_read+0x4f/0x60
    [] vfs_read+0x94/0x150
    [] sys_read+0x3d/0x70
    [] sysenter_do_call+0x12/0x32
    [] 0xffffffff

    write_pool_threads() only frees nthreads on error paths, in the success case
    we leak it.

    Signed-off-by: Eric Sesterhenn
    Reviewed-by: Catalin Marinas
    Signed-off-by: J. Bruce Fields

    Eric Sesterhenn
     

15 Jul, 2009

3 commits

  • The version 4.1 DRC memory limit and tracking variables are server wide and
    session specific. Replace struct svc_serv fields with globals.
    Stop using the svc_serv sv_lock.

    Add a spinlock to serialize access to the DRC limit management variables which
    change on session creation and deletion (usage counter) or (future)
    administrative action to adjust the total DRC memory limit.

    Signed-off-by: Andy Adamson
    Signed-off-by: Benny Halevy

    Andy Adamson
     
  • PKTINFO is needed to scrape the caller's IP address off the socket so
    RPC datagram replies are routed correctly. Fill in missing pieces in
    the kernel RPC server's UDP receive path to request IPv6 PKTINFO and
    correctly parse the IPv6 cmsg header.

    Without this patch, kernel RPC services drop all incoming requests on
    UDP on IPv6.

    Related commit: 7a37f5787e76bf1765c1add3a9a7163f841a28bb

    Signed-off-by: Chuck Lever
    Cc: Neil Brown
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • ACL in operations 'open' and 'create' is decoded but never be used.
    It should be set as the initial ACL for the object according to RFC3530.
    If error occurs when setting the ACL, just clear the ACL bit in the
    returned attr bitmap.

    Signed-off-by: Yu Zhiguo
    Signed-off-by: J. Bruce Fields

    Yu Zhiguo
     

11 Jul, 2009

1 commit


03 Jul, 2009

1 commit


25 Jun, 2009

25 commits

  • Linus Torvalds
     
  • This reverts commit 9e9f46c44e487af0a82eb61b624553e2f7118f5b.

    Quoting from the commit message:

    "At this point, it seems to solve more problems than it causes, so let's
    try using it by default. It's an easy revert if it ends up causing
    trouble."

    And guess what? The _CRS code causes trouble.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • * git://git.infradead.org/battery-2.6:
    da9030_battery: Fix race between event handler and monitor
    Add MAX17040 Fuel Gauge driver
    w1: ds2760_battery: add support for sleep mode feature
    w1: ds2760: add support for EEPROM read and write
    ds2760_battery: cleanups in ds2760_battery_probe()

    Linus Torvalds
     
  • …/{vfs-2.6,audit-current}

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
    another race fix in jfs_check_acl()
    Get "no acls for this inode" right, fix shmem breakage
    inline functions left without protection of ifdef (acl)

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
    audit: inode watches depend on CONFIG_AUDIT not CONFIG_AUDIT_SYSCALL

    Linus Torvalds
     
  • Signed-off-by: Al Viro

    Al Viro
     
  • Signed-off-by: Al Viro

    Al Viro
     
  • Even though one cannot make use of the audit watch code without
    CONFIG_AUDIT_SYSCALL the spaghetti nature of the audit code means that
    the audit rule filtering requires that it at least be compiled.

    Thus build the audit_watch code when we build auditfilter like it was
    before cfcad62c74abfef83762dc05a556d21bdf3980a2

    Clearly this is a point of potential future cleanup..

    Reported-by: Frans Pop
    Signed-off-by: Eric Paris
    Signed-off-by: Al Viro

    Eric Paris
     
  • Signed-off-by: Al Viro

    Markus Trippelsdorf
     
  • * 'futexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    futex: Fix the write access fault problem for real

    Linus Torvalds
     
  • commit 64d1304a64 (futex: setup writeable mapping for futex ops which
    modify user space data) did address only half of the problem of write
    access faults.

    The patch was made on two wrong assumptions:

    1) access_ok(VERIFY_WRITE,...) would actually check write access.

    On x86 it does _NOT_. It's a pure address range check.

    2) a RW mapped region can not go away under us.

    That's wrong as well. Nobody can prevent another thread to call
    mprotect(PROT_READ) on that region where the futex resides. If that
    call hits between the get_user_pages_fast() verification and the
    actual write access in the atomic region we are toast again.

    The solution is to not rely on access_ok and get_user() for any write
    access related fault on private and shared futexes. Instead we need to
    fault it in with verification of write access.

    There is no generic non destructive write mechanism which would fault
    the user page in trough a #PF, but as we already know that we will
    fault we can as well call get_user_pages() directly and avoid the #PF
    overhead.

    If get_user_pages() returns -EFAULT we know that we can not fix it
    anymore and need to bail out to user space.

    Remove a bunch of confusing comments on this issue as well.

    Signed-off-by: Thomas Gleixner
    Cc: stable@kernel.org

    Thomas Gleixner
     
  • SLUB uses higher order allocations by default but falls back to small
    orders under memory pressure. Make sure the GFP mask used in the initial
    allocation doesn't include __GFP_NOFAIL.

    Signed-off-by: Pekka Enberg
    Signed-off-by: Linus Torvalds

    Pekka Enberg
     
  • Traditionally, we never failed small orders (even regardless of any
    __GFP_NOFAIL flags), and slab will allocate order-1 allocations even for
    small allocations that could fit in a single page (in order to avoid
    excessive fragmentation).

    Maybe we should remove this warning entirely, but before making that
    judgement, at least limit it to bigger allocations.

    Acked-by: Pekka Enberg
    Cc: Andrew Morton
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • * 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
    Staging: octeon-ethernet: Fix race freeing transmit buffers.
    Staging: octeon-ethernet: Convert to use net_device_ops.
    MIPS: Cavium: Add CPU hotplugging code.
    MIPS: SMP: Allow suspend and hibernation if CPU hotplug is available
    MIPS: Add arch generic CPU hotplug
    DMA: txx9dmac: use dma_unmap_single if DMA_COMPL_{SRC,DEST}_UNMAP_SINGLE set
    MIPS: Sibyte: Fix build error if CONFIG_SERIAL_SB1250_DUART is undefined.
    MIPS: MIPSsim: Fix build error if MSC01E_INT_BASE is undefined.
    MIPS: Hibernation: Remove SMP TLB and cacheflushing code.
    MIPS: Build fix - include into all smp_processor_id() users.
    MIPS: bug.h Build fix - include .

    Linus Torvalds
     
  • The existing code had the following race:

    Thread-1 Thread-2

    inc/read in_use
    inc/read in_use
    inc tx_free_list[qos].len
    inc tx_free_list[qos].len

    The actual in_use value was incremented twice, but thread-1 is going
    to free memory based on its stale value, and will free one too many
    times. The result is that memory is freed back to the kernel while
    its packet is still in the transmit buffer. If the memory is
    overwritten before it is transmitted, the hardware will put a valid
    checksum on it and send it out (just like it does with good packets).
    If by chance the TCP flags are clobbered but not the addresses or
    ports, the result can be a broken TCP stream.

    The fix is to track the number of freed packets in a single location
    (a Fetch-and-Add Unit register). That way it can never get out of sync
    with itself.

    We try to free up to MAX_SKB_TO_FREE (currently 10) buffers at a time.
    If fewer are available we adjust the free count with the difference.
    The action of claiming buffers to free is atomic so two threads cannot
    claim the same buffers.

    Signed-off-by: David Daney
    Signed-off-by: Ralf Baechle

    David Daney
     
  • Convert the driver to use net_device_ops as it is now mandatory.

    Also compensate for the removal of struct sk_buff's dst field.

    The changes are mostly mechanical, the content of ethernet-common.c
    was moved to ethernet.c and ethernet-common.{c,h} are removed.

    Signed-off-by: David Daney
    Signed-off-by: Ralf Baechle

    David Daney
     
  • Thanks to Cavium Inc. for the code contribution and help.

    Signed-off-by: Ralf Baechle

    Ralf Baechle
     
  • The SMP implementation of suspend and hibernate depends on CPU hotplugging.
    In the past we didn't have CPU hotplug so suspend and hibernation were not
    possible on SMP systems.

    Signed-off-by: Ralf Baechle

    Ralf Baechle
     
  • Each platform has to add support for CPU hotplugging itself by providing
    suitable definitions for the cpu_disable and cpu_die of the smp_ops
    methods and setting SYS_SUPPORTS_HOTPLUG_CPU. A platform should only set
    SYS_SUPPORTS_HOTPLUG_CPU once all it's smp_ops definitions have the
    necessary changes. This patch contains the changes to the dummy smp_ops
    definition for uni-processor systems.

    Parts of the code contributed by Cavium Inc.

    Signed-off-by: Ralf Baechle

    Ralf Baechle
     
  • This patch does not change actual behaviour since dma_unmap_page is just
    an alias of dma_unmap_single on MIPS.

    Signed-off-by: Atsushi Nemoto
    Cc: Ralf Baechle
    Acked-by: Dan Williams
    Signed-off-by: Andrew Morton
    Signed-off-by: Ralf Baechle

    Atsushi Nemoto
     
  • This fixes kernel.org bugzilla 13596, see
    http://bugzilla.kernel.org/show_bug.cgi?id=13596

    Reported-by: dvice_null@yahoo.com

    Signed-off-by: Ralf Baechle

    Ralf Baechle
     
  • This fixes kernel.org bugzilla 13595, see
    http://bugzilla.kernel.org/show_bug.cgi?id=13595

    Reported-by: dvice_null@yahoo.com

    Signed-off-by: Ralf Baechle

    Ralf Baechle
     
  • We can't perform any flushes on SMP from swsusp_arch_resume because
    interrupts are disabled. A cross-CPU flush is unnecessary anyway
    because all but the local CPU have already been disabled. A local
    flush is not needed either because we didn't change any mappings. So
    just delete the code.

    Signed-off-by: Ralf Baechle

    Ralf Baechle
     
  • Some of the were relying into smp.h being dragged in by another header
    which of course is fragile. uses smp_processor_id()
    only in macros and including smp.h there leads to an include loop, so
    don't change cpu-info.h.

    Signed-off-by: Ralf Baechle

    Ralf Baechle
     
  • In the past this file somehow used to be dragged in.

    Signed-off-by: Ralf Baechle

    Ralf Baechle
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: (48 commits)
    dm mpath: change to be request based
    dm: disable interrupt when taking map_lock
    dm: do not set QUEUE_ORDERED_DRAIN if request based
    dm: enable request based option
    dm: prepare for request based option
    dm raid1: add userspace log
    dm: calculate queue limits during resume not load
    dm log: fix create_log_context to use logical_block_size of log device
    dm target:s introduce iterate devices fn
    dm table: establish queue limits by copying table limits
    dm table: replace struct io_restrictions with struct queue_limits
    dm table: validate device logical_block_size
    dm table: ensure targets are aligned to logical_block_size
    dm ioctl: support cookies for udev
    dm: sysfs add suspended attribute
    dm table: improve warning message when devices not freed before destruction
    dm mpath: add service time load balancer
    dm mpath: add queue length load balancer
    dm mpath: add start_io and nr_bytes to path selectors
    dm snapshot: use barrier when writing exception store
    ...

    Linus Torvalds