27 May, 2011

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
    RDMA/cma: Save PID of ID's owner
    RDMA/cma: Add support for netlink statistics export
    RDMA/cma: Pass QP type into rdma_create_id()
    RDMA: Update exported headers list
    RDMA/cma: Export enum cma_state in
    RDMA/nes: Add a check for strict_strtoul()
    RDMA/cxgb3: Don't post zero-byte read if endpoint is going away
    RDMA/cxgb4: Use completion objects for event blocking
    IB/srp: Fix integer -> pointer cast warnings
    IB: Add devnode methods to cm_class and umad_class
    IB/mad: Return EPROTONOSUPPORT when an RDMA device lacks the QP required
    IB/uverbs: Add devnode method to set path/mode
    RDMA/ucma: Add .nodename/.mode to tell userspace where to create device node
    RDMA: Add netlink infrastructure
    RDMA: Add error handling to ib_core_init()

    Linus Torvalds
     

26 May, 2011

1 commit

  • The RDMA CM currently infers the QP type from the port space selected
    by the user. In the future (eg with RDMA_PS_IB or XRC), there may not
    be a 1-1 correspondence between port space and QP type. For netlink
    export of RDMA CM state, we want to export the QP type to userspace,
    so it is cleaner to explicitly associate a QP type to an ID.

    Modify rdma_create_id() to allow the user to specify the QP type, and
    use it to make our selections of datagram versus connected mode.

    Signed-off-by: Sean Hefty
    Signed-off-by: Roland Dreier

    Sean Hefty
     

25 May, 2011

1 commit

  • Change each shrinker's API by consolidating the existing parameters into
    shrink_control struct. This will simplify any further features added w/o
    touching each file of shrinker.

    [akpm@linux-foundation.org: fix build]
    [akpm@linux-foundation.org: fix warning]
    [kosaki.motohiro@jp.fujitsu.com: fix up new shrinker API]
    [akpm@linux-foundation.org: fix xfs warning]
    [akpm@linux-foundation.org: update gfs2]
    Signed-off-by: Ying Han
    Cc: KOSAKI Motohiro
    Cc: Minchan Kim
    Acked-by: Pavel Emelyanov
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Acked-by: Rik van Riel
    Cc: Johannes Weiner
    Cc: Hugh Dickins
    Cc: Dave Hansen
    Cc: Steven Whitehouse
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ying Han
     

24 May, 2011

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
    b43: fix comment typo reqest -> request
    Haavard Skinnemoen has left Atmel
    cris: typo in mach-fs Makefile
    Kconfig: fix copy/paste-ism for dell-wmi-aio driver
    doc: timers-howto: fix a typo ("unsgined")
    perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c
    md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course').
    treewide: fix a few typos in comments
    regulator: change debug statement be consistent with the style of the rest
    Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations"
    audit: acquire creds selectively to reduce atomic op overhead
    rtlwifi: don't touch with treewide double semicolon removal
    treewide: cleanup continuations and remove logging message whitespace
    ath9k_hw: don't touch with treewide double semicolon removal
    include/linux/leds-regulator.h: fix syntax in example code
    tty: fix typo in descripton of tty_termios_encode_baud_rate
    xtensa: remove obsolete BKL kernel option from defconfig
    m68k: fix comment typo 'occcured'
    arch:Kconfig.locks Remove unused config option.
    treewide: remove extra semicolons
    ...

    Linus Torvalds
     

10 May, 2011

1 commit


26 Apr, 2011

1 commit


25 Apr, 2011

1 commit

  • On occasion, it is useful for the NFS layer to distinguish between
    soft timeouts and other EIO errors due to (say) encoding errors,
    or authentication errors.

    The following patch ensures that the default behaviour of the RPC
    layer remains to return EIO on soft timeouts (until we have
    audited all the callers).

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

19 Apr, 2011

1 commit


16 Apr, 2011

1 commit


14 Apr, 2011

1 commit

  • There can be an infinite loop if gss_create_upcall() is called without
    the userspace program running. To prevent this, we return -EACCES if
    we notice that pipe_version hasn't changed (indicating that the pipe
    has not been opened).

    Signed-off-by: Bryan Schumaker

    Signed-off-by: Trond Myklebust

    Bryan Schumaker
     

10 Apr, 2011

1 commit


09 Apr, 2011

1 commit


07 Apr, 2011

1 commit

  • This reverts commit 411b5e05617593efebc06241dbc56f42150f2abe.

    Olga Kornievskaia reports:

    Problem: linux client mounting linux server using rc4-hmac-md5
    enctype. gssd fails with create a context after receiving a reply from
    the server.

    Diagnose: putting printout statements in the server kernel and
    kerberos libraries revealed that client and server derived different
    integrity keys.

    Server kernel code was at fault due the the commit

    [aglo@skydive linux-pnfs]$ git show 411b5e05617593efebc06241dbc56f42150f2abe

    Trond: The problem is that since it relies on virt_to_page(), you cannot
    call sg_set_buf() for data in the const section.

    Reported-by: Olga Kornievskaia
    Signed-off-by: Trond Myklebust
    Cc: stable@kernel.org [2.6.36+]

    Trond Myklebust
     

31 Mar, 2011

1 commit


27 Mar, 2011

1 commit

  • BUG: atomic_dec_and_test(): -1: atomic counter underflow at:
    Pid: 2827, comm: mount.nfs Not tainted 2.6.38 #1
    Call Trace:
    [] ? put_rpccred+0x44/0x14e [sunrpc]
    [] ? rpc_ping+0x4e/0x58 [sunrpc]
    [] ? rpc_create+0x481/0x4fc [sunrpc]
    [] ? rpcauth_lookup_credcache+0xab/0x22d [sunrpc]
    [] ? nfs_create_rpc_client+0xa6/0xeb [nfs]
    [] ? nfs4_set_client+0xc2/0x1f9 [nfs]
    [] ? nfs4_create_server+0xf2/0x2a6 [nfs]
    [] ? nfs4_remote_mount+0x4e/0x14a [nfs]
    [] ? vfs_kern_mount+0x6e/0x133
    [] ? nfs_do_root_mount+0x76/0x95 [nfs]
    [] ? nfs4_try_mount+0x56/0xaf [nfs]
    [] ? nfs_get_sb+0x435/0x73c [nfs]
    [] ? vfs_kern_mount+0x99/0x133
    [] ? do_kern_mount+0x48/0xd8
    [] ? do_mount+0x6da/0x741
    [] ? sys_mount+0x83/0xc0
    [] ? system_call_fastpath+0x16/0x1b

    Well, so, I think this is real bug of nfs codes somewhere. With some
    review, the code

    rpc_call_sync()
    rpc_run_task
    rpc_execute()
    __rpc_execute()
    rpc_release_task()
    rpc_release_resources_task()
    put_rpccred()
    Signed-off-by: Trond Myklebust

    OGAWA Hirofumi
     

26 Mar, 2011

1 commit

  • * 'nfs-for-2.6.39' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (28 commits)
    Cleanup XDR parsing for LAYOUTGET, GETDEVICEINFO
    NFSv4.1 convert layoutcommit sync to boolean
    NFSv4.1 pnfs_layoutcommit_inode fixes
    NFS: Determine initial mount security
    NFS: use secinfo when crossing mountpoints
    NFS: Add secinfo procedure
    NFS: lookup supports alternate client
    NFS: convert call_sync() to a function
    NFSv4.1 remove temp code that prevented ds commits
    NFSv4.1: layoutcommit
    NFSv4.1: filelayout driver specific code for COMMIT
    NFSv4.1: remove GETATTR from ds commits
    NFSv4.1: add generic layer hooks for pnfs COMMIT
    NFSv4.1: alloc and free commit_buckets
    NFSv4.1: shift filelayout_free_lseg
    NFSv4.1: pull out code from nfs_commit_release
    NFSv4.1: pull error handling out of nfs_commit_list
    NFSv4.1: add callback to nfs4_commit_done
    NFSv4.1: rearrange nfs_commit_rpcsetup
    NFSv4.1: don't send COMMIT to ds for data sync writes
    ...

    Linus Torvalds
     

25 Mar, 2011

3 commits


24 Mar, 2011

1 commit

  • * 'for-2.6.39' of git://linux-nfs.org/~bfields/linux:
    SUNRPC: Remove resource leak in svc_rdma_send_error()
    nfsd: wrong index used in inner loop
    nfsd4: fix comment and remove unused nfsd4_file fields
    nfs41: make sure nfs server return right ca_maxresponsesize_cached
    nfsd: fix compile error
    svcrpc: fix bad argument in unix_domain_find
    nfsd4: fix struct file leak
    nfsd4: minor nfs4state.c reshuffling
    svcrpc: fix rare race on unix_domain creation
    nfsd41: modify the members value of nfsd4_op_flags
    nfsd: add proc file listing kernel's gss_krb5 enctypes
    gss:krb5 only include enctype numbers in gm_upcall_enctypes
    NFSD, VFS: Remove dead code in nfsd_rename()
    nfsd: kill unused macro definition
    locks: use assign_type()

    Linus Torvalds
     

23 Mar, 2011

1 commit


18 Mar, 2011

6 commits

  • * 'nfs-for-2.6.39' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (54 commits)
    RPC: killing RPC tasks races fixed
    xprt: remove redundant check
    SUNRPC: Convert struct rpc_xprt to use atomic_t counters
    SUNRPC: Ensure we always run the tk_callback before tk_action
    sunrpc: fix printk format warning
    xprt: remove redundant null check
    nfs: BKL is no longer needed, so remove the include
    NFS: Fix a warning in fs/nfs/idmap.c
    Cleanup: Factor out some cut-and-paste code.
    cleanup: save 60 lines/100 bytes by combining two mostly duplicate functions.
    NFS: account direct-io into task io accounting
    gss:krb5 only include enctype numbers in gm_upcall_enctypes
    RPCRDMA: Fix FRMR registration/invalidate handling.
    RPCRDMA: Fix to XDR page base interpretation in marshalling logic.
    NFSv4: Send unmapped uid/gids to the server when using auth_sys
    NFSv4: Propagate the error NFS4ERR_BADOWNER to nfs4_do_setattr
    NFSv4: cleanup idmapper functions to take an nfs_server argument
    NFSv4: Send unmapped uid/gids to the server if the idmapper fails
    NFSv4: If the server sends us a numeric uid/gid then accept it
    NFSv4.1: reject zero layout with zeroed stripe unit
    ...

    Linus Torvalds
     
  • We leak the memory allocated to 'ctxt' when we return after
    'ib_dma_mapping_error()' returns !=0.

    Signed-off-by: Jesper Juhl
    Signed-off-by: J. Bruce Fields

    Jesper Juhl
     
  • RPC task RPC_TASK_QUEUED bit is set must be checked before trying to wake up
    task rpc_killall_tasks() because task->tk_waitqueue can not be set (equal to
    NULL).
    Also, as Trond Myklebust mentioned, such approach (instead of checking
    tk_waitqueue to NULL) allows us to "optimise away the call to
    rpc_wake_up_queued_task() altogether for those
    tasks that aren't queued".

    Here is an example of dereferencing of tk_waitqueue equal to NULL:

    CPU 0 CPU 1 CPU 2
    -------------------- --------------------- --------------------------
    nfs4_run_open_task
    rpc_run_task
    rpc_execute
    rpc_set_active
    rpc_make_runnable
    (waiting)
    rpc_async_schedule
    nfs4_open_prepare
    nfs_wait_on_sequence
    nfs_umount_begin
    rpc_killall_tasks
    rpc_wake_up_task
    rpc_wake_up_queued_task
    spin_lock(tk_waitqueue == NULL)
    BUG()
    rpc_sleep_on
    spin_lock(&q->lock)
    __rpc_sleep_on
    task->tk_waitqueue = q

    Signed-off-by: Stanislav Kinsbursky
    Cc: stable@kernel.org
    Signed-off-by: Trond Myklebust

    Stanislav Kinsbursky
     
  • remove redundant check.

    Signed-off-by: Jinqiu Yang
    Signed-off-by: Trond Myklebust

    j223yang@asset.uwaterloo.ca
     
  • Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • This fixes a race in which the task->tk_callback() puts the rpc_task
    to sleep, setting a new callback. Under certain circumstances, the current
    code may end up executing the task->tk_action before it gets round to the
    callback.

    Signed-off-by: Trond Myklebust
    Cc: stable@kernel.org

    Trond Myklebust
     

17 Mar, 2011

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1480 commits)
    bonding: enable netpoll without checking link status
    xfrm: Refcount destination entry on xfrm_lookup
    net: introduce rx_handler results and logic around that
    bonding: get rid of IFF_SLAVE_INACTIVE netdev->priv_flag
    bonding: wrap slave state work
    net: get rid of multiple bond-related netdevice->priv_flags
    bonding: register slave pointer for rx_handler
    be2net: Bump up the version number
    be2net: Copyright notice change. Update to Emulex instead of ServerEngines
    e1000e: fix kconfig for crc32 dependency
    netfilter ebtables: fix xt_AUDIT to work with ebtables
    xen network backend driver
    bonding: Improve syslog message at device creation time
    bonding: Call netif_carrier_off after register_netdevice
    bonding: Incorrect TX queue offset
    net_sched: fix ip_tos2prio
    xfrm: fix __xfrm_route_forward()
    be2net: Fix UDP packet detected status in RX compl
    Phonet: fix aligned-mode pipe socket buffer header reserve
    netxen: support for GbE port settings
    ...

    Fix up conflicts in drivers/staging/brcm80211/brcmsmac/wl_mac80211.c
    with the staging updates.

    Linus Torvalds
     

16 Mar, 2011

3 commits

  • * 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
    workqueue: fix build failure introduced by s/freezeable/freezable/
    workqueue: add system_freezeable_wq
    rds/ib: use system_wq instead of rds_ib_fmr_wq
    net/9p: replace p9_poll_task with a work
    net/9p: use system_wq instead of p9_mux_wq
    xfs: convert to alloc_workqueue()
    reiserfs: make commit_wq use the default concurrency level
    ocfs2: use system_wq instead of ocfs2_quota_wq
    ext4: convert to alloc_workqueue()
    scsi/scsi_tgt_lib: scsi_tgtd isn't used in memory reclaim path
    scsi/be2iscsi,qla2xxx: convert to alloc_workqueue()
    misc/iwmc3200top: use system_wq instead of dedicated workqueues
    i2o: use alloc_workqueue() instead of create_workqueue()
    acpi: kacpi*_wq don't need WQ_MEM_RECLAIM
    fs/aio: aio_wq isn't used in memory reclaim path
    input/tps6507x-ts: use system_wq instead of dedicated workqueue
    cpufreq: use system_wq instead of dedicated workqueues
    wireless/ipw2x00: use system_wq instead of dedicated workqueues
    arm/omap: use system_wq in mailbox
    workqueue: use WQ_MEM_RECLAIM instead of WQ_RESCUER

    Linus Torvalds
     
  • Fix printk format build warning:

    net/sunrpc/xprtrdma/verbs.c:1463: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'dma_addr_t'

    Signed-off-by: Randy Dunlap
    Signed-off-by: Trond Myklebust

    Randy Dunlap
     
  • 'req' is dereferenced before checked for NULL.
    The patch simply removes the check.

    Signed-off-by: Jinqiu Yang
    Signed-off-by: Trond Myklebust

    j223yang@asset.uwaterloo.ca
     

12 Mar, 2011

6 commits

  • Make the value in gm_upcall_enctypes just the enctype values.
    This allows the values to be used more easily elsewhere.

    Signed-off-by: Kevin Coffman
    Signed-off-by: Trond Myklebust

    Kevin Coffman
     
  • When the rpc_memreg_strategy is 5, FRMR are used to map RPC data.
    This mode uses an FRMR to map the RPC data, then invalidates
    (i.e. unregisers) the data in xprt_rdma_free. These FRMR are used
    across connections on the same mount, i.e. if the connection goes
    away on an idle timeout and reconnects later, the FRMR are not
    destroyed and recreated.

    This creates a problem for transport errors because the WR that
    invalidate an FRMR may be flushed (i.e. fail) leaving the
    FRMR valid. When the FRMR is later used to map an RPC it will fail,
    tearing down the transport and starting over. Over time, more and
    more of the FRMR pool end up in the wrong state resulting in
    seemingly random disconnects.

    This fix keeps track of the FRMR state explicitly by setting it's
    state based on the successful completion of a reg/inv WR. If the FRMR
    is ever used and found to be in the wrong state, an invalidate WR
    is prepended, re-syncing the FRMR state and avoiding the connection loss.

    Signed-off-by: Tom Tucker
    Signed-off-by: Trond Myklebust

    Tom Tucker
     
  • The RPCRDMA marshalling logic assumed that xdr->page_base was an
    offset into the first page of xdr->page_list. It is in fact an
    offset into the xdr->page_list itself, that is, it selects the
    first page in the page_list and the offset into that page.

    The symptom depended in part on the rpc_memreg_strategy, if it was
    FRMR, or some other one-shot mapping mode, the connection would get
    torn down on a base and bounds error. When the badly marshalled RPC
    was retransmitted it would reconnect, get the error, and tear down the
    connection again in a loop forever. This resulted in a hung-mount. For
    the other modes, it would result in silent data corruption. This bug is
    most easily reproduced by writing more data than the filesystem
    has space for.

    This fix corrects the page_base assumption and otherwise simplifies
    the iov mapping logic.

    Signed-off-by: Tom Tucker
    Signed-off-by: Trond Myklebust

    Tom Tucker
     
  • Use our own async error handler.
    Mark the layout as failed and retry i/o through the MDS on specified errors.

    Update the mds_offset in nfs_readpage_retry so that a failed short-read retry
    to a DS gets correctly resent through the MDS.

    Signed-off-by: Andy Adamson
    Signed-off-by: Trond Myklebust

    Andy Adamson
     
  • rpc_run_task can only fail if it is not passed in a preallocated task.
    However, that is not at all clear with the current code. So
    remove several impossible to occur failure checks.

    Signed-off-by: Fred Isaman
    Signed-off-by: Trond Myklebust

    Fred Isaman
     
  • queue_work() only returns 0 or 1, never a negative value.

    Signed-off-by: Fred Isaman
    Signed-off-by: Trond Myklebust

    Fred Isaman
     

11 Mar, 2011

3 commits

  • xs_create_sock() is supposed to return a pointer or an ERR_PTR-encoded
    error, but it currently returns 0 if xs_bind() fails.

    Signed-off-by: Ben Hutchings
    Cc: stable@kernel.org [v2.6.37]
    Signed-off-by: Trond Myklebust

    Ben Hutchings
     
  • We leak the memory allocated to 'ctxt' when we return after
    'ib_dma_mapping_error()' returns !=0.

    Signed-off-by: Jesper Juhl
    Signed-off-by: Trond Myklebust

    Jesper Juhl
     
  • Although they run as rpciod background tasks, under normal operation
    (i.e. no SIGKILL), functions like nfs_sillyrename(), nfs4_proc_unlck()
    and nfs4_do_close() want to be fully synchronous. This means that when we
    exit, we want all references to the rpc_task to be gone, and we want
    any dentry references etc. held by that task to be released.

    For this reason these functions call __rpc_wait_for_completion_task(),
    followed by rpc_put_task() in the expectation that the latter will be
    releasing the last reference to the rpc_task, and thus ensuring that the
    callback_ops->rpc_release() has been called synchronously.

    This patch fixes a race which exists due to the fact that
    rpciod calls rpc_complete_task() (in order to wake up the callers of
    __rpc_wait_for_completion_task()) and then subsequently calls
    rpc_put_task() without ensuring that these two steps are done atomically.

    In order to avoid adding new spin locks, the patch uses the existing
    waitqueue spin lock to order the rpc_task reference count releases between
    the waiting process and rpciod.
    The common case where nobody is waiting for completion is optimised for by
    checking if the RPC_TASK_ASYNC flag is cleared and/or if the rpc_task
    reference count is 1: in those cases we drop trying to grab the spin lock,
    and immediately free up the rpc_task.

    Those few processes that need to put the rpc_task from inside an
    asynchronous context and that do not care about ordering are given a new
    helper: rpc_put_task_async().

    Signed-off-by: Trond Myklebust

    Trond Myklebust