31 Jan, 2013

1 commit

  • This fixes a livelock in the xprt->sending queue where we end up never
    making progress on lower priority tasks because sleep_on_priority()
    keeps adding new tasks with the same owner to the head of the queue,
    and priority bumps mean that we keep resetting the queue->owner to
    whatever task is at the head of the queue.

    Regression introduced by commit c05eecf636101dd4347b2d8fa457626bf0088e0a
    (SUNRPC: Don't allow low priority tasks to pre-empt higher priority ones).

    Reported-by: Andy Adamson
    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

09 Jan, 2013

1 commit

  • If the rpc_task exits while holding the socket write lock before it has
    allocated an rpc slot, then the usual mechanism for releasing the write
    lock in xprt_release() is defeated.

    The problem occurs if the call to xprt_lock_write() initially fails, so
    that the rpc_task is put on the xprt->sending wait queue. If the task
    exits after being assigned the lock by __xprt_lock_write_func, but
    before it has retried the call to xprt_lock_and_alloc_slot(), then
    it calls xprt_release() while holding the write lock, but will
    immediately exit due to the test for task->tk_rqstp != NULL.

    Reported-by: Chris Perl
    Signed-off-by: Trond Myklebust
    Cc: stable@vger.kernel.org [>= 3.1]

    Trond Myklebust
     

05 Jan, 2013

1 commit


06 Dec, 2012

2 commits


05 Nov, 2012

4 commits


29 Sep, 2012

1 commit


01 Aug, 2012

2 commits

  • Merge Andrew's second set of patches:
    - MM
    - a few random fixes
    - a couple of RTC leftovers

    * emailed patches from Andrew Morton : (120 commits)
    rtc/rtc-88pm80x: remove unneed devm_kfree
    rtc/rtc-88pm80x: assign ret only when rtc_register_driver fails
    mm: hugetlbfs: close race during teardown of hugetlbfs shared page tables
    tmpfs: distribute interleave better across nodes
    mm: remove redundant initialization
    mm: warn if pg_data_t isn't initialized with zero
    mips: zero out pg_data_t when it's allocated
    memcg: gix memory accounting scalability in shrink_page_list
    mm/sparse: remove index_init_lock
    mm/sparse: more checks on mem_section number
    mm/sparse: optimize sparse_index_alloc
    memcg: add mem_cgroup_from_css() helper
    memcg: further prevent OOM with too many dirty pages
    memcg: prevent OOM with too many dirty pages
    mm: mmu_notifier: fix freed page still mapped in secondary MMU
    mm: memcg: only check anon swapin page charges for swap cache
    mm: memcg: only check swap cache pages for repeated charging
    mm: memcg: split swapin charge function into private and public part
    mm: memcg: remove needless !mm fixup to init_mm when charging
    mm: memcg: remove unneeded shmem charge type
    ...

    Linus Torvalds
     
  • Implement the new swapfile a_ops for NFS and hook up ->direct_IO. This
    will set the NFS socket to SOCK_MEMALLOC and run socket reconnect under
    PF_MEMALLOC as well as reset SOCK_MEMALLOC before engaging the protocol
    ->connect() method.

    PF_MEMALLOC should allow the allocation of struct socket and related
    objects and the early (re)setting of SOCK_MEMALLOC should allow us to
    receive the packets required for the TCP connection buildup.

    [jlayton@redhat.com: Restore PF_MEMALLOC task flags in all cases]
    [dfeng@redhat.com: Fix handling of multiple swap files]
    [a.p.zijlstra@chello.nl: Original patch]
    Signed-off-by: Mel Gorman
    Acked-by: Rik van Riel
    Cc: Christoph Hellwig
    Cc: David S. Miller
    Cc: Eric B Munson
    Cc: Eric Paris
    Cc: James Morris
    Cc: Mel Gorman
    Cc: Mike Christie
    Cc: Neil Brown
    Cc: Sebastian Andrzej Siewior
    Cc: Trond Myklebust
    Cc: Xiaotian Feng
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     

31 Jul, 2012

2 commits

  • We've had some reports of a deadlock where rpciod ends up with a stack
    trace like this:

    PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14"
    #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
    #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
    #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
    #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
    #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
    #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
    #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
    #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
    #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
    #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

    rpciod is trying to allocate memory for a new socket to talk to the
    server. The VM ends up calling ->releasepage to get more memory, and it
    tries to do a blocking commit. That commit can't succeed however without
    a connected socket, so we deadlock.

    Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
    socket allocation, and having nfs_release_page check for that flag when
    deciding whether to do a commit call. Also, set PF_FSTRANS
    unconditionally in rpc_async_schedule since that function can also do
    allocations sometimes.

    Signed-off-by: Jeff Layton
    Signed-off-by: Trond Myklebust
    Cc: stable@vger.kernel.org

    Jeff Layton
     
  • rpc_make_runnable is not generally called with the queue lock held, unless
    it's waking up a task that has been sitting on a waitqueue. This is safe
    when the task has not entered the FSM yet, but the comments don't really
    spell this out.

    Signed-off-by: Jeff Layton
    Signed-off-by: Trond Myklebust

    Jeff Layton
     

20 Mar, 2012

1 commit

  • The problem is that for the case of priority queues, we
    have to assume that __rpc_remove_wait_queue_priority will move new
    elements from the tk_wait.links lists into the queue->tasks[] list.
    We therefore cannot use list_for_each_entry_safe() on queue->tasks[],
    since that will skip these new tasks that __rpc_remove_wait_queue_priority
    is adding.

    Without this fix, rpc_wake_up and rpc_wake_up_status will both fail
    to wake up all functions on priority wait queues, which can result
    in some nasty hangs.

    Reported-by: Andy Adamson
    Signed-off-by: Trond Myklebust
    Cc: stable@vger.kernel.org

    Trond Myklebust
     

15 Feb, 2012

1 commit


01 Feb, 2012

2 commits


22 Dec, 2011

1 commit

  • * master: (848 commits)
    SELinux: Fix RCU deref check warning in sel_netport_insert()
    binary_sysctl(): fix memory leak
    mm/vmalloc.c: remove static declaration of va from __get_vm_area_node
    ipmi_watchdog: restore settings when BMC reset
    oom: fix integer overflow of points in oom_badness
    memcg: keep root group unchanged if creation fails
    nilfs2: potential integer overflow in nilfs_ioctl_clean_segments()
    nilfs2: unbreak compat ioctl
    cpusets: stall when updating mems_allowed for mempolicy or disjoint nodemask
    evm: prevent racing during tfm allocation
    evm: key must be set once during initialization
    mmc: vub300: fix type of firmware_rom_wait_states module parameter
    Revert "mmc: enable runtime PM by default"
    mmc: sdhci: remove "state" argument from sdhci_suspend_host
    x86, dumpstack: Fix code bytes breakage due to missing KERN_CONT
    IB/qib: Correct sense on freectxts increment and decrement
    RDMA/cma: Verify private data length
    cgroups: fix a css_set not found bug in cgroup_attach_proc
    oprofile: Fix uninitialized memory access when writing to writing to oprofilefs
    Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel"
    ...

    Conflicts:
    kernel/cgroup_freezer.c

    Rafael J. Wysocki
     

07 Dec, 2011

1 commit

  • Allow the freezer to skip wait_on_bit_killable sleeps in the sunrpc
    layer. This should allow suspend and hibernate events to proceed, even
    when there are RPC's pending on the wire.

    Also, wrap the TASK_KILLABLE sleeps in NFS layer in freezer_do_not_count
    and freezer_count calls. This allows the freezer to skip tasks that are
    sleeping while looping on EJUKEBOX or NFS4ERR_DELAY sorts of errors.

    Signed-off-by: Jeff Layton
    Signed-off-by: Rafael J. Wysocki

    Jeff Layton
     

02 Dec, 2011

1 commit


18 Jul, 2011

1 commit


08 Jul, 2011

1 commit


15 Jun, 2011

1 commit

  • If the NLM daemon is killed on the NFS server, we can currently end up
    hanging forever on an 'unlock' request, instead of aborting. Basically,
    if the rpcbind request fails, or the server keeps returning garbage, we
    really want to quit instead of retrying.

    Tested-by: Vasily Averin
    Signed-off-by: Trond Myklebust
    Cc: stable@kernel.org

    Trond Myklebust
     

27 Mar, 2011

1 commit

  • BUG: atomic_dec_and_test(): -1: atomic counter underflow at:
    Pid: 2827, comm: mount.nfs Not tainted 2.6.38 #1
    Call Trace:
    [] ? put_rpccred+0x44/0x14e [sunrpc]
    [] ? rpc_ping+0x4e/0x58 [sunrpc]
    [] ? rpc_create+0x481/0x4fc [sunrpc]
    [] ? rpcauth_lookup_credcache+0xab/0x22d [sunrpc]
    [] ? nfs_create_rpc_client+0xa6/0xeb [nfs]
    [] ? nfs4_set_client+0xc2/0x1f9 [nfs]
    [] ? nfs4_create_server+0xf2/0x2a6 [nfs]
    [] ? nfs4_remote_mount+0x4e/0x14a [nfs]
    [] ? vfs_kern_mount+0x6e/0x133
    [] ? nfs_do_root_mount+0x76/0x95 [nfs]
    [] ? nfs4_try_mount+0x56/0xaf [nfs]
    [] ? nfs_get_sb+0x435/0x73c [nfs]
    [] ? vfs_kern_mount+0x99/0x133
    [] ? do_kern_mount+0x48/0xd8
    [] ? do_mount+0x6da/0x741
    [] ? sys_mount+0x83/0xc0
    [] ? system_call_fastpath+0x16/0x1b

    Well, so, I think this is real bug of nfs codes somewhere. With some
    review, the code

    rpc_call_sync()
    rpc_run_task
    rpc_execute()
    __rpc_execute()
    rpc_release_task()
    rpc_release_resources_task()
    put_rpccred()
    Signed-off-by: Trond Myklebust

    OGAWA Hirofumi
     

18 Mar, 2011

2 commits

  • * 'nfs-for-2.6.39' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (54 commits)
    RPC: killing RPC tasks races fixed
    xprt: remove redundant check
    SUNRPC: Convert struct rpc_xprt to use atomic_t counters
    SUNRPC: Ensure we always run the tk_callback before tk_action
    sunrpc: fix printk format warning
    xprt: remove redundant null check
    nfs: BKL is no longer needed, so remove the include
    NFS: Fix a warning in fs/nfs/idmap.c
    Cleanup: Factor out some cut-and-paste code.
    cleanup: save 60 lines/100 bytes by combining two mostly duplicate functions.
    NFS: account direct-io into task io accounting
    gss:krb5 only include enctype numbers in gm_upcall_enctypes
    RPCRDMA: Fix FRMR registration/invalidate handling.
    RPCRDMA: Fix to XDR page base interpretation in marshalling logic.
    NFSv4: Send unmapped uid/gids to the server when using auth_sys
    NFSv4: Propagate the error NFS4ERR_BADOWNER to nfs4_do_setattr
    NFSv4: cleanup idmapper functions to take an nfs_server argument
    NFSv4: Send unmapped uid/gids to the server if the idmapper fails
    NFSv4: If the server sends us a numeric uid/gid then accept it
    NFSv4.1: reject zero layout with zeroed stripe unit
    ...

    Linus Torvalds
     
  • This fixes a race in which the task->tk_callback() puts the rpc_task
    to sleep, setting a new callback. Under certain circumstances, the current
    code may end up executing the task->tk_action before it gets round to the
    callback.

    Signed-off-by: Trond Myklebust
    Cc: stable@kernel.org

    Trond Myklebust
     

16 Mar, 2011

1 commit

  • * 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
    workqueue: fix build failure introduced by s/freezeable/freezable/
    workqueue: add system_freezeable_wq
    rds/ib: use system_wq instead of rds_ib_fmr_wq
    net/9p: replace p9_poll_task with a work
    net/9p: use system_wq instead of p9_mux_wq
    xfs: convert to alloc_workqueue()
    reiserfs: make commit_wq use the default concurrency level
    ocfs2: use system_wq instead of ocfs2_quota_wq
    ext4: convert to alloc_workqueue()
    scsi/scsi_tgt_lib: scsi_tgtd isn't used in memory reclaim path
    scsi/be2iscsi,qla2xxx: convert to alloc_workqueue()
    misc/iwmc3200top: use system_wq instead of dedicated workqueues
    i2o: use alloc_workqueue() instead of create_workqueue()
    acpi: kacpi*_wq don't need WQ_MEM_RECLAIM
    fs/aio: aio_wq isn't used in memory reclaim path
    input/tps6507x-ts: use system_wq instead of dedicated workqueue
    cpufreq: use system_wq instead of dedicated workqueues
    wireless/ipw2x00: use system_wq instead of dedicated workqueues
    arm/omap: use system_wq in mailbox
    workqueue: use WQ_MEM_RECLAIM instead of WQ_RESCUER

    Linus Torvalds
     

12 Mar, 2011

2 commits


11 Mar, 2011

1 commit

  • Although they run as rpciod background tasks, under normal operation
    (i.e. no SIGKILL), functions like nfs_sillyrename(), nfs4_proc_unlck()
    and nfs4_do_close() want to be fully synchronous. This means that when we
    exit, we want all references to the rpc_task to be gone, and we want
    any dentry references etc. held by that task to be released.

    For this reason these functions call __rpc_wait_for_completion_task(),
    followed by rpc_put_task() in the expectation that the latter will be
    releasing the last reference to the rpc_task, and thus ensuring that the
    callback_ops->rpc_release() has been called synchronously.

    This patch fixes a race which exists due to the fact that
    rpciod calls rpc_complete_task() (in order to wake up the callers of
    __rpc_wait_for_completion_task()) and then subsequently calls
    rpc_put_task() without ensuring that these two steps are done atomically.

    In order to avoid adding new spin locks, the patch uses the existing
    waitqueue spin lock to order the rpc_task reference count releases between
    the waiting process and rpciod.
    The common case where nobody is waiting for completion is optimised for by
    checking if the RPC_TASK_ASYNC flag is cleared and/or if the rpc_task
    reference count is 1: in those cases we drop trying to grab the spin lock,
    and immediately free up the rpc_task.

    Those few processes that need to put the rpc_task from inside an
    asynchronous context and that do not care about ordering are given a new
    helper: rpc_put_task_async().

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

25 Jan, 2011

1 commit


26 Oct, 2010

1 commit

  • * 'nfs-for-2.6.37' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (67 commits)
    SUNRPC: Cleanup duplicate assignment in rpcauth_refreshcred
    nfs: fix unchecked value
    Ask for time_delta during fsinfo probe
    Revalidate caches on lock
    SUNRPC: After calling xprt_release(), we must restart from call_reserve
    NFSv4: Fix up the 'dircount' hint in encode_readdir
    NFSv4: Clean up nfs4_decode_dirent
    NFSv4: nfs4_decode_dirent must clear entry->fattr->valid
    NFSv4: Fix a regression in decode_getfattr
    NFSv4: Fix up decode_attr_filehandle() to handle the case of empty fh pointer
    NFS: Ensure we check all allocation return values in new readdir code
    NFS: Readdir plus in v4
    NFS: introduce generic decode_getattr function
    NFS: check xdr_decode for errors
    NFS: nfs_readdir_filler catch all errors
    NFS: readdir with vmapped pages
    NFS: remove page size checking code
    NFS: decode_dirent should use an xdr_stream
    SUNRPC: Add a helper function xdr_inline_peek
    NFS: remove readdir plus limit
    ...

    Linus Torvalds
     

24 Sep, 2010

1 commit


22 Sep, 2010

1 commit


04 Aug, 2010

4 commits


15 May, 2010

1 commit