07 Mar, 2010

2 commits

  • A frequent questions from users about memory management is what numbers of
    swap ents are user for processes. And this information will give some
    hints to oom-killer.

    Besides we can count the number of swapents per a process by scanning
    /proc//smaps, this is very slow and not good for usual process
    information handler which works like 'ps' or 'top'. (ps or top is now
    enough slow..)

    This patch adds a counter of swapents to mm_counter and update is at each
    swap events. Information is exported via /proc//status file as

    [kamezawa@bluextal memory]$ cat /proc/self/status
    Name: cat
    State: R (running)
    Tgid: 2910
    Pid: 2910
    PPid: 2823
    TracerPid: 0
    Uid: 500 500 500 500
    Gid: 500 500 500 500
    FDSize: 256
    Groups: 500
    VmPeak: 82696 kB
    VmSize: 82696 kB
    VmLck: 0 kB
    VmHWM: 432 kB
    VmRSS: 432 kB
    VmData: 172 kB
    VmStk: 84 kB
    VmExe: 48 kB
    VmLib: 1568 kB
    VmPTE: 40 kB
    VmSwap: 0 kB
    Reviewed-by: Minchan Kim
    Reviewed-by: Christoph Lameter
    Cc: Lee Schermerhorn
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     
  • Considering the nature of per mm stats, it's the shared object among
    threads and can be a cache-miss point in the page fault path.

    This patch adds per-thread cache for mm_counter. RSS value will be
    counted into a struct in task_struct and synchronized with mm's one at
    events.

    Now, in this patch, the event is the number of calls to handle_mm_fault.
    Per-thread value is added to mm at each 64 calls.

    rough estimation with small benchmark on parallel thread (2threads) shows
    [before]
    4.5 cache-miss/faults
    [after]
    4.0 cache-miss/faults
    Anyway, the most contended object is mmap_sem if the number of threads grows.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: KAMEZAWA Hiroyuki
    Cc: Minchan Kim
    Cc: Christoph Lameter
    Cc: Lee Schermerhorn
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     

06 Mar, 2010

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (33 commits)
    quota: stop using QUOTA_OK / NO_QUOTA
    dquot: cleanup dquot initialize routine
    dquot: move dquot initialization responsibility into the filesystem
    dquot: cleanup dquot drop routine
    dquot: move dquot drop responsibility into the filesystem
    dquot: cleanup dquot transfer routine
    dquot: move dquot transfer responsibility into the filesystem
    dquot: cleanup inode allocation / freeing routines
    dquot: cleanup space allocation / freeing routines
    ext3: add writepage sanity checks
    ext3: Truncate allocated blocks if direct IO write fails to update i_size
    quota: Properly invalidate caches even for filesystems with blocksize < pagesize
    quota: generalize quota transfer interface
    quota: sb_quota state flags cleanup
    jbd: Delay discarding buffers in journal_unmap_buffer
    ext3: quota_write cross block boundary behaviour
    quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota
    quota: split out compat_sys_quotactl support from quota.c
    quota: split out netlink notification support from quota.c
    quota: remove invalid optimization from quota_sync_all
    ...

    Fixed trivial conflicts in fs/namei.c and fs/ufs/inode.c

    Linus Torvalds
     

05 Mar, 2010

6 commits

  • Get rid of the initialize dquot operation - it is now always called from
    the filesystem and if a filesystem really needs it's own (which none
    currently does) it can just call into it's own routine directly.

    Rename the now static low-level dquot_initialize helper to __dquot_initialize
    and vfs_dq_init to dquot_initialize to have a consistent namespace.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Christoph Hellwig
     
  • Get rid of the drop dquot operation - it is now always called from
    the filesystem and if a filesystem really needs it's own (which none
    currently does) it can just call into it's own routine directly.

    Rename the now static low-level dquot_drop helper to __dquot_drop
    and vfs_dq_drop to dquot_drop to have a consistent namespace.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Christoph Hellwig
     
  • Get rid of the transfer dquot operation - it is now always called from
    the filesystem and if a filesystem really needs it's own (which none
    currently does) it can just call into it's own routine directly.

    Rename the now static low-level dquot_transfer helper to __dquot_transfer
    and vfs_dq_transfer to dquot_transfer to have a consistent namespace,
    and make the new dquot_transfer return a normal negative errno value
    which all callers expect.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Christoph Hellwig
     
  • Get rid of the alloc_inode and free_inode dquot operations - they are
    always called from the filesystem and if a filesystem really needs
    their own (which none currently does) it can just call into it's
    own routine directly.

    Also get rid of the vfs_dq_alloc/vfs_dq_free wrappers and always
    call the lowlevel dquot_alloc_inode / dqout_free_inode routines
    directly, which now lose the number argument which is always 1.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Christoph Hellwig
     
  • Get rid of the alloc_space, free_space, reserve_space, claim_space and
    release_rsv dquot operations - they are always called from the filesystem
    and if a filesystem really needs their own (which none currently does)
    it can just call into it's own routine directly.

    Move shared logic into the common __dquot_alloc_space,
    dquot_claim_space_nodirty and __dquot_free_space low-level methods,
    and rationalize the wrappers around it to move as much as possible
    code into the common block for CONFIG_QUOTA vs not. Also rename
    all these helpers to be named dquot_* instead of vfs_dq_*.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Christoph Hellwig
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits)
    init: Open /dev/console from rootfs
    mqueue: fix typo "failues" -> "failures"
    mqueue: only set error codes if they are really necessary
    mqueue: simplify do_open() error handling
    mqueue: apply mathematics distributivity on mq_bytes calculation
    mqueue: remove unneeded info->messages initialization
    mqueue: fix mq_open() file descriptor leak on user-space processes
    fix race in d_splice_alias()
    set S_DEAD on unlink() and non-directory rename() victims
    vfs: add NOFOLLOW flag to umount(2)
    get rid of ->mnt_parent in tomoyo/realpath
    hppfs can use existing proc_mnt, no need for do_kern_mount() in there
    Mirror MS_KERNMOUNT in ->mnt_flags
    get rid of useless vfsmount_lock use in put_mnt_ns()
    Take vfsmount_lock to fs/internal.h
    get rid of insanity with namespace roots in tomoyo
    take check for new events in namespace (guts of mounts_poll()) to namespace.c
    Don't mess with generic_permission() under ->d_lock in hpfs
    sanitize const/signedness for udf
    nilfs: sanitize const/signedness in dealing with ->d_name.name
    ...

    Fix up fairly trivial (famous last words...) conflicts in
    drivers/infiniband/core/uverbs_main.c and security/tomoyo/realpath.c

    Linus Torvalds
     

04 Mar, 2010

2 commits

  • * document locking
    * add the missing part of data structure invariants (relationship
    between mnt_share and mnt_slave lists in case of a peer group
    among slaves).

    Signed-off-by: Al Viro

    Al Viro
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
    nilfs2: add reader's lock for cno in nilfs_ioctl_sync
    nilfs2: delete unnecessary condition in load_segment_summary
    nilfs2: move iterator to write log into segment buffer
    nilfs2: get rid of s_dirt flag use
    nilfs2: get rid of nilfs_segctor_req struct
    nilfs2: delete unnecessary condition in nilfs_dat_translate
    nilfs2: fix potential hang in nilfs_error on errors=remount-ro
    nilfs2: use mnt_want_write in ioctls where write access is needed
    nilfs2: issue discard request after cleaning segments

    Linus Torvalds
     

13 Feb, 2010

1 commit


16 Jan, 2010

1 commit

  • Add expedited functions. Review documentation and update
    obsolete verbiage. Also fix the advice for the RCU CPU-stall
    kernel configuration parameter, and document RCU CPU-stall
    warnings.

    Signed-off-by: Paul E. McKenney
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

12 Jan, 2010

1 commit

  • Commit d899bf7b (procfs: provide stack information for threads) introduced
    to show stack information in /proc/{pid}/status. But it cause large
    performance regression. Unfortunately /proc/{pid}/status is used ps
    command too and ps is one of most important component. Because both to
    take mmap_sem and page table walk are heavily operation.

    If many process run, the ps performance is,

    [before d899bf7b]

    % perf stat ps >/dev/null

    Performance counter stats for 'ps':

    4090.435806 task-clock-msecs # 0.032 CPUs
    229 context-switches # 0.000 M/sec
    0 CPU-migrations # 0.000 M/sec
    234 page-faults # 0.000 M/sec
    8587565207 cycles # 2099.425 M/sec
    9866662403 instructions # 1.149 IPC
    3789415411 cache-references # 926.409 M/sec
    30419509 cache-misses # 7.437 M/sec

    128.859521955 seconds time elapsed

    [after d899bf7b]

    % perf stat ps > /dev/null

    Performance counter stats for 'ps':

    4305.081146 task-clock-msecs # 0.028 CPUs
    480 context-switches # 0.000 M/sec
    2 CPU-migrations # 0.000 M/sec
    237 page-faults # 0.000 M/sec
    9021211334 cycles # 2095.480 M/sec
    10605887536 instructions # 1.176 IPC
    3612650999 cache-references # 839.160 M/sec
    23917502 cache-misses # 5.556 M/sec

    152.277819582 seconds time elapsed

    Thus, this patch revert it. Fortunately /proc/{pid}/task/{tid}/smaps
    provide almost same information. we can use it.

    Commit d899bf7b introduced two features:

    1) Add the annotattion of [thread stack: xxxx] mark to
    /proc/{pid}/task/{tid}/maps.
    2) Add StackUsage field to /proc/{pid}/status.

    I only revert (2), because I haven't seen (1) cause regression.

    Signed-off-by: KOSAKI Motohiro
    Cc: Stefani Seibold
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Alexey Dobriyan
    Cc: "Eric W. Biederman"
    Cc: Randy Dunlap
    Cc: Andrew Morton
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     

05 Jan, 2010

1 commit


02 Jan, 2010

1 commit


25 Dec, 2009

1 commit


24 Dec, 2009

2 commits


17 Dec, 2009

1 commit

  • * 'for-2.6.33' of git://linux-nfs.org/~bfields/linux: (42 commits)
    nfsd: remove pointless paths in file headers
    nfsd: move most of nfsfh.h to fs/nfsd
    nfsd: remove unused field rq_reffh
    nfsd: enable V4ROOT exports
    nfsd: make V4ROOT exports read-only
    nfsd: restrict filehandles accepted in V4ROOT case
    nfsd: allow exports of symlinks
    nfsd: filter readdir results in V4ROOT case
    nfsd: filter lookup results in V4ROOT case
    nfsd4: don't continue "under" mounts in V4ROOT case
    nfsd: introduce export flag for v4 pseudoroot
    nfsd: let "insecure" flag vary by pseudoflavor
    nfsd: new interface to advertise export features
    nfsd: Move private headers to source directory
    vfs: nfsctl.c un-used nfsd #includes
    lockd: Remove un-used nfsd headers #includes
    s390: remove un-used nfsd #includes
    sparc: remove un-used nfsd #includes
    parsic: remove un-used nfsd #includes
    compat.c: Remove dependence on nfsd private headers
    ...

    Linus Torvalds
     

16 Dec, 2009

2 commits

  • Using create_proc_entry() + ->proc_fops assignment is racy because
    ->proc_fops will be NULL for some time, use proc_create() to avoid race.

    Signed-off-by: Alexey Dobriyan
    Cc: Randy Dunlap
    Cc: Jonathan Corbet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Setting a thread's comm to be something unique is a very useful ability
    and is helpful for debugging complicated threaded applications. However
    currently the only way to set a thread name is for the thread to name
    itself via the PR_SET_NAME prctl.

    However, there may be situations where it would be advantageous for a
    thread dispatcher to be naming the threads its managing, rather then
    having the threads self-describe themselves. This sort of behavior is
    available on other systems via the pthread_setname_np() interface.

    This patch exports a task's comm via proc/pid/comm and
    proc/pid/task/tid/comm interfaces, and allows thread siblings to write to
    these values.

    [akpm@linux-foundation.org: cleanups]
    Signed-off-by: John Stultz
    Cc: Andi Kleen
    Cc: Arjan van de Ven
    Cc: Mike Fulton
    Cc: Sean Foley
    Cc: Darren Hart
    Cc: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    john stultz
     

12 Dec, 2009

2 commits

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (21 commits)
    ext3: PTR_ERR return of wrong pointer in setup_new_group_blocks()
    ext3: Fix data / filesystem corruption when write fails to copy data
    ext4: Support for 64-bit quota format
    ext3: Support for vfsv1 quota format
    quota: Implement quota format with 64-bit space and inode limits
    quota: Move definition of QFMT_OCFS2 to linux/quota.h
    ext2: fix comment in ext2_find_entry about return values
    ext3: Unify log messages in ext3
    ext2: clear uptodate flag on super block I/O error
    ext2: Unify log messages in ext2
    ext3: make "norecovery" an alias for "noload"
    ext3: Don't update the superblock in ext3_statfs()
    ext3: journal all modifications in ext3_xattr_set_handle
    ext2: Explicitly assign values to on-disk enum of filetypes
    quota: Fix WARN_ON in lookup_one_len
    const: struct quota_format_ops
    ubifs: remove manual O_SYNC handling
    afs: remove manual O_SYNC handling
    kill wait_on_page_writeback_range
    vfs: Implement proper O_SYNC semantics
    ...

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: (49 commits)
    nilfs2: separate wait function from nilfs_segctor_write
    nilfs2: add iterator for segment buffers
    nilfs2: hide nilfs_write_info struct in segment buffer code
    nilfs2: relocate io status variables to segment buffer
    nilfs2: do not return io error for bio allocation failure
    nilfs2: use list_splice_tail or list_splice_tail_init
    nilfs2: replace mark_inode_dirty as nilfs_mark_inode_dirty
    nilfs2: delete mark_inode_dirty in nilfs_delete_entry
    nilfs2: delete mark_inode_dirty in nilfs_commit_chunk
    nilfs2: change return type of nilfs_commit_chunk
    nilfs2: split nilfs_unlink as nilfs_do_unlink and nilfs_unlink
    nilfs2: delete redundant mark_inode_dirty
    nilfs2: expand inode_inc_link_count and inode_dec_link_count
    nilfs2: delete mark_inode_dirty from nilfs_set_link
    nilfs2: delete mark_inode_dirty in nilfs_new_inode
    nilfs2: add norecovery mount option
    nilfs2: add helper to get if volume is in a valid state
    nilfs2: move recovery completion into load_nilfs function
    nilfs2: apply readahead for recovery on mount
    nilfs2: clean up get/put function of a segment usage
    ...

    Linus Torvalds
     

11 Dec, 2009

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (47 commits)
    ext4: Fix potential fiemap deadlock (mmap_sem vs. i_data_sem)
    ext4: Do not override ext2 or ext3 if built they are built as modules
    jbd2: Export jbd2_log_start_commit to fix ext4 build
    ext4: Fix insufficient checks in EXT4_IOC_MOVE_EXT
    ext4: Wait for proper transaction commit on fsync
    ext4: fix incorrect block reservation on quota transfer.
    ext4: quota macros cleanup
    ext4: ext4_get_reserved_space() must return bytes instead of blocks
    ext4: remove blocks from inode prealloc list on failure
    ext4: wait for log to commit when umounting
    ext4: Avoid data / filesystem corruption when write fails to copy data
    ext4: Use ext4 file system driver for ext2/ext3 file system mounts
    ext4: Return the PTR_ERR of the correct pointer in setup_new_group_blocks()
    jbd2: Add ENOMEM checking in and for jbd2_journal_write_metadata_buffer()
    ext4: remove unused parameter wbc from __ext4_journalled_writepage()
    ext4: remove encountered_congestion trace
    ext4: move_extent_per_page() cleanup
    ext4: initialize moved_len before calling ext4_move_extents()
    ext4: Fix double-free of blocks with EXT4_IOC_MOVE_EXT
    ext4: use ext4_data_block_valid() in ext4_free_blocks()
    ...

    Linus Torvalds
     

10 Dec, 2009

4 commits


06 Dec, 2009

1 commit

  • …/git/tip/linux-2.6-tip

    * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (35 commits)
    sched, cputime: Introduce thread_group_times()
    sched, cputime: Cleanups related to task_times()
    Revert "sched, x86: Optimize branch hint in __switch_to()"
    sched: Fix isolcpus boot option
    sched: Revert 498657a478c60be092208422fefa9c7b248729c2
    sched, time: Define nsecs_to_jiffies()
    sched: Remove task_{u,s,g}time()
    sched: Introduce task_times() to replace task_{u,s}time() pair
    sched: Limit the number of scheduler debug messages
    sched.c: Call debug_show_all_locks() when dumping all tasks
    sched, x86: Optimize branch hint in __switch_to()
    sched: Optimize branch hint in context_switch()
    sched: Optimize branch hint in pick_next_task_fair()
    sched_feat_write(): Update ppos instead of file->f_pos
    sched: Sched_rt_periodic_timer vs cpu hotplug
    sched, kvm: Fix race condition involving sched_in_preempt_notifers
    sched: More generic WAKE_AFFINE vs select_idle_sibling()
    sched: Cleanup select_task_rq_fair()
    sched: Fix granularity of task_u/stime()
    sched: Fix/add missing update_rq_clock() calls
    ...

    Linus Torvalds
     

01 Dec, 2009

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-fscache: (31 commits)
    FS-Cache: Provide nop fscache_stat_d() if CONFIG_FSCACHE_STATS=n
    SLOW_WORK: Fix GFS2 to #include before using THIS_MODULE
    SLOW_WORK: Fix CIFS to pass THIS_MODULE to slow_work_register_user()
    CacheFiles: Don't log lookup/create failing with ENOBUFS
    CacheFiles: Catch an overly long wait for an old active object
    CacheFiles: Better showing of debugging information in active object problems
    CacheFiles: Mark parent directory locks as I_MUTEX_PARENT to keep lockdep happy
    CacheFiles: Handle truncate unlocking the page we're reading
    CacheFiles: Don't write a full page if there's only a partial page to cache
    FS-Cache: Actually requeue an object when requested
    FS-Cache: Start processing an object's operations on that object's death
    FS-Cache: Make sure FSCACHE_COOKIE_LOOKING_UP cleared on lookup failure
    FS-Cache: Add a retirement stat counter
    FS-Cache: Handle pages pending storage that get evicted under OOM conditions
    FS-Cache: Handle read request vs lookup, creation or other cache failure
    FS-Cache: Don't delete pending pages from the page-store tracking tree
    FS-Cache: Fix lock misorder in fscache_write_op()
    FS-Cache: The object-available state can't rely on the cookie to be available
    FS-Cache: Permit cache retrieval ops to be interrupted in the initial wait phase
    FS-Cache: Use radix tree preload correctly in tracking of pages to be stored
    ...

    Linus Torvalds
     

26 Nov, 2009

1 commit


24 Nov, 2009

1 commit


20 Nov, 2009

7 commits

  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
    ocfs2: Trivial cleanup of jbd compatibility layer removal
    ocfs2: Refresh documentation
    ocfs2: return f_fsid info in ocfs2_statfs()
    ocfs2: duplicate inline data properly during reflink.
    ocfs2: Move ocfs2_complete_reflink to the right place.
    ocfs2: Return -EINVAL when a device is not ocfs2.

    Linus Torvalds
     
  • This adds "norecovery" mount option which disables temporal write
    access to read-only mounts or snapshots during mount/recovery.
    Without this option, write access will be even performed for those
    types of mounts; the temporal write access is needed to mount root
    file system read-only after an unclean shutdown.

    This option will be helpful when user wants to prevent any write
    access to the device.

    Signed-off-by: Ryusuke Konishi
    Cc: Eric Sandeen

    Ryusuke Konishi
     
  • Since most of fs using nofoobar style option,
    modified barrier=off option as nobarrier.

    Signed-off-by: Jiro SEKIBA
    Signed-off-by: Ryusuke Konishi

    Jiro SEKIBA
     
  • Users on the linux-ext4 list recently complained about differences
    across filesystems w.r.t. how to mount without a journal replay.

    In the discussion it was noted that xfs's "norecovery" option is
    perhaps more descriptively accurate than "noload," so let's make
    that an alias for ext4.

    Also show this status in /proc/mounts

    Signed-off-by: Eric Sandeen
    Signed-off-by: "Theodore Ts'o"

    Eric Sandeen
     
  • It is anticipated that when sb_issue_discard starts doing
    real work on trim-capable devices, we may see issues. Make
    this mount-time optional, and default it to off until we know
    that things are working out OK.

    Signed-off-by: Eric Sandeen
    Signed-off-by: "Theodore Ts'o"

    Eric Sandeen
     
  • Catch an overly long wait for an old, dying active object when we want to
    replace it with a new one. The probability is that all the slow-work threads
    are hogged, and the delete can't get a look in.

    What we do instead is:

    (1) if there's nothing in the slow work queue, we sleep until either the dying
    object has finished dying or there is something in the slow work queue
    behind which we can queue our object.

    (2) if there is something in the slow work queue, we return ETIMEDOUT to
    fscache_lookup_object(), which then puts us back on the slow work queue,
    presumably behind the deletion that we're blocked by. We are then
    deferred for a while until we work our way back through the queue -
    without blocking a slow-work thread unnecessarily.

    A backtrace similar to the following may appear in the log without this patch:

    INFO: task kslowd004:5711 blocked for more than 120 seconds.
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    kslowd004 D 0000000000000000 0 5711 2 0x00000080
    ffff88000340bb80 0000000000000046 ffff88002550d000 0000000000000000
    ffff88002550d000 0000000000000007 ffff88000340bfd8 ffff88002550d2a8
    000000000000ddf0 00000000000118c0 00000000000118c0 ffff88002550d2a8
    Call Trace:
    [] ? trace_hardirqs_on+0xd/0xf
    [] ? cachefiles_wait_bit+0x0/0xd [cachefiles]
    [] cachefiles_wait_bit+0x9/0xd [cachefiles]
    [] __wait_on_bit+0x43/0x76
    [] ? ext3_xattr_get+0x1ec/0x270
    [] out_of_line_wait_on_bit+0x69/0x74
    [] ? cachefiles_wait_bit+0x0/0xd [cachefiles]
    [] ? wake_bit_function+0x0/0x2e
    [] cachefiles_mark_object_active+0x203/0x23b [cachefiles]
    [] cachefiles_walk_to_object+0x558/0x827 [cachefiles]
    [] cachefiles_lookup_object+0xac/0x12a [cachefiles]
    [] fscache_lookup_object+0x1c7/0x214 [fscache]
    [] fscache_object_state_machine+0xa5/0x52d [fscache]
    [] fscache_object_slow_work_execute+0x5f/0xa0 [fscache]
    [] slow_work_execute+0x18f/0x2d1
    [] slow_work_thread+0x1c5/0x308
    [] ? autoremove_wake_function+0x0/0x34
    [] ? slow_work_thread+0x0/0x308
    [] kthread+0x7a/0x82
    [] child_rip+0xa/0x20
    [] ? restore_args+0x0/0x30
    [] ? kthread+0x0/0x82
    [] ? child_rip+0x0/0x20
    1 lock held by kslowd004/5711:
    #0: (&sb->s_type->i_mutex_key#7/1){+.+.+.}, at: [] cachefiles_walk_to_object+0x1b3/0x827 [cachefiles]

    Signed-off-by: David Howells

    David Howells
     
  • Start processing an object's operations when that object moves into the DYING
    state as the object cannot be destroyed until all its outstanding operations
    have completed.

    Furthermore, make sure that read and allocation operations handle being woken
    up on a dead object. Such events are recorded in the Allocs.abt and
    Retrvls.abt statistics as viewable through /proc/fs/fscache/stats.

    The code for waiting for object activation for the read and allocation
    operations is also extracted into its own function as it is much the same in
    all cases, differing only in the stats incremented.

    Signed-off-by: David Howells

    David Howells