10 Jan, 2012

1 commit


09 Jan, 2012

3 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (53 commits)
    Kconfig: acpi: Fix typo in comment.
    misc latin1 to utf8 conversions
    devres: Fix a typo in devm_kfree comment
    btrfs: free-space-cache.c: remove extra semicolon.
    fat: Spelling s/obsolate/obsolete/g
    SCSI, pmcraid: Fix spelling error in a pmcraid_err() call
    tools/power turbostat: update fields in manpage
    mac80211: drop spelling fix
    types.h: fix comment spelling for 'architectures'
    typo fixes: aera -> area, exntension -> extension
    devices.txt: Fix typo of 'VMware'.
    sis900: Fix enum typo 'sis900_rx_bufer_status'
    decompress_bunzip2: remove invalid vi modeline
    treewide: Fix comment and string typo 'bufer'
    hyper-v: Update MAINTAINERS
    treewide: Fix typos in various parts of the kernel, and fix some comments.
    clockevents: drop unknown Kconfig symbol GENERIC_CLOCKEVENTS_MIGR
    gpio: Kconfig: drop unknown symbol 'CS5535_GPIO'
    leds: Kconfig: Fix typo 'D2NET_V2'
    sound: Kconfig: drop unknown symbol ARCH_CLPS7500
    ...

    Fix up trivial conflicts in arch/powerpc/platforms/40x/Kconfig (some new
    kconfig additions, close to removed commented-out old ones)

    Linus Torvalds
     
  • * 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (76 commits)
    PM / Hibernate: Implement compat_ioctl for /dev/snapshot
    PM / Freezer: fix return value of freezable_schedule_timeout_killable()
    PM / shmobile: Allow the A4R domain to be turned off at run time
    PM / input / touchscreen: Make st1232 use device PM QoS constraints
    PM / QoS: Introduce dev_pm_qos_add_ancestor_request()
    PM / shmobile: Remove the stay_on flag from SH7372's PM domains
    PM / shmobile: Don't include SH7372's INTCS in syscore suspend/resume
    PM / shmobile: Add support for the sh7372 A4S power domain / sleep mode
    PM: Drop generic_subsys_pm_ops
    PM / Sleep: Remove forward-only callbacks from AMBA bus type
    PM / Sleep: Remove forward-only callbacks from platform bus type
    PM: Run the driver callback directly if the subsystem one is not there
    PM / Sleep: Make pm_op() and pm_noirq_op() return callback pointers
    PM/Devfreq: Add Exynos4-bus device DVFS driver for Exynos4210/4212/4412.
    PM / Sleep: Merge internal functions in generic_ops.c
    PM / Sleep: Simplify generic system suspend callbacks
    PM / Hibernate: Remove deprecated hibernation snapshot ioctls
    PM / Sleep: Fix freezer failures due to racy usermodehelper_is_disabled()
    ARM: S3C64XX: Implement basic power domain support
    PM / shmobile: Use common always on power domain governor
    ...

    Fix up trivial conflict in fs/xfs/xfs_buf.c due to removal of unused
    XBT_FORCE_SLEEP bit

    Linus Torvalds
     
  • * 'for-linus' of git://oss.sgi.com/xfs/xfs: (22 commits)
    xfs: mark the xfssyncd workqueue as non-reentrant
    xfs: simplify xfs_qm_detach_gdquots
    xfs: fix acl count validation in xfs_acl_from_disk()
    xfs: remove unused XBT_FORCE_SLEEP bit
    xfs: remove XFS_QMOPT_DQSUSER
    xfs: kill xfs_qm_idtodq
    xfs: merge xfs_qm_dqinit_core into the only caller
    xfs: add a xfs_dqhold helper
    xfs: simplify xfs_qm_dqattach_grouphint
    xfs: nest qm_dqfrlist_lock inside the dquot qlock
    xfs: flatten the dquot lock ordering
    xfs: implement lazy removal for the dquot freelist
    xfs: remove XFS_DQ_INACTIVE
    xfs: cleanup xfs_qm_dqlookup
    xfs: cleanup dquot locking helpers
    xfs: remove the sync_mode argument to xfs_qm_dqflush_all
    xfs: remove xfs_qm_sync
    xfs: make sure to really flush all dquots in xfs_qm_quotacheck
    xfs: untangle SYNC_WAIT and SYNC_TRYLOCK meanings for xfs_qm_dqflush
    xfs: remove the lid_size field in struct log_item_desc
    ...

    Fix up trivial conflict in fs/xfs/xfs_sync.c

    Linus Torvalds
     

07 Jan, 2012

1 commit


04 Jan, 2012

8 commits


24 Dec, 2011

2 commits

  • Since Linux 2.6.36 the writeback code has introduces various measures for
    live lock prevention during sync(). Unfortunately some of these are
    actively harmful for the XFS model, where the inode gets marked dirty for
    metadata from the data I/O handler.

    The older_than_this checks that are now more strictly enforced since

    writeback: avoid livelocking WB_SYNC_ALL writeback

    by only calling into __writeback_inodes_sb and thus only sampling the
    current cut off time once. But on a slow enough devices the previous
    asynchronous sync pass might not have fully completed yet, and thus XFS
    might mark metadata dirty only after that sampling of the cut off time for
    the blocking pass already happened. I have not myself reproduced this
    myself on a real system, but by introducing artificial delay into the
    XFS I/O completion workqueues it can be reproduced easily.

    Fix this by iterating over all XFS inodes in ->sync_fs and log all that
    are dirty. This might log inode that only got redirtied after the
    previous pass, but given how cheap delayed logging of inodes is it
    isn't a major concern for performance.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Tested-by: Mark Tinguely
    Reviewed-by: Mark Tinguely
    Signed-off-by: Ben Myers

    Christoph Hellwig
     
  • If the writeback code writes back an inode because it has expired we currently
    use the non-blockin ->write_inode path. This means any inode that is pinned
    is skipped. With delayed logging and a workload that has very little log
    traffic otherwise it is very likely that an inode that gets constantly
    written to is always pinned, and thus we keep refusing to write it. The VM
    writeback code at that point redirties it and doesn't try to write it again
    for another 30 seconds. This means under certain scenarious time based
    metadata writeback never happens.

    Fix this by calling into xfs_log_inode for kupdate in addition to data
    integrity syncs, and thus transfer the inode to the log ASAP.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Tested-by: Mark Tinguely
    Reviewed-by: Mark Tinguely
    Signed-off-by: Ben Myers

    Christoph Hellwig
     

22 Dec, 2011

1 commit

  • * master: (848 commits)
    SELinux: Fix RCU deref check warning in sel_netport_insert()
    binary_sysctl(): fix memory leak
    mm/vmalloc.c: remove static declaration of va from __get_vm_area_node
    ipmi_watchdog: restore settings when BMC reset
    oom: fix integer overflow of points in oom_badness
    memcg: keep root group unchanged if creation fails
    nilfs2: potential integer overflow in nilfs_ioctl_clean_segments()
    nilfs2: unbreak compat ioctl
    cpusets: stall when updating mems_allowed for mempolicy or disjoint nodemask
    evm: prevent racing during tfm allocation
    evm: key must be set once during initialization
    mmc: vub300: fix type of firmware_rom_wait_states module parameter
    Revert "mmc: enable runtime PM by default"
    mmc: sdhci: remove "state" argument from sdhci_suspend_host
    x86, dumpstack: Fix code bytes breakage due to missing KERN_CONT
    IB/qib: Correct sense on freectxts increment and decrement
    RDMA/cma: Verify private data length
    cgroups: fix a css_set not found bug in cgroup_attach_proc
    oprofile: Fix uninitialized memory access when writing to writing to oprofilefs
    Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel"
    ...

    Conflicts:
    kernel/cgroup_freezer.c

    Rafael J. Wysocki
     

20 Dec, 2011

1 commit

  • On a system with lots of memory pressure that is stuck on synchronous inode
    reclaim the workqueue code will run one instance of the inode reclaim work
    item on every CPU. which is not what we want. Make sure to mark the
    xfssyncd workqueue as non-reentrant to make sure there only is one instace
    of each running globally. Also stop using special paramater for the
    workqueue; now that we guarantee each fs has only running one of each works
    at a time there is no need to artificially lower max_active and compensate
    for that by setting the WQ_CPU_INTENSIVE flag.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Ben Myers

    Christoph Hellwig
     

17 Dec, 2011

3 commits


16 Dec, 2011

5 commits


15 Dec, 2011

2 commits

  • Allow xfs_qm_dqput to work without trylock loops by nesting the freelist lock
    inside the dquot qlock. In turn that requires trylocks in the reclaim path
    instead, but given it's a classic tradeoff between fast and slow path, and
    we follow the model of the inode and dentry caches.

    Document our new lock order now that it has settled.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Ben Myers

    Christoph Hellwig
     
  • Introduce a new XFS_DQ_FREEING flag that tells lookup and mplist walks
    to skip a dquot that is beeing freed, and use this avoid the trylock
    on the hash and mplist locks in xfs_qm_dqreclaim_one. Also simplify
    xfs_dqpurge by moving the inodes to a dispose list after marking them
    XFS_DQ_FREEING and avoid the locker ordering constraints.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Ben Myers

    Christoph Hellwig
     

14 Dec, 2011

3 commits

  • Do not remove dquots from the freelist when we grab a reference to them in
    xfs_qm_dqlookup, but leave them on the freelist util scanning notices that
    they have a reference. This speeds up the lookup fastpath, and greatly
    simplifies the lock ordering constraints. Note that the same scheme is
    used by the VFS inode and dentry caches.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Ben Myers

    Christoph Hellwig
     
  • Free dquots when purging them during umount instead of keeping them around
    on the freelist in a degraded state. The out of order locking in
    xfs_qm_dqpurge will be removed again later in this series.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Ben Myers

    Christoph Hellwig
     
  • Rearrange the code to avoid the conditional locking around the flist_locked
    variable. This means we lose a (rather pointless) assert, and hold the
    freelist lock a bit longer for one corner case.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Ben Myers

    Christoph Hellwig
     

13 Dec, 2011

5 commits


09 Dec, 2011

3 commits

  • Outside the now removed nodelaylog code this field is only used for
    asserts and can be safely removed now.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Ben Myers

    Christoph Hellwig
     
  • Now that the nodelaylog mode is gone we can simplify the transaction commit
    path a bit by removing the xfs_trans_commit_cil routine. Restoring the
    process flags is merged into xfs_trans_commit which already does it for
    the error path, and allocating the log vectors is merged into
    xlog_cil_format_items, which already fills them with data, thus avoiding
    one loop over all log items.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Ben Myers

    Christoph Hellwig
     
  • The delaylog mode has been the default for a long time, and the nodelaylog
    option has been scheduled for removal in Linux 3.3. Remove it and code
    only used by it now that we have opened the 3.3 window.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Ben Myers

    Christoph Hellwig
     

07 Dec, 2011

2 commits

  • Apply the scheme used in log_regrant_write_log_space to wake up any other
    threads waiting for log space before the newly added one to
    log_regrant_write_log_space as well, and factor the code into readable
    helpers. For each of the queues we have add two helpers:

    - one to try to wake up all waiting threads. This helper will also be
    usable by xfs_log_move_tail once we remove the current opportunistic
    wakeups in it.
    - one to sleep on t_wait until enough log space is available, loosely
    modelled after Linux waitqueues.

    And use them to reimplement the guts of log_regrant_write_log_space and
    log_regrant_write_log_space. These two function now use one and the same
    algorithm for waiting on log space instead of subtly different ones before,
    with an option to completely unify them in the near future.

    Also move the filesystem shutdown handling to the common caller given
    that we had to touch it anyway.

    Based on hard debugging and an earlier patch from
    Chandra Seetharaman .

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Chandra Seetharaman
    Tested-by: Chandra Seetharaman
    Signed-off-by: Ben Myers

    Christoph Hellwig
     
  • The i_ino field in the VFS inode is of type unsigned long and thus can't
    hold the full 64-bit inode number on 32-bit kernels. We have the full
    inode number in the XFS inode, so use that one for nfs exports. Note
    that I've also switched the 32-bit file handles types to it, just to make
    the code more consistent and copy & paste errors less likely to happen.

    Reported-by: Guoquan Yang
    Reported-by: Hank Peng
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Ben Myers

    Christoph Hellwig