09 Jan, 2012

1 commit

  • * 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (76 commits)
    PM / Hibernate: Implement compat_ioctl for /dev/snapshot
    PM / Freezer: fix return value of freezable_schedule_timeout_killable()
    PM / shmobile: Allow the A4R domain to be turned off at run time
    PM / input / touchscreen: Make st1232 use device PM QoS constraints
    PM / QoS: Introduce dev_pm_qos_add_ancestor_request()
    PM / shmobile: Remove the stay_on flag from SH7372's PM domains
    PM / shmobile: Don't include SH7372's INTCS in syscore suspend/resume
    PM / shmobile: Add support for the sh7372 A4S power domain / sleep mode
    PM: Drop generic_subsys_pm_ops
    PM / Sleep: Remove forward-only callbacks from AMBA bus type
    PM / Sleep: Remove forward-only callbacks from platform bus type
    PM: Run the driver callback directly if the subsystem one is not there
    PM / Sleep: Make pm_op() and pm_noirq_op() return callback pointers
    PM/Devfreq: Add Exynos4-bus device DVFS driver for Exynos4210/4212/4412.
    PM / Sleep: Merge internal functions in generic_ops.c
    PM / Sleep: Simplify generic system suspend callbacks
    PM / Hibernate: Remove deprecated hibernation snapshot ioctls
    PM / Sleep: Fix freezer failures due to racy usermodehelper_is_disabled()
    ARM: S3C64XX: Implement basic power domain support
    PM / shmobile: Use common always on power domain governor
    ...

    Fix up trivial conflict in fs/xfs/xfs_buf.c due to removal of unused
    XBT_FORCE_SLEEP bit

    Linus Torvalds
     

22 Nov, 2011

1 commit

  • There is no reason to export two functions for entering the
    refrigerator. Calling refrigerator() instead of try_to_freeze()
    doesn't save anything noticeable or removes any race condition.

    * Rename refrigerator() to __refrigerator() and make it return bool
    indicating whether it scheduled out for freezing.

    * Update try_to_freeze() to return bool and relay the return value of
    __refrigerator() if freezing().

    * Convert all refrigerator() users to try_to_freeze().

    * Update documentation accordingly.

    * While at it, add might_sleep() to try_to_freeze().

    Signed-off-by: Tejun Heo
    Cc: Samuel Ortiz
    Cc: Chris Mason
    Cc: "Theodore Ts'o"
    Cc: Steven Whitehouse
    Cc: Andrew Morton
    Cc: Jan Kara
    Cc: KONISHI Ryusuke
    Cc: Christoph Hellwig

    Tejun Heo
     

08 Nov, 2011

1 commit

  • Christoph has split up REQ_PRIO from REQ_META. That means that
    we can drop REQ_PRIO from places where is it not needed. I'm
    not at all sure that the combination WRITE_FLUSH_FUA | REQ_PRIO
    makes any kind of sense, anyway.

    In addition, I've added REQ_META to one place in the code where
    it was missing. REQ_PRIO has been left for read/writes triggered
    by glock acquisition and writeback only. We can adjust it again
    if required, but these are the most important points from a
    performance perspective.

    Signed-off-by: Steven Whitehouse
    Cc: Christoph Hellwig

    Steven Whitehouse
     

23 Aug, 2011

1 commit

  • Add a new REQ_PRIO to let requests preempt others in the cfq I/O schedule,
    and lave REQ_META purely for marking requests as metadata in blktrace.

    All existing callers of REQ_META except for XFS are updated to also
    set REQ_PRIO for now.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Namhyung Kim
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

14 Jul, 2011

1 commit

  • This patch contains a few misc fixes which resolve a recently
    reported issue. This patch has been a real team effort and has
    received a lot of testing.

    The first issue is that the ail lock needs to be held over a few
    more operations. The lock thats added into gfs2_releasepage() may
    possibly be a candidate for replacing with RCU at some future
    point, but at this stage we've gone for the obvious fix.

    The second issue is that gfs2_write_inode() can end up calling
    a glock recursively when called from gfs2_evict_inode() via the
    syncing code, so it needs a guard added.

    The third issue is that we either need to not truncate the metadata
    pages of inodes which have zero link count, but which we cannot
    deallocate due to them still being in use by other nodes, or we need
    to ensure that those pages have all made it through the journal and
    ail lists first. This patch takes the former approach, but the
    latter has also been tested and there is nothing to choose between
    them performance-wise. So again, we could revise that decision
    in the future.

    Also, the inode eviction process is now better documented.

    Signed-off-by: Steven Whitehouse
    Tested-by: Bob Peterson
    Tested-by: Abhijith Das
    Reported-by: Barry J. Marson
    Reported-by: David Teigland

    Steven Whitehouse
     

22 May, 2011

1 commit

  • The ail flush code has always relied upon log flushing to prevent
    it from spinning needlessly. This fixes it to wait on the last
    I/O request submitted (we don't need to wait for all of it)
    instead of either spinning with io_schedule or sleeping.

    As a result cpu usage of gfs2_logd is much reduced with certain
    workloads.

    Reported-by: Abhijith Das
    Tested-by: Abhijith Das
    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

03 May, 2011

1 commit

  • In the recent patches to update the AIL list code, I managed to
    forget that the ail list lock got dropped, even though I
    added a comment specifically to remind myself :(

    Reported-by: Barry Marson
    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

20 Apr, 2011

3 commits


25 Mar, 2011

1 commit

  • * 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits)
    Documentation/iostats.txt: bit-size reference etc.
    cfq-iosched: removing unnecessary think time checking
    cfq-iosched: Don't clear queue stats when preempt.
    blk-throttle: Reset group slice when limits are changed
    blk-cgroup: Only give unaccounted_time under debug
    cfq-iosched: Don't set active queue in preempt
    block: fix non-atomic access to genhd inflight structures
    block: attempt to merge with existing requests on plug flush
    block: NULL dereference on error path in __blkdev_get()
    cfq-iosched: Don't update group weights when on service tree
    fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away
    block: Require subsystems to explicitly allocate bio_set integrity mempool
    jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
    jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
    fs: make fsync_buffers_list() plug
    mm: make generic_writepages() use plugging
    blk-cgroup: Add unaccounted time to timeslice_used.
    block: fixup plugging stubs for !CONFIG_BLOCK
    block: remove obsolete comments for blkdev_issue_zeroout.
    blktrace: Use rq->cmd_flags directly in blk_add_trace_rq.
    ...

    Fix up conflicts in fs/{aio.c,super.c}

    Linus Torvalds
     

14 Mar, 2011

1 commit


11 Mar, 2011

1 commit

  • The log lock is currently used to protect the AIL lists and
    the movements of buffers into and out of them. The lists
    are self contained and no log specific items outside the
    lists are accessed when starting or emptying the AIL lists.

    Hence the operation of the AIL does not require the protection
    of the log lock so split them out into a new AIL specific lock
    to reduce the amount of traffic on the log lock. This will
    also reduce the amount of serialisation that occurs when
    the gfs2_logd pushes on the AIL to move it forward.

    This reduces the impact of log pushing on sequential write
    throughput.

    Signed-off-by: Dave Chinner
    Signed-off-by: Steven Whitehouse

    Dave Chinner
     

10 Mar, 2011

1 commit

  • With the plugging now being explicitly controlled by the
    submitter, callers need not pass down unplugging hints
    to the block layer. If they want to unplug, it's because they
    manually plugged on their own - in which case, they should just
    unplug at will.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

19 Oct, 2010

1 commit


17 Sep, 2010

1 commit


10 Sep, 2010

1 commit


08 Aug, 2010

2 commits

  • Remove the current bio flags and reuse the request flags for the bio, too.
    This allows to more easily trace the type of I/O from the filesystem
    down to the block driver. There were two flags in the bio that were
    missing in the requests: BIO_RW_UNPLUG and BIO_RW_AHEAD. Also I've
    renamed two request flags that had a superflous RW in them.

    Note that the flags are in bio.h despite having the REQ_ name - as
    blkdev.h includes bio.h that is the only way to go for now.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • A barrier request should by defintion have priority in get_request
    and let the queue be unplugged immediately as it's blocking all forward
    progress due to the queue draining.

    Most filesystems already get this implicitly by the way how submit_bh
    treats the buffer_ordered flag, and gfs2 sets it explicitly. But btrfs
    and XFS are still forgetting to set the flag, as is blkdev_issue_flush
    and some places in DM/MD.

    For XFS on metadata heavy workloads this gives a consistent speedup
    in the 2-3% range.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

21 May, 2010

1 commit

  • The previous patch I wrote for reclaiming unlinked dinodes
    had some shortcomings and did not prevent all hangs.
    This version is much cleaner and more logical, and has
    passed very difficult testing. Sorry for the churn.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     

06 May, 2010

1 commit

  • The following patch adds a message to indicate when barriers have been
    disabled due to a block device which doesn't support them. You could
    already tell this via the mount options in /proc/mounts, but all the
    other filesystems also log a message at the same time.

    Also, the same mechanisms are used to indicate when the lock
    demote interface has been used (only ever used for debugging)
    which is a request from our support team.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

05 May, 2010

1 commit

  • This patch contains various tweaks to how log flushes and active item writeback
    work. gfs2_logd is now managed by a waitqueue, and gfs2_log_reseve now waits
    for gfs2_logd to do the log flushing. Multiple functions were rewritten to
    remove the need to call gfs2_log_lock(). Instead of using one test to see if
    gfs2_logd had work to do, there are now seperate tests to check if there
    are two many buffers in the incore log or if there are two many items on the
    active items list.

    This patch is a port of a patch Steve Whitehouse wrote about a year ago, with
    some minor changes. Since gfs2_ail1_start always submits all the active items,
    it no longer needs to keep track of the first ai submitted, so this has been
    removed. In gfs2_log_reserve(), the order of the calls to
    prepare_to_wait_exclusive() and wake_up() when firing off the logd thread has
    been switched. If it called wake_up first there was a small window for a race,
    where logd could run and return before gfs2_log_reserve was ready to get woken
    up. If gfs2_logd ran, but did not free up enough blocks, gfs2_log_reserve()
    would be left waiting for gfs2_logd to eventualy run because it timed out.
    Finally, gt_logd_secs, which controls how long to wait before gfs2_logd times
    out, and flushes the log, can now be set on mount with ar_commit.

    Signed-off-by: Benjamin Marzinski
    Signed-off-by: Steven Whitehouse

    Benjamin Marzinski
     

11 Mar, 2010

1 commit

  • GFS2 tracks the number of revokes and unrevokes that are part of committed
    transactions via sd_log_commited_revoke. It is possible for one process to add
    revokes during its transaction, while another process unrevokes them during its
    transaction. If the second process finishes its transaction first,
    sd_log_commited_revoke will be decremented by the number of unrevokes that the
    second process did, without first being incremented by the number of revokes
    the first process did. This is fine, since all started transactions must be
    completed before the journal can be flushed. However, sd_log_commited_revoke
    is an unsigned integer, and log_refund() causes an assertion failure if it
    would go negative at the end of a transaction. This patch makes
    sd_log_commited_revoke a signed integer and allows it to go negative.
    __gfs2_log_flush() still checks that it mataches the actual number of revokes.

    Signed-off-by: Benjamin Marzinski
    Signed-off-by: Steven Whitehouse

    Benjamin Marzinski
     

03 Dec, 2009

1 commit

  • There are two spare field in the header common to all GFS2
    metadata. One is just the right size to fit a journal id
    in it, and this patch updates the journal code so that each
    time a metadata block is modified, we tag it with the journal
    id of the node which is performing the modification.

    The reason for this is that it should make it much easier to
    debug issues which arise if we can tell which node was the
    last to modify a particular metadata block.

    Since the field is updated before the block is written into
    the journal, each journal should only contain metadata which
    is tagged with its own journal id. The one exception to this
    is the journal header block, which might have a different node's
    id in it, if that journal was recovered by another node in the
    cluster.

    Thus each journal will contain a record of which nodes recovered
    it, via the journal header.

    The other field in the metadata header could potentially be
    used to hold information about what kind of operation was
    performed, but for the time being we just zero it on each
    transaction so that if we use it for that in future, we'll
    know that the information (where it exists) is reliable.

    I did consider using the other field to hold the journal
    sequence number, however since in GFS2's journaling we write
    the modified data into the journal and not the original
    data, this gives no information as to what action caused the
    modification, so I think we can probably come up with a better
    use for those 64 bits in the future.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

12 Jun, 2009

2 commits


11 May, 2009

1 commit

  • After Jens recent updates:
    http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a1f242524c3c1f5d40f1c9c343427e34d1aadd6e
    et al. this is a patch to bring gfs2 uptodate with the core
    code. Also I've managed to squash another call to ll_rw_block()
    along the way.

    There is still one part of the GFS2 I/O paths which are not correctly
    annotated and that is due to the sharing of the writeback code between
    the data and metadata address spaces. I would like to change that too,
    but this patch is still worth doing on its own, I think.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

24 Mar, 2009

1 commit

  • This is the big patch that I've been working on for some time
    now. There are many reasons for wanting to make this change
    such as:
    o Reducing overhead by eliminating duplicated fields between structures
    o Simplifcation of the code (reduces the code size by a fair bit)
    o The locking interface is now the DLM interface itself as proposed
    some time ago.
    o Fewer lookups of glocks when processing replies from the DLM
    o Fewer memory allocations/deallocations for each glock
    o Scope to do further optimisations in the future (but this patch is
    more than big enough for now!)

    Please note that (a) this patch relates to the lock_dlm module and
    not the DLM itself, that is still a separate module; and (b) that
    we retain the ability to build GFS2 as a standalone single node
    filesystem with out requiring the DLM.

    This patch needs a lot of testing, hence my keeping it I restarted
    my -git tree after the last merge window. That way, this has the maximum
    exposure before its merged. This is (modulo a few minor bug fixes) the
    same patch that I've been posting on and off the the last three months
    and its passed a number of different tests so far.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

26 Sep, 2008

1 commit

  • This patch adds barrier support to GFS2. There is not a lot of change
    really... we just add the barrier flag when we write journal header
    blocks. If the underlying device refuses to support them, we fall back
    to the previous way of doing things (wait for the I/O and hope) since
    there is nothing else we can do. There is no user configuration,
    barriers will always be on unless the device refuses to support them.
    This seems a reasonable solution to me since this is a correctness
    issue.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

27 Jun, 2008

1 commit


18 Apr, 2008

1 commit


31 Mar, 2008

1 commit

  • This patch is performance related. When we're doing a log flush,
    I noticed we were calling buf_lo_incore_commit twice: once for
    data bufs and once for metadata bufs. Since this is the same
    function and does the same thing in both cases, there should be
    no reason to call it twice. Since we only need to call it once,
    we can also make it faster by removing it from the generic "lops"
    code and making it a stand-along static function.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     

25 Jan, 2008

8 commits

  • Although the values were all being calculated correctly, there was a
    race in the assert due to the way it was using atomic variables. This
    changes the value we assert on so that we get the same effect by testing
    a different variable. This prevents the assert triggering when it shouldn't.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • A missing offset in the calculation.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This patch saves a little time when gfs2 writes to the journals by
    keeping a mapping between logical and physical blocks on disk.
    That's better than constantly looking up indirect pointers in
    buffers, when the journals are several levels of indirection
    (which they typically are).

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     
  • This patch is just a cleanup. Function gfs2_get_block() just calls
    function gfs2_block_map reversing the last two parameters. By
    reversing the parameters, gfs2_block_map() may be called directly
    and function gfs2_get_block may be eliminated altogether.
    Since this function is done for every block operation,
    this streamlines the code and makes it a little bit more efficient.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     
  • The issue is indeed UP vs SMP and it is totally random.

    spin_is_locked() is a bad assertion because there is no correct answer on UP.
    on UP spin_is_locked() has to return either one value or another, always.

    This means that in my setup I am lucky enough to trigger the issue and your you
    are lucky enough not to.

    the patch in attachment removes the bogus calls to BUG_ON and according to David
    (in CC and thanks for the long explanation on the problem) we can rely upon
    things like lockdep to find problem that might be trying to catch.

    Signed-off-by: Fabio M. Di Nitto
    Cc: David S. Miller
    Signed-off-by: Steven Whitehouse

    Fabio Massimo Di Nitto
     
  • We only care about the content of the jindex in two cases,
    one is when we mount the fs and the other is when we need
    to recover another journal. In both cases we have to update
    the jindex anyway, so there is no point in updating it
    periodically between times, so this removes it to simplify
    gfs2_logd.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This means that we can mark gfs2_ail1_empty static and prepares
    the way for further changes.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This patch changes the counter which keeps track of the free
    blocks in the journal to an atomic_t in preparation for the
    following patch which will update the log reservation code.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse