19 Mar, 2014

3 commits

  • Uninlined nested functions can cause crashes when using ftrace, as they don't
    follow the normal calling convention and confuse the ftrace function graph
    tracer as it examines the stack.

    Also, nested functions are supported as a gcc extension, but may fail on other
    compilers (e.g. llvm).

    Signed-off-by: John Sheu

    John Sheu
     
  • This changes the bucket allocation reserves to use _real_ reserves - separate
    freelists - instead of watermarks, which if nothing else makes the current code
    saner to reason about and is going to be important in the future when we add
    support for multiple btrees.

    It also adds btree_check_reserve(), which checks (and locks) the reserves for
    both bucket allocation and memory allocation for btree nodes; the old code just
    kinda sorta assumed that since (e.g. for btree node splits) it had the root
    locked and that meant no other threads could try to make use of the same
    reserve; this technically should have been ok for memory allocation (we should
    always have a reserve for memory allocation (the btree node cache is used as a
    reserve and we preallocate it)), but multiple btrees will mean that locking the
    root won't be sufficient anymore, and for the bucket allocation reserve it was
    technically possible for the old code to deadlock.

    Signed-off-by: Kent Overstreet

    Kent Overstreet
     
  • Break down data into clean data/dirty data/metadata.

    Signed-off-by: Kent Overstreet

    Kent Overstreet
     

30 Jan, 2014

1 commit


09 Jan, 2014

5 commits


17 Dec, 2013

1 commit

  • The old writeback PD controller could get into states where it had throttled all
    the way down and take way too long to recover - it was too complicated to really
    understand what it was doing.

    This rewrites a good chunk of it to hopefully be simpler and make more sense,
    and it also pays more attention to units which should make the behaviour a bit
    easier to understand.

    Signed-off-by: Kent Overstreet

    Kent Overstreet
     

11 Nov, 2013

8 commits

  • More testing ftw! Also, now verify mode doesn't break if you read dirty
    data.

    Signed-off-by: Kent Overstreet

    Kent Overstreet
     
  • Whoops.

    Signed-off-by: Kent Overstreet

    Kent Overstreet
     
  • It never really made sense to expose this, so just kill it.

    Signed-off-by: Kent Overstreet

    Kent Overstreet
     
  • Big garbage collection rewrite; now, garbage collection uses the same
    mechanisms as used elsewhere for inserting/updating btree node pointers,
    instead of rewriting interior btree nodes in place.

    This makes the code significantly cleaner and less fragile, and means we
    can now make garbage collection incremental - it doesn't have to hold a
    write lock on the root of the btree for the entire duration of garbage
    collection.

    This means that there's less of a latency hit for doing garbage
    collection, which means we can gc more frequently (and do a better job
    of reclaiming from the cache), and we can coalesce across more btree
    nodes (improving our space efficiency).

    Signed-off-by: Kent Overstreet

    Kent Overstreet
     
  • Couple changes:
    * Consolidate bch_check_keys() and bch_check_key_order(), and move the
    checks that only check_key_order() could do to bch_btree_iter_next().

    * Get rid of CONFIG_BCACHE_EDEBUG - now, all that code is compiled in
    when CONFIG_BCACHE_DEBUG is enabled, and there's now a sysfs file to
    flip on the EDEBUG checks at runtime.

    * Dropped an old not terribly useful check in rw_unlock(), and
    refactored/improved a some of the other debug code.

    Signed-off-by: Kent Overstreet

    Kent Overstreet
     
  • We needed a dedicated rescuer workqueue for gc anyways... and gc was
    conceptually a dedicated thread, just one that wasn't running all the
    time. Switch it to a dedicated thread to make the code a bit more
    straightforward.

    Signed-off-by: Kent Overstreet

    Kent Overstreet
     
  • Originally I got this right... except that the divides didn't use
    do_div(), which broke 32 bit kernels. When I went to fix that, I forgot
    that the raid stripe size usually isn't a power of two... doh

    Signed-off-by: Kent Overstreet

    Kent Overstreet
     
  • Works kind of like the ext4 setting, to panic or remount read only on
    errors.

    Signed-off-by: Kent Overstreet

    Kent Overstreet
     

25 Sep, 2013

1 commit


11 Sep, 2013

1 commit

  • Convert the driver shrinkers to the new API. Most changes are compile
    tested only because I either don't have the hardware or it's staging
    stuff.

    FWIW, the md and android code is pretty good, but the rest of it makes me
    want to claw my eyes out. The amount of broken code I just encountered is
    mind boggling. I've added comments explaining what is broken, but I fear
    that some of the code would be best dealt with by being dragged behind the
    bike shed, burying in mud up to it's neck and then run over repeatedly
    with a blunt lawn mower.

    Special mention goes to the zcache/zcache2 drivers. They can't co-exist
    in the build at the same time, they are under different menu options in
    menuconfig, they only show up when you've got the right set of mm
    subsystem options configured and so even compile testing is an exercise in
    pulling teeth. And that doesn't even take into account the horrible,
    broken code...

    [glommer@openvz.org: fixes for i915, android lowmem, zcache, bcache]
    Signed-off-by: Dave Chinner
    Signed-off-by: Glauber Costa
    Acked-by: Mel Gorman
    Cc: Daniel Vetter
    Cc: Kent Overstreet
    Cc: John Stultz
    Cc: David Rientjes
    Cc: Jerome Glisse
    Cc: Thomas Hellstrom
    Cc: "Theodore Ts'o"
    Cc: Adrian Hunter
    Cc: Al Viro
    Cc: Artem Bityutskiy
    Cc: Arve Hjønnevåg
    Cc: Carlos Maiolino
    Cc: Christoph Hellwig
    Cc: Chuck Lever
    Cc: Daniel Vetter
    Cc: David Rientjes
    Cc: Gleb Natapov
    Cc: Greg Thelen
    Cc: J. Bruce Fields
    Cc: Jan Kara
    Cc: Jerome Glisse
    Cc: John Stultz
    Cc: KAMEZAWA Hiroyuki
    Cc: Kent Overstreet
    Cc: Kirill A. Shutemov
    Cc: Marcelo Tosatti
    Cc: Mel Gorman
    Cc: Steven Whitehouse
    Cc: Thomas Hellstrom
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton

    Signed-off-by: Al Viro

    Dave Chinner
     

12 Jul, 2013

1 commit


27 Jun, 2013

5 commits

  • Signed-off-by: Gabriel de Perthuis
    Signed-off-by: Kent Overstreet

    Gabriel de Perthuis
     
  • Now that we're tracking dirty data per stripe, we can add two
    optimizations for raid5/6:

    * If a stripe is already dirty, force writes to that stripe to
    writeback mode - to help build up full stripes of dirty data

    * When flushing dirty data, preferentially write out full stripes first
    if there are any.

    Signed-off-by: Kent Overstreet

    Kent Overstreet
     
  • To make background writeback aware of raid5/6 stripes, we first need to
    track the amount of dirty data within each stripe - we do this by
    breaking up the existing sectors_dirty into per stripe atomic_ts

    Signed-off-by: Kent Overstreet

    Kent Overstreet
     
  • The tracepoints were reworked to be more sensible, and fixed a null
    pointer deref in one of the tracepoints.

    Converted some of the pr_debug()s to tracepoints - this is partly a
    performance optimization; it used to be that with DEBUG or
    CONFIG_DYNAMIC_DEBUG pr_debug() was an empty macro; but at some point it
    was changed to an empty inline function.

    Some of the pr_debug() statements had rather expensive function calls as
    part of the arguments, so this code was getting run unnecessarily even
    on non debug kernels - in some fast paths, too.

    Signed-off-by: Kent Overstreet

    Kent Overstreet
     
  • An old version of gcc was complaining about using a const int as the
    size of a stack allocated array. Which should be fine - but using
    ARRAY_SIZE() is better, anyways.

    Also, refactor the code to use scnprintf().

    Signed-off-by: Kent Overstreet

    Kent Overstreet
     

29 Mar, 2013

1 commit


24 Mar, 2013

1 commit

  • Does writethrough and writeback caching, handles unclean shutdown, and
    has a bunch of other nifty features motivated by real world usage.

    See the wiki at http://bcache.evilpiepirate.org for more.

    Signed-off-by: Kent Overstreet

    Kent Overstreet