04 Jan, 2012

1 commit


23 Jul, 2011

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (107 commits)
    vfs: use ERR_CAST for err-ptr tossing in lookup_instantiate_filp
    isofs: Remove global fs lock
    jffs2: fix IN_DELETE_SELF on overwriting rename() killing a directory
    fix IN_DELETE_SELF on overwriting rename() on ramfs et.al.
    mm/truncate.c: fix build for CONFIG_BLOCK not enabled
    fs:update the NOTE of the file_operations structure
    Remove dead code in dget_parent()
    AFS: Fix silly characters in a comment
    switch d_add_ci() to d_splice_alias() in "found negative" case as well
    simplify gfs2_lookup()
    jfs_lookup(): don't bother with . or ..
    get rid of useless dget_parent() in btrfs rename() and link()
    get rid of useless dget_parent() in fs/btrfs/ioctl.c
    fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers
    drivers: fix up various ->llseek() implementations
    fs: handle SEEK_HOLE/SEEK_DATA properly in all fs's that define their own llseek
    Ext4: handle SEEK_HOLE/SEEK_DATA generically
    Btrfs: implement our own ->llseek
    fs: add SEEK_HOLE and SEEK_DATA flags
    reiserfs: make reiserfs default to barrier=flush
    ...

    Fix up trivial conflicts in fs/xfs/linux-2.6/xfs_super.c due to the new
    shrinker callout for the inode cache, that clashed with the xfs code to
    start the periodic workers later.

    Linus Torvalds
     

21 Jul, 2011

1 commit

  • Btrfs needs to be able to control how filemap_write_and_wait_range() is called
    in fsync to make it less of a painful operation, so push down taking i_mutex and
    the calling of filemap_write_and_wait() down into the ->fsync() handlers. Some
    file systems can drop taking the i_mutex altogether it seems, like ext3 and
    ocfs2. For correctness sake I just pushed everything down in all cases to make
    sure that we keep the current behavior the same for everybody, and then each
    individual fs maintainer can make up their mind about what to do from there.
    Thanks,

    Acked-by: Jan Kara
    Signed-off-by: Josef Bacik
    Signed-off-by: Al Viro

    Josef Bacik
     

04 Jul, 2011

2 commits

  • Introduce the following I/O helper functions: 'ubifs_leb_read()',
    'ubifs_leb_write()', 'ubifs_leb_change()', 'ubifs_leb_unmap()',
    'ubifs_leb_map()', 'ubifs_is_mapped().

    The idea is to wrap all UBI I/O functions in order to encapsulate various
    assertions and error path handling (error message, stack dump, switching to R/O
    mode). And there are some other benefits of this which will be used in the
    following patches.

    This patch does not switch whole UBIFS to use these functions yet.

    Signed-off-by: Artem Bityutskiy

    Artem Bityutskiy
     
  • The UBIFS lpt tree is in many aspects similar to the TNC tree, and we have
    similar flags for these trees. And by mistake we use the COW_ZNODE flag for
    LPT in some places, instead of the right flag COW_CNODE. And this works
    only because these two constants have the same value.

    This patch makes all the LPT code to use COW_CNODE and also changes COW_CNODE
    constant value to make sure we do not misuse the flags any more.

    Signed-off-by: Artem Bityutskiy

    Artem Bityutskiy
     

01 Jun, 2011

2 commits

  • Instead of passing "grouped" parameter to 'ubifs_recover_leb()' which tells
    whether the nodes are grouped in the LEB to recover, pass the journal head
    number and let 'ubifs_recover_leb()' look at the journal head's 'grouped' flag.

    This patch is a preparation to a further fix where we'll need to know the
    journal head number for other purposes.

    Signed-off-by: Artem Bityutskiy

    Artem Bityutskiy
     
  • Journal heads are different in a way how UBIFS writes nodes there. All normal
    journal heads receive grouped nodes, while the GC journal heads receives
    ungrouped nodes. This patch adds a 'grouped' flag to 'struct ubifs_jhead' which
    describes this property.

    This patch is a preparation to a further recovery fix.

    Signed-off-by: Artem Bityutskiy

    Artem Bityutskiy
     

30 May, 2011

1 commit


16 May, 2011

3 commits

  • This patch adds the 'ubifs_fixup_free_space()' function which scans all
    LEBs in the filesystem for those that are in-use but have one or more
    empty pages, then re-maps the LEBs in order to erase the empty portions.
    Afterward it removes the "space_fixup" flag from the UBIFS superblock.

    Artem: massaged the patch

    Signed-off-by: Matthew L. Creech
    Signed-off-by: Artem Bityutskiy

    Matthew L. Creech
     
  • The 'space_fixup' flag can be set in the superblock of a new filesystem by
    mkfs.ubifs to indicate that any eraseblocks with free space remaining should be
    fixed-up the first time it's mounted (after which the flag is un-set). This
    means that the UBIFS image has been flashed by a "dumb" flasher and the free
    space has been actually programmed (writing all 0xFFs), so this free space
    cannot be used. UBIFS fixes the free space up by re-writing the contents of all
    LEBs with free space using the atomic LEB change UBI operation.

    Artem: improved commit message, add some more commentaries to the code.

    Signed-off-by: Matthew L. Creech
    Signed-off-by: Artem Bityutskiy

    Matthew L. Creech
     
  • This patch simplifies replay even further - it removes the replay tree and
    adds the replay list instead. Indeed, we just do not need to use a tree here -
    all we need to do is to add all nodes to the list and then sort it. Using
    RB-tree is an overkill - more code and slower. And since we replay buds in
    order, we expect the nodes to follow in _mostly_ sorted order, so the merge
    sort becomes much cheaper in average than an RB-tree.

    Signed-off-by: Artem Bityutskiy

    Artem Bityutskiy
     

14 May, 2011

2 commits

  • This patch separates out all the budgeting-related information
    from 'struct ubifs_info' to 'struct ubifs_budg_info'. This way the
    code looks a bit cleaner. However, the main driver for this is
    that we want to save budgeting information and print it later,
    so a separate data structure for this is helpful.

    This patch is a preparation for the further debugging output
    improvements.

    Signed-off-by: Artem Bityutskiy

    Artem Bityutskiy
     
  • Fix several minor stylistic issues:
    * lines longer than 80 characters
    * space before closing parenthesis ')'
    * spaces in the indentations

    Signed-off-by: Artem Bityutskiy

    Artem Bityutskiy
     

11 Mar, 2011

1 commit

  • Running kernel 2.6.37, my PPC-based device occasionally gets an
    order-2 allocation failure in UBIFS, which causes the root FS to
    become unwritable:

    kswapd0: page allocation failure. order:2, mode:0x4050
    Call Trace:
    [c787dc30] [c00085b8] show_stack+0x7c/0x194 (unreliable)
    [c787dc70] [c0061aec] __alloc_pages_nodemask+0x4f0/0x57c
    [c787dd00] [c0061b98] __get_free_pages+0x20/0x50
    [c787dd10] [c00e4f88] ubifs_jnl_write_data+0x54/0x200
    [c787dd50] [c00e82d4] do_writepage+0x94/0x198
    [c787dd90] [c00675e4] shrink_page_list+0x40c/0x77c
    [c787de40] [c0067de0] shrink_inactive_list+0x1e0/0x370
    [c787de90] [c0068224] shrink_zone+0x2b4/0x2b8
    [c787df00] [c0068854] kswapd+0x408/0x5d4
    [c787dfb0] [c0037bcc] kthread+0x80/0x84
    [c787dff0] [c000ef44] kernel_thread+0x4c/0x68

    Similar problems were encountered last April by Tomasz Stanislawski:

    http://patchwork.ozlabs.org/patch/50965/

    This patch implements Artem's suggested fix: fall back to a
    mutex-protected static buffer, allocated at mount time. I tested it
    by forcing execution down the failure path, and didn't see any ill
    effects.

    Artem: massaged the patch a little, improved it so that we'd not
    allocate the write reserve buffer when we are in R/O mode.

    Signed-off-by: Matthew L. Creech
    Signed-off-by: Artem Bityutskiy

    Matthew L. Creech
     

08 Mar, 2011

3 commits

  • Currently we assume write-buffer size is always min_io_size. But
    this is about to change and write-buffers may be of variable size.
    Namely, they will be of max_write_size at the beginning, but will
    get smaller when we are approaching the end of LEB.

    This is a preparation patch which introduces 'size' field in
    the write-buffer structure which carries the current write-buffer
    size.

    Signed-off-by: Artem Bityutskiy

    Artem Bityutskiy
     
  • Incorporate the LEB offset information into UBIFS. We'll use this
    information in one of the next patches to figure out what are the
    max. write size offsets relative to the PEB. So this patch is just
    a preparation.

    Signed-off-by: Artem Bityutskiy

    Artem Bityutskiy
     
  • Incorporate maximum write size into the UBIFS description data
    structure. This patch just introduces new 'c->max_write_size'
    and 'c->max_write_shift' fields as a preparation for the following
    patches.

    Signed-off-by: Artem Bityutskiy

    Artem Bityutskiy
     

18 Jan, 2011

2 commits

  • This is a preparational patch which removes the 'c->always_chk_crc' which was
    set during mounting and remounting to R/W mode and introduces 'c->mounting'
    flag which is set when mounting. Now the 'c->always_chk_crc' flag is the
    same as 'c->remounting_rw && c->mounting'.

    This patch is a preparation for the next one which will need to know when we
    are mounting and remounting to R/W mode, which is exactly what
    'c->always_chk_crc' effectively is, but its name does not suite the
    next patch. The other possibility would be to just re-name it, but then
    we'd end up with less logical flags coverage.

    Signed-off-by: Artem Bityutskiy

    Artem Bityutskiy
     
  • This is a cosmetic patch which re-arranges variables in 'struct ubifs_info'
    so that all boolean-like variables which are only changed during mounting or
    re-mounting to R/W mode are places together. Then they are turned into
    bit-fields, which makes the structure a little bit smaller.

    Signed-off-by: Artem Bityutskiy

    Artem Bityutskiy
     

20 Sep, 2010

1 commit

  • Commit 2fde99cb55fb9d9b88180512a5e8a5d939d27fec "UBIFS: mark VFS SB RO too"
    introduced regression. This commit made UBIFS set the 'MS_RDONLY' flag in the
    VFS superblock when it switches to R/O mode due to an error. This was done
    to make VFS show the R/O UBIFS flag in /proc/mounts.

    However, several places in UBIFS relied on the 'MS_RDONLY' flag and assume this
    flag can only change when we re-mount. For example, 'ubifs_put_super()'.

    This patch introduces new UBIFS flag - 'c->ro_mount' which changes only when
    we re-mount, and preserves the way UBIFS was originally mounted (R/W or R/O).
    This allows us to de-initialize UBIFS cleanly in 'ubifs_put_super()'.

    This patch also changes all 'ubifs_assert(!c->ro_media)' assertions to
    'ubifs_assert(!c->ro_media && !c->ro_mount)', because we never should write
    anything if the FS was mounter R/O.

    All the places where we test for 'MS_RDONLY' flag in the VFS SB were changed
    and now we test the 'c->ro_mount' flag instead, because it preserves the
    original UBIFS mount type, unlike the 'MS_RDONLY' flag.

    Signed-off-by: Artem Bityutskiy

    Artem Bityutskiy
     

17 Sep, 2010

1 commit

  • The R/O state may have various reasons:

    1. The UBI volume is R/O
    2. The FS is mounted R/O
    3. The FS switched to R/O mode because of an error

    However, in UBIFS we have only one variable which represents cases
    1 and 3 - 'c->ro_media'. Indeed, we set this to 1 if we switch to
    R/O mode due to an error, and then we test it in many places to
    make sure that we stop writing as soon as the error happens.

    But this is very unclean. One consequence of this, for example, is
    that in 'ubifs_remount_fs()' we use 'c->ro_media' to check whether
    we are in R/O mode because on an error, and we print a message
    in this case. However, if we are in R/O mode because the media
    is R/O, our message is bogus.

    This patch introduces new flag - 'c->ro_error' which is set when
    we switch to R/O mode because of an error. It also changes all
    "if (c->ro_media)" checks to "if (c->ro_error)" checks, because
    this is what the checks actually mean. We do not need to check
    for 'c->ro_media' because if the UBI volume is in R/O mode, we
    do not allow R/W mounting, and now writes can happen. This is
    guaranteed by VFS. But it is good to double-check this, so this
    patch also adds many "ubifs_assert(!c->ro_media)" checks.

    In the 'ubifs_remount_fs()' function this patch makes a bit more
    changes - it fixes the error messages as well.

    Signed-off-by: Artem Bityutskiy

    Artem Bityutskiy
     

30 Aug, 2010

1 commit

  • When scanning the flash, UBIFS builds a list of flash nodes of type
    'struct ubifs_scan_node'. Each scanned node has a 'snod->key' field. This field
    is valid for most of the nodes, but invalid for some node type, e.g., truncation
    nodes. It is safer to explicitly initialize such keys to something invalid,
    rather than leaving them initialized to all zeros, which has key type of
    UBIFS_INO_KEY.

    This patch introduces new "fake" key type UBIFS_INVALID_KEY and initializes
    unused 'snod->key' objects to this type. It also adds debugging assertions in
    the TNC code to make sure no one ever tries to look these nodes up in the TNC.

    Signed-off-by: Artem Bityutskiy

    Artem Bityutskiy
     

10 Aug, 2010

1 commit

  • Make sure we check the truncate constraints early on in ->setattr by adding
    those checks to inode_change_ok. Also clean up and document inode_change_ok
    to make this obvious.

    As a fallout we don't have to call inode_newsize_ok from simple_setsize and
    simplify it down to a truncate_setsize which doesn't return an error. This
    simplifies a lot of setattr implementations and means we use truncate_setsize
    almost everywhere. Get rid of fat_setsize now that it's trivial and mark
    ext2_setsize static to make the calling convention obvious.

    Keep the inode_newsize_ok in vmtruncate for now as all callers need an
    audit for its removal anyway.

    Note: setattr code in ecryptfs doesn't call inode_change_ok at all and
    needs a deeper audit, but that is left for later.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

19 Jul, 2010

1 commit

  • The current shrinker implementation requires the registered callback
    to have global state to work from. This makes it difficult to shrink
    caches that are not global (e.g. per-filesystem caches). Pass the shrinker
    structure to the callback so that users can embed the shrinker structure
    in the context the shrinker needs to operate on and get back to it in the
    callback via container_of().

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     

28 May, 2010

2 commits


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

15 Sep, 2009

2 commits


10 Sep, 2009

1 commit

  • At the moment UBIFS print large and scary error messages and
    flash dumps in case of nearly any corruption, even if it is
    a recoverable corruption. For example, if the master node is
    corrupted, ubifs_scan() prints error dumps, then UBIFS recovers
    just fine and goes on.

    This patch makes UBIFS print scary error messages only in
    real cases, which are not recoverable. It adds 'quiet' argument
    to the 'ubifs_scan()' function, so the caller may ask 'ubi_scan()'
    not to print error messages if the caller is able to do recovery.

    Signed-off-by: Artem Bityutskiy
    Reviewed-by: Adrian Hunter

    Artem Bityutskiy
     

05 Jul, 2009

3 commits


08 Jun, 2009

1 commit

  • UBIFS uses timers for write-buffer write-back. It is not
    crucial for us to write-back exactly on time. We are fine
    to write-back a little earlier or later. And this means
    we may optimize UBIFS timer so that it could be groped
    with a close timer event, so that the CPU would not be
    waken up just to do the write back. This is optimization
    to lessen power consumption, which is important in
    embedded devices UBIFS is used for.

    hrtimers have a nice feature: they are effectively range
    timers, and we may defind the soft and hard limits for
    it. Standard timers do not have these feature. They may
    only be made deferrable, but this means there is effectively
    no hard limit. So, we will better use hrtimers.

    Signed-off-by: Artem Bityutskiy

    Artem Bityutskiy
     

26 Mar, 2009

1 commit

  • Now UBIFS is supported by u-boot. If we ever decide to change the
    media format, then people will have to upgrade their u-boots to
    mount new format images. However, very often it is possible to
    preserve R/O forward-compatibility, even though the write
    forward-compatibility is not preserved.

    This patch introduces a new super-block field which stores the
    R/O compatibility version.

    Signed-off-by: Artem Bityutskiy
    Acked-by: Adrian Hunter

    Artem Bityutskiy
     

16 Mar, 2009

1 commit


08 Mar, 2009

1 commit


29 Jan, 2009

2 commits

  • This UBIFS feature has never worked properly, and it was a mistake
    to add it because we simply have no use-cases. So, lets still accept
    the fast_unmount mount option, but ignore it. This does not change
    much, because UBIFS commit in sync_fs anyway, and sync_fs is called
    while unmounting.

    Signed-off-by: Artem Bityutskiy

    Artem Bityutskiy
     
  • - preserve the idx_gc list - it will be needed in the same
    state, should UBIFS be remounted rw again
    - prevent remounting ro if we have switched to read only
    mode (due to a fatal error)

    Signed-off-by: Adrian Hunter
    Signed-off-by: Artem Bityutskiy

    Adrian Hunter
     

27 Jan, 2009

1 commit

  • When data CRC checking is disabled, UBIFS returns incorrect return
    code from the 'try_read_node()' function (0 instead of 1, which means
    CRC error), which make the caller re-read the data node again, but using
    a different code patch, so the second read is fine. Thus, we read the
    same node twice. And the result of this is that UBIFS is slower
    with no_chk_data_crc option than it is with chk_data_crc option.
    This patches fixes the problem.

    Reported-by: Reuben Dowle
    Signed-off-by: Artem Bityutskiy

    Artem Bityutskiy