15 Feb, 2006

2 commits

  • If 2 threads attached to the same process are blocking on different locks on
    different files (maybe even on different servers) but have the same lock
    arguments (i.e. same offset+length - actually quite common, since most
    processes try to lock the entire file) then the first GRANTED call that wakes
    one up will also wake the other.

    Currently when the NLM_GRANTED callback comes in, lockd walks the list of
    blocked locks in search of a match to the lock that the NLM server has
    granted. Although it checks the lock pid, start and end, it fails to check
    the filehandle and the server address.

    By checking the filehandle and server IP address, we ensure that this only
    happens if the locks truly are referencing the same file.

    Signed-off-by: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Trond Myklebust
     
  • This patch reverts commit f93ea411b73594f7d144855fd34278bcf34a9afc:
    [PATCH] jbd: split checkpoint lists

    This broke journal_flush() for OCFS2, which is its method of being sure
    that metadata is sent to disk for another node.

    And two related commits 8d3c7fce2d20ecc3264c8d8c91ae3beacdeaed1b and
    43c3e6f5abdf6acac9b90c86bf03f995bf7d3d92 with the subjects:
    [PATCH] jbd: log_do_checkpoint fix
    [PATCH] jbd: remove_transaction fix

    These seem to be incremental bugfixes on the original patch and as such are
    no longer needed.

    Signed-off-by: Mark Fasheh
    Cc: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mark Fasheh
     

14 Feb, 2006

1 commit


13 Feb, 2006

1 commit

  • Unfortunately, the reiserfs_attrs_cleared bit in the superblock flag can
    lie. File systems have been observed with the bit set, yet still contain
    garbage in the stat data field, causing unpredictable results.

    This patch backs out the enable-by-default behavior.

    It eliminates the changes from: d50a5cd860ce721dbeac6a4f3c6e42abcde68cd8,
    and ef5e5414e7a83eb9b4295bbaba5464410b11e030.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     

12 Feb, 2006

2 commits

  • With David Woodhouse

    select() presently has a habit of increasing the value of the user's
    `timeout' argument on return.

    We were writing back a timeout larger than the original. We _deliberately_
    round up, since we know we must wait at _least_ as long as the caller asks
    us to.

    The patch adds a couple of helper functions for magnitude comparison of
    timespecs and of timevals, and uses them to prevent the various poll and
    select functions from returning a timeout which is larger than the one which
    was passed in.

    The patch also fixes a bug in compat_sys_pselect7(): it was adding the new
    timeout value to the old one and was returning that. It should just return
    the new timeout value.

    (We have various handy timespec/timeval-to-from-nsec conversion functions in
    time.h. But this code open-codes it all).

    Cc: "David S. Miller"
    Cc: Andi Kleen
    Cc: Ulrich Drepper
    Cc: Thomas Gleixner
    Cc: george anzinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • The *at patches introduced fstatat and, due to inusfficient research, I
    used the newfstat functions generally as the guideline. The result is that
    on 32-bit platforms we don't have all the information needed to implement
    fstatat64.

    This patch modifies the code to pass up 64-bit information if
    __ARCH_WANT_STAT64 is defined. I renamed the syscall entry point to make
    this clear. Other archs will continue to use the existing code. On x86-64
    the compat code is implemented using a new sys32_ function. this is what
    is done for the other stat syscalls as well.

    This patch might break some other archs (those which define
    __ARCH_WANT_STAT64 and which already wired up the syscall). Yet others
    might need changes to accomodate the compatibility mode. I really don't
    want to do that work because all this stat handling is a mess (more so in
    glibc, but the kernel is also affected). It should be done by the arch
    maintainers. I'll provide some stand-alone test shortly. Those who are
    eager could compile glibc and run 'make check' (no installation needed).

    The patch below has been tested on x86 and x86-64.

    Signed-off-by: Ulrich Drepper
    Cc: Christoph Hellwig
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     

08 Feb, 2006

12 commits


07 Feb, 2006

3 commits


06 Feb, 2006

5 commits

  • Signed-off-by: Ulrich Drepper
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     
  • When walking a path, the LOOKUP_CONTINUE flag is used by some filesystems
    (for instance NFS) in order to determine whether or not it is looking up
    the last component of the path. It this is the case, it may have to look
    at the intent information in order to perform various tasks such as atomic
    open.

    A problem currently occurs when link_path_walk() hits a symlink. In this
    case LOOKUP_CONTINUE may be cleared prematurely when we hit the end of the
    path passed by __vfs_follow_link() (i.e. the end of the symlink path)
    rather than when we hit the end of the path passed by the user.

    The solution is to have link_path_walk() clear LOOKUP_CONTINUE if and only
    if that flag was unset when we entered the function.

    Signed-off-by: Trond Myklebust
    Cc: Al Viro
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Trond Myklebust
     
  • Ben points out that:

    When writing files out using O_SYNC, jbd's 1 jiffy delay results in a
    significant drop in throughput as the disk sits idle. The patch below
    results in a 4-5x performance improvement (from 6.5MB/s to ~24-30MB/s on my
    IDE test box) when writing out files using O_SYNC.

    So optimise the batching code by omitting it entirely if the process which is
    doing a sync write is the same as the one which did the most recent sync
    write. If that's true, we're unlikely to get any other processes joining the
    transaction.

    (Has been in -mm for ages - it took me a long time to get on to performance
    testing it)

    Numbers, on write-cache-disabled IDE:

    /usr/bin/time -p synctest -n 10 -uf -t 1 -p 1 dir-name

    Unpatched:
    40 seconds
    Patched:
    35 seconds
    Batching disabled:
    35 seconds

    This is the problematic single-process-doing-fsync case. With multiple
    fsyncing processes the numbers are AFACIT unaltered by the patch.

    Aside: performance testing and instrumentation shows that the transaction
    batching almost doesn't help (testing with synctest -n 1 -uf -t 100 -p 10
    dir-name on non-writeback-caching IDE). This is because by the time one
    process is running a synchronous commit, a bunch of other processes already
    have a transaction handle open, so they're all going to batch into the same
    transaction anyway.

    The batching seems to offer maybe 5-10% speedup with this workload, but I'm
    pretty sure it was more important than that when it was first developed 4-odd
    years ago...

    Cc: "Stephen C. Tweedie"
    Cc: Benjamin LaHaise
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • The last fix for this function in fact opened up a much more often
    triggering race.

    It was uncommented tricky code, that was buggy. Add comment, make it less
    tricky and fix bug.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • percpu_data blindly allocates bootmem memory to store NR_CPUS instances of
    cpudata, instead of allocating memory only for possible cpus.

    As a preparation for changing that, we need to convert various 0 -> NR_CPUS
    loops to use for_each_cpu().

    (The above only applies to users of asm-generic/percpu.h. powerpc has gone it
    alone and is presently only allocating memory for present CPUs, so it's
    currently corrupting memory).

    Signed-off-by: Eric Dumazet
    Cc: "David S. Miller"
    Cc: James Bottomley
    Acked-by: Ingo Molnar
    Cc: Jens Axboe
    Cc: Anton Blanchard
    Acked-by: William Irwin
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     

04 Feb, 2006

14 commits