28 Mar, 2007

7 commits

  • The user can generate console output if they cause do_mmap() to fail
    during sys_io_setup(). This was seen in a regression test that does
    exactly that by spinning calling mmap() until it gets -ENOMEM before
    calling io_setup().

    We don't need this printk at all, just remove it.

    Signed-off-by: Zach Brown
    Signed-off-by: Linus Torvalds

    Zach Brown
     
  • * 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block:
    Export __splice_from_pipe()
    2/2 splice: dont readpage
    1/2 splice: dont steal
    make elv_register() output atomic
    block: blk_max_pfn is somtimes wrong

    Linus Torvalds
     
  • Without attached patch against current -git I get following with
    !PROC_SYSCTL (with EMBEDDED and PROC_FS set):

    CC init/version.o
    LD init/built-in.o
    LD vmlinux
    fs/built-in.o: In function `do_proc_sys_lookup':
    proc_sysctl.c:(.text+0x26583): undefined reference to `sysctl_head_next'
    fs/built-in.o: In function `proc_sys_revalidate':
    proc_sysctl.c:(.text+0x265bb): undefined reference to `sysctl_head_finish'
    fs/built-in.o: In function `proc_sys_readdir':
    proc_sysctl.c:(.text+0x26720): undefined reference to `sysctl_head_next'
    proc_sysctl.c:(.text+0x267d8): undefined reference to `sysctl_head_finish'
    proc_sysctl.c:(.text+0x268e7): undefined reference to `sysctl_head_next'
    proc_sysctl.c:(.text+0x26910): undefined reference to `sysctl_head_finish'
    fs/built-in.o: In function `proc_sys_write':
    proc_sysctl.c:(.text+0x2695d): undefined reference to `sysctl_perm'
    proc_sysctl.c:(.text+0x2699c): undefined reference to `sysctl_head_finish'
    fs/built-in.o: In function `proc_sys_read':
    proc_sysctl.c:(.text+0x269e9): undefined reference to `sysctl_perm'
    proc_sysctl.c:(.text+0x26a25): undefined reference to `sysctl_head_finish'
    fs/built-in.o: In function `proc_sys_permission':
    proc_sysctl.c:(.text+0x26ad1): undefined reference to `sysctl_perm'
    proc_sysctl.c:(.text+0x26adb): undefined reference to `sysctl_head_finish'
    fs/built-in.o: In function `proc_sys_lookup':
    proc_sysctl.c:(.text+0x26b39): undefined reference to `sysctl_head_finish'
    make: *** [vmlinux] Virhe 1

    All those functions are in fs/proc/proc_sysctl.c, which has no CONFIG_
    #define's in it, so the patch makes the compilation of that file to depend
    on CONFIG_PROC_SYSCTL (the simplest choice).

    Acked-by: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mika Kukkonen
     
  • This cancel_delayed_work call is called from a function that is only called
    from a piece of code that immediate follows a cancel and destruction of the
    workqueue, so it's clearly a mistake.

    Cc: Oleg Nesterov
    Signed-off-by: "J. Bruce Fields"
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    J. Bruce Fields
     
  • The reused clientid here is a more of a problem for the client than the
    server, and the client can report the problem itself if it's serious.

    Signed-off-by: "J. Bruce Fields"
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bruce Fields
     
  • A regression introduced in the last set of acl patches removed the
    INHERIT_ONLY flag from aces derived from the posix acl. Fix.

    Signed-off-by: "J. Bruce Fields"
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bruce Fields
     
  • ->readdir passes lofft_t offsets (used as nfs cookies) to
    nfs3svc_encode_entry{,_plus}, but when they pass it on to encode_entry it
    becomes an 'off_t', which isn't good.

    So filesystems that returned 64bit offsets would lose.

    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     

27 Mar, 2007

4 commits

  • Ocfs2 wants to implement it's own splice write actor so that it can better
    manage cluster / page locks. This lets us re-use the rest of splice write
    while only providing our own code where it's actually important.

    Signed-off-by: Mark Fasheh
    Signed-off-by: Jens Axboe

    Mark Fasheh
     
  • Splice does not need to readpage to bring the page uptodate before writing
    to it, because prepare_write will take care of that for us.

    Splice is also wrong to SetPageUptodate before the page is actually uptodate.
    This results in the old uninitialised memory leak. This gets fixed as a
    matter of course when removing the readpage logic.

    Signed-off-by: Nick Piggin
    Signed-off-by: Jens Axboe

    Nick Piggin
     
  • Stealing pages with splice is problematic because we cannot just insert
    an uptodate page into the pagecache and hope the filesystem can take care
    of it later.

    We also cannot just ClearPageUptodate, then hope prepare_write does not
    write anything into the page, because I don't think prepare_write gives
    that guarantee.

    Remove support for SPLICE_F_MOVE for now. If we really want to bring it
    back, we might be able to do so with a the new filesystem buffered write
    aops APIs I'm working on. If we really don't want to bring it back, then
    we should decide that sooner rather than later, and remove the flag and
    all the stealing infrastructure before anybody starts using it.

    Signed-off-by: Nick Piggin
    Signed-off-by: Jens Axboe

    Nick Piggin
     
  • This patch makes te needlessly global struct v9fs_cached_file_operations
    static.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Eric Van Hensbergen

    Adrian Bunk
     

24 Mar, 2007

2 commits

  • A little mistake in 8a2bfdcbfa441d8b0e5cb9c9a7f45f77f80da465 is making all
    transactions synchronous, which reduces ext3 performance to comical levels.

    Cc: Mingming Cao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Fix the /proc/pid/stat representation of executable boundaries. It should
    show the bounds of the executable, but instead shows the bounds of the
    loader.

    Before the patch is applied, the bug can be seen by examining, say, inetd:

    # ps | grep inetd
    610 root 0 S /usr/sbin/inetd -i
    # cat /proc/610/maps
    c0bb0000-c0bba788 r-xs 00000000 00:0b 14582157 /lib/ld-uClibc-0.9.28.so
    c3180000-c31dede4 r-xs 00000000 00:0b 14582179 /lib/libuClibc-0.9.28.so
    c328c000-c328ea00 rw-p 00008000 00:0b 14582157 /lib/ld-uClibc-0.9.28.so
    c3290000-c329b6c0 rw-p 00000000 00:00 0
    c32a0000-c32c0000 rwxp 00000000 00:00 0
    c32d4000-c32d8000 rw-p 00000000 00:00 0
    c3394000-c3398000 rw-p 00000000 00:00 0
    c3458000-c345f464 r-xs 00000000 00:0b 16384612 /usr/sbin/inetd
    c3470000-c34748f8 rw-p 00004000 00:0b 16384612 /usr/sbin/inetd
    c34cc000-c34d0000 rw-p 00000000 00:00 0
    c34d4000-c34d8000 rw-p 00000000 00:00 0
    c34d8000-c34dc000 rw-p 00000000 00:00 0
    # cat /proc/610/stat
    610 (inetd) S 1 610 610 0 -1 256 0 0 0 0 0 8 0 0 19 0 1 0 94392000718
    950272 0 4294967295 3233480704 3233523592 3274440352 3274439976
    3273467584 0 0 4096 90115 3221712796 0 0 17 0 0 0 0

    The code boundaries are 3233480704 to 3233523592, which are:

    (gdb) p/x 3233480704
    $1 = 0xc0bb0000
    (gdb) p/x 3233523592
    $2 = 0xc0bba788

    Which corresponds to this line in the maps file:

    c0bb0000-c0bba788 r-xs 00000000 00:0b 14582157 /lib/ld-uClibc-0.9.28.so

    Which is wrong. After the patch is applied, the maps file is pretty much
    identical (there's some minor shuffling of the location of some of the
    anonymous VMAs), but the stat file is now:

    # cat /proc/610/stat
    610 (inetd) S 1 610 610 0 -1 256 0 0 0 0 0 7 0 0 18 0 1 0 94392000722
    950272 0 4294967295 3276111872 3276141668 3274440352 3274439976
    3273467584 0 0 4096 90115 3221712796 0 0 17 0 0 0 0

    The code boundaries are then 3276111872 to 3276141668, which are:

    (gdb) p/x 3276111872
    $1 = 0xc3458000
    (gdb) p/x 3276141668
    $2 = 0xc345f464

    And these correspond to this line in the maps file instead:

    c3458000-c345f464 r-xs 00000000 00:0b 16384612 /usr/sbin/inetd

    Which is now correct.

    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

23 Mar, 2007

3 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
    [CIFS] Allow reset of file to ATTR_NORMAL when archive bit not set
    [CIFS] Do not negotiate new POSIX_PATH_OPERATIONS_CAP yet
    [CIFS] reset mode when client notices that ATTR_READONLY is no longer set

    Linus Torvalds
     
  • Since freezable workqueues are broken in 2.6.21-rc
    (cf. http://marc.theaimsgroup.com/?l=linux-kernel&m=116855740612755,
    http://marc.theaimsgroup.com/?l=linux-kernel&m=117261312523921&w=2)
    it's better to change the only user of them, which is XFS, to use "normal"
    nonfreezable workqueues.

    Signed-off-by: Rafael J. Wysocki
    Cc: Pavel Machek
    Cc: David Chinner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • When a file had a dos attribute of 0x1 (readonly - but dos attribute
    of archive was not set) - doing chmod 0777 or equivalent would
    try to set a dos attribute of 0 (which some servers ignore)
    rather than ATTR_NORMAL (0x20) which most servers accept.
    Does not affect servers which support the CIFS Unix Extensions.

    Acked-by: Prasad Potluri
    Acked-by: Shirish Pargaonkar
    Signed-off-by: Steve French

    Steve French
     

17 Mar, 2007

11 commits

  • This bug was seen on ppc64, but it could have occurred on any
    architecture with a page size of 64k or above. The problem is that in
    fs/binfmt_elf.c:randomize_stack_top() randomizes the stack to within
    0x7ff pages. On 4k page machines, this is 8MB; on 64k page boxes, this
    is 128MB.

    The problem is that the new binary layout (selected in
    arch_pick_mmap_layout) places the mapping segment 128MB or the stack
    rlimit away from the top of the process memory, whichever is larger. If
    you chose an rlimit of less than 128MB (most defaults are in the 8Mb
    range) then you can end up having your entire stack randomized away.

    The fix is to make randomize_stack_top() only steal at most 8MB, which this
    patch does. However, I have to point out that even with this, your stack
    rlimit might not be exactly what you get if it's > 128MB, because you're
    still losing the random offset of up to 8MB.

    The true fix should be to leave an explicit gap for the randomization plus
    a buffer when determining mmap_base, but that would involve fixing all the
    architectures.

    Cc: Arjan van de Ven
    Cc: Ingo Molnar
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    James Bottomley
     
  • Remove the misleading "Presently only useful on the IA-64 platform" text
    from the EFI partition Kconfig.

    EFI partitions are also used by Apple on their Intel-based machines and
    thus you need EFI partition support if you (for example) want to attach
    such a machine in target disk mode.

    Signed-off-by: Johannes Berg
    Acked-by: Matt Domsch
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Berg
     
  • Looks like we need a check in nfs_getattr() for a regular file. It makes
    no sense to call nfs_sync_mapping_range() on anything else. I think that
    should fix your problem: it will stop the NFS client from interfering
    with dirty pages on that inode's mapping.

    Signed-off-by: Trond Myklebust
    Acked-by: Olof Johansson
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Trond Myklebust
     
  • The current NFS client congestion logic is severly broken, it marks the
    backing device congested during each nfs_writepages() call but doesn't
    mirror this in nfs_writepage() which makes for deadlocks. Also it
    implements its own waitqueue.

    Replace this by a more regular congestion implementation that puts a cap on
    the number of active writeback pages and uses the bdi congestion waitqueue.

    Also always use an interruptible wait since it makes sense to be able to
    SIGKILL the process even for mounts without 'intr'.

    Signed-off-by: Peter Zijlstra
    Acked-by: Trond Myklebust
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • The only error code which comes from the partition checkers is -1, when
    they finds an EIO. As per the discussion, ENOMEM values were ignored,
    as they might scare the users.

    So, with the current code, we end up returning -1 and not EIO for the
    ioctl() calls. Which doesn't give any clue to the user of what went
    wrong.

    Signed-off-by: Suzuki K P
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    suzuki
     
  • smbfs allocates rq_trans2buffer to handle server's multi transaction2 response
    messages. As struct smb_request may be reused, rq_trans2buffer is freed
    before each new request. However if last servers's response is not multi but
    single trans2 message then new rq_trans2buffer is not allocated but last
    smb_rput still tries to free it again.

    To prevent this issue rq_trans2buffer pointer should be set to NULL after
    kfree.

    Signed-off-by: Vasily Averin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vasily Averin
     
  • ecryptfs_d_release() first dereferences a pointer (via
    ecryptfs_dentry_to_lower()) and then afterwards checks to see if the
    pointer it just dereferenced is NULL (via ecryptfs_dentry_to_private()).

    This patch moves all of the work done on the dereferenced pointer inside a
    block governed by the condition that the pointer is non-NULL.

    Signed-off-by: Michael Halcrow
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael Halcrow
     
  • During modification of code to support UFS2 writing, the case with
    "three indirect" blocks in truncate path was missed, this patch fixes
    this situation.

    Signed-off-by: Evgeniy Dushistov
    Acked-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Evgeniy Dushistov
     
  • This patch fix behaviour in such test scenario:

    lseek(fd, BIG_OFFSET)
    write(fd, buf, sizeof(buf))
    truncate(BIG_OFFSET)
    truncate(BIG_OFFSET + sizeof(buf))
    read(fd, buf...)

    Because of if file big enough(BIG_OFFSET) we start allocate space by block,
    ordinary block size > page size, so we should zeroize the rest of block in
    truncate(except last framgnet, about which VFS should care), to not get
    garbage, when we extend file.

    Also patch corrects conversion from pointer to block to physical block number,
    this helps in case of not common used UFS types.

    And add to debug output inode number.

    Signed-off-by: Evgeniy Dushistov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Evgeniy Dushistov
     
  • This fixes "change blocks numbers on the fly" in case when "prepare
    write page" is in the call chain, in this case some buffers may be not
    uptodate and not mapped, we should care to map them and load from disk.

    This patch was tested with:
    - ufs regressions simple tests
    - fsx-linux
    - ltp(20060306)
    - untar and build kernel

    Signed-off-by: Evgeniy Dushistov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Evgeniy Dushistov
     
  • This patch corrects work with time in UFS2 case.

    1) According to UFS2 disk layout modification/access and so on "time"
    should be hold in two variables one 64bit for seconds and another 32bit for
    nanoseconds,

    at now for some unknown reason we suppose that "inode time" holds in
    three variables 32bit for seconds, 32bit for milliseconds and 32bit for
    nanoseconds.

    2) We set amount of nanoseconds in "VFS inode" to 0 during read, instead of
    getting values from "on disk inode"(this should close
    http://bugzilla.kernel.org/show_bug.cgi?id=7991).

    Signed-off-by: Evgeniy Dushistov
    Cc: Bjoern Jacke
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Evgeniy Dushistov
     

16 Mar, 2007

3 commits

  • Samba server now expects that clients which send the new
    POSIX_PATH_OPERATIONS_CAP send all opens with this new
    SMB - and expects that clients that could send the new
    posix open/create but don't as indicating that they really
    want Windows semantics on that handle (which allows Samba
    to support clients which want to support both types of
    behaviors on different handles on the same mount)

    We will put this capability back in the SetFSInfo
    negotiation with servers like Samba when the
    new POSIXCreate (create/open/mkdir) code is finished.

    Signed-off-by: Steve French

    Steve French
     
  • This patch (as869) reinstates the mutual exclusion between sysfs
    attribute method calls and attribute unregistration. The
    previously-reported deadlocks have been fixed, and this exclusion is
    by far the simplest way to avoid races during driver unbinding.

    The check for orphaned read-buffers has been moved down slightly, so
    that the remainder of a partially-read buffer will still be available
    to userspace even after the attribute has been unregistered.

    Signed-off-by: Alan Stern
    Cc: Hugh Dickins
    Cc: Cornelia Huck
    Cc: Oliver Neukum
    Signed-off-by: Linus Torvalds

    Alan Stern
     
  • This patch (as868) adds a helper routine for device drivers that need
    to set up a callback to perform some action in a different process's
    context. This is intended for use by attribute methods that want to
    unregister themselves or their parent device. Attribute method calls
    are mutually exclusive with unregistration, so such actions cannot be
    taken directly.

    Two attribute methods are converted to use the new helper routine: one
    for SCSI device deletion and one for System/390 ccwgroup devices.

    Signed-off-by: Alan Stern
    Cc: Hugh Dickins
    Cc: Cornelia Huck
    Cc: Oliver Neukum
    Signed-off-by: Linus Torvalds

    Alan Stern
     

15 Mar, 2007

10 commits