21 Jul, 2012

2 commits

  • Pull pnfs/ore fixes from Boaz Harrosh:
    "These are catastrophic fixes to the pnfs objects-layout that were just
    discovered. They are also destined for @stable.

    I have found these and worked on them at around RC1 time but
    unfortunately went to the hospital for kidney stones and had a very
    slow recovery. I refrained from sending them as is, before proper
    testing, and surly I have found a bug just yesterday.

    So now they are all well tested, and have my sign-off. Other then
    fixing the problem at hand, and assuming there are no bugs at the new
    code, there is low risk to any surrounding code. And in anyway they
    affect only these paths that are now broken. That is RAID5 in pnfs
    objects-layout code. It does also affect exofs (which was not broken)
    but I have tested exofs and it is lower priority then objects-layout
    because no one is using exofs, but objects-layout has lots of users."

    * 'for-linus' of git://git.open-osd.org/linux-open-osd:
    pnfs-obj: Fix __r4w_get_page when offset is beyond i_size
    pnfs-obj: don't leak objio_state if ore_write/read fails
    ore: Unlock r4w pages in exact reverse order of locking
    ore: Remove support of partial IO request (NFS crash)
    ore: Fix NFS crash by supporting any unaligned RAID IO

    Linus Torvalds
     
  • Pull UBIFS free space fix-up bugfix from Artem Bityutskiy:
    "It's been reported already twice recently:

    http://lists.infradead.org/pipermail/linux-mtd/2012-May/041408.html
    http://lists.infradead.org/pipermail/linux-mtd/2012-June/042422.html

    and we finally have the fix. I am quite confident the fix is correct
    because I could reproduce the problem with nandsim and verify the fix.
    It was also verified by Iwo (the reporter).

    I am also confident that this is OK to merge the fix so late because
    this patch affects only the fixup functionality, which is not used by
    most users."

    * tag 'upstream-3.5-rc8' of git://git.infradead.org/linux-ubifs:
    UBIFS: fix a bug in empty space fix-up

    Linus Torvalds
     

20 Jul, 2012

6 commits

  • It is very common for the end of the file to be unaligned on
    stripe size. But since we know it's beyond file's end then
    the XOR should be preformed with all zeros.

    Old code used to just read zeros out of the OSD devices, which is a great
    waist. But what scares me more about this situation is that, we now have
    pages attached to the file's mapping that are beyond i_size. I don't
    like the kind of bugs this calls for.

    Fix both birds, by returning a global zero_page, if offset is beyond
    i_size.

    TODO:
    Change the API to ->__r4w_get_page() so a NULL can be
    returned without being considered as error, since XOR API
    treats NULL entries as zero_pages.

    [Bug since 3.2. Should apply the same way to all Kernels since]
    CC: Stable Tree
    Signed-off-by: Boaz Harrosh

    Boaz Harrosh
     
  • [Bug since 3.2 Kernel]
    CC: Stable Tree
    Signed-off-by: Boaz Harrosh

    Boaz Harrosh
     
  • The read-4-write pages are locked in address ascending order.
    But where unlocked in a way easiest for coding. Fix that,
    locks should be released in opposite order of locking, .i.e
    descending address order.

    I have not hit this dead-lock. It was found by inspecting the
    dbug print-outs. I suspect there is an higher lock at caller that
    protects us, but fix it regardless.

    Signed-off-by: Boaz Harrosh

    Boaz Harrosh
     
  • Do to OOM situations the ore might fail to allocate all resources
    needed for IO of the full request. If some progress was possible
    it would proceed with a partial/short request, for the sake of
    forward progress.

    Since this crashes NFS-core and exofs is just fine without it just
    remove this contraption, and fail.

    TODO:
    Support real forward progress with some reserved allocations
    of resources, such as mem pools and/or bio_sets

    [Bug since 3.2 Kernel]
    CC: Stable Tree
    CC: Benny Halevy
    Signed-off-by: Boaz Harrosh

    Boaz Harrosh
     
  • In RAID_5/6 We used to not permit an IO that it's end
    byte is not stripe_size aligned and spans more than one stripe.
    .i.e the caller must check if after submission the actual
    transferred bytes is shorter, and would need to resubmit
    a new IO with the remainder.

    Exofs supports this, and NFS was supposed to support this
    as well with it's short write mechanism. But late testing has
    exposed a CRASH when this is used with none-RPC layout-drivers.

    The change at NFS is deep and risky, in it's place the fix
    at ORE to lift the limitation is actually clean and simple.
    So here it is below.

    The principal here is that in the case of unaligned IO on
    both ends, beginning and end, we will send two read requests
    one like old code, before the calculation of the first stripe,
    and also a new site, before the calculation of the last stripe.
    If any "boundary" is aligned or the complete IO is within a single
    stripe. we do a single read like before.

    The code is clean and simple by splitting the old _read_4_write
    into 3 even parts:
    1._read_4_write_first_stripe
    2. _read_4_write_last_stripe
    3. _read_4_write_execute

    And calling 1+3 at the same place as before. 2+3 before last
    stripe, and in the case of all in a single stripe then 1+2+3
    is preformed additively.

    Why did I not think of it before. Well I had a strike of
    genius because I have stared at this code for 2 years, and did
    not find this simple solution, til today. Not that I did not try.

    This solution is much better for NFS than the previous supposedly
    solution because the short write was dealt with out-of-band after
    IO_done, which would cause for a seeky IO pattern where as in here
    we execute in order. At both solutions we do 2 separate reads, only
    here we do it within a single IO request. (And actually combine two
    writes into a single submission)

    NFS/exofs code need not change since the ORE API communicates the new
    shorter length on return, what will happen is that this case would not
    occur anymore.

    hurray!!

    [Stable this is an NFS bug since 3.2 Kernel should apply cleanly]
    CC: Stable Tree
    Signed-off-by: Boaz Harrosh

    Boaz Harrosh
     
  • UBIFS has a feature called "empty space fix-up" which is a quirk to work-around
    limitations of dumb flasher programs. Namely, of those flashers that are unable
    to skip NAND pages full of 0xFFs while flashing, resulting in empty space at
    the end of half-filled eraseblocks to be unusable for UBIFS. This feature is
    relatively new (introduced in v3.0).

    The fix-up routine (fixup_free_space()) is executed only once at the very first
    mount if the superblock has the 'space_fixup' flag set (can be done with -F
    option of mkfs.ubifs). It basically reads all the UBIFS data and metadata and
    writes it back to the same LEB. The routine assumes the image is pristine and
    does not have anything in the journal.

    There was a bug in 'fixup_free_space()' where it fixed up the log incorrectly.
    All but one LEB of the log of a pristine file-system are empty. And one
    contains just a commit start node. And 'fixup_free_space()' just unmapped this
    LEB, which resulted in wiping the commit start node. As a result, some users
    were unable to mount the file-system next time with the following symptom:

    UBIFS error (pid 1): replay_log_leb: first log node at LEB 3:0 is not CS node
    UBIFS error (pid 1): replay_log_leb: log error detected while replaying the log at LEB 3:0

    The root-cause of this bug was that 'fixup_free_space()' wrongly assumed
    that the beginning of empty space in the log head (c->lhead_offs) was known
    on mount. However, it is not the case - it was always 0. UBIFS does not store
    in it the master node and finds out by scanning the log on every mount.

    The fix is simple - just pass commit start node size instead of 0 to
    'fixup_leb()'.

    Signed-off-by: Artem Bityutskiy
    Cc: stable@vger.kernel.org [v3.0+]
    Reported-by: Iwo Mergler
    Tested-by: Iwo Mergler
    Reported-by: James Nute

    Artem Bityutskiy
     

19 Jul, 2012

1 commit

  • Pull CIFS fixes from Steve French.

    * git://git.samba.org/sfrench/cifs-2.6:
    cifs: always update the inode cache with the results from a FIND_*
    cifs: when CONFIG_HIGHMEM is set, serialize the read/write kmaps
    cifs: on CONFIG_HIGHMEM machines, limit the rsize/wsize to the kmap space
    Initialise mid_q_entry before putting it on the pending queue

    Linus Torvalds
     

18 Jul, 2012

3 commits

  • Caused, AFAICS, by mismerge in commit ff9cb1c4eead ("Merge branch
    'for_linus' into for_linus_merged")

    Signed-off-by: Al Viro
    Cc: Theodore Ts'o
    Cc: stable@vger.kernel.org # 3.3+
    Signed-off-by: Linus Torvalds

    Al Viro
     
  • Pull a last-minute PM update from Rafael J. Wysocki:
    "This renames CAP_EPOLLWAKEUP to CAP_BLOCK_SUSPEND to encourage future
    reuse of the capability in question in related cases."

    * tag 'pm-post-3.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    PM: Rename CAP_EPOLLWAKEUP to CAP_BLOCK_SUSPEND

    Linus Torvalds
     
  • As discussed in
    http://thread.gmane.org/gmane.linux.kernel/1249726/focus=1288990,
    the capability introduced in 4d7e30d98939a0340022ccd49325a3d70f7e0238
    to govern EPOLLWAKEUP seems misnamed: this capability is about governing
    the ability to suspend the system, not using a particular API flag
    (EPOLLWAKEUP). We should make the name of the capability more general
    to encourage reuse in related cases. (Whether or not this capability
    should also be used to govern the use of /sys/power/wake_lock is a
    question that needs to be separately resolved.)

    This patch renames the capability to CAP_BLOCK_SUSPEND. In order to ensure
    that the old capability name doesn't make it out into the wild, could you
    please apply and push up the tree to ensure that it is incorporated
    for the 3.5 release.

    Signed-off-by: Michael Kerrisk
    Acked-by: Serge Hallyn
    Signed-off-by: Rafael J. Wysocki

    Michael Kerrisk
     

17 Jul, 2012

5 commits

  • When we get back a FIND_FIRST/NEXT result, we have some info about the
    dentry that we use to instantiate a new inode. We were ignoring and
    discarding that info when we had an existing dentry in the cache.

    Fix this by updating the inode in place when we find an existing dentry
    and the uniqueid is the same.

    Cc: # .31.x
    Reported-and-Tested-by: Andrew Bartlett
    Reported-by: Bill Robertson
    Reported-by: Dion Edwards
    Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     
  • Jian found that when he ran fsx on a 32 bit arch with a large wsize the
    process and one of the bdi writeback kthreads would sometimes deadlock
    with a stack trace like this:

    crash> bt
    PID: 2789 TASK: f02edaa0 CPU: 3 COMMAND: "fsx"
    #0 [eed63cbc] schedule at c083c5b3
    #1 [eed63d80] kmap_high at c0500ec8
    #2 [eed63db0] cifs_async_writev at f7fabcd7 [cifs]
    #3 [eed63df0] cifs_writepages at f7fb7f5c [cifs]
    #4 [eed63e50] do_writepages at c04f3e32
    #5 [eed63e54] __filemap_fdatawrite_range at c04e152a
    #6 [eed63ea4] filemap_fdatawrite at c04e1b3e
    #7 [eed63eb4] cifs_file_aio_write at f7fa111a [cifs]
    #8 [eed63ecc] do_sync_write at c052d202
    #9 [eed63f74] vfs_write at c052d4ee
    #10 [eed63f94] sys_write at c052df4c
    #11 [eed63fb0] ia32_sysenter_target at c0409a98
    EAX: 00000004 EBX: 00000003 ECX: abd73b73 EDX: 012a65c6
    DS: 007b ESI: 012a65c6 ES: 007b EDI: 00000000
    SS: 007b ESP: bf8db178 EBP: bf8db1f8 GS: 0033
    CS: 0073 EIP: 40000424 ERR: 00000004 EFLAGS: 00000246

    Each task would kmap part of its address array before getting stuck, but
    not enough to actually issue the write.

    This patch fixes this by serializing the marshal_iov operations for
    async reads and writes. The idea here is to ensure that cifs
    aggressively tries to populate a request before attempting to fulfill
    another one. As soon as all of the pages are kmapped for a request, then
    we can unlock and allow another one to proceed.

    There's no need to do this serialization on non-CONFIG_HIGHMEM arches
    however, so optimize all of this out when CONFIG_HIGHMEM isn't set.

    Cc:
    Reported-by: Jian Li
    Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     
  • We currently rely on being able to kmap all of the pages in an async
    read or write request. If you're on a machine that has CONFIG_HIGHMEM
    set then that kmap space is limited, sometimes to as low as 512 slots.

    With 512 slots, we can only support up to a 2M r/wsize, and that's
    assuming that we can get our greedy little hands on all of them. There
    are other users however, so it's possible we'll end up stuck with a
    size that large.

    Since we can't handle a rsize or wsize larger than that currently, cap
    those options at the number of kmap slots we have. We could consider
    capping it even lower, but we currently default to a max of 1M. Might as
    well allow those luddites on 32 bit arches enough rope to hang
    themselves.

    A more robust fix would be to teach the send and receive routines how
    to contend with an array of pages so we don't need to marshal up a kvec
    array at all. That's a fairly significant overhaul though, so we'll need
    this limit in place until that's ready.

    Cc:
    Reported-by: Jian Li
    Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     
  • A user reported a crash in cifs_demultiplex_thread() caused by an
    incorrectly set mid_q_entry->callback() function. It appears that the
    callback assignment made in cifs_call_async() was not flushed back to
    memory suggesting that a memory barrier was required here. Changing the
    code to make sure that the mid_q_entry structure was completely
    initialised before it was added to the pending queue fixes the problem.

    Signed-off-by: Sachin Prabhu
    Reviewed-by: Jeff Layton
    Reviewed-by: Shirish Pargaonkar
    Signed-off-by: Steve French

    Sachin Prabhu
     
  • Pull xfs regression fixes from Ben Myers:
    - Really fix a cursor leak in xfs_alloc_ag_vextent_near
    - Fix a performance regression related to doing allocation in
    workqueues
    - Prevent recursion in xfs_buf_iorequest which is causing stack
    overflows
    - Don't call xfs_bdstrat_cb in xfs_buf_iodone callbacks

    * tag 'for-linus-v3.5-rc7' of git://oss.sgi.com/xfs/xfs:
    xfs: do not call xfs_bdstrat_cb in xfs_buf_iodone_callbacks
    xfs: prevent recursion in xfs_buf_iorequest
    xfs: don't defer metadata allocation to the workqueue
    xfs: really fix the cursor leak in xfs_alloc_ag_vextent_near

    Linus Torvalds
     

16 Jul, 2012

1 commit

  • If a parent and child process open the two ends of a fifo, and the
    child immediately exits, the parent may receive a SIGCHLD before its
    open() returns. In that case, we need to make sure that open() will
    return successfully after the SIGCHLD handler returns, instead of
    throwing EINTR or being restarted. Otherwise, the restarted open()
    would incorrectly wait for a second partner on the other end.

    The following test demonstrates the EINTR that was wrongly thrown from
    the parent’s open(). Change .sa_flags = 0 to .sa_flags = SA_RESTART
    to see a deadlock instead, in which the restarted open() waits for a
    second reader that will never come. (On my systems, this happens
    pretty reliably within about 5 to 500 iterations. Others report that
    it manages to loop ~forever sometimes; YMMV.)

    #include
    #include
    #include
    #include
    #include
    #include
    #include
    #include

    #define CHECK(x) do if ((x) == -1) {perror(#x); abort();} while(0)

    void handler(int signum) {}

    int main()
    {
    struct sigaction act = {.sa_handler = handler, .sa_flags = 0};
    CHECK(sigaction(SIGCHLD, &act, NULL));
    CHECK(mknod("fifo", S_IFIFO | S_IRWXU, 0));
    for (;;) {
    int fd;
    pid_t pid;
    putc('.', stderr);
    CHECK(pid = fork());
    if (pid == 0) {
    CHECK(fd = open("fifo", O_RDONLY));
    _exit(0);
    }
    CHECK(fd = open("fifo", O_WRONLY));
    CHECK(close(fd));
    CHECK(waitpid(pid, NULL, 0));
    }
    }

    This is what I suspect was causing the Git test suite to fail in
    t9010-svn-fe.sh:

    http://bugs.debian.org/678852

    Signed-off-by: Anders Kaseorg
    Reviewed-by: Jonathan Nieder
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Anders Kaseorg
     

14 Jul, 2012

6 commits

  • xfs_bdstrat_cb only adds a check for a shutdown filesystem over
    xfs_buf_iorequest, but xfs_buf_iodone_callbacks just checked for a shut down
    filesystem a little earlier. In addition the shutdown handling in
    xfs_bdstrat_cb is not very suitable for this caller.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Ben Myers

    Christoph Hellwig
     
  • If the b_iodone handler is run in calling context in xfs_buf_iorequest we
    can run into a recursion where xfs_buf_iodone_callbacks keeps calling back
    into xfs_buf_iorequest because an I/O error happened, which keeps calling
    back into xfs_buf_iorequest. This chain will usually not take long
    because the filesystem gets shut down because of log I/O errors, but even
    over a short time it can cause stack overflows if run on the same context.

    As a short term workaround make sure we always call the iodone handler in
    workqueue context.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Ben Myers

    Christoph Hellwig
     
  • Almost all metadata allocations come from shallow stack usage
    situations. Avoid the overhead of switching the allocation to a
    workqueue as we are not in danger of running out of stack when
    making these allocations. Metadata allocations are already marked
    through the args that are passed down, so this is trivial to do.

    Signed-off-by: Dave Chinner
    Reported-by: Mel Gorman
    Tested-by: Mel Gorman
    Signed-off-by: Ben Myers

    Dave Chinner
     
  • The current cursor is reallocated when retrying the allocation, so
    the existing cursor needs to be destroyed in both the restart and
    the failure cases.

    Signed-off-by: Dave Chinner
    Tested-by: Mike Snitzer
    Signed-off-by: Ben Myers

    Dave Chinner
     
  • Pull NFS client bugfixes from Trond Myklebust:
    - Fix an NFSv4 mount regression
    - Fix O_DIRECT list manipulation snafus

    * tag 'nfs-for-3.5-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
    NFSv4: Fix an NFSv4 mount regression
    NFS: Fix list manipulation snafus in fs/nfs/direct.c

    Linus Torvalds
     
  • This can be trivially triggered from userspace by passing in something unexpected.

    kernel BUG at fs/locks.c:1468!
    invalid opcode: 0000 [#1] SMP
    RIP: 0010:generic_setlease+0xc2/0x100
    Call Trace:
    __vfs_setlease+0x35/0x40
    fcntl_setlease+0x76/0x150
    sys_fcntl+0x1c6/0x810
    system_call_fastpath+0x1a/0x1f

    Signed-off-by: Dave Jones
    Cc: stable@kernel.org # 3.2+
    Signed-off-by: Linus Torvalds

    Dave Jones
     

13 Jul, 2012

1 commit

  • Commit 080399aaaf35 ("block: don't mark buffers beyond end of disk as
    mapped") exposed a bug in __getblk_slow that causes mount to hang as it
    loops infinitely waiting for a buffer that lies beyond the end of the
    disk to become uptodate.

    The problem was initially reported by Torsten Hilbrich here:

    https://lkml.org/lkml/2012/6/18/54

    and also reported independently here:

    http://www.sysresccd.org/forums/viewtopic.php?f=13&t=4511

    and then Richard W.M. Jones and Marcos Mello noted a few separate
    bugzillas also associated with the same issue. This patch has been
    confirmed to fix:

    https://bugzilla.redhat.com/show_bug.cgi?id=835019

    The main problem is here, in __getblk_slow:

    for (;;) {
    struct buffer_head * bh;
    int ret;

    bh = __find_get_block(bdev, block, size);
    if (bh)
    return bh;

    ret = grow_buffers(bdev, block, size);
    if (ret < 0)
    return NULL;
    if (ret == 0)
    free_more_memory();
    }

    __find_get_block does not find the block, since it will not be marked as
    mapped, and so grow_buffers is called to fill in the buffers for the
    associated page. I believe the for (;;) loop is there primarily to
    retry in the case of memory pressure keeping grow_buffers from
    succeeding. However, we also continue to loop for other cases, like the
    block lying beond the end of the disk. So, the fix I came up with is to
    only loop when grow_buffers fails due to memory allocation issues
    (return value of 0).

    The attached patch was tested by myself, Torsten, and Rich, and was
    found to resolve the problem in call cases.

    Signed-off-by: Jeff Moyer
    Reported-and-Tested-by: Torsten Hilbrich
    Tested-by: Richard W.M. Jones
    Reviewed-by: Josh Boyer
    Cc: Stable # 3.0+
    [ Jens is on vacation, taking this directly - Linus ]
    --
    Stable Notes: this patch requires backport to 3.0, 3.2 and 3.3.
    Signed-off-by: Linus Torvalds

    Jeff Moyer
     

12 Jul, 2012

3 commits

  • fat_encode_fh() can fetch an invalid i_pos value on systems where 64-bit
    accesses are not atomic. Make it use the same accessor as the rest of the
    FAT code.

    Signed-off-by: Steven J. Magnani
    Acked-by: OGAWA Hirofumi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Steven J. Magnani
     
  • There is a bug in the below scenario for !CONFIG_MMU:

    1. create a new file
    2. mmap the file and write to it
    3. read the file can't get the correct value

    Because

    sys_read() -> generic_file_aio_read() -> simple_readpage() -> clear_page()

    which causes the page to be zeroed.

    Add SetPageUptodate() to ramfs_nommu_expand_for_mapping() so that
    generic_file_aio_read() do not call simple_readpage().

    Signed-off-by: Bob Liu
    Cc: Hugh Dickins
    Cc: David Howells
    Cc: Geert Uytterhoeven
    Cc: Greg Ungerer
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bob Liu
     
  • As ocfs2_fallocate() will invoke __ocfs2_change_file_space() with a NULL
    as the first parameter (file), it may trigger a NULL pointer dereferrence
    due to a missing check.

    Addresses http://bugs.launchpad.net/bugs/1006012

    Signed-off-by: Luis Henriques
    Reported-by: Bret Towe
    Tested-by: Bret Towe
    Cc: Sunil Mushran
    Acked-by: Joel Becker
    Acked-by: Mark Fasheh
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Luis Henriques
     

11 Jul, 2012

1 commit

  • The helper nfs_fs_mount() will always call nfs4_try_mount with the
    mount_info->fill_super argument pointing to nfs_fill_super, which is
    NFSv2/v3 only.
    Fix is to have nfs4_try_mount replace it with nfs4_fill_super.

    The regression was introduced by commit c40f8d1d (NFS: Create a common
    fs_mount() function)

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

08 Jul, 2012

2 commits

  • Fix 2 bugs in nfs_direct_write_reschedule:

    - The request needs to be removed from the 'reqs' list before it can
    be added to 'failed'.
    - Fix an infinite loop if the 'failed' list is non-empty.

    Reported-by: Julia Lawall
    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • We already use them for openat() and friends, but fchdir() also wants to
    be able to use O_PATH file descriptors. This should make it comparable
    to the O_SEARCH of Solaris. In particular, O_PATH allows you to access
    (not-quite-open) a directory you don't have read persmission to, only
    execute permission.

    Noticed during development of multithread support for ksh93.

    Reported-by: ольга крыжановская
    Cc: Al Viro
    Cc: stable@kernel.org # O_PATH introduced in 3.0+
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

07 Jul, 2012

4 commits

  • Pull eCryptfs fixes from Tyler Hicks:
    "Fixes an incorrect access mode check when preparing to open a file in
    the lower filesystem. This isn't an urgent fix, but it is simple and
    the check was obviously incorrect.

    Also fixes a couple important bugs in the eCryptfs miscdev interface.
    These changes are low risk due to the small number of users that use
    the miscdev interface. I was able to keep the changes minimal and I
    have some cleaner, more complete changes queued up for the next merge
    window that will build on these patches."

    * tag 'ecryptfs-3.5-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
    eCryptfs: Gracefully refuse miscdev file ops on inherited/passed files
    eCryptfs: Fix lockdep warning in miscdev operations
    eCryptfs: Properly check for O_RDONLY flag before doing privileged open

    Linus Torvalds
     
  • File operations on /dev/ecryptfs would BUG() when the operations were
    performed by processes other than the process that originally opened the
    file. This could happen with open files inherited after fork() or file
    descriptors passed through IPC mechanisms. Rather than calling BUG(), an
    error code can be safely returned in most situations.

    In ecryptfs_miscdev_release(), eCryptfs still needs to handle the
    release even if the last file reference is being held by a process that
    didn't originally open the file. ecryptfs_find_daemon_by_euid() will not
    be successful, so a pointer to the daemon is stored in the file's
    private_data. The private_data pointer is initialized when the miscdev
    file is opened and only used when the file is released.

    https://launchpad.net/bugs/994247

    Signed-off-by: Tyler Hicks
    Reported-by: Sasha Levin
    Tested-by: Sasha Levin

    Tyler Hicks
     
  • Pull ocfs2 fixes from Joel Becker.

    * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
    aio: make kiocb->private NUll in init_sync_kiocb()
    ocfs2: Fix bogus error message from ocfs2_global_read_info
    ocfs2: for SEEK_DATA/SEEK_HOLE, return internal error unchanged if ocfs2_get_clusters_nocache() or ocfs2_inode_lock() call failed.
    ocfs2: use spinlock irqsave for downconvert lock.patch
    ocfs2: Misplaced parens in unlikley
    ocfs2: clear unaligned io flag when dio fails

    Linus Torvalds
     
  • Pull cifs fixes from Steve French.

    * git://git.samba.org/sfrench/cifs-2.6:
    cifs: when server doesn't set CAP_LARGE_READ_X, cap default rsize at MaxBufferSize
    cifs: fix parsing of password mount option

    Linus Torvalds
     

06 Jul, 2012

1 commit

  • Pull btrfs updates from Chris Mason:
    "I held off on my rc5 pull because I hit an oops during log recovery
    after a crash. I wanted to make sure it wasn't a regression because
    we have some logging fixes in here.

    It turns out that a commit during the merge window just made it much
    more likely to trigger directory logging instead of full commits,
    which exposed an old bug.

    The new backref walking code got some additional fixes. This should
    be the final set of them.

    Josef fixed up a corner where our O_DIRECT writes and buffered reads
    could expose old file contents (not stale, just not the most recent).
    He and Liu Bo fixed crashes during tree log recover as well.

    Ilya fixed errors while we resume disk balancing operations on
    readonly mounts."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
    Btrfs: run delayed directory updates during log replay
    Btrfs: hold a ref on the inode during writepages
    Btrfs: fix tree log remove space corner case
    Btrfs: fix wrong check during log recovery
    Btrfs: use _IOR for BTRFS_IOC_SUBVOL_GETFLAGS
    Btrfs: resume balance on rw (re)mounts properly
    Btrfs: restore restriper state on all mounts
    Btrfs: fix dio write vs buffered read race
    Btrfs: don't count I/O statistic read errors for missing devices
    Btrfs: resolve tree mod log locking issue in btrfs_next_leaf
    Btrfs: fix tree mod log rewind of ADD operations
    Btrfs: leave critical region in btrfs_find_all_roots as soon as possible
    Btrfs: always put insert_ptr modifications into the tree mod log
    Btrfs: fix tree mod log for root replacements at leaf level
    Btrfs: support root level changes in __resolve_indirect_ref
    Btrfs: avoid waiting for delayed refs when we must not

    Linus Torvalds
     

04 Jul, 2012

4 commits

  • 'status' variable in ocfs2_global_read_info() is always != 0 when leaving the
    function because it happens to contain number of read bytes. Thus we always log
    error message although everything is OK. Since all error cases properly call
    mlog_errno() before jumping to out_err, there's no reason to call mlog_errno()
    on exit at all. This is a fallout of c1e8d35e (conversion of mlog_exit()
    calls).

    Signed-off-by: Jan Kara
    Signed-off-by: Joel Becker

    Jan Kara
     
  • …sters_nocache() or ocfs2_inode_lock() call failed.

    Hello,

    Since ENXIO only means "offset beyond EOF" for SEEK_DATA/SEEK_HOLE,
    Hence we should return the internal error unchanged if ocfs2_inode_lock() or
    ocfs2_get_clusters_nocache() call failed rather than ENXIO.
    Otherwise, it will confuse the user applications when they trying to understand the root cause.

    Thanks Dave for pointing this out.

    Thanks,
    -Jeff

    Cc: Dave Chinner <david@fromorbit.com>
    Signed-off-by: Jie Liu <jeff.liu@oracle.com>
    Signed-off-by: Joel Becker <jlbec@evilplan.org>

    Jeff Liu
     
  • When ocfs2dc thread holds dc_task_lock spinlock and receives soft IRQ it
    deadlock itself trying to get same spinlock in ocfs2_wake_downconvert_thread.
    Below is the stack snippet.

    The patch disables interrupts when acquiring dc_task_lock spinlock.

    ocfs2_wake_downconvert_thread
    ocfs2_rw_unlock
    ocfs2_dio_end_io
    dio_complete
    .....
    bio_endio
    req_bio_endio
    ....
    scsi_io_completion
    blk_done_softirq
    __do_softirq
    do_softirq
    irq_exit
    do_IRQ
    ocfs2_downconvert_thread
    [kthread]

    Signed-off-by: Srinivas Eeda
    Signed-off-by: Joel Becker

    Srinivas Eeda
     
  • Fix misplaced parentheses

    Signed-off-by: Roel Kluin
    Signed-off-by: Joel Becker

    roel