20 Sep, 2010

1 commit

  • Coda's REQ_* defines were renamed to avoid clashes with the block layer
    (commit 4aeefdc69f7b: "coda: fixup clash with block layer REQ_*
    defines").

    However one was missed and response messages are no longer matched with
    requests and waiting threads are no longer woken up. This patch fixes
    this.

    Signed-off-by: Jan Harkes
    [ Also fixed up whitespace while at it -Linus ]
    Signed-off-by: Linus Torvalds

    Jan Harkes
     

17 Sep, 2010

2 commits


15 Sep, 2010

3 commits

  • * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
    SUNRPC: Fix the NFSv4 and RPCSEC_GSS Kconfig dependencies
    statfs() gives ESTALE error
    NFS: Fix a typo in nfs_sockaddr_match_ipaddr6
    sunrpc: increase MAX_HASHTABLE_BITS to 14
    gss:spkm3 miss returning error to caller when import security context
    gss:krb5 miss returning error to caller when import security context
    Remove incorrect do_vfs_lock message
    SUNRPC: cleanup state-machine ordering
    SUNRPC: Fix a race in rpc_info_open
    SUNRPC: Fix race corrupting rpc upcall
    Fix null dereference in call_allocate

    Linus Torvalds
     
  • Tavis Ormandy pointed out that do_io_submit does not do proper bounds
    checking on the passed-in iocb array:

           if (unlikely(nr < 0))
                   return -EINVAL;

           if (unlikely(!access_ok(VERIFY_READ, iocbpp, (nr*sizeof(iocbpp)))))
                   return -EFAULT;                      ^^^^^^^^^^^^^^^^^^

    The attached patch checks for overflow, and if it is detected, the
    number of iocbs submitted is scaled down to a number that will fit in
    the long.  This is an ok thing to do, as sys_io_submit is documented as
    returning the number of iocbs submitted, so callers should handle a
    return value of less than the 'nr' argument passed in.

    Reported-by: Tavis Ormandy
    Signed-off-by: Jeff Moyer
    Signed-off-by: Linus Torvalds

    Jeff Moyer
     
  • cifs_get_smb_ses must be called on a server pointer on which it holds an
    active reference. It first does a search for an existing SMB session. If
    it finds one, it'll put the server reference and then try to ensure that
    the negprot is done, etc.

    If it encounters an error at that point then it'll return an error.
    There's a potential problem here though. When cifs_get_smb_ses returns
    an error, the caller will also put the TCP server reference leading to a
    double-put.

    Fix this by having cifs_get_smb_ses only put the server reference if
    it found an existing session that it could use and isn't returning an
    error.

    Cc: stable@kernel.org
    Reviewed-by: Suresh Jayaraman
    Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     

14 Sep, 2010

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
    cifs: prevent possible memory corruption in cifs_demultiplex_thread
    cifs: eliminate some more premature cifsd exits
    cifs: prevent cifsd from exiting prematurely
    [CIFS] ntlmv2/ntlmssp remove-unused-function CalcNTLMv2_partial_mac_key
    cifs: eliminate redundant xdev check in cifs_rename
    Revert "[CIFS] Fix ntlmv2 auth with ntlmssp"
    Revert "missing changes during ntlmv2/ntlmssp auth and sign"
    Revert "Eliminate sparse warning - bad constant expression"
    Revert "[CIFS] Eliminate unused variable warning"

    Linus Torvalds
     

13 Sep, 2010

9 commits

  • We should not use dotlversion for the dotu inode operations

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Eric Van Hensbergen

    Aneesh Kumar K.V
     
  • We should use the cached dentry operation only if caching mode is enabled

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Eric Van Hensbergen

    Aneesh Kumar K.V
     
  • NULL fid should be handled in cases where we endup calling v9fs_dir_release()
    before even we instantiate the fid in filp.

    Signed-off-by: Venkateswararao Jujjuri
    Signed-off-by: Eric Van Hensbergen

    jvrao
     
  • This was introduced by 7cadb63d58a932041afa3f957d5cbb6ce69dcee5

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Eric Van Hensbergen

    Aneesh Kumar K.V
     
  • Four memory leak fixes in the 9P code.

    Signed-off-by: Latchesar Ionkov
    Signed-off-by: Eric Van Hensbergen

    Latchesar Ionkov
     
  • The NFSv4 client's callback server calls svc_gss_principal(), which
    is defined in the auth_rpcgss.ko

    The NFSv4 server has the same dependency, and in addition calls
    svcauth_gss_flavor(), gss_mech_get_by_pseudoflavor(),
    gss_pseudoflavor_to_service() and gss_mech_put() from the same module.

    The module auth_rpcgss itself has no dependencies aside from sunrpc,
    so we only need to select RPCSEC_GSS.

    Reported-by: Uwe Kleine-König
    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Hi,

    An NFS client executes a statfs("file", &buff) call.
    "file" exists / existed, the client has read / written it,
    but it has already closed it.

    user_path(pathname, &path) looks up "file" successfully in the
    directory-cache and restarts the aging timer of the directory-entry.
    Even if "file" has already been removed from the server, because the
    lookupcache=positive option I use, keeps the entries valid for a while.

    nfs_statfs() returns ESTALE if "file" has already been removed from the
    server.

    If the user application repeats the statfs("file", &buff) call, we
    are stuck: "file" remains young forever in the directory-cache.

    Signed-off-by: Zoltan Menyhart
    Signed-off-by: Trond Myklebust
    Cc: stable@kernel.org

    Menyhart Zoltan
     
  • Reported-by: Ben Greear
    Signed-off-by: Trond Myklebust
    Cc: stable@kernel.org

    Trond Myklebust
     
  • The do_vfs_lock function on fs/nfs/file.c is only called if NLM is
    not being used, via the -onolock mount option. Therefore it cannot
    really be "out of sync with lock manager" when the local locking
    function called returns an error, as there will be no corresponding
    call to the NLM. For details, simply check the if/else on do_setlk
    and do_unlk on fs/nfs/file.c.

    Signed-Off-By: Fabio Olive Leite
    Reviewed-by: Jeff Layton
    Signed-off-by: Trond Myklebust

    Fabio Olive Leite
     

11 Sep, 2010

1 commit


10 Sep, 2010

12 commits

  • The workqueue implementation in 2.6.36-rcX has changed, resulting
    in the workqueues no longer having dedicated threads for work
    processing. This has caused severe livelocks under heavy parallel
    create workloads because the log IO completions have been getting
    held up behind metadata IO completions. Hence log commits would
    stall, memory allocation would stall because pages could not be
    cleaned, and lock contention on the AIL during inode IO completion
    processing was being seen to slow everything down even further.

    By making the log Io completion workqueue a high priority workqueue,
    they are queued ahead of all data/metadata IO completions and
    processed before the data/metadata completions. Hence the log never
    gets stalled, and operations needed to clean memory can continue as
    quickly as possible. This avoids the livelock conditions and allos
    the system to keep running under heavy load as per normal.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Dave Chinner
     
  • An execve with a very large total of argument/environment strings
    can take a really long time in the execve system call. It runs
    uninterruptibly to count and copy all the strings. This change
    makes it abort the exec quickly if sent a SIGKILL.

    Note that this is the conservative change, to interrupt only for
    SIGKILL, by using fatal_signal_pending(). It would be perfectly
    correct semantics to let any signal interrupt the string-copying in
    execve, i.e. use signal_pending() instead of fatal_signal_pending().
    We'll save that change for later, since it could have user-visible
    consequences, such as having a timer set too quickly make it so that
    an execve can never complete, though it always happened to work before.

    Signed-off-by: Roland McGrath
    Reviewed-by: KOSAKI Motohiro
    Signed-off-by: Linus Torvalds

    Roland McGrath
     
  • This adds a preemption point during the copying of the argument and
    environment strings for execve, in copy_strings(). There is already
    a preemption point in the count() loop, so this doesn't add any new
    points in the abstract sense.

    When the total argument+environment strings are very large, the time
    spent copying them can be much more than a normal user time slice.
    So this change improves the interactivity of the rest of the system
    when one process is doing an execve with very large arguments.

    Signed-off-by: Roland McGrath
    Reviewed-by: KOSAKI Motohiro
    Signed-off-by: Linus Torvalds

    Roland McGrath
     
  • The CONFIG_STACK_GROWSDOWN variant of setup_arg_pages() does not
    check the size of the argument/environment area on the stack.
    When it is unworkably large, shift_arg_pages() hits its BUG_ON.
    This is exploitable with a very large RLIMIT_STACK limit, to
    create a crash pretty easily.

    Check that the initial stack is not too large to make it possible
    to map in any executable. We're not checking that the actual
    executable (or intepreter, for binfmt_elf) will fit. So those
    mappings might clobber part of the initial stack mapping. But
    that is just userland lossage that userland made happen, not a
    kernel problem.

    Signed-off-by: Roland McGrath
    Reviewed-by: KOSAKI Motohiro
    Signed-off-by: Linus Torvalds

    Roland McGrath
     
  • * 'for-linus' of git://git.kernel.dk/linux-2.6-block:
    block: Range check cpu in blk_cpu_to_group
    scatterlist: prevent invalid free when alloc fails
    writeback: Fix lost wake-up shutting down writeback thread
    writeback: do not lose wakeup events when forking bdi threads
    cciss: fix reporting of max queue depth since init
    block: switch s390 tape_block and mg_disk to elevator_change()
    block: add function call to switch the IO scheduler from a driver
    fs/bio-integrity.c: return -ENOMEM on kmalloc failure
    bio-integrity.c: remove dependency on __GFP_NOFAIL
    BLOCK: fix bio.bi_rw handling
    block: put dev->kobj in blk_register_queue fail path
    cciss: handle allocation failure
    cfq-iosched: Documentation help for new tunables
    cfq-iosched: blktrace print per slice sector stats
    cfq-iosched: Implement tunable group_idle
    cfq-iosched: Do group share accounting in IOPS when slice_idle=0
    cfq-iosched: Do not idle if slice_idle=0
    cciss: disable doorbell reset on reset_devices
    blkio: Fix return code for mkdir calls

    Linus Torvalds
     
  • The XFS_IOC_FSGETXATTR ioctl allows unprivileged users to read 12
    bytes of uninitialized stack memory, because the fsxattr struct
    declared on the stack in xfs_ioc_fsgetxattr() does not alter (or zero)
    the 12-byte fsx_pad member before copying it back to the user. This
    patch takes care of it.

    Signed-off-by: Dan Rosenberg
    Reviewed-by: Eric Sandeen
    Signed-off-by: Alex Elder

    Dan Rosenberg
     
  • Commit 9eed1fb721c ("minix: replace inode uid,gid,mode init with helper")
    broke directory creation on minix filesystems.

    Fix it by passing the needed mode flag to inode init helper.

    Signed-off-by: Jorge Boncompte [DTI2]
    Cc: Dmitry Monakhov
    Cc: Al Viro
    Cc: [2.6.35.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jorge Boncompte [DTI2]
     
  • O_NONBLOCK on parisc has a dual value:

    #define O_NONBLOCK 000200004 /* HPUX has separate NDELAY & NONBLOCK */

    It is caught by the O_* bits uniqueness check and leads to a parisc
    compile error. The fix would be to take O_NONBLOCK out.

    Signed-off-by: Wu Fengguang
    Signed-off-by: James Bottomley
    Cc: Jamie Lokier
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    James Bottomley
     
  • Commit 74641f584da ("alpha: binfmt_aout fix") (May 2009) introduced a
    regression - binfmt_misc is now consulted after binfmt_elf, which will
    unfortunately break ia32el. ia32 ELF binaries on ia64 used to be matched
    using binfmt_misc and executed using wrapper. As 32bit binaries are now
    matched by binfmt_elf before bindmt_misc kicks in, the wrapper is ignored.

    The fix increases precedence of binfmt_misc to the original state.

    Signed-off-by: Jan Sembera
    Cc: Ivan Kokshaysky
    Cc: Al Viro
    Cc: Richard Henderson [2.6.everything.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Sembera
     
  • Fix the left-over old ifdef for PG_uncached in /proc/kpageflags. Now it's
    used by x86, too.

    Signed-off-by: Takashi Iwai
    Cc: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Takashi Iwai
     
  • commit c2c6ca4 (direct-io: do not merge logically non-contiguous requests)
    introduced a bug whereby all O_DIRECT I/Os were submitted a page at a time
    to the block layer. The problem is that the code expected
    dio->block_in_file to correspond to the current page in the dio. In fact,
    it corresponds to the previous page submitted via submit_page_section.
    This was purely an oversight, as the dio->cur_page_fs_offset field was
    introduced for just this purpose. This patch simply uses the correct
    variable when calculating whether there is a mismatch between contiguous
    logical blocks and contiguous physical blocks (as described in the
    comments).

    I also switched the if conditional following this check to an else if, to
    ensure that we never call dio_bio_submit twice for the same dio (in
    theory, this should not happen, anyway).

    I've tested this by running blktrace and verifying that a 64KB I/O was
    submitted as a single I/O. I also ran the patched kernel through
    xfstests' aio tests using xfs, ext4 (with 1k and 4k block sizes) and btrfs
    and verified that there were no regressions as compared to an unpatched
    kernel.

    Signed-off-by: Jeff Moyer
    Acked-by: Josef Bacik
    Cc: Christoph Hellwig
    Cc: Chris Mason
    Cc: [2.6.35.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Moyer
     
  • So it can be used by all that need to check for that.

    Signed-off-by: Stefan Bader
    Signed-off-by: Linus Torvalds

    Stefan Bader
     

09 Sep, 2010

11 commits

  • * 'fixes' of git://oss.oracle.com/git/tma/linux-2.6:
    ocfs2: Fix orphan add in ocfs2_create_inode_in_orphan
    ocfs2: split out ocfs2_prepare_orphan_dir() into locking and prep functions
    ocfs2: allow return of new inode block location before allocation of the inode
    ocfs2: use ocfs2_alloc_dinode_update_counts() instead of open coding
    ocfs2: split out inode alloc code from ocfs2_mknod_locked
    Ocfs2: Fix a regression bug from mainline commit(6b933c8e6f1a2f3118082c455eef25f9b1ac7b45).
    ocfs2: Fix deadlock when allocating page
    ocfs2: properly set and use inode group alloc hint
    ocfs2: Use the right group in nfs sync check.
    ocfs2: Flush drive's caches on fdatasync
    ocfs2: make __ocfs2_page_mkwrite handle file end properly.
    ocfs2: Fix incorrect checksum validation error
    ocfs2: Fix metaecc error messages

    Linus Torvalds
     
  • cifs_demultiplex_thread sets the addr.sockAddr.sin_port without any
    regard for the socket family. While it may be that the error in question
    here never occurs on an IPv6 socket, it's probably best to be safe and
    set the port properly if it ever does.

    Break the port setting code out of cifs_fill_sockaddr and into a new
    function, and call that from cifs_demultiplex_thread.

    Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     
  • If the tcpStatus is still CifsNew, the main cifs_demultiplex_loop can
    break out prematurely in some cases. This is wrong as we will almost
    always have other structures with pointers to the TCP_Server_Info. If
    the main loop breaks under any other condition other than tcpStatus ==
    CifsExiting, then it'll face a use-after-free situation.

    I don't see any reason to treat a CifsNew tcpStatus differently than
    CifsGood. I believe we'll still want to attempt to reconnect in either
    case. What should happen in those situations is that the MIDs get marked
    as MID_RETRY_NEEDED. This will make CIFSSMBNegotiate return -EAGAIN, and
    then the caller can retry the whole thing on a newly reconnected socket.
    If that fails again in the same way, the caller of cifs_get_smb_ses
    should tear down the TCP_Server_Info struct.

    Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     
  • When cifs_demultiplex_thread exits, it does a number of cleanup tasks
    including freeing the TCP_Server_Info struct. Much of the existing code
    in cifs assumes that when there is a cisfSesInfo struct, that it holds a
    reference to a valid TCP_Server_Info struct.

    We can never allow cifsd to exit when a cifsSesInfo struct is still
    holding a reference to the server. The server pointers will then point
    to freed memory.

    This patch eliminates a couple of questionable conditions where it does
    this. The idea here is to make an -EINTR return from kernel_recvmsg
    behave the same way as -ERESTARTSYS or -EAGAIN. If the task was
    signalled from cifs_put_tcp_session, then tcpStatus will be CifsExiting,
    and the kernel_recvmsg call will return quickly.

    There's also another condition where this can occur too -- if the
    tcpStatus is still in CifsNew, then it will also exit if the server
    closes the socket prematurely. I think we'll probably also need to fix
    that situation, but that requires a bit more consideration.

    Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     
  • This function is not used, so remove the definition and declaration.

    Reviewed-by: Jeff Layton
    Signed-off-by: Shirish Pargaonkar
    Signed-off-by: Steve French

    Steve French
     
  • The VFS always checks that the source and target of a rename are on the
    same vfsmount, and hence have the same superblock. So, this check is
    redundant. Remove it and simplify the error handling.

    Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     
  • This reverts commit 9fbc590860e75785bdaf8b83e48fabfe4d4f7d58.

    The change to kernel crypto and fixes to ntlvm2 and ntlmssp
    series, introduced a regression. Deferring this patch series
    to 2.6.37 after Shirish fixes it.

    Signed-off-by: Steve French
    Acked-by: Jeff Layton
    CC: Shirish Pargaonkar

    Steve French
     
  • This reverts commit 3ec6bbcdb4e85403f2c5958876ca9492afdf4031.

    The change to kernel crypto and fixes to ntlvm2 and ntlmssp
    series, introduced a regression. Deferring this patch series
    to 2.6.37 after Shirish fixes it.

    Signed-off-by: Steve French
    Acked-by: Jeff Layton
    CC: Shirish Pargaonkar

    Steve French
     
  • This reverts commit 2d20ca835867d93ead6ce61780d883a4b128106d.

    The change to kernel crypto and fixes to ntlvm2 and ntlmssp
    series, introduced a regression. Deferring this patch series
    to 2.6.37 after Shirish fixes it.

    Signed-off-by: Steve French
    Acked-by: Jeff Layton
    CC: Shirish Pargaonkar

    Steve French
     
  • The change to kernel crypto and fixes to ntlvm2 and ntlmssp
    series, introduced a regression. Deferring this patch series
    to 2.6.37 after Shirish fixes it.

    This reverts commit c89e5198b26a869ce2842bad8519264f3394dee9.

    Signed-off-by: Steve French
    Acked-by: Jeff Layton
    CC: Shirish Pargaonkar

    Steve French
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
    fuse: fix lock annotations
    fuse: flush background queue on connection close

    Linus Torvalds