10 Nov, 2010

1 commit


09 Nov, 2010

9 commits

  • Andrew Hendry reported a kmemleak warning in 2.6.37-rc1 while editing a
    text file with gedit over cifs.

    unreferenced object 0xffff88022ee08b40 (size 32):
    comm "gedit", pid 2524, jiffies 4300160388 (age 2633.655s)
    hex dump (first 32 bytes):
    5c 2e 67 6f 75 74 70 75 74 73 74 72 65 61 6d 2d \.goutputstream-
    35 42 41 53 4c 56 00 de 09 00 00 00 2c 26 78 ee 5BASLV......,&x.
    backtrace:
    [] kmemleak_alloc+0x2d/0x60
    [] __kmalloc+0xe3/0x1d0
    [] build_path_from_dentry+0xf0/0x230 [cifs]
    [] cifs_setattr+0x9e/0x770 [cifs]
    [] notify_change+0x170/0x2e0
    [] sys_fchmod+0x10b/0x140
    [] system_call_fastpath+0x16/0x1b
    [] 0xffffffffffffffff

    The commit 1025774c that removed inode_setattr() seems to have introduced this
    memleak by returning early without freeing 'full_path'.

    Reported-by: Andrew Hendry
    Cc: Christoph Hellwig
    Reviewed-by: Jeff Layton
    Signed-off-by: Suresh Jayaraman
    Signed-off-by: Steve French

    Suresh Jayaraman
     
  • Fix openpromfs compilation by adding a missing semicolon in
    fs/openpromfs/inode.c openprom_mount().

    Signed-off-by: Meelis Roos
    Signed-off-by: David S. Miller
    Signed-off-by: Linus Torvalds

    Meelis Roos
     
  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: Add new ext4 inode tracepoints
    ext4: Don't call sb_issue_discard() in ext4_free_blocks()
    ext4: do not try to grab the s_umount semaphore in ext4_quota_off
    ext4: fix potential race when freeing ext4_io_page structures
    ext4: handle writeback of inodes which are being freed
    ext4: initialize the percpu counters before replaying the journal
    ext4: "ret" may be used uninitialized in ext4_lazyinit_thread()
    ext4: fix lazyinit hang after removing request

    Linus Torvalds
     
  • Commit 13cfb7334e made cifs_ioctl use the tlink attached to the
    cifsFileInfo for a filp. This ignores the case of an open directory
    however, which in CIFS can have a NULL private_data until a readdir
    is done on it.

    This patch re-adds the NULL pointer checks that were removed in commit
    50ae28f01 and moves the setting of tcon and "caps" variables lower.

    Long term, a better fix would be to establish a f_op->open routine for
    directories that populates that field at open time, but that requires
    some other changes to how readdir calls are handled.

    Reported-by: Kjell Rune Skaaraas
    Reviewed-and-Tested-by: Suresh Jayaraman
    Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     
  • Add ext4_evict_inode, ext4_drop_inode, ext4_mark_inode_dirty, and
    ext4_begin_ordered_truncate()

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • Commit 5c521830cf (ext4: Support discard requests when running in
    no-journal mode) attempts to add sb_issue_discard() for data blocks
    (in data=writeback mode) and in no-journal mode. Unfortunately, this
    no longer works, because in commit dd3932eddf (block: remove
    BLKDEV_IFL_WAIT), sb_issue_discard() only presents a synchronous
    interface, and there are times when we call ext4_free_blocks() when we
    are are holding a spinlock, or are otherwise in an atomic context.

    For now, I've removed the call to sb_issue_discard() to prevent a
    deadlock or (if spinlock debugging is enabled) failures like this:

    BUG: scheduling while atomic: rc.sysinit/1376/0x00000002
    Pid: 1376, comm: rc.sysinit Not tainted 2.6.36-ARCH #1
    Call Trace:
    [] __schedule_bug+0x5e/0x70
    [] schedule+0x950/0xa70
    [] ? insert_work+0x7d/0x90
    [] ? queue_work_on+0x1d/0x30
    [] ? queue_work+0x37/0x60
    [] schedule_timeout+0x21d/0x360
    [] ? generic_make_request+0x2c3/0x540
    [] wait_for_common+0xc0/0x150
    [] ? default_wake_function+0x0/0x10
    [] ? submit_bio+0x7c/0x100
    [] ? wake_bit_function+0x0/0x40
    [] wait_for_completion+0x18/0x20
    [] blkdev_issue_discard+0x1b9/0x210
    [] ext4_free_blocks+0x68e/0xb60
    [] ? __ext4_handle_dirty_metadata+0x110/0x120
    [] ext4_ext_truncate+0x8cc/0xa70
    [] ? pagevec_lookup+0x1e/0x30
    [] ext4_truncate+0x178/0x5d0
    [] ? unmap_mapping_range+0xab/0x280
    [] vmtruncate+0x56/0x70
    [] ext4_setattr+0x14b/0x460
    [] notify_change+0x194/0x380
    [] do_truncate+0x60/0x90
    [] ? security_inode_permission+0x1a/0x20
    [] ? tomoyo_path_truncate+0x11/0x20
    [] do_last+0x5d9/0x770
    [] do_filp_open+0x1ed/0x680
    [] ? page_fault+0x1f/0x30
    [] ? alloc_fd+0xec/0x140
    [] do_sys_open+0x61/0x120
    [] sys_open+0x1b/0x20
    [] system_call_fastpath+0x16/0x1b

    https://bugzilla.kernel.org/show_bug.cgi?id=22302

    Reported-by: Mathias Burén
    Signed-off-by: "Theodore Ts'o"
    Cc: jiayingz@google.com

    Theodore Ts'o
     
  • It's not needed to sync the filesystem, and it fixes a lock_dep complaint.

    Signed-off-by: Dmitry Monakhov
    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Jan Kara

    Dmitry Monakhov
     
  • Use an atomic_t and make sure we don't free the structure while we
    might still be submitting I/O for that page.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • The following BUG can occur when an inode which is getting freed when
    it still has dirty pages outstanding, and it gets deleted (in this
    because it was the target of a rename). In ordered mode, we need to
    make sure the data pages are written just in case we crash before the
    rename (or unlink) is committed. If the inode is being freed then
    when we try to igrab the inode, we end up tripping the BUG_ON at
    fs/ext4/page-io.c:146.

    To solve this problem, we need to keep track of the number of io
    callbacks which are pending, and avoid destroying the inode until they
    have all been completed. That way we don't have to bump the inode
    count to keep the inode from being destroyed; an approach which
    doesn't work because the count could have already been dropped down to
    zero before the inode writeback has started (at which point we're not
    allowed to bump the count back up to 1, since it's already started
    getting freed).

    Thanks to Dave Chinner for suggesting this approach, which is also
    used by XFS.

    kernel BUG at /scratch_space/linux-2.6/fs/ext4/page-io.c:146!
    Call Trace:
    [] ext4_bio_write_page+0x172/0x307
    [] mpage_da_submit_io+0x2f9/0x37b
    [] mpage_da_map_and_submit+0x2cc/0x2e2
    [] mpage_add_bh_to_extent+0xc6/0xd5
    [] write_cache_pages_da+0x2a4/0x3ac
    [] ext4_da_writepages+0x2d6/0x44d
    [] do_writepages+0x1c/0x25
    [] __filemap_fdatawrite_range+0x4b/0x4d
    [] filemap_fdatawrite_range+0xe/0x10
    [] jbd2_journal_begin_ordered_truncate+0x7b/0xa2
    [] ext4_evict_inode+0x57/0x24c
    [] evict+0x22/0x92
    [] iput+0x212/0x249
    [] dentry_iput+0xa1/0xb9
    [] d_kill+0x3d/0x5d
    [] dput+0x13a/0x147
    [] sys_renameat+0x1b5/0x258
    [] ? _atomic_dec_and_lock+0x2d/0x4c
    [] ? cp_new_stat+0xde/0xea
    [] ? sys_newlstat+0x2d/0x38
    [] sys_rename+0x16/0x18
    [] system_call_fastpath+0x16/0x1b

    Reported-by: Nick Bowler
    Signed-off-by: "Theodore Ts'o"
    Tested-by: Nick Bowler

    Theodore Ts'o
     

06 Nov, 2010

2 commits


05 Nov, 2010

2 commits

  • This patch is based on Dan's original patch. His original description is
    below:

    Smatch complained about a couple checking for NULL after dereferencing
    bugs. I'm not super familiar with the code so I did the conservative
    thing and move the dereferences after the checks.

    The dereferences in cifs_lock() and cifs_fsync() were added in
    ba00ba64cf0 "cifs: make various routines use the cifsFileInfo->tcon
    pointer". The dereference in find_writable_file() was added in
    6508d904e6f "cifs: have find_readable/writable_file filter by fsuid".
    The comments there say it's possible to trigger the NULL dereference
    under stress.

    Signed-off-by: Dan Carpenter
    Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     
  • Noticed while reviewing (late) the rbtree conversion patchset (which has been merged
    already).

    Cc: Jeff Layton
    Signed-off-by: Suresh Jayaraman
    Signed-off-by: Steve French

    Suresh Jayaraman
     

04 Nov, 2010

1 commit


03 Nov, 2010

7 commits


02 Nov, 2010

4 commits

  • Linus noted, and complained to me, that doing while lots of "git diff"'s
    of kernel sources, these spinlocks were responsible for 27% of the
    spinlock cost on his two-processor system as reported by perf.

    Git was doing lots of parallel stats, and this was putting a lot of
    pressure on ext4_getattr(). A spinlock to protect a single
    memory-to-memory copy is pointless, so remove it.

    Signed-off-by: "Theodore Ts'o"
    Signed-off-by: Linus Torvalds

    Theodore Ts'o
     
  • Steve French
     
  • Stanse found that pSMBFile in cifs_ioctl and file->f_path.dentry in
    cifs_user_write are dereferenced prior their test to NULL.

    The alternative is not to dereference them before the tests. The patch is
    to point out the problem, you have to decide.

    While at it we cache the inode in cifs_user_write to a local variable
    and use all over the function.

    Signed-off-by: Jiri Slaby
    Cc: Steve French
    Cc: linux-cifs@vger.kernel.org
    Cc: Jeff Layton
    Cc: Christoph Hellwig
    Signed-off-by: Steve French

    Jiri Slaby
     
  • Commit 7d945a3aa760 ("logfs get_sb, part 3") broke the logfs build when
    CONFIG_MTD is set due to a mangled logfs_get_sb_mtd() definition.

    Signed-off-by: Paul Mundt
    Signed-off-by: Linus Torvalds

    Paul Mundt
     

01 Nov, 2010

1 commit

  • …rnel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

    * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    genirq: Fix up irq_node() for irq_data changes.
    genirq: Add single IRQ reservation helper
    genirq: Warn if enable_irq is called before irq is set up

    * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    semaphore: Remove mutex emulation
    staging: Final semaphore cleanup
    jbd2: Convert jbd2_slab_create_sem to mutex
    hpfs: Convert sbi->hpfs_creation_de to mutex

    Fix up trivial change/delete conflicts with deleted 'dream' drivers
    (drivers/staging/dream/camera/{mt9d112.c,mt9p012_fox.c,mt9t013.c,s5k3e2fx.c})

    Linus Torvalds
     

31 Oct, 2010

9 commits

  • This one was only used for a nasty hack in nfsd, which has recently
    been removed.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • The caller allocated it, the caller should free it.

    The only issue so far is that we could change the flp pointer even on an
    error return if the fl_change callback failed. But we can simply move
    the flp assignment after the fl_change invocation, as the callers don't
    care about the flp return value if the setlease call failed.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • The NFSv4 server was initializing the dp->dl_flock pointer by the
    somewhat ridiculous method of a locks_copy_lock callback.

    Now that setlease uses the passed-in lock instead of doing a copy,
    dl_flock no longer gets set, resulting in the lock leaking on delegation
    release, and later possible hangs (among other problems).

    So, initialize dl_flock and get rid of the callback.

    Signed-off-by: J. Bruce Fields
    Acked-by: Arnd Bergmann
    Signed-off-by: Linus Torvalds

    J. Bruce Fields
     
  • We modified setlease to require the caller to allocate the new lease in
    the case of creating a new lease, but forgot to fix up the filesystem
    methods.

    Cc: Steven Whitehouse
    Cc: Steve French
    Cc: Trond Myklebust
    Signed-off-by: J. Bruce Fields
    Acked-by: Arnd Bergmann
    Signed-off-by: Linus Torvalds

    J. Bruce Fields
     
  • We're depending on setlease to free the passed-in lease on failure.

    Signed-off-by: J. Bruce Fields
    Acked-by: Arnd Bergmann
    Signed-off-by: Linus Torvalds

    J. Bruce Fields
     
  • Removing a lock shouldn't require any allocations; a failure due to
    ENOMEM leaves the caller with a choice between retrying or giving up and
    leaking an unused lease.

    Next we should split the other lease calls into add and delete cases.
    I wanted to start with just the bugfix.

    Signed-off-by: J. Bruce Fields
    Acked-by: Arnd Bergmann
    Signed-off-by: Linus Torvalds

    J. Bruce Fields
     
  • * 'for-linus' of git://git.infradead.org/users/eparis/notify: (22 commits)
    Ensure FMODE_NONOTIFY is not set by userspace
    make fanotify_read() restartable across signals
    fsnotify: remove alignment padding from fsnotify_mark on 64 bit builds
    fs/notify/fanotify/fanotify_user.c: fix warnings
    fanotify: Fix FAN_CLOSE comments
    fanotify: do not recalculate the mask if the ignored mask changed
    fanotify: ignore events on directories unless specifically requested
    fsnotify: rename FS_IN_ISDIR to FS_ISDIR
    fanotify: do not send events for irregular files
    fanotify: limit number of listeners per user
    fanotify: allow userspace to override max marks
    fanotify: limit the number of marks in a single fanotify group
    fanotify: allow userspace to override max queue depth
    fsnotify: implement a default maximum queue depth
    fanotify: ignore fanotify ignore marks if open writers
    fanotify: allow userspace to flush all marks
    fsnotify: call fsnotify_parent in perm events
    fsnotify: correctly handle return codes from listeners
    fanotify: use __aligned_u64 in fanotify userspace metadata
    fanotify: implement fanotify listener ordering
    ...

    Linus Torvalds
     
  • In fanotify_read() return -ERESTARTSYS instead of -EINTR to
    make read() restartable across signals (BSD semantic).

    Signed-off-by: Eric Paris

    Lino Sanfilippo
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (39 commits)
    Btrfs: deal with errors from updating the tree log
    Btrfs: allow subvol deletion by unprivileged user with -o user_subvol_rm_allowed
    Btrfs: make SNAP_DESTROY async
    Btrfs: add SNAP_CREATE_ASYNC ioctl
    Btrfs: add START_SYNC, WAIT_SYNC ioctls
    Btrfs: async transaction commit
    Btrfs: fix deadlock in btrfs_commit_transaction
    Btrfs: fix lockdep warning on clone ioctl
    Btrfs: fix clone ioctl where range is adjacent to extent
    Btrfs: fix delalloc checks in clone ioctl
    Btrfs: drop unused variable in block_alloc_rsv
    Btrfs: cleanup warnings from gcc 4.6 (nonbugs)
    Btrfs: Fix variables set but not read (bugs found by gcc 4.6)
    Btrfs: Use ERR_CAST helpers
    Btrfs: use memdup_user helpers
    Btrfs: fix raid code for removing missing drives
    Btrfs: Switch the extent buffer rbtree into a radix tree
    Btrfs: restructure try_release_extent_buffer()
    Btrfs: use the flusher threads for delalloc throttling
    Btrfs: tune the chunk allocation to 5% of the FS as metadata
    ...

    Fix up trivial conflicts in fs/btrfs/super.c and fs/fs-writeback.c, and
    remove use of INIT_RCU_HEAD in fs/btrfs/extent_io.c (that init macro was
    useless and removed in commit 5e8067adfdba: "rcu head remove init")

    Linus Torvalds
     

30 Oct, 2010

4 commits

  • The btrfs merge looks like hell, because it changes fs-writeback.c, and
    the crazy code has this repeated "estimate number of dirty pages"
    counting that involves three different helper functions. And it's done
    in two different places.

    Just unify that whole calculation as a "get_nr_dirty_pages()" helper
    function, and the merge result will look half-way decent.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • * git://git.infradead.org/mtd-2.6: (82 commits)
    mtd: fix build error in m25p80.c
    mtd: Remove redundant mutex from mtd_blkdevs.c
    MTD: Fix wrong check register_blkdev return value
    Revert "mtd: cleanup Kconfig dependencies"
    mtd: cfi_cmdset_0002: make sector erase command variable
    mtd: cfi_cmdset_0002: add CFI detection for SST 38VF640x chips
    mtd: cfi_util: add support for switching SST 39VF640xB chips into QRY mode
    mtd: cfi_cmdset_0001: use defined value of P_ID_INTEL_PERFORMANCE instead of hardcoded one
    block2mtd: dubious assignment
    P4080/mtd: Fix the freescale lbc issue with 36bit mode
    P4080/eLBC: Make Freescale elbc interrupt common to elbc devices
    mtd: phram: use KBUILD_MODNAME
    mtd: OneNAND: S5PC110: Fix double call suspend & resume function
    mtd: nand: fix MTD_MODE_RAW writes
    jffs2: use kmemdup
    mtd: sm_ftl: cosmetic, use bool when possible
    mtd: r852: remove useless pci powerup/down from suspend/resume routines
    mtd: blktrans: fix a race vs kthread_stop
    mtd: blktrans: kill BKL
    mtd: allow to unload the mtdtrans module if its block devices aren't open
    ...

    Fix up trivial whitespace-introduced conflict in drivers/mtd/mtdchar.c

    Linus Torvalds
     
  • The definition of PAGE_CACHE_MASK in is needed to use
    MAX_RW_COUNT, and on x86-64 that gets done indirectly through the
    architecture header includes. But on MIPS and s390 that doesn't happen,
    and we need to make sure that fs/compat.c includes pagemap.h explicitly.

    Introduced in commit 435f49a518c7 ("readv/writev: do the same
    MAX_RW_COUNT truncation that read/write does").

    Reported-by: Sachin Sant (S390)
    Reported-by: wu zhangjin (MIPS)
    Signed-off-by: Linus Torvalds

    wu zhangjin
     
  • Conflicts:
    drivers/mtd/mtd_blkdevs.c

    Merge Grant's device-tree bits so that we can apply the subsequent fixes.

    Signed-off-by: David Woodhouse

    David Woodhouse