05 Apr, 2014

1 commit

  • Pull ext4 updates from Ted Ts'o:
    "Major changes for 3.14 include support for the newly added ZERO_RANGE
    and COLLAPSE_RANGE fallocate operations, and scalability improvements
    in the jbd2 layer and in xattr handling when the extended attributes
    spill over into an external block.

    Other than that, the usual clean ups and minor bug fixes"

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (42 commits)
    ext4: fix premature freeing of partial clusters split across leaf blocks
    ext4: remove unneeded test of ret variable
    ext4: fix comment typo
    ext4: make ext4_block_zero_page_range static
    ext4: atomically set inode->i_flags in ext4_set_inode_flags()
    ext4: optimize Hurd tests when reading/writing inodes
    ext4: kill i_version support for Hurd-castrated file systems
    ext4: each filesystem creates and uses its own mb_cache
    fs/mbcache.c: doucple the locking of local from global data
    fs/mbcache.c: change block and index hash chain to hlist_bl_node
    ext4: Introduce FALLOC_FL_ZERO_RANGE flag for fallocate
    ext4: refactor ext4_fallocate code
    ext4: Update inode i_size after the preallocation
    ext4: fix partial cluster handling for bigalloc file systems
    ext4: delete path dealloc code in ext4_ext_handle_uninitialized_extents
    ext4: only call sync_filesystm() when remounting read-only
    fs: push sync_filesystem() down to the file system's remount_fs()
    jbd2: improve error messages for inconsistent journal heads
    jbd2: minimize region locked by j_list_lock in jbd2_journal_forget()
    jbd2: minimize region locked by j_list_lock in journal_get_create_access()
    ...

    Linus Torvalds
     

13 Mar, 2014

1 commit

  • Previously, the no-op "mount -o mount /dev/xxx" operation when the
    file system is already mounted read-write causes an implied,
    unconditional syncfs(). This seems pretty stupid, and it's certainly
    documented or guaraunteed to do this, nor is it particularly useful,
    except in the case where the file system was mounted rw and is getting
    remounted read-only.

    However, it's possible that there might be some file systems that are
    actually depending on this behavior. In most file systems, it's
    probably fine to only call sync_filesystem() when transitioning from
    read-write to read-only, and there are some file systems where this is
    not needed at all (for example, for a pseudo-filesystem or something
    like romfs).

    Signed-off-by: "Theodore Ts'o"
    Cc: linux-fsdevel@vger.kernel.org
    Cc: Christoph Hellwig
    Cc: Artem Bityutskiy
    Cc: Adrian Hunter
    Cc: Evgeniy Dushistov
    Cc: Jan Kara
    Cc: OGAWA Hirofumi
    Cc: Anders Larsen
    Cc: Phillip Lougher
    Cc: Kees Cook
    Cc: Mikulas Patocka
    Cc: Petr Vandrovec
    Cc: xfs@oss.sgi.com
    Cc: linux-btrfs@vger.kernel.org
    Cc: linux-cifs@vger.kernel.org
    Cc: samba-technical@lists.samba.org
    Cc: codalist@coda.cs.cmu.edu
    Cc: linux-ext4@vger.kernel.org
    Cc: linux-f2fs-devel@lists.sourceforge.net
    Cc: fuse-devel@lists.sourceforge.net
    Cc: cluster-devel@redhat.com
    Cc: linux-mtd@lists.infradead.org
    Cc: jfs-discussion@lists.sourceforge.net
    Cc: linux-nfs@vger.kernel.org
    Cc: linux-nilfs@vger.kernel.org
    Cc: linux-ntfs-dev@lists.sourceforge.net
    Cc: ocfs2-devel@oss.oracle.com
    Cc: reiserfs-devel@vger.kernel.org

    Theodore Ts'o
     

19 Feb, 2014

1 commit


13 Nov, 2013

1 commit


01 Aug, 2013

1 commit

  • debugfs_remove_recursive() is wrong,

    1. it wrongly assumes that !list_empty(d_subdirs) means that this
    dir should be removed.

    This is not that bad by itself, but:

    2. if d_subdirs does not becomes empty after __debugfs_remove()
    it gives up and silently fails, it doesn't even try to remove
    other entries.

    However ->d_subdirs can be non-empty because it still has the
    already deleted !debugfs_positive() entries.

    3. simple_release_fs() is called even if __debugfs_remove() fails.

    Suppose we have

    dir1/
    dir2/
    file2
    file1

    and someone opens dir1/dir2/file2.

    Now, debugfs_remove_recursive(dir1/dir2) succeeds, and dir1/dir2 goes
    away.

    But debugfs_remove_recursive(dir1) silently fails and doesn't remove
    this directory. Because it tries to delete (the already deleted)
    dir1/dir2/file2 again and then fails due to "Avoid infinite loop"
    logic.

    Test-case:

    #!/bin/sh

    cd /sys/kernel/debug/tracing
    echo 'p:probe/sigprocmask sigprocmask' >> kprobe_events
    sleep 1000 < events/probe/sigprocmask/id &
    echo -n >| kprobe_events

    [ -d events/probe ] && echo "ERR!! failed to rm probe"

    And after that it is not possible to create another probe entry.

    With this patch debugfs_remove_recursive() skips !debugfs_positive()
    files although this is not strictly needed. The most important change
    is that it does not try to make ->d_subdirs empty, it simply scans
    the whole list(s) recursively and removes as much as possible.

    Link: http://lkml.kernel.org/r/20130726151256.GC19472@redhat.com

    Acked-by: Greg Kroah-Hartman
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Steven Rostedt

    Oleg Nesterov
     

04 Jun, 2013

2 commits

  • In case, userland writes an empty string to a bool debugfs file, buf[]
    will still be uninitialized when being passed to strtobool() making the
    outcome of that function purely random.

    Fix this by always zero-terminating the buffer.

    Signed-off-by: Mathias Krause
    Signed-off-by: Greg Kroah-Hartman

    Mathias Krause
     
  • debugfs currently lack the ability to create attributes
    that set/get atomic_t values.

    This patch adds support for this through a new
    debugfs_create_atomic_t() function.

    Signed-off-by: Seth Jennings
    Acked-by: Greg Kroah-Hartman
    Acked-by: Mel Gorman
    Acked-by: Rik van Riel
    Acked-by: Konrad Rzeszutek Wilk
    Signed-off-by: Greg Kroah-Hartman

    Seth Jennings
     

04 Mar, 2013

1 commit

  • Modify the request_module to prefix the file system type with "fs-"
    and add aliases to all of the filesystems that can be built as modules
    to match.

    A common practice is to build all of the kernel code and leave code
    that is not commonly needed as modules, with the result that many
    users are exposed to any bug anywhere in the kernel.

    Looking for filesystems with a fs- prefix limits the pool of possible
    modules that can be loaded by mount to just filesystems trivially
    making things safer with no real cost.

    Using aliases means user space can control the policy of which
    filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
    with blacklist and alias directives. Allowing simple, safe,
    well understood work-arounds to known problematic software.

    This also addresses a rare but unfortunate problem where the filesystem
    name is not the same as it's module name and module auto-loading
    would not work. While writing this patch I saw a handful of such
    cases. The most significant being autofs that lives in the module
    autofs4.

    This is relevant to user namespaces because we can reach the request
    module in get_fs_type() without having any special permissions, and
    people get uncomfortable when a user specified string (in this case
    the filesystem type) goes all of the way to request_module.

    After having looked at this issue I don't think there is any
    particular reason to perform any filtering or permission checks beyond
    making it clear in the module request that we want a filesystem
    module. The common pattern in the kernel is to call request_module()
    without regards to the users permissions. In general all a filesystem
    module does once loaded is call register_filesystem() and go to sleep.
    Which means there is not much attack surface exposed by loading a
    filesytem module unless the filesystem is mounted. In a user
    namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
    which most filesystems do not set today.

    Acked-by: Serge Hallyn
    Acked-by: Kees Cook
    Reported-by: Kees Cook
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

18 Jan, 2013

2 commits


11 Jan, 2013

1 commit


16 Nov, 2012

1 commit


03 Oct, 2012

1 commit

  • Pull user namespace changes from Eric Biederman:
    "This is a mostly modest set of changes to enable basic user namespace
    support. This allows the code to code to compile with user namespaces
    enabled and removes the assumption there is only the initial user
    namespace. Everything is converted except for the most complex of the
    filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs,
    nfs, ocfs2 and xfs as those patches need a bit more review.

    The strategy is to push kuid_t and kgid_t values are far down into
    subsystems and filesystems as reasonable. Leaving the make_kuid and
    from_kuid operations to happen at the edge of userspace, as the values
    come off the disk, and as the values come in from the network.
    Letting compile type incompatible compile errors (present when user
    namespaces are enabled) guide me to find the issues.

    The most tricky areas have been the places where we had an implicit
    union of uid and gid values and were storing them in an unsigned int.
    Those places were converted into explicit unions. I made certain to
    handle those places with simple trivial patches.

    Out of that work I discovered we have generic interfaces for storing
    quota by projid. I had never heard of the project identifiers before.
    Adding full user namespace support for project identifiers accounts
    for most of the code size growth in my git tree.

    Ultimately there will be work to relax privlige checks from
    "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing
    root in a user names to do those things that today we only forbid to
    non-root users because it will confuse suid root applications.

    While I was pushing kuid_t and kgid_t changes deep into the audit code
    I made a few other cleanups. I capitalized on the fact we process
    netlink messages in the context of the message sender. I removed
    usage of NETLINK_CRED, and started directly using current->tty.

    Some of these patches have also made it into maintainer trees, with no
    problems from identical code from different trees showing up in
    linux-next.

    After reading through all of this code I feel like I might be able to
    win a game of kernel trivial pursuit."

    Fix up some fairly trivial conflicts in netfilter uid/git logging code.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits)
    userns: Convert the ufs filesystem to use kuid/kgid where appropriate
    userns: Convert the udf filesystem to use kuid/kgid where appropriate
    userns: Convert ubifs to use kuid/kgid
    userns: Convert squashfs to use kuid/kgid where appropriate
    userns: Convert reiserfs to use kuid and kgid where appropriate
    userns: Convert jfs to use kuid/kgid where appropriate
    userns: Convert jffs2 to use kuid and kgid where appropriate
    userns: Convert hpfs to use kuid and kgid where appropriate
    userns: Convert btrfs to use kuid/kgid where appropriate
    userns: Convert bfs to use kuid/kgid where appropriate
    userns: Convert affs to use kuid/kgid wherwe appropriate
    userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids
    userns: On ia64 deal with current_uid and current_gid being kuid and kgid
    userns: On ppc convert current_uid from a kuid before printing.
    userns: Convert s390 getting uid and gid system calls to use kuid and kgid
    userns: Convert s390 hypfs to use kuid and kgid where appropriate
    userns: Convert binder ipc to use kuids
    userns: Teach security_path_chown to take kuids and kgids
    userns: Add user namespace support to IMA
    userns: Convert EVM to deal with kuids and kgids in it's hmac computation
    ...

    Linus Torvalds
     

02 Oct, 2012

1 commit

  • Pull driver core merge from Greg Kroah-Hartman:
    "Here is the big driver core update for 3.7-rc1.

    A number of firmware_class.c updates (as you saw a month or so ago),
    and some hyper-v updates and some printk fixes as well. All patches
    that are outside of the drivers/base area have been acked by the
    respective maintainers, and have all been in the linux-next tree for a
    while.

    Signed-off-by: Greg Kroah-Hartman "

    * tag 'driver-core-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (95 commits)
    memory: tegra{20,30}-mc: Fix reading incorrect register in mc_readl()
    device.h: Add missing inline to #ifndef CONFIG_PRINTK dev_vprintk_emit
    memory: emif: Add ifdef CONFIG_DEBUG_FS guard for emif_debugfs_[init|exit]
    Documentation: Fixes some translation error in Documentation/zh_CN/gpio.txt
    Documentation: Remove 3 byte redundant code at the head of the Documentation/zh_CN/arm/booting
    Documentation: Chinese translation of Documentation/video4linux/omap3isp.txt
    device and dynamic_debug: Use dev_vprintk_emit and dev_printk_emit
    dev: Add dev_vprintk_emit and dev_printk_emit
    netdev_printk/netif_printk: Remove a superfluous logging colon
    netdev_printk/dynamic_netdev_dbg: Directly call printk_emit
    dev_dbg/dynamic_debug: Update to use printk_emit, optimize stack
    driver-core: Shut up dev_dbg_reatelimited() without DEBUG
    tools/hv: Parse /etc/os-release
    tools/hv: Check for read/write errors
    tools/hv: Fix exit() error code
    tools/hv: Fix file handle leak
    Tools: hv: Implement the KVP verb - KVP_OP_GET_IP_INFO
    Tools: hv: Rename the function kvp_get_ip_address()
    Tools: hv: Implement the KVP verb - KVP_OP_SET_IP_INFO
    Tools: hv: Add an example script to configure an interface
    ...

    Linus Torvalds
     

22 Sep, 2012

2 commits

  • The format_array_alloc() function is fundamentally racy, in that it
    prints the array twice: once to figure out how much space to allocate
    for the buffer, and the second time to actually print out the data.

    If any of the array contents changes in between, the allocation size may
    be wrong, and the end result may be truncated in odd ways.

    Just don't do it. Allocate a maximum-sized array up-front, and just
    format the array contents once. The only user of the u32_array
    interfaces is the Xen spinlock statistics code, and it has 31 entries in
    the arrays, so the maximum size really isn't that big, and the end
    result is much simpler code without the bug.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • u32_array_open() is racy when multiple threads read from a file with a
    seek position of zero, i.e. when two or more simultaneous reads are
    occurring after the non-seekable files are created. It is possible that
    file->private_data is double-freed because the threads races between

    kfree(file->private-data);

    and

    file->private_data = NULL;

    The fix is to only do format_array_alloc() when the file is opened and
    free it when it is closed.

    Note that because the file has always been non-seekable, you can't open
    it and read it multiple times anyway, so the data has always been
    generated just once. The difference is that now it is generated at open
    time rather than at the time of the first read, and that avoids the
    race.

    Reported-by: Dave Jones
    Acked-by: Konrad Rzeszutek Wilk
    Tested-by: Raghavendra
    Signed-off-by: David Rientjes
    Signed-off-by: Linus Torvalds

    David Rientjes
     

07 Sep, 2012

1 commit


28 Aug, 2012

1 commit


17 Aug, 2012

1 commit


27 Jul, 2012

1 commit

  • Pull driver core changes from Greg Kroah-Hartman:
    "Here's the big driver core pull request for 3.6-rc1.

    Unlike 3.5, this kernel should be a lot tamer, with the printk changes
    now settled down. All we have here is some extcon driver updates, w1
    driver updates, a few printk cleanups that weren't needed for 3.5, but
    are good to have now, and some other minor fixes/changes in the driver
    core.

    All of these have been in the linux-next releases for a while now.

    Signed-off-by: Greg Kroah-Hartman "

    * tag 'driver-core-3.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (38 commits)
    printk: Export struct log size and member offsets through vmcoreinfo
    Drivers: hv: Change the hex constant to a decimal constant
    driver core: don't trigger uevent after failure
    extcon: MAX77693: Add extcon-max77693 driver to support Maxim MAX77693 MUIC device
    sysfs: fail dentry revalidation after namespace change fix
    sysfs: fail dentry revalidation after namespace change
    extcon: spelling of detach in function doc
    extcon: arizona: Stop microphone detection if we give up on it
    extcon: arizona: Update cable reporting calls and split headset
    PM / Runtime: Do not increment device usage counts before probing
    kmsg - do not flush partial lines when the console is busy
    kmsg - export "continuation record" flag to /dev/kmsg
    kmsg - avoid warning for CONFIG_PRINTK=n compilations
    kmsg - properly print over-long continuation lines
    driver-core: Use kobj_to_dev instead of re-implementing it
    driver-core: Move kobj_to_dev from genhd.h to device.h
    driver core: Move deferred devices to the end of dpm_list before probing
    driver core: move uevent call to driver_register
    driver core: fix shutdown races with probe/remove(v3)
    Extcon: Arizona: Add driver for Wolfson Arizona class devices
    ...

    Linus Torvalds
     

14 Jul, 2012

3 commits


14 Jun, 2012

1 commit


17 Apr, 2012

1 commit


06 Apr, 2012

1 commit

  • Many users of debugfs copy the implementation of default_open() when
    they want to support a custom read/write function op. This leads to a
    proliferation of the default_open() implementation across the entire
    tree.

    Now that the common implementation has been consolidated into libfs we
    can replace all the users of this function with simple_open().

    This replacement was done with the following semantic patch:

    @ open @
    identifier open_f != simple_open;
    identifier i, f;
    @@
    -int open_f(struct inode *i, struct file *f)
    -{
    (
    -if (i->i_private)
    -f->private_data = i->i_private;
    |
    -f->private_data = i->i_private;
    )
    -return 0;
    -}

    @ has_open depends on open @
    identifier fops;
    identifier open.open_f;
    @@
    struct file_operations fops = {
    ...
    -.open = open_f,
    +.open = simple_open,
    ...
    };

    [akpm@linux-foundation.org: checkpatch fixes]
    Signed-off-by: Stephen Boyd
    Cc: Greg Kroah-Hartman
    Cc: Al Viro
    Cc: Julia Lawall
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Boyd
     

22 Mar, 2012

1 commit

  • Pull vfs pile 1 from Al Viro:
    "This is _not_ all; in particular, Miklos' and Jan's stuff is not there
    yet."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (64 commits)
    ext4: initialization of ext4_li_mtx needs to be done earlier
    debugfs-related mode_t whack-a-mole
    hfsplus: add an ioctl to bless files
    hfsplus: change finder_info to u32
    hfsplus: initialise userflags
    qnx4: new helper - try_extent()
    qnx4: get rid of qnx4_bread/qnx4_getblk
    take removal of PF_FORKNOEXEC to flush_old_exec()
    trim includes in inode.c
    um: uml_dup_mmap() relies on ->mmap_sem being held, but activate_mm() doesn't hold it
    um: embed ->stub_pages[] into mmu_context
    gadgetfs: list_for_each_safe() misuse
    ocfs2: fix leaks on failure exits in module_init
    ecryptfs: make register_filesystem() the last potential failure exit
    ntfs: forgets to unregister sysctls on register_filesystem() failure
    logfs: missing cleanup on register_filesystem() failure
    jfs: mising cleanup on register_filesystem() failure
    make configfs_pin_fs() return root dentry on success
    configfs: configfs_create_dir() has parent dentry in dentry->d_parent
    configfs: sanitize configfs_create()
    ...

    Linus Torvalds
     

21 Mar, 2012

1 commit


03 Feb, 2012

1 commit


27 Jan, 2012

1 commit

  • Cautious admins may want to restrict access to debugfs. Currently a
    manual chown/chmod e.g. in an init script is needed to achieve that.
    Distributions that want to make the mount options configurable need
    to add extra config files. By allowing to set the root inode's uid,
    gid and mode via mount options no such hacks are needed anymore.
    Instead configuration becomes straight forward via fstab.

    Signed-off-by: Ludwig Nussel
    Signed-off-by: Greg Kroah-Hartman

    Ludwig Nussel
     

24 Jan, 2012

1 commit

  • Fix new kernel-doc warnings:

    Warning(fs/debugfs/file.c:556): No description found for parameter 'nregs'
    Warning(fs/debugfs/file.c:556): Excess function parameter 'mregs' description in 'debugfs_print_regs32'

    Signed-off-by: Randy Dunlap
    Cc: Greg Kroah-Hartman
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

09 Jan, 2012

1 commit

  • * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits)
    reiserfs: Properly display mount options in /proc/mounts
    vfs: prevent remount read-only if pending removes
    vfs: count unlinked inodes
    vfs: protect remounting superblock read-only
    vfs: keep list of mounts for each superblock
    vfs: switch ->show_options() to struct dentry *
    vfs: switch ->show_path() to struct dentry *
    vfs: switch ->show_devname() to struct dentry *
    vfs: switch ->show_stats to struct dentry *
    switch security_path_chmod() to struct path *
    vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb
    vfs: trim includes a bit
    switch mnt_namespace ->root to struct mount
    vfs: take /proc/*/mounts and friends to fs/proc_namespace.c
    vfs: opencode mntget() mnt_set_mountpoint()
    vfs: spread struct mount - remaining argument of next_mnt()
    vfs: move fsnotify junk to struct mount
    vfs: move mnt_devname
    vfs: move mnt_list to struct mount
    vfs: switch pnode.h macros to struct mount *
    ...

    Linus Torvalds
     

04 Jan, 2012

3 commits


27 Nov, 2011

1 commit

  • The cast here causes a Sparse warning:
    fs/debugfs/file.c:561:42: warning: cast removes address space of expression
    fs/debugfs/file.c:561:42: warning: incorrect type in argument 1 (different address spaces)
    fs/debugfs/file.c:561:42: expected void const volatile [noderef] *addr
    fs/debugfs/file.c:561:42: got void *

    It's redundant to cast it to a (void *) anyway when it is already a
    (void __iomem *).

    Signed-off-by: Dan Carpenter
    Signed-off-by: Greg Kroah-Hartman

    Dan Carpenter
     

23 Nov, 2011

1 commit


19 Nov, 2011

2 commits


23 Aug, 2011

1 commit