04 Sep, 2010

1 commit


11 Aug, 2010

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (96 commits)
    no need for list_for_each_entry_safe()/resetting with superblock list
    Fix sget() race with failing mount
    vfs: don't hold s_umount over close_bdev_exclusive() call
    sysv: do not mark superblock dirty on remount
    sysv: do not mark superblock dirty on mount
    btrfs: remove junk sb_dirt change
    BFS: clean up the superblock usage
    AFFS: wait for sb synchronization when needed
    AFFS: clean up dirty flag usage
    cifs: truncate fallout
    mbcache: fix shrinker function return value
    mbcache: Remove unused features
    add f_flags to struct statfs(64)
    pass a struct path to vfs_statfs
    update VFS documentation for method changes.
    All filesystems that need invalidate_inode_buffers() are doing that explicitly
    convert remaining ->clear_inode() to ->evict_inode()
    Make ->drop_inode() just return whether inode needs to be dropped
    fs/inode.c:clear_inode() is gone
    fs/inode.c:evict() doesn't care about delete vs. non-delete paths now
    ...

    Fix up trivial conflicts in fs/nilfs2/super.c

    Linus Torvalds
     

10 Aug, 2010

2 commits


06 Aug, 2010

1 commit


27 Jul, 2010

3 commits

  • Supporting symlinks from untagged to tagged directories is reasonable,
    and needed to support CONFIG_SYSFS_DEPRECATED. So don't fail a prior
    allowing that case to work.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • This happens for network devices when SYSFS_DEPRECATED is enabled.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Recently my tagged sysfs support revealed a flaw in the device core
    that a few rare drivers are running into such that we don't always put
    network devices in a class subdirectory named net/.

    Since we are not creating the class directory the network devices wind
    up in a non-tagged directory, but the symlinks to the network devices
    from /sys/class/net are in a tagged directory. All of which works
    until we go to remove or rename the symlink. When we remove or rename
    a symlink we look in the namespace of the target of the symlink.
    Since the target of the symlink is in a non-tagged sysfs directory we
    don't have a namespace to look in, and we fail to remove the symlink.

    Detect this problem up front and simply don't create symlinks we won't
    be able to remove later. This prevents symlink leakage and fails in
    a much clearer and more understandable way.

    Signed-off-by: Eric W. Biederman
    Cc: Andrew Morton
    Cc: Rafael J. Wysocki
    Cc: Maciej W. Rozycki
    Cc: Kay Sievers
    Cc: Johannes Berg
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

05 Jun, 2010

1 commit

  • sysfs and configfs setattr functions have error cases after the generic inode's
    attributes have been changed. Fix consistency by changing the generic inode
    attributes only when it is guaranteed to succeed.

    Signed-off-by: Nick Piggin
    Acked-by: Joel Becker
    Signed-off-by: Greg Kroah-Hartman

    Nick Piggin
     

28 May, 2010

1 commit


22 May, 2010

8 commits

  • This allows bin_attr->read,write,mmap callbacks to check file specific data
    (such as inode owner) as part of any privilege validation.

    Signed-off-by: Chris Wright
    Signed-off-by: Greg Kroah-Hartman

    Chris Wright
     
  • In Al's latest vfs tree the code is reworked and S_BIAS has been removed.

    It turns out that checking to see if a super block is in the
    middle of an unmount in sysfs_exit_ns is unnecessary because we
    remove the super_block from the s_supers/s_instances list before
    struct sysfs_super_info pointed to by sb->s_fs_info is freed.

    For now just delete the unnecessary check to see if a superblock is in the
    middle of an unmount, it isn't necessary with or without Al's changes
    and it just causes a needless conflict.

    Reported-by: Stephen Rothwell
    Cc: Al Viro
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Add some in-line comments to explain the new infrastructure, which
    was introduced to support sysfs directory tagging with namespaces.
    I think an overall description someplace might be good too, but it
    didn't really seem to fit into Documentation/filesystems/sysfs.txt,
    which appears more geared toward users, rather than maintainers, of
    sysfs.

    (Tejun, please let me know if I can make anything clearer or failed
    altogether to comment something that should be commented.)

    Signed-off-by: Serge E. Hallyn
    Cc: Eric W. Biederman
    Cc: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Serge E. Hallyn
     
  • When removing a symlink sysfs_remove_link does not provide
    enough information to figure out which tagged directory the symlink
    falls in. So I need sysfs_delete_link which is passed the target
    of the symlink to delete.

    sysfs_rename_link is updated to call sysfs_delete_link instead
    of sysfs_remove_link as we have all of the information necessary
    and the callers are interesting.

    Both of these functions now have enough information to find a symlink
    in a tagged directory. The only restriction is that they must be called
    before the target kobject is renamed or deleted. If they are called
    later I loose track of which tag the target kobject was marked with
    and can no longer find the old symlink to remove it.

    This patch was split from an earlier patch.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Benjamin Thery
    Signed-off-by: Daniel Lezcano
    Acked-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • I had hopped to avoid this but the bonding driver adds a file
    to /sys/class/net/ and the easiest way to handle that file is
    to make it untagged and to register it only once.

    So relax the rules on tagged directories, and make bonding work.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • The problem. When implementing a network namespace I need to be able
    to have multiple network devices with the same name. Currently this
    is a problem for /sys/class/net/*, /sys/devices/virtual/net/*, and
    potentially a few other directories of the form /sys/ ... /net/*.

    What this patch does is to add an additional tag field to the
    sysfs dirent structure. For directories that should show different
    contents depending on the context such as /sys/class/net/, and
    /sys/devices/virtual/net/ this tag field is used to specify the
    context in which those directories should be visible. Effectively
    this is the same as creating multiple distinct directories with
    the same name but internally to sysfs the result is nicer.

    I am calling the concept of a single directory that looks like multiple
    directories all at the same path in the filesystem tagged directories.

    For the networking namespace the set of directories whose contents I need
    to filter with tags can depend on the presence or absence of hotplug
    hardware or which modules are currently loaded. Which means I need
    a simple race free way to setup those directories as tagged.

    To achieve a reace free design all tagged directories are created
    and managed by sysfs itself.

    Users of this interface:
    - define a type in the sysfs_tag_type enumeration.
    - call sysfs_register_ns_types with the type and it's operations
    - sysfs_exit_ns when an individual tag is no longer valid

    - Implement mount_ns() which returns the ns of the calling process
    so we can attach it to a sysfs superblock.
    - Implement ktype.namespace() which returns the ns of a syfs kobject.

    Everything else is left up to sysfs and the driver layer.

    For the network namespace mount_ns and namespace() are essentially
    one line functions, and look to remain that.

    Tags are currently represented a const void * pointers as that is
    both generic, prevides enough information for equality comparisons,
    and is trivial to create for current users, as it is just the
    existing namespace pointer.

    The work needed in sysfs is more extensive. At each directory
    or symlink creating I need to check if the directory it is being
    created in is a tagged directory and if so generate the appropriate
    tag to place on the sysfs_dirent. Likewise at each symlink or
    directory removal I need to check if the sysfs directory it is
    being removed from is a tagged directory and if so figure out
    which tag goes along with the name I am deleting.

    Currently only directories which hold kobjects, and
    symlinks are supported. There is not enough information
    in the current file attribute interfaces to give us anything
    to discriminate on which makes it useless, and there are
    no potential users which makes it an uninteresting problem
    to solve.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Benjamin Thery
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Add all of the necessary bioler plate to support
    multiple superblocks in sysfs.

    Signed-off-by: Eric W. Biederman
    Acked-by: Serge Hallyn
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

16 May, 2010

1 commit

  • Links for each port are created in sysfs using the device
    name, but this could be changed after being added to the
    bridge.

    As well as being unable to remove interfaces after this
    occurs (because userspace tools don't recognise the new
    name, and the kernel won't recognise the old name), adding
    another interface with the old name to the bridge will
    cause an error trying to create the sysfs link.

    This fixes the problem by listening for NETDEV_CHANGENAME
    notifications and renaming the link.

    https://bugzilla.kernel.org/show_bug.cgi?id=12743

    Signed-off-by: Simon Arlott
    Acked-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Simon Arlott
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

08 Mar, 2010

11 commits

  • Now that there are no more users we can remove
    the sysfs_sb variable.

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Currently sysfs_get_inode magically returns an inode on
    sysfs_sb. Make the super_block parameter explicit and
    the code becomes clearer.

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Because of rename ordering problems we occassionally give false
    warnings about invalid sysfs operations. So using sysfs_rename
    create a sysfs_rename_link function that doesn't need strange
    workarounds.

    Cc: Benjamin Thery
    Cc: Daniel Lezcano
    Acked-by: Serge Hallyn
    Acked-by: Tejun Heo
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Placing the 16bit s_mode between a pointer and a long doesn't pack well
    especailly on 64bit where we wast 48 bits. So move s_mode and
    declare it as a unsigned short. This is the sysfs backing store
    after all we don't need fields extra large just in case someday
    we want userspace to be able to use a larger value.

    Acked-by: Tejun Heo
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • The vfs depends upon filesystem methods to update the
    vfs inode. Sysfs adds to the normal number of places
    where the vfs inode is updated by also updatng the
    vfs inode in sysfs_refresh_inode.

    Typically the inode mutex is used to serialize updates
    to the vfs inode, but grabbing the inode mutex in
    sysfs_permission and sysfs_getattr causes deadlocks,
    because sometimes the vfs calls those operations with
    the inode mutex held. Therefore sysfs can not use the
    inode mutex to serial updates to the vfs inode.

    The sysfs_mutex is acquired in all of the routines
    where sysfs updates the vfs inode, and with a small
    change we can consistently protext sysfs vfs inode
    updates with the sysfs_mutex. To protect the sysfs
    vfs inode updates with the sysfs_mutex simply requires
    extending the scope of sysfs_mutex in sysfs_setattr
    over inode_setattr, and over inode_change_ok (so we
    have an unchanging inode when we perform the check).

    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Acknowledge that the logical sysfs rwsem has one instance per
    sysfs attribute with different locking depencencies for different
    attributes.

    There is a sysfs idiom where writing to one sysfs file causes the
    addition or removal of other sysfs files. Lumping all of the
    sysfs attributes together in one lock class causes lockdep to
    generate lots of false positives.

    This introduces the requirement that non-static sysfs attributes
    need to be initialized with sysfs_attr_init or sysfs_bin_attr_init.
    Strictly speaking this requirement only exists when lockdep is
    enabled, and when lockdep is enabled we get a bit fat warning
    if this requirement is not met.

    Signed-off-by: Eric W. Biederman
    Acked-by: WANG Cong
    Cc: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • If we exclude directories and symlinks from the set of sysfs
    dirents where we need active references we are left with
    sysfs attributes (binary or not).

    - Tweak sysfs_deactivate to only do something on attributes
    - Move lockdep initialization into sysfs_file_add_mode to
    limit it to just attributes.

    Signed-off-by: Eric W. Biederman
    Acked-by: WANG Cong
    Cc: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • It turns out that holding an active reference on a directory is
    pointless. The purpose of the active references are to allows us to
    block when removing sysfs entries that have custom methods so we don't
    remove modules while running modular code and to keep those custom
    methods from accessing data structures after the files have been
    removed. Further sysfs_remove_dir remove all elements in the
    directory before removing the directory itself, so there is no chance
    we will remove a directory with active children.

    Signed-off-by: Eric W. Biederman
    Cc: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Constify struct sysfs_ops.

    This is part of the ops structure constification
    effort started by Arjan van de Ven et al.

    Benefits of this constification:

    * prevents modification of data that is shared
    (referenced) by many other structure instances
    at runtime

    * detects/prevents accidental (but not intentional)
    modification attempts on archs that enforce
    read-only kernel data at runtime

    * potentially better optimized code as the compiler
    can assume that the const data cannot be changed

    * the compiler/linker move const data into .rodata
    and therefore exclude them from false sharing

    Signed-off-by: Emese Revfy
    Acked-by: David Teigland
    Acked-by: Matt Domsch
    Acked-by: Maciej Sosnowski
    Acked-by: Hans J. Koch
    Acked-by: Pekka Enberg
    Acked-by: Jens Axboe
    Acked-by: Stephen Hemminger
    Signed-off-by: Greg Kroah-Hartman

    Emese Revfy
     
  • When sysfs_readdir stops short we now cache the next
    sysfs_dirent to return to user space in filp->private_data.
    There is no impact on the rest of sysfs by doing this and
    in the common case it allows us to pick up exactly where
    we left off with no seeking.

    Additionally I drop and regrab the sysfs_mutex around
    filldir to avoid a page fault abritrarily increasing the
    hold time on the sysfs_mutex.

    v2: Returned to using INT_MAX as the EOF condition.
    seekdir is ambiguous unless all directory entries have
    a unique f_pos value.

    Fixes http://bugzilla.kernel.org/show_bug.cgi?id=14949

    Signed-off-by: Eric W. Biederman
    Cc: Linus Torvalds
    Cc: stable
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Adding/Removing a whole array of attributes is very common. Add a standard
    utility function to do this with a simple function call, instead of
    requiring drivers to open code this.

    Signed-off-by: Andi Kleen
    Signed-off-by: Greg Kroah-Hartman

    Andi Kleen
     

17 Feb, 2010

1 commit

  • There is currently a bug in sysfs_sd_setattr inherited from
    sysfs_setattr in 2.6.32 where the first time we set the attributes
    on a sysfs file we allocate backing store but do not set the
    backing store attributes. Resulting in overly restrictive
    permissions on sysfs files.

    The fix is to simply modify the code so that it always executes
    when we update the sysfs attributes, as we did in 2.6.31 and earlier.

    Signed-off-by: Eric W. Biederman
    Tested-by: Jean Delvare
    Cc: stable
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

05 Jan, 2010

1 commit

  • Holding locks over device_del -> kobject_del -> sysfs_deactivate can
    cause deadlocks if those same locks are grabbed in sysfs show or store
    methods.

    The I model s_active count + completion as a sleeping read/write lock.
    I describe to lockdep sysfs_get_active as a read_trylock,
    sysfs_put_active as a read_unlock, and sysfs_deactivate as a
    write_lock and write_unlock pair. This seems to capture the essence
    for purposes of finding deadlocks, and in my testing gives finds real
    issues and ignores non-issues.

    This brings us back to holding locks over kobject_del is a problem
    that ideally we should find a way of addressing, but at least lockdep
    can tell us about the problems instead of requiring developers to debug
    rare strange system deadlocks, that happen when sysfs files are removed
    while being written to.

    Signed-off-by: Eric W. Biederman
    Acked-by: Tejun Heo
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

24 Dec, 2009

1 commit


12 Dec, 2009

6 commits

  • inode_change_ok already clears the SGID bit when necessary
    so there is no reason for sysfs_setattr to carry code to do
    the same, and it is good to kill the extra copy because when
    I moved the code last in certain corner cases the code will
    look at the wrong gid.

    Acked-by: Serge Hallyn
    Acked-by: Tejun Heo
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • These two functions do 90% of the same work and it doesn't significantly
    obfuscate the function to allow both the parent dir and the name to change
    at the same time. So merge them together to simplify maintenance, and
    increase testing.

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • By teaching sysfs_revalidate to hide a dentry for
    a sysfs_dirent if the sysfs_dirent has been renamed,
    and by teaching sysfs_lookup to return the original
    dentry if the sysfs dirent has been renamed. I can
    show the results of renames correctly without having to
    update the dcache during the directory rename.

    This massively simplifies the rename logic allowing a lot
    of weird sysfs special cases to be removed along with
    a lot of now unnecesary helper code.

    Acked-by: Tejun Heo
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • With lazy inode updates and dentry operations bringing everything
    into sync on demand there is no longer any need to immediately
    update the vfs or grab i_mutex to protect those updates as we
    make changes to sysfs.

    Acked-by: Serge Hallyn
    Acked-by: Tejun Heo
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Now that sysfs_getattr and sysfs_permission refresh the vfs
    inode there is no need to immediatly push the mode change
    into the vfs cache. Reducing the amount of work needed and
    simplifying the locking.

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • With the implementation of sysfs_getattr and sysfs_permission
    sysfs becomes able to lazily propogate inode attribute changes
    from the sysfs_dirents to the vfs inodes. This paves the way
    for deleting significant chunks of now unnecessary code.

    While doing this we did not reference sysfs_setattr from
    sysfs_symlink_inode_operations so I added along with
    sysfs_getattr and sysfs_permission.

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman