07 Jan, 2011

3 commits

  • Require filesystems be aware of .d_revalidate being called in rcu-walk
    mode (nd->flags & LOOKUP_RCU). For now do a simple push down, returning
    -ECHILD from all implementations.

    Signed-off-by: Nick Piggin

    Nick Piggin
     
  • Reduce some branches and memory accesses in dcache lookup by adding dentry
    flags to indicate common d_ops are set, rather than having to check them.
    This saves a pointer memory access (dentry->d_op) in common path lookup
    situations, and saves another pointer load and branch in cases where we
    have d_op but not the particular operation.

    Patched with:

    git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i

    Signed-off-by: Nick Piggin

    Nick Piggin
     
  • Change d_delete from a dentry deletion notification to a dentry caching
    advise, more like ->drop_inode. Require it to be constant and idempotent,
    and not take d_lock. This is how all existing filesystems use the callback
    anyway.

    This makes fine grained dentry locking of dput and dentry lru scanning
    much simpler.

    Signed-off-by: Nick Piggin

    Nick Piggin
     

22 May, 2010

3 commits

  • Add some in-line comments to explain the new infrastructure, which
    was introduced to support sysfs directory tagging with namespaces.
    I think an overall description someplace might be good too, but it
    didn't really seem to fit into Documentation/filesystems/sysfs.txt,
    which appears more geared toward users, rather than maintainers, of
    sysfs.

    (Tejun, please let me know if I can make anything clearer or failed
    altogether to comment something that should be commented.)

    Signed-off-by: Serge E. Hallyn
    Cc: Eric W. Biederman
    Cc: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Serge E. Hallyn
     
  • I had hopped to avoid this but the bonding driver adds a file
    to /sys/class/net/ and the easiest way to handle that file is
    to make it untagged and to register it only once.

    So relax the rules on tagged directories, and make bonding work.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • The problem. When implementing a network namespace I need to be able
    to have multiple network devices with the same name. Currently this
    is a problem for /sys/class/net/*, /sys/devices/virtual/net/*, and
    potentially a few other directories of the form /sys/ ... /net/*.

    What this patch does is to add an additional tag field to the
    sysfs dirent structure. For directories that should show different
    contents depending on the context such as /sys/class/net/, and
    /sys/devices/virtual/net/ this tag field is used to specify the
    context in which those directories should be visible. Effectively
    this is the same as creating multiple distinct directories with
    the same name but internally to sysfs the result is nicer.

    I am calling the concept of a single directory that looks like multiple
    directories all at the same path in the filesystem tagged directories.

    For the networking namespace the set of directories whose contents I need
    to filter with tags can depend on the presence or absence of hotplug
    hardware or which modules are currently loaded. Which means I need
    a simple race free way to setup those directories as tagged.

    To achieve a reace free design all tagged directories are created
    and managed by sysfs itself.

    Users of this interface:
    - define a type in the sysfs_tag_type enumeration.
    - call sysfs_register_ns_types with the type and it's operations
    - sysfs_exit_ns when an individual tag is no longer valid

    - Implement mount_ns() which returns the ns of the calling process
    so we can attach it to a sysfs superblock.
    - Implement ktype.namespace() which returns the ns of a syfs kobject.

    Everything else is left up to sysfs and the driver layer.

    For the network namespace mount_ns and namespace() are essentially
    one line functions, and look to remain that.

    Tags are currently represented a const void * pointers as that is
    both generic, prevides enough information for equality comparisons,
    and is trivial to create for current users, as it is just the
    existing namespace pointer.

    The work needed in sysfs is more extensive. At each directory
    or symlink creating I need to check if the directory it is being
    created in is a tagged directory and if so generate the appropriate
    tag to place on the sysfs_dirent. Likewise at each symlink or
    directory removal I need to check if the sysfs directory it is
    being removed from is a tagged directory and if so figure out
    which tag goes along with the name I am deleting.

    Currently only directories which hold kobjects, and
    symlinks are supported. There is not enough information
    in the current file attribute interfaces to give us anything
    to discriminate on which makes it useless, and there are
    no potential users which makes it an uninteresting problem
    to solve.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Benjamin Thery
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

08 Mar, 2010

4 commits

  • Currently sysfs_get_inode magically returns an inode on
    sysfs_sb. Make the super_block parameter explicit and
    the code becomes clearer.

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • If we exclude directories and symlinks from the set of sysfs
    dirents where we need active references we are left with
    sysfs attributes (binary or not).

    - Tweak sysfs_deactivate to only do something on attributes
    - Move lockdep initialization into sysfs_file_add_mode to
    limit it to just attributes.

    Signed-off-by: Eric W. Biederman
    Acked-by: WANG Cong
    Cc: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • It turns out that holding an active reference on a directory is
    pointless. The purpose of the active references are to allows us to
    block when removing sysfs entries that have custom methods so we don't
    remove modules while running modular code and to keep those custom
    methods from accessing data structures after the files have been
    removed. Further sysfs_remove_dir remove all elements in the
    directory before removing the directory itself, so there is no chance
    we will remove a directory with active children.

    Signed-off-by: Eric W. Biederman
    Cc: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • When sysfs_readdir stops short we now cache the next
    sysfs_dirent to return to user space in filp->private_data.
    There is no impact on the rest of sysfs by doing this and
    in the common case it allows us to pick up exactly where
    we left off with no seeking.

    Additionally I drop and regrab the sysfs_mutex around
    filldir to avoid a page fault abritrarily increasing the
    hold time on the sysfs_mutex.

    v2: Returned to using INT_MAX as the EOF condition.
    seekdir is ambiguous unless all directory entries have
    a unique f_pos value.

    Fixes http://bugzilla.kernel.org/show_bug.cgi?id=14949

    Signed-off-by: Eric W. Biederman
    Cc: Linus Torvalds
    Cc: stable
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

05 Jan, 2010

1 commit

  • Holding locks over device_del -> kobject_del -> sysfs_deactivate can
    cause deadlocks if those same locks are grabbed in sysfs show or store
    methods.

    The I model s_active count + completion as a sleeping read/write lock.
    I describe to lockdep sysfs_get_active as a read_trylock,
    sysfs_put_active as a read_unlock, and sysfs_deactivate as a
    write_lock and write_unlock pair. This seems to capture the essence
    for purposes of finding deadlocks, and in my testing gives finds real
    issues and ignores non-issues.

    This brings us back to holding locks over kobject_del is a problem
    that ideally we should find a way of addressing, but at least lockdep
    can tell us about the problems instead of requiring developers to debug
    rare strange system deadlocks, that happen when sysfs files are removed
    while being written to.

    Signed-off-by: Eric W. Biederman
    Acked-by: Tejun Heo
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

12 Dec, 2009

7 commits

  • These two functions do 90% of the same work and it doesn't significantly
    obfuscate the function to allow both the parent dir and the name to change
    at the same time. So merge them together to simplify maintenance, and
    increase testing.

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • By teaching sysfs_revalidate to hide a dentry for
    a sysfs_dirent if the sysfs_dirent has been renamed,
    and by teaching sysfs_lookup to return the original
    dentry if the sysfs dirent has been renamed. I can
    show the results of renames correctly without having to
    update the dcache during the directory rename.

    This massively simplifies the rename logic allowing a lot
    of weird sysfs special cases to be removed along with
    a lot of now unnecesary helper code.

    Acked-by: Tejun Heo
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • With lazy inode updates and dentry operations bringing everything
    into sync on demand there is no longer any need to immediately
    update the vfs or grab i_mutex to protect those updates as we
    make changes to sysfs.

    Acked-by: Serge Hallyn
    Acked-by: Tejun Heo
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • With the implementation of sysfs_getattr and sysfs_permission
    sysfs becomes able to lazily propogate inode attribute changes
    from the sysfs_dirents to the vfs inodes. This paves the way
    for deleting significant chunks of now unnecessary code.

    While doing this we did not reference sysfs_setattr from
    sysfs_symlink_inode_operations so I added along with
    sysfs_getattr and sysfs_permission.

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Currently sysfs updates the timestamps on the vfs directory
    inode when we create or remove a directory entry but doesn't
    update the cached copy on the sysfs_dirent, fix that oversight.

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Calling d_drop unconditionally when a sysfs_dirent is deleted has
    the potential to leak mounts, so instead implement dentry delete
    and revalidate operations that cause sysfs dentries to be removed
    at the appropriate time.

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Using dentry instead of d in the function name is what
    several other filesystems are doing and it seems to be
    a more readable convention.

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

05 Nov, 2009

1 commit


15 Oct, 2009

1 commit


10 Sep, 2009

1 commit

  • This patch adds a setxattr handler to the file, directory, and symlink
    inode_operations structures for sysfs. The patch uses hooks introduced in the
    previous patch to handle the getting and setting of security information for
    the sysfs inodes. As was suggested by Eric Biederman the struct iattr in the
    sysfs_dirent structure has been replaced by a structure which contains the
    iattr, secdata and secdata length to allow the changes to persist in the event
    that the inode representing the sysfs_dirent is evicted. Because sysfs only
    stores this information when a change is made all the optional data is moved
    into one dynamically allocated field.

    This patch addresses an issue where SELinux was denying virtd access to the PCI
    configuration entries in sysfs. The lack of setxattr handlers for sysfs
    required that a single label be assigned to all entries in sysfs. Granting virtd
    access to every entry in sysfs is not an acceptable solution so fine grained
    labeling of sysfs is required such that individual entries can be labeled
    appropriately.

    [sds: Fixed compile-time warnings, coding style, and setting of inode security init flags.]

    Signed-off-by: David P. Quigley
    Signed-off-by: Stephen D. Smalley
    Signed-off-by: James Morris

    David P. Quigley
     

29 Jul, 2009

1 commit

  • Update directory hardlink count when moving kobjects to a new parent.
    Fixes the following problem which occurs when several devices are
    moved to the same parent and then unregistered:

    > ls -laF /sys/devices/css0/defunct/
    > total 0
    > drwxr-xr-x 4294967295 root root 0 2009-07-14 17:02 ./
    > drwxr-xr-x 114 root root 0 2009-07-14 17:02 ../
    > drwxr-xr-x 2 root root 0 2009-07-14 17:01 power/
    > -rw-r--r-- 1 root root 4096 2009-07-14 17:01 uevent

    Signed-off-by: Peter Oberparleiter
    Cc: stable
    Signed-off-by: Greg Kroah-Hartman

    Peter Oberparleiter
     

28 Mar, 2009

2 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (37 commits)
    fs: avoid I_NEW inodes
    Merge code for single and multiple-instance mounts
    Remove get_init_pts_sb()
    Move common mknod_ptmx() calls into caller
    Parse mount options just once and copy them to super block
    Unroll essentials of do_remount_sb() into devpts
    vfs: simple_set_mnt() should return void
    fs: move bdev code out of buffer.c
    constify dentry_operations: rest
    constify dentry_operations: configfs
    constify dentry_operations: sysfs
    constify dentry_operations: JFS
    constify dentry_operations: OCFS2
    constify dentry_operations: GFS2
    constify dentry_operations: FAT
    constify dentry_operations: FUSE
    constify dentry_operations: procfs
    constify dentry_operations: ecryptfs
    constify dentry_operations: CIFS
    constify dentry_operations: AFS
    ...

    Linus Torvalds
     
  • Signed-off-by: Al Viro

    Al Viro
     

25 Mar, 2009

2 commits

  • Modify sysfs bin files so that we can remove the bin file while they are
    still mapped. When the kobject is removed we unmap the bin file and
    arrange for future accesses to the mapping to receive SIGBUS.

    Implementing this prevents a nasty DOS when pci devices are hot plugged
    and unplugged. Where if any of their resources were mmaped the kernel
    could not free up their pci resources or release their pci data
    structures.

    [akpm@linux-foundation.org: remove unused var]
    Signed-off-by: Eric W. Biederman
    Cc: Jesse Barnes
    Acked-by: Tejun Heo
    Cc: Kay Sievers
    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • sysfs: sysfs_add_one WARNs with full path to duplicate filename

    As a debugging aid, it can be useful to know the full path to a
    duplicate file being created in sysfs.

    We now will display warnings such as:

    sysfs: cannot create duplicate filename '/foo'

    when attempting to create multiple files named 'foo' in the sysfs
    root, or:

    sysfs: cannot create duplicate filename '/bus/pci/slots/5/foo'

    when attempting to create multiple files named 'foo' under a
    given directory in sysfs.

    The path displayed is always a relative path to sysfs_root. The
    leading '/' in the path name refers to the sysfs_root mount
    point, and should not be confused with the "real" '/'.

    Thanks to Alex Williamson for essentially writing sysfs_pathname.

    Cc: Alex Williamson
    Signed-off-by: Alex Chiang
    Signed-off-by: Greg Kroah-Hartman

    Alex Chiang
     

23 Oct, 2008

1 commit


17 Oct, 2008

3 commits

  • It finally dawned on me what the clean fix to sysfs_rename_dir
    calling kobject_set_name is. Move the work into kobject_rename
    where it belongs. The callers serialize us anyway so this is
    safe.

    Signed-off-by: Eric W. Biederman
    Acked-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • As inode creation is protected by sysfs_mutex, ilookup5_nowait()
    always either fails to find at all or finds one which is fully
    initialized, so using ilookup5_nowait() or ilookup5() doesn't make any
    difference. Switch to ilookup5() as it's planned to be removed. This
    change also makes lookup return value handling a bit simpler.

    This change was suggested by Al Viro.

    Signed-off-by: Tejun Heo
    Cc: Al Viro
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • Support sysfs_notify from atomic context with new sysfs_notify_dirent

    sysfs_notify currently takes sysfs_mutex.
    This means that it cannot be called in atomic context.
    sysfs_mutex is sometimes held over a malloc (sysfs_rename_dir)
    so it can block on low memory.

    In md I want to be able to notify on a sysfs attribute from
    atomic context, and I don't want to block on low memory because I
    could be in the writeout path for freeing memory.

    So:
    - export the "sysfs_dirent" structure along with sysfs_get, sysfs_put
    and sysfs_get_dirent so I can get the sysfs_dirent that I want to
    notify on and hold it in an md structure.
    - split sysfs_notify_dirent out of sysfs_notify so the sysfs_dirent
    can be notified on with no blocking (just a spinlock).

    Signed-off-by: Neil Brown
    Acked-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Neil Brown
     

27 Jul, 2008

1 commit

  • Use WARN() instead of a printk+WARN_ON() pair; this way the message becomes
    part of the warning section for better reporting/collection. Also, with this,
    one fo the if() sections collapses entirely into the WARN().

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

22 Jul, 2008

1 commit

  • driver core: Suppress sysfs warnings for device_rename().

    Renaming network devices to an already existing name is not
    something we want sysfs to print a scary warning for, since the
    callers can deal with this correctly. So let's introduce
    sysfs_create_link_nowarn() which gets rid of the common warning.

    Signed-off-by: Cornelia Huck
    Signed-off-by: Greg Kroah-Hartman

    Cornelia Huck
     

15 May, 2008

1 commit

  • It is possible that the entry in sysfs already exists, one case of this is
    when a network device is renamed to bonding_masters. Anyway, in this case
    the proper error path is for device_rename to return an error code, not to
    generate bogus backtrace and errors.

    Also, to avoid possible races, the create link should be done before the
    remove link. This makes a device rename atomic operation like other renames.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

20 Apr, 2008

1 commit


25 Jan, 2008

1 commit


17 Jan, 2008

2 commits

  • sysfs_rename/move_dir() have the following bugs.

    - On dentry lookup failure, kfree() is called on ERR_PTR() value.
    - sysfs_move_dir() has an extra dput() on success path.

    Fix them.

    Signed-off-by: Tejun Heo
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • sysfs tries to keep dcache a strict subset of sysfs_dirent tree by
    shooting down dentries when a node is removed, that is, no negative
    dentry for sysfs. However, the lookup function returned NULL and thus
    created negative dentries when the target node didn't exist.

    Make sysfs_lookup() return ERR_PTR(-ENOENT) on lookup failure. This
    fixes the NULL dereference bug in sysfs_get_dentry() discovered by
    bluetooth rfcomm device moving around.

    Signed-off-by: Tejun Heo
    Signed-off-by: Linus Torvalds

    Tejun Heo
     

31 Oct, 2007

1 commit


17 Oct, 2007

2 commits

  • Replace some SPIN_LOCK_UNLOCKED with DEFINE_SPINLOCK

    Signed-off-by: Roel Kluin
    Acked-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roel Kluin
     
  • Try to fix the mess created by sysfs braindamage.

    - refactor code internal to fs/namei.c a little to avoid too much
    duplication:
    o __lookup_hash_kern is renamed back to __lookup_hash
    o the old __lookup_hash goes away, permission checks moves to
    the two callers
    o useless inline qualifiers on above functions go away
    - lookup_one_len_kern loses it's last argument and is renamed to
    lookup_one_noperm to make it's useage a little more clear
    - added kerneldoc comments to describe lookup_one_len aswell as
    lookup_one_noperm and make it very clear that no one should use
    the latter ever.

    Signed-off-by: Christoph Hellwig
    Cc: Josef 'Jeff' Sipek
    Cc: Miklos Szeredi
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig