04 Jan, 2012

1 commit


23 Aug, 2011

4 commits

  • sysfs: use rb-tree for inode number lookup

    This patch makes sysfs use red-black tree for inode number lookup.
    Together with a previous patch to use red-black tree for name lookup,
    this patch makes all sysfs lookups to have O(log n) complexity.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Greg Kroah-Hartman

    Mikulas Patocka
     
  • sysfs: remove s_sibling hacks

    s_sibling was used for three different purposes:
    1) as a linked list of entries in the directory
    2) as a linked list of entries to be deleted
    3) as a pointer to "struct completion"

    This patch removes the hack and introduces new union u which
    holds pointers for cases 2) and 3).

    This change is needed for the following patch that removes s_sibling at all
    and replaces it with a rb tree.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Greg Kroah-Hartman

    Mikulas Patocka
     
  • sysfs: use rb-tree for name lookups

    Use red-black tree for name lookups.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Greg Kroah-Hartman

    Mikulas Patocka
     
  • sysfs: count subdirectories

    This patch introduces a subdirectory counter for each sysfs directory.

    Without the patch, sysfs_refresh_inode would walk all entries of the directory
    to calculate the number of subdirectories.

    This patch improves time of "ls -la /sys/block" when there are 10000 block
    devices from 9 seconds to 0.19 seconds.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Greg Kroah-Hartman

    Mikulas Patocka
     

20 Jul, 2011

1 commit


13 Jun, 2011

1 commit

  • * new refcount in struct net, controlling actual freeing of the memory
    * new method in kobj_ns_type_operations (->drop_ns())
    * ->current_ns() semantics change - it's supposed to be followed by
    corresponding ->drop_ns(). For struct net in case of CONFIG_NET_NS it bumps
    the new refcount; net_drop_ns() decrements it and calls net_free() if the
    last reference has been dropped. Method renamed to ->grab_current_ns().
    * old net_free() callers call net_drop_ns() instead.
    * sysfs_exit_ns() is gone, along with a large part of callchain
    leading to it; now that the references stored in ->ns[...] stay valid we
    do not need to hunt them down and replace them with NULL. That fixes
    problems in sysfs_lookup() and sysfs_readdir(), along with getting rid
    of sb->s_instances abuse.

    Note that struct net *shutdown* logics has not changed - net_cleanup()
    is called exactly when it used to be called. The only thing postponed by
    having a sysfs instance refering to that struct net is actual freeing of
    memory occupied by struct net.

    Signed-off-by: Al Viro

    Al Viro
     

11 Jan, 2011

1 commit


07 Jan, 2011

1 commit


10 Aug, 2010

1 commit


22 May, 2010

3 commits

  • Add some in-line comments to explain the new infrastructure, which
    was introduced to support sysfs directory tagging with namespaces.
    I think an overall description someplace might be good too, but it
    didn't really seem to fit into Documentation/filesystems/sysfs.txt,
    which appears more geared toward users, rather than maintainers, of
    sysfs.

    (Tejun, please let me know if I can make anything clearer or failed
    altogether to comment something that should be commented.)

    Signed-off-by: Serge E. Hallyn
    Cc: Eric W. Biederman
    Cc: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Serge E. Hallyn
     
  • The problem. When implementing a network namespace I need to be able
    to have multiple network devices with the same name. Currently this
    is a problem for /sys/class/net/*, /sys/devices/virtual/net/*, and
    potentially a few other directories of the form /sys/ ... /net/*.

    What this patch does is to add an additional tag field to the
    sysfs dirent structure. For directories that should show different
    contents depending on the context such as /sys/class/net/, and
    /sys/devices/virtual/net/ this tag field is used to specify the
    context in which those directories should be visible. Effectively
    this is the same as creating multiple distinct directories with
    the same name but internally to sysfs the result is nicer.

    I am calling the concept of a single directory that looks like multiple
    directories all at the same path in the filesystem tagged directories.

    For the networking namespace the set of directories whose contents I need
    to filter with tags can depend on the presence or absence of hotplug
    hardware or which modules are currently loaded. Which means I need
    a simple race free way to setup those directories as tagged.

    To achieve a reace free design all tagged directories are created
    and managed by sysfs itself.

    Users of this interface:
    - define a type in the sysfs_tag_type enumeration.
    - call sysfs_register_ns_types with the type and it's operations
    - sysfs_exit_ns when an individual tag is no longer valid

    - Implement mount_ns() which returns the ns of the calling process
    so we can attach it to a sysfs superblock.
    - Implement ktype.namespace() which returns the ns of a syfs kobject.

    Everything else is left up to sysfs and the driver layer.

    For the network namespace mount_ns and namespace() are essentially
    one line functions, and look to remain that.

    Tags are currently represented a const void * pointers as that is
    both generic, prevides enough information for equality comparisons,
    and is trivial to create for current users, as it is just the
    existing namespace pointer.

    The work needed in sysfs is more extensive. At each directory
    or symlink creating I need to check if the directory it is being
    created in is a tagged directory and if so generate the appropriate
    tag to place on the sysfs_dirent. Likewise at each symlink or
    directory removal I need to check if the sysfs directory it is
    being removed from is a tagged directory and if so figure out
    which tag goes along with the name I am deleting.

    Currently only directories which hold kobjects, and
    symlinks are supported. There is not enough information
    in the current file attribute interfaces to give us anything
    to discriminate on which makes it useless, and there are
    no potential users which makes it an uninteresting problem
    to solve.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Benjamin Thery
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Add all of the necessary bioler plate to support
    multiple superblocks in sysfs.

    Signed-off-by: Eric W. Biederman
    Acked-by: Serge Hallyn
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

08 Mar, 2010

6 commits

  • Now that there are no more users we can remove
    the sysfs_sb variable.

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Currently sysfs_get_inode magically returns an inode on
    sysfs_sb. Make the super_block parameter explicit and
    the code becomes clearer.

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Placing the 16bit s_mode between a pointer and a long doesn't pack well
    especailly on 64bit where we wast 48 bits. So move s_mode and
    declare it as a unsigned short. This is the sysfs backing store
    after all we don't need fields extra large just in case someday
    we want userspace to be able to use a larger value.

    Acked-by: Tejun Heo
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Acknowledge that the logical sysfs rwsem has one instance per
    sysfs attribute with different locking depencencies for different
    attributes.

    There is a sysfs idiom where writing to one sysfs file causes the
    addition or removal of other sysfs files. Lumping all of the
    sysfs attributes together in one lock class causes lockdep to
    generate lots of false positives.

    This introduces the requirement that non-static sysfs attributes
    need to be initialized with sysfs_attr_init or sysfs_bin_attr_init.
    Strictly speaking this requirement only exists when lockdep is
    enabled, and when lockdep is enabled we get a bit fat warning
    if this requirement is not met.

    Signed-off-by: Eric W. Biederman
    Acked-by: WANG Cong
    Cc: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • If we exclude directories and symlinks from the set of sysfs
    dirents where we need active references we are left with
    sysfs attributes (binary or not).

    - Tweak sysfs_deactivate to only do something on attributes
    - Move lockdep initialization into sysfs_file_add_mode to
    limit it to just attributes.

    Signed-off-by: Eric W. Biederman
    Acked-by: WANG Cong
    Cc: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • It turns out that holding an active reference on a directory is
    pointless. The purpose of the active references are to allows us to
    block when removing sysfs entries that have custom methods so we don't
    remove modules while running modular code and to keep those custom
    methods from accessing data structures after the files have been
    removed. Further sysfs_remove_dir remove all elements in the
    directory before removing the directory itself, so there is no chance
    we will remove a directory with active children.

    Signed-off-by: Eric W. Biederman
    Cc: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

05 Jan, 2010

1 commit

  • Holding locks over device_del -> kobject_del -> sysfs_deactivate can
    cause deadlocks if those same locks are grabbed in sysfs show or store
    methods.

    The I model s_active count + completion as a sleeping read/write lock.
    I describe to lockdep sysfs_get_active as a read_trylock,
    sysfs_put_active as a read_unlock, and sysfs_deactivate as a
    write_lock and write_unlock pair. This seems to capture the essence
    for purposes of finding deadlocks, and in my testing gives finds real
    issues and ignores non-issues.

    This brings us back to holding locks over kobject_del is a problem
    that ideally we should find a way of addressing, but at least lockdep
    can tell us about the problems instead of requiring developers to debug
    rare strange system deadlocks, that happen when sysfs files are removed
    while being written to.

    Signed-off-by: Eric W. Biederman
    Acked-by: Tejun Heo
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

12 Dec, 2009

5 commits

  • These two functions do 90% of the same work and it doesn't significantly
    obfuscate the function to allow both the parent dir and the name to change
    at the same time. So merge them together to simplify maintenance, and
    increase testing.

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • By teaching sysfs_revalidate to hide a dentry for
    a sysfs_dirent if the sysfs_dirent has been renamed,
    and by teaching sysfs_lookup to return the original
    dentry if the sysfs dirent has been renamed. I can
    show the results of renames correctly without having to
    update the dcache during the directory rename.

    This massively simplifies the rename logic allowing a lot
    of weird sysfs special cases to be removed along with
    a lot of now unnecesary helper code.

    Acked-by: Tejun Heo
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • With lazy inode updates and dentry operations bringing everything
    into sync on demand there is no longer any need to immediately
    update the vfs or grab i_mutex to protect those updates as we
    make changes to sysfs.

    Acked-by: Serge Hallyn
    Acked-by: Tejun Heo
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • With the implementation of sysfs_getattr and sysfs_permission
    sysfs becomes able to lazily propogate inode attribute changes
    from the sysfs_dirents to the vfs inodes. This paves the way
    for deleting significant chunks of now unnecessary code.

    While doing this we did not reference sysfs_setattr from
    sysfs_symlink_inode_operations so I added along with
    sysfs_getattr and sysfs_permission.

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Cleanly separate the work that is specific to setting the
    attributes of a sysfs_dirent from what is needed to update
    the attributes of a vfs inode.

    Additionally grab the sysfs_mutex to keep any nasties from
    surprising us when updating the sysfs_dirent.

    Acked-by: Tejun Heo
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

10 Sep, 2009

1 commit

  • This patch adds a setxattr handler to the file, directory, and symlink
    inode_operations structures for sysfs. The patch uses hooks introduced in the
    previous patch to handle the getting and setting of security information for
    the sysfs inodes. As was suggested by Eric Biederman the struct iattr in the
    sysfs_dirent structure has been replaced by a structure which contains the
    iattr, secdata and secdata length to allow the changes to persist in the event
    that the inode representing the sysfs_dirent is evicted. Because sysfs only
    stores this information when a change is made all the optional data is moved
    into one dynamically allocated field.

    This patch addresses an issue where SELinux was denying virtd access to the PCI
    configuration entries in sysfs. The lack of setxattr handlers for sysfs
    required that a single label be assigned to all entries in sysfs. Granting virtd
    access to every entry in sysfs is not an acceptable solution so fine grained
    labeling of sysfs is required such that individual entries can be labeled
    appropriately.

    [sds: Fixed compile-time warnings, coding style, and setting of inode security init flags.]

    Signed-off-by: David P. Quigley
    Signed-off-by: Stephen D. Smalley
    Signed-off-by: James Morris

    David P. Quigley
     

25 Mar, 2009

2 commits

  • Modify sysfs bin files so that we can remove the bin file while they are
    still mapped. When the kobject is removed we unmap the bin file and
    arrange for future accesses to the mapping to receive SIGBUS.

    Implementing this prevents a nasty DOS when pci devices are hot plugged
    and unplugged. Where if any of their resources were mmaped the kernel
    could not free up their pci resources or release their pci data
    structures.

    [akpm@linux-foundation.org: remove unused var]
    Signed-off-by: Eric W. Biederman
    Cc: Jesse Barnes
    Acked-by: Tejun Heo
    Cc: Kay Sievers
    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • The sysfs_dirent serves as both an inode and a directory entry
    for sysfs. To prevent the sysfs inode numbers from being freed
    prematurely hold a reference to sysfs_dirent from the sysfs inode.

    [akpm@linux-foundation.org: add comment]
    Signed-off-by: Eric W. Biederman
    Cc: Tejun Heo
    Cc: Al Viro
    Cc: Cornelia Huck
    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

17 Oct, 2008

1 commit

  • Support sysfs_notify from atomic context with new sysfs_notify_dirent

    sysfs_notify currently takes sysfs_mutex.
    This means that it cannot be called in atomic context.
    sysfs_mutex is sometimes held over a malloc (sysfs_rename_dir)
    so it can block on low memory.

    In md I want to be able to notify on a sysfs attribute from
    atomic context, and I don't want to block on low memory because I
    could be in the writeout path for freeing memory.

    So:
    - export the "sysfs_dirent" structure along with sysfs_get, sysfs_put
    and sysfs_get_dirent so I can get the sysfs_dirent that I want to
    notify on and hold it in an md structure.
    - split sysfs_notify_dirent out of sysfs_notify so the sysfs_dirent
    can be notified on with no blocking (just a spinlock).

    Signed-off-by: Neil Brown
    Acked-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Neil Brown
     

22 Jul, 2008

1 commit

  • driver core: Suppress sysfs warnings for device_rename().

    Renaming network devices to an already existing name is not
    something we want sysfs to print a scary warning for, since the
    callers can deal with this correctly. So let's introduce
    sysfs_create_link_nowarn() which gets rid of the common warning.

    Signed-off-by: Cornelia Huck
    Signed-off-by: Greg Kroah-Hartman

    Cornelia Huck
     

23 Apr, 2008

1 commit

  • We have a problem in scsi_transport_spi in that we need to customise
    not only the visibility of the attributes, but also their mode. Fix
    this by making the is_visible() callback return a mode, with 0
    indicating is not visible.

    Also add a sysfs_update_group() API to allow us to change either the
    visibility or mode of the files at any time on the fly.

    Acked-by: Kay Sievers
    Signed-off-by: James Bottomley

    James Bottomley
     

31 Oct, 2007

1 commit


17 Oct, 2007

1 commit

  • provide BDI constructor/destructor hooks

    [akpm@linux-foundation.org: compile fix]
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

13 Oct, 2007

7 commits

  • Sysfs has gone through considerable amount of reimplementation. Add
    copyrights. Any objections? :-)

    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • Sysfs file poll implementation is scattered over sysfs and kobject.
    Event numbering is done in sysfs_dirent but wait itself is done on
    kobject. This not only unecessarily bloats both kobject and
    sysfs_dirent but is also buggy - if a sysfs_dirent is removed while
    there still are pollers, the associaton betwen the kobject and
    sysfs_dirent breaks and kobject may be freed with the pollers still
    sleeping on it.

    This patch moves whole poll implementation into sysfs_open_dirent.
    Each time a sysfs_open_dirent is created, event number restarts from 1
    and pollers sleep on sysfs_open_dirent. As event sequence number is
    meaningless without any open file and pollers should have open file
    and thus sysfs_open_dirent, this ephemeral event counting works and is
    a saner implementation.

    This patch fixes the dnagling sleepers bug and reduces the sizes of
    kobject and sysfs_dirent by one pointer.

    Signed-off-by: Tejun Heo
    Acked-by: Cornelia Huck
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • Implement sysfs_open_dirent which represents an open file (attribute)
    sysfs_dirent. A file sysfs_dirent with one or more open files have
    one sysfs_dirent and all sysfs_buffers (one for each open instance)
    are linked to it.

    sysfs_open_dirent doesn't actually do anything yet but will be used to
    off-load things which are specific for open file sysfs_dirent from it.

    Signed-off-by: Tejun Heo
    Acked-by: Cornelia Huck
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • Children list head is only meaninful for directory nodes. Move it
    into s_dir. This doesn't save any space currently but it will with
    further changes.

    Signed-off-by: Tejun Heo
    Acked-by: Cornelia Huck
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • sysfs_root is different from a regular directory dirent in that it's
    of type SYSFS_ROOT and doesn't have a name. These differences aren't
    used by anybody and only adds to complexity. Make sysfs_root a
    regular directory dirent.

    Signed-off-by: Tejun Heo
    Acked-by: Cornelia Huck
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • Make s_elem an anonymous union. Prefixing with s_elem makes things
    needlessly longer without any advantage.

    Signed-off-by: Tejun Heo
    Acked-by: Cornelia Huck
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • Move s_mode downward such that it's side-by-side with s_iattr which is
    used for the same thing.

    Signed-off-by: Tejun Heo
    Acked-by: Cornelia Huck
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo