13 Jun, 2011

1 commit

  • * new refcount in struct net, controlling actual freeing of the memory
    * new method in kobj_ns_type_operations (->drop_ns())
    * ->current_ns() semantics change - it's supposed to be followed by
    corresponding ->drop_ns(). For struct net in case of CONFIG_NET_NS it bumps
    the new refcount; net_drop_ns() decrements it and calls net_free() if the
    last reference has been dropped. Method renamed to ->grab_current_ns().
    * old net_free() callers call net_drop_ns() instead.
    * sysfs_exit_ns() is gone, along with a large part of callchain
    leading to it; now that the references stored in ->ns[...] stay valid we
    do not need to hunt them down and replace them with NULL. That fixes
    problems in sysfs_lookup() and sysfs_readdir(), along with getting rid
    of sb->s_instances abuse.

    Note that struct net *shutdown* logics has not changed - net_cleanup()
    is called exactly when it used to be called. The only thing postponed by
    having a sysfs instance refering to that struct net is actual freeing of
    memory occupied by struct net.

    Signed-off-by: Al Viro

    Al Viro
     

29 Oct, 2010

1 commit


10 Aug, 2010

1 commit


22 May, 2010

4 commits

  • In Al's latest vfs tree the code is reworked and S_BIAS has been removed.

    It turns out that checking to see if a super block is in the
    middle of an unmount in sysfs_exit_ns is unnecessary because we
    remove the super_block from the s_supers/s_instances list before
    struct sysfs_super_info pointed to by sb->s_fs_info is freed.

    For now just delete the unnecessary check to see if a superblock is in the
    middle of an unmount, it isn't necessary with or without Al's changes
    and it just causes a needless conflict.

    Reported-by: Stephen Rothwell
    Cc: Al Viro
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • The problem. When implementing a network namespace I need to be able
    to have multiple network devices with the same name. Currently this
    is a problem for /sys/class/net/*, /sys/devices/virtual/net/*, and
    potentially a few other directories of the form /sys/ ... /net/*.

    What this patch does is to add an additional tag field to the
    sysfs dirent structure. For directories that should show different
    contents depending on the context such as /sys/class/net/, and
    /sys/devices/virtual/net/ this tag field is used to specify the
    context in which those directories should be visible. Effectively
    this is the same as creating multiple distinct directories with
    the same name but internally to sysfs the result is nicer.

    I am calling the concept of a single directory that looks like multiple
    directories all at the same path in the filesystem tagged directories.

    For the networking namespace the set of directories whose contents I need
    to filter with tags can depend on the presence or absence of hotplug
    hardware or which modules are currently loaded. Which means I need
    a simple race free way to setup those directories as tagged.

    To achieve a reace free design all tagged directories are created
    and managed by sysfs itself.

    Users of this interface:
    - define a type in the sysfs_tag_type enumeration.
    - call sysfs_register_ns_types with the type and it's operations
    - sysfs_exit_ns when an individual tag is no longer valid

    - Implement mount_ns() which returns the ns of the calling process
    so we can attach it to a sysfs superblock.
    - Implement ktype.namespace() which returns the ns of a syfs kobject.

    Everything else is left up to sysfs and the driver layer.

    For the network namespace mount_ns and namespace() are essentially
    one line functions, and look to remain that.

    Tags are currently represented a const void * pointers as that is
    both generic, prevides enough information for equality comparisons,
    and is trivial to create for current users, as it is just the
    existing namespace pointer.

    The work needed in sysfs is more extensive. At each directory
    or symlink creating I need to check if the directory it is being
    created in is a tagged directory and if so generate the appropriate
    tag to place on the sysfs_dirent. Likewise at each symlink or
    directory removal I need to check if the sysfs directory it is
    being removed from is a tagged directory and if so figure out
    which tag goes along with the name I am deleting.

    Currently only directories which hold kobjects, and
    symlinks are supported. There is not enough information
    in the current file attribute interfaces to give us anything
    to discriminate on which makes it useless, and there are
    no potential users which makes it an uninteresting problem
    to solve.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Benjamin Thery
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Add all of the necessary bioler plate to support
    multiple superblocks in sysfs.

    Signed-off-by: Eric W. Biederman
    Acked-by: Serge Hallyn
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

08 Mar, 2010

2 commits


25 Mar, 2009

3 commits

  • The sysfs_dirent serves as both an inode and a directory entry
    for sysfs. To prevent the sysfs inode numbers from being freed
    prematurely hold a reference to sysfs_dirent from the sysfs inode.

    [akpm@linux-foundation.org: add comment]
    Signed-off-by: Eric W. Biederman
    Cc: Tejun Heo
    Cc: Al Viro
    Cc: Cornelia Huck
    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • sysfs_get_inode ultimately calls sysfs_count_nlink when the a
    directory inode is fectched. sysfs_count_nlink needs to be
    called under the sysfs_mutex to guard against the unlikely
    but possible scenario that the root directory is changing
    as we are counting the number entries in it, and just in
    general to be consistent.

    Signed-off-by: Eric W. Biederman
    Acked-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • SYSFS_MAGIC has been added into magic.h, so only use that definition
    in magic.h to avoid potential consistency problem.

    Signed-off-by: Qinghuang Feng
    Signed-off-by: Greg Kroah-Hartman

    Qinghuang Feng
     

17 Oct, 2008

1 commit

  • Support sysfs_notify from atomic context with new sysfs_notify_dirent

    sysfs_notify currently takes sysfs_mutex.
    This means that it cannot be called in atomic context.
    sysfs_mutex is sometimes held over a malloc (sysfs_rename_dir)
    so it can block on low memory.

    In md I want to be able to notify on a sysfs attribute from
    atomic context, and I don't want to block on low memory because I
    could be in the writeout path for freeing memory.

    So:
    - export the "sysfs_dirent" structure along with sysfs_get, sysfs_put
    and sysfs_get_dirent so I can get the sysfs_dirent that I want to
    notify on and hold it in an md structure.
    - split sysfs_notify_dirent out of sysfs_notify so the sysfs_dirent
    can be notified on with no blocking (just a spinlock).

    Signed-off-by: Neil Brown
    Acked-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Neil Brown
     

30 Apr, 2008

1 commit


17 Oct, 2007

1 commit

  • provide BDI constructor/destructor hooks

    [akpm@linux-foundation.org: compile fix]
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

13 Oct, 2007

8 commits

  • Sysfs has gone through considerable amount of reimplementation. Add
    copyrights. Any objections? :-)

    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • sysfs_root is different from a regular directory dirent in that it's
    of type SYSFS_ROOT and doesn't have a name. These differences aren't
    used by anybody and only adds to complexity. Make sysfs_root a
    regular directory dirent.

    Signed-off-by: Tejun Heo
    Acked-by: Cornelia Huck
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • The only uses of s_dentry left are the code that maintains
    s_dentry and trivial users that don't actually need it.
    So this patch removes the s_dentry maintenance code and
    restructures the trivial uses to use something else.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Tejun Heo
    Cc: Cornelia Huck
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • This patch modifies the users of sysfs_mount to use sysfs_root
    instead (which is what they are looking for). It then
    makes sysfs_mount static to keep people from using it
    by accident.

    The net result is slightly faster and cleaner code.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Tejun Heo
    Cc: Cornelia Huck
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Since sysfs no longer stores fs directory information in the dcache
    on a permanent basis kill_litter_super it is inappropriate and actively
    wrong. It will decrement the count on all dentries left in the
    dcache before trying to free them.

    At the moment this is not biting us only because we never unmount sysfs.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Tejun Heo
    Cc: Cornelia Huck
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Tejun Heo
    Cc: Cornelia Huck
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • While shadow directories appear to be a good idea, the current scheme
    of controlling their creation and destruction outside of sysfs appears
    to be a locking and maintenance nightmare in the face of sysfs
    directories dynamically coming and going. Which can now occur for
    directories containing network devices when CONFIG_SYSFS_DEPRECATED is
    not set.

    This patch removes everything from the initial shadow directory support
    that allowed the shadow directory creation to be controlled at a higher
    level. So except for a few bits of sysfs_rename_dir everything from
    commit b592fcfe7f06c15ec11774b5be7ce0de3aa86e73 is now gone.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Cleanup semaphore.h

    Signed-off-by: Dave Young
    Signed-off-by: Greg Kroah-Hartman

    Dave Young
     

20 Jul, 2007

1 commit

  • Slab destructors were no longer supported after Christoph's
    c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
    BUGs for both slab and slub, and slob never supported them
    either.

    This rips out support for the dtor pointer from kmem_cache_create()
    completely and fixes up every single callsite in the kernel (there were
    about 224, not including the slab allocator definitions themselves,
    or the documentation references).

    Signed-off-by: Paul Mundt

    Paul Mundt
     

19 Jul, 2007

1 commit

  • While making sysfs indoes hashed, sysfs root inode was left out. Now
    that nlink accounting depends on the inode being on the hash, sysfs
    root inode nlink isn't adjusted properly.

    Put sysfs root inode on the inode hash by allocating it using
    sysfs_get_inode() like other sysfs inodes. While at it, massage
    comments a bit.

    Signed-off-by: Tejun Heo
    Acked-by: Jean Delvare
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     

12 Jul, 2007

8 commits

  • This patch makes dentries and inodes for sysfs directories
    reclaimable.

    * sysfs_notify() is modified to walk sysfs_dirent tree instead of
    dentry tree.

    * sysfs_update_file() and sysfs_chmod_file() use sysfs_get_dentry() to
    grab the victim dentry.

    * sysfs_rename_dir() and sysfs_move_dir() grab all dentries using
    sysfs_get_dentry() on startup.

    * Dentries for all shadowed directories are pinned in memory to serve
    as lookup start point.

    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • Rename sysfs_dirent->s_type to s_flags, pack type into lower eight
    bits and reserve the rest for flags. sysfs_type() can used to access
    the type. All existing sd->s_type accesses are converted to use
    sysfs_type(). While at it, type test is changed to equality test
    instead of bit-and test where appropriate.

    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • Make sysfs_dirent use singly linked list for its tree structure.
    sysfs_link_sibling() and sysfs_unlink_sibling() functions are added to
    handle simpler cases. It adds some complexity and cpu cycle overhead
    but reduced memory footprint is worthwhile on big machines.

    This change reduces the sizeof sysfs_dirent from 104 to 88 on 64bit
    and from 60 to 52 on 32bit.

    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • The root sysfs_dirent didn't point to the root dentry fix it.

    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • Reorganize/clean up sysfs_new_inode() and sysfs_create().

    * sysfs_init_inode() is separated out from sysfs_new_inode() and is
    responsible for basic initialization.
    * sysfs_instantiate() replaces the last step of sysfs_create() and is
    responsible for dentry instantitaion.
    * type-specific initialization is moved out to the callers.
    * mode is specified only once when creating a sysfs_dirent.
    * spurious list_del_init(&sd->s_sibling) dropped from create_dir()

    This change is to

    * prepare for inode allocation fix.
    * separate alloc and init code for synchronization update.
    * make dentry/inode initialization more flexible for later changes.

    This patch doesn't introduce visible behavior change.

    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • Now that sysfs_dirent can be disconnected from kobject on deletion,
    there is no need to orphan each attribute files. All [bin_]attribute
    nodes are automatically orphaned when the parent node is deleted.
    Kill attribute file orphaning.

    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • Make sd->s_element a union of sysfs_elem_{dir|symlink|attr|bin_attr}
    and rename it to s_elem. This is to achieve...

    * some level of type checking : changing symlink to point to
    sysfs_dirent instead of kobject is much safer and less painful now.
    * easier / standardized dereferencing
    * allow sysfs_elem_* to contain more than one entry

    Where possible, pointer is obtained by directly deferencing from sd
    instead of going through other entities. This reduces dependencies to
    dentry, inode and kobject. to_attr() and to_bin_attr() are unused now
    and removed.

    This is in preparation of object reference simplification.

    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • Add sysfs_dirent->s_parent. With this patch, each sd points to and
    holds a reference to its parent. This allows walking sysfs tree
    without referencing sd->s_dentry which can go away anytime if the user
    doesn't control when it's deleted.

    sd->s_parent is initialized and parent is referenced in
    sysfs_attach_dirent(). Reference to parent is released when the sd is
    released, so as long as reference to a sd is held, s_parent can be
    followed.

    dentry walk in sysfs_readdir() is convereted to s_parent walk.

    This will be used to reimplement symlink such that it uses only
    sysfs_dirent tree.

    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     

13 Jun, 2007

1 commit

  • Backport of
    ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/broken-out/gregkh-driver-sysfs-allocate-inode-number-using-ida.patch

    For regular files in sysfs, sysfs_readdir wants to traverse
    sysfs_dirent->s_dentry->d_inode->i_ino to get to the inode number.
    But, the dentry can be reclaimed under memory pressure, and there is
    no synchronization with readdir. This patch follows Tejun's scheme of
    allocating and storing an inode number in the new s_ino member of a
    sysfs_dirent, when dirents are created, and retrieving it from there
    for readdir, so that the pointer chain doesn't have to be traversed.

    Tejun's upstream patch uses a new-ish "ida" allocator which brings
    along some extra complexity; this -stable patch has a brain-dead
    incrementing counter which does not guarantee uniqueness, but because
    sysfs doesn't hash inodes as iunique expects, uniqueness wasn't
    guaranteed today anyway.

    Signed-off-by: Eric Sandeen
    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Eric Sandeen
     

13 Feb, 2007

1 commit


08 Feb, 2007

2 commits

  • The problem. When implementing a network namespace I need to be able
    to have multiple network devices with the same name. Currently this
    is a problem for /sys/class/net/*.

    What I want is a separate /sys/class/net directory in sysfs for each
    network namespace, and I want to name each of them /sys/class/net.

    I looked and the VFS actually allows that. All that is needed is
    for /sys/class/net to implement a follow link method to redirect
    lookups to the real directory you want.

    Implementing a follow link method that is sensitive to the current
    network namespace turns out to be 3 lines of code so it looks like a
    clean approach. Modifying sysfs so it doesn't get in my was is a bit
    trickier.

    I am calling the concept of multiple directories all at the same path
    in the filesystem shadow directories. With the directory entry really
    at that location the shadow master.

    The following patch modifies sysfs so it can handle a directory
    structure slightly different from the kobject tree so I can implement
    the shadow directories for handling /sys/class/net/.

    Signed-off-by: Eric W. Biederman
    Cc: Maneesh Soni
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • This patch prevents a race between IO and removing a file from sysfs.
    It introduces a list of sysfs_buffers associated with a file at the inode.
    Upon removal of a file the list is walked and the buffers marked orphaned.
    IO to orphaned buffers fails with -ENODEV. The driver can safely free
    associated data structures or be unloaded.

    Signed-off-by: Oliver Neukum
    Acked-by: Maneesh Soni
    Signed-off-by: Greg Kroah-Hartman

    Oliver Neukum
     

08 Dec, 2006

1 commit

  • Replace all uses of kmem_cache_t with struct kmem_cache.

    The patch was generated using the following script:

    #!/bin/sh
    #
    # Replace one string by another in all the kernel sources.
    #

    set -e

    for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
    quilt add $file
    sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
    mv /tmp/$$ $file
    quilt refresh
    done

    The script was run like this

    sh replace kmem_cache_t "struct kmem_cache"

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

01 Oct, 2006

1 commit