23 Oct, 2010

2 commits

  • bb->vm_ops is a cached copy of the vm_ops of the underlying
    sysfs bin file, which means that after sysfs_bin_remove_file
    completes it is only longer valid to deference bb->vm_ops.

    So move all of the tests of bb->vm_ops inside of where
    we hold the sysfs active lock.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • It is not reasonably possible to wrap vma->close(). To correctly
    wrap close would imply calling close on any vmas that remain when
    sysfs_remove_bin_file is called. Finding the proper lists walking
    them getting the locking right etc, requires deep knowledge of the
    mm subsystem and as such would require assistence from the mm
    subsystem to implement. That assistence does not currently exist.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

22 May, 2010

2 commits

  • This allows bin_attr->read,write,mmap callbacks to check file specific data
    (such as inode owner) as part of any privilege validation.

    Signed-off-by: Chris Wright
    Signed-off-by: Greg Kroah-Hartman

    Chris Wright
     
  • The problem. When implementing a network namespace I need to be able
    to have multiple network devices with the same name. Currently this
    is a problem for /sys/class/net/*, /sys/devices/virtual/net/*, and
    potentially a few other directories of the form /sys/ ... /net/*.

    What this patch does is to add an additional tag field to the
    sysfs dirent structure. For directories that should show different
    contents depending on the context such as /sys/class/net/, and
    /sys/devices/virtual/net/ this tag field is used to specify the
    context in which those directories should be visible. Effectively
    this is the same as creating multiple distinct directories with
    the same name but internally to sysfs the result is nicer.

    I am calling the concept of a single directory that looks like multiple
    directories all at the same path in the filesystem tagged directories.

    For the networking namespace the set of directories whose contents I need
    to filter with tags can depend on the presence or absence of hotplug
    hardware or which modules are currently loaded. Which means I need
    a simple race free way to setup those directories as tagged.

    To achieve a reace free design all tagged directories are created
    and managed by sysfs itself.

    Users of this interface:
    - define a type in the sysfs_tag_type enumeration.
    - call sysfs_register_ns_types with the type and it's operations
    - sysfs_exit_ns when an individual tag is no longer valid

    - Implement mount_ns() which returns the ns of the calling process
    so we can attach it to a sysfs superblock.
    - Implement ktype.namespace() which returns the ns of a syfs kobject.

    Everything else is left up to sysfs and the driver layer.

    For the network namespace mount_ns and namespace() are essentially
    one line functions, and look to remain that.

    Tags are currently represented a const void * pointers as that is
    both generic, prevides enough information for equality comparisons,
    and is trivial to create for current users, as it is just the
    existing namespace pointer.

    The work needed in sysfs is more extensive. At each directory
    or symlink creating I need to check if the directory it is being
    created in is a tagged directory and if so generate the appropriate
    tag to place on the sysfs_dirent. Likewise at each symlink or
    directory removal I need to check if the sysfs directory it is
    being removed from is a tagged directory and if so figure out
    which tag goes along with the name I am deleting.

    Currently only directories which hold kobjects, and
    symlinks are supported. There is not enough information
    in the current file attribute interfaces to give us anything
    to discriminate on which makes it useless, and there are
    no potential users which makes it an uninteresting problem
    to solve.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Benjamin Thery
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

08 Mar, 2010

1 commit

  • It turns out that holding an active reference on a directory is
    pointless. The purpose of the active references are to allows us to
    block when removing sysfs entries that have custom methods so we don't
    remove modules while running modular code and to keep those custom
    methods from accessing data structures after the files have been
    removed. Further sysfs_remove_dir remove all elements in the
    directory before removing the directory itself, so there is no chance
    we will remove a directory with active children.

    Signed-off-by: Eric W. Biederman
    Cc: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

24 Dec, 2009

1 commit


28 Sep, 2009

1 commit


09 Jul, 2009

1 commit


21 Apr, 2009

1 commit


01 Apr, 2009

1 commit

  • Fix warnings and return values in sysfs bin_page_mkwrite(), fixing
    fs/sysfs/bin.c: In function `bin_page_mkwrite':
    fs/sysfs/bin.c:250: warning: passing argument 2 of `bb->vm_ops->page_mkwrite' from incompatible pointer type
    fs/sysfs/bin.c: At top level:
    fs/sysfs/bin.c:280: warning: initialization from incompatible pointer type

    Expects to have my [PATCH next] sysfs: fix some bin_vm_ops errors

    Signed-off-by: Hugh Dickins
    Cc: Nick Piggin
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

25 Mar, 2009

2 commits

  • Commit 86c9508eb1c0ce5aa07b5cf1d36b60c54efc3d7a
    "sysfs: don't block indefinitely for unmapped files" in linux-next
    crashes the PowerMac G5 when X starts up. It's caught out by the way
    powerpc's pci_mmap of legacy_mem uses shmem_zero_setup(), substituting
    a new vma->vm_file whose private_data no longer points to the bin_buffer
    (substitution done because some versions of X crash if that mmap fails).

    The fix to this is straightforward: the original vm_file is fput() in
    that case, so this mmap won't block sysfs at all, so just don't switch
    over to bin_vm_ops if vm_file has changed.

    But more fixes made before realizing that was the problem:-

    It should not be an error if bin_page_mkwrite() finds no underlying
    page_mkwrite().

    Check that a file already mmap'ed has the same underlying vm_ops
    _before_ pointing vma->vm_ops at bin_vm_ops.

    If the file being mmap'ed is a shmem/tmpfs file, don't fail the mmap
    on CONFIG_NUMA=y, just because that has a set_policy and get_policy:
    provide bin_set_policy, bin_get_policy and bin_migrate.

    Signed-off-by: Hugh Dickins
    Acked-by: Eric Biederman
    Signed-off-by: Greg Kroah-Hartman

    Hugh Dickins
     
  • Modify sysfs bin files so that we can remove the bin file while they are
    still mapped. When the kobject is removed we unmap the bin file and
    arrange for future accesses to the mapping to receive SIGBUS.

    Implementing this prevents a nasty DOS when pci devices are hot plugged
    and unplugged. Where if any of their resources were mmaped the kernel
    could not free up their pci resources or release their pci data
    structures.

    [akpm@linux-foundation.org: remove unused var]
    Signed-off-by: Eric W. Biederman
    Cc: Jesse Barnes
    Acked-by: Tejun Heo
    Cc: Kay Sievers
    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

21 Jan, 2009

1 commit

  • Some sysfs binary files don't like having 0 passed to them as a size.
    Fix this up at the root by just returning to the vfs if userspace asks
    us for a zero sized buffer.

    Thanks to Pavel Roskin for pointing this out.

    Reported-by: Pavel Roskin
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

17 Oct, 2008

1 commit

  • On Thu, Sep 11, 2008 at 10:27:10AM +0200, Ingo Molnar wrote:

    > and it's working fine on most boxes. One testbox found this new locking
    > scenario:
    >
    > PM: Adding info for No Bus:vcsa7
    > EDAC DEBUG: MC0: i82860_check()
    >
    > =======================================================
    > [ INFO: possible circular locking dependency detected ]
    > 2.6.27-rc6-tip #1
    > -------------------------------------------------------
    > X/4873 is trying to acquire lock:
    > (&bb->mutex){--..}, at: [] mmap+0x40/0xa0
    >
    > but task is already holding lock:
    > (&mm->mmap_sem){----}, at: [] sys_mmap2+0x8e/0xc0
    >
    > which lock already depends on the new lock.
    >
    >
    > the existing dependency chain (in reverse order) is:
    >
    > -> #1 (&mm->mmap_sem){----}:
    > [] validate_chain+0xa96/0xf50
    > [] __lock_acquire+0x2cb/0x5b0
    > [] lock_acquire+0x89/0xc0
    > [] might_fault+0x6b/0x90
    > [] copy_to_user+0x38/0x60
    > [] read+0xfb/0x170
    > [] vfs_read+0x95/0x110
    > [] sys_pread64+0x63/0x80
    > [] sysenter_do_call+0x12/0x43
    > [] 0xffffffff
    >
    > -> #0 (&bb->mutex){--..}:
    > [] validate_chain+0x6b7/0xf50
    > [] __lock_acquire+0x2cb/0x5b0
    > [] lock_acquire+0x89/0xc0
    > [] __mutex_lock_common+0xab/0x3c0
    > [] mutex_lock_nested+0x38/0x50
    > [] mmap+0x40/0xa0
    > [] mmap_region+0x14e/0x450
    > [] do_mmap_pgoff+0x2ef/0x310
    > [] sys_mmap2+0xad/0xc0
    > [] sysenter_do_call+0x12/0x43
    > [] 0xffffffff
    >
    > other info that might help us debug this:
    >
    > 1 lock held by X/4873:
    > #0: (&mm->mmap_sem){----}, at: [] sys_mmap2+0x8e/0xc0
    >
    > stack backtrace:
    > Pid: 4873, comm: X Not tainted 2.6.27-rc6-tip #1
    > [] print_circular_bug_tail+0x79/0xc0
    > [] validate_chain+0x6b7/0xf50
    > [] ? trace_hardirqs_off_caller+0x15/0xb0
    > [] __lock_acquire+0x2cb/0x5b0
    > [] lock_acquire+0x89/0xc0
    > [] ? mmap+0x40/0xa0
    > [] __mutex_lock_common+0xab/0x3c0
    > [] ? mmap+0x40/0xa0
    > [] mutex_lock_nested+0x38/0x50
    > [] ? mmap+0x40/0xa0
    > [] mmap+0x40/0xa0
    > [] mmap_region+0x14e/0x450
    > [] ? arch_get_unmapped_area_topdown+0xf8/0x160
    > [] do_mmap_pgoff+0x2ef/0x310
    > [] sys_mmap2+0xad/0xc0
    > [] sysenter_do_call+0x12/0x43
    > [] ? __switch_to+0x130/0x220
    > =======================
    > evbug.c: Event. Dev: input3, Type: 20, Code: 0, Value: 500
    > warning: `sudo' uses deprecated v2 capabilities in a way that may be insecure.
    >
    > i've attached the config.
    >
    > at first sight it looks like a genuine bug in fs/sysfs/bin.c?

    Yes, it is a real bug by the looks. bin.c takes bb->mutex under mmap_sem
    when it is mmapped, and then does its copy_*_user under bb->mutex too.

    Here is a basic fix for the sysfs lor.

    From: Nick Piggin
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Nick Piggin
     

13 Oct, 2007

5 commits

  • Sysfs has gone through considerable amount of reimplementation. Add
    copyrights. Any objections? :-)

    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • Make s_elem an anonymous union. Prefixing with s_elem makes things
    needlessly longer without any advantage.

    Signed-off-by: Tejun Heo
    Acked-by: Cornelia Huck
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • All bin attr operations require active references of itself and its
    parent. There's no reason to allow open when its parent has been
    deactivated and allowing it is inconsistent with regular sysfs file.
    Use sysfs_get_active_two() in bin attribute open function.

    Signed-off-by: Tejun Heo
    Acked-by: Cornelia Huck
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • There's no reason to get an extra reference to sysfs_dirent for an
    open file. Open file has a reference to the dentry which in turn has
    a reference to sysfs_dirent. This is fairly obvious as otherwise open
    itself won't be able to access the sysfs_dirent. Kill the extra
    sysfs_get() and matching sysfs_put().

    Signed-off-by: Tejun Heo
    Acked-by: Cornelia Huck
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • Cleanup semaphore.h

    Signed-off-by: Dave Young
    Signed-off-by: Greg Kroah-Hartman

    Dave Young
     

23 Aug, 2007

1 commit

  • This patch (as960) removes the error message and stack dump logged by
    sysfs_remove_bin_file() when someone tries to remove a nonexistent
    file. The warning doesn't seem to be needed, since none of the other
    file-, symlink-, or directory-removal routines in sysfs complain in a
    comparable way.

    Signed-off-by: Alan Stern
    Acked-by: Tejun Heo
    Acked-by: Cornelia Huck
    Signed-off-by: Greg Kroah-Hartman

    Alan Stern
     

12 Jul, 2007

7 commits

  • Well, first of all, I don't want to change so many files either.

    What I do:
    Adding a new parameter "struct bin_attribute *" in the
    .read/.write methods for the sysfs binary attributes.

    In fact, only the four lines change in fs/sysfs/bin.c and
    include/linux/sysfs.h do the real work.
    But I have to update all the files that use binary attributes
    to make them compatible with the new .read and .write methods.
    I'm not sure if I missed any. :(

    Why I do this:
    For a sysfs attribute, we can get a pointer pointing to the
    struct attribute in the .show/.store method,
    while we can't do this for the binary attributes.
    I don't know why this is different, but this does make it not
    so handy to use the binary attributes as the regular ones.
    So I think this patch is reasonable. :)

    Who benefits from it:
    The patch that exposes ACPI tables in sysfs
    requires such an improvement.
    All the table binary attributes share the same .read method.
    Parameter "struct bin_attribute *" is used to get
    the table signature and instance number which are used to
    distinguish different ACPI table binary attributes.

    Without this parameter, we need to offer different .read methods
    for different ACPI table binary attributes.
    This is impossible as there are various ACPI tables on different
    platforms, and we don't know what they are until they are loaded.

    Signed-off-by: Zhang Rui
    Signed-off-by: Greg Kroah-Hartman

    Zhang Rui
     
  • As kobj sysfs dentries and inodes are gonna be made reclaimable,
    dentry can't be used as naming token for sysfs file/directory, replace
    kobj->dentry with kobj->sd. The only external interface change is
    shadow directory handling. All other changes are contained in kobj
    and sysfs.

    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • sysfs is now completely out of driver/module lifetime game. After
    deletion, a sysfs node doesn't access anything outside sysfs proper,
    so there's no reason to hold onto the attribute owners. Note that
    often the wrong modules were accounted for as owners leading to
    accessing removed modules.

    This patch kills now unnecessary attribute->owner. Note that with
    this change, userland holding a sysfs node does not prevent the
    backing module from being unloaded.

    For more info regarding lifetime rule cleanup, please read the
    following message.

    http://article.gmane.org/gmane.linux.kernel/510293

    (tweaked by Greg to not delete the field just yet, to make it easier to
    merge things properly.)

    Signed-off-by: Tejun Heo
    Cc: Cornelia Huck
    Cc: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • sysfs: implement sysfs_dirent active reference and immediate disconnect

    Opening a sysfs node references its associated kobject, so userland
    can arbitrarily prolong lifetime of a kobject which complicates
    lifetime rules in drivers. This patch implements active reference and
    makes the association between kobject and sysfs immediately breakable.

    Now each sysfs_dirent has two reference counts - s_count and s_active.
    s_count is a regular reference count which guarantees that the
    containing sysfs_dirent is accessible. As long as s_count reference
    is held, all sysfs internal fields in sysfs_dirent are accessible
    including s_parent and s_name.

    The newly added s_active is active reference count. This is acquired
    by invoking sysfs_get_active() and it's the caller's responsibility to
    ensure sysfs_dirent itself is accessible (should be holding s_count
    one way or the other). Dereferencing sysfs_dirent to access objects
    out of sysfs proper requires active reference. This includes access
    to the associated kobjects, attributes and ops.

    The active references can be drained and denied by calling
    sysfs_deactivate(). All active sysfs_dirents must be deactivated
    after deletion but before the default reference is dropped. This
    enables immediate disconnect of sysfs nodes. Once a sysfs_dirent is
    deleted, it won't access any entity external to sysfs proper.

    Because attr/bin_attr ops access both the node itself and its parent
    for kobject, they need to hold active references to both.
    sysfs_get/put_active_two() helpers are provided to help grabbing both
    references. Parent's is acquired first and released last.

    Unlike other operations, mmapped area lingers on after mmap() is
    finished and the module implement implementing it and kobj need to
    stay referenced till all the mapped pages are gone. This is
    accomplished by holding one set of active references to the bin_attr
    and its parent if there have been any mmap during lifetime of an
    openfile. The references are dropped when the openfile is released.

    This change makes sysfs lifetime rules independent from both kobject's
    and module's. It not only fixes several race conditions caused by
    sysfs not holding onto the proper module when referencing kobject, but
    also helps fixing and simplifying lifetime management in driver model
    and drivers by taking sysfs out of the equation.

    Please read the following message for more info.

    http://article.gmane.org/gmane.linux.kernel/510293

    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • Implement bin_buffer which contains a mutex and pointer to PAGE_SIZE
    buffer to properly synchronize accesses to per-openfile buffer and
    prepare for immediate-kobj-disconnect.

    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • Make sd->s_element a union of sysfs_elem_{dir|symlink|attr|bin_attr}
    and rename it to s_elem. This is to achieve...

    * some level of type checking : changing symlink to point to
    sysfs_dirent instead of kobject is much safer and less painful now.
    * easier / standardized dereferencing
    * allow sysfs_elem_* to contain more than one entry

    Where possible, pointer is obtained by directly deferencing from sd
    instead of going through other entities. This reduces dependencies to
    dentry, inode and kobject. to_attr() and to_bin_attr() are unused now
    and removed.

    This is in preparation of object reference simplification.

    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • Error handling in fs/sysfs/bin.c:write() was wrong because size_t
    count is used to receive return value from flush_write() which is
    negative on failure.

    This patch updates write() such that int variable is used instead.
    read() is updated the same way for consistency.

    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     

03 May, 2007

1 commit


28 Apr, 2007

1 commit

  • fs/sysfs/bin.c: In function 'read':
    fs/sysfs/bin.c:77: warning: format '%zd' expects type 'signed size_t', but argument 4 has type 'int'

    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Andrew Morton
     

08 Feb, 2007

2 commits


09 Dec, 2006

1 commit


26 Sep, 2006

1 commit

  • Make sysfs_remove_bin_file() void. If it detects an error,
    printk the file name and call dump_stack().

    sysfs_hash_and_remove() now returns an error code indicating
    its success or failure so that sysfs_remove_bin_file() can
    know success/failure.

    Convert the only driver that checked the return value of
    sysfs_remove_bin_file().

    Signed-off-by: Randy Dunlap
    Signed-off-by: Greg Kroah-Hartman

    Randy.Dunlap
     

29 Mar, 2006

1 commit

  • This is a conversion to make the various file_operations structs in fs/
    const. Basically a regexp job, with a few manual fixups

    The goal is both to increase correctness (harder to accidentally write to
    shared datastructures) and reducing the false sharing of cachelines with
    things that get dirty in .data (while .rodata is nicely read only and thus
    cache clean)

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

21 Jun, 2005

1 commit


17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds