29 May, 2012

1 commit

  • Pull writeback tree from Wu Fengguang:
    "Mainly from Jan Kara to avoid iput() in the flusher threads."

    * tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
    writeback: Avoid iput() from flusher thread
    vfs: Rename end_writeback() to clear_inode()
    vfs: Move waiting for inode writeback from end_writeback() to evict_inode()
    writeback: Refactor writeback_single_inode()
    writeback: Remove wb->list_lock from writeback_single_inode()
    writeback: Separate inode requeueing after writeback
    writeback: Move I_DIRTY_PAGES handling
    writeback: Move requeueing when I_SYNC set to writeback_sb_inodes()
    writeback: Move clearing of I_SYNC into inode_sync_complete()
    writeback: initialize global_dirty_limit
    fs: remove 8 bytes of padding from struct writeback_control on 64 bit builds
    mm: page-writeback.c: local functions should not be exposed globally

    Linus Torvalds
     

16 May, 2012

1 commit


06 May, 2012

1 commit

  • After we moved inode_sync_wait() from end_writeback() it doesn't make sense
    to call the function end_writeback() anymore. Rename it to clear_inode()
    which well says what the function really does - set I_CLEAR flag.

    Signed-off-by: Jan Kara
    Signed-off-by: Fengguang Wu

    Jan Kara
     

09 Mar, 2012

1 commit


25 Feb, 2012

1 commit

  • This patch fixies follwing two memory leak patterns that reported by kmemleak.
    sysfs_sd_setsecdata() is called during sys_lsetxattr() operation.
    It checks sd->s_iattr is NULL or not. Then if it is NULL, it calls
    sysfs_init_inode_attrs() to allocate memory.
    That code is this.

    iattrs = sd->s_iattr;
    if (!iattrs)
    iattrs = sysfs_init_inode_attrs(sd);

    The iattrs recieves sysfs_init_inode_attrs()'s result, but sd->s_iattr
    doesn't know the address. so it needs to set correct address to
    sd->s_iattr to free memory in other function.

    unreferenced object 0xffff880250b73e60 (size 32):
    comm "systemd", pid 1, jiffies 4294683888 (age 94.553s)
    hex dump (first 32 bytes):
    73 79 73 74 65 6d 5f 75 3a 6f 62 6a 65 63 74 5f system_u:object_
    72 3a 73 79 73 66 73 5f 74 3a 73 30 00 00 00 00 r:sysfs_t:s0....
    backtrace:
    [] kmemleak_alloc+0x73/0x98
    [] __kmalloc+0x100/0x12c
    [] context_struct_to_string+0x106/0x210
    [] security_sid_to_context_core+0x10b/0x129
    [] security_sid_to_context+0x10/0x12
    [] selinux_inode_getsecurity+0x7d/0xa8
    [] selinux_inode_getsecctx+0x22/0x2e
    [] security_inode_getsecctx+0x16/0x18
    [] sysfs_setxattr+0x96/0x117
    [] __vfs_setxattr_noperm+0x73/0xd9
    [] vfs_setxattr+0x83/0xa1
    [] setxattr+0xcf/0x101
    [] sys_lsetxattr+0x6a/0x8f
    [] system_call_fastpath+0x16/0x1b
    [] 0xffffffffffffffff
    unreferenced object 0xffff88024163c5a0 (size 96):
    comm "systemd", pid 1, jiffies 4294683888 (age 94.553s)
    hex dump (first 32 bytes):
    00 00 00 00 ed 41 00 00 00 00 00 00 00 00 00 00 .....A..........
    00 00 00 00 00 00 00 00 0c 64 42 4f 00 00 00 00 .........dBO....
    backtrace:
    [] kmemleak_alloc+0x73/0x98
    [] kmem_cache_alloc_trace+0xc4/0xee
    [] sysfs_init_inode_attrs+0x2a/0x83
    [] sysfs_setxattr+0xbf/0x117
    [] __vfs_setxattr_noperm+0x73/0xd9
    [] vfs_setxattr+0x83/0xa1
    [] setxattr+0xcf/0x101
    [] sys_lsetxattr+0x6a/0x8f
    [] system_call_fastpath+0x16/0x1b
    [] 0xffffffffffffffff
    `

    Signed-off-by: Masami Ichikawa
    Cc: stable
    Signed-off-by: Greg Kroah-Hartman

    Masami Ichikawa
     

03 Feb, 2012

1 commit


25 Jan, 2012

2 commits

  • Tracking the number of subdirectories requires an extra field that increases
    the size of sysfs_dirent. nlinks are not particularly interesting for sysfs
    and the nlink counts are wrong when network namespaces are involved so stop
    counting them, and always return nlink == 1. Userspace already knows that
    directories with nlink == 1 have an nlink count they can't use to count
    subdirectories.

    This reduces the size of sysfs_dirent by 8 bytes on 64bit platforms.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Recently an OOPS was observed from the usb serial io_ti driver when it tried to remove
    sysfs directories. Upon investigation it turns out this driver was always buggy
    and that a recent sysfs change had stopped guarding itself against removing attributes
    from sysfs directories that had already been removed. :(

    Historically we have been silent about attempting to files from nonexistent sysfs
    directories and have politely returned error codes. That has resulted in people writing
    broken code that ignores the error codes.

    Issue a kernel WARNING and a stack backtrace to make it clear in no uncertain
    terms that abusing sysfs is not ok, and the callers need to fix their code.

    This change transforms the io_ti OOPS into a more comprehensible error message
    and stack backtrace.

    Signed-off-by: Eric W. Biederman
    Reported-by: Wolfgang Frisch
    Cc: stable
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

04 Jan, 2012

1 commit


02 Nov, 2011

1 commit


25 Oct, 2011

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1745 commits)
    dp83640: free packet queues on remove
    dp83640: use proper function to free transmit time stamping packets
    ipv6: Do not use routes from locally generated RAs
    |PATCH net-next] tg3: add tx_dropped counter
    be2net: don't create multiple RX/TX rings in multi channel mode
    be2net: don't create multiple TXQs in BE2
    be2net: refactor VF setup/teardown code into be_vf_setup/clear()
    be2net: add vlan/rx-mode/flow-control config to be_setup()
    net_sched: cls_flow: use skb_header_pointer()
    ipv4: avoid useless call of the function check_peer_pmtu
    TCP: remove TCP_DEBUG
    net: Fix driver name for mdio-gpio.c
    ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT
    rtnetlink: Add missing manual netlink notification in dev_change_net_namespaces
    ipv4: fix ipsec forward performance regression
    jme: fix irq storm after suspend/resume
    route: fix ICMP redirect validation
    net: hold sock reference while processing tx timestamps
    tcp: md5: add more const attributes
    Add ethtool -g support to virtio_net
    ...

    Fix up conflicts in:
    - drivers/net/Kconfig:
    The split-up generated a trivial conflict with removal of a
    stale reference to Documentation/networking/net-modules.txt.
    Remove it from the new location instead.
    - fs/sysfs/dir.c:
    Fairly nasty conflicts with the sysfs rb-tree usage, conflicting
    with Eric Biederman's changes for tagged directories.

    Linus Torvalds
     

20 Oct, 2011

1 commit

  • Now that /sys/class/net/bonding_masters is implemented as a tagged sysfs
    file we can remove support for untagged files in tagged directories.

    This change removes any ambiguity of what a NULL namespace value
    means. A NULL namespace parameter after this patch means
    that we are talking about an untagged sysfs dirent.

    This makes the sysfs code much less prone to mistakes when during
    maintenance.

    Signed-off-by: Eric W. Biederman
    Acked-by: Greg Kroah-Hartman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

23 Aug, 2011

1 commit

  • sysfs: count subdirectories

    This patch introduces a subdirectory counter for each sysfs directory.

    Without the patch, sysfs_refresh_inode would walk all entries of the directory
    to calculate the number of subdirectories.

    This patch improves time of "ls -la /sys/block" when there are 10000 block
    devices from 9 seconds to 0.19 seconds.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Greg Kroah-Hartman

    Mikulas Patocka
     

20 Jul, 2011

3 commits


11 Jan, 2011

1 commit


07 Jan, 2011

1 commit


10 Aug, 2010

2 commits


05 Jun, 2010

1 commit

  • sysfs and configfs setattr functions have error cases after the generic inode's
    attributes have been changed. Fix consistency by changing the generic inode
    attributes only when it is guaranteed to succeed.

    Signed-off-by: Nick Piggin
    Acked-by: Joel Becker
    Signed-off-by: Greg Kroah-Hartman

    Nick Piggin
     

28 May, 2010

1 commit


22 May, 2010

2 commits

  • I had hopped to avoid this but the bonding driver adds a file
    to /sys/class/net/ and the easiest way to handle that file is
    to make it untagged and to register it only once.

    So relax the rules on tagged directories, and make bonding work.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • The problem. When implementing a network namespace I need to be able
    to have multiple network devices with the same name. Currently this
    is a problem for /sys/class/net/*, /sys/devices/virtual/net/*, and
    potentially a few other directories of the form /sys/ ... /net/*.

    What this patch does is to add an additional tag field to the
    sysfs dirent structure. For directories that should show different
    contents depending on the context such as /sys/class/net/, and
    /sys/devices/virtual/net/ this tag field is used to specify the
    context in which those directories should be visible. Effectively
    this is the same as creating multiple distinct directories with
    the same name but internally to sysfs the result is nicer.

    I am calling the concept of a single directory that looks like multiple
    directories all at the same path in the filesystem tagged directories.

    For the networking namespace the set of directories whose contents I need
    to filter with tags can depend on the presence or absence of hotplug
    hardware or which modules are currently loaded. Which means I need
    a simple race free way to setup those directories as tagged.

    To achieve a reace free design all tagged directories are created
    and managed by sysfs itself.

    Users of this interface:
    - define a type in the sysfs_tag_type enumeration.
    - call sysfs_register_ns_types with the type and it's operations
    - sysfs_exit_ns when an individual tag is no longer valid

    - Implement mount_ns() which returns the ns of the calling process
    so we can attach it to a sysfs superblock.
    - Implement ktype.namespace() which returns the ns of a syfs kobject.

    Everything else is left up to sysfs and the driver layer.

    For the network namespace mount_ns and namespace() are essentially
    one line functions, and look to remain that.

    Tags are currently represented a const void * pointers as that is
    both generic, prevides enough information for equality comparisons,
    and is trivial to create for current users, as it is just the
    existing namespace pointer.

    The work needed in sysfs is more extensive. At each directory
    or symlink creating I need to check if the directory it is being
    created in is a tagged directory and if so generate the appropriate
    tag to place on the sysfs_dirent. Likewise at each symlink or
    directory removal I need to check if the sysfs directory it is
    being removed from is a tagged directory and if so figure out
    which tag goes along with the name I am deleting.

    Currently only directories which hold kobjects, and
    symlinks are supported. There is not enough information
    in the current file attribute interfaces to give us anything
    to discriminate on which makes it useless, and there are
    no potential users which makes it an uninteresting problem
    to solve.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Benjamin Thery
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

08 Mar, 2010

2 commits

  • Currently sysfs_get_inode magically returns an inode on
    sysfs_sb. Make the super_block parameter explicit and
    the code becomes clearer.

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • The vfs depends upon filesystem methods to update the
    vfs inode. Sysfs adds to the normal number of places
    where the vfs inode is updated by also updatng the
    vfs inode in sysfs_refresh_inode.

    Typically the inode mutex is used to serialize updates
    to the vfs inode, but grabbing the inode mutex in
    sysfs_permission and sysfs_getattr causes deadlocks,
    because sometimes the vfs calls those operations with
    the inode mutex held. Therefore sysfs can not use the
    inode mutex to serial updates to the vfs inode.

    The sysfs_mutex is acquired in all of the routines
    where sysfs updates the vfs inode, and with a small
    change we can consistently protext sysfs vfs inode
    updates with the sysfs_mutex. To protect the sysfs
    vfs inode updates with the sysfs_mutex simply requires
    extending the scope of sysfs_mutex in sysfs_setattr
    over inode_setattr, and over inode_change_ok (so we
    have an unchanging inode when we perform the check).

    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

17 Feb, 2010

1 commit

  • There is currently a bug in sysfs_sd_setattr inherited from
    sysfs_setattr in 2.6.32 where the first time we set the attributes
    on a sysfs file we allocate backing store but do not set the
    backing store attributes. Resulting in overly restrictive
    permissions on sysfs files.

    The fix is to simply modify the code so that it always executes
    when we update the sysfs attributes, as we did in 2.6.31 and earlier.

    Signed-off-by: Eric W. Biederman
    Tested-by: Jean Delvare
    Cc: stable
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

12 Dec, 2009

7 commits

  • inode_change_ok already clears the SGID bit when necessary
    so there is no reason for sysfs_setattr to carry code to do
    the same, and it is good to kill the extra copy because when
    I moved the code last in certain corner cases the code will
    look at the wrong gid.

    Acked-by: Serge Hallyn
    Acked-by: Tejun Heo
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • By teaching sysfs_revalidate to hide a dentry for
    a sysfs_dirent if the sysfs_dirent has been renamed,
    and by teaching sysfs_lookup to return the original
    dentry if the sysfs dirent has been renamed. I can
    show the results of renames correctly without having to
    update the dcache during the directory rename.

    This massively simplifies the rename logic allowing a lot
    of weird sysfs special cases to be removed along with
    a lot of now unnecesary helper code.

    Acked-by: Tejun Heo
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • With the implementation of sysfs_getattr and sysfs_permission
    sysfs becomes able to lazily propogate inode attribute changes
    from the sysfs_dirents to the vfs inodes. This paves the way
    for deleting significant chunks of now unnecessary code.

    While doing this we did not reference sysfs_setattr from
    sysfs_symlink_inode_operations so I added along with
    sysfs_getattr and sysfs_permission.

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Cleanly separate the work that is specific to setting the
    attributes of a sysfs_dirent from what is needed to update
    the attributes of a vfs inode.

    Additionally grab the sysfs_mutex to keep any nasties from
    surprising us when updating the sysfs_dirent.

    Acked-by: Tejun Heo
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • The granularity of sysfs time when we keep it is 1 ns. Which
    when passed to timestamp_trunc results in a nop. So remove
    the unnecessary function call making sysfs_setattr slightly
    easier to read.

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • The sysfs_mutex is required to ensure updates are and will remain
    atomic with respect to other inode iattr updates, that do not happen
    through the filesystem.

    Acked-by: Serge Hallyn
    Acked-by: Tejun Heo
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • Signed-off-by: Stefan Richter
    Acked-by: David P. Quigley
    Signed-off-by: Greg Kroah-Hartman

    Stefan Richter
     

12 Sep, 2009

1 commit

  • * 'writeback' of git://git.kernel.dk/linux-2.6-block:
    writeback: check for registered bdi in flusher add and inode dirty
    writeback: add name to backing_dev_info
    writeback: add some debug inode list counters to bdi stats
    writeback: get rid of pdflush completely
    writeback: switch to per-bdi threads for flushing data
    writeback: move dirty inodes from super_block to backing_dev_info
    writeback: get rid of generic_sync_sb_inodes() export

    Linus Torvalds
     

11 Sep, 2009

1 commit


10 Sep, 2009

1 commit

  • This patch adds a setxattr handler to the file, directory, and symlink
    inode_operations structures for sysfs. The patch uses hooks introduced in the
    previous patch to handle the getting and setting of security information for
    the sysfs inodes. As was suggested by Eric Biederman the struct iattr in the
    sysfs_dirent structure has been replaced by a structure which contains the
    iattr, secdata and secdata length to allow the changes to persist in the event
    that the inode representing the sysfs_dirent is evicted. Because sysfs only
    stores this information when a change is made all the optional data is moved
    into one dynamically allocated field.

    This patch addresses an issue where SELinux was denying virtd access to the PCI
    configuration entries in sysfs. The lack of setxattr handlers for sysfs
    required that a single label be assigned to all entries in sysfs. Granting virtd
    access to every entry in sysfs is not an acceptable solution so fine grained
    labeling of sysfs is required such that individual entries can be labeled
    appropriately.

    [sds: Fixed compile-time warnings, coding style, and setting of inode security init flags.]

    Signed-off-by: David P. Quigley
    Signed-off-by: Stephen D. Smalley
    Signed-off-by: James Morris

    David P. Quigley
     

25 Mar, 2009

1 commit

  • The sysfs_dirent serves as both an inode and a directory entry
    for sysfs. To prevent the sysfs inode numbers from being freed
    prematurely hold a reference to sysfs_dirent from the sysfs inode.

    [akpm@linux-foundation.org: add comment]
    Signed-off-by: Eric W. Biederman
    Cc: Tejun Heo
    Cc: Al Viro
    Cc: Cornelia Huck
    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

06 Jan, 2009

1 commit