05 Jan, 2010
1 commit
-
Holding locks over device_del -> kobject_del -> sysfs_deactivate can
cause deadlocks if those same locks are grabbed in sysfs show or store
methods.The I model s_active count + completion as a sleeping read/write lock.
I describe to lockdep sysfs_get_active as a read_trylock,
sysfs_put_active as a read_unlock, and sysfs_deactivate as a
write_lock and write_unlock pair. This seems to capture the essence
for purposes of finding deadlocks, and in my testing gives finds real
issues and ignores non-issues.This brings us back to holding locks over kobject_del is a problem
that ideally we should find a way of addressing, but at least lockdep
can tell us about the problems instead of requiring developers to debug
rare strange system deadlocks, that happen when sysfs files are removed
while being written to.Signed-off-by: Eric W. Biederman
Acked-by: Tejun Heo
Signed-off-by: Linus Torvalds
24 Dec, 2009
1 commit
-
Many struct bin_attribute descriptors are purely read-only
structures, and there's no need to change them. Therefore
make the promise not to, which will let those descriptors
be put in a ro section.Signed-off-by: Phil Carmody
Signed-off-by: Greg Kroah-Hartman
12 Dec, 2009
15 commits
-
inode_change_ok already clears the SGID bit when necessary
so there is no reason for sysfs_setattr to carry code to do
the same, and it is good to kill the extra copy because when
I moved the code last in certain corner cases the code will
look at the wrong gid.Acked-by: Serge Hallyn
Acked-by: Tejun Heo
Signed-off-by: Eric W. Biederman
Signed-off-by: Greg Kroah-Hartman -
These two functions do 90% of the same work and it doesn't significantly
obfuscate the function to allow both the parent dir and the name to change
at the same time. So merge them together to simplify maintenance, and
increase testing.Acked-by: Tejun Heo
Acked-by: Serge Hallyn
Signed-off-by: Eric W. Biederman
Signed-off-by: Greg Kroah-Hartman -
By teaching sysfs_revalidate to hide a dentry for
a sysfs_dirent if the sysfs_dirent has been renamed,
and by teaching sysfs_lookup to return the original
dentry if the sysfs dirent has been renamed. I can
show the results of renames correctly without having to
update the dcache during the directory rename.This massively simplifies the rename logic allowing a lot
of weird sysfs special cases to be removed along with
a lot of now unnecesary helper code.Acked-by: Tejun Heo
Signed-off-by: Eric W. Biederman
Signed-off-by: Greg Kroah-Hartman -
With lazy inode updates and dentry operations bringing everything
into sync on demand there is no longer any need to immediately
update the vfs or grab i_mutex to protect those updates as we
make changes to sysfs.Acked-by: Serge Hallyn
Acked-by: Tejun Heo
Signed-off-by: Eric W. Biederman
Signed-off-by: Greg Kroah-Hartman -
Now that sysfs_getattr and sysfs_permission refresh the vfs
inode there is no need to immediatly push the mode change
into the vfs cache. Reducing the amount of work needed and
simplifying the locking.Acked-by: Tejun Heo
Acked-by: Serge Hallyn
Signed-off-by: Eric W. Biederman
Signed-off-by: Greg Kroah-Hartman -
With the implementation of sysfs_getattr and sysfs_permission
sysfs becomes able to lazily propogate inode attribute changes
from the sysfs_dirents to the vfs inodes. This paves the way
for deleting significant chunks of now unnecessary code.While doing this we did not reference sysfs_setattr from
sysfs_symlink_inode_operations so I added along with
sysfs_getattr and sysfs_permission.Acked-by: Tejun Heo
Acked-by: Serge Hallyn
Signed-off-by: Eric W. Biederman
Signed-off-by: Greg Kroah-Hartman -
Lining up the functions in sysfs_symlink_inode_operations
follows the pattern in the rest of sysfs and makes things
slightly more readable.Acked-by: Tejun Heo
Acked-by: Serge Hallyn
Signed-off-by: Eric W. Biederman
Signed-off-by: Greg Kroah-Hartman -
Currently sysfs updates the timestamps on the vfs directory
inode when we create or remove a directory entry but doesn't
update the cached copy on the sysfs_dirent, fix that oversight.Acked-by: Tejun Heo
Acked-by: Serge Hallyn
Signed-off-by: Eric W. Biederman
Signed-off-by: Greg Kroah-Hartman -
Cleanly separate the work that is specific to setting the
attributes of a sysfs_dirent from what is needed to update
the attributes of a vfs inode.Additionally grab the sysfs_mutex to keep any nasties from
surprising us when updating the sysfs_dirent.Acked-by: Tejun Heo
Signed-off-by: Eric W. Biederman
Signed-off-by: Greg Kroah-Hartman -
The granularity of sysfs time when we keep it is 1 ns. Which
when passed to timestamp_trunc results in a nop. So remove
the unnecessary function call making sysfs_setattr slightly
easier to read.Acked-by: Tejun Heo
Acked-by: Serge Hallyn
Signed-off-by: Eric W. Biederman
Signed-off-by: Greg Kroah-Hartman -
Currently every caller of sysfs_chmod_file happens at either
file creation time to set a non-default mode or in response
to a specific user requested space change in policy. Making
timestamps of when the chmod happens and notification of
a file changing mode uninteresting.Remove the unnecessary time stamp and filesystem change
notification, and removes the last of the explicit inotify
and donitfy support from sysfs.Acked-by: Tejun Heo
Acked-by: Serge Hallyn
Signed-off-by: Eric W. Biederman
Signed-off-by: Greg Kroah-Hartman -
Calling d_drop unconditionally when a sysfs_dirent is deleted has
the potential to leak mounts, so instead implement dentry delete
and revalidate operations that cause sysfs dentries to be removed
at the appropriate time.Acked-by: Tejun Heo
Acked-by: Serge Hallyn
Signed-off-by: Eric W. Biederman
Signed-off-by: Greg Kroah-Hartman -
Using dentry instead of d in the function name is what
several other filesystems are doing and it seems to be
a more readable convention.Acked-by: Tejun Heo
Acked-by: Serge Hallyn
Signed-off-by: Eric W. Biederman
Signed-off-by: Greg Kroah-Hartman -
The sysfs_mutex is required to ensure updates are and will remain
atomic with respect to other inode iattr updates, that do not happen
through the filesystem.Acked-by: Serge Hallyn
Acked-by: Tejun Heo
Signed-off-by: Eric W. Biederman
Signed-off-by: Greg Kroah-Hartman -
Signed-off-by: Stefan Richter
Acked-by: David P. Quigley
Signed-off-by: Greg Kroah-Hartman
05 Nov, 2009
1 commit
-
While refreshing my sysfs patches I noticed a leak in the secdata
implementation. We don't free the secdata when we free the
sysfs dirent.This is a bug in 2.6.32-rc5 that we really should close.
Signed-off-by: Eric W. Biederman
Acked-by: Serge Hallyn
Signed-off-by: James Morris
15 Oct, 2009
2 commits
-
sysfs_notify_dirent is a simple atomic operation that can be used to
alert user-space that new data can be read from a sysfs attribute.Unfortunately it cannot currently be called from non-process context
because of its use of spin_lock which is sometimes taken with
interrupts enabled.So change all lockers of sysfs_open_dirent_lock to disable interrupts,
thus making sysfs_notify_dirent safe to be called from non-process
context (as drivers/md does in md_safemode_timeout).sysfs_get_open_dirent is (documented as being) only called from
process context, so it uses spin_lock_irq. Other places
use spin_lock_irqsave.The usage for sysfs_notify_dirent in md_safemode_timeout was
introduced in 2.6.28, so this patch is suitable for that and more
recent kernels.Reported-by: Joel Andres Granados
Signed-off-by: NeilBrown
Signed-off-by: Dan Williams
Cc: stable
Signed-off-by: Greg Kroah-Hartman -
As device_move() and kobject_move() both handle a NULL destination,
sysfs_move_dir() should do this as well (again) and fall back to
sysfs_root in that case.Signed-off-by: Cornelia Huck
Cc: Phil Carmody
Signed-off-by: Greg Kroah-Hartman
28 Sep, 2009
1 commit
-
* mark struct vm_area_struct::vm_ops as const
* mark vm_ops in AGP codeBut leave TTM code alone, something is fishy there with global vm_ops
being used.Signed-off-by: Alexey Dobriyan
Signed-off-by: Linus Torvalds
12 Sep, 2009
1 commit
-
* 'writeback' of git://git.kernel.dk/linux-2.6-block:
writeback: check for registered bdi in flusher add and inode dirty
writeback: add name to backing_dev_info
writeback: add some debug inode list counters to bdi stats
writeback: get rid of pdflush completely
writeback: switch to per-bdi threads for flushing data
writeback: move dirty inodes from super_block to backing_dev_info
writeback: get rid of generic_sync_sb_inodes() export
11 Sep, 2009
1 commit
-
This enables us to track who does what and print info. Its main use
is catching dirty inodes on the default_backing_dev_info, so we can
fix that up.Signed-off-by: Jens Axboe
10 Sep, 2009
1 commit
-
This patch adds a setxattr handler to the file, directory, and symlink
inode_operations structures for sysfs. The patch uses hooks introduced in the
previous patch to handle the getting and setting of security information for
the sysfs inodes. As was suggested by Eric Biederman the struct iattr in the
sysfs_dirent structure has been replaced by a structure which contains the
iattr, secdata and secdata length to allow the changes to persist in the event
that the inode representing the sysfs_dirent is evicted. Because sysfs only
stores this information when a change is made all the optional data is moved
into one dynamically allocated field.This patch addresses an issue where SELinux was denying virtd access to the PCI
configuration entries in sysfs. The lack of setxattr handlers for sysfs
required that a single label be assigned to all entries in sysfs. Granting virtd
access to every entry in sysfs is not an acceptable solution so fine grained
labeling of sysfs is required such that individual entries can be labeled
appropriately.[sds: Fixed compile-time warnings, coding style, and setting of inode security init flags.]
Signed-off-by: David P. Quigley
Signed-off-by: Stephen D. Smalley
Signed-off-by: James Morris
29 Jul, 2009
1 commit
-
Update directory hardlink count when moving kobjects to a new parent.
Fixes the following problem which occurs when several devices are
moved to the same parent and then unregistered:> ls -laF /sys/devices/css0/defunct/
> total 0
> drwxr-xr-x 4294967295 root root 0 2009-07-14 17:02 ./
> drwxr-xr-x 114 root root 0 2009-07-14 17:02 ../
> drwxr-xr-x 2 root root 0 2009-07-14 17:01 power/
> -rw-r--r-- 1 root root 4096 2009-07-14 17:01 ueventSigned-off-by: Peter Oberparleiter
Cc: stable
Signed-off-by: Greg Kroah-Hartman
09 Jul, 2009
1 commit
-
Commit 1c8542c7bb replaced kmalloc() with memdup_user() in the write()
function but also dropped the kfree(temp). The memdup_user() function
allocates memory which is never freed.Signed-off-by: Catalin Marinas
Cc: Parag Warudkar
Signed-off-by: Linus Torvalds
16 Jun, 2009
1 commit
-
There is the possiblity of a memory leak if a page is allocated and if
sysfs_getlink() fails in the sysfs_follow_link.Signed-off-by: Armin Kuster
Signed-off-by: Greg Kroah-Hartman
29 May, 2009
1 commit
-
We don't need a kernel thread per CPU for this application.
Acked-by: Alex Chiang
Cc: Lai Jiangshan
Cc: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Greg Kroah-Hartman
21 Apr, 2009
1 commit
-
Remove open-coded memdup_user().
Signed-off-by: Li Zefan
Signed-off-by: Al Viro
17 Apr, 2009
2 commits
-
Currently, following test programs don't finished.
% ruby -e '
Thread.new { sleep }
File.read("/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies")
'strace expose the reason.
...
open("/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies", O_RDONLY|O_LARGEFILE) = 3
ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbf9fa6b8) = -1 ENOTTY (Inappropriate ioctl for device)
fstat64(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
_llseek(3, 0, [0], SEEK_CUR) = 0
select(4, [3], NULL, NULL, NULL) = 1 (in [3])
read(3, "1400000 1300000 1200000 1100000 1"..., 4096) = 62
select(4, [3], NULL, NULL, NULLBecause Ruby (the scripting language) VM assume select system-call
against regular file don't block. it because SUSv3 says "Regular files
shall always poll TRUE for reading and writing". see
http://www.opengroup.org/onlinepubs/009695399/functions/poll.html it
seems valid assumption.But sysfs_poll() don't keep this rule although sysfs file can read and
write always.This patch restore proper poll behavior to sysfs.
/sys/block/md*/md/sync_action polling application and another sysfs
updating sensitive application still can use POLLERR and POLLPRI.Cc: Neil Brown
Signed-off-by: KOSAKI Motohiro
Signed-off-by: Greg Kroah-Hartman -
A sysfs attribute using sysfs_schedule_callback() to commit suicide
may end up calling device_unregister(), which will eventually call
a driver's ->remove function.Drivers may call flush_scheduled_work() in their shutdown routines,
in which case lockdep will complain with something like the following:=============================================
[ INFO: possible recursive locking detected ]
2.6.29-rc8-kk #1
---------------------------------------------
events/4/56 is trying to acquire lock:
(events){--..}, at: [] flush_workqueue+0x0/0xa0but task is already holding lock:
(events){--..}, at: [] run_workqueue+0x108/0x230other info that might help us debug this:
3 locks held by events/4/56:
#0: (events){--..}, at: [] run_workqueue+0x108/0x230
#1: (&ss->work){--..}, at: [] run_workqueue+0x108/0x230
#2: (pci_remove_rescan_mutex){--..}, at: [] remove_callback+0x21/0x40stack backtrace:
Pid: 56, comm: events/4 Not tainted 2.6.29-rc8-kk #1
Call Trace:
[] validate_chain+0xb7d/0x1260
[] __lock_acquire+0x42e/0xa40
[] lock_acquire+0x58/0x80
[] ? flush_workqueue+0x0/0xa0
[] flush_workqueue+0x4d/0xa0
[] ? flush_workqueue+0x0/0xa0
[] flush_scheduled_work+0x10/0x20
[] e1000_remove+0x55/0xfe [e1000e]
[] ? sysfs_schedule_callback_work+0x0/0x50
[] pci_device_remove+0x32/0x70
[] __device_release_driver+0x59/0x90
[] device_release_driver+0x2b/0x40
[] bus_remove_device+0xa6/0x120
[] device_del+0x12b/0x190
[] device_unregister+0x26/0x70
[] pci_stop_dev+0x49/0x60
[] pci_remove_bus_device+0x40/0xc0
[] remove_callback+0x29/0x40
[] sysfs_schedule_callback_work+0x1f/0x50
[] run_workqueue+0x15a/0x230
[] ? run_workqueue+0x108/0x230
[] worker_thread+0x9f/0x100
[] ? autoremove_wake_function+0x0/0x40
[] ? worker_thread+0x0/0x100
[] kthread+0x4d/0x80
[] child_rip+0xa/0x20
[] ? restore_args+0x0/0x30
[] ? kthread+0x0/0x80
[] ? child_rip+0x0/0x20Although we know that the device_unregister path will never acquire
a lock that a driver might try to acquire in its ->remove, in general
we should never attempt to flush a workqueue from within the same
workqueue, and lockdep rightly complains.So as long as sysfs attributes cannot commit suicide directly and we
are stuck with this callback mechanism, put the sysfs callbacks on
their own workqueue instead of the global one.This has the side benefit that if a suicidal sysfs attribute kicks
off a long chain of ->remove callbacks, we no longer induce a long
delay on the global queue.This also fixes a missing module_put in the error path introduced
by sysfs-only-allow-one-scheduled-removal-callback-per-kobj.patch.We never destroy the workqueue, but I'm not sure that's a
problem.Reported-by: Kenji Kaneshige
Tested-by: Kenji Kaneshige
Signed-off-by: Alex Chiang
Signed-off-by: Greg Kroah-Hartman
01 Apr, 2009
1 commit
-
Fix warnings and return values in sysfs bin_page_mkwrite(), fixing
fs/sysfs/bin.c: In function `bin_page_mkwrite':
fs/sysfs/bin.c:250: warning: passing argument 2 of `bb->vm_ops->page_mkwrite' from incompatible pointer type
fs/sysfs/bin.c: At top level:
fs/sysfs/bin.c:280: warning: initialization from incompatible pointer typeExpects to have my [PATCH next] sysfs: fix some bin_vm_ops errors
Signed-off-by: Hugh Dickins
Cc: Nick Piggin
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
28 Mar, 2009
2 commits
-
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (37 commits)
fs: avoid I_NEW inodes
Merge code for single and multiple-instance mounts
Remove get_init_pts_sb()
Move common mknod_ptmx() calls into caller
Parse mount options just once and copy them to super block
Unroll essentials of do_remount_sb() into devpts
vfs: simple_set_mnt() should return void
fs: move bdev code out of buffer.c
constify dentry_operations: rest
constify dentry_operations: configfs
constify dentry_operations: sysfs
constify dentry_operations: JFS
constify dentry_operations: OCFS2
constify dentry_operations: GFS2
constify dentry_operations: FAT
constify dentry_operations: FUSE
constify dentry_operations: procfs
constify dentry_operations: ecryptfs
constify dentry_operations: CIFS
constify dentry_operations: AFS
... -
Signed-off-by: Al Viro
25 Mar, 2009
6 commits
-
Commit 86c9508eb1c0ce5aa07b5cf1d36b60c54efc3d7a
"sysfs: don't block indefinitely for unmapped files" in linux-next
crashes the PowerMac G5 when X starts up. It's caught out by the way
powerpc's pci_mmap of legacy_mem uses shmem_zero_setup(), substituting
a new vma->vm_file whose private_data no longer points to the bin_buffer
(substitution done because some versions of X crash if that mmap fails).The fix to this is straightforward: the original vm_file is fput() in
that case, so this mmap won't block sysfs at all, so just don't switch
over to bin_vm_ops if vm_file has changed.But more fixes made before realizing that was the problem:-
It should not be an error if bin_page_mkwrite() finds no underlying
page_mkwrite().Check that a file already mmap'ed has the same underlying vm_ops
_before_ pointing vma->vm_ops at bin_vm_ops.If the file being mmap'ed is a shmem/tmpfs file, don't fail the mmap
on CONFIG_NUMA=y, just because that has a set_policy and get_policy:
provide bin_set_policy, bin_get_policy and bin_migrate.Signed-off-by: Hugh Dickins
Acked-by: Eric Biederman
Signed-off-by: Greg Kroah-Hartman -
The only way for a sysfs attribute to remove itself (without
deadlock) is to use the sysfs_schedule_callback() interface.Vegard Nossum discovered that a poorly written sysfs ->store
callback can repeatedly schedule remove callbacks on the same
device over and over, e.g.$ while true ; do echo 1 > /sys/devices/.../remove ; done
If the 'remove' attribute uses the sysfs_schedule_callback API
and also does not protect itself from concurrent accesses, its
callback handler will be called multiple times, and will
eventually attempt to perform operations on a freed kobject,
leading to many problems.Instead of requiring all callers of sysfs_schedule_callback to
implement their own synchronization, provide the protection in
the infrastructure.Now, sysfs_schedule_callback will only allow one scheduled
callback per kobject. On subsequent calls with the same kobject,
return -EAGAIN.This is a short term fix. The long term fix is to allow sysfs
attributes to remove themselves directly, without any of this
callback hokey pokey.[cornelia.huck@de.ibm.com: s390 ccwgroup bits]
Reported-by: vegard.nossum@gmail.com
Signed-off-by: Alex Chiang
Acked-by: Cornelia Huck
Signed-off-by: Greg Kroah-Hartman -
Modify sysfs bin files so that we can remove the bin file while they are
still mapped. When the kobject is removed we unmap the bin file and
arrange for future accesses to the mapping to receive SIGBUS.Implementing this prevents a nasty DOS when pci devices are hot plugged
and unplugged. Where if any of their resources were mmaped the kernel
could not free up their pci resources or release their pci data
structures.[akpm@linux-foundation.org: remove unused var]
Signed-off-by: Eric W. Biederman
Cc: Jesse Barnes
Acked-by: Tejun Heo
Cc: Kay Sievers
Signed-off-by: Andrew Morton
Signed-off-by: Greg Kroah-Hartman -
The sysfs_dirent serves as both an inode and a directory entry
for sysfs. To prevent the sysfs inode numbers from being freed
prematurely hold a reference to sysfs_dirent from the sysfs inode.[akpm@linux-foundation.org: add comment]
Signed-off-by: Eric W. Biederman
Cc: Tejun Heo
Cc: Al Viro
Cc: Cornelia Huck
Signed-off-by: Andrew Morton
Signed-off-by: Greg Kroah-Hartman -
sysfs: sysfs_add_one WARNs with full path to duplicate filename
As a debugging aid, it can be useful to know the full path to a
duplicate file being created in sysfs.We now will display warnings such as:
sysfs: cannot create duplicate filename '/foo'
when attempting to create multiple files named 'foo' in the sysfs
root, or:sysfs: cannot create duplicate filename '/bus/pci/slots/5/foo'
when attempting to create multiple files named 'foo' under a
given directory in sysfs.The path displayed is always a relative path to sysfs_root. The
leading '/' in the path name refers to the sysfs_root mount
point, and should not be confused with the "real" '/'.Thanks to Alex Williamson for essentially writing sysfs_pathname.
Cc: Alex Williamson
Signed-off-by: Alex Chiang
Signed-off-by: Greg Kroah-Hartman -
sysfs_get_inode ultimately calls sysfs_count_nlink when the a
directory inode is fectched. sysfs_count_nlink needs to be
called under the sysfs_mutex to guard against the unlikely
but possible scenario that the root directory is changing
as we are counting the number entries in it, and just in
general to be consistent.Signed-off-by: Eric W. Biederman
Acked-by: Tejun Heo
Signed-off-by: Greg Kroah-Hartman