12 Sep, 2013
40 commits
-
Now that sem, msgque and shm, through *_down(), all use the lockless
variant of ipcctl_pre_down(), go ahead and delete it.[akpm@linux-foundation.org: fix function name in kerneldoc, cleanups]
Signed-off-by: Davidlohr Bueso
Tested-by: Sedat Dilek
Cc: Rik van Riel
Cc: Manfred Spraul
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Instead of holding the ipc lock for the entire function, use the
ipcctl_pre_down_nolock and only acquire the lock for specific commands:
RMID and SET.Signed-off-by: Davidlohr Bueso
Tested-by: Sedat Dilek
Cc: Rik van Riel
Cc: Manfred Spraul
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This is the third and final patchset that deals with reducing the amount
of contention we impose on the ipc lock (kern_ipc_perm.lock). These
changes mostly deal with shared memory, previous work has already been
done for semaphores and message queues:http://lkml.org/lkml/2013/3/20/546 (sems)
http://lkml.org/lkml/2013/5/15/584 (mqueues)With these patches applied, a custom shm microbenchmark stressing shmctl
doing IPC_STAT with 4 threads a million times, reduces the execution
time by 50%. A similar run, this time with IPC_SET, reduces the
execution time from 3 mins and 35 secs to 27 seconds.Patches 1-8: replaces blindly taking the ipc lock for a smarter
combination of rcu and ipc_obtain_object, only acquiring the spinlock
when updating.Patch 9: renames the ids rw_mutex to rwsem, which is what it already was.
Patch 10: is a trivial mqueue leftover cleanup
Patch 11: adds a brief lock scheme description, requested by Andrew.
This patch:
Add shm_obtain_object() and shm_obtain_object_check(), which will allow us
to get the ipc object without acquiring the lock. Just as with other
forms of ipc, these functions are basically wrappers around
ipc_obtain_object*().Signed-off-by: Davidlohr Bueso
Tested-by: Sedat Dilek
Cc: Rik van Riel
Cc: Manfred Spraul
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Command line option rootfstype=ramfs to obtain old initramfs behavior, and
use ramfs instead of tmpfs for stub when root= defined (for cosmetic
reasons).[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Rob Landley
Cc: Jeff Layton
Cc: Jens Axboe
Cc: Stephen Warren
Cc: Rusty Russell
Cc: Jim Cromie
Cc: Sam Ravnborg
Cc: Greg Kroah-Hartman
Cc: "Eric W. Biederman"
Cc: Alexander Viro
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Conditionally call the appropriate fs_init function and fill_super
functions. Add a use once guard to shmem_init() to simply succeed on a
second call.(Note that IS_ENABLED() is a compile time constant so dead code
elimination removes unused function calls when CONFIG_TMPFS is disabled.)Signed-off-by: Rob Landley
Cc: Jeff Layton
Cc: Jens Axboe
Cc: Stephen Warren
Cc: Rusty Russell
Cc: Jim Cromie
Cc: Sam Ravnborg
Cc: Greg Kroah-Hartman
Cc: "Eric W. Biederman"
Cc: Alexander Viro
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
When the rootfs code was a wrapper around ramfs, having them in the same
file made sense. Now that it can wrap another filesystem type, move it in
with the init code instead.This also allows a subsequent patch to access rootfstype= command line
arg.Signed-off-by: Rob Landley
Cc: Jeff Layton
Cc: Jens Axboe
Cc: Stephen Warren
Cc: Rusty Russell
Cc: Jim Cromie
Cc: Sam Ravnborg
Cc: Greg Kroah-Hartman
Cc: "Eric W. Biederman"
Cc: Alexander Viro
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Even though ramfs hasn't got a backing device, commit e0bf68ddec4f ("mm:
bdi init hooks") added one anyway, and put the initialization in
init_rootfs() since that's the first user, leaving it out of init_ramfs()
to avoid duplication.But initmpfs uses init_tmpfs() instead, so move the init into the
filesystem's init function, add a "once" guard to prevent duplicate
initialization, and call the filesystem init from rootfs init.This goes part of the way to allowing ramfs to be built as a module.
[akpm@linux-foundation.org; using bit 1 was odd]
Signed-off-by: Rob Landley
Cc: Jeff Layton
Cc: Jens Axboe
Cc: Stephen Warren
Cc: Rusty Russell
Cc: Jim Cromie
Cc: Sam Ravnborg
Cc: Greg Kroah-Hartman
Cc: "Eric W. Biederman"
Cc: Alexander Viro
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Mounting MS_NOUSER prevents --bind mounts from rootfs. Prevent new rootfs
mounts with a different mechanism that doesn't affect bind mounts.Signed-off-by: Rob Landley
Cc: Jeff Layton
Cc: Jens Axboe
Cc: Stephen Warren
Cc: Rusty Russell
Cc: Jim Cromie
Cc: Sam Ravnborg
Cc: Greg Kroah-Hartman
Cc: "Eric W. Biederman"
Cc: Alexander Viro
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
With users of radix_tree_preload() run from interrupt (block/blk-ioc.c is
one such possible user), the following race can happen:radix_tree_preload()
...
radix_tree_insert()
radix_tree_node_alloc()
if (rtp->nr) {
ret = rtp->nodes[rtp->nr - 1];...
radix_tree_preload()
...
radix_tree_insert()
radix_tree_node_alloc()
if (rtp->nr) {
ret = rtp->nodes[rtp->nr - 1];And we give out one radix tree node twice. That clearly results in radix
tree corruption with different results (usually OOPS) depending on which
two users of radix tree race.We fix the problem by making radix_tree_node_alloc() always allocate fresh
radix tree nodes when in interrupt. Using preloading when in interrupt
doesn't make sense since all the allocations have to be atomic anyway and
we cannot steal nodes from process-context users because some users rely
on radix_tree_insert() succeeding after radix_tree_preload().
in_interrupt() check is somewhat ugly but we cannot simply key off passed
gfp_mask as that is acquired from root_gfp_mask() and thus the same for
all preload users.Another part of the fix is to avoid node preallocation in
radix_tree_preload() when passed gfp_mask doesn't allow waiting. Again,
preallocation in such case doesn't make sense and when preallocation would
happen in interrupt we could possibly leak some allocated nodes. However,
some users of radix_tree_preload() require following radix_tree_insert()
to succeed. To avoid unexpected effects for these users,
radix_tree_preload() only warns if passed gfp mask doesn't allow waiting
and we provide a new function radix_tree_maybe_preload() for those users
which get different gfp mask from different call sites and which are
prepared to handle radix_tree_insert() failure.Signed-off-by: Jan Kara
Cc: Jens Axboe
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The driver core clears the driver data to NULL after device_release or on
probe failure. Thus, it is not needed to manually clear the device driver
data to NULL.Signed-off-by: Jingoo Han
Cc: Evgeniy Polyakov
Cc: Greg KH
Acked-by: Shawn Guo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The usage of strict_strtol() is not preferred, because strict_strtol() is
obsolete. Thus, kstrtol() should be used.Signed-off-by: Jingoo Han
Cc: Evgeniy Polyakov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Based partially on MS standard spec quotes from Alex Dubov.
As any code that works with user data this driver isn't recommended to use
to write cards that contain valuable data.It tries its best though to avoid data corruption and possible damage to
the card.Tested on MS DUO 64 MB card on Ricoh R592 card reader.
Signed-off-by: Maxim Levitsky
Cc: Valdis Kletnieks
Cc: Jens Axboe
Cc: Alex Dubov
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The driver core clears the driver data to NULL after device_release or on
probe failure. Thus, it is not needed to manually clear the device driver
data to NULL.Signed-off-by: Jingoo Han
Cc: Maxim Levitsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The driver core clears the driver data to NULL after device_release or on
probe failure. Thus, it is not needed to manually clear the device driver
data to NULL.Signed-off-by: Jingoo Han
Cc: Rodolfo Giometti
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Fix thinkos where pkt_ needs a valid pktcdvd_device * and the
pointer is known to be NULL.Signed-off-by: Joe Perches
Reported-by: Dan Carpenter (go smatch!)
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Allow the device name to be emitted with pkt_err when logging the sense
data.Signed-off-by: Joe Perches
Cc: Jiri Kosina
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add a new pkt_info macro to prefix the name to the logging output.
Signed-off-by: Joe Perches
Cc: Jiri Kosina
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add a new pkt_notice macro to prefix the name to the logging output.
Signed-off-by: Joe Perches
Cc: Jiri Kosina
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add a new pkt_err macro to prefix the name to the logging output. Convert
pr_err where there is a non-null struct pktcdvd_device.Includes improvements from Andy Shevchenko.
Signed-off-by: Joe Perches
Cc: Andy Shevchenko
Cc: Jiri Kosina
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add pd->name to output for these debugging messages.
Remove normally compiled out pkt_dbg(2, ...) function entry tracing
equivalents as it's better done via the function tracer.Signed-off-by: Joe Perches
Cc: Jiri Kosina
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use the more common pkt_dbg(level, fmt, ...) form.
These messages are emitted at KERN_NOTICE.
Always emit function name with pkt_dbg(2, ...) uses and remove the
sometimes abbreviated embedded function name.This form always verifies the format and arguments.
Signed-off-by: Joe Perches
Cc: Jiri Kosina
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use a more current logging style and add messages levels to the logging
messages.Simplify pkt_dump_sense by using %*ph and adding a simple function to emit
the sense string.Includes improvements from Andy Shevchenko and Dan Carpenter.
Signed-off-by: Joe Perches
Cc: Andy Shevchenko
Cc: Dan Carpenter
Cc: Jiri Kosina
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Macros should be converted to functions where feasible to
verify arguments and the like.Signed-off-by: Joe Perches
Cc: Jiri Kosina
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Since the panic handlers may produce additional information (via printk)
for the kernel log, it should be reported as part of the panic output
saved by kmsg_dump(). Without this re-ordering, nothing that adds
information to a panic will show up in pstore's view when kmsg_dump runs,
and is therefore not visible to crash reporting tools that examine pstore
output.Signed-off-by: Kees Cook
Cc: Anton Vorontsov
Cc: Colin Cross
Acked-by: Tony Luck
Cc: Stephen Boyd
Cc: Vikram Mulukutla
Cc: Peter Zijlstra
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
It seems pretty unlikely that AFFS supports files over 4GB but we may as
well leave use loff_t just for cleanness sake instead of truncating it to
32 bits.Signed-off-by: Dan Carpenter
Cc: Marco Stornelli
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
When the example udev rules in the documentation are used without
modification, warnings like the one shown below appear in the system logs:/var/log/messages:Aug 22 11:09:11 kung udevd[445]: NAME="%k" \
is superfluous and breaks kernel supplied names, please remove \
it from /etc/udev/rules.d/60-aoe.rules:26Removing the term does not cause any problems with the creation of the
special character and block device nodes.Signed-off-by: Ed Cashin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
If the system has trouble allocating memory for the creation of the aoe
debugfs directory or of a file inside it, the debugfs member of an aoedev
can be NULL.Do not treat a NULL debugfs pointer as a BUG on aoedev shutdown, avoiding
the user impact of an unecessary panic.Signed-off-by: Ed Cashin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This patch fixes following compiler warnings:
drivers/block/aoe/aoecmd.c: In function `aoecmd_ata_rw':
drivers/block/aoe/aoecmd.c:383:17: warning: variable `t' set but not used [-Wunused-but-set-variable]
struct aoetgt *t;
^
drivers/block/aoe/aoecmd.c: In function `resend':
drivers/block/aoe/aoecmd.c:488:21: warning: variable `ah' set but not used [-Wunused-but-set-variable]
struct aoe_atahdr *ah;
^Signed-off-by: Andy Shevchenko
Cc: Ed Cashin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
In the kernel we have a nice helper that may be used here. This patch
substitutes the custom implementation by the native function call.Signed-off-by: Andy Shevchenko
Cc: Ed Cashin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Ed Cashin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Ed Cashin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This information is presented in a compact format that has evolved for
easy routine scanning by expert humans, mostly developers and support
technicians helping to troubleshoot or test AoE-based systems.Signed-off-by: Ed Cashin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The place holder in the file contents is filled out in the following
patch.Signed-off-by: Ed Cashin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Ed Cashin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This series adds the debugging information that the coraid.com-distributed
aoe driver exports via sysfs, but instead of sysfs, it uses debugfs.With these patches applied, even without AoE targets on the network, KEDR
reports new possible memory leaks, but these are from callers outside the
aoe driver that have used aoe_devnode to get the name of the character
devices through the aoe_class->devnode callback, and I believe they're
responsible for freeing that memory.This patch:
Create and destroy the debugfs directory.
Signed-off-by: Ed Cashin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Cody P Schafer
Reviewed-by: Seth Jennings
Cc: David Woodhouse
Cc: Rik van Riel
Cc: Michel Lespinasse
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
No reason require rbtree test code to be a module, allow it to be builtin
(streamlines my development process)Signed-off-by: Cody P Schafer
Reviewed-by: Seth Jennings
Cc: David Woodhouse
Cc: Rik van Riel
Cc: Michel Lespinasse
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Just check that we examine all nodes in the tree for the postorder
iteration.Signed-off-by: Cody P Schafer
Reviewed-by: Seth Jennings
Cc: David Woodhouse
Cc: Rik van Riel
Cc: Michel Lespinasse
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Because deletion (of the entire tree) is a relatively common use of the
rbtree_postorder iteration, and because doing it safely means fiddling
with temporary storage, provide a helper to simplify postorder rbtree
iteration.Signed-off-by: Cody P Schafer
Reviewed-by: Seth Jennings
Cc: David Woodhouse
Cc: Rik van Riel
Cc: Michel Lespinasse
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Postorder iteration yields all of a node's children prior to yielding the
node itself, and this particular implementation also avoids examining the
leaf links in a node after that node has been yielded.In what I expect will be its most common usage, postorder iteration allows
the deletion of every node in an rbtree without modifying the rbtree nodes
(no _requirement_ that they be nulled) while avoiding referencing child
nodes after they have been "deleted" (most commonly, freed).I have only updated zswap to use this functionality at this point, but
numerous bits of code (most notably in the filesystem drivers) use a hand
rolled postorder iteration that NULLs child links as it traverses the
tree. Each of those instances could be replaced with this common
implementation.1 & 2 add rbtree postorder iteration functions.
3 adds testing of the iteration to the rbtree runtime tests
4 allows building the rbtree runtime tests as builtins
5 updates zswap.This patch:
Add postorder iteration functions for rbtree. These are useful for safely
freeing an entire rbtree without modifying the tree at all.Signed-off-by: Cody P Schafer
Reviewed-by: Seth Jennings
Cc: David Woodhouse
Cc: Rik van Riel
Cc: Michel Lespinasse
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds