Doug / smarc-fsl-linux-kernel | Embedian Git Server

04 Sep, 2013

3 commits

f83b0a4e4 Merge tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux ... Browse Code »

Pull pstore changes from Tony Luck:
"A big part of this is the addition of compression to the generic
pstore layer so that all backends can use the pitiful amounts of
storage they control more effectively. Three other small
fixes/cleanups too.

* tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
pstore/ram: (really) fix undefined usage of rounddown_pow_of_two
pstore/ram: Read and write to the 'compressed' flag of pstore
efi-pstore: Read and write to the 'compressed' flag of pstore
erst: Read and write to the 'compressed' flag of pstore
powerpc/pseries: Read and write to the 'compressed' flag of pstore
pstore: Add file extension to pstore file if compressed
pstore: Add decompression support to pstore
pstore: Introduce new argument 'compressed' in the read callback
pstore: Add compression support to pstore
pstore/Kconfig: Select ZLIB_DEFLATE and ZLIB_INFLATE when PSTORE is selected
pstore: Add new argument 'compressed' in pstore write callback
powerpc/pseries: Remove (de)compression in nvram with pstore enabled
pstore: d_alloc_name() doesn't return an ERR_PTR
acpi/apei/erst: Add missing iounmap() on error in erst_exec_move_data()

Linus Torvalds
2013-09-04 12:14:06 +0800
32dad03d1 Merge branch 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup ... Browse Code »

Pull cgroup updates from Tejun Heo:
"A lot of activities on the cgroup front. Most changes aren't visible
to userland at all at this point and are laying foundation for the
planned unified hierarchy.

- The biggest change is decoupling the lifetime management of css
(cgroup_subsys_state) from that of cgroup's. Because controllers
(cpu, memory, block and so on) will need to be dynamically enabled
and disabled, css which is the association point between a cgroup
and a controller may come and go dynamically across the lifetime of
a cgroup. Till now, css's were created when the associated cgroup
was created and stayed till the cgroup got destroyed.

Assumptions around this tight coupling permeated through cgroup
core and controllers. These assumptions are gradually removed,
which consists bulk of patches, and css destruction path is
completely decoupled from cgroup destruction path. Note that
decoupling of creation path is relatively easy on top of these
changes and the patchset is pending for the next window.

- cgroup has its own event mechanism cgroup.event_control, which is
only used by memcg. It is overly complex trying to achieve high
flexibility whose benefits seem dubious at best. Going forward,
new events will simply generate file modified event and the
existing mechanism is being made specific to memcg. This pull
request contains prepatory patches for such change.

- Various fixes and cleanups"

Fixed up conflict in kernel/cgroup.c as per Tejun.

* 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (69 commits)
cgroup: fix cgroup_css() invocation in css_from_id()
cgroup: make cgroup_write_event_control() use css_from_dir() instead of __d_cgrp()
cgroup: make cgroup_event hold onto cgroup_subsys_state instead of cgroup
cgroup: implement CFTYPE_NO_PREFIX
cgroup: make cgroup_css() take cgroup_subsys * instead and allow NULL subsys
cgroup: rename cgroup_css_from_dir() to css_from_dir() and update its syntax
cgroup: fix cgroup_write_event_control()
cgroup: fix subsystem file accesses on the root cgroup
cgroup: change cgroup_from_id() to css_from_id()
cgroup: use css_get() in cgroup_create() to check CSS_ROOT
cpuset: remove an unncessary forward declaration
cgroup: RCU protect each cgroup_subsys_state release
cgroup: move subsys file removal to kill_css()
cgroup: factor out kill_css()
cgroup: decouple cgroup_subsys_state destruction from cgroup destruction
cgroup: replace cgroup->css_kill_cnt with ->nr_css
cgroup: bounce cgroup_subsys_state ref kill confirmation to a work item
cgroup: move cgroup->subsys[] assignment to online_css()
cgroup: reorganize css init / exit paths
cgroup: add __rcu modifier to cgroup->subsys[]
...

Linus Torvalds
2013-09-04 09:25:03 +0800
542a086ac Merge tag 'driver-core-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core ... Browse Code »

Pull driver core patches from Greg KH:
"Here's the big driver core pull request for 3.12-rc1.

Lots of tiny changes here fixing up the way sysfs attributes are
created, to try to make drivers simpler, and fix a whole class race
conditions with creations of device attributes after the device was
announced to userspace.

All the various pieces are acked by the different subsystem
maintainers"

* tag 'driver-core-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (119 commits)
firmware loader: fix pending_fw_head list corruption
drivers/base/memory.c: introduce help macro to_memory_block
dynamic debug: line queries failing due to uninitialized local variable
sysfs: sysfs_create_groups returns a value.
debugfs: provide debugfs_create_x64() when disabled
rbd: convert bus code to use bus_groups
firmware: dcdbas: use binary attribute groups
sysfs: add sysfs_create/remove_groups for when SYSFS is not enabled
driver core: add #include to core files.
HID: convert bus code to use dev_groups
Input: serio: convert bus code to use drv_groups
Input: gameport: convert bus code to use drv_groups
driver core: firmware: use __ATTR_RW()
driver core: core: use DEVICE_ATTR_RO
driver core: bus: use DRIVER_ATTR_WO()
driver core: create write-only attribute macros for devices and drivers
sysfs: create __ATTR_WO()
driver-core: platform: convert bus code to use dev_groups
workqueue: convert bus code to use dev_groups
MEI: convert bus code to use dev_groups
...

Linus Torvalds
2013-09-04 02:37:15 +0800

03 Sep, 2013

3 commits

fc6d0b037 Merge branch 'lockref' (locked reference counts) ... Browse Code »

Merge lockref infrastructure code by me and Waiman Long.

I already merged some of the preparatory patches that didn't actually do
any semantic changes earlier, but this merges the actual _reason_ for
those preparatory patches.

The "lockref" structure is a combination "spinlock and reference count"
that allows optimized reference count accesses. In particular, it
guarantees that the reference count will be updated AS IF the spinlock
was held, but using atomic accesses that cover both the reference count
and the spinlock words, we can often do the update without actually
having to take the lock.

This allows us to avoid the nastiest cases of spinlock contention on
large machines under heavy pathname lookup loads. When updating the
dentry reference counts on a large system, we'll still end up with the
cache line bouncing around, but that's much less noticeable than
actually having to spin waiting for the lock.

* lockref:
lockref: implement lockless reference count updates using cmpxchg()
lockref: uninline lockref helper functions
vfs: reimplement d_rcu_to_refcount() using lockref_get_or_lock()
vfs: use lockref_get_not_zero() for optimistic lockless dget_parent()
lockref: add 'lockref_get_or_lock() helper

Linus Torvalds
2013-09-03 23:08:21 +0800
15570086b vfs: reimplement d_rcu_to_refcount() using lockref_get_or_lock() ... Browse Code »

This moves __d_rcu_to_refcount() from into fs/namei.c
and re-implements it using the lockref infrastructure instead. It also
adds a lot of comments about what is actually going on, because turning
a dentry that was looked up using RCU into a long-lived reference
counted entry is one of the more subtle parts of the rcu walk.

We also used to be _particularly_ subtle in unlazy_walk() where we
re-validate both the dentry and its parent using the same sequence
count. We used to do it by nesting the locks and then verifying the
sequence count just once.

That was silly, because nested locking is expensive, but the sequence
count check is not. So this just re-validates the dentry and the parent
separately, avoiding the nested locking, and making the lockref lookup
possible.

Acked-by: Waiman Long
Signed-off-by: Linus Torvalds

Linus Torvalds
2013-09-03 02:38:06 +0800
df3d0bbcd vfs: use lockref_get_not_zero() for optimistic lockless dget_parent() ... Browse Code »

A valid parent pointer is always going to have a non-zero reference
count, but if we look up the parent optimistically without locking, we
have to protect against the (very unlikely) race against renaming
changing the parent from under us.

We do that by using lockref_get_not_zero(), and then re-checking the
parent pointer after getting a valid reference.

[ This is a re-implementation of a chunk from the original patch by
Waiman Long: "dcache: Enable lockless update of dentry's refcount".
I've completely rewritten the patch-series and split it up, but I'm
attributing this part to Waiman as it's close enough to his earlier
patch - Linus ]

Signed-off-by: Waiman Long
Signed-off-by: Linus Torvalds

Waiman Long
2013-09-03 02:29:22 +0800

31 Aug, 2013

1 commit

3bd11cf56 pstore/ram: (really) fix undefined usage of rounddown_pow_of_two ... Browse Code »

Previous attempt to fix was b042e47491ba5f487601b5141a3f1d8582304170

Suggested use of is_power_of_2() was bogus because is_power_of_2(0) is
false (documented behaviour).

Signed-off-by: Maxime Bizon
Acked-by: Kees Cook
Signed-off-by: Tony Luck

Maxime Bizon
2013-08-31 06:57:01 +0800

29 Aug, 2013

4 commits

c95389b4c Merge branch 'akpm' (patches from Andrew Morton) ... Browse Code »

Merge fixes from Andrew Morton:
"Five fixes.

err, make that six. let me try again"

* emailed patches from Andrew Morton :
fs/ocfs2/super.c: Use bigger nodestr to accomodate 32-bit node numbers
memcg: check that kmem_cache has memcg_params before accessing it
drivers/base/memory.c: fix show_mem_removable() to handle missing sections
IPC: bugfix for msgrcv with msgtyp < 0
Omnikey Cardman 4000: pull in ioctl.h in user header
timer_list: correct the iterator for timer_list

Linus Torvalds
2013-08-29 10:31:33 +0800
49fa8140e fs/ocfs2/super.c: Use bigger nodestr to accomodate 32-bit node numbers ... Browse Code »

While using pacemaker/corosync, the node numbers are generated using IP
address as opposed to serial node number generation. This may not fit
in a 8-byte string. Use a bigger string to print the complete node
number.

Signed-off-by: Goldwyn Rodrigues
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Goldwyn Rodrigues
2013-08-29 10:26:38 +0800
98474236f vfs: make the dentry cache use the lockref infrastructure ... Browse Code »

This just replaces the dentry count/lock combination with the lockref
structure that contains both a count and a spinlock, and does the
mechanical conversion to use the lockref infrastructure.

There are no semantic changes here, it's purely syntactic. The
reference lockref implementation uses the spinlock exactly the same way
that the old dcache code did, and the bulk of this patch is just
expanding the internal "d_count" use in the dcache code to use
"d_lockref.count" instead.

This is purely preparation for the real change to make the reference
count updates be lockless during the 3.12 merge window.

[ As with the previous commit, this is a rewritten version of a concept
originally from Waiman, so credit goes to him, blame for any errors
goes to me.

Waiman's patch had some semantic differences for taking advantage of
the lockless update in dget_parent(), while this patch is
intentionally a pure search-and-replace change with no semantic
changes. - Linus ]

Signed-off-by: Waiman Long
Signed-off-by: Linus Torvalds

Waiman Long
2013-08-29 09:24:59 +0800
f0cc6ffb8 Revert "fs: Allow unprivileged linkat(..., AT_EMPTY_PATH) aka flink" ... Browse Code »

This reverts commit bb2314b47996491bbc5add73633905c3120b6268.

It wasn't necessarily wrong per se, but we're still busily discussing
the exact details of this all, so I'm going to revert it for now.

It's true that you can already do flink() through /proc and that flink()
isn't new. But as Brad Spengler points out, some secure environments do
not mount proc, and flink adds a new interface that can avoid path
lookup of the source for those kinds of environments.

We may re-do this (and even mark it for stable backporting back in 3.11
and possibly earlier) once the whole discussion about the interface is done.

Cc: Andy Lutomirski
Cc: Al Viro
Cc: Oleg Nesterov
Cc: Brad Spengler
Signed-off-by: Linus Torvalds

Linus Torvalds
2013-08-29 00:18:05 +0800

27 Aug, 2013

1 commit

83c425d22 Merge tag 'jfs-3.11-rc8' of git://github.com/kleikamp/linux-shaggy ... Browse Code »

Pull jfs fix from Dave Kleikamp:
"One JFS patch to fix an incompatibility with NFSv4 resulting in the
nfs client reporting a readdir loop"

* tag 'jfs-3.11-rc8' of git://github.com/kleikamp/linux-shaggy:
jfs: fix readdir cookie incompatibility with NFSv4

Linus Torvalds
2013-08-27 10:22:49 +0800

26 Aug, 2013

1 commit

4d4323ea2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull vfs fixes from Al Viro:
"Assorted fixes from the last week or so"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
VFS: collect_mounts() should return an ERR_PTR
bfs: iget_locked() doesn't return an ERR_PTR
efs: iget_locked() doesn't return an ERR_PTR()
proc: kill the extra proc_readfd_common()->dir_emit_dots()
cope with potentially long ->d_dname() output for shmem/hugetlb

Linus Torvalds
2013-08-26 03:25:38 +0800

25 Aug, 2013

6 commits

5befb98b3 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi ... Browse Code »

Pull SCSI fixes from James Bottomley:
"This is a set of small bug fixes for lpfc and zfcp and a fix for a
fairly nasty bug in sg where a process which cancels I/O completes in
a kernel thread which would then try to write back to the now gone
userspace and end up writing to a random kernel address instead"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
[SCSI] zfcp: remove access control tables interface (keep sysfs files)
[SCSI] zfcp: fix schedule-inside-lock in scsi_device list loops
[SCSI] zfcp: fix lock imbalance by reworking request queue locking
[SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal
[SCSI] lpfc: Don't force CONFIG_GENERIC_CSUM on

Linus Torvalds
2013-08-25 02:33:21 +0800
52e220d35 VFS: collect_mounts() should return an ERR_PTR ... Browse Code »

This should actually be returning an ERR_PTR on error instead of NULL.
That was how it was designed and all the callers expect it.

[AV: actually, that's what "VFS: Make clone_mnt()/copy_tree()/collect_mounts()
return errors" missed - originally collect_mounts() was expected to return
NULL on failure]

Cc: # 3.10+
Signed-off-by: Dan Carpenter
Signed-off-by: Al Viro

Dan Carpenter
2013-08-25 00:10:29 +0800
821ff77c6 bfs: iget_locked() doesn't return an ERR_PTR ... Browse Code »

iget_locked() returns a NULL on error, it doesn't return an ERR_PTR.

Signed-off-by: Dan Carpenter
Signed-off-by: Al Viro

Dan Carpenter
2013-08-25 00:10:22 +0800
136eefa48 efs: iget_locked() doesn't return an ERR_PTR() ... Browse Code »

The iget_locked() function returns NULL on error and never an ERR_PTR.

Signed-off-by: Dan Carpenter
Signed-off-by: Al Viro

Dan Carpenter
2013-08-25 00:10:22 +0800
a5a1955e0 proc: kill the extra proc_readfd_common()->dir_emit_dots() ... Browse Code »

proc_readfd_common() does dir_emit_dots() twice in a row,
we need to do this only once.

Signed-off-by: Oleg Nesterov
Signed-off-by: Al Viro

Oleg Nesterov
2013-08-25 00:10:22 +0800
118b23022 cope with potentially long ->d_dname() output for shmem/hugetlb ... Browse Code »

dynamic_dname() is both too much and too little for those - the
output may be well in excess of 64 bytes dynamic_dname() assumes
to be enough (thanks to ashmem feeding really long names to
shmem_file_setup()) and vsnprintf() is an overkill for those
guys.

Signed-off-by: Al Viro

Al Viro
2013-08-25 00:10:17 +0800

24 Aug, 2013

2 commits

4bf93b50f nilfs2: fix issue with counting number of bio requests for BIO_EOPNOTSUPP error detection ... Browse Code »

Fix the issue with improper counting number of flying bio requests for
BIO_EOPNOTSUPP error detection case.

The sb_nbio must be incremented exactly the same number of times as
complete() function was called (or will be called) because
nilfs_segbuf_wait() will call wail_for_completion() for the number of
times set to sb_nbio:

do {
wait_for_completion(&segbuf->sb_bio_event);
} while (--segbuf->sb_nbio > 0);

Two functions complete() and wait_for_completion() must be called the
same number of times for the same sb_bio_event. Otherwise,
wait_for_completion() will hang or leak.

Signed-off-by: Vyacheslav Dubeyko
Cc: Dan Carpenter
Acked-by: Ryusuke Konishi
Tested-by: Ryusuke Konishi
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vyacheslav Dubeyko
2013-08-24 00:51:22 +0800
2df37a19c nilfs2: remove double bio_put() in nilfs_end_bio_write() for BIO_EOPNOTSUPP error ... Browse Code »

Remove double call of bio_put() in nilfs_end_bio_write() for the case of
BIO_EOPNOTSUPP error detection. The issue was found by Dan Carpenter
and he suggests first version of the fix too.

Signed-off-by: Vyacheslav Dubeyko
Reported-by: Dan Carpenter
Acked-by: Ryusuke Konishi
Tested-by: Ryusuke Konishi
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vyacheslav Dubeyko
2013-08-24 00:51:22 +0800

23 Aug, 2013

1 commit

09239ed4a sysfs: group.c: fix up kerneldoc ... Browse Code »

Fix up the wording of sysfs_create/remove_groups() a bit.

Reported-by: Anthony Foiani
Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2013-08-23 00:23:28 +0800

22 Aug, 2013

16 commits

2c3a908b4 sysfs: sysfs.h: fix coding style issues ... Browse Code »

This fixes up the remaining coding style issues in sysfs.h

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2013-08-22 07:40:05 +0800
07ac62a60 sysfs: file.c: fix up broken string warnings ... Browse Code »

This fixes the coding style warnings in fs/sysfs/file.c for broken
strings across lines.

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2013-08-22 07:37:42 +0800
37814ee0b sysfs: dir.c: fix up odd do/while indentation ... Browse Code »

This fixes up the odd do/while after an if statement warning in dir.c

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2013-08-22 07:36:02 +0800
060cc749e sysfs: fix up uaccess.h coding style warnings ... Browse Code »

This fixes the uaccess.h warnings in the sysfs.c files.

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2013-08-22 07:34:59 +0800
ddfd6d074 sysfs: fix up 80 column coding style issues ... Browse Code »

This fixes up the 80 column coding style issues in the sysfs .c files.

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2013-08-22 07:33:34 +0800
1b18dc2be sysfs: fix up space coding style issues ... Browse Code »

This fixes up all of the space-related coding style issues for the sysfs
code.

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2013-08-22 07:28:26 +0800
ab9bf4be4 sysfs: remove trailing whitespace ... Browse Code »

This removes all trailing whitespace errors in the sysfs code.

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2013-08-22 07:21:17 +0800
1b866757f sysfs: fix placement of EXPORT_SYMBOL() ... Browse Code »

The export should happen after the function, not at the bottom of the
file, so fix that up.

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2013-08-22 07:17:47 +0800
9e2a47ed6 sysfs: group: update copyright to add myself and the LF ... Browse Code »

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2013-08-22 07:14:11 +0800
f9ae443b5 sysfs: group.c: add kerneldoc for sysfs_remove_group ... Browse Code »

sysfs_remove_group() never had kerneldoc, so add it, and fix up the
kerneldoc for sysfs_remove_groups() which didn't specify the parameters
properly.

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2013-08-22 07:12:34 +0800
16aebf1c5 sysfs: group.c: fix up broken string coding style ... Browse Code »

checkpatch complains about the broken string in the file, and it's
correct, so fix it up.

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2013-08-22 07:10:02 +0800
995d8ed94 sysfs: group.c: fix up some * coding style issues ... Browse Code »

This fixes up the * coding style warnings for the group.c sysfs file.

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2013-08-22 07:07:29 +0800
e6c56920f sysfs: group.c: fix trailing whitespace ... Browse Code »

There was some trailing spaces in the file, fix that up.

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2013-08-22 07:06:14 +0800
d363bc53e sysfs: group.c: move EXPORT_SYMBOL_GPL() to the proper location ... Browse Code »

This fixes up the coding style issue of incorrectly placing the
EXPORT_SYMBOL_GPL() macro, it should be right after the function itself,
not at the end of the file.

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2013-08-22 07:04:12 +0800
3e9b2bae8 sysfs: add sysfs_create/remove_groups() ... Browse Code »

These functions are being open-coded in 3 different places in the driver
core, and other driver subsystems will want to start doing this as well,
so move it to the sysfs core to keep it all in one place, where we know
it is written properly.

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2013-08-22 07:02:19 +0800
35dc24838 [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal ... Browse Code »

There is a nasty bug in the SCSI SG_IO ioctl that in some circumstances
leads to one process writing data into the address space of some other
random unrelated process if the ioctl is interrupted by a signal.
What happens is the following:

- A process issues an SG_IO ioctl with direction DXFER_FROM_DEV (ie the
underlying SCSI command will transfer data from the SCSI device to
the buffer provided in the ioctl)

- Before the command finishes, a signal is sent to the process waiting
in the ioctl. This will end up waking up the sg_ioctl() code:

result = wait_event_interruptible(sfp->read_wait,
(srp_done(sfp, srp) || sdp->detached));

but neither srp_done() nor sdp->detached is true, so we end up just
setting srp->orphan and returning to userspace:

srp->orphan = 1;
write_unlock_irq(&sfp->rq_list_lock);
return result; /* -ERESTARTSYS because signal hit process */

At this point the original process is done with the ioctl and
blithely goes ahead handling the signal, reissuing the ioctl, etc.

- Eventually, the SCSI command issued by the first ioctl finishes and
ends up in sg_rq_end_io(). At the end of that function, we run through:

write_lock_irqsave(&sfp->rq_list_lock, iflags);
if (unlikely(srp->orphan)) {
if (sfp->keep_orphan)
srp->sg_io_owned = 0;
else
done = 0;
}
srp->done = done;
write_unlock_irqrestore(&sfp->rq_list_lock, iflags);

if (likely(done)) {
/* Now wake up any sg_read() that is waiting for this
* packet.
*/
wake_up_interruptible(&sfp->read_wait);
kill_fasync(&sfp->async_qp, SIGPOLL, POLL_IN);
kref_put(&sfp->f_ref, sg_remove_sfp);
} else {
INIT_WORK(&srp->ew.work, sg_rq_end_io_usercontext);
schedule_work(&srp->ew.work);
}

Since srp->orphan *is* set, we set done to 0 (assuming the
userspace app has not set keep_orphan via an SG_SET_KEEP_ORPHAN
ioctl), and therefore we end up scheduling sg_rq_end_io_usercontext()
to run in a workqueue.

- In workqueue context we go through sg_rq_end_io_usercontext() ->
sg_finish_rem_req() -> blk_rq_unmap_user() -> ... ->
bio_uncopy_user() -> __bio_copy_iov() -> copy_to_user().

The key point here is that we are doing copy_to_user() on a
workqueue -- that is, we're on a kernel thread with current->mm
equal to whatever random previous user process was scheduled before
this kernel thread. So we end up copying whatever data the SCSI
command returned to the virtual address of the buffer passed into
the original ioctl, but it's quite likely we do this copying into a
different address space!

As suggested by James Bottomley ,
add a check for current->mm (which is NULL if we're on a kernel thread
without a real userspace address space) in bio_uncopy_user(), and skip
the copy if we're on a kernel thread.

There's no reason that I can think of for any caller of bio_uncopy_user()
to want to do copying on a kernel thread with a random active userspace
address space.

Huge thanks to Costa Sapuntzakis for the
original pointer to this bug in the sg code.

Signed-off-by: Roland Dreier
Tested-by: David Milburn
Cc: Jens Axboe
Cc:
Signed-off-by: James Bottomley

Roland Dreier
2013-08-22 01:58:35 +0800

20 Aug, 2013

2 commits

fd3930f70 proc: more readdir conversion bug-fixes ... Browse Code »

In the previous commit, Richard Genoud fixed proc_root_readdir(), which
had lost the check for whether all of the non-process /proc entries had
been returned or not.

But that in turn exposed _another_ bug, namely that the original readdir
conversion patch had yet another problem: it had lost the return value
of proc_readdir_de(), so now checking whether it had completed
successfully or not didn't actually work right anyway.

This reinstates the non-zero return for the "end of base entries" that
had also gotten lost in commit f0c3b5093add ("[readdir] convert
procfs"). So now you get all the base entries *and* you get all the
process entries, regardless of getdents buffer size.

(Side note: the Linux "getdents" manual page actually has a nice example
application for testing getdents, which can be easily modified to use
different buffers. Who knew? Man-pages can be useful)

Reported-by: Emmanuel Benisty
Reported-by: Marc Dionne
Cc: Richard Genoud
Cc: Al Viro
Signed-off-by: Linus Torvalds

Linus Torvalds
2013-08-20 07:26:12 +0800
3f8f80f0c pstore/ram: Read and write to the 'compressed' flag of pstore ... Browse Code »

In pstore write, add character 'C'(compressed) or 'D'(decompressed)
in the header while writing to Ram persistent buffer. In pstore read,
read the header and update the 'compressed' flag accordingly.

Signed-off-by: Aruna Balakrishnaiah
Reviewed-by: Kees Cook
Signed-off-by: Tony Luck

Aruna Balakrishnaiah
2013-08-20 02:53:50 +0800