Eric Lee / smarc-fsl-linux-kernel

03 Apr, 2012

6 commits

eee11e214 module: Remove module size limit ... Browse Code »

commit f946eeb9313ff1470758e171a60fe7438a2ded3f upstream.

Module size was limited to 64MB, this was legacy limitation due to vmalloc()
which was removed a while ago.

Limiting module size to 64MB is both pointless and affects real world use
cases.

Cc: Tim Abbott
Signed-off-by: Sasha Levin
Signed-off-by: Rusty Russell
Signed-off-by: Greg Kroah-Hartman

Sasha Levin
2012-04-03 01:32:22 +0800
7ca476a69 PM / Hibernate: Enable usermodehelpers in hibernate() error path ... Browse Code »

commit 05b4877f6a4f1ba4952d1222213d262bf8c132b7 upstream.

If create_basic_memory_bitmaps() fails, usermodehelpers are not re-enabled
before returning. Fix this. And while at it, reword the goto labels so that
they look more meaningful.

Signed-off-by: Srivatsa S. Bhat
Signed-off-by: Rafael J. Wysocki
Signed-off-by: Greg Kroah-Hartman

Srivatsa S. Bhat
2012-04-03 01:32:10 +0800
a65dcdb33 genirq: Fix incorrect check for forced IRQ thread handler ... Browse Code »

commit 540b60e24f3f4781d80e47122f0c4486a03375b8 upstream.

We do not want a bitwise AND between boolean operands

Signed-off-by: Alexander Gordeev
Cc: Oleg Nesterov
Link: http://lkml.kernel.org/r/20120309135912.GA2114@dhcp-26-207.brq.redhat.com
Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Alexander Gordeev
2012-04-03 01:31:52 +0800
74f8c9894 genirq: Fix long-term regression in genirq irq_set_irq_type() handling ... Browse Code »

commit a09b659cd68c10ec6a30cb91ebd2c327fcd5bfe5 upstream.

In 2008, commit 0c5d1eb77a8be ("genirq: record trigger type") modified the
way set_irq_type() handles the 'no trigger' condition. However, this has
an adverse effect on PCMCIA support on Intel StrongARM and probably PXA
platforms.

PCMCIA has several status signals on the socket which can trigger
interrupts; some of these status signals depend on the card's mode
(whether it is configured in memory or IO mode). For example, cards have
a 'Ready/IRQ' signal: in memory mode, this provides an indication to
PCMCIA that the card has finished its power up initialization. In IO
mode, it provides the device interrupt signal. Other status signals
switch between on-board battery status and loud speaker output.

In classical PCMCIA implementations, where you have a specific socket
controller, the controller provides a method to mask interrupts from the
socket, and importantly ignore any state transitions on the pins which
correspond with interrupts once masked. This masking prevents unwanted
events caused by the removal and application of socket power being
forwarded.

However, on platforms where there is no socket controller, the PCMCIA
status and interrupt signals are routed to standard edge-triggered GPIOs.
These GPIOs can be configured to interrupt on rising edge, falling edge,
or never. This is where the problems start.

Edge triggered interrupts are required to record events while disabled via
the usual methods of {free,request,disable,enable}_irq() to prevent
problems with dropped interrupts (eg, the 8390 driver uses disable_irq()
to defer the delivery of interrupts). As a result, these interfaces can
not be used to implement the desired behaviour.

The side effect of this is that if the 'Ready/IRQ' GPIO is disabled via
disable_irq() on suspend, and enabled via enable_irq() after resume, we
will record the state transitions caused by powering events as valid
interrupts, and foward them to the card driver, which may attempt to
access a card which is not powered up.

This leads delays resume while drivers spin in their interrupt handlers,
and complaints from drivers before they realize what's happened.

Moreover, in the case of the 'Ready/IRQ' signal, this is requested and
freed by the card driver itself; the PCMCIA core has no idea whether the
interrupt is requested, and, therefore, whether a call to disable_irq()
would be valid. (We tried this around 2.4.17 / 2.5.1 kernel era, and
ended up throwing it out because of this problem.)

Therefore, it was decided back in around 2002 to disable the edge
triggering instead, resulting in all state transitions on the GPIO being
ignored. That's what we actually need the hardware to do.

The commit above changes this behaviour; it explicitly prevents the 'no
trigger' state being selected.

The reason that request_irq() does not accept the 'no trigger' state is
for compatibility with existing drivers which do not provide their desired
triggering configuration. The set_irq_type() function is 'new' and not
used by non-trigger aware drivers.

Therefore, revert this change, and restore previously working platforms
back to their former state.

Signed-off-by: Russell King
Cc: linux@arm.linux.org.uk
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Russell King
2012-04-03 01:31:52 +0800
b2af31bf8 ntp: Fix integer overflow when setting time ... Browse Code »

commit a078c6d0e6288fad6d83fb6d5edd91ddb7b6ab33 upstream.

'long secs' is passed as divisor to div_s64, which accepts a 32bit
divisor. On 64bit machines that value is trimmed back from 8 bytes
back to 4, causing a divide by zero when the number is bigger than
(1 << 32) - 1 and all 32 lower bits are 0.

Use div64_long() instead.

Signed-off-by: Sasha Levin
Cc: johnstul@us.ibm.com
Link: http://lkml.kernel.org/r/1331829374-31543-2-git-send-email-levinsasha928@gmail.com
Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Sasha Levin
2012-04-03 01:31:52 +0800
0a5ced57e futex: Cover all PI opcodes with cmpxchg enabled check ... Browse Code »

commit 59263b513c11398cd66a52d4c5b2b118ce1e0359 upstream.

Some of the newer futex PI opcodes do not check the cmpxchg enabled
variable and call unconditionally into the handling functions. Cover
all PI opcodes in a separate check.

Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Darren Hart
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2012-04-03 01:31:34 +0800

16 Mar, 2012

1 commit

79f0713d4 prctl: use CAP_SYS_RESOURCE for PR_SET_MM option ... Browse Code »

CAP_SYS_ADMIN is already overloaded left and right, so to have more
fine-grained access control use CAP_SYS_RESOURCE here.

The CAP_SYS_RESOUCE is chosen because this prctl option allows a current
process to adjust some fields of memory map descriptor which rather
represents what the process owns: pointers to code, data, stack
segments, command line, auxiliary vector data and etc.

Suggested-by: Michael Kerrisk
Acked-by: Kees Cook
Acked-by: Michael Kerrisk
Cc: Pavel Emelyanov
Cc: Tejun Heo
Cc: Oleg Nesterov
Cc: Paul Bolle
Cc: KOSAKI Motohiro
Signed-off-by: Cyrill Gorcunov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Cyrill Gorcunov
2012-03-16 08:03:03 +0800

15 Mar, 2012

1 commit

f1cbd03f5 Merge branch 'for-linus' of git://git.kernel.dk/linux-block ... Browse Code »

Pull block fixes from Jens Axboe:
"Been sitting on this for a while, but lets get this out the door.
This fixes various important bugs for 3.3 final, along with a few more
trivial ones. Please pull!"

* 'for-linus' of git://git.kernel.dk/linux-block:
block: fix ioc leak in put_io_context
block, sx8: fix pointer math issue getting fw version
Block: use a freezable workqueue for disk-event polling
drivers/block/DAC960: fix -Wuninitialized warning
drivers/block/DAC960: fix DAC960_V2_IOCTL_Opcode_T -Wenum-compare warning
block: fix __blkdev_get and add_disk race condition
block: Fix setting bio flags in drivers (sd_dif/floppy)
block: Fix NULL pointer dereference in sd_revalidate_disk
block: exit_io_context() should call elevator_exit_icq_fn()
block: simplify ioc_release_fn()
block: replace icq->changed with icq->flags

Linus Torvalds
2012-03-15 08:16:45 +0800

08 Mar, 2012

1 commit

4293f20c1 Revert "CPU hotplug, cpusets, suspend: Don't touch cpusets during suspend/resume" ... Browse Code »

This reverts commit 8f2f748b0656257153bcf0941df8d6060acc5ca6.

It causes some odd regression that we have not figured out, and it's too
late in the -rc series to try to figure it out now.

As reported by Konstantin Khlebnikov, it causes consistent hangs on his
laptop (Thinkpad x220: 2x cores + HT). They can be avoided by adding
calls to "rebuild_sched_domains();" in cpuset_cpu_[in]active() for the
CPU_{ONLINE/DOWN_FAILED/DOWN_PREPARE}_FROZEN cases, but it's not at all
clear why, and it makes no sense.

Konstantin's config doesn't even have CONFIG_CPUSETS enabled, just to
make things even more interesting. So it's not the cpusets, it's just
the scheduling domains.

So until this is understood, revert.

Bisected-reported-and-tested-by: Konstantin Khlebnikov
Acked-by: Peter Zijlstra
Acked-by: Ingo Molnar
Acked-by: Srivatsa S. Bhat
Signed-off-by: Linus Torvalds

Linus Torvalds
2012-03-08 00:21:19 +0800

07 Mar, 2012

1 commit

52abb700e genirq: Clear action->thread_mask if IRQ_ONESHOT is not set ... Browse Code »
1

Xommit ac5637611(genirq: Unmask oneshot irqs when thread was not woken)
fails to unmask when a !IRQ_ONESHOT threaded handler is handled by
handle_level_irq.

This happens because thread_mask is or'ed unconditionally in
irq_wake_thread(), but for !IRQ_ONESHOT interrupts never cleared. So
the check for !desc->thread_active fails and keeps the interrupt
disabled.

Keep the thread_mask zero for !IRQ_ONESHOT interrupts.

Document the thread_mask magic while at it.

Reported-and-tested-by: Sven Joachim
Reported-and-tested-by: Stefan Lippers-Hollmann
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner
Signed-off-by: Linus Torvalds

Thomas Gleixner
2012-03-07 08:46:39 +0800

06 Mar, 2012

7 commits

6027ce497 hung_task: fix the broken rcu_lock_break() logic ... Browse Code »

check_hung_uninterruptible_tasks()->rcu_lock_break() introduced by
"softlockup: check all tasks in hung_task" commit ce9dbe24 looks
absolutely wrong.

- rcu_lock_break() does put_task_struct(). If the task has exited
it is not safe to even read its ->state, nothing protects this
task_struct.

- The TASK_DEAD checks are wrong too. Contrary to the comment, we
can't use it to check if the task was unhashed. It can be unhashed
without TASK_DEAD, or it can be valid with TASK_DEAD.

For example, an autoreaping task can do release_task(current)
long before it sets TASK_DEAD in do_exit().

Or, a zombie task can have ->state == TASK_DEAD but release_task()
was not called, and in this case we must not break the loop.

Change this code to check pid_alive() instead, and do this before we drop
the reference to the task_struct.

Note: while_each_thread() under rcu_read_lock() is not really safe, it can
livelock. This will be fixed later, but fortunately in this case the
"max_count" logic saves us anyway.

Signed-off-by: Oleg Nesterov
Acked-by: Frederic Weisbecker
Acked-by: Mandeep Singh Baines
Acked-by: Paul E. McKenney
Cc: Tetsuo Handa
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2012-03-06 07:49:42 +0800
6e27f63ed vfork: kill PF_STARTING ... Browse Code »

Previously it was (ab)used by utrace. Then it was wrongly used by the
scheduler code.

Currently it is not used, kill it before it finds the new erroneous user.

Signed-off-by: Oleg Nesterov
Acked-by: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2012-03-06 07:49:42 +0800
57b59c4a1 coredump_wait: don't call complete_vfork_done() ... Browse Code »

Now that CLONE_VFORK is killable, coredump_wait() no longer needs
complete_vfork_done(). zap_threads() should find and kill all tasks with
the same ->mm, this includes our parent if ->vfork_done is set.

mm_release() becomes the only caller, unexport complete_vfork_done().

Signed-off-by: Oleg Nesterov
Acked-by: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2012-03-06 07:49:42 +0800
d68b46fe1 vfork: make it killable ... Browse Code »

Make vfork() killable.

Change do_fork(CLONE_VFORK) to do wait_for_completion_killable(). If it
fails we do not return to the user-mode and never touch the memory shared
with our child.

However, in this case we should clear child->vfork_done before return, we
use task_lock() in do_fork()->wait_for_vfork_done() and
complete_vfork_done() to serialize with each other.

Note: now that we use task_lock() we don't really need completion, we
could turn task->vfork_done into "task_struct *wake_up_me" but this needs
some complications.

NOTE: this and the next patches do not affect in-kernel users of
CLONE_VFORK, kernel threads run with all signals ignored including
SIGKILL/SIGSTOP.

However this is obviously the user-visible change. Not only a fatal
signal can kill the vforking parent, a sub-thread can do execve or
exit_group() and kill the thread sleeping in vfork().

Signed-off-by: Oleg Nesterov
Acked-by: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2012-03-06 07:49:42 +0800
c415c3b47 vfork: introduce complete_vfork_done() ... Browse Code »

No functional changes.

Move the clear-and-complete-vfork_done code into the new trivial helper,
complete_vfork_done().

Signed-off-by: Oleg Nesterov
Acked-by: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2012-03-06 07:49:42 +0800
f986a499e kprobes: return proper error code from register_kprobe() ... Browse Code »

register_kprobe() aborts if the address of the new request falls in a
prohibited area (such as ftrace pouch, __kprobes annotated functions,
non-kernel text addresses, jump label text). We however don't return the
right error on this abort, resulting in a silent failure - incorrect
adding/reporting of kprobes ('perf probe do_fork+18' or 'perf probe
mcount' for instance).

In V2 we are incorporating Masami Hiramatsu's feedback.

This patch fixes it by returning -EINVAL upon failure.

While we are here, rename the label used for exit to be more appropriate.

Signed-off-by: Ananth N Mavinakayanahalli
Signed-off-by: Prashanth K Nageshappa
Acked-by: Masami Hiramatsu
Cc: Jason Baron
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Prashanth Nageshappa
2012-03-06 07:49:42 +0800
c22ab3329 kmsg_dump: don't run on non-error paths by default ... Browse Code »
43

Since commit 04c6862c055f ("kmsg_dump: add kmsg_dump() calls to the
reboot, halt, poweroff and emergency_restart paths"), kmsg_dump() gets
run on normal paths including poweroff and reboot.

This is less than ideal given pstore implementations that can only
represent single backtraces, since a reboot may overwrite a stored oops
before it's been picked up by userspace. In addition, some pstore
backends may have low performance and provide a significant delay in
reboot as a result.

This patch adds a printk.always_kmsg_dump kernel parameter (which can also
be changed from userspace). Without it, the code will only be run on
failure paths rather than on normal paths. The option can be enabled in
environments where there's a desire to attempt to audit whether or not a
reboot was cleanly requested or not.

Signed-off-by: Matthew Garrett
Acked-by: Seiji Aguchi
Cc: Seiji Aguchi
Cc: David Woodhouse
Cc: Marco Stornelli
Cc: Artem Bityutskiy
Cc: KOSAKI Motohiro
Cc: Vivek Goyal
Cc: Don Zickus
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Garrett
2012-03-06 07:49:42 +0800

03 Mar, 2012

1 commit

2273d5ccb Merge branches 'core-urgent-for-linus', 'perf-urgent-for-linus' and 'sched-urgen… ... Browse Code »

…t-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pulling latest branches from Ingo:

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
memblock: Fix size aligning of memblock_alloc_base_nid()

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf probe: Ensure offset provided is not greater than function length without DWARF info too
perf tools: Ensure comm string is properly terminated
perf probe: Ensure offset provided is not greater than function length
perf evlist: Return first evsel for non-sample event on old kernel
perf/hwbp: Fix a possible memory leak

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
CPU hotplug, cpusets, suspend: Don't touch cpusets during suspend/resume

Linus Torvalds
2012-03-03 03:38:43 +0800

02 Mar, 2012

1 commit

62d3c5439 Block: use a freezable workqueue for disk-event polling ... Browse Code »
1

This patch (as1519) fixes a bug in the block layer's disk-events
polling. The polling is done by a work routine queued on the
system_nrt_wq workqueue. Since that workqueue isn't freezable, the
polling continues even in the middle of a system sleep transition.

Obviously, polling a suspended drive for media changes and such isn't
a good thing to do; in the case of USB mass-storage devices it can
lead to real problems requiring device resets and even re-enumeration.

The patch fixes things by creating a new system-wide, non-reentrant,
freezable workqueue and using it for disk-events polling.

Signed-off-by: Alan Stern
CC:
Acked-by: Tejun Heo
Acked-by: Rafael J. Wysocki
Signed-off-by: Jens Axboe

Alan Stern
2012-03-02 17:51:00 +0800

28 Feb, 2012

1 commit

30ce2f7ee perf/hwbp: Fix a possible memory leak ... Browse Code »

If kzalloc() for TYPE_DATA failed on a given cpu, previous chunk
of TYPE_INST will be leaked. Fix it.

Thanks to Peter Zijlstra for suggesting this better solution. It
should work as long as the initial value of the region is all
0's and that's the case of static (per-cpu) memory allocation.

Signed-off-by: Namhyung Kim
Acked-by: Frederic Weisbecker
Acked-by: Peter Zijlstra
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Link: http://lkml.kernel.org/r/1330391978-28070-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Ingo Molnar

Namhyung Kim
2012-02-28 16:52:54 +0800

27 Feb, 2012

3 commits

70ca00db1 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/events: Revert trace_sched_stat_sleeptime()

Linus Torvalds
2012-02-27 23:55:39 +0800
faf3502a3 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Handle pending irqs in irq_startup()
genirq: Unmask oneshot irqs when thread was not woken

Linus Torvalds
2012-02-27 23:54:57 +0800
8f2f748b0 CPU hotplug, cpusets, suspend: Don't touch cpusets during suspend/resume ... Browse Code »
43

Currently, during CPU hotplug, the cpuset callbacks modify the cpusets
to reflect the state of the system, and this handling is asymmetric.
That is, upon CPU offline, that CPU is removed from all cpusets. However
when it comes back online, it is put back only to the root cpuset.

This gives rise to a significant problem during suspend/resume. During
suspend, we offline all non-boot cpus and during resume we online them back.
Which means, after a resume, all cpusets (except the root cpuset) will be
restricted to just one single CPU (the boot cpu). But the whole point of
suspend/resume is to restore the system to a state which is as close as
possible to how it was before suspend.

So to fix this, don't touch cpusets during suspend/resume. That is, modify
the cpuset-related CPU hotplug callback to just ignore CPU hotplug when it
is initiated as part of the suspend/resume sequence.

Reported-by: Prashanth Nageshappa
Signed-off-by: Srivatsa S. Bhat
Cc: Linus Torvalds
Cc: Andrew Morton
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/4F460D7B.1020703@linux.vnet.ibm.com
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Srivatsa S. Bhat
2012-02-27 18:38:13 +0800

25 Feb, 2012

1 commit

d80e731ec epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree() ... Browse Code »
1

This patch is intentionally incomplete to simplify the review.
It ignores ep_unregister_pollwait() which plays with the same wqh.
See the next change.

epoll assumes that the EPOLL_CTL_ADD'ed file controls everything
f_op->poll() needs. In particular it assumes that the wait queue
can't go away until eventpoll_release(). This is not true in case
of signalfd, the task which does EPOLL_CTL_ADD uses its ->sighand
which is not connected to the file.

This patch adds the special event, POLLFREE, currently only for
epoll. It expects that init_poll_funcptr()'ed hook should do the
necessary cleanup. Perhaps it should be defined as EPOLLFREE in
eventpoll.

__cleanup_sighand() is changed to do wake_up_poll(POLLFREE) if
->signalfd_wqh is not empty, we add the new signalfd_cleanup()
helper.

ep_poll_callback(POLLFREE) simply does list_del_init(task_list).
This make this poll entry inconsistent, but we don't care. If you
share epoll fd which contains our sigfd with another process you
should blame yourself. signalfd is "really special". I simply do
not know how we can define the "right" semantics if it used with
epoll.

The main problem is, epoll calls signalfd_poll() once to establish
the connection with the wait queue, after that signalfd_poll(NULL)
returns the different/inconsistent results depending on who does
EPOLL_CTL_MOD/signalfd_read/etc. IOW: apart from sigmask, signalfd
has nothing to do with the file, it works with the current thread.

In short: this patch is the hack which tries to fix the symptoms.
It also assumes that nobody can take tasklist_lock under epoll
locks, this seems to be true.

Note:

- we do not have wake_up_all_poll() but wake_up_poll()
is fine, poll/epoll doesn't use WQ_FLAG_EXCLUSIVE.

- signalfd_cleanup() uses POLLHUP along with POLLFREE,
we need a couple of simple changes in eventpoll.c to
make sure it can't be "lost".

Reported-by: Maxime Bizon
Cc:
Signed-off-by: Oleg Nesterov
Signed-off-by: Linus Torvalds

Oleg Nesterov
2012-02-25 03:42:50 +0800

22 Feb, 2012

1 commit

8c79a045f sched/events: Revert trace_sched_stat_sleeptime() ... Browse Code »

Commit 1ac9bc69 ("sched/tracing: Add a new tracepoint for sleeptime")
added a new sched:sched_stat_sleeptime tracepoint.

It's broken: the first sample we get on a task might be bad because
of a stale sleep_start value that wasn't reset at the last task switch
because the tracepoint was not active.

It also breaks the existing schedstat samples due to the side
effects of:

- se->statistics.sleep_start = 0;
...
- se->statistics.block_start = 0;

Nor do I see means to fix it without adding overhead to the scheduler
fast path, which I'm not willing to for the sake of redundant
instrumentation.

Most importantly, sleep time information can already be constructed
by tracing context switches and wakeups, and taking the timestamp
difference between the schedule-out, the wakeup and the schedule-in.

Signed-off-by: Peter Zijlstra
Cc: Andrew Vagin
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Link: http://lkml.kernel.org/n/tip-pc4c9qhl8q6vg3bs4j6k0rbd@git.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2012-02-22 19:06:55 +0800

21 Feb, 2012

1 commit

8ebbfb495 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Assorted fixes, sat in -next for a week or so...

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ocfs2: deal with wraparounds of i_nlink in ocfs2_rename()
vfs: fix compat_sys_stat() handling of overflows in st_nlink
quota: Fix deadlock with suspend and quotas
vfs: Provide function to get superblock and wait for it to thaw
vfs: fix panic in __d_lookup() with high dentry hashtable counts
autofs4 - fix lockdep splat in autofs
vfs: fix d_inode_lookup() dentry ref leak

Linus Torvalds
2012-02-21 08:13:58 +0800

15 Feb, 2012

2 commits

b4bc724e8 genirq: Handle pending irqs in irq_startup() ... Browse Code »
1

An interrupt might be pending when irq_startup() is called, but the
startup code does not invoke the resend logic. In some cases this
prevents the device from issuing another interrupt which renders the
device non functional.

Call the resend function in irq_startup() to keep things going.

Reported-and-tested-by: Russell King
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2012-02-15 18:56:59 +0800
ac5637611 genirq: Unmask oneshot irqs when thread was not woken ... Browse Code »
45

When the primary handler of an interrupt which is marked IRQ_ONESHOT
returns IRQ_HANDLED or IRQ_NONE, then the interrupt thread is not
woken and the unmask logic of the interrupt line is never
invoked. This keeps the interrupt masked forever.

This was not noticed as most IRQ_ONESHOT users wake the thread
unconditionally (usually because they cannot access the underlying
device from hard interrupt context). Though this behaviour was nowhere
documented and not necessarily intentional. Some drivers can avoid the
thread wakeup in certain cases and run into the situation where the
interrupt line s kept masked.

Handle it gracefully.

Reported-and-tested-by: Lothar Wassmann
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2012-02-15 18:56:59 +0800

14 Feb, 2012

3 commits

074b85175 vfs: fix panic in __d_lookup() with high dentry hashtable counts ... Browse Code »

When the number of dentry cache hash table entries gets too high
(2147483648 entries), as happens by default on a 16TB system, use of a
signed integer in the dcache_init() initialization loop prevents the
dentry_hashtable from getting initialized, causing a panic in
__d_lookup(). Fix this in dcache_init() and similar areas.

Signed-off-by: Dimitri Sivanich
Acked-by: David S. Miller
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Al Viro

Dimitri Sivanich
2012-02-14 09:45:38 +0800
e3f89f4ae Merge tag 'for-linus' of git://github.com/rustyrussell/linux ... Browse Code »

* tag 'for-linus' of git://github.com/rustyrussell/linux:
module: fix broken isapnp handling in file2alias
module: make module param bint handle nul value

Linus Torvalds
2012-02-14 08:59:53 +0800
10f296cbf module: make module param bint handle nul value ... Browse Code »

Allow bint param accept nul values, just do same as bool param.

Signed-off-by: Dave Young
Cc: Rusty Russell
Signed-off-by: Rusty Russell

Dave Young
2012-02-14 08:32:15 +0800

12 Feb, 2012

1 commit

3ec1e88b3 Merge branch 'for-linus' of git://git.kernel.dk/linux-block ... Browse Code »

Says Jens:

"Time to push off some of the pending items. I really wanted to wait
until we had the regression nailed, but alas it's not quite there yet.
But I'm very confident that it's "just" a missing expire on exit, so
fix from Tejun should be fairly trivial. I'm headed out for a week on
the slopes.

- Killing the barrier part of mtip32xx. It doesn't really support
barriers, and it doesn't need them (writes are fully ordered).

- A few fixes from Dan Carpenter, preventing overflows of integer
multiplication.

- A fixup for loop, fixing a previous commit that didn't quite solve
the partial read problem from Dave Young.

- A bio integer overflow fix from Kent Overstreet.

- Improvement/fix of the door "keep locked" part of the cdrom shared
code from Paolo Benzini.

- A few cfq fixes from Shaohua Li.

- A fix for bsg sysfs warning when removing a file it did not create
from Stanislaw Gruszka.

- Two fixes for floppy from Vivek, preventing a crash.

- A few block core fixes from Tejun. One killing the over-optimized
ioc exit path, cleaning that up nicely. Two others fixing an oops
on elevator switch, due to calling into the scheduler merge check
code without holding the queue lock."

* 'for-linus' of git://git.kernel.dk/linux-block:
block: fix lockdep warning on io_context release put_io_context()
relay: prevent integer overflow in relay_open()
loop: zero fill bio instead of return -EIO for partial read
bio: don't overflow in bio_get_nr_vecs()
floppy: Fix a crash during rmmod
floppy: Cleanup disk->queue before caling put_disk() if add_disk() was never called
cdrom: move shared static to cdrom_device_info
bsg: fix sysfs link remove warning
block: don't call elevator callbacks for plug merges
block: separate out blk_rq_merge_ok() and blk_try_merge() from elevator functions
mtip32xx: removed the irrelevant argument of mtip_hw_submit_io() and the unused member of struct driver_data
block: strip out locking optimization in put_io_context()
cdrom: use copy_to_user() without the underscores
block: fix ioc locking warning
block: fix NULL icq_cache reference
block,cfq: change code order

Linus Torvalds
2012-02-12 02:07:11 +0800

11 Feb, 2012

1 commit

ce2814f22 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Fix double start/stop in x86_pmu_start()
perf evsel: Fix an issue where perf report fails to show the proper percentage
perf tools: Fix prefix matching for kernel maps
perf tools: Fix perf stack to non executable on x86_64
perf: Remove deprecated WARN_ON_ONCE()

Linus Torvalds
2012-02-11 01:05:07 +0800

10 Feb, 2012

1 commit

f6302f1bc relay: prevent integer overflow in relay_open() ... Browse Code »
1

"subbuf_size" and "n_subbufs" come from the user and they need to be
capped to prevent an integer overflow.

Signed-off-by: Dan Carpenter
Cc: stable@kernel.org
Signed-off-by: Jens Axboe

Dan Carpenter
2012-02-10 16:04:49 +0800

07 Feb, 2012

2 commits

f39d47ff8 perf: Fix double start/stop in x86_pmu_start() ... Browse Code »

The following patch fixes a bug introduced by the following
commit:

e050e3f0a71b ("perf: Fix broken interrupt rate throttling")

The patch caused the following warning to pop up depending on
the sampling frequency adjustments:

------------[ cut here ]------------
WARNING: at arch/x86/kernel/cpu/perf_event.c:995 x86_pmu_start+0x79/0xd4()

It was caused by the following call sequence:

perf_adjust_freq_unthr_context.part() {
stop()
if (delta > 0) {
perf_adjust_period() {
if (period > 8*...) {
stop()
...
start()
}
}
}
start()
}

Which caused a double start and a double stop, thus triggering
the assert in x86_pmu_start().

The patch fixes the problem by avoiding the double calls. We
pass a new argument to perf_adjust_period() to indicate whether
or not the event is already stopped. We can't just remove the
start/stop from that function because it's called from
__perf_event_overflow where the event needs to be reloaded via a
stop/start back-toback call.

The patch reintroduces the assertion in x86_pmu_start() which
was removed by commit:

84f2b9b ("perf: Remove deprecated WARN_ON_ONCE()")

In this second version, we've added calls to disable/enable PMU
during unthrottling or frequency adjustment based on bug report
of spurious NMI interrupts from Eric Dumazet.

Reported-and-tested-by: Eric Dumazet
Signed-off-by: Stephane Eranian
Acked-by: Peter Zijlstra
Cc: markus@trippelsdorf.de
Cc: paulus@samba.org
Link: http://lkml.kernel.org/r/20120207133956.GA4932@quad
[ Minor edits to the changelog and to the code ]
Signed-off-by: Ingo Molnar

Stephane Eranian
2012-02-07 23:58:56 +0800
11a3122f6 block: strip out locking optimization in put_io_context() ... Browse Code »
43

put_io_context() performed a complex trylock dancing to avoid
deferring ioc release to workqueue. It was also broken on UP because
trylock was always assumed to succeed which resulted in unbalanced
preemption count.

While there are ways to fix the UP breakage, even the most
pathological microbench (forced ioc allocation and tight fork/exit
loop) fails to show any appreciable performance benefit of the
optimization. Strip it out. If there turns out to be workloads which
are affected by this change, simpler optimization from the discussion
thread can be applied later.

Signed-off-by: Tejun Heo
LKML-Reference:
Signed-off-by: Jens Axboe

Tejun Heo
2012-02-07 14:51:30 +0800

05 Feb, 2012

2 commits

23783f817 Merge tag 'pm-fixes-for-3.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm ... Browse Code »

Power management fixes for 3.3-rc3

Three power management regression fixes, one for a recent regression introcuded
by the freezer changes during the 3.3 merge window and two for regressions
in cpuidle (resulting from PM QoS changes) and in the hibernate user space
interface, both introduced during the 3.2 development cycle.

They include:

* Two hibernate (s2disk) regression fixes from Srivatsa S. Bhat (for
regressions introduced during the 3.3 merge window and during the 3.2
development cycle).

* A cpuidle fix from Venki Pallipadi for a regression resulting from PM QoS
changes during the 3.2 development cycle causing cpuidle to work incorrectly
for CONFIG_PM unset.

* tag 'pm-fixes-for-3.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / QoS: CPU C-state breakage with PM Qos change
PM / Freezer: Thaw only kernel threads if freezing of kernel threads fails
PM / Hibernate: Thaw kernel threads in SNAPSHOT_CREATE_IMAGE ioctl path

Linus Torvalds
2012-02-05 07:21:39 +0800
379e0be81 PM / Freezer: Thaw only kernel threads if freezing of kernel threads fails ... Browse Code »

If freezing of kernel threads fails, we are expected to automatically
thaw tasks in the error recovery path. However, at times, we encounter
situations in which we would like the automatic error recovery path
to thaw only the kernel threads, because we want to be able to do
some more cleanup before we thaw userspace. Something like:

error = freeze_kernel_threads();
if (error) {
/* Do some cleanup */

/* Only then thaw userspace tasks*/
thaw_processes();
}

An example of such a situation is where we freeze/thaw filesystems
during suspend/hibernation. There, if freezing of kernel threads
fails, we would like to thaw the frozen filesystems before thawing
the userspace tasks.

So, modify freeze_kernel_threads() to thaw only kernel threads in
case of freezing failure. And change suspend_freeze_processes()
accordingly. (At the same time, let us also get rid of the rather
cryptic usage of the conditional operator (:?) in that function.)

[rjw: In fact, this patch fixes a regression introduced during the
3.3 merge window, because without it thaw_processes() may be called
before swsusp_free() in some situations and that may lead to massive
memory allocation failures.]

Signed-off-by: Srivatsa S. Bhat
Acked-by: Tejun Heo
Acked-by: Nigel Cunningham
Signed-off-by: Rafael J. Wysocki

Srivatsa S. Bhat
2012-02-05 05:23:05 +0800

04 Feb, 2012

1 commit

55ca6140e kprobes: fix a memory leak in function pre_handler_kretprobe() ... Browse Code »
1

In function pre_handler_kretprobe(), the allocated kretprobe_instance
object will get leaked if the entry_handler callback returns non-zero.
This may cause all the preallocated kretprobe_instance objects exhausted.

This issue can be reproduced by changing
samples/kprobes/kretprobe_example.c to probe "mutex_unlock". And the fix
is straightforward: just put the allocated kretprobe_instance object back
onto the free_instances list.

[akpm@linux-foundation.org: use raw_spin_lock/unlock]
Signed-off-by: Jiang Liu
Acked-by: Jim Keniston
Acked-by: Ananth N Mavinakayanahalli
Cc: Masami Hiramatsu
Cc: Anil S Keshavamurthy
Cc: "David S. Miller"
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jiang Liu
2012-02-04 08:16:41 +0800

03 Feb, 2012

1 commit

8cdb878dc Fix race in process_vm_rw_core ... Browse Code »

This fixes the race in process_vm_core found by Oleg (see

http://article.gmane.org/gmane.linux.kernel/1235667/

for details).

This has been updated since I last sent it as the creation of the new
mm_access() function did almost exactly the same thing as parts of the
previous version of this patch did.

In order to use mm_access() even when /proc isn't enabled, we move it to
kernel/fork.c where other related process mm access functions already
are.

Signed-off-by: Chris Yeoh
Signed-off-by: Linus Torvalds

Christopher Yeoh
2012-02-03 04:55:17 +0800