Eric Lee / smarc-fsl-linux-kernel

16 Aug, 2018

1 commit

3eb86ff32 Mark HI and TASKLET softirq synchronous ... Browse Code »

commit 3c53776e29f81719efcf8f7a6e30cdf753bee94d upstream.

Way back in 4.9, we committed 4cd13c21b207 ("softirq: Let ksoftirqd do
its job"), and ever since we've had small nagging issues with it. For
example, we've had:

1ff688209e2e ("watchdog: core: make sure the watchdog_worker is not deferred")
8d5755b3f77b ("watchdog: softdog: fire watchdog even if softirqs do not get to run")
217f69743681 ("net: busy-poll: allow preemption in sk_busy_loop()")

all of which worked around some of the effects of that commit.

The DVB people have also complained that the commit causes excessive USB
URB latencies, which seems to be due to the USB code using tasklets to
schedule USB traffic. This seems to be an issue mainly when already
living on the edge, but waiting for ksoftirqd to handle it really does
seem to cause excessive latencies.

Now Hanna Hawa reports that this issue isn't just limited to USB URB and
DVB, but also causes timeout problems for the Marvell SoC team:

"I'm facing kernel panic issue while running raid 5 on sata disks
connected to Macchiatobin (Marvell community board with Armada-8040
SoC with 4 ARMv8 cores of CA72) Raid 5 built with Marvell DMA engine
and async_tx mechanism (ASYNC_TX_DMA [=y]); the DMA driver (mv_xor_v2)
uses a tasklet to clean the done descriptors from the queue"

The latency problem causes a panic:

mv_xor_v2 f0400000.xor: dma_sync_wait: timeout!
Kernel panic - not syncing: async_tx_quiesce: DMA error waiting for transaction

We've discussed simply just reverting the original commit entirely, and
also much more involved solutions (with per-softirq threads etc). This
patch is intentionally stupid and fairly limited, because the issue
still remains, and the other solutions either got sidetracked or had
other issues.

We should probably also consider the timer softirqs to be synchronous
and not be delayed to ksoftirqd (since they were the issue with the
earlier watchdog problems), but that should be done as a separate patch.
This does only the tasklet cases.

Reported-and-tested-by: Hanna Hawa
Reported-and-tested-by: Josef Griebichler
Reported-by: Mauro Carvalho Chehab
Cc: Alan Stern
Cc: Greg Kroah-Hartman
Cc: Eric Dumazet
Cc: Ingo Molnar
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Linus Torvalds
2018-08-16 00:12:47 +0800

09 Aug, 2018

1 commit

2d898915c nohz: Fix missing tick reprogram when interrupting an inline softirq ... Browse Code »

commit 0a0e0829f990120cef165bbb804237f400953ec2 upstream.

The full nohz tick is reprogrammed in irq_exit() only if the exit is not in
a nesting interrupt. This stands as an optimization: whether a hardirq or a
softirq is interrupted, the tick is going to be reprogrammed when necessary
at the end of the inner interrupt, with even potential new updates on the
timer queue.

When soft interrupts are interrupted, it's assumed that they are executing
on the tail of an interrupt return. In that case tick_nohz_irq_exit() is
called after softirq processing to take care of the tick reprogramming.

But the assumption is wrong: softirqs can be processed inline as well, ie:
outside of an interrupt, like in a call to local_bh_enable() or from
ksoftirqd.

Inline softirqs don't reprogram the tick once they are done, as opposed to
interrupt tail softirq processing. So if a tick interrupts an inline
softirq processing, the next timer will neither be reprogrammed from the
interrupting tick's irq_exit() nor after the interrupted softirq
processing. This situation may leave the tick unprogrammed while timers are
armed.

To fix this, simply keep reprogramming the tick even if a softirq has been
interrupted. That can be optimized further, but for now correctness is more
important.

Note that new timers enqueued in nohz_full mode after a softirq gets
interrupted will still be handled just fine through self-IPIs triggered by
the timer code.

Reported-by: Anna-Maria Gleixner
Signed-off-by: Frederic Weisbecker
Signed-off-by: Thomas Gleixner
Tested-by: Anna-Maria Gleixner
Cc: stable@vger.kernel.org # 4.14+
Link: https://lkml.kernel.org/r/1533303094-15855-1-git-send-email-frederic@kernel.org
Signed-off-by: Greg Kroah-Hartman

Frederic Weisbecker
2018-08-09 18:16:38 +0800

22 Feb, 2018

1 commit

f369f1486 kmemcheck: rip it out ... Browse Code »

commit 4675ff05de2d76d167336b368bd07f3fef6ed5a6 upstream.

Fix up makefiles, remove references, and git rm kmemcheck.

Link: http://lkml.kernel.org/r/20171007030159.22241-4-alexander.levin@verizon.com
Signed-off-by: Sasha Levin
Cc: Steven Rostedt
Cc: Vegard Nossum
Cc: Pekka Enberg
Cc: Michal Hocko
Cc: Eric W. Biederman
Cc: Alexander Potapenko
Cc: Tim Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Levin, Alexander (Sasha Levin)
2018-02-22 22:42:24 +0800

11 Apr, 2017

1 commit

717a94b5f sched/core: Remove 'task' parameter and rename tsk_restore_flags() to current_restore_flags() ... Browse Code »

It is not safe for one thread to modify the ->flags
of another thread as there is no locking that can protect
the update.

So tsk_restore_flags(), which takes a task pointer and modifies
the flags, is an invitation to do the wrong thing.

All current users pass "current" as the task, so no developers have
accepted that invitation. It would be best to ensure it remains
that way.

So rename tsk_restore_flags() to current_restore_flags() and don't
pass in a task_struct pointer. Always operate on current->flags.

Signed-off-by: NeilBrown
Cc: Linus Torvalds
Cc: Mel Gorman
Cc: Michal Hocko
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar

NeilBrown
2017-04-11 15:06:32 +0800

22 Oct, 2016

1 commit

f660f6066 softirq: Display IRQ_POLL for irq-poll statistics ... Browse Code »

This library was moved to the generic area and was
renamed to irq-poll. Hence, update proc/softirqs output accordingly.

Signed-off-by: Sagi Grimberg
Reviewed-by: Johannes Thumshirn
Reviewed-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Sagi Grimberg
2016-10-22 05:45:47 +0800

16 Oct, 2016

1 commit

9ffc66941 Merge tag 'gcc-plugins-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux ... Browse Code »

Pull gcc plugins update from Kees Cook:
"This adds a new gcc plugin named "latent_entropy". It is designed to
extract as much possible uncertainty from a running system at boot
time as possible, hoping to capitalize on any possible variation in
CPU operation (due to runtime data differences, hardware differences,
SMP ordering, thermal timing variation, cache behavior, etc).

At the very least, this plugin is a much more comprehensive example
for how to manipulate kernel code using the gcc plugin internals"

* tag 'gcc-plugins-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
latent_entropy: Mark functions with __latent_entropy
gcc-plugins: Add latent_entropy plugin

Linus Torvalds
2016-10-16 01:03:15 +0800

11 Oct, 2016

1 commit

0766f788e latent_entropy: Mark functions with __latent_entropy ... Browse Code »

The __latent_entropy gcc attribute can be used only on functions and
variables. If it is on a function then the plugin will instrument it for
gathering control-flow entropy. If the attribute is on a variable then
the plugin will initialize it with random contents. The variable must
be an integer, an integer array type or a structure with integer fields.

These specific functions have been selected because they are init
functions (to help gather boot-time entropy), are called at unpredictable
times, or they have variable loops, each of which provide some level of
latent entropy.

Signed-off-by: Emese Revfy
[kees: expanded commit message]
Signed-off-by: Kees Cook

Emese Revfy
2016-10-11 05:51:45 +0800

04 Oct, 2016

1 commit

597f03f9d Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull CPU hotplug updates from Thomas Gleixner:
"Yet another batch of cpu hotplug core updates and conversions:

- Provide core infrastructure for multi instance drivers so the
drivers do not have to keep custom lists.

- Convert custom lists to the new infrastructure. The block-mq custom
list conversion comes through the block tree and makes the diffstat
tip over to more lines removed than added.

- Handle unbalanced hotplug enable/disable calls more gracefully.

- Remove the obsolete CPU_STARTING/DYING notifier support.

- Convert another batch of notifier users.

The relayfs changes which conflicted with the conversion have been
shipped to me by Andrew.

The remaining lot is targeted for 4.10 so that we finally can remove
the rest of the notifiers"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
cpufreq: Fix up conversion to hotplug state machine
blk/mq: Reserve hotplug states for block multiqueue
x86/apic/uv: Convert to hotplug state machine
s390/mm/pfault: Convert to hotplug state machine
mips/loongson/smp: Convert to hotplug state machine
mips/octeon/smp: Convert to hotplug state machine
fault-injection/cpu: Convert to hotplug state machine
padata: Convert to hotplug state machine
cpufreq: Convert to hotplug state machine
ACPI/processor: Convert to hotplug state machine
virtio scsi: Convert to hotplug state machine
oprofile/timer: Convert to hotplug state machine
block/softirq: Convert to hotplug state machine
lib/irq_poll: Convert to hotplug state machine
x86/microcode: Convert to hotplug state machine
sh/SH-X3 SMP: Convert to hotplug state machine
ia64/mca: Convert to hotplug state machine
ARM/OMAP/wakeupgen: Convert to hotplug state machine
ARM/shmobile: Convert to hotplug state machine
arm64/FP/SIMD: Convert to hotplug state machine
...

Linus Torvalds
2016-10-04 10:43:08 +0800

30 Sep, 2016

1 commit

4cd13c21b softirq: Let ksoftirqd do its job ... Browse Code »

A while back, Paolo and Hannes sent an RFC patch adding threaded-able
napi poll loop support : (https://patchwork.ozlabs.org/patch/620657/)

The problem seems to be that softirqs are very aggressive and are often
handled by the current process, even if we are under stress and that
ksoftirqd was scheduled, so that innocent threads would have more chance
to make progress.

This patch makes sure that if ksoftirq is running, we let it
perform the softirq work.

Jonathan Corbet summarized the issue in https://lwn.net/Articles/687617/

Tested:

- NIC receiving traffic handled by CPU 0
- UDP receiver running on CPU 0, using a single UDP socket.
- Incoming flood of UDP packets targeting the UDP socket.

Before the patch, the UDP receiver could almost never get CPU cycles and
could only receive ~2,000 packets per second.

After the patch, CPU cycles are split 50/50 between user application and
ksoftirqd/0, and we can effectively read ~900,000 packets per second,
a huge improvement in DOS situation. (Note that more packets are now
dropped by the NIC itself, since the BH handlers get less CPU cycles to
drain RX ring buffer)

Since the load runs in well identified threads context, an admin can
more easily tune process scheduling parameters if needed.

Reported-by: Paolo Abeni
Reported-by: Hannes Frederic Sowa
Signed-off-by: Eric Dumazet
Signed-off-by: Peter Zijlstra (Intel)
Cc: David Miller
Cc: Hannes Frederic Sowa
Cc: Jesper Dangaard Brouer
Cc: Jonathan Corbet
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Rik van Riel
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1472665349.14381.356.camel@edumazet-glaptop3.roam.corp.google.com
Signed-off-by: Ingo Molnar

Eric Dumazet
2016-09-30 16:43:36 +0800

07 Sep, 2016

1 commit

c4544dbc7 kernel/softirq: Convert to hotplug state machine ... Browse Code »

Install the callbacks via the state machine.

Signed-off-by: Sebastian Andrzej Siewior
Cc: Peter Zijlstra
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160818125731.27256-7-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner

Sebastian Andrzej Siewior
2016-09-07 00:30:22 +0800

26 Mar, 2016

1 commit

be7635e72 arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections ... Browse Code »

KASAN needs to know whether the allocation happens in an IRQ handler.
This lets us strip everything below the IRQ entry point to reduce the
number of unique stack traces needed to be stored.

Move the definition of __irq_entry to so that the
users don't need to pull in . Also introduce the
__softirq_entry macro which is similar to __irq_entry, but puts the
corresponding functions to the .softirqentry.text section.

Signed-off-by: Alexander Potapenko
Acked-by: Steven Rostedt
Cc: Christoph Lameter
Cc: Pekka Enberg
Cc: David Rientjes
Cc: Joonsoo Kim
Cc: Andrey Konovalov
Cc: Dmitry Vyukov
Cc: Andrey Ryabinin
Cc: Konstantin Serebryany
Cc: Dmitry Chernenkov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexander Potapenko
2016-03-26 07:37:42 +0800

29 Feb, 2016

1 commit

f904f5826 sched/debug: Fix preempt_disable_ip recording for preempt_disable() ... Browse Code »

The preempt_disable() invokes preempt_count_add() which saves the caller
in ->preempt_disable_ip. It uses CALLER_ADDR1 which does not look for
its caller but for the parent of the caller. Which means we get the correct
caller for something like spin_lock() unless the architectures inlines
those invocations. It is always wrong for preempt_disable() or
local_bh_disable().

This patch makes the function get_lock_parent_ip() which tries
CALLER_ADDR0,1,2 if the former is a locking function.
This seems to record the preempt_disable() caller properly for
preempt_disable() itself as well as for get_cpu_var() or
local_bh_disable().

Steven asked for the get_parent_ip() -> get_lock_parent_ip() rename.

Signed-off-by: Sebastian Andrzej Siewior
Signed-off-by: Peter Zijlstra (Intel)
Cc: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Steven Rostedt
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20160226135456.GB18244@linutronix.de
Signed-off-by: Ingo Molnar

Sebastian Andrzej Siewior
2016-02-29 16:53:10 +0800

10 Feb, 2015

1 commit

8308756f4 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull core locking updates from Ingo Molnar:
"The main changes are:

- mutex, completions and rtmutex micro-optimizations
- lock debugging fix
- various cleanups in the MCS and the futex code"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rtmutex: Optimize setting task running after being blocked
locking/rwsem: Use task->state helpers
sched/completion: Add lock-free checking of the blocking case
sched/completion: Remove unnecessary ->wait.lock serialization when reading completion state
locking/mutex: Explicitly mark task as running after wakeup
futex: Fix argument handling in futex_lock_pi() calls
doc: Fix misnamed FUTEX_CMP_REQUEUE_PI op constants
locking/Documentation: Update code path
softirq/preempt: Add missing current->preempt_disable_ip update
locking/osq: No need for load/acquire when acquire-polling
locking/mcs: Better differentiate between MCS variants
locking/mutex: Introduce ww_mutex_set_context_slowpath()
locking/mutex: Move MCS related comments to proper location
locking/mutex: Checking the stamp is WW only

Linus Torvalds
2015-02-10 07:24:03 +0800

15 Jan, 2015

2 commits

60479676e ksoftirqd: Use new cond_resched_rcu_qs() function ... Browse Code »

Simplify run_ksoftirqd() by using the new cond_resched_rcu_qs() function
that conditionally reschedules, but unconditionally supplies an RCU
quiescent state. This commit is separate from the previous commit by
Calvin Owens because Calvin's approach can be backported, while this
commit cannot be. The reason that this commit cannot be backported is
that cond_resched_rcu_qs() does not always provide the needed quiescent
state in earlier kernels.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2015-01-15 05:20:26 +0800
28423ad28 ksoftirqd: Enable IRQs and call cond_resched() before poking RCU ... Browse Code »

While debugging an issue with excessive softirq usage, I encountered the
following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread
infrastructure"):

[ paulmck: Call rcu_note_context_switch() with interrupts enabled. ]

...but despite this note, the patch still calls RCU with IRQs disabled.

This seemingly innocuous change caused a significant regression in softirq
CPU usage on the sending side of a large TCP transfer (~1 GB/s): when
introducing 0.01% packet loss, the softirq usage would jump to around 25%,
spiking as high as 50%. Before the change, the usage would never exceed 5%.

Moving the call to rcu_note_context_switch() after the cond_sched() call,
as it was originally before the hotplug patch, completely eliminated this
problem.

Signed-off-by: Calvin Owens
Cc: stable@vger.kernel.org
Signed-off-by: Paul E. McKenney

Calvin Owens
2015-01-15 05:18:58 +0800

14 Jan, 2015

1 commit

0f1ba9a2c softirq/preempt: Add missing current->preempt_disable_ip update ... Browse Code »

While debugging some "sleeping function called from invalid context" bug I
realized that the debugging message "Preemption disabled at:" pointed to
an incorrect function.

In particular if the last function/action that disabled preemption was
spin_lock_bh() then current->preempt_disable_ip won't be updated.

The reason for this is that __local_bh_disable_ip() will increase
preempt_count manually instead of calling preempt_count_add(), which
would handle the update correctly.

It look like the manual handling was done to work around some lockdep issue.

So add the missing update of current->preempt_disable_ip to
__local_bh_disable_ip() as well.

Signed-off-by: Heiko Carstens
Signed-off-by: Peter Zijlstra (Intel)
Cc: Paul E. McKenney
Cc: Andrew Morton
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/20150107090441.GC4365@osiris
Signed-off-by: Ingo Molnar

Heiko Carstens
2015-01-14 22:16:21 +0800

04 Nov, 2014

1 commit

38200cf24 rcu: Remove "cpu" argument to rcu_note_context_switch() ... Browse Code »

The "cpu" argument to rcu_note_context_switch() is always the current
CPU, so drop it. This in turn allows the "cpu" argument to
rcu_preempt_note_context_switch() to be removed, which allows the sole
use of "cpu" in both functions to be replaced with a this_cpu_ptr().
Again, the anticipated cross-CPU uses of these functions has been
replaced by NO_HZ_FULL.

Signed-off-by: Paul E. McKenney
Reviewed-by: Pranith Kumar

Paul E. McKenney
2014-11-04 11:20:34 +0800

15 Oct, 2014

1 commit

0429fbc0b Merge branch 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu ... Browse Code »

Pull percpu consistent-ops changes from Tejun Heo:
"Way back, before the current percpu allocator was implemented, static
and dynamic percpu memory areas were allocated and handled separately
and had their own accessors. The distinction has been gone for many
years now; however, the now duplicate two sets of accessors remained
with the pointer based ones - this_cpu_*() - evolving various other
operations over time. During the process, we also accumulated other
inconsistent operations.

This pull request contains Christoph's patches to clean up the
duplicate accessor situation. __get_cpu_var() uses are replaced with
with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr().

Unfortunately, the former sometimes is tricky thanks to C being a bit
messy with the distinction between lvalues and pointers, which led to
a rather ugly solution for cpumask_var_t involving the introduction of
this_cpu_cpumask_var_ptr().

This converts most of the uses but not all. Christoph will follow up
with the remaining conversions in this merge window and hopefully
remove the obsolete accessors"

* 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits)
irqchip: Properly fetch the per cpu offset
percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix
ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write.
percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t
Revert "powerpc: Replace __get_cpu_var uses"
percpu: Remove __this_cpu_ptr
clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
sparc: Replace __get_cpu_var uses
avr32: Replace __get_cpu_var with __this_cpu_write
blackfin: Replace __get_cpu_var uses
tile: Use this_cpu_ptr() for hardware counters
tile: Replace __get_cpu_var uses
powerpc: Replace __get_cpu_var uses
alpha: Replace __get_cpu_var
ia64: Replace __get_cpu_var uses
s390: cio driver &__get_cpu_var replacements
s390: Replace __get_cpu_var uses
mips: Replace __get_cpu_var uses
MIPS: Replace __get_cpu_var uses in FPU emulator.
arm: Replace __this_cpu_ptr with raw_cpu_ptr
...

Linus Torvalds
2014-10-15 13:48:18 +0800

08 Sep, 2014

1 commit

284a8c93a rcu: Per-CPU operation cleanups to rcu_*_qs() functions ... Browse Code »

The rcu_bh_qs(), rcu_preempt_qs(), and rcu_sched_qs() functions use
old-style per-CPU variable access and write to ->passed_quiesce even
if it is already set. This commit therefore updates to use the new-style
per-CPU variable access functions and avoids the spurious writes.
This commit also eliminates the "cpu" argument to these functions because
they are always invoked on the indicated CPU.

Reported-by: Peter Zijlstra
Signed-off-by: Paul E. McKenney

Paul E. McKenney
2014-09-08 07:27:35 +0800

27 Aug, 2014

1 commit

22127e93c time: Replace __get_cpu_var uses ... Browse Code »

Convert uses of __get_cpu_var for creating a address from a percpu
offset to this_cpu_ptr.

The two cases where get_cpu_var is used to actually access a percpu
variable are changed to use this_cpu_read/raw_cpu_read.

Reviewed-by: Thomas Gleixner
Signed-off-by: Christoph Lameter
Signed-off-by: Tejun Heo

Christoph Lameter
2014-08-27 01:45:44 +0800

22 May, 2014

1 commit

e14505a8d Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck… ... Browse Code »

…/linux-rcu into core/rcu

Pull RCU updates from Paul E. McKenney:

" 1. Update RCU documentation. These were posted to LKML at
https://lkml.org/lkml/2014/4/28/634.

2. Miscellaneous fixes. These were posted to LKML at
https://lkml.org/lkml/2014/4/28/645.

3. Torture-test changes. These were posted to LKML at
https://lkml.org/lkml/2014/4/28/667.

4. Variable-name renaming cleanup, sent separately due to conflicts.
This was posted to LKML at https://lkml.org/lkml/2014/5/13/854.

5. Patch to suppress RCU stall warnings while sysrq requests are
being processed. This patch is the RCU portions of the patch
that Rik posted to LKML at https://lkml.org/lkml/2014/4/29/457.
The reason for pushing this patch ahead instead of waiting until
3.17 is that the NMI-based stack traces are messing up sysrq
output, and in some cases also messing up the system as well."

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2014-05-22 17:36:10 +0800

06 May, 2014

1 commit

722a9f929 asmlinkage: Add explicit __visible to drivers/*, lib/*, kernel/* ... Browse Code »

As requested by Linus add explicit __visible to the asmlinkage users.
This marks functions visible to assembler.

Tree sweep for rest of tree.

Signed-off-by: Andi Kleen
Link: http://lkml.kernel.org/r/1398984278-29319-4-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin

Andi Kleen
2014-05-06 07:07:46 +0800

29 Apr, 2014

1 commit

a5d6d3a1b softirq: A single rcu_bh_qs() call per softirq set is enough ... Browse Code »

Calling rcu_bh_qs() after every softirq action is not really needed.
What RCU needs is at least one rcu_bh_qs() per softirq round to note a
quiescent state was passed for rcu_bh.

Note for Paul and myself : this could be inlined as a single instruction
and avoid smp_processor_id()
(sone this_cpu_write(rcu_bh_data.passed_quiesce, 1))

Signed-off-by: Eric Dumazet
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Eric Dumazet
2014-04-29 23:45:40 +0800

28 Apr, 2014

1 commit

62a08ae2a genirq: x86: Ensure that dynamic irq allocation does not conflict ... Browse Code »

On x86 the allocation of irq descriptors may allocate interrupts which
are in the range of the GSI interrupts. That's wrong as those
interrupts are hardwired and we don't have the irq domain translation
like PPC. So one of these interrupts can be hooked up later to one of
the devices which are hard wired to it and the io_apic init code for
that particular interrupt line happily reuses that descriptor with a
completely different configuration so hell breaks lose.

Inside x86 we allocate dynamic interrupts from above nr_gsi_irqs,
except for a few usage sites which have not yet blown up in our face
for whatever reason. But for drivers which need an irq range, like the
GPIO drivers, we have no limit in place and we don't want to expose
such a detail to a driver.

To cure this introduce a function which an architecture can implement
to impose a lower bound on the dynamic interrupt allocations.

Implement it for x86 and set the lower bound to nr_gsi_irqs, which is
the end of the hardwired interrupt space, so all dynamic allocations
happen above.

That not only allows the GPIO driver to work sanely, it also protects
the bogus callsites of create_irq_nr() in hpet, uv, irq_remapping and
htirq code. They need to be cleaned up as well, but that's a separate
issue.

Reported-by: Jin Yao
Signed-off-by: Thomas Gleixner
Tested-by: Mika Westerberg
Cc: Mathias Nyman
Cc: Linus Torvalds
Cc: Grant Likely
Cc: H. Peter Anvin
Cc: Rafael J. Wysocki
Cc: Andy Shevchenko
Cc: Krogerus Heikki
Cc: Linus Walleij
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1404241617360.28206@ionos.tec.linutronix.de
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2014-04-28 18:20:00 +0800

19 Mar, 2014

1 commit

d532676cc softirq: Add linux/irq.h to make it compile again ... Browse Code »

On Sparc and S390 the removal of irq.h from kernel_stat.h causes:

kernel/softirq.c:774:9: error: 'NR_IRQS_LEGACY' undeclared

Reported-by: Peter Zijlstra
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2014-03-19 18:28:14 +0800

01 Feb, 2014

1 commit

aafd9d6a4 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull timer/dynticks updates from Ingo Molnar:
"This tree contains misc dynticks updates: a fix and three cleanups"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/nohz: Fix overflow error in scheduler_tick_max_deferment()
nohz_full: fix code style issue of tick_nohz_full_stop_tick
nohz: Get timekeeping max deferment outside jiffies_lock
tick: Rename tick_check_idle() to tick_irq_enter()

Linus Torvalds
2014-02-01 01:02:51 +0800

28 Jan, 2014

3 commits

ce85b4f2e softirq: use const char * const for softirq_to_name, whitespace neatening ... Browse Code »

Reduce data size a little.
Reduce checkpatch noise.

$ size kernel/softirq.o*
text data bss dec hex filename
11554 6013 4008 21575 5447 kernel/softirq.o.new
11474 6093 4008 21575 5447 kernel/softirq.o.old

Signed-off-by: Joe Perches
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2014-01-28 13:02:40 +0800
403227641 softirq: convert printks to pr_<level> ... Browse Code »

Use a more current logging style.

Signed-off-by: Joe Perches
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2014-01-28 13:02:40 +0800
2e702b9f6 softirq: use ffs() in __do_softirq() ... Browse Code »

Possible speed improvement of __do_softirq() by using ffs() instead of
using a while loop with an & 1 test then single bit shift.

Signed-off-by: Joe Perches
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2014-01-28 13:02:40 +0800

25 Jan, 2014

1 commit

a2b4c607c Merge branch 'timers/core' of git://git.kernel.org/pub/scm/linux/kernel/git/fred… ... Browse Code »

…eric/linux-dynticks into timers/urgent

Pull dynticks cleanups from Frederic Weisbecker.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2014-01-25 15:27:26 +0800

21 Jan, 2014

2 commits

6c6461435 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull timer changes from Ingo Molnar:
- ARM clocksource/clockevent improvements and fixes
- generic timekeeping updates: TAI fixes/improvements, cleanups
- Posix cpu timer cleanups and improvements
- dynticks updates: full dynticks bugfixes, optimizations and cleanups

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
clocksource: Timer-sun5i: Switch to sched_clock_register()
timekeeping: Remove comment that's mostly out of date
rtc-cmos: Add an alarm disable quirk
timekeeper: fix comment typo for tk_setup_internals()
timekeeping: Fix missing timekeeping_update in suspend path
timekeeping: Fix CLOCK_TAI timer/nanosleep delays
tick/timekeeping: Call update_wall_time outside the jiffies lock
timekeeping: Avoid possible deadlock from clock_was_set_delayed
timekeeping: Fix potential lost pv notification of time change
timekeeping: Fix lost updates to tai adjustment
clocksource: sh_cmt: Add clk_prepare/unprepare support
clocksource: bcm_kona_timer: Remove unused bcm_timer_ids
clocksource: vt8500: Remove deprecated IRQF_DISABLED
clocksource: tegra: Remove deprecated IRQF_DISABLED
clocksource: misc drivers: Remove deprecated IRQF_DISABLED
clocksource: sh_mtu2: Remove unnecessary platform_set_drvdata()
clocksource: sh_tmu: Remove unnecessary platform_set_drvdata()
clocksource: armada-370-xp: Enable timer divider only when needed
clocksource: clksrc-of: Warn if no clock sources are found
clocksource: orion: Switch to sched_clock_register()
...

Linus Torvalds
2014-01-21 03:34:26 +0800
a0fa1dd3c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull scheduler changes from Ingo Molnar:

- Add the initial implementation of SCHED_DEADLINE support: a real-time
scheduling policy where tasks that meet their deadlines and
periodically execute their instances in less than their runtime quota
see real-time scheduling and won't miss any of their deadlines.
Tasks that go over their quota get delayed (Available to privileged
users for now)

- Clean up and fix preempt_enable_no_resched() abuse all around the
tree

- Do sched_clock() performance optimizations on x86 and elsewhere

- Fix and improve auto-NUMA balancing

- Fix and clean up the idle loop

- Apply various cleanups and fixes

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
sched: Fix __sched_setscheduler() nice test
sched: Move SCHED_RESET_ON_FORK into attr::sched_flags
sched: Fix up attr::sched_priority warning
sched: Fix up scheduler syscall LTP fails
sched: Preserve the nice level over sched_setscheduler() and sched_setparam() calls
sched/core: Fix htmldocs warnings
sched/deadline: No need to check p if dl_se is valid
sched/deadline: Remove unused variables
sched/deadline: Fix sparse static warnings
m68k: Fix build warning in mac_via.h
sched, thermal: Clean up preempt_enable_no_resched() abuse
sched, net: Fixup busy_loop_us_clock()
sched, net: Clean up preempt_enable_no_resched() abuse
sched/preempt: Fix up missed PREEMPT_NEED_RESCHED folding
sched/preempt, locking: Rework local_bh_{dis,en}able()
sched/clock, x86: Avoid a runtime condition in native_sched_clock()
sched/clock: Fix up clear_sched_clock_stable()
sched/clock, x86: Use a static_key for sched_clock_stable
sched/clock: Remove local_irq_disable() from the clocks
sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQs
...

Linus Torvalds
2014-01-21 02:42:08 +0800

16 Jan, 2014

1 commit

5acac1be4 tick: Rename tick_check_idle() to tick_irq_enter() ... Browse Code »

This makes the code more symetric against the existing tick functions
called on irq exit: tick_irq_exit() and tick_nohz_irq_exit().

These function are also symetric as they mirror each other's action:
we start to account idle time on irq exit and we stop this accounting
on irq entry. Also the tick is stopped on irq exit and timekeeping
catches up with the tickless time elapsed until we reach irq entry.

This rename was suggested by Peter Zijlstra a long while ago but it
got forgotten in the mass.

Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Alex Shi
Cc: Steven Rostedt
Cc: Paul E. McKenney
Cc: John Stultz
Cc: Kevin Hilman
Link: http://lkml.kernel.org/r/1387320692-28460-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Frederic Weisbecker

Frederic Weisbecker
2014-01-16 06:05:31 +0800

14 Jan, 2014

1 commit

0bd3a173d sched/preempt, locking: Rework local_bh_{dis,en}able() ... Browse Code »

Currently local_bh_disable() is out-of-line for no apparent reason.
So inline it to save a few cycles on call/return nonsense, the
function body is a single add on x86 (a few loads and store extra on
load/store archs).

Also expose two new local_bh functions:

__local_bh_{dis,en}able_ip(unsigned long ip, unsigned int cnt);

Which implement the actual local_bh_{dis,en}able() behaviour.

The next patch uses the exposed @cnt argument to optimize bh lock
functions.

With build fixes from Jacob Pan.

Cc: rjw@rjwysocki.net
Cc: rui.zhang@intel.com
Cc: jacob.jun.pan@linux.intel.com
Cc: Mike Galbraith
Cc: hpa@zytor.com
Cc: Arjan van de Ven
Cc: lenb@kernel.org
Reviewed-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20131119151338.GF3694@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar

Peter Zijlstra
2014-01-14 00:32:27 +0800

13 Jan, 2014

1 commit

9ea4c3800 locking: Optimize lock_bh functions ... Browse Code »

Currently all _bh_ lock functions do two preempt_count operations:

local_bh_disable();
preempt_disable();

and for the unlock:

preempt_enable_no_resched();
local_bh_enable();

Since its a waste of perfectly good cycles to modify the same variable
twice when you can do it in one go; use the new
__local_bh_{dis,en}able_ip() functions that allow us to provide a
preempt_count value to add/sub.

So define SOFTIRQ_LOCK_OFFSET as the offset a _bh_ lock needs to
add/sub to be done in one go.

As a bonus it gets rid of the preempt_enable_no_resched() usage.

This reduces a 1000 loops of:

spin_lock_bh(&bh_lock);
spin_unlock_bh(&bh_lock);

from 53596 cycles to 51995 cycles. I didn't do enough measurements to
say for absolute sure that the result is significant but the the few
runs I did for each suggest it is so.

Reviewed-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra
Cc: jacob.jun.pan@linux.intel.com
Cc: Mike Galbraith
Cc: hpa@zytor.com
Cc: Arjan van de Ven
Cc: lenb@kernel.org
Cc: rjw@rjwysocki.net
Cc: rui.zhang@intel.com
Cc: Linus Torvalds
Cc: Andrew Morton
Link: http://lkml.kernel.org/r/20131119151338.GF3694@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar

Peter Zijlstra
2014-01-13 20:47:36 +0800

03 Dec, 2013

1 commit

e8fcaa5c5 nohz: Convert a few places to use local per cpu accesses ... Browse Code »

A few functions use remote per CPU access APIs when they
deal with local values.

Just do the right conversion to improve performance, code
readability and debug checks.

While at it, lets extend some of these function names with *_this_cpu()
suffix in order to display their purpose more clearly.

Signed-off-by: Frederic Weisbecker
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Oleg Nesterov
Cc: Steven Rostedt

Frederic Weisbecker
2013-12-03 03:39:30 +0800

27 Nov, 2013

2 commits

5c4853b60 lockdep: Simplify a bit hardirq <-> softirq transitions ... Browse Code »

Instead of saving the hardirq state on a per CPU variable, which require
an explicit call before the softirq handling and some complication,
just save and restore the hardirq tracing state through functions
return values and parameters.

It simplifies a bit the black magic that works around the fact that
softirqs can be called from hardirqs while hardirqs can nest on softirqs
but those two cases have very different semantics and only the latter
case assume both states.

Signed-off-by: Frederic Weisbecker
Signed-off-by: Peter Zijlstra
Cc: Sebastian Andrzej Siewior
Cc: Linus Torvalds
Cc: Andrew Morton
Cc: Paul E. McKenney
Link: http://lkml.kernel.org/r/1384906054-30676-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar

Frederic Weisbecker
2013-11-27 18:09:40 +0800
7d5b15831 Merge branch 'core/urgent' into core/locking ... Browse Code »

Prepare for dependent patch.

Signed-off-by: Ingo Molnar

Ingo Molnar
2013-11-27 18:09:19 +0800

20 Nov, 2013

1 commit

f1a83e652 lockdep: Correctly annotate hardirq context in irq_exit() ... Browse Code »

There was a reported deadlock on -rt which lockdep didn't report.

It turns out that in irq_exit() we tell lockdep that the hardirq
context ends and then do all kinds of locking afterwards.

To fix it, move trace_hardirq_exit() to the very end of irq_exit(), this
ensures all locking in tick_irq_exit() and rcu_irq_exit() are properly
recorded as happening from hardirq context.

This however leads to the 'fun' little problem of running softirqs
while in hardirq context. To cure this make the softirq code a little
more complex (in the CONFIG_TRACE_IRQFLAGS case).

Due to stack swizzling arch dependent trickery we cannot pass an
argument to __do_softirq() to tell it if it was done from hardirq
context or not; so use a side-band argument.

When we do __do_softirq() from hardirq context, 'atomically' flip to
softirq context and back, so that no locking goes without being in
either hard- or soft-irq context.

I didn't find any new problems in mainline using this patch, but it
did show the -rt problem.

Reported-by: Sebastian Andrzej Siewior
Cc: Frederic Weisbecker
Cc: Linus Torvalds
Cc: Andrew Morton
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/n/tip-dgwc5cdksbn0jk09vbmcc9sa@git.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2013-11-20 00:07:00 +0800

15 Nov, 2013

1 commit

fc21c0cff revert "softirq: Add support for triggering softirq work on softirqs" ... Browse Code »

This commit was incomplete in that code to remove items from the per-cpu
lists was missing and never acquired a user in the 5 years it has been in
the tree. We're going to implement what it seems to try to archive in a
simpler way, and this code is in the way of doing so.

Signed-off-by: Christoph Hellwig
Cc: Jan Kara
Cc: Jens Axboe
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Hellwig
2013-11-15 08:32:22 +0800