Eric Lee / smarc-fsl-linux-kernel

16 Oct, 2016

1 commit

9ffc66941 Merge tag 'gcc-plugins-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux ... Browse Code »

Pull gcc plugins update from Kees Cook:
"This adds a new gcc plugin named "latent_entropy". It is designed to
extract as much possible uncertainty from a running system at boot
time as possible, hoping to capitalize on any possible variation in
CPU operation (due to runtime data differences, hardware differences,
SMP ordering, thermal timing variation, cache behavior, etc).

At the very least, this plugin is a much more comprehensive example
for how to manipulate kernel code using the gcc plugin internals"

* tag 'gcc-plugins-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
latent_entropy: Mark functions with __latent_entropy
gcc-plugins: Add latent_entropy plugin

Linus Torvalds
2016-10-16 01:03:15 +0800

11 Oct, 2016

1 commit

0766f788e latent_entropy: Mark functions with __latent_entropy ... Browse Code »

The __latent_entropy gcc attribute can be used only on functions and
variables. If it is on a function then the plugin will instrument it for
gathering control-flow entropy. If the attribute is on a variable then
the plugin will initialize it with random contents. The variable must
be an integer, an integer array type or a structure with integer fields.

These specific functions have been selected because they are init
functions (to help gather boot-time entropy), are called at unpredictable
times, or they have variable loops, each of which provide some level of
latent entropy.

Signed-off-by: Emese Revfy
[kees: expanded commit message]
Signed-off-by: Kees Cook

Emese Revfy
2016-10-11 05:51:45 +0800

20 Sep, 2016

1 commit

9a659f43d block/softirq: Convert to hotplug state machine ... Browse Code »

Install the callbacks via the state machine.

Signed-off-by: Sebastian Andrzej Siewior
Cc: Peter Zijlstra
Cc: Jens Axboe
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160906170457.32393-9-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner

Sebastian Andrzej Siewior
2016-09-20 03:44:28 +0800

10 Apr, 2014

1 commit

360f92c24 block: fix regression with block enabled tagging ... Browse Code »

Martin reported that his test system would not boot with
current git, it oopsed with this:

BUG: unable to handle kernel paging request at ffff88046c6c9e80
IP: [] blk_queue_start_tag+0x90/0x150
PGD 1ddf067 PUD 1de2067 PMD 47fc7d067 PTE 800000046c6c9060
Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
Modules linked in: sd_mod lpfc(+) scsi_transport_fc scsi_tgt oracleasm
rpcsec_gss_krb5 ipv6 igb dca i2c_algo_bit i2c_core hwmon
CPU: 3 PID: 87 Comm: kworker/u17:1 Not tainted 3.14.0+ #246
Hardware name: Supermicro X9DRX+-F/X9DRX+-F, BIOS 3.00 07/09/2013
Workqueue: events_unbound async_run_entry_fn
task: ffff8802743c2150 ti: ffff880273d02000 task.ti: ffff880273d02000
RIP: 0010:[] []
blk_queue_start_tag+0x90/0x150
RSP: 0018:ffff880273d03a58 EFLAGS: 00010092
RAX: ffff88046c6c9e78 RBX: ffff880077208e78 RCX: 00000000fffc8da6
RDX: 00000000fffc186d RSI: 0000000000000009 RDI: 00000000fffc8d9d
RBP: ffff880273d03a88 R08: 0000000000000001 R09: ffff8800021c2410
R10: 0000000000000005 R11: 0000000000015b30 R12: ffff88046c5bb8a0
R13: ffff88046c5c0890 R14: 000000000000001e R15: 000000000000001e
FS: 0000000000000000(0000) GS:ffff880277b00000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff88046c6c9e80 CR3: 00000000018f6000 CR4: 00000000000407e0
Stack:
ffff880273d03a98 ffff880474b18800 0000000000000000 ffff880474157000
ffff88046c5c0890 ffff880077208e78 ffff880273d03ae8 ffffffff813b9e62
ffff880200000010 ffff880474b18968 ffff880474b18848 ffff88046c5c0cd8
Call Trace:
[] scsi_request_fn+0xf2/0x510
[] __blk_run_queue+0x37/0x50
[] blk_execute_rq_nowait+0xb3/0x130
[] blk_execute_rq+0x64/0xf0
[] ? bit_waitqueue+0xd0/0xd0
[] scsi_execute+0xe5/0x180
[] scsi_execute_req_flags+0x9a/0x110
[] sd_spinup_disk+0x94/0x460 [sd_mod]
[] ? __unmap_hugepage_range+0x200/0x2f0
[] sd_revalidate_disk+0xaa/0x3f0 [sd_mod]
[] sd_probe_async+0xd8/0x200 [sd_mod]
[] async_run_entry_fn+0x3f/0x140
[] process_one_work+0x175/0x410
[] worker_thread+0x123/0x400
[] ? manage_workers+0x160/0x160
[] kthread+0xce/0xf0
[] ? kthread_freezable_should_stop+0x70/0x70
[] ret_from_fork+0x7c/0xb0
[] ? kthread_freezable_should_stop+0x70/0x70
Code: 48 0f ab 11 72 db 48 81 4b 40 00 00 10 00 89 83 08 01 00 00 48 89
df 49 8b 04 24 48 89 1c d0 e8 f7 a8 ff ff 49 8b 85 28 05 00 00 89
58 08 48 89 03 49 8d 85 28 05 00 00 48 89 43 08 49 89 9d
RIP [] blk_queue_start_tag+0x90/0x150
RSP
CR2: ffff88046c6c9e80

Martin bisected and found this to be the problem patch;

commit 6d113398dcf4dfcd9787a4ead738b186f7b7ff0f
Author: Jan Kara
Date: Mon Feb 24 16:39:54 2014 +0100

block: Stop abusing rq->csd.list in blk-softirq

and the problem was immediately apparent. The patch states that
it is safe to reuse queuelist at completion time, since it is
no longer used. However, that is not true if a device is using
block enabled tagging. If that is the case, then the queuelist
is reused to keep track of busy tags. If a device also ended
up using softirq completions, we'd reuse ->queuelist for the
IPI handling while block tagging was still using it. Boom.

Fix this by adding a new ipi_list list head, and share the
memory used with the request hash table. The hash table is
never used after the request is moved to the dispatch list,
which happens long before any potential completion of the
request. Add a new request bit for this, so we don't have
cases that check rq->hash while it could potentially have
been reused for the IPI completion.

Reported-by: Martin K. Petersen
Tested-by: Benjamin Herrenschmidt
Signed-off-by: Jens Axboe

Jens Axboe
2014-04-10 11:54:06 +0800

25 Feb, 2014

3 commits

c46fff2a3 smp: Rename __smp_call_function_single() to smp_call_function_single_async() ... Browse Code »

The name __smp_call_function_single() doesn't tell much about the
properties of this function, especially when compared to
smp_call_function_single().

The comments above the implementation are also misleading. The main
point of this function is actually not to be able to embed the csd
in an object. This is actually a requirement that result from the
purpose of this function which is to raise an IPI asynchronously.

As such it can be called with interrupts disabled. And this feature
comes at the cost of the caller who then needs to serialize the
IPIs on this csd.

Lets rename the function and enhance the comments so that they reflect
these properties.

Suggested-by: Christoph Hellwig
Cc: Andrew Morton
Cc: Christoph Hellwig
Cc: Ingo Molnar
Cc: Jan Kara
Cc: Jens Axboe
Signed-off-by: Frederic Weisbecker
Signed-off-by: Jens Axboe

Frederic Weisbecker
2014-02-25 06:47:15 +0800
fce8ad156 smp: Remove wait argument from __smp_call_function_single() ... Browse Code »

The main point of calling __smp_call_function_single() is to send
an IPI in a pure asynchronous way. By embedding a csd in an object,
a caller can send the IPI without waiting for a previous one to complete
as is required by smp_call_function_single() for example. As such,
sending this kind of IPI can be safe even when irqs are disabled.

This flexibility comes at the expense of the caller who then needs to
synchronize the csd lifecycle by himself and make sure that IPIs on a
single csd are serialized.

This is how __smp_call_function_single() works when wait = 0 and this
usecase is relevant.

Now there don't seem to be any usecase with wait = 1 that can't be
covered by smp_call_function_single() instead, which is safer. Lets look
at the two possible scenario:

1) The user calls __smp_call_function_single(wait = 1) on a csd embedded
in an object. It looks like a nice and convenient pattern at the first
sight because we can then retrieve the object from the IPI handler easily.

But actually it is a waste of memory space in the object since the csd
can be allocated from the stack by smp_call_function_single(wait = 1)
and the object can be passed an the IPI argument.

Besides that, embedding the csd in an object is more error prone
because the caller must take care of the serialization of the IPIs
for this csd.

2) The user calls __smp_call_function_single(wait = 1) on a csd that
is allocated on the stack. It's ok but smp_call_function_single()
can do it as well and it already takes care of the allocation on the
stack. Again it's more simple and less error prone.

Therefore, using the underscore prepend API version with wait = 1
is a bad pattern and a sign that the caller can do safer and more
simple.

There was a single user of that which has just been converted.
So lets remove this option to discourage further users.

Cc: Andrew Morton
Cc: Christoph Hellwig
Cc: Ingo Molnar
Cc: Jan Kara
Cc: Jens Axboe
Signed-off-by: Frederic Weisbecker
Signed-off-by: Jens Axboe

Frederic Weisbecker
2014-02-25 06:47:09 +0800
6d113398d block: Stop abusing rq->csd.list in blk-softirq ... Browse Code »

Abusing rq->csd.list for a list of requests to complete is rather ugly.
We use rq->queuelist instead which is much cleaner. It is safe because
queuelist is used by the block layer only for requests waiting to be
submitted to a device. Thus it is unused when irq reports the request IO
is finished.

Signed-off-by: Jan Kara
Cc: Andrew Morton
Cc: Christoph Hellwig
Cc: Ingo Molnar
Cc: Jens Axboe
Signed-off-by: Frederic Weisbecker
Signed-off-by: Jens Axboe

Jan Kara
2014-02-25 06:46:42 +0800

15 Nov, 2013

1 commit

0a06ff068 kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS ... Browse Code »

We've switched over every architecture that supports SMP to it, so
remove the new useless config variable.

Signed-off-by: Christoph Hellwig
Cc: Jan Kara
Cc: Jens Axboe
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Hellwig
2013-11-15 08:32:22 +0800

08 Nov, 2013

1 commit

170d800af block: Replace __get_cpu_var uses ... Browse Code »

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x). This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area. __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :

#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))

__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.

This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset. Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e. using a global
register that may be set to the per cpu base.

Transformations done to __get_cpu_var()

1. Determine the address of the percpu instance of the current processor.

DEFINE_PER_CPU(int, y);
int *x = &__get_cpu_var(y);

Converts to

int *x = this_cpu_ptr(&y);

2. Same as #1 but this time an array structure is involved.

DEFINE_PER_CPU(int, y[20]);
int *x = __get_cpu_var(y);

Converts to

int *x = this_cpu_ptr(y);

3. Retrieve the content of the current processors instance of a per cpu
variable.

DEFINE_PER_CPU(int, y);
int x = __get_cpu_var(y)

Converts to

int x = __this_cpu_read(y);

4. Retrieve the content of a percpu struct

DEFINE_PER_CPU(struct mystruct, y);
struct mystruct x = __get_cpu_var(y);

Converts to

memcpy(&x, this_cpu_ptr(&y), sizeof(x));

5. Assignment to a per cpu variable

DEFINE_PER_CPU(int, y)
__get_cpu_var(y) = x;

Converts to

this_cpu_write(y, x);

6. Increment/Decrement etc of a per cpu variable

DEFINE_PER_CPU(int, y);
__get_cpu_var(y)++

Converts to

this_cpu_inc(y)

Signed-off-by: Christoph Lameter
Signed-off-by: Jens Axboe

Christoph Lameter
2013-11-08 23:59:58 +0800

15 Jul, 2013

1 commit

0b776b062 block: delete __cpuinit usage from all block files ... Browse Code »

The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.

After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out. Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.

This removes all the drivers/block uses of the __cpuinit macros
from all C files.

[1] https://lkml.org/lkml/2013/5/20/589

Cc: Jens Axboe
Signed-off-by: Paul Gortmaker

Paul Gortmaker
2013-07-15 07:36:59 +0800

27 Jan, 2012

1 commit

39be35012 sched, block: Unify cache detection ... Browse Code »

The block layer has some code trying to determine if two CPUs share a
cache, the scheduler has a similar function. Expose the function used
by the scheduler and make the block layer use it, thereby removing the
block layers usage of CONFIG_SCHED* and topology bits.

Signed-off-by: Peter Zijlstra
Acked-by: Jens Axboe
Link: http://lkml.kernel.org/r/1327579450.2446.95.camel@twins

Peter Zijlstra
2012-01-27 20:28:48 +0800

14 Sep, 2011

1 commit

8ad6a56f5 block: Don't check QUEUE_FLAG_SAME_COMP in __blk_complete_request ... Browse Code »

In __blk_complete_request, we check both QUEUE_FLAG_SAME_COMP and req->cpu
to decide whether we should use req->cpu. Actually the user can also
select the complete cpu by either setting BIO_CPU_AFFINE or by calling
bio_set_completion_cpu. Current solution makes these 2 ways don't work
any more. So we'd better just check req->cpu.

Signed-off-by: Tao Ma
Signed-off-by: Jens Axboe

Tao Ma
2011-09-14 15:31:01 +0800

11 Aug, 2011

1 commit

bcf30e75b block: improve rq_affinity placement ... Browse Code »

This patch reverts commit 35ae66e0a09ab70ed(block: Make rq_affinity = 1
work as expected). The purpose is to avoid an unnecessary IPI.
Let's take an example. My test box has cpu 0-7, one socket. Say request is
added from CPU 1, blk_complete_request() occurs at CPU 7. Without the reverted
patch, softirq will be done at CPU 7. With it, an IPI will be directed to CPU
0, and softirq will be done at CPU 0. In this case, doing softirq at CPU 0 and
CPU 7 have no difference from cache sharing point view and we can avoid an
ipi if doing it in CPU 7.
An immediate concern is this is just like QUEUE_FLAG_SAME_FORCE, but actually
not. blk_complete_request() is running in interrupt handler, and currently
I/O controller doesn't support multiple interrupts (I checked several LSI
cards and AHCI), so only one CPU can run blk_complete_request(). This is
still quite different as QUEUE_FLAG_SAME_FORCE.
Since only one CPU runs softirq, the only difference with below patch is
softirq not always runs at the first CPU of a group.

Signed-off-by: Shaohua Li
Signed-off-by: Jens Axboe

Shaohua Li
2011-08-11 16:39:04 +0800

05 Aug, 2011

1 commit

35ae66e0a block: Make rq_affinity = 1 work as expected ... Browse Code »

Commit 5757a6d76c introduced a new rq_affinity = 2 so as to make
the request completed in the __make_request cpu. But it makes the
old rq_affinity = 1 not work any more. The root cause is that
if the 'cpu' and 'req->cpu' is in the same group and cpu != req->cpu,
ccpu will be the same as group_cpu, so the completion will be
excuted in the 'cpu' not 'group_cpu'.

This patch fix problem by simpling removing group_cpu and the codes
are more explicit now. If ccpu == cpu, we complete in cpu, otherwise
we raise_blk_irq to ccpu.

Cc: Christoph Hellwig
Cc: Roland Dreier
Cc: Dan Williams
Cc: Jens Axboe
Signed-off-by: Tao Ma
Reviewed-by: Shaohua Li
Signed-off-by: Jens Axboe

Tao Ma
2011-08-05 15:37:47 +0800

24 Jul, 2011

1 commit

5757a6d76 block: strict rq_affinity ... Browse Code »

Some systems benefit from completions always being steered to the strict
requester cpu rather than the looser "per-socket" steering that
blk_cpu_to_group() attempts by default. This is because the first
CPU in the group mask ends up being completely overloaded with work,
while the others (including the original submitter) has power left
to spare.

Allow the strict mode to be set by writing '2' to the sysfs control
file. This is identical to the scheme used for the nomerges file,
where '2' is a more aggressive setting than just being turned on.

echo 2 > /sys/block//queue/rq_affinity

Cc: Christoph Hellwig
Cc: Roland Dreier
Tested-by: Dave Jiang
Signed-off-by: Dan Williams
Signed-off-by: Jens Axboe

Dan Williams
2011-07-24 02:44:25 +0800

25 Feb, 2009

1 commit

6e2756376 generic-ipi: remove CSD_FLAG_WAIT ... Browse Code »

Oleg noticed that we don't strictly need CSD_FLAG_WAIT, rework
the code so that we can use CSD_FLAG_LOCK for both purposes.

Signed-off-by: Peter Zijlstra
Cc: Oleg Nesterov
Cc: Linus Torvalds
Cc: Nick Piggin
Cc: Jens Axboe
Cc: "Paul E. McKenney"
Cc: Rusty Russell
Signed-off-by: Ingo Molnar

Peter Zijlstra
2009-02-25 21:13:44 +0800

29 Dec, 2008

1 commit

3c18ce71a block: make blk_softirq_init() static ... Browse Code »

Sparse asked whether these could be static.

Signed-off-by: Roel Kluin
Signed-off-by: Jens Axboe

Roel Kluin
2008-12-29 15:29:51 +0800

09 Oct, 2008

4 commits

581d4e28d block: add fault injection mechanism for faking request timeouts ... Browse Code »

Only works for the generic request timer handling. Allows one to
sporadically ignore request completions, thus exercising the timeout
handling.

Signed-off-by: Jens Axboe

Jens Axboe
2008-10-09 14:56:17 +0800
242f9dcb8 block: unify request timeout handling ... Browse Code »

Right now SCSI and others do their own command timeout handling.
Move those bits to the block layer.

Instead of having a timer per command, we try to be a bit more clever
and simply have one per-queue. This avoids the overhead of having to
tear down and setup a timer for each command, so it will result in a lot
less timer fiddling.

Signed-off-by: Mike Anderson
Signed-off-by: Jens Axboe

Jens Axboe
2008-10-09 14:56:13 +0800
c7c22e4d5 block: add support for IO CPU affinity ... Browse Code »

This patch adds support for controlling the IO completion CPU of
either all requests on a queue, or on a per-request basis. We export
a sysfs variable (rq_affinity) which, if set, migrates completions
of requests to the CPU that originally submitted it. A bio helper
(bio_set_completion_cpu()) is also added, so that queuers can ask
for completion on that specific CPU.

In testing, this has been show to cut the system time by as much
as 20-40% on synthetic workloads where CPU affinity is desired.

This requires a little help from the architecture, so it'll only
work as designed for archs that are using the new generic smp
helper infrastructure.

Signed-off-by: Jens Axboe

Jens Axboe
2008-10-09 14:56:09 +0800
b646fc59b block: split softirq handling into blk-softirq.c ... Browse Code »

Signed-off-by: Jens Axboe

Jens Axboe
2008-10-09 14:56:09 +0800