Eric Lee / smarc-fsl-linux-kernel

25 Apr, 2015

1 commit

474095e46 Merge tag 'md/4.1' of git://neil.brown.name/md ... Browse Code »

Pull md updates from Neil Brown:
"More updates that usual this time. A few have performance impacts
which hould mostly be positive, but RAID5 (in particular) can be very
work-load ensitive... We'll have to wait and see.

Highlights:

- "experimental" code for managing md/raid1 across a cluster using
DLM. Code is not ready for general use and triggers a WARNING if
used. However it is looking good and mostly done and having in
mainline will help co-ordinate development.

- RAID5/6 can now batch multiple (4K wide) stripe_heads so as to
handle a full (chunk wide) stripe as a single unit.

- RAID6 can now perform read-modify-write cycles which should help
performance on larger arrays: 6 or more devices.

- RAID5/6 stripe cache now grows and shrinks dynamically. The value
set is used as a minimum.

- Resync is now allowed to go a little faster than the 'mininum' when
there is competing IO. How much faster depends on the speed of the
devices, so the effective minimum should scale with device speed to
some extent"

* tag 'md/4.1' of git://neil.brown.name/md: (58 commits)
md/raid5: don't do chunk aligned read on degraded array.
md/raid5: allow the stripe_cache to grow and shrink.
md/raid5: change ->inactive_blocked to a bit-flag.
md/raid5: move max_nr_stripes management into grow_one_stripe and drop_one_stripe
md/raid5: pass gfp_t arg to grow_one_stripe()
md/raid5: introduce configuration option rmw_level
md/raid5: activate raid6 rmw feature
md/raid6 algorithms: xor_syndrome() for SSE2
md/raid6 algorithms: xor_syndrome() for generic int
md/raid6 algorithms: improve test program
md/raid6 algorithms: delta syndrome functions
raid5: handle expansion/resync case with stripe batching
raid5: handle io error of batch list
RAID5: batch adjacent full stripe write
raid5: track overwrite disk count
raid5: add a new flag to track if a stripe can be batched
raid5: use flex_array for scribble data
md raid0: access mddev->queue (request queue member) conditionally because it is not set when accessed from dm-raid
md: allow resync to go faster when there is competing IO.
md: remove 'go_faster' option from ->sync_request()
...

Linus Torvalds
2015-04-25 00:28:01 +0800

22 Apr, 2015

6 commits

db4fd9c5d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc ... Browse Code »

Pull sparc fixes from David Miller:

1) ldc_alloc_exp_dring() can be called from softints, so use
GFP_ATOMIC. From Sowmini Varadhan.

2) Some minor warning/build fixups for the new iommu-common code on
certain archs and with certain debug options enabled. Also from
Sowmini Varadhan.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc: Use GFP_ATOMIC in ldc_alloc_exp_dring() as it can be called in softirq context
sparc64: Use M7 PMC write on all chips T4 and onward.
iommu-common: rename iommu_pool_hash to iommu_hash_common
iommu-common: fix x86_64 compiler warnings

Linus Torvalds
2015-04-22 14:21:34 +0800
a582564b2 md/raid6 algorithms: xor_syndrome() for SSE2 ... Browse Code »

The second and (last) optimized XOR syndrome calculation. This version
supports right and left side optimization. All CPUs with architecture
older than Haswell will benefit from it.

It should be noted that SSE2 movntdq kills performance for memory areas
that are read and written simultaneously in chunks smaller than cache
line size. So use movdqa instead for P/Q writes in sse21 and sse22 XOR
functions.

Signed-off-by: Markus Stockhausen
Signed-off-by: NeilBrown

Markus Stockhausen
2015-04-22 06:00:42 +0800
9a5ce91d0 md/raid6 algorithms: xor_syndrome() for generic int ... Browse Code »

Start the algorithms with the very basic one. It is left and right
optimized. That means we can avoid all calculations for unneeded pages
above the right stop offset. For pages below the left start offset we
still need the syndrome multiplication but without reading data pages.

Signed-off-by: Markus Stockhausen
Signed-off-by: NeilBrown

Markus Stockhausen
2015-04-22 06:00:42 +0800
7e92e1d76 md/raid6 algorithms: improve test program ... Browse Code »

It is always helpful to have a test tool in place if we implement
new data critical algorithms. So add some test routines to the raid6
checker that can prove if the new xor_syndrome() works as expected.

Run through all permutations of start/stop pages per algorithm and
simulate a xor_syndrome() assisted rmw run. After each rmw check if
the recovery algorithm still confirms that the stripe is fine.

Signed-off-by: Markus Stockhausen
Signed-off-by: NeilBrown

Markus Stockhausen
2015-04-22 06:00:42 +0800
fe5cbc6e0 md/raid6 algorithms: delta syndrome functions ... Browse Code »

v3: s-o-b comment, explanation of performance and descision for
the start/stop implementation

Implementing rmw functionality for RAID6 requires optimized syndrome
calculation. Up to now we can only generate a complete syndrome. The
target P/Q pages are always overwritten. With this patch we provide
a framework for inplace P/Q modification. In the first place simply
fill those functions with NULL values.

xor_syndrome() has two additional parameters: start & stop. These
will indicate the first and last page that are changing during a
rmw run. That makes it possible to avoid several unneccessary loops
and speed up calculation. The caller needs to implement the following
logic to make the functions work.

1) xor_syndrome(disks, start, stop, ...): "Remove" all data of source
blocks inside P/Q between (and including) start and end.

2) modify any block with start
Signed-off-by: NeilBrown

Markus Stockhausen
2015-04-22 06:00:41 +0800
1fc149933 Merge tag 'char-misc-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc ... Browse Code »

Pull char/misc driver updates from Greg KH:
"Here's the big char/misc driver patchset for 4.1-rc1.

Lots of different driver subsystem updates here, nothing major, full
details are in the shortlog.

All of this has been in linux-next for a while"

* tag 'char-misc-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (133 commits)
mei: trace: remove unused TRACE_SYSTEM_STRING
DTS: ARM: OMAP3-N900: Add lis3lv02d support
Documentation: DT: lis302: update wakeup binding
lis3lv02d: DT: add wakeup unit 2 and wakeup threshold
lis3lv02d: DT: use s32 to support negative values
Drivers: hv: hv_balloon: correctly handle num_pages>INT_MAX case
Drivers: hv: hv_balloon: correctly handle val.freeram directory
coresight-tmc: Adding a status interface to sysfs
coresight: remove the unnecessary configuration coresight-default-sink
...

Linus Torvalds
2015-04-22 00:42:58 +0800

21 Apr, 2015

3 commits

7b3372d4c iommu-common: rename iommu_pool_hash to iommu_hash_common ... Browse Code »

When CONFIG_DEBUG_FORCE_WEAK_PER_CPU is set, the DEFINE_PER_CPU_SECTION
macro will define an extern __pcpu_unique_##name variable that could
conflict with the same definition in powerpc at this time. Avoid that
conflict by renaming iommu_pool_hash in iommu-common.c

Thanks to Guenter Roeck for catching this, and helping to test the fix.

Signed-off-by: Sowmini Varadhan
Tested-by: Guenter Roeck
Reviewed-by: Guenter Roeck
Signed-off-by: David S. Miller

Sowmini Varadhan
2015-04-21 02:09:55 +0800
b0cc836d3 iommu-common: fix x86_64 compiler warnings ... Browse Code »

Declare iommu_large_alloc as static. Remove extern definition for
iommu_tbl_pool_init().

Signed-off-by: Sowmini Varadhan
Tested-by: Guenter Roeck
Reviewed-by: Guenter Roeck
Signed-off-by: David S. Miller

Sowmini Varadhan
2015-04-21 02:09:55 +0800
6496edfce Merge tag 'cpumask-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux ... Browse Code »

Pull final removal of deprecated cpus_* cpumask functions from Rusty Russell:
"This is the final removal (after several years!) of the obsolete
cpus_* functions, prompted by their mis-use in staging.

With these function removed, all cpu functions should only iterate to
nr_cpu_ids, so we finally only allocate that many bits when cpumasks
are allocated offstack"

* tag 'cpumask-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (25 commits)
cpumask: remove __first_cpu / __next_cpu
cpumask: resurrect CPU_MASK_CPU0
linux/cpumask.h: add typechecking to cpumask_test_cpu
cpumask: only allocate nr_cpumask_bits.
Fix weird uses of num_online_cpus().
cpumask: remove deprecated functions.
mips: fix obsolete cpumask_of_cpu usage.
x86: fix more deprecated cpu function usage.
ia64: remove deprecated cpus_ usage.
powerpc: fix deprecated CPU_MASK_CPU0 usage.
CPU_MASK_ALL/CPU_MASK_NONE: remove from deprecated region.
staging/lustre/o2iblnd: Don't use cpus_weight
staging/lustre/libcfs: replace deprecated cpus_ calls with cpumask_
staging/lustre/ptlrpc: Do not use deprecated cpus_* functions
blackfin: fix up obsolete cpu function usage.
parisc: fix up obsolete cpu function usage.
tile: fix up obsolete cpu function usage.
arm64: fix up obsolete cpu function usage.
mips: fix up obsolete cpu function usage.
x86: fix up obsolete cpu function usage.
...

Linus Torvalds
2015-04-21 01:19:03 +0800

20 Apr, 2015

1 commit

17974c054 hexdump: avoid warning in test function ... Browse Code »

The test_data_1_le[] array is a const array of const char *. To avoid
dropping any const information, we need to use "const char * const *",
not just "const char **".

I'm not sure why the different test arrays end up having different
const'ness, but let's make the pointer we use to traverse them as const
as possible, since we modify neither the array of pointers _or_ the
pointers we find in the array.

Signed-off-by: Linus Torvalds

Linus Torvalds
2015-04-20 04:48:40 +0800

19 Apr, 2015

4 commits

e4afa120c cpumask: remove __first_cpu / __next_cpu ... Browse Code »

They were for use by the deprecated first_cpu() and next_cpu() wrappers,
but sparc used them directly.

They're now replaced by cpumask_first / cpumask_next. And __next_cpu_nr
is completely obsolete.

Signed-off-by: Rusty Russell
Acked-by: David S. Miller

Rusty Russell
2015-04-19 13:05:32 +0800
2f0c0fdc0 iommu-common: Fix PARISC compile-time warnings ... Browse Code »

Fixes warnings due to
- no DMA_ERROR_CODE on PARISC,
- sizeof (unsigned long) == 4 bytes on PARISC.

Signed-off-by: Sowmini Varadhan
Signed-off-by: David S. Miller

Sowmini Varadhan
2015-04-19 03:34:50 +0800
ff7d37a50 Break up monolithic iommu table/lock into finer graularity pools and lock ... Browse Code »

Investigation of multithreaded iperf experiments on an ethernet
interface show the iommu->lock as the hottest lock identified by
lockstat, with something of the order of 21M contentions out of
27M acquisitions, and an average wait time of 26 us for the lock.
This is not efficient. A more scalable design is to follow the ppc
model, where the iommu_map_table has multiple pools, each stretching
over a segment of the map, and with a separate lock for each pool.
This model allows for better parallelization of the iommu map search.

This patch adds the iommu range alloc/free function infrastructure.

Signed-off-by: Sowmini Varadhan
Acked-by: Benjamin Herrenschmidt
Signed-off-by: David S. Miller

Sowmini Varadhan
2015-04-19 03:32:59 +0800
c12f048ff sparc: Revert generic IOMMU allocator. ... Browse Code »

I applied the wrong version of this patch series, V4 instead
of V10, due to a patchwork bundling snafu.

Signed-off-by: David S. Miller

David S. Miller
2015-04-19 03:31:25 +0800

18 Apr, 2015

2 commits

e2fdae7e7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc ... Browse Code »

Pull sparc updates from David Miller:
"The PowerPC folks have a really nice scalable IOMMU pool allocator
that we wanted to make use of for sparc. So here we have a series
that abstracts out their code into a common layer that anyone can make
use of.

Sparc is converted, and the PowerPC folks have reviewed and ACK'd this
series and plan to convert PowerPC over as well"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
iommu-common: Fix PARISC compile-time warnings
sparc: Make LDC use common iommu poll management functions
sparc: Make sparc64 use scalable lib/iommu-common.c functions
sparc: Break up monolithic iommu table/lock into finer graularity pools and lock

Linus Torvalds
2015-04-18 04:19:26 +0800
cb97201cb iommu-common: Fix PARISC compile-time warnings ... Browse Code »

Fixes warnings due to
- no DMA_ERROR_CODE on PARISC,
- sizeof (unsigned long) == 4 bytes on PARISC.

Signed-off-by: Sowmini Varadhan
Signed-off-by: David S. Miller

Sowmini Varadhan
2015-04-18 03:24:36 +0800

17 Apr, 2015

12 commits

54e514b91 Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge third patchbomb from Andrew Morton:

- various misc things

- a couple of lib/ optimisations

- provide DIV_ROUND_CLOSEST_ULL()

- checkpatch updates

- rtc tree

- befs, nilfs2, hfs, hfsplus, fatfs, adfs, affs, bfs

- ptrace fixes

- fork() fixes

- seccomp cleanups

- more mmap_sem hold time reductions from Davidlohr

* emailed patches from Andrew Morton : (138 commits)
proc: show locks in /proc/pid/fdinfo/X
docs: add missing and new /proc/PID/status file entries, fix typos
drivers/rtc/rtc-at91rm9200.c: make IO endian agnostic
Documentation/spi/spidev_test.c: fix warning
drivers/rtc/rtc-s5m.c: allow usage on device type different than main MFD type
.gitignore: ignore *.tar
MAINTAINERS: add Mediatek SoC mailing list
tomoyo: reduce mmap_sem hold for mm->exe_file
powerpc/oprofile: reduce mmap_sem hold for exe_file
oprofile: reduce mmap_sem hold for mm->exe_file
mips: ip32: add platform data hooks to use DS1685 driver
lib/Kconfig: fix up HAVE_ARCH_BITREVERSE help text
x86: switch to using asm-generic for seccomp.h
sparc: switch to using asm-generic for seccomp.h
powerpc: switch to using asm-generic for seccomp.h
parisc: switch to using asm-generic for seccomp.h
mips: switch to using asm-generic for seccomp.h
microblaze: use asm-generic for seccomp.h
arm: use asm-generic for seccomp.h
seccomp: allow COMPAT sigreturn overrides
...

Linus Torvalds
2015-04-17 21:04:38 +0800
9e522c0d2 lib/Kconfig: fix up HAVE_ARCH_BITREVERSE help text ... Browse Code »

Cc: Yalin Wang
Cc: Russell King
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2015-04-17 21:04:10 +0800
534b483a8 cpumask: don't perform while loop in cpumask_next_and() ... Browse Code »

cpumask_next_and() is looking for cpumask_next() in src1 in a loop and
tests if found cpu is also present in src2. remove that loop, perform
cpumask_and() of src1 and src2 first and use that new mask to find
cpumask_next().

Apart from removing while loop, ./bloat-o-meter on x86_64 shows
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-8 (-8)
function old new delta
cpumask_next_and 62 54 -8

Signed-off-by: Sergey Senozhatsky
Cc: Tejun Heo
Cc: "David S. Miller"
Cc: Amir Vadai
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sergey Senozhatsky
2015-04-17 21:04:08 +0800
2afe27c71 lib/bitmap.c: bitmap_[empty,full]: remove code duplication ... Browse Code »

bitmap_empty() has its own implementation. But it's clearly as simple as:

find_first_bit(src, nbits) == nbits

The same is true for 'bitmap_full'.

Signed-off-by: Yury Norov
Cc: George Spelvin
Cc: Alexey Klimov
Cc: Rasmus Villemoes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Yury Norov
2015-04-17 21:03:56 +0800
675cf53c1 lib/vsprintf.c: improve put_dec_trunc8 slightly ... Browse Code »

I hadn't had enough coffee when I wrote this. Currently, the final
increment of buf depends on the value loaded from the table, and
causes gcc to emit a cmov immediately before the return. It is smarter
to let it depend on r, since the increment can then be computed in
parallel with the final load/store pair. It also shaves 16 bytes of
.text.

Signed-off-by: Rasmus Villemoes
Cc: Tejun Heo
Cc: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rasmus Villemoes
2015-04-17 21:03:55 +0800
a7a2c02a4 lib/dma-debug: fix bucket_find_contain() ... Browse Code »

bucket_find_contain() will search the bucket list for a dma_debug_entry.
When the entry isn't found it needs to search other buckets too, since
only the start address of a dma range is hashed (which might be in a
different bucket).

A copy of the dma_debug_entry is used to get the previous hash bucket
but when its list is searched the original dma_debug_entry is to be used
not its modified copy.

This fixes false "device driver tries to sync DMA memory it has not allocated"
warnings.

Signed-off-by: Sebastian Ott
Cc: Florian Fainelli
Cc: Horia Geanta
Cc: Jiri Kosina
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sebastian Ott
2015-04-17 21:03:54 +0800
7c43d9a30 lib/vsprintf.c: even faster binary to decimal conversion ... Browse Code »

The most expensive part of decimal conversion is the divisions by 10
(albeit done using reciprocal multiplication with appropriately chosen
constants). I decided to see if one could eliminate around half of
these multiplications by emitting two digits at a time, at the cost of a
200 byte lookup table, and it does indeed seem like there is something
to be gained, especially on 64 bits. Microbenchmarking shows
improvements ranging from -50% (for numbers uniformly distributed in [0,
2^64-1]) to -25% (for numbers heavily biased toward the smaller end, a
more realistic distribution).

On a larger scale, perf shows that top, one of the big consumers of /proc
data, uses 0.5-1.0% fewer cpu cycles.

I had to jump through some hoops to get the 32 bit code to compile and run
on my 64 bit machine, so I'm not sure how relevant these numbers are, but
just for comparison the microbenchmark showed improvements between -30%
and -10%.

The bloat-o-meter costs are around 150 bytes (the generated code is a
little smaller, so it's not the full 200 bytes) on both 32 and 64 bit.
I'm aware that extra cache misses won't show up in a microbenchmark as
used above, but on the other hand decimal conversions often happen in bulk
(for example in the case of top).

I have of course tested that the new code generates the same output as the
old, for both the first and last 1e10 numbers in [0,2^64-1] and 4e9
'random' numbers in-between.

Test and verification code on github: https://github.com/Villemoes/dec.

Signed-off-by: Rasmus Villemoes
Tested-by: Jeff Epler
Cc: "Peter Zijlstra (Intel)"
Cc: Tejun Heo
Cc: Joe Perches
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rasmus Villemoes
2015-04-17 21:03:54 +0800
840620a15 lib: rename lib/find_next_bit.c to lib/find_bit.c ... Browse Code »

This file contains implementation for all find_*_bit{,_le}
So giving it more generic name looks reasonable.

Signed-off-by: Yury Norov
Reviewed-by: Rasmus Villemoes
Reviewed-by: George Spelvin
Cc: Alexey Klimov
Cc: David S. Miller
Cc: Daniel Borkmann
Cc: Hannes Frederic Sowa
Cc: Lai Jiangshan
Cc: Mark Salter
Cc: AKASHI Takahiro
Cc: Thomas Graf
Cc: Valentin Rothberg
Cc: Chris Wilson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Yury Norov
2015-04-17 21:03:54 +0800
8f6f19dd5 lib: move find_last_bit to lib/find_next_bit.c ... Browse Code »

Currently all 'find_*_bit' family is located in lib/find_next_bit.c,
except 'find_last_bit', which is in lib/find_last_bit.c. It seems,
there's no major benefit to have it separated.

Signed-off-by: Yury Norov
Reviewed-by: Rasmus Villemoes
Reviewed-by: George Spelvin
Cc: Alexey Klimov
Cc: David S. Miller
Cc: Daniel Borkmann
Cc: Hannes Frederic Sowa
Cc: Lai Jiangshan
Cc: Mark Salter
Cc: AKASHI Takahiro
Cc: Thomas Graf
Cc: Valentin Rothberg
Cc: Chris Wilson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Yury Norov
2015-04-17 21:03:54 +0800
2c57a0e23 lib: find_*_bit reimplementation ... Browse Code »

This patchset does rework to find_bit function family to achieve better
performance, and decrease size of text. All rework is done in patch 1.
Patches 2 and 3 are about code moving and renaming.

It was boot-tested on x86_64 and MIPS (big-endian) machines.
Performance tests were ran on userspace with code like this:

/* addr[] is filled from /dev/urandom */
start = clock();
while (ret < nbits)
ret = find_next_bit(addr, nbits, ret + 1);

end = clock();
printf("%ld\t", (unsigned long) end - start);

On Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz measurements are: (for
find_next_bit, nbits is 8M, for find_first_bit - 80K)

find_next_bit: find_first_bit:
new current new current
26932 43151 14777 14925
26947 43182 14521 15423
26507 43824 15053 14705
27329 43759 14473 14777
26895 43367 14847 15023
26990 43693 15103 15163
26775 43299 15067 15232
27282 42752 14544 15121
27504 43088 14644 14858
26761 43856 14699 15193
26692 43075 14781 14681
27137 42969 14451 15061
... ...

find_next_bit performance gain is 35-40%;
find_first_bit - no measurable difference.

On ARM machine, there is arch-specific implementation for find_bit.

Thanks a lot to George Spelvin and Rasmus Villemoes for hints and
helpful discussions.

This patch (of 3):

New implementations takes less space in source file (see diffstat) and in
object. For me it's 710 vs 453 bytes of text. It also shows better
performance.

find_last_bit description fixed due to obvious typo.

[akpm@linux-foundation.org: include linux/bitmap.h, per Rasmus]
Signed-off-by: Yury Norov
Reviewed-by: Rasmus Villemoes
Reviewed-by: George Spelvin
Cc: Alexey Klimov
Cc: David S. Miller
Cc: Daniel Borkmann
Cc: Hannes Frederic Sowa
Cc: Lai Jiangshan
Cc: Mark Salter
Cc: AKASHI Takahiro
Cc: Thomas Graf
Cc: Valentin Rothberg
Cc: Chris Wilson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Yury Norov
2015-04-17 21:03:53 +0800
7d69cff26 Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi ... Browse Code »

Pull SCSI updates from James Bottomley:
"This is the usual grab bag of driver updates (lpfc, qla2xxx, storvsc,
aacraid, ipr) plus an assortment of minor updates. There's also a
major update to aic1542 which moves the driver into this millenium"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (106 commits)
change SCSI Maintainer email
sd, mmc, virtio_blk, string_helpers: fix block size units
ufs: add support to allow non standard behaviours (quirks)
ufs-qcom: save controller revision info in internal structure
qla2xxx: Update driver version to 8.07.00.18-k
qla2xxx: Restore physical port WWPN only, when port down detected for FA-WWPN port.
qla2xxx: Fix virtual port configuration, when switch port is disabled/enabled.
qla2xxx: Prevent multiple firmware dump collection for ISP27XX.
qla2xxx: Disable Interrupt handshake for ISP27XX.
qla2xxx: Add debugging info for MBX timeout.
qla2xxx: Add serdes read/write support for ISP27XX
qla2xxx: Add udev notification to save fw dump for ISP27XX
qla2xxx: Add message for sucessful FW dump collected for ISP27XX.
qla2xxx: Add support to load firmware from file for ISP 26XX/27XX.
qla2xxx: Fix beacon blink for ISP27XX.
qla2xxx: Increase the wait time for firmware to be ready for P3P.
qla2xxx: Fix crash due to wrong casting of reg for ISP27XX.
qla2xxx: Fix warnings reported by static checker.
lpfc: Update version to 10.5.0.0 for upstream patch set
lpfc: Update copyright to 2015
...

Linus Torvalds
2015-04-17 07:02:04 +0800
10b88a4b1 sparc: Break up monolithic iommu table/lock into finer graularity pools and lock ... Browse Code »

Investigation of multithreaded iperf experiments on an ethernet
interface show the iommu->lock as the hottest lock identified by
lockstat, with something of the order of 21M contentions out of
27M acquisitions, and an average wait time of 26 us for the lock.
This is not efficient. A more scalable design is to follow the ppc
model, where the iommu_table has multiple pools, each stretching
over a segment of the map, and with a separate lock for each pool.
This model allows for better parallelization of the iommu map search.

This patch adds the iommu range alloc/free function infrastructure.

Signed-off-by: Sowmini Varadhan
Signed-off-by: David S. Miller

Sowmini Varadhan
2015-04-17 03:44:55 +0800

16 Apr, 2015

11 commits

eea3a0026 Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge second patchbomb from Andrew Morton:

- the rest of MM

- various misc bits

- add ability to run /sbin/reboot at reboot time

- printk/vsprintf changes

- fiddle with seq_printf() return value

* akpm: (114 commits)
parisc: remove use of seq_printf return value
lru_cache: remove use of seq_printf return value
tracing: remove use of seq_printf return value
cgroup: remove use of seq_printf return value
proc: remove use of seq_printf return value
s390: remove use of seq_printf return value
cris fasttimer: remove use of seq_printf return value
cris: remove use of seq_printf return value
openrisc: remove use of seq_printf return value
ARM: plat-pxa: remove use of seq_printf return value
nios2: cpuinfo: remove use of seq_printf return value
microblaze: mb: remove use of seq_printf return value
ipc: remove use of seq_printf return value
rtc: remove use of seq_printf return value
power: wakeup: remove use of seq_printf return value
x86: mtrr: if: remove use of seq_printf return value
linux/bitmap.h: improve BITMAP_{LAST,FIRST}_WORD_MASK
MAINTAINERS: CREDITS: remove Stefano Brivio from B43
.mailmap: add Ricardo Ribalda
CREDITS: add Ricardo Ribalda Delgado
...

Linus Torvalds
2015-04-16 07:39:15 +0800
d50f8f8d9 lru_cache: remove use of seq_printf return value ... Browse Code »

The seq_printf return value, because it's frequently misused,
will eventually be converted to void.

See: commit 1f33c41c03da ("seq_file: Rename seq_overflow() to
seq_has_overflowed() and make public")

Signed-off-by: Joe Perches
Cc: Lars Ellenberg
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2015-04-16 07:35:25 +0800
41416f233 lib/string_helpers.c: change semantics of string_escape_mem ... Browse Code »

The current semantics of string_escape_mem are inadequate for one of its
current users, vsnprintf(). If that is to honour its contract, it must
know how much space would be needed for the entire escaped buffer, and
string_escape_mem provides no way of obtaining that (short of allocating a
large enough buffer (~4 times input string) to let it play with, and
that's definitely a big no-no inside vsnprintf).

So change the semantics for string_escape_mem to be more snprintf-like:
Return the size of the output that would be generated if the destination
buffer was big enough, but of course still only write to the part of dst
it is allowed to, and (contrary to snprintf) don't do '\0'-termination.
It is then up to the caller to detect whether output was truncated and to
append a '\0' if desired. Also, we must output partial escape sequences,
otherwise a call such as snprintf(buf, 3, "%1pE", "\123") would cause
printf to write a \0 to buf[2] but leaving buf[0] and buf[1] with whatever
they previously contained.

This also fixes a bug in the escaped_string() helper function, which used
to unconditionally pass a length of "end-buf" to string_escape_mem();
since the latter doesn't check osz for being insanely large, it would
happily write to dst. For example, kasprintf(GFP_KERNEL, "something and
then %pE", ...); is an easy way to trigger an oops.

In test-string_helpers.c, the -ENOMEM test is replaced with testing for
getting the expected return value even if the buffer is too small. We
also ensure that nothing is written (by relying on a NULL pointer deref)
if the output size is 0 by passing NULL - this has to work for
kasprintf("%pE") to work.

In net/sunrpc/cache.c, I think qword_add still has the same semantics.
Someone should definitely double-check this.

In fs/proc/array.c, I made the minimum possible change, but longer-term it
should stop poking around in seq_file internals.

[andriy.shevchenko@linux.intel.com: simplify qword_add]
[andriy.shevchenko@linux.intel.com: add missed curly braces]
Signed-off-by: Rasmus Villemoes
Acked-by: Andy Shevchenko
Signed-off-by: Andy Shevchenko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rasmus Villemoes
2015-04-16 07:35:24 +0800
3aeddc7d6 lib/string_helpers.c: refactor string_escape_mem ... Browse Code »

When printf is given the format specifier %pE, it needs a way of obtaining
the total output size that would be generated if the buffer was large
enough, and string_escape_mem doesn't easily provide that. This is a
refactorization of string_escape_mem in preparation of changing its
external API to provide that information.

The somewhat ugly early returns and subsequent seemingly redundant
conditionals are to make the following patch touch as little as possible
in string_helpers.c while still preserving the current behaviour of never
outputting partial escape sequences. That behaviour must also change for
%pE to work as one expects from every other printf specifier.

Signed-off-by: Rasmus Villemoes
Acked-by: Andy Shevchenko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rasmus Villemoes
2015-04-16 07:35:23 +0800
9c98f2359 lib/vsprintf.c: fix potential NULL deref in hex_string ... Browse Code »

The helper hex_string() is broken in two ways. First, it doesn't
increment buf regardless of whether there is room to print, so callers
such as kasprintf() that try to probe the correct storage to allocate will
get a too small return value. But even worse, kasprintf() (and likely
anyone else trying to find the size of the result) pass NULL for buf and 0
for size, so we also have end == NULL. But this means that the end-1 in
hex_string() is (char*)-1, so buf < end-1 is true and we get a NULL
pointer deref. I double-checked this with a trivial kernel module that
just did a kasprintf(GFP_KERNEL, "%14ph", "CrashBoomBang").

Nobody seems to be using %ph with kasprintf, but we might as well fix it
before it hits someone.

Signed-off-by: Rasmus Villemoes
Acked-by: Andy Shevchenko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rasmus Villemoes
2015-04-16 07:35:23 +0800
900cca294 lib/vsprintf: add %pC{,n,r} format specifiers for clocks ... Browse Code »

Add format specifiers for printing struct clk:
- '%pC' or '%pCn': name (Common Clock Framework) or address (legacy
clock framework) of the clock,
- '%pCr': rate of the clock.

[akpm@linux-foundation.org: omit code if !CONFIG_HAVE_CLK]
Signed-off-by: Geert Uytterhoeven
Cc: Jonathan Corbet
Cc: Mike Turquette
Cc: Stephen Boyd
Cc: Tetsuo Handa
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Geert Uytterhoeven
2015-04-16 07:35:23 +0800
d1c1b1213 lib/vsprintf.c: another small hack ... Browse Code »

Making ZEROPAD == '0'-' ', we can eliminate a few more instructions.

Signed-off-by: Rasmus Villemoes
Cc: Peter Zijlstra
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rasmus Villemoes
2015-04-16 07:35:23 +0800
3ea8d440a lib/vsprintf.c: eliminate duplicate hex string array ... Browse Code »

gcc doesn't merge or overlap const char[] objects with identical contents
(probably language lawyers would also insist that these things have
different addresses), but there's no reason to have the string
"0123456789ABCDEF" occur in multiple places. hex_asc_upper is declared in
kernel.h and defined in lib/hexdump.c, which is unconditionally compiled
in.

Signed-off-by: Rasmus Villemoes
Cc: Peter Zijlstra
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rasmus Villemoes
2015-04-16 07:35:23 +0800
e26c12c77 lib/vsprintf.c: reduce stack use in number() ... Browse Code »

At least since the initial git commit, when base was passed as a separate
parameter, number() has only been called with bases 8, 10 and 16. I'm
guessing that 66 was to accommodate 64 0/1, a sign and a '\0', but the
buffer is only used for the actual digits. Octal digits carry 3 bits of
information, so 24 is enough. Spell that 3*sizeof(num) so one less place
needs to be changed should long long ever be 128 bits. Also remove the
commented-out code that would handle an arbitrary base.

Signed-off-by: Rasmus Villemoes
Cc: Peter Zijlstra
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rasmus Villemoes
2015-04-16 07:35:23 +0800
51be17dff lib/vsprintf.c: eliminate some branches ... Browse Code »

Since FORMAT_TYPE_INT is simply 1 more than FORMAT_TYPE_UINT, and
similarly for BYTE/UBYTE, SHORT/USHORT, LONG/ULONG, we can eliminate a few
instructions by making SIGN have the value 1 instead of 2, and then use
arithmetic instead of branches for computing the right spec->type. It's a
little hacky, but certainly in the same spirit as SMALL needing to have
the value 0x20. For example for the spec->qualifier == 'l' case, gcc now
generates

75e: 0f b6 53 01 movzbl 0x1(%rbx),%edx
762: 83 e2 01 and $0x1,%edx
765: 83 c2 09 add $0x9,%edx
768: 88 13 mov %dl,(%rbx)

instead of

763: 0f b6 53 01 movzbl 0x1(%rbx),%edx
767: 83 e2 02 and $0x2,%edx
76a: 80 fa 01 cmp $0x1,%dl
76d: 19 d2 sbb %edx,%edx
76f: 83 c2 0a add $0xa,%edx
772: 88 13 mov %dl,(%rbx)

Signed-off-by: Rasmus Villemoes
Cc: Peter Zijlstra
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rasmus Villemoes
2015-04-16 07:35:23 +0800
c79574abe lib/test-hexdump.c: fix initconst confusion ... Browse Code »

const char *...[] is not const, but an array of pointer to const. So
these arrays cannot be __initconst, but must be __initdata

This fixes section conflicts with LTO.

Signed-off-by: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andi Kleen
2015-04-16 07:35:22 +0800