Doug / smarc-fsl-linux-kernel | Embedian Git Server

28 Feb, 2013

16 commits

7175c61cc idr: explain WARN_ON_ONCE() on negative IDs out-of-range ID ... Browse Code »

Until recently, when an negative ID is specified, idr functions used to
ignore the sign bit and proceeded with the operation with the rest of
bits, which is bizarre and error-prone. The behavior recently got changed
so that negative IDs are treated as invalid but we're triggering
WARN_ON_ONCE() on negative IDs just in case somebody was depending on the
sign bit being ignored, so that those can be detected and fixed easily.

We only need this for a while. Explain why WARN_ON_ONCE()s are there and
that they can be removed later.

Signed-off-by: Tejun Heo
Acked-by: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2013-02-28 11:10:21 +0800
0ffc2a9c8 idr: implement lookup hint ... Browse Code »

While idr lookup isn't a particularly heavy operation, it still is too
substantial to use in hot paths without worrying about the performance
implications. With recent changes, each idr_layer covers 256 slots
which should be enough to cover most use cases with single idr_layer
making lookup hint very attractive.

This patch adds idr->hint which points to the idr_layer which
allocated an ID most recently and the fast path lookup becomes

if (look up target's prefix matches that of the hinted layer)
return hint->ary[ID's offset in the leaf layer];

which can be inlined.

idr->hint is set to the leaf node on idr_fill_slot() and cleared from
free_layer().

[andriy.shevchenko@linux.intel.com: always do slow path when hint is uninitialized]
Signed-off-by: Tejun Heo
Cc: Kirill A. Shutemov
Cc: Sasha Levin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2013-02-28 11:10:21 +0800
54616283c idr: add idr_layer->prefix ... Browse Code »

Add a field which carries the prefix of ID the idr_layer covers. This
will be used to implement lookup hint.

This patch doesn't make use of the new field and doesn't introduce any
behavior difference.

Signed-off-by: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2013-02-28 11:10:20 +0800
1d9b2e1e6 idr: remove length restriction from idr_layer->bitmap ... Browse Code »

Currently, idr->bitmap is declared as an unsigned long which restricts
the number of bits an idr_layer can contain. All bitops can handle
arbitrary positive integer bit number and there's no reason for this
restriction.

Declare idr_layer->bitmap using DECLARE_BITMAP() instead of a single
unsigned long.

* idr_layer->bitmap is now an array. '&' dropped from params to
bitops.

* Replaced "== IDR_FULL" tests with bitmap_full() and removed
IDR_FULL.

* Replaced find_next_bit() on ~bitmap with find_next_zero_bit().

* Replaced "bitmap = 0" with bitmap_clear().

This patch doesn't (or at least shouldn't) introduce any behavior
changes.

[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2013-02-28 11:10:20 +0800
e8c8d1bc0 idr: remove MAX_IDR_MASK and move left MAX_IDR_* into idr.c ... Browse Code »

MAX_IDR_MASK is another weirdness in the idr interface. As idr covers
whole positive integer range, it's defined as 0x7fffffff or INT_MAX.

Its usage in idr_find(), idr_replace() and idr_remove() is bizarre.
They basically mask off the sign bit and operate on the rest, so if
the caller, by accident, passes in a negative number, the sign bit
will be masked off and the remaining part will be used as if that was
the input, which is worse than crashing.

The constant is visible in idr.h and there are several users in the
kernel.

* drivers/i2c/i2c-core.c:i2c_add_numbered_adapter()

Basically used to test if adap->nr is a negative number which isn't
-1 and returns -EINVAL if so. idr_alloc() already has negative
@start checking (w/ WARN_ON_ONCE), so this can go away.

* drivers/infiniband/core/cm.c:cm_alloc_id()
drivers/infiniband/hw/mlx4/cm.c:id_map_alloc()

Used to wrap cyclic @start. Can be replaced with max(next, 0).
Note that this type of cyclic allocation using idr is buggy. These
are prone to spurious -ENOSPC failure after the first wraparound.

* fs/super.c:get_anon_bdev()

The ID allocated from ida is masked off before being tested whether
it's inside valid range. ida allocated ID can never be a negative
number and the masking is unnecessary.

Update idr_*() functions to fail with -EINVAL when negative @id is
specified and update other MAX_IDR_MASK users as described above.

This leaves MAX_IDR_MASK without any user, remove it and relocate
other MAX_IDR_* constants to lib/idr.c.

Signed-off-by: Tejun Heo
Cc: Jean Delvare
Cc: Roland Dreier
Cc: Sean Hefty
Cc: Hal Rosenstock
Cc: "Marciniszyn, Mike"
Cc: Jack Morgenstein
Cc: Or Gerlitz
Cc: Al Viro
Acked-by: Wolfram Sang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2013-02-28 11:10:20 +0800
326cf0f0f idr: fix top layer handling ... Browse Code »

Most functions in idr fail to deal with the high bits when the idr
tree grows to the maximum height.

* idr_get_empty_slot() stops growing idr tree once the depth reaches
MAX_IDR_LEVEL - 1, which is one depth shallower than necessary to
cover the whole range. The function doesn't even notice that it
didn't grow the tree enough and ends up allocating the wrong ID
given sufficiently high @starting_id.

For example, on 64 bit, if the starting id is 0x7fffff01,
idr_get_empty_slot() will grow the tree 5 layer deep, which only
covers the 30 bits and then proceed to allocate as if the bit 30
wasn't specified. It ends up allocating 0x3fffff01 without the bit
30 but still returns 0x7fffff01.

* __idr_remove_all() will not remove anything if the tree is fully
grown.

* idr_find() can't find anything if the tree is fully grown.

* idr_for_each() and idr_get_next() can't iterate anything if the tree
is fully grown.

Fix it by introducing idr_max() which returns the maximum possible ID
given the depth of tree and replacing the id limit checks in all
affected places.

As the idr_layer pointer array pa[] needs to be 1 larger than the
maximum depth, enlarge pa[] arrays by one.

While this plugs the discovered issues, the whole code base is
horrible and in desparate need of rewrite. It's fragile like hell,

Signed-off-by: Tejun Heo
Cc: Rusty Russell
Cc:

Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2013-02-28 11:10:20 +0800
d5c7409f7 idr: implement idr_preload[_end]() and idr_alloc() ... Browse Code »

The current idr interface is very cumbersome.

* For all allocations, two function calls - idr_pre_get() and
idr_get_new*() - should be made.

* idr_pre_get() doesn't guarantee that the following idr_get_new*()
will not fail from memory shortage. If idr_get_new*() returns
-EAGAIN, the caller is expected to retry pre_get and allocation.

* idr_get_new*() can't enforce upper limit. Upper limit can only be
enforced by allocating and then freeing if above limit.

* idr_layer buffer is unnecessarily per-idr. Each idr ends up keeping
around MAX_IDR_FREE idr_layers. The memory consumed per idr is
under two pages but it makes it difficult to make idr_layer larger.

This patch implements the following new set of allocation functions.

* idr_preload[_end]() - Similar to radix preload but doesn't fail.
The first idr_alloc() inside preload section can be treated as if it
were called with @gfp_mask used for idr_preload().

* idr_alloc() - Allocate an ID w/ lower and upper limits. Takes
@gfp_flags and can be used w/o preloading. When used inside
preloaded section, the allocation mask of preloading can be assumed.

If idr_alloc() can be called from a context which allows sufficiently
relaxed @gfp_mask, it can be used by itself. If, for example,
idr_alloc() is called inside spinlock protected region, preloading can
be used like the following.

idr_preload(GFP_KERNEL);
spin_lock(lock);

id = idr_alloc(idr, ptr, start, end, GFP_NOWAIT);

spin_unlock(lock);
idr_preload_end();
if (id < 0)
error;

which is much simpler and less error-prone than idr_pre_get and
idr_get_new*() loop.

The new interface uses per-pcu idr_layer buffer and thus the number of
idr's in the system doesn't affect the amount of memory used for
preloading.

idr_layer_alloc() is introduced to handle idr_layer allocations for
both old and new ID allocation paths. This is a bit hairy now but the
new interface is expected to replace the old and the internal
implementation eventually will become simpler.

Signed-off-by: Tejun Heo
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2013-02-28 11:10:14 +0800
3594eb289 idr: refactor idr_get_new_above() ... Browse Code »

Move slot filling to idr_fill_slot() from idr_get_new_above_int() and
make idr_get_new_above() directly call it. idr_get_new_above_int() is
no longer needed and removed.

This will be used to implement a new ID allocation interface.

Signed-off-by: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2013-02-28 11:10:14 +0800
12d1b4393 idr: remove _idr_rc_to_errno() hack ... Browse Code »

idr uses -1, IDR_NEED_TO_GROW and IDR_NOMORE_SPACE to communicate
exception conditions internally. The return value is later translated
to errno values using _idr_rc_to_errno().

This is confusing. Drop the custom ones and consistently use -EAGAIN
for "tree needs to grow", -ENOMEM for "need more memory" and -ENOSPC for
"ran out of ID space".

Due to the weird memory preloading mechanism, [ra]_get_new*() return
-EAGAIN on memory shortage, so we need to substitute -ENOMEM w/
-EAGAIN on those interface functions. They'll eventually be cleaned
up and the translations will go away.

This patch doesn't introduce any functional changes.

Signed-off-by: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2013-02-28 11:10:14 +0800
49038ef4f idr: relocate idr_for_each_entry() and reorganize id[r|a]_get_new() ... Browse Code »

* Move idr_for_each_entry() definition next to other idr related
definitions.

* Make id[r|a]_get_new() inline wrappers of id[r|a]_get_new_above().

This changes the implementation of idr_get_new() but the new
implementation is trivial. This patch doesn't introduce any
functional change.

Signed-off-by: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2013-02-28 11:10:14 +0800
fe6e24ec9 idr: deprecate idr_remove_all() ... Browse Code »

There was only one legitimate use of idr_remove_all() and a lot more of
incorrect uses (or lack of it). Now that idr_destroy() implies
idr_remove_all() and all the in-kernel users updated not to use it,
there's no reason to keep it around. Mark it deprecated so that we can
later unexport it.

idr_remove_all() is made an inline function calling __idr_remove_all()
to avoid triggering deprecated warning on EXPORT_SYMBOL().

Signed-off-by: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2013-02-28 11:10:14 +0800
9bb26bc1f idr: make idr_destroy() imply idr_remove_all() ... Browse Code »

idr is silly in quite a few ways, one of which is how it's supposed to
be destroyed - idr_destroy() doesn't release IDs and doesn't even whine
if the idr isn't empty. If the caller forgets idr_remove_all(), it
simply leaks memory.

Even ida gets this wrong and leaks memory on destruction. There is
absoltely no reason not to call idr_remove_all() from idr_destroy().
Nobody is abusing idr_destroy() for shrinking free layer buffer and
continues to use idr after idr_destroy(), so it's safe to do remove_all
from destroy.

In the whole kernel, there is only one place where idr_remove_all() is
legitimiately used without following idr_destroy() while there are quite
a few places where the caller forgets either idr_remove_all() or
idr_destroy() leaking memory.

This patch makes idr_destroy() call idr_destroy_all() and updates the
function description accordingly.

Signed-off-by: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2013-02-28 11:10:13 +0800
6cdae7416 idr: fix a subtle bug in idr_get_next() ... Browse Code »

The iteration logic of idr_get_next() is borrowed mostly verbatim from
idr_for_each(). It walks down the tree looking for the slot matching
the current ID. If the matching slot is not found, the ID is
incremented by the distance of single slot at the given level and
repeats.

The implementation assumes that during the whole iteration id is aligned
to the layer boundaries of the level closest to the leaf, which is true
for all iterations starting from zero or an existing element and thus is
fine for idr_for_each().

However, idr_get_next() may be given any point and if the starting id
hits in the middle of a non-existent layer, increment to the next layer
will end up skipping the same offset into it. For example, an IDR with
IDs filled between [64, 127] would look like the following.

[ 0 64 ... ]
/----/ |
| |
NULL [ 64 ... 127 ]

If idr_get_next() is called with 63 as the starting point, it will try
to follow down the pointer from 0. As it is NULL, it will then try to
proceed to the next slot in the same level by adding the slot distance
at that level which is 64 - making the next try 127. It goes around the
loop and finds and returns 127 skipping [64, 126].

Note that this bug also triggers in idr_for_each_entry() loop which
deletes during iteration as deletions can make layers go away leaving
the iteration with unaligned ID into missing layers.

Fix it by ensuring proceeding to the next slot doesn't carry over the
unaligned offset - ie. use round_up(id + 1, slot_distance) instead of
id += slot_distance.

Signed-off-by: Tejun Heo
Reported-by: David Teigland
Cc: KAMEZAWA Hiroyuki
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2013-02-28 11:10:12 +0800
4225fc855 lib/scatterlist: use page iterator in the mapping iterator ... Browse Code »

For better code reuse use the newly added page iterator to iterate
through the pages. The offset, length within the page is still
calculated by the mapping iterator as well as the actual mapping. Idea
from Tejun Heo.

Signed-off-by: Imre Deak
Cc: Maxim Levitsky
Cc: Tejun Heo
Cc: Daniel Vetter
Cc: James Hogan
Cc: Stephen Warren
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Imre Deak
2013-02-28 11:10:10 +0800
a321e91b6 lib/scatterlist: add simple page iterator ... Browse Code »

Add an iterator to walk through a scatter list a page at a time starting
at a specific page offset. As opposed to the mapping iterator this is
meant to be small, performing well even in simple loops like collecting
all pages on the scatterlist into an array or setting up an iommu table
based on the pages' DMA address.

Signed-off-by: Imre Deak
Cc: Maxim Levitsky
Cc: Tejun Heo
Cc: Daniel Vetter
Tested-by: Stephen Warren
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Imre Deak
2013-02-28 11:10:10 +0800
9ed8a30f3 lib/devres.c: fix misplaced #endif ... Browse Code »

A misplaced #endif causes link errors related to pcim_*() functions.

This is because pcim_*() functions are related to CONFIG_PCI option,
however these are not related to CONFIG_HAS_IOPORT option. Therefore,
when CONFIG_PCI is enabled and CONFIG_HAS_IOPORT is not enabled, it makes
link errors related to pcim_*() functions as below:

drivers/ata/libata-sff.c:3233: undefined reference to `pcim_iomap_regions'
drivers/ata/libata-sff.c:3238: undefined reference to `pcim_iomap_table'
drivers/built-in.o: In function `ata_pci_sff_init_host':
drivers/ata/libata-sff.c:2318: undefined reference to `pcim_iomap_regions'
drivers/ata/libata-sff.c:2329: undefined reference to `pcim_iomap_table

Signed-off-by: Jingoo Han
Cc: Greg KH
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jingoo Han
2013-02-28 11:10:09 +0800

26 Feb, 2013

1 commit

9043a2650 Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux ... Browse Code »

Pull module update from Rusty Russell:
"The sweeping change is to make add_taint() explicitly indicate whether
to disable lockdep, but it's a mechanical change."

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
MODSIGN: Add option to not sign modules during modules_install
MODSIGN: Add -s option to sign-file
MODSIGN: Specify the hash algorithm on sign-file command line
MODSIGN: Simplify Makefile with a Kconfig helper
module: clean up load_module a little more.
modpost: Ignore ARC specific non-alloc sections
module: constify within_module_*
taint: add explicit flag to show whether lock dep is still OK.
module: printk message when module signature fail taints kernel.

Linus Torvalds
2013-02-26 07:41:43 +0800

23 Feb, 2013

1 commit

3b5d8510b Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull core locking changes from Ingo Molnar:
"The biggest change is the rwsem lock-steal improvements, both to the
assembly optimized and the spinlock based variants.

The other notable change is the clean up of the seqlock implementation
to be based on the seqcount infrastructure.

The rest is assorted smaller debuggability, cleanup and continued -rt
locking changes."

* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rwsem-spinlock: Implement writer lock-stealing for better scalability
futex: Revert "futex: Mark get_robust_list as deprecated"
generic: Use raw local irq variant for generic cmpxchg
lockdep: Selftest: convert spinlock to raw spinlock
seqlock: Use seqcount infrastructure
seqlock: Remove unused functions
ntp: Make ntp_lock raw
intel_idle: Convert i7300_idle_lock to raw_spinlock
locking: Various static lock initializer fixes
lockdep: Print more info when MAX_LOCK_DEPTH is exceeded
rwsem: Implement writer lock-stealing for better scalability
lockdep: Silence warning if CONFIG_LOCKDEP isn't set
watchdog: Use local_clock for get_timestamp()
lockdep: Rename print_unlock_inbalance_bug() to print_unlock_imbalance_bug()
locking/stat: Fix a typo

Linus Torvalds
2013-02-23 11:25:09 +0800

22 Feb, 2013

11 commits

2ef14f465 Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull x86 mm changes from Peter Anvin:
"This is a huge set of several partly interrelated (and concurrently
developed) changes, which is why the branch history is messier than
one would like.

The *really* big items are two humonguous patchsets mostly developed
by Yinghai Lu at my request, which completely revamps the way we
create initial page tables. In particular, rather than estimating how
much memory we will need for page tables and then build them into that
memory -- a calculation that has shown to be incredibly fragile -- we
now build them (on 64 bits) with the aid of a "pseudo-linear mode" --
a #PF handler which creates temporary page tables on demand.

This has several advantages:

1. It makes it much easier to support things that need access to data
very early (a followon patchset uses this to load microcode way
early in the kernel startup).

2. It allows the kernel and all the kernel data objects to be invoked
from above the 4 GB limit. This allows kdump to work on very large
systems.

3. It greatly reduces the difference between Xen and native (Xen's
equivalent of the #PF handler are the temporary page tables created
by the domain builder), eliminating a bunch of fragile hooks.

The patch series also gets us a bit closer to W^X.

Additional work in this pull is the 64-bit get_user() work which you
were also involved with, and a bunch of cleanups/speedups to
__phys_addr()/__pa()."

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (105 commits)
x86, mm: Move reserving low memory later in initialization
x86, doc: Clarify the use of asm("%edx") in uaccess.h
x86, mm: Redesign get_user with a __builtin_choose_expr hack
x86: Be consistent with data size in getuser.S
x86, mm: Use a bitfield to mask nuisance get_user() warnings
x86/kvm: Fix compile warning in kvm_register_steal_time()
x86-32: Add support for 64bit get_user()
x86-32, mm: Remove reference to alloc_remap()
x86-32, mm: Remove reference to resume_map_numa_kva()
x86-32, mm: Rip out x86_32 NUMA remapping code
x86/numa: Use __pa_nodebug() instead
x86: Don't panic if can not alloc buffer for swiotlb
mm: Add alloc_bootmem_low_pages_nopanic()
x86, 64bit, mm: hibernate use generic mapping_init
x86, 64bit, mm: Mark data/bss/brk to nx
x86: Merge early kernel reserve for 32bit and 64bit
x86: Add Crash kernel low reservation
x86, kdump: Remove crashkernel range find limit for 64bit
memblock: Add memblock_mem_size()
x86, boot: Not need to check setup_header version for setup_data
...

Linus Torvalds
2013-02-22 10:06:55 +0800
7c2db36e7 Merge branch 'akpm' (incoming from Andrew) ... Browse Code »

Merge misc patches from Andrew Morton:

- Florian has vanished so I appear to have become fbdev maintainer
again :(

- Joel and Mark are distracted to welcome to the new OCFS2 maintainer

- The backlight queue

- Small core kernel changes

- lib/ updates

- The rtc queue

- Various random bits

* akpm: (164 commits)
rtc: rtc-davinci: use devm_*() functions
rtc: rtc-max8997: use devm_request_threaded_irq()
rtc: rtc-max8907: use devm_request_threaded_irq()
rtc: rtc-da9052: use devm_request_threaded_irq()
rtc: rtc-wm831x: use devm_request_threaded_irq()
rtc: rtc-tps80031: use devm_request_threaded_irq()
rtc: rtc-lp8788: use devm_request_threaded_irq()
rtc: rtc-coh901331: use devm_clk_get()
rtc: rtc-vt8500: use devm_*() functions
rtc: rtc-tps6586x: use devm_request_threaded_irq()
rtc: rtc-imxdi: use devm_clk_get()
rtc: rtc-cmos: use dev_warn()/dev_dbg() instead of printk()/pr_debug()
rtc: rtc-pcf8583: use dev_warn() instead of printk()
rtc: rtc-sun4v: use pr_warn() instead of printk()
rtc: rtc-vr41xx: use dev_info() instead of printk()
rtc: rtc-rs5c313: use pr_err() instead of printk()
rtc: rtc-at91rm9200: use dev_dbg()/dev_err() instead of printk()/pr_debug()
rtc: rtc-rs5c372: use dev_dbg()/dev_warn() instead of printk()/pr_debug()
rtc: rtc-ds2404: use dev_err() instead of printk()
rtc: rtc-efi: use dev_err()/dev_warn()/pr_err() instead of printk()
...

Linus Torvalds
2013-02-22 09:38:49 +0800
5dc49c75a decompressors: make the default XZ_DEC_* config match the selected architecture ... Browse Code »

Change the defautl XZ_DEC_* config symbol to match the configured
architecture. It is perfectly legitimate to support multiple XZ BCJ
filters for different architectures (e.g.: to mount foreign squashfs/xz
compressed filesystems), it is however more natural not to select them all
by default, but only the one matching the configured architecture.

Signed-off-by: Florian Fainelli
Acked-by: Lasse Collin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Florian Fainelli
2013-02-22 09:22:26 +0800
64dbfb444 decompressors: drop dependency on CONFIG_EXPERT ... Browse Code »

Remove the XZ_DEC_* depedencey on CONFIG_EXPERT as recommended by Lasse
Colin.

Signed-off-by: Florian Fainelli
Acked-by: Lasse Collin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Florian Fainelli
2013-02-22 09:22:26 +0800
9d7496296 decompressors: group XZ_DEC_* symbols under an if XZ_BCJ / endif ... Browse Code »

Group all architecture-specific BCJ filter configuration symbols under an
if XZ_BCJ / endif statement.

Signed-off-by: Florian Fainelli
Acked-by: Lasse Collin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Florian Fainelli
2013-02-22 09:22:26 +0800
53769627b lib/parser.c: fix up comments for valid return values from match_number ... Browse Code »

match_number() has return values of -ENOMEM, -EINVAL and -ERANGE. So, for
all the functions calling match_number, the return value should include
these values. Fix up the comments to reflect the correct values.

Signed-off-by: Namjae Jeon
Signed-off-by: Amit Sahrawat
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Namjae Jeon
2013-02-22 09:22:25 +0800
7d7992108 lib/vsprintf.c: add %pa format specifier for phys_addr_t types ... Browse Code »

Add the %pa format specifier for printing a phys_addr_t type and its
derivative types (such as resource_size_t), since the physical address
size on some platforms can vary based on build options, regardless of
the native integer type.

Signed-off-by: Stepan Moskovchenko
Cc: Rob Landley
Cc: George Spelvin
Cc: Andy Shevchenko
Cc: Stephen Boyd
Cc: Andrei Emeltchenko
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Stepan Moskovchenko
2013-02-22 09:22:20 +0800
76e840261 lib/Kconfig.debug: unhide CONFIG_PANIC_ON_OOPS ... Browse Code »

CONFIG_EXPERT doesn't really make sense, and hides it unintentionally.
Remove superfluous "default n" pointed out by Ingo as well.

Signed-off-by: Kyle McMartin
Acked-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kyle McMartin
2013-02-22 09:22:20 +0800
21eaab6d1 Merge tag 'tty-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty ... Browse Code »

Pull tty/serial patches from Greg Kroah-Hartman:
"Here's the big tty/serial driver patches for 3.9-rc1.

More tty port rework and fixes from Jiri here, as well as lots of
individual serial driver updates and fixes.

All of these have been in the linux-next tree for a while."

* tag 'tty-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (140 commits)
tty: mxser: improve error handling in mxser_probe() and mxser_module_init()
serial: imx: fix uninitialized variable warning
serial: tegra: assume CONFIG_OF
TTY: do not update atime/mtime on read/write
lguest: select CONFIG_TTY to build properly.
ARM defconfigs: add missing inclusions of linux/platform_device.h
fb/exynos: include platform_device.h
ARM: sa1100/assabet: include platform_device.h directly
serial: imx: Fix recursive locking bug
pps: Fix build breakage from decoupling pps from tty
tty: Remove ancient hardpps()
pps: Additional cleanups in uart_handle_dcd_change
pps: Move timestamp read into PPS code proper
pps: Don't crash the machine when exiting will do
pps: Fix a use-after free bug when unregistering a source.
pps: Use pps_lookup_dev to reduce ldisc coupling
pps: Add pps_lookup_dev() function
tty: serial: uartlite: Support uartlite on big and little endian systems
tty: serial: uartlite: Fix sparse and checkpatch warnings
serial/arc-uart: Miscll DT related updates (Grant's review comments)
...

Fix up trivial conflicts, mostly just due to the TTY config option
clashing with the EXPERIMENTAL removal.

Linus Torvalds
2013-02-22 05:41:04 +0800
06991c28f Merge tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core ... Browse Code »

Pull driver core patches from Greg Kroah-Hartman:
"Here is the big driver core merge for 3.9-rc1

There are two major series here, both of which touch lots of drivers
all over the kernel, and will cause you some merge conflicts:

- add a new function called devm_ioremap_resource() to properly be
able to check return values.

- remove CONFIG_EXPERIMENTAL

Other than those patches, there's not much here, some minor fixes and
updates"

Fix up trivial conflicts

* tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (221 commits)
base: memory: fix soft/hard_offline_page permissions
drivercore: Fix ordering between deferred_probe and exiting initcalls
backlight: fix class_find_device() arguments
TTY: mark tty_get_device call with the proper const values
driver-core: constify data for class_find_device()
firmware: Ignore abort check when no user-helper is used
firmware: Reduce ifdef CONFIG_FW_LOADER_USER_HELPER
firmware: Make user-mode helper optional
firmware: Refactoring for splitting user-mode helper code
Driver core: treat unregistered bus_types as having no devices
watchdog: Convert to devm_ioremap_resource()
thermal: Convert to devm_ioremap_resource()
spi: Convert to devm_ioremap_resource()
power: Convert to devm_ioremap_resource()
mtd: Convert to devm_ioremap_resource()
mmc: Convert to devm_ioremap_resource()
mfd: Convert to devm_ioremap_resource()
media: Convert to devm_ioremap_resource()
iommu: Convert to devm_ioremap_resource()
drm: Convert to devm_ioremap_resource()
...

Linus Torvalds
2013-02-22 04:05:51 +0800
33673dcb3 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security ... Browse Code »

Pull security subsystem updates from James Morris:
"This is basically a maintenance update for the TPM driver and EVM/IMA"

Fix up conflicts in lib/digsig.c and security/integrity/ima/ima_main.c

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (45 commits)
tpm/ibmvtpm: build only when IBM pseries is configured
ima: digital signature verification using asymmetric keys
ima: rename hash calculation functions
ima: use new crypto_shash API instead of old crypto_hash
ima: add policy support for file system uuid
evm: add file system uuid to EVM hmac
tpm_tis: check pnp_acpi_device return code
char/tpm/tpm_i2c_stm_st33: drop temporary variable for return value
char/tpm/tpm_i2c_stm_st33: remove dead assignment in tpm_st33_i2c_probe
char/tpm/tpm_i2c_stm_st33: Remove __devexit attribute
char/tpm/tpm_i2c_stm_st33: Don't use memcpy for one byte assignment
tpm_i2c_stm_st33: removed unused variables/code
TPM: Wait for TPM_ACCESS tpmRegValidSts to go high at startup
tpm: Fix cancellation of TPM commands (interrupt mode)
tpm: Fix cancellation of TPM commands (polling mode)
tpm: Store TPM vendor ID
TPM: Work around buggy TPMs that block during continue self test
tpm_i2c_stm_st33: fix oops when i2c client is unavailable
char/tpm: Use struct dev_pm_ops for power management
TPM: STMicroelectronics ST33 I2C BUILD STUFF
...

Linus Torvalds
2013-02-22 00:18:12 +0800

20 Feb, 2013

1 commit

e84cf5d0f Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull RCU changes from Ingo Molnar:
"SRCU changes:

- These include debugging aids, updates that move towards the goal of
permitting srcu_read_lock() and srcu_read_unlock() to be used from
idle and offline CPUs, and a few small fixes.

Changes to rcutorture and to RCU documentation:

- Posted to LKML at https://lkml.org/lkml/2013/1/26/188

Enhancements to uniprocessor handling in tiny RCU:

- Posted to LKML at https://lkml.org/lkml/2013/1/27/2

Tag RCU callbacks with grace-period number to simplify callback
advancement:

- Posted to LKML at https://lkml.org/lkml/2013/1/26/203

Miscellaneous fixes:

- Posted to LKML at https://lkml.org/lkml/2013/1/26/204"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
srcu: use ACCESS_ONCE() to access sp->completed in srcu_read_lock()
srcu: Update synchronize_srcu_expedited()'s comments
srcu: Update synchronize_srcu()'s comments
srcu: Remove checks preventing idle CPUs from calling srcu_read_lock()
srcu: Remove checks preventing offline CPUs from calling srcu_read_lock()
srcu: Simple cleanup for cleanup_srcu_struct()
srcu: Add might_sleep() annotation to synchronize_srcu()
srcu: Simplify __srcu_read_unlock() via this_cpu_dec()
rcu: Allow rcutorture to be built at low optimization levels
rcu: Make rcutorture's shuffler task shuffle recently added tasks
rcu: Allow TREE_PREEMPT_RCU on UP systems
rcu: Provide RCU CPU stall warnings for tiny RCU
context_tracking: Add comments on interface and internals
rcu: Remove obsolete Kconfig option from comment
rcu: Remove unused code originally used for context tracking
rcu: Consolidate debugging Kconfig options
rcu: Correct 'optimized' to 'optimize' in header comment
rcu: Trace callback acceleration
rcu: Tag callback lists with corresponding grace-period number
rcutorture: Don't compare ptr with 0
...

Linus Torvalds
2013-02-20 09:45:20 +0800

19 Feb, 2013

3 commits

41ef8f826 rwsem-spinlock: Implement writer lock-stealing for better scalability ... Browse Code »

We (Linux Kernel Performance project) found a regression
introduced by commit:

5a505085f043 mm/rmap: Convert the struct anon_vma::mutex to an rwsem

which converted all anon_vma::mutex locks rwsem write locks.

The semantics are the same, but the behavioral difference is
quite huge in some cases. After investigating it we found the
root cause: mutexes support lock stealing while rwsems don't.

Here is the link for the detailed regression report:

https://lkml.org/lkml/2013/1/29/84

Ingo suggested adding write lock stealing to rwsems:

"I think we should allow lock-steal between rwsem writers - that
will not hurt fairness as most rwsem fairness concerns relate to
reader vs. writer fairness"

And here is the rwsem-spinlock version.

With this patch, we got a double performance increase in one
test box with following aim7 workfile:

FILESIZE: 1M
POOLSIZE: 10M
10 fork_test

/usr/bin/time output w/o patch /usr/bin/time_output with patch
-- Percent of CPU this job got: 369% Percent of CPU this job got: 537%
Voluntary context switches: 640595016 Voluntary context switches: 157915561

We got a 45% increase in CPU usage and saved about 3/4 voluntary context switches.

Reported-by: LKP project
Suggested-by: Ingo Molnar
Signed-off-by: Yuanhan Liu
Cc: Alex Shi
Cc: David Howells
Cc: Michel Lespinasse
Cc: Linus Torvalds
Cc: Andrew Morton
Cc: Peter Zijlstra
Cc: Anton Blanchard
Cc: Arjan van de Ven
Cc: paul.gortmaker@windriver.com
Link: http://lkml.kernel.org/r/1359716356-23865-1-git-send-email-yuanhan.liu@linux.intel.com
Signed-off-by: Ingo Molnar

Yuanhan Liu
2013-02-19 15:43:39 +0800
9fb1b90ce lockdep: Selftest: convert spinlock to raw spinlock ... Browse Code »

To make the lockdep selftest working on RT we need to convert the
spinlock tests to a raw spinlock. Otherwise we cannot run the irq
context checks. For mainline this is just annotational as spinlocks
are mapped to raw_spinlocks anyway.

Signed-off-by: Yong Zhang
Link: http://lkml.kernel.org/r/1334559716-18447-2-git-send-email-yong.zhang0@gmail.com
Signed-off-by: Thomas Gleixner

Yong Zhang
2013-02-19 15:43:35 +0800
ce6711f3d rwsem: Implement writer lock-stealing for better scalability ... Browse Code »

Commit 5a505085f043 ("mm/rmap: Convert the struct anon_vma::mutex
to an rwsem") changed struct anon_vma::mutex to an rwsem, which
caused aim7 fork_test performance to drop by 50%.

Yuanhan Liu did the following excellent analysis:

https://lkml.org/lkml/2013/1/29/84

and found that the regression is caused by strict, serialized,
FIFO sequential write-ownership of rwsems. Ingo suggested
implementing opportunistic lock-stealing for the front writer
task in the waitqueue.

Yuanhan Liu implemented lock-stealing for spinlock-rwsems,
which indeed recovered much of the regression - confirming
the analysis that the main factor in the regression was the
FIFO writer-fairness of rwsems.

In this patch we allow lock-stealing to happen when the first
waiter is also writer. With that change in place the
aim7 fork_test performance is fully recovered on my
Intel NHM EP, NHM EX, SNB EP 2S and 4S test-machines.

Reported-by: lkp@linux.intel.com
Reported-by: Yuanhan Liu
Signed-off-by: Alex Shi
Cc: David Howells
Cc: Michel Lespinasse
Cc: Linus Torvalds
Cc: Andrew Morton
Cc: Peter Zijlstra
Cc: Anton Blanchard
Cc: Arjan van de Ven
Cc: paul.gortmaker@windriver.com
Link: https://lkml.org/lkml/2013/1/29/84
Link: http://lkml.kernel.org/r/1360069915-31619-1-git-send-email-alex.shi@intel.com
[ Small stylistic fixes, updated changelog. ]
Signed-off-by: Ingo Molnar

Alex Shi
2013-02-19 15:42:43 +0800

05 Feb, 2013

1 commit

9228b5f24 Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck… ... Browse Code »

…/linux-rcu into core/rcu

Pull RCU updates from Paul E. McKenney:

1. Changes to rcutorture and to RCU documentation. Posted to LKML at
https://lkml.org/lkml/2013/1/26/188.

2. Enhancements to uniprocessor handling in tiny RCU. Posted to LKML
at https://lkml.org/lkml/2013/1/27/2.

3. Tag RCU callbacks with grace-period number to simplify callback
advancement. Posted to LKML at https://lkml.org/lkml/2013/1/26/203.

4. Miscellaneous fixes. Posted to LKML at https://lkml.org/lkml/2013/1/26/204.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2013-02-05 02:06:34 +0800

01 Feb, 2013

3 commits

0d2a1b2d0 mpilib: use DIV_ROUND_UP and remove unused macros ... Browse Code »

Remove MIN, MAX and ABS macros that are duplicates kernel's native
implementation.

Signed-off-by: Andy Shevchenko
Signed-off-by: James Morris

Andy Shevchenko
2013-02-01 13:28:32 +0800
26d438457 digsig: remove unnecessary memory allocation and copying ... Browse Code »

In existing use case, copying of the decoded data is unnecessary in
pkcs_1_v1_5_decode_emsa. It is just enough to get pointer to the message.
Removing copying and extra buffer allocation.

Signed-off-by: Dmitry Kasatkin
Signed-off-by: James Morris

Dmitry Kasatkin
2013-02-01 13:28:24 +0800
7810cc1e7 digsig: Fix memory leakage in digsig_verify_rsa() ... Browse Code »

digsig_verify_rsa() does not free kmalloc'ed buffer returned by
mpi_get_buffer().

Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: Dmitry Kasatkin
Cc: stable@vger.kernel.org
Signed-off-by: James Morris

YOSHIFUJI Hideaki
2013-02-01 12:59:33 +0800

30 Jan, 2013

1 commit

ac2cbab21 x86: Don't panic if can not alloc buffer for swiotlb ... Browse Code »

Normal boot path on system with iommu support:
swiotlb buffer will be allocated early at first and then try to initialize
iommu, if iommu for intel or AMD could setup properly, swiotlb buffer
will be freed.

The early allocating is with bootmem, and could panic when we try to use
kdump with buffer above 4G only, or with memmap to limit mem under 4G.
for example: memmap=4095M$1M to remove memory under 4G.

According to Eric, add _nopanic version and no_iotlb_memory to fail
map single later if swiotlb is still needed.

-v2: don't pass nopanic, and use -ENOMEM return value according to Eric.
panic early instead of using swiotlb_full to panic...according to Eric/Konrad.
-v3: make swiotlb_init to be notpanic, but will affect:
arm64, ia64, powerpc, tile, unicore32, x86.
-v4: cleanup swiotlb_init by removing swiotlb_init_with_default_size.

Suggested-by: Eric W. Biederman
Signed-off-by: Yinghai Lu
Link: http://lkml.kernel.org/r/1359058816-7615-36-git-send-email-yinghai@kernel.org
Reviewed-and-tested-by: Konrad Rzeszutek Wilk
Cc: Joerg Roedel
Cc: Ralf Baechle
Cc: Jeremy Fitzhardinge
Cc: Kyungmin Park
Cc: Marek Szyprowski
Cc: Arnd Bergmann
Cc: Andrzej Pietrasiewicz
Cc: linux-mips@linux-mips.org
Cc: xen-devel@lists.xensource.com
Cc: virtualization@lists.linux-foundation.org
Cc: Shuah Khan
Signed-off-by: H. Peter Anvin

Yinghai Lu
2013-01-30 11:36:53 +0800

29 Jan, 2013

2 commits

40393f525 Merge branches 'doctorture.2013.01.29a', 'fixes.2013.01.26a', 'tagcb.2013.01.24a… ... Browse Code »

…' and 'tiny.2013.01.29b' into HEAD

doctorture.2013.01.11a: Changes to rcutorture and to RCU documentation.

fixes.2013.01.26a: Miscellaneous fixes.

tagcb.2013.01.24a: Tag RCU callbacks with grace-period number to
simplify callback advancement.

tiny.2013.01.29b: Enhancements to uniprocessor handling in tiny RCU.

Paul E. McKenney
2013-01-29 14:25:21 +0800
6bfc09e23 rcu: Provide RCU CPU stall warnings for tiny RCU ... Browse Code »

Tiny RCU has historically omitted RCU CPU stall warnings in order to
reduce memory requirements, however, lack of these warnings caused
Thomas Gleixner some debugging pain recently. Therefore, this commit
adds RCU CPU stall warnings to tiny RCU if RCU_TRACE=y. This keeps
the memory footprint small, while still enabling CPU stall warnings
in kernels built to enable them.

Updated to include Josh Triplett's suggested use of RCU_STALL_COMMON
config variable to simplify #if expressions.

Reported-by: Thomas Gleixner
Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Paul E. McKenney
2013-01-29 14:06:21 +0800