Eric Lee / smarc-fsl-linux-kernel

25 May, 2011

40 commits

4440673a9 leds: provide helper to register "leds-gpio" devices ... Browse Code »
43

This function makes a deep copy of the platform data to allow it to live
in init memory. For a kernel that supports several machines and so
includes the definition for several leds-gpio devices this saves quite
some memory because all but one definition can be free'd after boot.

As the function is used by arch code it must be builtin and so cannot go
into leds-gpio.c.

[akpm@linux-foundation.org: s/CONFIG_LED_REGISTER_GPIO/CONFIG_LEDS_REGISTER_GPIO/]
Signed-off-by: Uwe Kleine-König
Cc: Russell King
Acked-by: Richard Purdie
Cc: Fabio Estevam
Cc: Sascha Hauer
Tested-by: H Hartley Sweeten
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Uwe Kleine-König
2011-05-25 23:39:51 +0800
9b2da53f7 drivers/leds/leds-lm3530.c: add regulator ... Browse Code »

Add add regulator support to lm3530 driver. The lm3530 driver needs to
get proper regulator during device probe and enable it before accessing
the device. Also it disables the regulator in case of brightness ==
LED_OFF, and puts it back during driver removal.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Shreshtha Kumar Sahu
Cc: Lee Jones
Cc: Shreshtha Kumar Sahu
Cc: Richard Purdie
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Shreshtha Kumar Sahu
2011-05-25 23:39:51 +0800
402f75881 leds: remove the leds-h1940 driver ... Browse Code »

The H1940 machine now uses leds-gpio and leds-h1940 has no users anymore.

Signed-off-by: Vasily Khoruzhick
Cc: "Arnaud Patard (Rtp)"
Cc: Ben Dooks
Cc: Richard Purdie
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vasily Khoruzhick
2011-05-25 23:39:51 +0800
3dbf622c1 drivers/leds/leds-pca9532.c: add support pca9530, pca9531 and pca9533 ... Browse Code »

The pca953x family are only different in number of leds and register
layout Adding chipinfo to use driver with whole pca953x family Rename
driver to pca953x, but left files and platformflags named pca9532.

Tested with pca9530 and pca9533

Tested-by: Juergen Kilb
Signed-off-by: Jan Weitzel
Acked-by: Joachim Eastwood
Tested-by: Joachim Eastwood
Cc: Wolfram Sang
Cc: H Hartley Sweeten
Cc: Richard Purdie
Cc: Grant Likely
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Weitzel
2011-05-25 23:39:50 +0800
3c1ab50d0 drivers/leds/leds-pca9532.c: add gpio capability ... Browse Code »

Allow unused leds on pca9532 to be used as gpio. The board I am working
on now has no less than 6 pca9532 chips. One chips is used for only leds,
one has 14 leds and 2 gpio and the rest of the chips are gpio only.

There is also one board in mainline which could use this capabilty;
arch/arm/mach-iop32x/n2100.c
232 { .type = PCA9532_TYPE_NONE }, /* power OFF gpio */
233 { .type = PCA9532_TYPE_NONE }, /* reset gpio */

This patch defines a new pin type, PCA9532_TYPE_GPIO, and registers a
gpiochip if any pin has this type set. The gpio will registers all chip
pins but will filter on gpio_request.

[randy.dunlap@oracle.com: fix build when GPIOLIB is not enabled]
Signed-off-by: Joachim Eastwood
Reviewed-by: Wolfram Sang
Reviewed-by: H Hartley Sweeten
Cc: Richard Purdie
Cc: Grant Likely
Signed-off-by: Randy Dunlap
Cc: Jan Weitzel
Cc: Juergen Kilb
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joachim Eastwood
2011-05-25 23:39:50 +0800
fff26f814 leds: support automatic start of blinking with ledtrig-timer ... Browse Code »

By setting initial values blink_delay_on and blink_delay_off in a
led_classdev struct, this change starts the blinking when the led is
initialized.

With this patch, you can initialize blink_delay_on and blink_delay_off in
led_classdev with default_trigger set to "timer", and the led will start
up blinking. The current ledtrig-timer implementation ignores any initial
blink_delay_on/blink_delay_off settings, and requires setting
blink_delay_on/blink_delay_off (typically from userspace) before the led
blinks.

Signed-off-by: Esben Haabendal
Cc: Richard Purdie
Cc: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Esben Haabendal
2011-05-25 23:39:49 +0800
5ff77428e MAINTAINERS: orphan DMFE, move Tobias Ringstrom to CREDITS ... Browse Code »

Tobias's email bounces and he hasn't submitted or acked a patch in git
history.

Signed-off-by: Joe Perches
Cc: David Miller
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2011-05-25 23:39:49 +0800
8007798f5 MAINTAINERS: remove stale reference to Chris Wright's LSM tree ... Browse Code »

This tree hasn't been updated since June 2008.

Signed-off-by: Lucian Adrian Grijincu
Acked-by: Chris Wright
Cc: James Morris
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Lucian Adrian Grijincu
2011-05-25 23:39:48 +0800
162a7e750 printk: allocate kernel log buffer earlier ... Browse Code »

On larger systems, because of the numerous ACPI, Bootmem and EFI messages,
the static log buffer overflows before the larger one specified by the
log_buf_len param is allocated. Minimize the overflow by allocating the
new log buffer as soon as possible.

On kernels without memblock, a later call to setup_log_buf from
kernel/init.c is the fallback.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix CONFIG_PRINTK=n build]
Signed-off-by: Mike Travis
Cc: Yinghai Lu
Cc: "H. Peter Anvin"
Cc: Jack Steiner
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mike Travis
2011-05-25 23:39:48 +0800
95dde5019 memblock: add error return when CONFIG_HAVE_MEMBLOCK is not set ... Browse Code »

On larger systems, information in the kernel log is lost because there is
so much early text printed, that it overflows the static log buffer before
the log_buf_len kernel parameter can be processed, and a bigger log buffer
allocated.

Distros are relunctant to increase memory usage by increasing the size of
the static log buffer, so minimize the problem by allocating the new log
buffer as early as possible.

This patch:

Add an error return if CONFIG_HAVE_MEMBLOCK is not set instead of having
to add #ifdef CONFIG_HAVE_MEMBLOCK around blocks of code calling that
function.

Signed-off-by: Mike Travis
Cc: Yinghai Lu
Cc: "H. Peter Anvin"
Cc: Jack Steiner
Cc: Thomas Gleixner
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Yinghai Lu
2011-05-25 23:39:48 +0800
d9be9b90d lib/vsprintf.c: fix interaction of kasprintf() and vsnprintf() when using %pV ... Browse Code »

Otherwise, the warning at the top of vsnprintf() gets triggered by
kvasprintf()'s first invocation (with NULL buffer and zero size) of
vsnprintf().

Signed-off-by: Jan Beulich
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Beulich
2011-05-25 23:39:47 +0800
746a2a838 sparse: Undef __compiletime_{warning,error} if __CHECKER__ is defined ... Browse Code »

sparse can't parse warning and error attribute. then they should be
hidden from sparse.

Signed-off-by: KOSAKI Motohiro
Cc: Arjan van de Ven
Cc: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KOSAKI Motohiro
2011-05-25 23:39:47 +0800
5bd7e6a30 sparse: define __must_be_array() for __CHECKER__ ... Browse Code »

commit c5e631cf65f ("ARRAY_SIZE: check for type") added __must_be_array().
But sparse can't parse this gcc extention.

Now make C=2 makes following sparse errors a lot.

kernel/futex.c:2699:25: error: No right hand side of '+'-expression

Because __must_be_array() is used for ARRAY_SIZE() macro and it is
used very widely.

This patch fixes it.

Signed-off-by: KOSAKI Motohiro
Cc: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KOSAKI Motohiro
2011-05-25 23:39:46 +0800
903c0c7cd sparse: define dummy BUILD_BUG_ON definition for sparse ... Browse Code »

BUILD_BUG_ON() causes a syntax error to detect coding errors. So it
causes sparse to detect an error too. This reduces sparse's usefulness.

This patch makes a dummy BUILD_BUG_ON() definition for sparse.

Signed-off-by: KOSAKI Motohiro
Cc: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KOSAKI Motohiro
2011-05-25 23:39:46 +0800
d2b463135 init/calibrate.c: fix for critical bogoMIPS intermittent calculation failure ... Browse Code »

A fix to the TSC (Time Stamp Counter) based bogoMIPS calculation used on
secondary CPUs which has two faults:

1: Not handling wrapping of the lower 32 bits of the TSC counter on
32bit kernel - perhaps TSC is not reset by a warm reset?

2: TSC and Jiffies are no incrementing together properly. Either
jiffies increment too quickly or Time Stamp Counter isn't incremented
in during an SMI but the real time clock is and jiffies are
incremented.

Case 1 can result in a factor of 16 too large a value which makes udelay()
values too small and can cause mysterious driver errors. Case 2 appears
to give smaller 10-15% errors after averaging but enough to cause
occasional failures on my own board

I have tested this code on my own branch and attach patch suitable for
current kernel code. See below for examples of the failures and how the
fix handles these situations now.

I reported this issue earlier here:
Intermittent problem with BogoMIPs calculation on Intel AP CPUs -
http://marc.info/?l=linux-kernel&m=129947246316875&w=4

I suspect this issue has been seen by others but as it is intermittent and
bogoMIPS for secondary CPUs are no longer printed out it might have been
difficult to identify this as the cause. Perhaps these unresolved issues,
although quite old, might be relevant as possibly this fault has been
around for a while. In particular Case 1 may only be relevant to 32bit
kernels on newer HW (most people run 64bit kernels?). Case 2 is less
dramatic since the earlier fix in this area and also intermittent.

Re: bogomips discrepancy on Intel Core2 Quad CPU -
http://marc.info/?l=linux-kernel&m=118929277524298&w=4
slow system and bogus bogomips -
http://marc.info/?l=linux-kernel&m=116791286716107&w=4
Re: Re: [RFC-PATCH] clocksource: update lpj if clocksource has -
http://marc.info/?l=linux-kernel&m=128952775819467&w=4

This issue is masked a little by commit feae3203d711db0a ("timers, init:
Limit the number of per cpu calibration bootup messages") which only
prints out the first bogoMIPS value making it much harder to notice other
values differing. Perhaps it should be changed to only suppress them when
they are similar values?

Here are some outputs showing faults occurring and the new code handling
them properly. See my earlier message for examples of the original
failure.

Case 1: A Time Stamp Counter wrap:
...
Calibrating delay loop (skipped), value calculated using timer
frequency.. 6332.70 BogoMIPS (lpj=31663540)
....
calibrate_delay_direct() timer_rate_max=31666493
timer_rate_min=31666151 pre_start=4170369255 pre_end=4202035539
calibrate_delay_direct() timer_rate_max=2425955274
timer_rate_min=2425954941 pre_start=4265368533 pre_end=2396356387
calibrate_delay_direct() ignoring timer_rate as we had a TSC wrap
around start=4265368581 >=post_end=2396356511
calibrate_delay_direct() timer_rate_max=31666274
timer_rate_min=31665942 pre_start=2440373374 pre_end=2472039515
calibrate_delay_direct() timer_rate_max=31666492
timer_rate_min=31666160 pre_start=2535372139 pre_end=2567038422
calibrate_delay_direct() timer_rate_max=31666455
timer_rate_min=31666207 pre_start=2630371084 pre_end=2662037415
Calibrating delay using timer specific routine.. 6333.28 BogoMIPS (lpj=31666428)
Total of 2 processors activated (12665.99 BogoMIPS).
....

Case 2: Some thing (presumably the SMM interrupt?) causing the
very low increase in TSC counter for the DELAY_CALIBRATION_TICKS
increase in jiffies
...
Calibrating delay loop (skipped), value calculated using timer
frequency.. 6333.25 BogoMIPS (lpj=31666270)
...
calibrate_delay_direct() timer_rate_max=31666483
timer_rate_min=31666074 pre_start=4199536526 pre_end=4231202809
calibrate_delay_direct() timer_rate_max=864348 timer_rate_min=864016
pre_start=2405343672 pre_end=2406207897
calibrate_delay_direct() timer_rate_max=31666483
timer_rate_min=31666179 pre_start=2469540464 pre_end=2501206823
calibrate_delay_direct() timer_rate_max=31666511
timer_rate_min=31666122 pre_start=2564539400 pre_end=2596205712
calibrate_delay_direct() timer_rate_max=31666084
timer_rate_min=31665685 pre_start=2659538782 pre_end=2691204657
calibrate_delay_direct() dropping min bogoMips estimate 1 = 864348
Calibrating delay using timer specific routine.. 6333.27 BogoMIPS (lpj=31666390)
Total of 2 processors activated (12666.53 BogoMIPS).
...

After 70 boots I saw 2 variations
Reviewed-by: Phil Carmody
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Worsley
2011-05-25 23:39:46 +0800
1dbe39424 xattr.h: expose string defines to userspace ... Browse Code »

af4f136056c9 ("security: move LSM xattrnames to xattr.h") moved the
XATTR_CAPS_SUFFIX define from capability.h to xattr.h. This makes sense
except it was previously exports to userspace but xattr.h does not export
it to userspace. This patch exports these headers to userspace to fix the
ABI regression.

There is some slight possibility that this will cause problems in other
applications which used these #defines differently (wrongly) and I could
JUST export the capabilities xattr name that we broke. Does anyonehave an
idea how exposing these headers could cause a problem?

Below is what is being exposed to userspace, included here since it isn't
clear exactly what is going to be made available from the patch.

/* Namespaces */
#define XATTR_OS2_PREFIX "os2."
#define XATTR_OS2_PREFIX_LEN (sizeof (XATTR_OS2_PREFIX) - 1)

#define XATTR_SECURITY_PREFIX "security."
#define XATTR_SECURITY_PREFIX_LEN (sizeof (XATTR_SECURITY_PREFIX) - 1)

#define XATTR_SYSTEM_PREFIX "system."
#define XATTR_SYSTEM_PREFIX_LEN (sizeof (XATTR_SYSTEM_PREFIX) - 1)

#define XATTR_TRUSTED_PREFIX "trusted."
#define XATTR_TRUSTED_PREFIX_LEN (sizeof (XATTR_TRUSTED_PREFIX) - 1)

#define XATTR_USER_PREFIX "user."
#define XATTR_USER_PREFIX_LEN (sizeof (XATTR_USER_PREFIX) - 1)

/* Security namespace */
#define XATTR_SELINUX_SUFFIX "selinux"
#define XATTR_NAME_SELINUX XATTR_SECURITY_PREFIX XATTR_SELINUX_SUFFIX

#define XATTR_SMACK_SUFFIX "SMACK64"
#define XATTR_SMACK_IPIN "SMACK64IPIN"
#define XATTR_SMACK_IPOUT "SMACK64IPOUT"
#define XATTR_NAME_SMACK XATTR_SECURITY_PREFIX XATTR_SMACK_SUFFIX
#define XATTR_NAME_SMACKIPIN XATTR_SECURITY_PREFIX XATTR_SMACK_IPIN
#define XATTR_NAME_SMACKIPOUT XATTR_SECURITY_PREFIX XATTR_SMACK_IPOUT

#define XATTR_CAPS_SUFFIX "capability"
#define XATTR_NAME_CAPS XATTR_SECURITY_PREFIX XATTR_CAPS_SUFFIX

Reported-by: Ozan Çaglayan
Signed-off-by: Eric Paris
Cc: Mimi Zohar
Cc: Serge Hallyn
Cc: James Morris
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric Paris
2011-05-25 23:39:45 +0800
4b060420a bitmap, irq: add smp_affinity_list interface to /proc/irq ... Browse Code »

Manually adjusting the smp_affinity for IRQ's becomes unwieldy when the
cpu count is large.

Setting smp affinity to cpus 256 to 263 would be:

echo 000000ff,00000000,00000000,00000000,00000000,00000000,00000000,00000000 > smp_affinity

instead of:

echo 256-263 > smp_affinity_list

Think about what it looks like for cpus around say, 4088 to 4095.

We already have many alternate "list" interfaces:

/sys/devices/system/cpu/cpuX/indexY/shared_cpu_list
/sys/devices/system/cpu/cpuX/topology/thread_siblings_list
/sys/devices/system/cpu/cpuX/topology/core_siblings_list
/sys/devices/system/node/nodeX/cpulist
/sys/devices/pci***/***/local_cpulist

Add a companion interface, smp_affinity_list to use cpu lists instead of
cpu maps. This conforms to other companion interfaces where both a map
and a list interface exists.

This required adding a bitmap_parselist_user() function in a manner
similar to the bitmap_parse_user() function.

[akpm@linux-foundation.org: make __bitmap_parselist() static]
Signed-off-by: Mike Travis
Cc: Thomas Gleixner
Cc: Jack Steiner
Cc: Lee Schermerhorn
Cc: Andy Shevchenko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mike Travis
2011-05-25 23:39:45 +0800
e50c1f609 fscache: remove dead code under CONFIG_WORKQUEUE_DEBUGFS ... Browse Code »

There is no CONFIG_WORKQUEUE_DEBUGFS any more, so this code is dead.

Signed-off-by: WANG Cong
Cc: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Amerigo Wang
2011-05-25 23:39:44 +0800
dbee8a0af x86: remove 32-bit versions of readq()/writeq() ... Browse Code »
43

The presense of a writeq() implementation on 32-bit x86 that splits the
64-bit write into two 32-bit writes turns out to break the mpt2sas driver
(and in general is risky for drivers as was discussed in
). To fix this,
revert 2c5643b1c5c7 ("x86: provide readq()/writeq() on 32-bit too") and
follow-on cleanups.

This unfortunately leads to pushing non-atomic definitions of readq() and
write() to various x86-only drivers that in the meantime started using the
definitions in the x86 version of . However as discussed
exhaustively, this is actually the right thing to do, because the right
way to split a 64-bit transaction is hardware dependent and therefore
belongs in the hardware driver (eg mpt2sas needs a spinlock to make sure
no other accesses occur in between the two halves of the access).

Build tested on 32- and 64-bit x86 allmodconfig.

Link: http://lkml.kernel.org/r/x86-32-writeq-is-broken@mdm.bga.com
Acked-by: Hitoshi Mitake
Cc: Kashyap Desai
Cc: Len Brown
Cc: Ravi Anand
Cc: Vikas Chaudhary
Cc: Matthew Garrett
Cc: Jason Uhlenkott
Acked-by: James Bottomley
Acked-by: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Signed-off-by: Roland Dreier
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Roland Dreier
2011-05-25 23:39:44 +0800
818b667ba Remove unused PROC_CHANGE_PENALTY constant ... Browse Code »

This constant hasn't been used since before the git era (2.6.12) and thus
can be dropped.

Signed-off-by: Stephen Boyd
Cc: Russell King
Cc: Richard Weinberger
Cc: Hirokazu Takata
Cc: Kyle McMartin
Cc: Richard Henderson
Cc: Ivan Kokshaysky
Cc: Matt Turner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Stephen Boyd
2011-05-25 23:39:43 +0800
21b7d815b include/linux/c2port.h: remove wrong and never used macros ... Browse Code »

The macro to_class_dev() uses the deprecated structure class_device, and
the c2port_device has no member named class in the definition of the macro
to_c2port_device.

Signed-off-by: Wanlong Gao
Cc: Rodolfo Giometti
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Wanlong Gao
2011-05-25 23:39:43 +0800
0ac1ee0bf ulimit: raise default hard ulimit on number of files to 4096 ... Browse Code »

Apps are increasingly using more than 1024 file descriptors. See
discussion in several distro bug trackers, e.g. BugLink:
http://bugs.launchpad.net/bugs/663090
https://issues.rpath.com/browse/RPL-2054

You don't want to raise the default soft limit, since that might break
apps that use select(), but it's safe to raise the default hard limit;
that way, apps that know they need lots of file descriptors can raise
their soft limit without needing root, and without user intervention.

Ubuntu is doing this with a kernel change because they have a policy of
not changing kernel defaults in userland.

While 4096 might not be enough for *all* apps, it seems to be plenty for
the apps I've seen lately that are unhappy with 1024.

Signed-off-by: Tim Gardner
Cc: Dan Kegel
Cc: Al Viro
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tim Gardner
2011-05-25 23:39:43 +0800
db271cf03 um: fix crash while os_dump_core() ... Browse Code »

os_dump_core() emits SIGTERM to terminate all UML processes. Kernel
threads have to exit on SIGTERM instead of calling last_ditch_exit().
Multiple calls to last_ditch_exit() can cause a crash.

Signed-off-by: Richard Weinberger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Richard Weinberger
2011-05-25 23:39:42 +0800
607647ab0 um: include linux/prefetch.h ... Browse Code »

Fix build failures on UML.

Signed-off-by: Richard Weinberger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Richard Weinberger
2011-05-25 23:39:42 +0800
3ef6130ab um: print info about fatal segfaults ... Browse Code »

Print a short info about fatal segfaults like other archs do.

Signed-off-by: Richard Weinberger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Richard Weinberger
2011-05-25 23:39:41 +0800
4ff4d8d34 um: add ucast ethernet transport ... Browse Code »

The ucast transport is similar to the mcast transport (and, in fact,
shares most of its code), only it uses UDP unicast to move packets.

Obviously this is only useful for point-to-point connections between
virtual ethernet devices.

Signed-off-by: Nolan Leake
Signed-off-by: Richard Weinberger
Cc: David Miller
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nolan Leake
2011-05-25 23:39:41 +0800
d634f194d um: add earlyprintk support ... Browse Code »

User Mode Linux can also benefit from earlyprintk. UML's earlyprintk
writes kernel messages directly to stdout.

Signed-off-by: Richard Weinberger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Richard Weinberger
2011-05-25 23:39:41 +0800
2525e70d4 um: remove SIGHUP handler ... Browse Code »

The UML kernel ignores SIGHUP anyway. This handler is in vain.

Signed-off-by: Richard Weinberger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Richard Weinberger
2011-05-25 23:39:40 +0800
0ce451acb um: fix UML_LIB_PATH ... Browse Code »

UML_LIB_PATH is hardcoded to /usr/lib/uml/, on 64bit systems UML_LIB_PATH
needs to be /usr/lib64/uml/.

Signed-off-by: Richard Weinberger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Richard Weinberger
2011-05-25 23:39:40 +0800
8aebe21e0 cris: convert old cpumask API into new one ... Browse Code »

Adapt to the new API.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: KOSAKI Motohiro
Cc: Mikael Starvik
Cc: Jesper Nilsson
Cc: Thiago Farina
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KOSAKI Motohiro
2011-05-25 23:39:39 +0800
8ea9716fd mn10300: convert old cpumask API into new one ... Browse Code »

Adapt to the new API.

We plan to remove old cpumask APIs later. Thus this patch converts them
into the new one.

Signed-off-by: KOSAKI Motohiro
Cc: David Howells
Cc: Koichi Yasutake
Cc: Hugh Dickins
Cc: Chris Metcalf
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KOSAKI Motohiro
2011-05-25 23:39:39 +0800
81ee42baa alpha: hook up gpiolib support ... Browse Code »

Allow people to use gpiolib on Alpha if they want to, mostly for build
coverage. The header is a stright copy of that for Microblaze, which in
turn was taken from PowerPC.

[akpm@linux-foundation.org: define GENERIC_GPIO]
Signed-off-by: Mark Brown
Cc: Richard Henderson
Cc: Ivan Kokshaysky
Cc: Matt Turner
Acked-by: Grant Likely
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mark Brown
2011-05-25 23:39:38 +0800
81740fc6b alpha: replace with new cpumask APIs ... Browse Code »

We plan to remove cpu_xx() old APIs. Thus convert them. This patch has
no functional change.

Signed-off-by: KOSAKI Motohiro
Cc: Richard Henderson
Cc: Ivan Kokshaysky
Cc: Matt Turner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KOSAKI Motohiro
2011-05-25 23:39:38 +0800
f67d9b157 nommu: add page alignment to mmap ... Browse Code »

Currently on nommu arch mmap(),mremap() and munmap() doesn't do
page_align() which isn't consist with mmu arch and cause some issues.

First, some drivers' mmap() function depends on vma->vm_end - vma->start
is page aligned which is true on mmu arch but not on nommu. eg: uvc
camera driver.

Second munmap() may return -EINVAL[split file] error in cases when end is
not page aligned(passed into from userspace) but vma->vm_end is aligned
dure to split or driver's mmap() ops.

Add page alignment to fix those issues.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Bob Liu
Cc: David Howells
Cc: Paul Mundt
Cc: Greg Ungerer
Cc: Geert Uytterhoeven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Bob Liu
2011-05-25 23:39:38 +0800
eb709b0d0 mm: batch activate_page() to reduce lock contention ... Browse Code »

The zone->lru_lock is heavily contented in workload where activate_page()
is frequently used. We could do batch activate_page() to reduce the lock
contention. The batched pages will be added into zone list when the pool
is full or page reclaim is trying to drain them.

For example, in a 4 socket 64 CPU system, create a sparse file and 64
processes, processes shared map to the file. Each process read access the
whole file and then exit. The process exit will do unmap_vmas() and cause
a lot of activate_page() call. In such workload, we saw about 58% total
time reduction with below patch. Other workloads with a lot of
activate_page also benefits a lot too.

Andrew Morton suggested activate_page() and putback_lru_pages() should
follow the same path to active pages, but this is hard to implement (see
commit 7a608572a282a ("Revert "mm: batch activate_page() to reduce lock
contention")). On the other hand, do we really need putback_lru_pages()
to follow the same path? I tested several FIO/FFSB benchmark (about 20
scripts for each benchmark) in 3 machines here from 2 sockets to 4
sockets. My test doesn't show anything significant with/without below
patch (there is slight difference but mostly some noise which we found
even without below patch before). Below patch basically returns to the
same as my first post.

I tested some microbenchmarks:
case-anon-cow-rand-mt 0.58%
case-anon-cow-rand -3.30%
case-anon-cow-seq-mt -0.51%
case-anon-cow-seq -5.68%
case-anon-r-rand-mt 0.23%
case-anon-r-rand 0.81%
case-anon-r-seq-mt -0.71%
case-anon-r-seq -1.99%
case-anon-rx-rand-mt 2.11%
case-anon-rx-seq-mt 3.46%
case-anon-w-rand-mt -0.03%
case-anon-w-rand -0.50%
case-anon-w-seq-mt -1.08%
case-anon-w-seq -0.12%
case-anon-wx-rand-mt -5.02%
case-anon-wx-seq-mt -1.43%
case-fork 1.65%
case-fork-sleep -0.07%
case-fork-withmem 1.39%
case-hugetlb -0.59%
case-lru-file-mmap-read-mt -0.54%
case-lru-file-mmap-read 0.61%
case-lru-file-mmap-read-rand -2.24%
case-lru-file-readonce -0.64%
case-lru-file-readtwice -11.69%
case-lru-memcg -1.35%
case-mmap-pread-rand-mt 1.88%
case-mmap-pread-rand -15.26%
case-mmap-pread-seq-mt 0.89%
case-mmap-pread-seq -69.72%
case-mmap-xread-rand-mt 0.71%
case-mmap-xread-seq-mt 0.38%

The most significent are:
case-lru-file-readtwice -11.69%
case-mmap-pread-rand -15.26%
case-mmap-pread-seq -69.72%

which use activate_page a lot. others are basically variations because
each run has slightly difference.

In UP case, 'size mm/swap.o'
before the two patches:
text data bss dec hex filename
6466 896 4 7366 1cc6 mm/swap.o
after the two patches:
text data bss dec hex filename
6343 896 4 7243 1c4b mm/swap.o

Signed-off-by: Shaohua Li
Cc: KOSAKI Motohiro
Cc: Hiroyuki Kamezawa
Cc: Andi Kleen
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Mel Gorman
Cc: Johannes Weiner
Cc: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Shaohua Li
2011-05-25 23:39:37 +0800
f68aa5b44 asm-generic/cacheflush.h: flush icache when copying to user pages ... Browse Code »

The copy_to_user_page() function is supposed to flush the icache on the
memory that was written, but the current asm-generic version lacks that
logic. While normally it isn't a big deal as the asm-generic version of
icache flushing is a stub, it is a deal for ports that want to use the
asm-generic version as a baseline and then overlay its own specific parts
(like icache flushing).

Signed-off-by: Mike Frysinger
Cc: Arnd Bergmann
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mike Frysinger
2011-05-25 23:39:37 +0800
cfa54a0fc mm/page_alloc.c: prevent unending loop in __alloc_pages_slowpath() ... Browse Code »

I believe I found a problem in __alloc_pages_slowpath, which allows a
process to get stuck endlessly looping, even when lots of memory is
available.

Running an I/O and memory intensive stress-test I see a 0-order page
allocation with __GFP_IO and __GFP_WAIT, running on a system with very
little free memory. Right about the same time that the stress-test gets
killed by the OOM-killer, the utility trying to allocate memory gets stuck
in __alloc_pages_slowpath even though most of the systems memory was freed
by the oom-kill of the stress-test.

The utility ends up looping from the rebalance label down through the
wait_iff_congested continiously. Because order=0,
__alloc_pages_direct_compact skips the call to get_page_from_freelist.
Because all of the reclaimable memory on the system has already been
reclaimed, __alloc_pages_direct_reclaim skips the call to
get_page_from_freelist. Since there is no __GFP_FS flag, the block with
__alloc_pages_may_oom is skipped. The loop hits the wait_iff_congested,
then jumps back to rebalance without ever trying to
get_page_from_freelist. This loop repeats infinitely.

The test case is pretty pathological. Running a mix of I/O stress-tests
that do a lot of fork() and consume all of the system memory, I can pretty
reliably hit this on 600 nodes, in about 12 hours. 32GB/node.

Signed-off-by: Andrew Barry
Signed-off-by: Minchan Kim
Reviewed-by: Rik van Riel
Acked-by: Mel Gorman
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Barry
2011-05-25 23:39:36 +0800
a539f3533 mm: add SECTION_ALIGN_UP() and SECTION_ALIGN_DOWN() macro ... Browse Code »

Add SECTION_ALIGN_UP() and SECTION_ALIGN_DOWN() macro which aligns given
pfn to upper section and lower section boundary accordingly.

Required for the latest memory hotplug support for the Xen balloon driver.

Signed-off-by: Daniel Kiper
Reviewed-by: Konrad Rzeszutek Wilk
David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Daniel Kiper
2011-05-25 23:39:36 +0800
a2c8990ae memsw: remove noswapaccount kernel parameter ... Browse Code »

The noswapaccount parameter has been deprecated since 2.6.38 without any
complaints from users so we can remove it. swapaccount=0|1 can be used
instead.

As we are removing the parameter we can also clean up swapaccount because
it doesn't have to accept an empty string anymore (to match noswapaccount)
and so we can push = into __setup macro rather than checking "=1" resp.
"=0" strings

Signed-off-by: Michal Hocko
Cc: Hiroyuki Kamezawa
Cc: Daisuke Nishimura
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michal Hocko
2011-05-25 23:39:36 +0800
5b52fc890 proc: allocate storage for numa_maps statistics once ... Browse Code »

In show_numa_map() we collect statistics into a numa_maps structure.
Since the number of NUMA nodes can be very large, this structure is not a
candidate for stack allocation.

Instead of going thru a kmalloc()+kfree() cycle each time show_numa_map()
is invoked, perform the allocation just once when /proc/pid/numa_maps is
opened.

Performing the allocation when numa_maps is opened, and thus before a
reference to the target tasks mm is taken, eliminates a potential
stalemate condition in the oom-killer as originally described by Hugh
Dickins:

... imagine what happens if the system is out of memory, and the mm
we're looking at is selected for killing by the OOM killer: while
we wait in __get_free_page for more memory, no memory is freed
from the selected mm because it cannot reach exit_mmap while we hold
that reference.

Signed-off-by: Stephen Wilson
Reviewed-by: KOSAKI Motohiro
Cc: Hugh Dickins
Cc: David Rientjes
Cc: Lee Schermerhorn
Cc: Alexey Dobriyan
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Stephen Wilson
2011-05-25 23:39:35 +0800