Eric Lee / smarc-fsl-linux-kernel

30 Sep, 2010

1 commit

77f890223 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx ... Browse Code »

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
dmaengine: fix interrupt clearing for mv_xor
missing inline keyword for static function in linux/dmaengine.h
dma/shdma: move dereference below the NULL check

Linus Torvalds
2010-09-30 09:41:19 +0800

29 Sep, 2010

1 commit

a2724f28d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
tcp: Fix >4GB writes on 64-bit.
net/9p: Mount only matching virtio channels
de2104x: fix ethtool
tproxy: check for transparent flag in ip_route_newports
ipv6: add IPv6 to neighbour table overflow warning
tcp: fix TSO FACK loss marking in tcp_mark_head_lost
3c59x: fix regression from patch "Add ethtool WOL support"
ipv6: add a missing unregister_pernet_subsys call
s390: use free_netdev(netdev) instead of kfree()
sgiseeq: use free_netdev(netdev) instead of kfree()
rionet: use free_netdev(netdev) instead of kfree()
ibm_newemac: use free_netdev(netdev) instead of kfree()
smsc911x: Add MODULE_ALIAS()
net: reset skb queue mapping when rx'ing over tunnel
br2684: fix scheduling while atomic
de2104x: fix TP link detection
de2104x: fix power management
de2104x: disable autonegotiation on broken hardware
net: fix a lockdep splat
e1000e: 82579 do not gate auto config of PHY by hardware during nominal use
...

Linus Torvalds
2010-09-29 03:01:26 +0800

28 Sep, 2010

3 commits

01db403cf tcp: Fix >4GB writes on 64-bit. ... Browse Code »

Fixes kernel bugzilla #16603

tcp_sendmsg() truncates iov_len to an 'int' which a 4GB write to write
zero bytes, for example.

There is also the problem higher up of how verify_iovec() works. It
wants to prevent the total length from looking like an error return
value.

However it does this using 'int', but syscalls return 'long' (and
thus signed 64-bit on 64-bit machines). So it could trigger
false-positives on 64-bit as written. So fix it to use 'long'.

Reported-by: Olaf Bonorden
Reported-by: Daniel Büse
Reported-by: Andrew Morton
Signed-off-by: David S. Miller

David S. Miller
2010-09-28 11:24:54 +0800
fb0c5f0bc tproxy: check for transparent flag in ip_route_newports ... Browse Code »

as done in ip_route_connect()

Signed-off-by: Ulrich Weber
Signed-off-by: David S. Miller

Ulrich Weber
2010-09-28 06:03:33 +0800
6a6aa2b7e Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/… ... Browse Code »

…git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86/amd-iommu: Fix rounding-bug in __unmap_single
x86/amd-iommu: Work around S3 BIOS bug
x86/amd-iommu: Set iommu configuration flags in enable-loop
x86, setup: Fix earlyprintk=serial,0x3f8,115200
x86, setup: Fix earlyprintk=serial,ttyS0,115200

Linus Torvalds
2010-09-28 03:22:21 +0800

27 Sep, 2010

2 commits

2cc6d2bf3 ipv6: add a missing unregister_pernet_subsys call ... Browse Code »

Clean up a missing exit path in the ipv6 module init routines. In
addrconf_init we call ipv6_addr_label_init which calls register_pernet_subsys
for the ipv6_addr_label_ops structure. But if module loading fails, or if the
ipv6 module is removed, there is no corresponding unregister_pernet_subsys call,
which leaves a now-bogus address on the pernet_list, leading to oopses in
subsequent registrations. This patch cleans up both the failed load path and
the unload path. Tested by myself with good results.

Signed-off-by: Neil Horman

include/net/addrconf.h | 1 +
net/ipv6/addrconf.c | 11 ++++++++---
net/ipv6/addrlabel.c | 5 +++++
3 files changed, 14 insertions(+), 3 deletions(-)
Signed-off-by: David S. Miller

Neil Horman
2010-09-27 10:09:25 +0800
693019e90 net: reset skb queue mapping when rx'ing over tunnel ... Browse Code »

Reset queue mapping when an skb is reentering the stack via a tunnel.
On second pass, the queue mapping from the original device is no
longer valid.

Signed-off-by: Tom Herbert
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Tom Herbert
2010-09-27 09:48:40 +0800

24 Sep, 2010

1 commit

7329cf020 Merge branch 'amd-iommu/2.6.36' of git://git.kernel.org/pub/scm/linux/kernel/git… ... Browse Code »

…/joro/linux-2.6-iommu into x86/urgent

Ingo Molnar
2010-09-24 17:19:53 +0800

23 Sep, 2010

4 commits

4c894f47b x86/amd-iommu: Work around S3 BIOS bug ... Browse Code »

This patch adds a workaround for an IOMMU BIOS problem to
the AMD IOMMU driver. The result of the bug is that the
IOMMU does not execute commands anymore when the system
comes out of the S3 state resulting in system failure. The
bug in the BIOS is that is does not restore certain hardware
specific registers correctly. This workaround reads out the
contents of these registers at boot time and restores them
on resume from S3. The workaround is limited to the specific
IOMMU chipset where this problem occurs.

Cc: stable@kernel.org
Signed-off-by: Joerg Roedel

Joerg Roedel
2010-09-23 22:26:03 +0800
710224fa2 arm: fix "arm: fix pci_set_consistent_dma_mask for dmabounce devices" ... Browse Code »

This fixes the regression caused by the commit 6fee48cd330c68
("dma-mapping: arm: use generic pci_set_dma_mask and
pci_set_consistent_dma_mask").

ARM needs to clip the dma coherent mask for dmabounce devices. This
restores the old trick.

Note that strictly speaking, the DMA API doesn't allow architectures to do
such but I'm not sure it's worth adding the new API to set the dma mask
that allows architectures to clip it.

Reported-by: Krzysztof Halasa
Signed-off-by: FUJITA Tomonori
Acked-by: Russell King
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

FUJITA Tomonori
2010-09-23 08:22:38 +0800
d3f3cf859 missing inline keyword for static function in linux/dmaengine.h ... Browse Code »

Add a missing inline keyword for static function in linux/dmaengine.h to
avoid duplicate symbol definitions.

Signed-off-by: Mathieu Lacage
Signed-off-by: Dan Williams

Mathieu Lacage
2010-09-23 06:29:32 +0800
56b49f4b8 net: Move "struct net" declaration inside the __KERNEL__ macro guard ... Browse Code »

This patch reduces namespace pollution by moving the "struct net" declaration
out of the userspace-facing portion of linux/netlink.h. It has no impact on
the kernel.

(This came up because we have several C++ applications which use "net" as a
namespace name.)

Signed-off-by: Ollie Wild
Signed-off-by: David S. Miller

Ollie Wild
2010-09-23 04:21:05 +0800

22 Sep, 2010

1 commit

8b15575ca fs: {lock,unlock}_flocks() stubs to prepare for BKL removal ... Browse Code »

The lock structs are currently protected by the BKL, but are accessed by
code in fs/locks.c and misc file system and DLM code. These stubs will
allow all users to switch to the new interface before the implementation
is changed to a spinlock.

Acked-by: Arnd Bergmann
Signed-off-by: Sage Weil
Signed-off-by: Linus Torvalds

Sage Weil
2010-09-22 08:27:44 +0800

21 Sep, 2010

1 commit

8444cf712 xfrm: Allow different selector family in temporary state ... Browse Code »

The family parameter xfrm_state_find is used to find a state matching a
certain policy. This value is set to the template's family
(encap_family) right before xfrm_state_find is called.
The family parameter is however also used to construct a temporary state
in xfrm_state_find itself which is wrong for inter-family scenarios
because it produces a selector for the wrong family. Since this selector
is included in the xfrm_user_acquire structure, user space programs
misinterpret IPv6 addresses as IPv4 and vice versa.
This patch splits up the original init_tempsel function into a part that
initializes the selector respectively the props and id of the temporary
state, to allow for differing ip address families whithin the state.

Signed-off-by: Thomas Egerer
Signed-off-by: Steffen Klassert
Signed-off-by: David S. Miller

Thomas Egerer
2010-09-21 02:11:38 +0800

20 Sep, 2010

1 commit

7d7dee96e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
dca: disable dca on IOAT ver.3.0 multiple-IOH platforms
netpoll: Disable IRQ around RCU dereference in netpoll_rx
sctp: Do not reset the packet during sctp_packet_config().
net/llc: storing negative error codes in unsigned short
MAINTAINERS: move atlx discussions to netdev
drivers/net/cxgb3/cxgb3_main.c: prevent reading uninitialized stack memory
drivers/net/eql.c: prevent reading uninitialized stack memory
drivers/net/usb/hso.c: prevent reading uninitialized memory
xfrm: dont assume rcu_read_lock in xfrm_output_one()
r8169: Handle rxfifo errors on 8168 chips
3c59x: Remove atomic context inside vortex_{set|get}_wol
tcp: Prevent overzealous packetization by SWS logic.
net: RPS needs to depend upon USE_GENERIC_SMP_HELPERS
phylib: fix PAL state machine restart on resume
net: use rcu_barrier() in rollback_registered_many
bonding: correctly process non-linear skbs
ipv4: enable getsockopt() for IP_NODEFRAG
ipv4: force_igmp_version ignored when a IGMPv3 query received
ppp: potential NULL dereference in ppp_mp_explode()
net/llc: make opt unsigned in llc_ui_setsockopt()
...

Linus Torvalds
2010-09-20 02:05:50 +0800

18 Sep, 2010

1 commit

f0f9deae9 netpoll: Disable IRQ around RCU dereference in netpoll_rx ... Browse Code »

We cannot use rcu_dereference_bh safely in netpoll_rx as we may
be called with IRQs disabled. We could however simply disable
IRQs as that too causes BH to be disabled and is safe in either
case.

Thanks to John Linville for discovering this bug and providing
a patch.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2010-09-18 07:55:03 +0800

17 Sep, 2010

2 commits

94ca9d669 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: add documentation

Linus Torvalds
2010-09-17 03:50:31 +0800
2c35cd019 Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 ... Browse Code »

* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/radeon/kms: only warn on mipmap size checks in r600 cs checker (v2)
drm/radeon/kms: force legacy pll algo for RV620 LVDS
drm: fix race between driver loading and userspace open.
drm: Use a nondestructive mode for output detect when polling (v2)
drm/radeon/kms: fix the colorbuffer CS checker for r300-r500
drm/radeon/kms: increase lockup detection interval to 10 sec for r100-r500
drm/radeon/kms/evergreen: fix backend setup
drm: Use a nondestructive mode for output detect when polling
drm/radeon: add some missing copyright headers
drm: Only decouple the old_fb from the crtc is we call mode_set*
drm/radeon/kms: don't enable underscan with interlaced modes
drm/radeon/kms: add connector table for Mac x800
drm/radeon/kms: fix regression in RMX code (v2)
drm: Fix regression in disable polling e58f637

Linus Torvalds
2010-09-17 03:48:58 +0800

16 Sep, 2010

1 commit

01f83d698 tcp: Prevent overzealous packetization by SWS logic. ... Browse Code »

If peer uses tiny MSS (say, 75 bytes) and similarly tiny advertised
window, the SWS logic will packetize to half the MSS unnecessarily.

This causes problems with some embedded devices.

However for large MSS devices we do want to half-MSS packetize
otherwise we never get enough packets into the pipe for things
like fast retransmit and recovery to work.

Be careful also to handle the case where MSS > window, otherwise
we'll never send until the probe timer.

Reported-by: ツ Leandro Melo de Sales
Signed-off-by: David S. Miller

Alexey Kuznetsov
2010-09-16 03:01:44 +0800

15 Sep, 2010

3 commits

9c03f1622 Merge ssh://master.kernel.org/home/hpa/tree/sec ... Browse Code »

* ssh://master.kernel.org/home/hpa/tree/sec:
x86-64, compat: Retruncate rax after ia32 syscall entry tracing
x86-64, compat: Test %rax for the syscall number, not %eax
compat: Make compat_alloc_user_space() incorporate the access_ok()

Linus Torvalds
2010-09-15 08:07:51 +0800
de8d4f5d7 Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 ... Browse Code »

* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
SUNRPC: Fix the NFSv4 and RPCSEC_GSS Kconfig dependencies
statfs() gives ESTALE error
NFS: Fix a typo in nfs_sockaddr_match_ipaddr6
sunrpc: increase MAX_HASHTABLE_BITS to 14
gss:spkm3 miss returning error to caller when import security context
gss:krb5 miss returning error to caller when import security context
Remove incorrect do_vfs_lock message
SUNRPC: cleanup state-machine ordering
SUNRPC: Fix a race in rpc_info_open
SUNRPC: Fix race corrupting rpc upcall
Fix null dereference in call_allocate

Linus Torvalds
2010-09-15 08:04:48 +0800
c41d68a51 compat: Make compat_alloc_user_space() incorporate the access_ok() ... Browse Code »

compat_alloc_user_space() expects the caller to independently call
access_ok() to verify the returned area. A missing call could
introduce problems on some architectures.

This patch incorporates the access_ok() check into
compat_alloc_user_space() and also adds a sanity check on the length.
The existing compat_alloc_user_space() implementations are renamed
arch_compat_alloc_user_space() and are used as part of the
implementation of the new global function.

This patch assumes NULL will cause __get_user()/__put_user() to either
fail or access userspace on all architectures. This should be
followed by checking the return value of compat_access_user_space()
for NULL in the callers, at which time the access_ok() in the callers
can also be removed.

Reported-by: Ben Hawkes
Signed-off-by: H. Peter Anvin
Acked-by: Benjamin Herrenschmidt
Acked-by: Chris Metcalf
Acked-by: David S. Miller
Acked-by: Ingo Molnar
Acked-by: Thomas Gleixner
Acked-by: Tony Luck
Cc: Andrew Morton
Cc: Arnd Bergmann
Cc: Fenghua Yu
Cc: H. Peter Anvin
Cc: Heiko Carstens
Cc: Helge Deller
Cc: James Bottomley
Cc: Kyle McMartin
Cc: Martin Schwidefsky
Cc: Paul Mackerras
Cc: Ralf Baechle
Cc:

H. Peter Anvin
2010-09-15 07:08:45 +0800

14 Sep, 2010

3 commits

930a9e283 drm: Use a nondestructive mode for output detect when polling (v2) ... Browse Code »

v2: Julien Cristau pointed out that @nondestructive results in
double-negatives and confusion when trying to interpret the parameter,
so use @force instead. Much easier to type as well. ;-)

And fix the miscompilation of vmgfx reported by Sedat Dilek.

Signed-off-by: Chris Wilson
Cc: stable@kernel.org
Signed-off-by: Dave Airlie

Chris Wilson
2010-09-14 18:38:48 +0800
2bb3a259d Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 ... Browse Code »

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
dquot: do full inode dirty in allocating space

Linus Torvalds
2010-09-14 03:46:09 +0800
6142811a3 Merge branch 'next-spi' of git://git.secretlab.ca/git/linux-2.6 ... Browse Code »

* 'next-spi' of git://git.secretlab.ca/git/linux-2.6:
spi/pl022: move probe call to subsys_initcall()
powerpc/5200: mpc52xx_uart.c: Add of_node_put to avoid memory leak
spi/pl022: fix APB pclk power regression on U300
spi/spi_s3c64xx: Warn if PIO transfers time out
spi/s3c64xx: Fix incorrect reuse of 'val' local variable.
spi/s3c64xx: Fix compilation warning
spi/dw_spi: clean the cs_control code
spi/dw_spi: Allow interrupt sharing
spi/spi_s3c64xx: Increase dead reckoning time in wait_for_xfer()
spi/spi_s3c64xx: Move to subsys_initcall()
spi: free children in spi_unregister_master, not siblings
gpiolib: Add 'struct gpio_chip' forward declaration for !GPIOLIB case
of: Fix missing includes - ll_temac
spi/spi_s3c64xx: Staticise non-exported functions
spi/spi_s3c64xx: Make probe more robust against missing board config

Linus Torvalds
2010-09-14 03:45:50 +0800

13 Sep, 2010

3 commits

7b334fcb4 drm: Use a nondestructive mode for output detect when polling ... Browse Code »

Destructive load-detection is very expensive and due to failings
elsewhere can trigger system wide stalls of up to 600ms. A simple
first step to correcting this is not to invoke such an expensive
and destructive load-detection operation automatically.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=29536
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=16265
Reported-by: Bruno Prémont
Tested-by: Sitsofe Wheeler
Signed-off-by: Chris Wilson
Cc: stable@kernel.org
Signed-off-by: Dave Airlie

Chris Wilson
2010-09-13 18:29:11 +0800
c54fce6ef workqueue: add documentation ... Browse Code »

Update copyright notice and add Documentation/workqueue.txt.

Randy Dunlap, Dave Chinner: misc fixes.

Signed-off-by: Tejun Heo
Reviewed-By: Florian Mickler
Cc: Ingo Molnar
Cc: Christoph Lameter
Cc: Randy Dunlap
Cc: Dave Chinner

Tejun Heo
2010-09-13 16:26:52 +0800
006abe887 SUNRPC: Fix a race in rpc_info_open ... Browse Code »

There is a race between rpc_info_open and rpc_release_client()
in that nothing stops a process from opening the file after
the clnt->cl_kref goes to zero.

Fix this by using atomic_inc_unless_zero()...

Reported-by: J. Bruce Fields
Signed-off-by: Trond Myklebust
Cc: stable@kernel.org

Trond Myklebust
2010-09-13 07:55:25 +0800

11 Sep, 2010

1 commit

002e473d1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (28 commits)
ipheth: remove incorrect devtype to WWAN
MAINTAINERS: Add CAIF
sctp: fix test for end of loop
KS8851: Correct RX packet allocation
udp: add rehash on connect()
net: blackhole route should always be recalculated
ipv4: Suppress lockdep-RCU false positive in FIB trie (3)
niu: Fix kernel buffer overflow for ETHTOOL_GRXCLSRLALL
ipvs: fix active FTP
gro: Re-fix different skb headrooms
via-velocity: Turn scatter-gather support back off.
ipv4: Fix reverse path filtering with multipath routing.
UNIX: Do not loop forever at unix_autobind().
PATCH: b44 Handle RX FIFO overflow better (simplified)
irda: off by one
3c59x: Fix deadlock in vortex_error()
netfilter: discard overlapping IPv6 fragment
ipv6: discard overlapping fragment
net: fix tx queue selection for bridged devices implementing select_queue
bonding: Fix jiffies overflow problems (again)
...

Fix up trivial conflicts due to the same cgroup API thinko fix going
through both Andrew and the networking tree. However, there were small
differences between the two, with Andrew's version generally being the
nicer one, and the one I merged first. So pick that one.

Conflicts in: include/linux/cgroup.h and kernel/cgroup.c

Linus Torvalds
2010-09-11 23:06:38 +0800

10 Sep, 2010

11 commits

ff3cb3fec Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block ... Browse Code »

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: Range check cpu in blk_cpu_to_group
scatterlist: prevent invalid free when alloc fails
writeback: Fix lost wake-up shutting down writeback thread
writeback: do not lose wakeup events when forking bdi threads
cciss: fix reporting of max queue depth since init
block: switch s390 tape_block and mg_disk to elevator_change()
block: add function call to switch the IO scheduler from a driver
fs/bio-integrity.c: return -ENOMEM on kmalloc failure
bio-integrity.c: remove dependency on __GFP_NOFAIL
BLOCK: fix bio.bi_rw handling
block: put dev->kobj in blk_register_queue fail path
cciss: handle allocation failure
cfq-iosched: Documentation help for new tunables
cfq-iosched: blktrace print per slice sector stats
cfq-iosched: Implement tunable group_idle
cfq-iosched: Do group share accounting in IOPS when slice_idle=0
cfq-iosched: Do not idle if slice_idle=0
cciss: disable doorbell reset on reset_devices
blkio: Fix return code for mkdir calls

Linus Torvalds
2010-09-10 22:26:27 +0800
053d8f662 Merge branch 'vhost-net' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Browse Code »

David S. Miller
2010-09-10 12:59:51 +0800
df423dc7f Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev ... Browse Code »

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
libata-sff: Reenable Port Multiplier after libata-sff remodeling.
libata: skip EH autopsy and recovery during suspend
ahci: AHCI and RAID mode SATA patch for Intel Patsburg DeviceIDs
ata_piix: IDE Mode SATA patch for Intel Patsburg DeviceIDs
libata,pata_via: revert ata_wait_idle() removal from ata_sff/via_tf_load()
ahci: fix hang on failed softreset
pata_artop: Fix device ID parity check

Linus Torvalds
2010-09-10 11:28:19 +0800
ea3c64506 libata-sff: Reenable Port Multiplier after libata-sff remodeling. ... Browse Code »

Keep track of the link on the which the current request is in progress.
It allows support of links behind port multiplier.

Not all libata-sff is PMP compliant. Code for native BMDMA controller
does not take in accound PMP.

Tested on Marvell 7042 and Sil7526.

Signed-off-by: Gwendal Grignou
Signed-off-by: Jeff Garzik

Gwendal Grignou
2010-09-10 10:31:55 +0800
e2f3d75fc libata: skip EH autopsy and recovery during suspend ... Browse Code »

For some mysterious reason, certain hardware reacts badly to usual EH
actions while the system is going for suspend. As the devices won't
be needed until the system is resumed, ask EH to skip usual autopsy
and recovery and proceed directly to suspend.

Signed-off-by: Tejun Heo
Tested-by: Stephan Diestelhorst
Cc: stable@kernel.org
Signed-off-by: Jeff Garzik

Tejun Heo
2010-09-10 10:27:59 +0800
aa4548403 mm: page allocator: calculate a better estimate of NR_FREE_PAGES when memory is … ... Browse Code »

…low and kswapd is awake

Ordinarily watermark checks are based on the vmstat NR_FREE_PAGES as it is
cheaper than scanning a number of lists. To avoid synchronization
overhead, counter deltas are maintained on a per-cpu basis and drained
both periodically and when the delta is above a threshold. On large CPU
systems, the difference between the estimated and real value of
NR_FREE_PAGES can be very high. If NR_FREE_PAGES is much higher than
number of real free page in buddy, the VM can allocate pages below min
watermark, at worst reducing the real number of pages to zero. Even if
the OOM killer kills some victim for freeing memory, it may not free
memory if the exit path requires a new page resulting in livelock.

This patch introduces a zone_page_state_snapshot() function (courtesy of
Christoph) that takes a slightly more accurate view of an arbitrary vmstat
counter. It is used to read NR_FREE_PAGES while kswapd is awake to avoid
the watermark being accidentally broken. The estimate is not perfect and
may result in cache line bounces but is expected to be lighter than the
IPI calls necessary to continually drain the per-cpu counters while kswapd
is awake.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Christoph Lameter
2010-09-10 09:57:25 +0800
339944663 swap: discard while swapping only if SWAP_FLAG_DISCARD ... Browse Code »

Tests with recent firmware on Intel X25-M 80GB and OCZ Vertex 60GB SSDs
show a shift since I last tested in December: in part because of firmware
updates, in part because of the necessary move from barriers to awaiting
completion at the block layer. While discard at swapon still shows as
slightly beneficial on both, discarding 1MB swap cluster when allocating
is now disadvanteous: adds 25% overhead on Intel, adds 230% on OCZ (YMMV).

Surrender: discard as presently implemented is more hindrance than help
for swap; but might prove useful on other devices, or with improvements.
So continue to do the discard at swapon, but make discard while swapping
conditional on a SWAP_FLAG_DISCARD to sys_swapon() (which has been using
only the lower 16 bits of int flags).

We can add a --discard or -d to swapon(8), and a "discard" to swap in
/etc/fstab: matching the mount option for btrfs, ext4, fat, gfs2, nilfs2.

Signed-off-by: Hugh Dickins
Cc: Christoph Hellwig
Cc: Nigel Cunningham
Cc: Tejun Heo
Cc: Jens Axboe
Cc: James Bottomley
Cc: "Martin K. Petersen"
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2010-09-10 09:57:25 +0800
910321ea8 swap: revert special hibernation allocation ... Browse Code »

Please revert 2.6.36-rc commit d2997b1042ec150616c1963b5e5e919ffd0b0ebf
"hibernation: freeze swap at hibernation". It complicated matters by
adding a second swap allocation path, just for hibernation; without in any
way fixing the issue that it was intended to address - page reclaim after
fixing the hibernation image might free swap from a page already imaged as
swapcache, letting its swap be reallocated to store a different page of
the image: resulting in data corruption if the imaged page were freed as
clean then swapped back in. Pages freed to si->swap_map were still in
danger of being reallocated by the alternative allocation path.

I guess it inadvertently fixed slow SSD swap allocation for hibernation,
as reported by Nigel Cunningham: by missing out the discards that occur on
the usual swap allocation path; but that was unintentional, and needs a
separate fix.

Signed-off-by: Hugh Dickins
Cc: KAMEZAWA Hiroyuki
Cc: KOSAKI Motohiro
Cc: "Rafael J. Wysocki"
Cc: Ondrej Zary
Cc: Andrea Gelmini
Cc: Balbir Singh
Cc: Andrea Arcangeli
Cc: Nigel Cunningham
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2010-09-10 09:57:25 +0800
c956126c1 gpio: doc updates ... Browse Code »

There's been some recent confusion about error checking GPIO numbers.
briefly, it should be handled mostly during setup, when gpio_request() is
called, and NEVER by expectig gpio_is_valid to report more than
never-usable GPIO numbers.

[akpm@linux-foundation.org: terminate unterminated comment]
Signed-off-by: David Brownell
Cc: Eric Miao"
Cc: "Ryan Mallon"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Brownell
2010-09-10 09:57:24 +0800
5affb6077 gpio: sx150x: correct and refine reset-on-probe behavior ... Browse Code »

Replace the arbitrary software-reset call from the device-probe
method, because:

- It is defective. To work correctly, it should be two byte writes,
not a single word write. As it stands, it does nothing.

- Some devices with sx150x expanders installed have their NRESET pins
ganged on the same line, so resetting one causes the others to reset -
not a nice thing to do arbitrarily!

- The probe, usually taking place at boot, implies a recent hard-reset,
so a software reset at this point is just a waste of energy anyway.

Therefore, make it optional, defaulting to off, as this will match the
common case of probing at powerup and also matches the current broken
no-op behavior.

Signed-off-by: Gregory Bean
Reviewed-by: Jean Delvare
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Gregory Bean
2010-09-10 09:57:24 +0800
4969c1192 mm: fix swapin race condition ... Browse Code »
46

The pte_same check is reliable only if the swap entry remains pinned (by
the page lock on swapcache). We've also to ensure the swapcache isn't
removed before we take the lock as try_to_free_swap won't care about the
page pin.

One of the possible impacts of this patch is that a KSM-shared page can
point to the anon_vma of another process, which could exit before the page
is freed.

This can leave a page with a pointer to a recycled anon_vma object, or
worse, a pointer to something that is no longer an anon_vma.

[riel@redhat.com: changelog help]
Signed-off-by: Andrea Arcangeli
Acked-by: Hugh Dickins
Reviewed-by: Rik van Riel
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrea Arcangeli
2010-09-10 09:57:24 +0800