Eric Lee / smarc-fsl-linux-kernel

21 Oct, 2011

12 commits

7c9ca6211 GFS2: Use rbtree for resource groups and clean up bitmap buffer ref count scheme ... Browse Code »

Here is an update of Bob's original rbtree patch which, in addition, also
resolves the rather strange ref counting that was being done relating to
the bitmap blocks.

Originally we had a dual system for journaling resource groups. The metadata
blocks were journaled and also the rgrp itself was added to a list. The reason
for adding the rgrp to the list in the journal was so that the "repolish
clones" code could be run to update the free space, and potentially send any
discard requests when the log was flushed. This was done by comparing the
"cloned" bitmap with what had been written back on disk during the transaction
commit.

Due to this, there was a requirement to hang on to the rgrps' bitmap buffers
until the journal had been flushed. For that reason, there was a rather
complicated set up in the ->go_lock ->go_unlock functions for rgrps involving
both a mutex and a spinlock (the ->sd_rindex_spin) to maintain a reference
count on the buffers.

However, the journal maintains a reference count on the buffers anyway, since
they are being journaled as metadata buffers. So by moving the code which deals
with the post-journal accounting for bitmap blocks to the metadata journaling
code, we can entirely dispense with the rather strange buffer ref counting
scheme and also the requirement to journal the rgrps.

The net result of all this is that the ->sd_rindex_spin is left to do exactly
one job, and that is to look after the rbtree or rgrps.

This patch is designed to be a stepping stone towards using RCU for the rbtree
of resource groups, however the reduction in the number of uses of the
->sd_rindex_spin is likely to have benefits for multi-threaded workloads,
anyway.

The patch retains ->go_lock and ->go_unlock for rgrps, however these maybe also
be removed in future in favour of calling the functions directly where required
in the code. That will allow locking of resource groups without needing to
actually read them in - something that could be useful in speeding up statfs.

In the mean time though it is valid to dereference ->bi_bh only when the rgrp
is locked. This is basically the same rule as before, modulo the references not
being valid until the following journal flush.

Signed-off-by: Steven Whitehouse
Signed-off-by: Bob Peterson
Cc: Benjamin Marzinski

Bob Peterson
2011-10-21 19:39:31 +0800
9453615a1 GFS2: Fix lseek after SEEK_DATA, SEEK_HOLE have been added ... Browse Code »

We need to take the inode's glock whenever the inode's size
is referenced, otherwise it might not be uptodate. Even
though generic_file_llseek_unlocked() doesn't implement
SEEK_DATA, SEEK_HOLE directly, it does reference the inode's
size in those cases, so we need to add them to the list
of origins which need the glock.

Signed-off-by: Steven Whitehouse
Cc: Andi Kleen

Steven Whitehouse
2011-10-21 19:39:29 +0800
9a63edd12 GFS2: Clean up gfs2_create ... Browse Code »

If we pass through knowledge of whether the creation is intended to be
exclusive or not, then we can deal with that in gfs2_create_inode
and remove one set of locking. Also this removes the loop in
gfs2_create and simplifies the code a bit.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-10-21 19:39:28 +0800
ab9bbda02 GFS2: Use ->dirty_inode() ... Browse Code »

The aim of this patch is to use the newly enhanced ->dirty_inode()
super block operation to deal with atime updates, rather than
piggy backing that code into ->write_inode() as is currently
done.

The net result is a simplification of the code in various places
and a reduction of the number of gfs2_dinode_out() calls since
this is now implied by ->dirty_inode().

Some of the mark_inode_dirty() calls have been moved under glocks
in order to take advantage of then being able to avoid locking in
->dirty_inode() when we already have suitable locks.

One consequence is that generic_write_end() now correctly deals
with file size updates, so that we do not need a separate check
for that afterwards. This also, indirectly, means that fdatasync
should work correctly on GFS2 - the current code always syncs the
metadata whether it needs to or not.

Has survived testing with postmark (with and without atime) and
also fsx.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-10-21 19:39:26 +0800
f18185291 GFS2: Fix bug trap and journaled data fsync ... Browse Code »

Journaled data requires that a complete flush of all dirty data for
the file is done, in order that the ail flush which comes after
will succeed.

Also the recently enhanced bug trap can trigger falsely in case
an ail flush from fsync races with a page read. This updates the
bug trap such that it will ignore buffers which are locked and
only trigger on dirty and/or pinned buffers when the ail flush
is run from fsync. The original bug trap is retained when ail
flush is run from ->go_sync()

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-10-21 19:39:25 +0800
40ac218f5 GFS2: Fix inode allocation error path ... Browse Code »

If we have got far enough through the inode allocation code
path that an inode has already been allocated, then we must
call iput to dispose of it, if an error occurs during a
later part of the process. This will always be the final iput
since there will be no other references to the inode.

Unlike when the inode has been unlinked, its block state will
be GFS2_BLKST_INODE rather than GFS2_BLKST_UNLINKED so we need
to skip the test in ->evict_inode() for this one case in order
to ensure that it will be deallocated correctly. This patch adds
a new flag in order to ensure that this will happen correctly.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-10-21 19:39:23 +0800
1d4ec642d GFS2: Make atime checks more efficient ... Browse Code »

We do not need to start a transaction unless the atime
check has proved positive. Also if we are going to flush
the complete ail list anyway, we might as well skip the
writeback for this specific inode's metadata, since that
will be done as part of the ail writeback process in an
order offering potentially more efficient I/O.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-10-21 19:39:21 +0800
75549186e GFS2: Fix bug-trap in ail flush code ... Browse Code »

The assert was being tested under the wrong lock, a
legacy of the original code. Also, if it does trigger,
the resulting information was not always a lot of help.

This moves the patch under the correct lock and also
prints out more useful information in tacking down the
source of the problem.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-10-21 19:39:20 +0800
2f0264d59 GFS2: Split data write & wait in fsync ... Browse Code »

Now that the data writing is part of fsync proper, we can split
the waiting part out and do it later on. This reduces the
number of waits that we do during fsync on average.

There is also no need to take the i_mutex unless we are flushing
metadata to disk, so we can move that to within the metadata
flushing code.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-10-21 19:39:18 +0800
4c28d3380 GFS2: Clean up dir hash table reading ... Browse Code »

Since there is now only a single caller to gfs2_dir_read_data()
and it has a number of constant arguments, we can factor
those out. Also some tests relating to the inode size were
being done twice.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-10-21 19:39:17 +0800
fd11e153b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc: Add alignment flag to PCI expansion resources
sparc: Avoid calling sigprocmask()
sparc: Use set_current_blocked()
sparc32,leon: SRMMU MMU Table probe fix

Linus Torvalds
2011-10-21 03:16:28 +0800
505f48b53 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
fib_rules: fix unresolved_rules counting
r8169: fix wrong eee setting for rlt8111evl
r8169: fix driver shutdown WoL regression.
ehea: Change maintainer to me
pptp: pptp_rcv_core() misses pskb_may_pull() call
tproxy: copy transparent flag when creating a time wait
pptp: fix skb leak in pptp_xmit()
bonding: use local function pointer of bond->recv_probe in bond_handle_frame
smsc911x: Add support for SMSC LAN89218
tg3: negate USE_PHYLIB flag check
netconsole: enable netconsole can make net_device refcnt incorrent
bluetooth: Properly clone LSM attributes to newly created child connections
l2tp: fix a potential skb leak in l2tp_xmit_skb()
bridge: fix hang on removal of bridge via netlink
x25: Prevent skb overreads when checking call user data
x25: Handle undersized/fragmented skbs
x25: Validate incoming call user data lengths
udplite: fast-path computation of checksum coverage
IPVS netns shutdown/startup dead-lock
netfilter: nf_conntrack: fix event flooding in GRE protocol tracker

Linus Torvalds
2011-10-21 03:15:20 +0800

20 Oct, 2011

6 commits

486cf46f3 mm: fix race between mremap and removing migration entry ... Browse Code »
1

I don't usually pay much attention to the stale "? " addresses in
stack backtraces, but this lucky report from Pawel Sikora hints that
mremap's move_ptes() has inadequate locking against page migration.

3.0 BUG_ON(!PageLocked(p)) in migration_entry_to_page():
kernel BUG at include/linux/swapops.h:105!
RIP: 0010:[] []
migration_entry_wait+0x156/0x160
[] handle_pte_fault+0xae1/0xaf0
[] ? __pte_alloc+0x42/0x120
[] ? do_huge_pmd_anonymous_page+0xab/0x310
[] handle_mm_fault+0x181/0x310
[] ? vma_adjust+0x537/0x570
[] do_page_fault+0x11d/0x4e0
[] ? do_mremap+0x2d5/0x570
[] page_fault+0x1f/0x30

mremap's down_write of mmap_sem, together with i_mmap_mutex or lock,
and pagetable locks, were good enough before page migration (with its
requirement that every migration entry be found) came in, and enough
while migration always held mmap_sem; but not enough nowadays, when
there's memory hotremove and compaction.

The danger is that move_ptes() lets a migration entry dodge around
behind remove_migration_pte()'s back, so it's in the old location when
looking at the new, then in the new location when looking at the old.

Either mremap's move_ptes() must additionally take anon_vma lock(), or
migration's remove_migration_pte() must stop peeking for is_swap_entry()
before it takes pagetable lock.

Consensus chooses the latter: we prefer to add overhead to migration
than to mremapping, which gets used by JVMs and by exec stack setup.

Reported-and-tested-by: Paweł Sikora
Signed-off-by: Hugh Dickins
Acked-by: Andrea Arcangeli
Acked-by: Mel Gorman
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds

Hugh Dickins
2011-10-20 14:42:58 +0800
aad456449 sparc: Add alignment flag to PCI expansion resources ... Browse Code »

Currently no type of alignment is specified for PCI expansion roms while
parsing the openfirmware tree. This causes calls to pci_map_rom() to fail.
IORESOURCE_SIZEALIGN is the default alignment used for rom resouces in
pci/probe.c, and has been verified to work with various cards on a ultra 10.

Signed-off-By: Kjetil Oftedal
Signed-off-by: David S. Miller

Kjetil Oftedal
2011-10-20 07:20:50 +0800
afaef734e fib_rules: fix unresolved_rules counting ... Browse Code »

we should decrease ops->unresolved_rules when deleting a unresolved rule.

Signed-off-by: Zheng Yan
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Yan, Zheng
2011-10-20 07:17:41 +0800
1b23a3e3d r8169: fix wrong eee setting for rlt8111evl ... Browse Code »

Correct the wrong parameter for setting EEE for RTL8111E-VL.

Signed-off-by: Hayes Wang
Signed-off-by: David S. Miller

hayeswang
2011-10-20 06:48:17 +0800
649b3b8c4 r8169: fix driver shutdown WoL regression. ... Browse Code »

Due to commit 92fc43b4159b518f5baae57301f26d770b0834c9 ("r8169: modify the
flow of the hw reset."), rtl8169_hw_reset stomps during driver shutdown on
RxConfig bits which are needed for WOL on some versions of the hardware.

As these bits were formerly set from the r81{0x, 68}_pll_power_down methods,
factor them out for use in the driver shutdown (rtl_shutdown) handler.

I favored __rtl8169_get_wol() -hardware state indication- over
RTL_FEATURE_WOL as the latter has become a good candidate for removal.

Signed-off-by: Francois Romieu
Cc: Hayes
Tested-by: Marc Ballarin
Signed-off-by: David S. Miller

françois romieu
2011-10-20 05:08:21 +0800
34b1901ab ehea: Change maintainer to me ... Browse Code »

Breno Leitao has passed the maintainership to me.

Signed-off-by: Thadeu Lima de Souza Cascardo
Cc: Breno Leitao
Acked-by: Breno Leitão
Signed-off-by: David S. Miller

Thadeu Lima de Souza Cascardo
2011-10-20 04:01:20 +0800

19 Oct, 2011

14 commits

e4fcd69c9 Merge branch 'v4l_for_linus' of git://linuxtv.org/mchehab/for_linus ... Browse Code »

* 'v4l_for_linus' of git://linuxtv.org/mchehab/for_linus:
[media] videodev: fix a NULL pointer dereference in v4l2_device_release()

Linus Torvalds
2011-10-19 21:44:11 +0800
f91f6cfd4 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux ... Browse Code »

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon/kms/atom: fix handling of FB scratch indices
drm/radeon/kms/DCE4.1: fix Select_CrtcSource EncodeMode setting for DP bridges (v2)
drm/radeon/kms/DCE4.1: ss is not supported on the internal pplls
drm/radeon/kms/DCE4.1: fix dig encoder to transmitter mapping
ttm: Fix error-path using an uninitialized value

Linus Torvalds
2011-10-19 21:43:24 +0800
e58fced20 [media] videodev: fix a NULL pointer dereference in v4l2_device_release() ... Browse Code »

The change in 8280b66 does not cover the case when v4l2_dev is already
NULL, fix that.

With a Kinect sensor, seen as an USB camera using GSPCA in this context,
a NULL pointer dereference BUG can be triggered by just unplugging the
device after the camera driver has been loaded.

Signed-off-by: Antonio Ospite
Signed-off-by: Mauro Carvalho Chehab

Antonio Ospite
2011-10-19 19:48:08 +0800
5a6e8482a drm/radeon/kms/atom: fix handling of FB scratch indices ... Browse Code »
1

FB scratch indices are dword indices, but we were treating
them as byte indices. As such, we were getting the wrong
FB scratch data for non-0 indices. Fix the indices and
guard the indexing against indices larger than the scratch
allocation.

Fixes memory corruption on some boards if data was written
past the end of the FB scratch array.

Signed-off-by: Alex Deucher
Reported-by: Dave Airlie
Tested-by: Dave Airlie
Cc: stable@kernel.org
Signed-off-by: Dave Airlie

Alex Deucher
2011-10-19 16:47:47 +0800
4ea2739ea pptp: pptp_rcv_core() misses pskb_may_pull() call ... Browse Code »

e1000e uses paged frags, so any layer incorrectly pulling bytes from skb
can trigger a BUG in skb_pull()

[951.142737] [] skb_pull+0x15/0x17
[951.142737] [] pptp_rcv_core+0x126/0x19a [pptp]
[951.152725] [] sk_receive_skb+0x69/0x105
[951.163558] [] pptp_rcv+0xc8/0xdc [pptp]
[951.165092] [] gre_rcv+0x62/0x75 [gre]
[951.165092] [] ip_local_deliver_finish+0x150/0x1c1
[951.177599] [] ? ip_local_deliver_finish+0x0/0x1c1
[951.177599] [] NF_HOOK.clone.7+0x51/0x58
[951.177599] [] ip_local_deliver+0x51/0x55
[951.177599] [] ip_rcv_finish+0x31a/0x33e
[951.177599] [] ? ip_rcv_finish+0x0/0x33e
[951.204898] [] NF_HOOK.clone.7+0x51/0x58
[951.214651] [] ip_rcv+0x21b/0x246

pptp_rcv_core() is a nice example of a function assuming everything it
needs is available in skb head.

Reported-by: Bradley Peterson
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2011-10-19 15:50:43 +0800
58af19e38 tproxy: copy transparent flag when creating a time wait ... Browse Code »

The transparent socket option setting was not copied to the time wait
socket when an inet socket was being replaced by a time wait socket. This
broke the --transparent option of the socket match and may have caused
that FIN packets belonging to sockets in FIN_WAIT2 or TIME_WAIT state
were being dropped by the packet filter.

Signed-off-by: KOVACS Krisztian
Signed-off-by: David S. Miller

KOVACS Krisztian
2011-10-19 15:21:35 +0800
8bae8bd6c pptp: fix skb leak in pptp_xmit() ... Browse Code »

In case we cant transmit skb, we must free it

Signed-off-by: Eric Dumazet
CC: Dmitry Kozlov
Signed-off-by: David S. Miller

Eric Dumazet
2011-10-19 14:39:43 +0800
4d97480b1 bonding: use local function pointer of bond->recv_probe in bond_handle_frame ... Browse Code »
1

The bond->recv_probe is called in bond_handle_frame() when
a packet is received, but bond_close() sets it to NULL. So,
a panic occurs when both functions work in parallel.

Why this happen:
After null pointer check of bond->recv_probe, an sk_buff is
duplicated and bond->recv_probe is called in bond_handle_frame.
So, a panic occurs when bond_close() is called between the
check and call of bond->recv_probe.

Patch:
This patch uses a local function pointer of bond->recv_probe
in bond_handle_frame(). So, it can avoid the null pointer
dereference.

Signed-off-by: Mitsuo Hayasaka
Cc: Jay Vosburgh
Cc: Andy Gospodarek
Cc: Eric Dumazet
Cc: WANG Cong
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Mitsuo Hayasaka
2011-10-19 12:14:22 +0800
28c213793 smsc911x: Add support for SMSC LAN89218 ... Browse Code »

LAN89218 is register compatible with LAN911x.

Signed-off-by: Phil Edworthy
Signed-off-by: David S. Miller

Phil Edworthy
2011-10-19 12:01:01 +0800
e730c8234 tg3: negate USE_PHYLIB flag check ... Browse Code »
1

USE_PHYLIB flag in tg3_remove_one() is being checked incorrectly. This
results tg3_phy_fini->phy_disconnect is never called and when tg3 module
is removed.

In my case this resulted in panics in phy_state_machine calling function
phydev->adjust_link.

So correct this check.

Signed-off-by: Jiri Pirko
Acked-by: Matt Carlson
Signed-off-by: David S. Miller

Jiri Pirko
2011-10-19 11:59:33 +0800
d5123480b netconsole: enable netconsole can make net_device refcnt incorrent ... Browse Code »
1

There is no check if netconsole is enabled current.
so when exec echo 1 > enabled;
the reference of net_device will increment always.

Signed-off-by: Gao feng
Acked-by: Flavio Leitner
Signed-off-by: David S. Miller

Gao feng
2011-10-19 11:55:29 +0800
6230c9b4f bluetooth: Properly clone LSM attributes to newly created child connections ... Browse Code »

The Bluetooth stack has internal connection handlers for all of the various
Bluetooth protocols, and unfortunately, they are currently lacking the LSM
hooks found in the core network stack's connection handlers. I say
unfortunately, because this can cause problems for users who have have an
LSM enabled and are using certain Bluetooth devices. See one problem
report below:

* http://bugzilla.redhat.com/show_bug.cgi?id=741703

In order to keep things simple at this point in time, this patch fixes the
problem by cloning the parent socket's LSM attributes to the newly created
child socket. If we decide we need a more elaborate LSM marking mechanism
for Bluetooth (I somewhat doubt this) we can always revisit this decision
in the future.

Reported-by: James M. Cape
Signed-off-by: Paul Moore
Acked-by: James Morris
Signed-off-by: David S. Miller

Paul Moore
2011-10-19 11:36:43 +0800
835acf5da l2tp: fix a potential skb leak in l2tp_xmit_skb() ... Browse Code »
1

l2tp_xmit_skb() can leak one skb if skb_cow_head() returns an error.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2011-10-19 11:32:00 +0800
1ce5cce89 bridge: fix hang on removal of bridge via netlink ... Browse Code »
1

Need to cleanup bridge device timers and ports when being bridge
device is being removed via netlink.

This fixes the problem of observed when doing:
ip link add br0 type bridge
ip link set dev eth1 master br0
ip link set br0 up
ip link del br0

which would cause br0 to hang in unregister_netdev because
of leftover reference count.

Reported-by: Sridhar Samudrala
Signed-off-by: Stephen Hemminger
Acked-by: Sridhar Samudrala
Signed-off-by: David S. Miller

stephen hemminger
2011-10-19 11:24:16 +0800

18 Oct, 2011

8 commits

bcd5cff72 cputimer: Cure lock inversion ... Browse Code »
1

There's a lock inversion between the cputimer->lock and rq->lock;
notably the two callchains involved are:

update_rlimit_cpu()
sighand->siglock
set_process_cpu_timer()
cpu_timer_sample_group()
thread_group_cputimer()
cputimer->lock
thread_group_cputime()
task_sched_runtime()
->pi_lock
rq->lock

scheduler_tick()
rq->lock
task_tick_fair()
update_curr()
account_group_exec()
cputimer->lock

Where the first one is enabling a CLOCK_PROCESS_CPUTIME_ID timer, and
the second one is keeping up-to-date.

This problem was introduced by e8abccb7193 ("posix-cpu-timers: Cure
SMP accounting oddities").

Cure the problem by removing the cputimer->lock and rq->lock nesting,
this leaves concurrent enablers doing duplicate work, but the time
wasted should be on the same order otherwise wasted spinning on the
lock and the greater-than assignment filter should ensure we preserve
monotonicity.

Reported-by: Dave Jones
Reported-by: Simon Kirby
Signed-off-by: Peter Zijlstra
Cc: stable@kernel.org
Cc: Linus Torvalds
Cc: Martin Schwidefsky
Link: http://lkml.kernel.org/r/1318928713.21167.4.camel@twins
Signed-off-by: Thomas Gleixner

Peter Zijlstra
2011-10-18 17:36:59 +0800
a4863ca93 drm/radeon/kms/DCE4.1: fix Select_CrtcSource EncodeMode setting for DP bridges (v2) ... Browse Code »

Settings in this table reflect the physical panel/connector rather
than the internal dig encoding.

v2: fix typo for DRM_MODE_CONNECTOR_VGA case.

Signed-off-by: Alex Deucher
Signed-off-by: Dave Airlie

Alex Deucher
2011-10-18 17:16:55 +0800
09cc6506f drm/radeon/kms/DCE4.1: ss is not supported on the internal pplls ... Browse Code »

It's handled via external clock. It should already be protected
by the external ss flag, but add an explicit check just in case.

Signed-off-by: Alex Deucher
Signed-off-by: Dave Airlie

Alex Deucher
2011-10-18 17:16:33 +0800
3a6dea314 drm/radeon/kms/DCE4.1: fix dig encoder to transmitter mapping ... Browse Code »

llano has fully routeable dig encoders similar to DCE3.2 while
ontario has a hardcoded mapping similar to DCE4.0.

Signed-off-by: Alex Deucher
Signed-off-by: Dave Airlie

Alex Deucher
2011-10-18 17:16:10 +0800
e22469ca8 ttm: Fix error-path using an uninitialized value ... Browse Code »

Pointed out by Michel Daenzer.

Signed-off-by: Thomas Hellstrom
Signed-off-by: Dave Airlie

Thomas Hellstrom
2011-10-18 16:37:49 +0800
899e3ee40 Linux 3.1-rc10 Browse Code »

Linus Torvalds
2011-10-18 12:06:23 +0800
ae2a45831 Merge branch 'nf' of git://1984.lsi.us.es/net Browse Code »

David S. Miller
2011-10-18 07:38:03 +0800
7f81e25be x25: Prevent skb overreads when checking call user data ... Browse Code »
1

x25_find_listener does not check that the amount of call user data given
in the skb is big enough in per-socket comparisons, hence buffer
overreads may occur. Fix this by adding a check.

Signed-off-by: Matthew Daley
Cc: Eric Dumazet
Cc: Andrew Hendry
Cc: stable
Acked-by: Andrew Hendry
Signed-off-by: David S. Miller

Matthew Daley
2011-10-18 07:31:40 +0800