Eric Lee / smarc-fsl-linux-kernel

03 Dec, 2008

1 commit

51eaaa677 Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6 ... Browse Code »

* 'linux-next' of git://git.infradead.org/ubifs-2.6:
UBIFS: pre-allocate bulk-read buffer
UBIFS: do not allocate too much
UBIFS: do not print scary memory allocation warnings
UBIFS: allow for gaps when dirtying the LPT
UBIFS: fix compilation warnings
MAINTAINERS: change UBI/UBIFS git tree URLs
UBIFS: endian handling fixes and annotations
UBIFS: remove printk

Linus Torvalds
2008-12-03 07:56:55 +0800

02 Dec, 2008

7 commits

038015536 ntfs: don't fool kernel-doc ... Browse Code »

kernel-doc handles macros now (it has for quite some time), so change the
ntfs_debug() macro's kernel-doc to be just before the macro instead of
before a phony function prototype.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Randy Dunlap
Cc: Anton Altaparmakov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Randy Dunlap
2008-12-02 11:55:25 +0800
7ef9964e6 epoll: introduce resource usage limits ... Browse Code »

It has been thought that the per-user file descriptors limit would also
limit the resources that a normal user can request via the epoll
interface. Vegard Nossum reported a very simple program (a modified
version attached) that can make a normal user to request a pretty large
amount of kernel memory, well within the its maximum number of fds. To
solve such problem, default limits are now imposed, and /proc based
configuration has been introduced. A new directory has been created,
named /proc/sys/fs/epoll/ and inside there, there are two configuration
points:

max_user_instances = Maximum number of devices - per user

max_user_watches = Maximum number of "watched" fds - per user

The current default for "max_user_watches" limits the memory used by epoll
to store "watches", to 1/32 of the amount of the low RAM. As example, a
256MB 32bit machine, will have "max_user_watches" set to roughly 90000.
That should be enough to not break existing heavy epoll users. The
default value for "max_user_instances" is set to 128, that should be
enough too.

This also changes the userspace, because a new error code can now come out
from EPOLL_CTL_ADD (-ENOSPC). The EMFILE from epoll_create() was already
listed, so that should be ok.

[akpm@linux-foundation.org: use get_current_user()]
Signed-off-by: Davide Libenzi
Cc: Michael Kerrisk
Cc:
Cc: Cyrill Gorcunov
Reported-by: Vegard Nossum
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davide Libenzi
2008-12-02 11:55:24 +0800
d6b58f89f ocfs2: fix regression in ocfs2_read_blocks_sync() ... Browse Code »

We're panicing in ocfs2_read_blocks_sync() if a jbd-managed buffer is seen.
At first glance, this seems ok but in reality it can happen. My test case
was to just run 'exorcist'. A struct inode is being pushed out of memory but
is then re-read at a later time, before the buffer has been checkpointed by
jbd. This causes a BUG to be hit in ocfs2_read_blocks_sync().

Reviewed-by: Joel Becker
Signed-off-by: Mark Fasheh

Mark Fasheh
2008-12-02 06:46:58 +0800
07d9a3954 ocfs2: fix return value set in init_dlmfs_fs() ... Browse Code »

In init_dlmfs_fs(), if calling kmem_cache_create() failed, the code will use return value from
calling bdi_init(). The correct behavior should be set status as -ENOMEM before going to "bail:".

Signed-off-by: Coly Li
Acked-by: Sunil Mushran
Signed-off-by: Mark Fasheh

Coly Li
2008-12-02 06:46:55 +0800
07f9eebcd ocfs2: fix wake_up in unlock_ast ... Browse Code »

In ocfs2_unlock_ast(), call wake_up() on lockres before releasing
the spin lock on it. As soon as the spin lock is released, the
lockres can be freed.

Signed-off-by: David Teigland
Signed-off-by: Mark Fasheh

David Teigland
2008-12-02 06:46:45 +0800
66f502a41 ocfs2: initialize stack_user lvbptr ... Browse Code »

The locking_state dump, ocfs2_dlm_seq_show, reads the lvb on locks where it
has not yet been initialized by a lock call.

Signed-off-by: David Teigland
Acked-by: Joel Becker
Signed-off-by: Mark Fasheh

David Teigland
2008-12-02 06:46:39 +0800
3b5da0189 ocfs2: comments typo fix ... Browse Code »

This patch fixes two typos in comments of ocfs2.

Signed-off-by: Coly Li
Signed-off-by: Mark Fasheh

Coly Li
2008-12-02 06:46:31 +0800

01 Dec, 2008

1 commit

8e36a5d6a Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
[CIFS] fix regression in cifs_write_begin/cifs_write_end

Linus Torvalds
2008-12-01 06:04:02 +0800

28 Nov, 2008

1 commit

52b19ac99 udf: Fix BUG_ON() in destroy_inode() ... Browse Code »

udf_clear_inode() can leave behind buffers on mapping's i_private list (when
we truncated preallocation). Call invalidate_inode_buffers() so that the list
is properly cleaned-up before we return from udf_clear_inode(). This is ugly
and suggest that we should cleanup preallocation earlier than in clear_inode()
but currently there's no such call available since drop_inode() is called under
inode lock and thus is unusable for disk operations.

Signed-off-by: Jan Kara

Jan Kara
2008-11-28 00:38:28 +0800

27 Nov, 2008

1 commit

a98ee8c1c [CIFS] fix regression in cifs_write_begin/cifs_write_end ... Browse Code »

The conversion to write_begin/write_end interfaces had a bug where we
were passing a bad parameter to cifs_readpage_worker. Rather than
passing the page offset of the start of the write, we needed to pass the
offset of the beginning of the page. This was reliably showing up as
data corruption in the fsx-linux test from LTP.

It also became evident that this code was occasionally doing unnecessary
read calls. Optimize those away by using the PG_checked flag to indicate
that the unwritten part of the page has been initialized.

CC: Nick Piggin
Acked-by: Dave Kleikamp
Signed-off-by: Jeff Layton
Signed-off-by: Steve French

Jeff Layton
2008-11-27 03:32:33 +0800

22 Nov, 2008

3 commits

3477d2046 UBIFS: pre-allocate bulk-read buffer ... Browse Code »

To avoid memory allocation failure during bulk-read, pre-allocate
a bulk-read buffer, so that if there is only one bulk-reader at
a time, it would just use the pre-allocated buffer and would not
do any memory allocation. However, if there are more than 1 bulk-
reader, then only one reader would use the pre-allocated buffer,
while the other reader would allocate the buffer for itself.

Signed-off-by: Artem Bityutskiy

Artem Bityutskiy
2008-11-22 00:59:33 +0800
6c0c42cdf UBIFS: do not allocate too much ... Browse Code »

Bulk-read allocates 128KiB or more using kmalloc. The allocation
starts failing often when the memory gets fragmented. UBIFS still
works fine in this case, because it falls-back to standard
(non-optimized) read method, though. This patch teaches bulk-read
to allocate exactly the amount of memory it needs, instead of
allocating 128KiB every time.

This patch is also a preparation to the further fix where we'll
have a pre-allocated bulk-read buffer as well. For example, now
the @bu object is prepared in 'ubifs_bulk_read()', so we could
path either pre-allocated or allocated information to
'ubifs_do_bulk_read()' later. Or teaching 'ubifs_do_bulk_read()'
not to allocate 'bu->buf' if it is already there.

Signed-off-by: Artem Bityutskiy

Artem Bityutskiy
2008-11-22 00:59:25 +0800
39ce81ce7 UBIFS: do not print scary memory allocation warnings ... Browse Code »

Bulk-read allocates a lot of memory with 'kmalloc()', and when it
is/gets fragmented 'kmalloc()' fails with a scarry warning. But
because bulk-read is just an optimization, UBIFS keeps working fine.
Supress the warning by passing __GFP_NOWARN option to 'kmalloc()'.

This patch also introduces a macro for the magic 128KiB constant.
This is just neater.

Note, this is not really fixes the problem we had, but just hides
the warnings. The further patches fix the problem.

Signed-off-by: Artem Bityutskiy

Artem Bityutskiy
2008-11-22 00:59:16 +0800

21 Nov, 2008

2 commits

0cb39aa0a Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
[CIFS] Do not attempt to close invalidated file handles
[CIFS] fix check for dead tcon in smb_init

Linus Torvalds
2008-11-21 05:14:16 +0800
ddb4cbfc5 [CIFS] Do not attempt to close invalidated file handles ... Browse Code »

If a connection with open file handles has gone down
and come back up and reconnected without reopening
the file handle yet, do not attempt to send an SMB close
request for this handle in cifs_close. We were
checking for the connection being invalid in cifs_close
but since the connection may have been reconnected
we also need to check whether the file handle
was marked invalid (otherwise we could close the
wrong file handle by accident).

Acked-by: Jeff Layton
Signed-off-by: Steve French

Steve French
2008-11-21 04:14:13 +0800

20 Nov, 2008

3 commits

ea7e743e4 hostfs: fix a duplicated global function name ... Browse Code »

fs/hostfs/hostfs_user.c defines do_readlink() as non-static, and so does
fs/xfs/linux-2.6/xfs_ioctl.c when CONFIG_XFS_DEBUG=y. So rename
do_readlink() in hostfs to hostfs_do_readlink().

I think it's better if XFS guys will also rename their do_readlink(),
it's not necessary to use such a general name.

Signed-off-by: WANG Cong
Cc: Jeff Dike
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

WANG Cong
2008-11-20 10:50:00 +0800
f9454548e don't unlink an active swapfile ... Browse Code »

Peter Cordes is sorry that he rm'ed his swapfiles while they were in use,
he then had no pathname to swapoff. It's a curious little oversight, but
not one worth a lot of hackery. Kudos to Willy Tarreau for turning this
around from a discussion of synthetic pathnames to how to prevent unlink.
Mimic immutable: prohibit unlinking an active swapfile in may_delete()
(and don't worry my little head over the tiny race window).

Signed-off-by: Hugh Dickins
Cc: Willy Tarreau
Acked-by: Christoph Hellwig
Cc: Peter Cordes
Cc: Bodo Eggert
Cc: David Newall
Cc: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2008-11-20 10:49:59 +0800
ac97b9f9a eCryptfs: Allocate up to two scatterlists for crypto ops on keys ... Browse Code »

I have received some reports of out-of-memory errors on some older AMD
architectures. These errors are what I would expect to see if
crypt_stat->key were split between two separate pages. eCryptfs should
not assume that any of the memory sent through virt_to_scatterlist() is
all contained in a single page, and so this patch allocates two
scatterlist structs instead of one when processing keys. I have received
confirmation from one person affected by this bug that this patch resolves
the issue for him, and so I am submitting it for inclusion in a future
stable release.

Note that virt_to_scatterlist() runs sg_init_table() on the scatterlist
structs passed to it, so the calls to sg_init_table() in
decrypt_passphrase_encrypted_session_key() are redundant.

Signed-off-by: Michael Halcrow
Reported-by: Paulo J. S. Silva
Cc: "Leon Woestenberg"
Cc: Tim Gardner
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michael Halcrow
2008-11-20 10:49:58 +0800

19 Nov, 2008

1 commit

bfb59820e [CIFS] fix check for dead tcon in smb_init ... Browse Code »

This was recently changed to check for need_reconnect, but should
actually be a check for a tidStatus of CifsExiting.

Signed-off-by: Jeff Layton
Signed-off-by: Steve French

Steve French
2008-11-19 00:33:48 +0800

18 Nov, 2008

7 commits

55e8e30c3 block/md: fix md autodetection ... Browse Code »

Block ext devt conversion missed md_autodetect_dev() call in
rescan_partitions() leaving md autodetect unable to see partitions.
Fix it.

Signed-off-by: Tejun Heo
Cc: Neil Brown
Signed-off-by: Jens Axboe

Tejun Heo
2008-11-18 22:08:56 +0800
ba32929a9 block: make add_partition() return pointer to hd_struct ... Browse Code »

Make add_partition() return pointer to the new hd_struct on success
and ERR_PTR() value on failure. This change will be used to fix md
autodetection bug.

Signed-off-by: Tejun Heo
Cc: Neil Brown
Signed-off-by: Jens Axboe

Tejun Heo
2008-11-18 22:08:56 +0800
eb60fa106 block: fix add_partition() error path ... Browse Code »

Partition stats structure was not freed on devt allocation failure
path. Fix it.

Signed-off-by: Tejun Heo
Signed-off-by: Jens Axboe

Tejun Heo
2008-11-18 22:08:55 +0800
4e14e833a Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
prevent cifs_writepages() from skipping unwritten pages
Fixed parsing of mount options when doing DFS submount
[CIFS] Fix check for tcon seal setting and fix oops on failed mount from earlier patch
[CIFS] Fix build break
cifs: reinstate sharing of tree connections
[CIFS] minor cleanup to cifs_mount
cifs: reinstate sharing of SMB sessions sans races
cifs: disable sharing session and tcon and add new TCP sharing code
[CIFS] clean up server protocol handling
[CIFS] remove unused list, add new cifs sock list to prepare for mount/umount fix
[CIFS] Fix cifs reconnection flags
[CIFS] Can't rely on iov length and base when kernel_recvmsg returns error

Linus Torvalds
2008-11-18 12:53:31 +0800
b066a48c9 prevent cifs_writepages() from skipping unwritten pages ... Browse Code »

Fixes a data corruption under heavy stress in which pages could be left
dirty after all open instances of a inode have been closed.

In order to write contiguous pages whenever possible, cifs_writepages()
asks pagevec_lookup_tag() for more pages than it may write at one time.
Normally, it then resets index just past the last page written before calling
pagevec_lookup_tag() again.

If cifs_writepages() can't write the first page returned, it wasn't resetting
index, and the next call to pagevec_lookup_tag() resulted in skipping all of
the pages it previously returned, even though cifs_writepages() did nothing
with them. This can result in data loss when the file descriptor is about
to be closed.

This patch ensures that index gets set back to the next returned page so
that none get skipped.

Signed-off-by: Dave Kleikamp
Acked-by: Jeff Layton
Cc: Shirish S Pargaonkar
Signed-off-by: Steve French

Dave Kleikamp
2008-11-18 12:30:07 +0800
2c55608f2 Fixed parsing of mount options when doing DFS submount ... Browse Code »

Since these hit the same routines, and are relatively small, it is easier to review
them as one patch.

Fixed incorrect handling of the last option in some cases
Fixed prefixpath handling convert path_consumed into host depended string length (in bytes)
Use non default separator if it is provided in the original mount options

Acked-by: Jeff Layton
Signed-off-by: Igor Mammedov
Signed-off-by: Steve French

Igor Mammedov
2008-11-18 12:29:06 +0800
ab3f99298 [CIFS] Fix check for tcon seal setting and fix oops on failed mount from earlier patch ... Browse Code »

set tcon->ses earlier

If the inital tree connect fails, we'll end up calling cifs_put_smb_ses
with a NULL pointer. Fix it by setting the tcon->ses earlier.

Acked-by: Jeff Layton
Signed-off-by: Steve French

Steve French
2008-11-18 00:03:00 +0800

17 Nov, 2008

3 commits

c2b3382cd [CIFS] Fix build break ... Browse Code »

Signed-off-by: Steve French

Steve French
2008-11-17 11:57:13 +0800
f1987b44f cifs: reinstate sharing of tree connections ... Browse Code »

Use a similar approach to the SMB session sharing. Add a list of tcons
attached to each SMB session. Move the refcount to non-atomic. Protect
all of the above with the cifs_tcp_ses_lock. Add functions to
properly find and put references to the tcons.

Signed-off-by: Jeff Layton
Signed-off-by: Steve French

Jeff Layton
2008-11-17 11:14:12 +0800
5c06fe772 Fix broken ownership of /proc/sys/ files ... Browse Code »

D'oh...

Signed-off-by: Al Viro
Reported-and-tested-by: Peter Palfrader
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds

Al Viro
2008-11-17 07:09:52 +0800

16 Nov, 2008

1 commit

8f7b0ba1c Fix inotify watch removal/umount races ... Browse Code »

Inotify watch removals suck violently.

To kick the watch out we need (in this order) inode->inotify_mutex and
ih->mutex. That's fine if we have a hold on inode; however, for all
other cases we need to make damn sure we don't race with umount. We can
*NOT* just grab a reference to a watch - inotify_unmount_inodes() will
happily sail past it and we'll end with reference to inode potentially
outliving its superblock.

Ideally we just want to grab an active reference to superblock if we
can; that will make sure we won't go into inotify_umount_inodes() until
we are done. Cleanup is just deactivate_super().

However, that leaves a messy case - what if we *are* racing with
umount() and active references to superblock can't be acquired anymore?
We can bump ->s_count, grab ->s_umount, which will almost certainly wait
until the superblock is shut down and the watch in question is pining
for fjords. That's fine, but there is a problem - we might have hit the
window between ->s_active getting to 0 / ->s_count - below S_BIAS (i.e.
the moment when superblock is past the point of no return and is heading
for shutdown) and the moment when deactivate_super() acquires
->s_umount.

We could just do drop_super() yield() and retry, but that's rather
antisocial and this stuff is luser-triggerable. OTOH, having grabbed
->s_umount and having found that we'd got there first (i.e. that
->s_root is non-NULL) we know that we won't race with
inotify_umount_inodes().

So we could grab a reference to watch and do the rest as above, just
with drop_super() instead of deactivate_super(), right? Wrong. We had
to drop ih->mutex before we could grab ->s_umount. So the watch
could've been gone already.

That still can be dealt with - we need to save watch->wd, do idr_find()
and compare its result with our pointer. If they match, we either have
the damn thing still alive or we'd lost not one but two races at once,
the watch had been killed and a new one got created with the same ->wd
at the same address. That couldn't have happened in inotify_destroy(),
but inotify_rm_wd() could run into that. Still, "new one got created"
is not a problem - we have every right to kill it or leave it alone,
whatever's more convenient.

So we can use idr_find(...) == watch && watch->inode->i_sb == sb as
"grab it and kill it" check. If it's been our original watch, we are
fine, if it's a newcomer - nevermind, just pretend that we'd won the
race and kill the fscker anyway; we are safe since we know that its
superblock won't be going away.

And yes, this is far beyond mere "not very pretty"; so's the entire
concept of inotify to start with.

Signed-off-by: Al Viro
Acked-by: Greg KH
Signed-off-by: Linus Torvalds

Al Viro
2008-11-16 04:26:44 +0800

15 Nov, 2008

3 commits

d82c2df54 [CIFS] minor cleanup to cifs_mount ... Browse Code »

Signed-off-by: Steve French

Steve French
2008-11-15 08:07:26 +0800
14fbf50d6 cifs: reinstate sharing of SMB sessions sans races ... Browse Code »

We do this by abandoning the global list of SMB sessions and instead
moving to a per-server list. This entails adding a new list head to the
TCP_Server_Info struct. The refcounting for the cifsSesInfo is moved to
a non-atomic variable. We have to protect it by a lock anyway, so there's
no benefit to making it an atomic. The list and refcount are protected
by the global cifs_tcp_ses_lock.

The patch also adds a new routines to find and put SMB sessions and
that properly take and put references under the lock.

Signed-off-by: Jeff Layton
Signed-off-by: Steve French

Jeff Layton
2008-11-15 07:56:55 +0800
e7ddee903 cifs: disable sharing session and tcon and add new TCP sharing code ... Browse Code »

The code that allows these structs to be shared is extremely racy.
Disable the sharing of SMB and tcon structs for now until we can
come up with a way to do this that's race free.

We want to continue to share TCP sessions, however since they are
required for multiuser mounts. For that, implement a new (hopefully
race-free) scheme. Add a new global list of TCP sessions, and take
care to get a reference to it whenever we're dealing with one.

Signed-off-by: Jeff Layton
Signed-off-by: Steve French

Jeff Layton
2008-11-15 07:42:32 +0800

14 Nov, 2008

5 commits

3ec332ef7 [CIFS] clean up server protocol handling ... Browse Code »

We're currently declaring both a sockaddr_in and sockaddr6_in on the
stack, but we really only need storage for one of them. Declare a
sockaddr struct and cast it to the proper type. Also, eliminate the
protocolType field in the TCP_Server_Info struct. It's redundant since
we have a sa_family field in the sockaddr anyway.

We may need to revisit this if SCTP is ever implemented, but for now
this will simplify the code.

CIFS over IPv6 also has a number of problems currently. This fixes all
of them that I found. Eventually, it would be nice to move more of the
code to be protocol independent, but this is a start.

Signed-off-by: Jeff Layton
Signed-off-by: Steve French

Steve French
2008-11-14 11:35:10 +0800
fb3960166 [CIFS] remove unused list, add new cifs sock list to prepare for mount/umount fix ... Browse Code »

Also adds two lines missing from the previous patch (for the need reconnect flag in the
/proc/fs/cifs/DebugData handling)

The new global_cifs_sock_list is added, and initialized in init_cifs but not used yet.
Jeff Layton will be adding code in to use that and to remove the GlobalTcon and GlobalSMBSession
lists.

CC: Jeff Layton
CC: Shirish Pargaonkar
Signed-off-by: Steve French

Steve French
2008-11-14 04:04:07 +0800
7b4236539 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm:
dlm: fix shutdown cleanup

Linus Torvalds
2008-11-14 03:56:05 +0800
3b7952109 [CIFS] Fix cifs reconnection flags ... Browse Code »

In preparation for Jeff's big umount/mount fixes to remove the possibility of
various races in cifs mount and linked list handling of sessions, sockets and
tree connections, this patch cleans up some repetitive code in cifs_mount,
and addresses a problem with ses->status and tcon->tidStatus in which we
were overloading the "need_reconnect" state with other status in that
field. So the "need_reconnect" flag has been broken out from those
two state fields (need reconnect was not mutually exclusive from some of the
other possible tid and ses states). In addition, a few exit cases in
cifs_mount were cleaned up, and a problem with a tcon flag (for lease support)
was not being set consistently for the 2nd mount of the same share

CC: Jeff Layton
CC: Shirish Pargaonkar
Signed-off-by: Steve French

Steve French
2008-11-14 03:45:32 +0800
278afcbf4 dlm: fix shutdown cleanup ... Browse Code »

Fixes a regression from commit 0f8e0d9a317406612700426fad3efab0b7bbc467,
"dlm: allow multiple lockspace creates".

An extraneous 'else' slipped into a code fragment being moved from
release_lockspace() to dlm_release_lockspace(). The result of the
unwanted 'else' is that dlm threads and structures are not stopped
and cleaned up when the final dlm lockspace is removed. Trying to
create a new lockspace again afterward will fail with
"kmem_cache_create: duplicate cache dlm_conn" because the cache
was not previously destroyed.

Signed-off-by: David Teigland

David Teigland
2008-11-14 03:22:34 +0800

13 Nov, 2008

1 commit

6cdfcc275 ext3: Clean up outdated and incorrect comment for ext3_write_super() ... Browse Code »

Signed-off-by: "Theodore Ts'o"
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Theodore Tso
2008-11-13 09:17:17 +0800