20 Jul, 2007

1 commit

  • Slab destructors were no longer supported after Christoph's
    c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
    BUGs for both slab and slub, and slob never supported them
    either.

    This rips out support for the dtor pointer from kmem_cache_create()
    completely and fixes up every single callsite in the kernel (there were
    about 224, not including the slab allocator definitions themselves,
    or the documentation references).

    Signed-off-by: Paul Mundt

    Paul Mundt
     

17 Jul, 2007

4 commits

  • Every file should include the headers containing the prototypes for
    its global functions.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • While working on unshare support for the network namespace I noticed we
    were putting clone flags in an int. Which is weird because the syscall
    uses unsigned long and we at least need an unsigned to properly hold all of
    the unshare flags.

    So to make the code consistent, this patch updates the code to use
    unsigned long instead of int for the clone flags in those places
    where we get it wrong today.

    Signed-off-by: Eric W. Biederman
    Acked-by: Cedric Le Goater
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • dup_mnt_ns() and clone_uts_ns() return NULL on failure. This is wrong,
    create_new_namespaces() uses ERR_PTR() to catch an error. This means that the
    subsequent create_new_namespaces() will hit BUG_ON() in copy_mnt_ns() or
    copy_utsname().

    Modify create_new_namespaces() to also use the errors returned by the
    copy_*_ns routines and not to systematically return ENOMEM.

    [oleg@tv-sign.ru: better changelog]
    Signed-off-by: Cedric Le Goater
    Cc: Serge E. Hallyn
    Cc: Badari Pulavarty
    Cc: Pavel Emelianov
    Cc: Herbert Poetzl
    Cc: Eric W. Biederman
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Cedric Le Goater
     
  • One more simple and stupid switching to the new API.

    Signed-off-by: Pavel Emelianov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelianov
     

09 May, 2007

4 commits

  • There's a missing check for CAP_SYS_ADMIN in do_change_type().

    Signed-off-by: Miklos Szeredi
    Cc: Al Viro
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • There are many places in the kernel where the construction like

    foo = list_entry(head->next, struct foo_struct, list);

    are used.
    The code might look more descriptive and neat if using the macro

    list_first_entry(head, type, member) \
    list_entry((head)->next, type, member)

    Here is the macro itself and the examples of its usage in the generic code.
    If it will turn out to be useful, I can prepare the set of patches to
    inject in into arch-specific code, drivers, networking, etc.

    Signed-off-by: Pavel Emelianov
    Signed-off-by: Kirill Korotaev
    Cc: Randy Dunlap
    Cc: Andi Kleen
    Cc: Zach Brown
    Cc: Davide Libenzi
    Cc: John McCutchan
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: john stultz
    Cc: Ram Pai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelianov
     
  • There's a slight problem with filesystem type representation in fuse
    based filesystems.

    From the kernel's view, there are just two filesystem types: fuse and
    fuseblk. From the user's view there are lots of different filesystem
    types. The user is not even much concerned if the filesystem is fuse based
    or not. So there's a conflict of interest in how this should be
    represented in fstab, mtab and /proc/mounts.

    The current scheme is to encode the real filesystem type in the mount
    source. So an sshfs mount looks like this:

    sshfs#user@server:/ /mnt/server fuse rw,nosuid,nodev,...

    This url-ish syntax works OK for sshfs and similar filesystems. However
    for block device based filesystems (ntfs-3g, zfs) it doesn't work, since
    the kernel expects the mount source to be a real device name.

    A possibly better scheme would be to encode the real type in the type
    field as "type.subtype". So fuse mounts would look like this:

    /dev/hda1 /mnt/windows fuseblk.ntfs-3g rw,...
    user@server:/ /mnt/server fuse.sshfs rw,nosuid,nodev,...

    This patch adds the necessary code to the kernel so that this can be
    correctly displayed in /proc/mounts.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • sys_clone() and sys_unshare() both makes copies of nsproxy and its associated
    namespaces. But they have different code paths.

    This patch merges all the nsproxy and its associated namespace copy/clone
    handling (as much as possible). Posted on container list earlier for
    feedback.

    - Create a new nsproxy and its associated namespaces and pass it back to
    caller to attach it to right process.

    - Changed all copy_*_ns() routines to return a new copy of namespace
    instead of attaching it to task->nsproxy.

    - Moved the CAP_SYS_ADMIN checks out of copy_*_ns() routines.

    - Removed unnessary !ns checks from copy_*_ns() and added BUG_ON()
    just incase.

    - Get rid of all individual unshare_*_ns() routines and make use of
    copy_*_ns() instead.

    [akpm@osdl.org: cleanups, warning fix]
    [clg@fr.ibm.com: remove dup_namespaces() declaration]
    [serue@us.ibm.com: fix CONFIG_IPC_NS=n, clone(CLONE_NEWIPC) retval]
    [akpm@linux-foundation.org: fix build with CONFIG_SYSVIPC=n]
    Signed-off-by: Badari Pulavarty
    Signed-off-by: Serge Hallyn
    Cc: Cedric Le Goater
    Cc: "Eric W. Biederman"
    Cc:
    Signed-off-by: Cedric Le Goater
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Badari Pulavarty
     

12 Feb, 2007

1 commit

  • Replace appropriate pairs of "kmem_cache_alloc()" + "memset(0)" with the
    corresponding "kmem_cache_zalloc()" call.

    Signed-off-by: Robert P. J. Day
    Cc: "Luck, Tony"
    Cc: Andi Kleen
    Cc: Roland McGrath
    Cc: James Bottomley
    Cc: Greg KH
    Acked-by: Joel Becker
    Cc: Steven Whitehouse
    Cc: Jan Kara
    Cc: Michael Halcrow
    Cc: "David S. Miller"
    Cc: Stephen Smalley
    Cc: James Morris
    Cc: Chris Wright
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert P. J. Day
     

14 Dec, 2006

1 commit

  • Add "relatime" (relative atime) support. Relative atime only updates the
    atime if the previous atime is older than the mtime or ctime. Like
    noatime, but useful for applications like mutt that need to know when a
    file has been read since it was last modified.

    A corresponding patch against mount(8) is available at
    http://userweb.kernel.org/~akpm/mount-relative-atime.txt

    Signed-off-by: Valerie Henson
    Cc: Mark Fasheh
    Cc: Al Viro
    Cc: Christoph Hellwig
    Cc: Karel Zak
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Valerie Henson
     

09 Dec, 2006

1 commit

  • Rename 'struct namespace' to 'struct mnt_namespace' to avoid confusion with
    other namespaces being developped for the containers : pid, uts, ipc, etc.
    'namespace' variables and attributes are also renamed to 'mnt_ns'

    Signed-off-by: Kirill Korotaev
    Signed-off-by: Cedric Le Goater
    Cc: Eric W. Biederman
    Cc: Herbert Poetzl
    Cc: Sukadev Bhattiprolu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill Korotaev
     

08 Dec, 2006

1 commit

  • Replace all uses of kmem_cache_t with struct kmem_cache.

    The patch was generated using the following script:

    #!/bin/sh
    #
    # Replace one string by another in all the kernel sources.
    #

    set -e

    for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
    quilt add $file
    sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
    mv /tmp/$$ $file
    quilt refresh
    done

    The script was run like this

    sh replace kmem_cache_t "struct kmem_cache"

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

02 Oct, 2006

1 commit

  • This moves the mount namespace into the nsproxy. The mount namespace count
    now refers to the number of nsproxies point to it, rather than the number of
    tasks. As a result, the unshare_namespace() function in kernel/fork.c no
    longer checks whether it is being shared.

    Signed-off-by: Serge Hallyn
    Cc: Kirill Korotaev
    Cc: "Eric W. Biederman"
    Cc: Herbert Poetzl
    Cc: Andrey Savochkin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Serge E. Hallyn
     

01 Oct, 2006

1 commit


30 Sep, 2006

1 commit


26 Sep, 2006

1 commit


01 Jul, 2006

1 commit


27 Jun, 2006

1 commit

  • This patch converts the combination of list_del(A) and list_add(A, B) to
    list_move(A, B).

    Cc: Greg Kroah-Hartman
    Cc: Ram Pai
    Signed-off-by: Akinobu Mita
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     

25 Jun, 2006

1 commit


23 Jun, 2006

1 commit

  • Extend the get_sb() filesystem operation to take an extra argument that
    permits the VFS to pass in the target vfsmount that defines the mountpoint.

    The filesystem is then required to manually set the superblock and root dentry
    pointers. For most filesystems, this should be done with simple_set_mnt()
    which will set the superblock pointer and then set the root dentry to the
    superblock's s_root (as per the old default behaviour).

    The get_sb() op now returns an integer as there's now no need to return the
    superblock pointer.

    This patch permits a superblock to be implicitly shared amongst several mount
    points, such as can be done with NFS to avoid potential inode aliasing. In
    such a case, simple_set_mnt() would not be called, and instead the mnt_root
    and mnt_sb would be set directly.

    The patch also makes the following changes:

    (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
    pointer argument and return an integer, so most filesystems have to change
    very little.

    (*) If one of the convenience function is not used, then get_sb() should
    normally call simple_set_mnt() to instantiate the vfsmount. This will
    always return 0, and so can be tail-called from get_sb().

    (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
    dcache upon superblock destruction rather than shrink_dcache_anon().

    This is required because the superblock may now have multiple trees that
    aren't actually bound to s_root, but that still need to be cleaned up. The
    currently called functions assume that the whole tree is rooted at s_root,
    and that anonymous dentries are not the roots of trees which results in
    dentries being left unculled.

    However, with the way NFS superblock sharing are currently set to be
    implemented, these assumptions are violated: the root of the filesystem is
    simply a dummy dentry and inode (the real inode for '/' may well be
    inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
    with child trees.

    [*] Anonymous until discovered from another tree.

    (*) The documentation has been adjusted, including the additional bit of
    changing ext2_* into foo_* in the documentation.

    [akpm@osdl.org: convert ipath_fs, do other stuff]
    Signed-off-by: David Howells
    Acked-by: Al Viro
    Cc: Nathan Scott
    Cc: Roland Dreier
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

09 Jun, 2006

2 commits


16 May, 2006

1 commit

  • Revert commit f6422f17d3a480f21917a3895e2a46b968f56a08, due to

    Valdis.Kletnieks@vt.edu wrote:
    >
    > There seems to have been a bug introduced in this changeset:
    >
    > Am running 2.6.17-rc3-mm1. When this changeset is applied, 'mount --bind'
    > misbehaves:
    >
    > > # mkdir /foo
    > > # mount -t tmpfs -o rw,nosuid,nodev,noexec,noatime,nodiratime none /foo
    > > # mkdir /foo/bar
    > > # mount --bind /foo/bar /foo
    > > # tail -2 /proc/mounts
    > > none /foo tmpfs rw,nosuid,nodev,noexec,noatime,nodiratime 0 0
    > > none /foo tmpfs rw 0 0
    >
    > Reverting this changeset causes both mounts to have the same options.
    >
    > (Thanks to Stephen Smalley for tracking down the changeset...)
    >

    Cc: Herbert Poetzl
    Cc: Christoph Hellwig
    Cc:
    Cc: Stephen Smalley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

11 Apr, 2006

1 commit


28 Mar, 2006

1 commit


27 Mar, 2006

1 commit

  • I discovered on oprofile hunting on a SMP platform that dentry lookups were
    slowed down because d_hash_mask, d_hash_shift and dentry_hashtable were in
    a cache line that contained inodes_stat. So each time inodes_stats is
    changed by a cpu, other cpus have to refill their cache line.

    This patch moves some variables to the __read_mostly section, in order to
    avoid false sharing. RCU dentry lookups can go full speed.

    Signed-off-by: Eric Dumazet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     

21 Mar, 2006

1 commit

  • Create a new file under /proc/self, called mountstats, where mounted file
    systems can export information (configuration options, performance counters,
    and so on). Use a mechanism similar to /proc/mounts and s_ops->show_options.

    This mechanism does not violate namespace security, and is safe to use while
    other processes are unmounting file systems.

    Thanks to Mike Waychison for his review and comments.

    Test-plan:
    Test concurrent mount/unmount operations while cat'ing /proc/self/mountstats.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

16 Mar, 2006

1 commit


08 Feb, 2006

2 commits


17 Jan, 2006

1 commit


12 Jan, 2006

1 commit


11 Jan, 2006

1 commit

  • Turn noatime and nodiratime into per-mount instead of per-sb flags.

    After all the preparations this is a rather trivial patch. The mount code
    needs to treat the two options as per-mount instead of per-superblock, and
    touch_atime needs to be changed to check the new MNT_ flags in addition to
    the MS_ flags that are kept for filesystems that are always
    noatime/nodiratime but not user settable anymore. Besides that core code
    only nfs needed an update because it's leaving atime updates to the server
    and thus sets the S_NOATIME flag on every inode, but needs to know whether
    it's a real noatime mount for an getattr optimization.

    While we're at it I've killed the IS_NOATIME/IS_NODIRATIME macros that were
    only used by touch_atime.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     

10 Jan, 2006

1 commit


09 Jan, 2006

2 commits


09 Nov, 2005

1 commit

  • Most permission() calls have a struct nameidata * available. This helper
    takes that as an argument and thus makes sure we pass it down for lookup
    intents and prepares for per-mount read-only support where we need a struct
    vfsmount for checking whether a file is writeable.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     

08 Nov, 2005

2 commits

  • An unbindable mount does not forward or receive propagation. Also
    unbindable mount disallows bind mounts. The semantics is as follows.

    Bind semantics:
    It is invalid to bind mount an unbindable mount.

    Move semantics:
    It is invalid to move an unbindable mount under shared mount.

    Clone-namespace semantics:
    If a mount is unbindable in the parent namespace, the corresponding
    cloned mount in the child namespace becomes unbindable too. Note:
    there is subtle difference, unbindable mounts cannot be bind mounted
    but can be cloned during clone-namespace.

    Signed-off-by: Ram Pai
    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Ram Pai
     
  • This makes bind, rbind, move, clone namespace and umount operations
    aware of the semantics of slave mount (see Documentation/sharedsubtree.txt
    in the last patch of the series for detailed description).

    Signed-off-by: Ram Pai
    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Ram Pai