18 Oct, 2007

5 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
    9p: remove sysctl
    9p: fix bad kconfig cross-dependency
    9p: soften invalidation in loose_mode
    9p: attach-per-user
    9p: rename uid and gid parameters
    9p: define session flags
    9p: Make transports dynamic

    Linus Torvalds
     
  • The 9P2000 protocol requires the authentication and permission checks to be
    done in the file server. For that reason every user that accesses the file
    server tree has to authenticate and attach to the server separately.
    Multiple users can share the same connection to the server.

    Currently v9fs does a single attach and executes all I/O operations as a
    single user. This makes using v9fs in multiuser environment unsafe as it
    depends on the client doing the permission checking.

    This patch improves the 9P2000 support by allowing every user to attach
    separately. The patch defines three modes of access (new mount option
    'access'):

    - attach-per-user (access=user) (default mode for 9P2000.u)
    If a user tries to access a file served by v9fs for the first time, v9fs
    sends an attach command to the server (Tattach) specifying the user. If
    the attach succeeds, the user can access the v9fs tree.
    As there is no uname->uid (string->integer) mapping yet, this mode works
    only with the 9P2000.u dialect.

    - allow only one user to access the tree (access=)
    Only the user with uid can access the v9fs tree. Other users that attempt
    to access it will get EPERM error.

    - do all operations as a single user (access=any) (default for 9P2000)
    V9fs does a single attach and all operations are done as a single user.
    If this mode is selected, the v9fs behavior is identical with the current
    one.

    Signed-off-by: Latchesar Ionkov
    Signed-off-by: Eric Van Hensbergen

    Latchesar Ionkov
     
  • Change the names of 'uid' and 'gid' parameters to the more appropriate
    'dfltuid' and 'dfltgid'. This also sets the default uid/gid to -2
    (aka nfsnobody)

    Signed-off-by: Latchesar Ionkov
    Signed-off-by: Eric Van Hensbergen

    Latchesar Ionkov
     
  • This patch abstracts out the interfaces to underlying transports so that
    new transports can be added as modules. This should also allow kernel
    configuration of transports without ifdef-hell.

    Signed-off-by: Eric Van Hensbergen

    Eric Van Hensbergen
     
  • Add missing IRQs and IRQ descriptions to /proc/interrupts.

    /proc/interrupts is most useful when it displays every IRQ vector in use by
    the system, not just those somebody thought would be interesting.

    This patch inserts the following vector displays to the i386 and x86_64
    platforms, as appropriate:

    rescheduling interrupts
    TLB flush interrupts
    function call interrupts
    thermal event interrupts
    threshold interrupts
    spurious interrupts

    A threshold interrupt occurs when ECC memory correction is occuring at too
    high a frequency. Thresholds are used by the ECC hardware as occasional
    ECC failures are part of normal operation, but long sequences of ECC
    failures usually indicate a memory chip that is about to fail.

    Thermal event interrupts occur when a temperature threshold has been
    exceeded for some CPU chip. IIRC, a thermal interrupt is also generated
    when the temperature drops back to a normal level.

    A spurious interrupt is an interrupt that was raised then lowered by the
    device before it could be fully processed by the APIC. Hence the apic sees
    the interrupt but does not know what device it came from. For this case
    the APIC hardware will assume a vector of 0xff.

    Rescheduling, call, and TLB flush interrupts are sent from one CPU to
    another per the needs of the OS. Typically, their statistics would be used
    to discover if an interrupt flood of the given type has been occuring.

    AK: merged v2 and v4 which had some more tweaks
    AK: replace Local interrupts with Local timer interrupts
    AK: Fixed description of interrupt types.

    [ tglx: arch/x86 adaptation ]
    [ mingo: small cleanup ]

    Signed-off-by: Joe Korty
    Signed-off-by: Andi Kleen
    Cc: Tim Hockin
    Cc: Andi Kleen
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Signed-off-by: Andrew Morton
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Joe Korty
     

17 Oct, 2007

5 commits

  • Signed-off-by: Denis Cheng
    Cc: Rob Landley
    Cc: "Randy.Dunlap"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Denis Cheng
     
  • Implement sending of quota messages via netlink interface. The advantage
    is that in userspace we can better decide what to do with the message - for
    example display a dialogue in your X session or just write the message to
    the console. As a bonus, we can get rid of problems with console locking
    deep inside filesystem code once we remove the old printing mechanism.

    Signed-off-by: Jan Kara
    Cc: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • initrd/initramfs/ramdisk docs:
    - fix typos/spellos/grammar
    - clarify RAM disk config location
    - correct cpio option

    Acked-by: Bryan O'Sullivan
    Acked-by: Rob Landley
    Cc: Werner Almesberger
    Cc: H. Peter Anvin
    Signed-off-by: Randy Dunlap
    Acked-by: Jesper Juhl
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • prepare/commit_write no longer returns AOP_TRUNCATED_PAGE since OCFS2 and
    GFS2 were converted to the new aops, so we can make some simplifications
    for that.

    [michal.k.k.piotrowski@gmail.com: fix warning]
    Signed-off-by: Nick Piggin
    Cc: Michael Halcrow
    Cc: Mark Fasheh
    Cc: Steven Whitehouse
    Signed-off-by: Michal Piotrowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • These are intended to replace prepare_write and commit_write with more
    flexible alternatives that are also able to avoid the buffered write
    deadlock problems efficiently (which prepare_write is unable to do).

    [mark.fasheh@oracle.com: API design contributions, code review and fixes]
    [akpm@linux-foundation.org: various fixes]
    [dmonakhov@sw.ru: new aop block_write_begin fix]
    Signed-off-by: Nick Piggin
    Signed-off-by: Mark Fasheh
    Signed-off-by: Dmitriy Monakhov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

16 Oct, 2007

1 commit

  • * 'locks' of git://linux-nfs.org/~bfields/linux:
    nfsd: remove IS_ISMNDLCK macro
    Rework /proc/locks via seq_files and seq_list helpers
    fs/locks.c: use list_for_each_entry() instead of list_for_each()
    NFS: clean up explicit check for mandatory locks
    AFS: clean up explicit check for mandatory locks
    9PFS: clean up explicit check for mandatory locks
    GFS2: clean up explicit check for mandatory locks
    Cleanup macros for distinguishing mandatory locks
    Documentation: move locks.txt in filesystems/
    locks: add warning about mandatory locking races
    Documentation: move mandatory locking documentation to filesystems/
    locks: Fix potential OOPS in generic_setlease()
    Use list_first_entry in locks_wake_up_blocks
    locks: fix flock_lock_file() comment
    Memory shortage can result in inconsistent flocks state
    locks: kill redundant local variable
    locks: reverse order of posix_locks_conflict() arguments

    Linus Torvalds
     

13 Oct, 2007

1 commit

  • Big thanks go to Mathias Kolehmainen for reporting the bug, providing
    debug output and testing the patches I sent him to get it working.

    The fix was to stop calling ntfs_attr_set() at mount time as that causes
    balance_dirty_pages_ratelimited() to be called which on systems with
    little memory actually tries to go and balance the dirty pages which tries
    to take the s_umount semaphore but because we are still in fill_super()
    across which the VFS holds s_umount for writing this results in a
    deadlock.

    We now do the dirty work by hand by submitting individual buffers. This
    has the annoying "feature" that mounting can take a few seconds if the
    journal is large as we have clear it all. One day someone should improve
    on this by deferring the journal clearing to a helper kernel thread so it
    can be done in the background but I don't have time for this at the moment
    and the current solution works fine so I am leaving it like this for now.

    Signed-off-by: Anton Altaparmakov
    Signed-off-by: Linus Torvalds

    Anton Altaparmakov
     

10 Oct, 2007

3 commits


12 Sep, 2007

3 commits


23 Aug, 2007

1 commit

  • Updates to the MAINTAINERS file and documentation for 9p to point to the
    swik wiki versus the outdated sf.net page. Also updated some email addresses
    and added pointers to papers which better describe the implementation and
    application of the Linux 9p client.

    Signed-off-by: Eric Van Hensbergen

    Eric Van Hensbergen
     

01 Aug, 2007

1 commit


20 Jul, 2007

6 commits

  • Transform some calls to kmalloc/memset to a single kzalloc (or kcalloc).

    Here is a short excerpt of the semantic patch performing
    this transformation:

    @@
    type T2;
    expression x;
    identifier f,fld;
    expression E;
    expression E1,E2;
    expression e1,e2,e3,y;
    statement S;
    @@

    x =
    - kmalloc
    + kzalloc
    (E1,E2)
    ... when != \(x->fld=E;\|y=f(...,x,...);\|f(...,x,...);\|x=E;\|while(...) S\|for(e1;e2;e3) S\)
    - memset((T2)x,0,E1);

    @@
    expression E1,E2,E3;
    @@

    - kzalloc(E1 * E2,E3)
    + kcalloc(E1,E2,E3)

    [akpm@linux-foundation.org: get kcalloc args the right way around]
    Signed-off-by: Yoann Padioleau
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Acked-by: Russell King
    Cc: Bryan Wu
    Acked-by: Jiri Slaby
    Cc: Dave Airlie
    Acked-by: Roland Dreier
    Cc: Jiri Kosina
    Acked-by: Dmitry Torokhov
    Cc: Benjamin Herrenschmidt
    Acked-by: Mauro Carvalho Chehab
    Acked-by: Pierre Ossman
    Cc: Jeff Garzik
    Cc: "David S. Miller"
    Acked-by: Greg KH
    Cc: James Bottomley
    Cc: "Antonino A. Daplas"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yoann Padioleau
     
  • This patch adds the documentation for /proc//coredump_filter.

    Signed-off-by: Hidehiro Kawai
    Cc: Alan Cox
    Cc: David Howells
    Cc: Hugh Dickins
    Cc: Nick Piggin
    Cc: "Randy.Dunlap"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kawai, Hidehiro
     
  • The purpose of audit_bprm() is to log the argv array to a userspace daemon at
    the end of the execve system call. Since user-space hasn't had time to run,
    this array is still in pristine state on the process' stack; so no need to
    copy it, we can just grab it from there.

    In order to minimize the damage to audit_log_*() copy each string into a
    temporary kernel buffer first.

    Currently the audit code requires that the full argument vector fits in a
    single packet. So currently it does clip the argv size to a (sysctl) limit,
    but only when execve auditing is enabled.

    If the audit protocol gets extended to allow for multiple packets this check
    can be removed.

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ollie Wild
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • Change ->fault prototype. We now return an int, which contains
    VM_FAULT_xxx code in the low byte, and FAULT_RET_xxx code in the next byte.
    FAULT_RET_ code tells the VM whether a page was found, whether it has been
    locked, and potentially other things. This is not quite the way he wanted
    it yet, but that's changed in the next patch (which requires changes to
    arch code).

    This means we no longer set VM_CAN_INVALIDATE in the vma in order to say
    that a page is locked which requires filemap_nopage to go away (because we
    can no longer remain backward compatible without that flag), but we were
    going to do that anyway.

    struct fault_data is renamed to struct vm_fault as Linus asked. address
    is now a void __user * that we should firmly encourage drivers not to use
    without really good reason.

    The page is now returned via a page pointer in the vm_fault struct.

    Signed-off-by: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • There seems to be very little documentation about this callback in general.
    The locking in particular is a bit tricky, so it's worth having this in
    writing.

    Signed-off-by: Mark Fasheh
    Cc: Nick Piggin
    Cc: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mark Fasheh
     
  • Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes
    the virtual address -> file offset differently from linear mappings.

    ->populate is a layering violation because the filesystem/pagecache code
    should need to know anything about the virtual memory mapping. The hitch here
    is that the ->nopage handler didn't pass down enough information (ie. pgoff).
    But it is more logical to pass pgoff rather than have the ->nopage function
    calculate it itself anyway (because that's a similar layering violation).

    Having the populate handler install the pte itself is likewise a nasty thing
    to be doing.

    This patch introduces a new fault handler that replaces ->nopage and
    ->populate and (later) ->nopfn. Most of the old mechanism is still in place
    so there is a lot of duplication and nice cleanups that can be removed if
    everyone switches over.

    The rationale for doing this in the first place is that nonlinear mappings are
    subject to the pagefault vs invalidate/truncate race too, and it seemed stupid
    to duplicate the synchronisation logic rather than just consolidate the two.

    After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in
    pagecache. Seems like a fringe functionality anyway.

    NOPAGE_REFAULT is removed. This should be implemented with ->fault, and no
    users have hit mainline yet.

    [akpm@linux-foundation.org: cleanup]
    [randy.dunlap@oracle.com: doc. fixes for readahead]
    [akpm@linux-foundation.org: build fix]
    Signed-off-by: Nick Piggin
    Signed-off-by: Randy Dunlap
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

18 Jul, 2007

2 commits

  • Signed-off-by: Josef 'Jeff' Sipek
    Acked-by: Michael Halcrow
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josef 'Jeff' Sipek
     
  • This patch adds the kernelcore= parameter for x86.

    Once all patches are applied, a new command-line parameter exist and a new
    sysctl. This patch adds the necessary documentation.

    From: Yasunori Goto

    When "kernelcore" boot option is specified, kernel can't boot up on ia64
    because of an infinite loop. In addition, the parsing code can be handled
    in an architecture-independent manner.

    This patch uses common code to handle the kernelcore= parameter. It is
    only available to architectures that support arch-independent zone-sizing
    (i.e. define CONFIG_ARCH_POPULATES_NODE_MAP). Other architectures will
    ignore the boot parameter.

    [bunk@stusta.de: make cmdline_parse_kernelcore() static]
    Signed-off-by: Mel Gorman
    Signed-off-by: Yasunori Goto
    Acked-by: Andy Whitcroft
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     

17 Jul, 2007

4 commits

  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (32 commits)
    [PATCH] ocfs2: zero_user_page conversion
    ocfs2: Support xfs style space reservation ioctls
    ocfs2: support for removing file regions
    ocfs2: update truncate handling of partial clusters
    ocfs2: btree support for removal of arbirtrary extents
    ocfs2: Support creation of unwritten extents
    ocfs2: support writing of unwritten extents
    ocfs2: small cleanup of ocfs2_write_begin_nolock()
    ocfs2: btree changes for unwritten extents
    ocfs2: abstract btree growing calls
    ocfs2: use all extent block suballocators
    ocfs2: plug truncate into cached dealloc routines
    ocfs2: simplify deallocation locking
    ocfs2: harden buffer check during mapping of page blocks
    ocfs2: shared writeable mmap
    ocfs2: factor out write aops into nolock variants
    ocfs2: rework ocfs2_buffered_write_cluster()
    ocfs2: take ip_alloc_sem during entire truncate
    ocfs2: Add "preferred slot" mount option
    [KJ PATCH] Replacing memset(,0,PAGE_SIZE) with clear_page() in fs/ocfs2/dlm/dlmrecovery.c
    ...

    Linus Torvalds
     
  • Update Documentation/filesystems/vfs.txt

    Signed-off-by: Borislav Petkov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Borislav Petkov
     
  • Update the description of struct file_system_type and get_sb() in
    Documentation/filesystems/vfs.txt to match the current code.

    Signed-off-by: Borislav Petkov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Borislav Petkov
     
  • Documentation for the /proc/$pid/stat file.

    Signed-off-by: Kees Cook
    Cc: Rob Landley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     

11 Jul, 2007

3 commits

  • Sometimes other drivers depend on particular configfs items. For
    example, ocfs2 mounts depend on a heartbeat region item. If that
    region item is removed with rmdir(2), the ocfs2 mount must BUG or go
    readonly. Not happy.

    This provides two additional API calls: configfs_depend_item() and
    configfs_undepend_item(). A client driver can call
    configfs_depend_item() on an existing item to tell configfs that it is
    depended on. configfs will then return -EBUSY from rmdir(2) for that
    item. When the item is no longer depended on, the client driver calls
    configfs_undepend_item() on it.

    These API cannot be called underneath any configfs callbacks, as
    they will conflict. They can block and allocate. A client driver
    probably shouldn't calling them of its own gumption. Rather it should
    be providing an API that external subsystems call.

    How does this work? Imagine the ocfs2 mount process. When it mounts,
    it asks for a heart region item. This is done via a call into the
    heartbeat code. Inside the heartbeat code, the region item is looked
    up. Here, the heartbeat code calls configfs_depend_item(). If it
    succeeds, then heartbeat knows the region is safe to give to ocfs2.
    If it fails, it was being torn down anyway, and heartbeat can gracefully
    pass up an error.

    [ Fixed some bad whitespace in configfs.txt. --Mark ]

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • Add a notification callback, ops->disconnect_notify(). It has the same
    prototype as ->drop_item(), but it will be called just before the item
    linkage is broken. This way, configfs users who want to do work while
    the object is still in the heirarchy have a chance.

    Client drivers will still need to config_item_put() in their
    ->drop_item(), if they implement it. They need do nothing in
    ->disconnect_notify(). They don't have to provide it if they don't
    care. But someone who wants to be notified before ci_parent is set to
    NULL can now be notified.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • Convert the su_sem member of struct configfs_subsystem to a struct
    mutex, as that's what it is. Also convert all the users and update
    Documentation/configfs.txt and Documentation/configfs_example.c
    accordingly.

    [ Conflict in fs/dlm/config.c with commit
    3168b0780d06ace875696f8a648d04d6089654e5 manually resolved. --Mark ]

    Inspired-by: Satyam Sharma
    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     

09 Jun, 2007

1 commit

  • Randy Dunlap reports that a tmpfs, mounted with NUMA mpol= specifying an
    offline node, crashes as soon as data is allocated upon it. Now restrict it
    to online nodes, where before it restricted to MAX_NUMNODES.

    Signed-off-by: Hugh Dickins
    Cc: Robin Holt
    Cc: Christoph Lameter
    Cc: Andi Kleen
    Tested-and-acked-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

25 May, 2007

1 commit


09 May, 2007

3 commits