16 Mar, 2011

20 commits

  • …git/tip/linux-2.6-tip

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (184 commits)
    perf probe: Clean up probe_point_lazy_walker() return value
    tracing: Fix irqoff selftest expanding max buffer
    tracing: Align 4 byte ints together in struct tracer
    tracing: Export trace_set_clr_event()
    tracing: Explain about unstable clock on resume with ring buffer warning
    ftrace/graph: Trace function entry before updating index
    ftrace: Add .ref.text as one of the safe areas to trace
    tracing: Adjust conditional expression latency formatting.
    tracing: Fix event alignment: skb:kfree_skb
    tracing: Fix event alignment: mce:mce_record
    tracing: Fix event alignment: kvm:kvm_hv_hypercall
    tracing: Fix event alignment: module:module_request
    tracing: Fix event alignment: ftrace:context_switch and ftrace:wakeup
    tracing: Remove lock_depth from event entry
    perf header: Stop using 'self'
    perf session: Use evlist/evsel for managing perf.data attributes
    perf top: Don't let events to eat up whole header line
    perf top: Fix events overflow in top command
    ring-buffer: Remove unused #include <linux/trace_irq.h>
    tracing: Add an 'overwrite' trace_option.
    ...

    Linus Torvalds
     
  • …el/git/tip/linux-2.6-tip

    * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    rtmutex: tester: Remove the remaining BKL leftovers
    lockdep/timers: Explain in detail the locking problems del_timer_sync() may cause
    rtmutex: Simplify PI algorithm and make highest prio task get lock
    rwsem: Remove redundant asmregparm annotation
    rwsem: Move duplicate function prototypes to linux/rwsem.h
    rwsem: Unify the duplicate rwsem_is_locked() inlines
    rwsem: Move duplicate init macros and functions to linux/rwsem.h
    rwsem: Move duplicate struct rwsem declaration to linux/rwsem.h
    x86: Cleanup rwsem_count_t typedef
    rwsem: Cleanup includes
    locking: Remove deprecated lock initializers
    cred: Replace deprecated spinlock initialization
    kthread: Replace deprecated spinlock initialization
    xtensa: Replace deprecated spinlock initialization
    um: Replace deprecated spinlock initialization
    sparc: Replace deprecated spinlock initialization
    mips: Replace deprecated spinlock initialization
    cris: Replace deprecated spinlock initialization
    alpha: Replace deprecated spinlock initialization
    rtmutex-tester: Remove BKL tests

    Linus Torvalds
     
  • …el/git/tip/linux-2.6-tip

    * 'core-futexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    arm: Remove bogus comment in futex_atomic_cmpxchg_inatomic()
    futex: Deobfuscate handle_futex_death()
    plist: Add priority list test
    plist: Shrink struct plist_head
    futex,plist: Remove debug lock assignment from plist_node
    futex,plist: Pass the real head of the priority list to plist_del()
    futex: Sanitize futex ops argument types
    futex: Sanitize cmpxchg_futex_value_locked API
    futex: Remove redundant pagefault_disable in futex_atomic_cmpxchg_inatomic()
    futex: Avoid redudant evaluation of task_pid_vnr()
    futex: Update futex_wait_setup comments about locking

    Linus Torvalds
     
  • …/kernel/git/tip/linux-2.6-tip

    * 'core-debugobjects-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    debugobjects: Add hint for better object identification

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (57 commits)
    tidy the trailing symlinks traversal up
    Turn resolution of trailing symlinks iterative everywhere
    simplify link_path_walk() tail
    Make trailing symlink resolution in path_lookupat() iterative
    update nd->inode in __do_follow_link() instead of after do_follow_link()
    pull handling of one pathname component into a helper
    fs: allow AT_EMPTY_PATH in linkat(), limit that to CAP_DAC_READ_SEARCH
    Allow passing O_PATH descriptors via SCM_RIGHTS datagrams
    readlinkat(), fchownat() and fstatat() with empty relative pathnames
    Allow O_PATH for symlinks
    New kind of open files - "location only".
    ext4: Copy fs UUID to superblock
    ext3: Copy fs UUID to superblock.
    vfs: Export file system uuid via /proc//mountinfo
    unistd.h: Add new syscalls numbers to asm-generic
    x86: Add new syscalls for x86_64
    x86: Add new syscalls for x86_32
    fs: Remove i_nlink check from file system link callback
    fs: Don't allow to create hardlink for deleted file
    vfs: Add open by file handle support
    ...

    Linus Torvalds
     
  • The new vfs locking scheme introduced in 2.6.38 breaks NFS sillyrename
    because the latter relies on being able to determine the parent
    directory of the dentry in the ->iput() callback in order to send the
    appropriate unlink rpc call.

    Looking at the code that cares about races with dput(), there doesn't
    seem to be anything that specifically uses d_parent as a test for
    whether or not there is a race:
    - __d_lookup_rcu(), __d_lookup() all test for d_hashed() after d_parent
    - shrink_dcache_for_umount() is safe since nothing else can rearrange
    the dentries in that super block.
    - have_submount(), select_parent() and d_genocide() can test for a
    deletion if we set the DCACHE_DISCONNECTED flag when the dentry
    is removed from the parent's d_subdirs list.

    Signed-off-by: Trond Myklebust
    Cc: stable@kernel.org (2.6.38, needs commit c826cb7dfce8 "dcache.c:
    create helper function for duplicated functionality" )
    Signed-off-by: Linus Torvalds

    Trond Myklebust
     
  • This creates a helper function for he "try to ascend into the parent
    directory" case, which was written out in triplicate before. With all
    the locking and subtle sequence number stuff, we really don't want to
    duplicate that kind of code.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • * pull the handling of current->total_link_count into
    __do_follow_link()
    * put the common "do ->put_link() if needed and path_put() the link"
    stuff into a helper (put_link(nd, link, cookie))
    * rename __do_follow_link() to follow_link(), while we are at it

    Signed-off-by: Al Viro

    Al Viro
     
  • The last remaining place (resolution of nested symlink) converted
    to the loop of the same kind we have in path_lookupat() and
    path_openat().

    Note that we still *do* have a recursion in pathname resolution;
    can't avoid it, really. However, it's strictly for nested symlinks
    now - i.e. ones in the middle of a pathname.

    link_path_walk() has lost the tail now - it always walks everything
    except the last component.

    do_follow_link() renamed to nested_symlink() and moved down.

    Signed-off-by: Al Viro

    Al Viro
     
  • Now that link_path_walk() is called without LOOKUP_PARENT
    only from do_follow_link(), we can simplify the checks in
    last component handling. First of all, checking if we'd
    arrived to a directory is not needed - the caller will check
    it anyway. And LOOKUP_FOLLOW is guaranteed to be there,
    since we only get to that place with nd->depth > 0.

    Signed-off-by: Al Viro

    Al Viro
     
  • Now the only caller of link_path_walk() that does *not* pass
    LOOKUP_PARENT is do_follow_link()

    Signed-off-by: Al Viro

    Al Viro
     
  • ... and note that we only need to do it for LAST_BIND symlinks

    Signed-off-by: Al Viro

    Al Viro
     
  • new helper: walk_component(). Handles everything except symlinks;
    returns negative on error, 0 on success and 1 on symlinks we decided
    to follow. Drops out of RCU mode on such symlinks.

    link_path_walk() and do_last() switched to using that.

    Signed-off-by: Al Viro

    Al Viro
     
  • We don't want to allow creation of private hardlinks by different application
    using the fd passed to them via SCM_RIGHTS. So limit the null relative name
    usage in linkat syscall to CAP_DAC_READ_SEARCH

    Signed-off-by: Aneesh Kumar K.V

    Aneesh Kumar K.V
     
  • Newer compilers (gcc 4.6) complains about:

    return ret < 0 ?: 0;

    For the following reason:

    util/probe-finder.c: In function ‘probe_point_lazy_walker’:
    util/probe-finder.c:1331:18: error: the omitted middle operand in ?: will always be ‘true’, suggest explicit middle operand [-Werror=parentheses]

    And indeed the return value is a somewhat obscure (but correct) value
    of 'true', so return 'ret' instead - this is cleaner and unconfuses
    GCC as well.

    Cc: Arnaldo Carvalho de Melo
    Cc: Masami Hiramatsu
    Cc: Frederic Weisbecker
    Cc: Masami Hiramatsu
    Cc: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • * 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm:
    xen: suspend: remove xen_hvm_suspend
    xen: suspend: pull pre/post suspend hooks out into suspend_info
    xen: suspend: move arch specific pre/post suspend hooks into generic hooks
    xen: suspend: refactor non-arch specific pre/post suspend hooks
    xen: suspend: add "arch" to pre/post suspend hooks
    xen: suspend: pass extra hypercall argument via suspend_info struct
    xen: suspend: refactor cancellation flag into a structure
    xen: suspend: use HYPERVISOR_suspend for PVHVM case instead of open coding
    xen: switch to new schedop hypercall by default.
    xen: use new schedop interface for suspend
    xen: do not respond to unknown xenstore control requests
    xen: fix compile issue if XEN is enabled but XEN_PVHVM is disabled
    xen: PV on HVM: support PV spinlocks and IPIs
    xen: make the ballon driver work for hvm domains
    xen-blkfront: handle Xen major numbers other than XENVBD
    xen: do not use xen_info on HVM, set pv_info name to "Xen HVM"
    xen: no need to delay xen_setup_shutdown_event for hvm guests anymore

    Linus Torvalds
     
  • …git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen

    * 'stable/ia64' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
    xen: ia64 build broken due to "xen: switch to new schedop hypercall by default."

    * 'stable/blkfront-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
    xen: Union the blkif_request request specific fields

    * 'stable/cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
    xen: annotate functions which only call into __init at start of day
    xen p2m: annotate variable which appears unused
    xen: events: mark cpu_evtchn_mask_p as __refdata

    Linus Torvalds
     
  • * 'stable/irq.cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
    xen: events: remove dom0 specific xen_create_msi_irq
    xen: events: use xen_bind_pirq_msi_to_irq from xen_create_msi_irq
    xen: events: push set_irq_msi down into xen_create_msi_irq
    xen: events: update pirq_to_irq in xen_create_msi_irq
    xen: events: refactor xen_create_msi_irq slightly
    xen: events: separate MSI PIRQ allocation from PIRQ binding to IRQ
    xen: events: assume PHYSDEVOP_get_free_pirq exists
    xen: pci: collapse apic_register_gsi_xen_hvm and xen_hvm_register_pirq
    xen: events: return irq from xen_allocate_pirq_msi
    xen: events: drop XEN_ALLOC_IRQ flag to xen_allocate_pirq_msi
    xen: events: do not leak IRQ from xen_allocate_pirq_msi when no pirq available.
    xen: pci: only define xen_initdom_setup_msi_irqs if CONFIG_XEN_DOM0

    Linus Torvalds
     
  • …el.org/pub/scm/linux/kernel/git/konrad/xen

    * 'stable/irq.rework' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
    xen/irq: Cleanup up the pirq_to_irq for DomU PV PCI passthrough guests as well.
    xen: Use IRQF_FORCE_RESUME
    xen/timer: Missing IRQF_NO_SUSPEND in timer code broke suspend.
    xen: Fix compile error introduced by "switch to new irq_chip functions"
    xen: Switch to new irq_chip functions
    xen: Remove stale irq_chip.end
    xen: events: do not free legacy IRQs
    xen: events: allocate GSIs and dynamic IRQs from separate IRQ ranges.
    xen: events: add xen_allocate_irq_{dynamic, gsi} and xen_free_irq
    xen:events: move find_unbound_irq inside CONFIG_PCI_MSI
    xen: handled remapped IRQs when enabling a pcifront PCI device.
    genirq: Add IRQF_FORCE_RESUME

    * 'stable/pcifront-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
    pci/xen: When free-ing MSI-X/MSI irq->desc also use generic code.
    pci/xen: Cleanup: convert int** to int[]
    pci/xen: Use xen_allocate_pirq_msi instead of xen_allocate_pirq
    xen-pcifront: Sanity check the MSI/MSI-X values
    xen-pcifront: don't use flush_scheduled_work()

    Linus Torvalds
     
  • …l.org/pub/scm/linux/kernel/git/konrad/xen

    * 'stable/p2m-identity.v4.9.1' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
    xen/m2p: Check whether the MFN has IDENTITY_FRAME bit set..
    xen/m2p: No need to catch exceptions when we know that there is no RAM
    xen/debug: WARN_ON when identity PFN has no _PAGE_IOMAP flag set.
    xen/debugfs: Add 'p2m' file for printing out the P2M layout.
    xen/setup: Set identity mapping for non-RAM E820 and E820 gaps.
    xen/mmu: WARN_ON when racing to swap middle leaf.
    xen/mmu: Set _PAGE_IOMAP if PFN is an identity PFN.
    xen/mmu: Add the notion of identity (1-1) mapping.
    xen: Mark all initial reserved pages for the balloon as INVALID_P2M_ENTRY.

    * 'stable/e820' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
    xen/e820: Don't mark balloon memory as E820_UNUSABLE when running as guest and fix overflow.
    xen/setup: Inhibit resource API from using System RAM E820 gaps as PCI mem gaps.

    Linus Torvalds
     

15 Mar, 2011

20 commits

  • Just need to make sure that AF_UNIX garbage collector won't
    confuse O_PATHed socket on filesystem for real AF_UNIX opened
    socket.

    Signed-off-by: Al Viro

    Al Viro
     
  • For readlinkat() we simply allow empty pathname; it will fail unless
    we have dfd equal to O_PATH-opened symlink, so we are outside of
    POSIX scope here. For fchownat() and fstatat() we allow AT_EMPTY_PATH;
    let the caller explicitly ask for such behaviour.

    Signed-off-by: Al Viro

    Al Viro
     
  • At that point we can't do almost nothing with them. They can be opened
    with O_PATH, we can manipulate such descriptors with dup(), etc. and
    we can see them in /proc/*/{fd,fdinfo}/*.

    We can't (and won't be able to) follow /proc/*/fd/* symlinks for those;
    there's simply not enough information for pathname resolution to go on
    from such point - to resolve a symlink we need to know which directory
    does it live in.

    We will be able to do useful things with them after the next commit, though -
    readlinkat() and fchownat() will be possible to use with dfd being an
    O_PATH-opened symlink and empty relative pathname. Combined with
    open_by_handle() it'll give us a way to do realink-by-handle and
    lchown-by-handle without messing with more redundant syscalls.

    Signed-off-by: Al Viro

    Al Viro
     
  • New flag for open(2) - O_PATH. Semantics:
    * pathname is resolved, but the file itself is _NOT_ opened
    as far as filesystem is concerned.
    * almost all operations on the resulting descriptors shall
    fail with -EBADF. Exceptions are:
    1) operations on descriptors themselves (i.e.
    close(), dup(), dup2(), dup3(), fcntl(fd, F_DUPFD),
    fcntl(fd, F_DUPFD_CLOEXEC, ...), fcntl(fd, F_GETFD),
    fcntl(fd, F_SETFD, ...))
    2) fcntl(fd, F_GETFL), for a common non-destructive way to
    check if descriptor is open
    3) "dfd" arguments of ...at(2) syscalls, i.e. the starting
    points of pathname resolution
    * closing such descriptor does *NOT* affect dnotify or
    posix locks.
    * permissions are checked as usual along the way to file;
    no permission checks are applied to the file itself. Of course,
    giving such thing to syscall will result in permission checks (at
    the moment it means checking that starting point of ....at() is
    a directory and caller has exec permissions on it).

    fget() and fget_light() return NULL on such descriptors; use of
    fget_raw() and fget_raw_light() is needed to get them. That protects
    existing code from dealing with those things.

    There are two things still missing (they come in the next commits):
    one is handling of symlinks (right now we refuse to open them that
    way; see the next commit for semantics related to those) and another
    is descriptor passing via SCM_RIGHTS datagrams.

    Signed-off-by: Al Viro

    Al Viro
     
  • File system UUID is made available to application
    via /proc//mountinfo

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Al Viro

    Aneesh Kumar K.V
     
  • File system UUID is made available to application
    via /proc//mountinfo

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Al Viro

    Aneesh Kumar K.V
     
  • We add a per superblock uuid field. File systems should
    update the uuid in the fill_super callback

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Al Viro

    Aneesh Kumar K.V
     
  • Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Al Viro

    Aneesh Kumar K.V
     
  • This patch add new syscalls to x86_64

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Al Viro

    Aneesh Kumar K.V
     
  • This patch adds new syscalls to x86_32

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Al Viro

    Aneesh Kumar K.V
     
  • Now that VFS check for inode->i_nlink == 0 and returns proper
    error, remove similar check from file system

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Al Viro

    Aneesh Kumar K.V
     
  • Add inode->i_nlink == 0 check in VFS. Some of the file systems
    do this internally. A followup patch will remove those instance.
    This is needed to ensure that with link by handle we don't allow
    to create hardlink of an unlinked file. The check also prevent a race
    between unlink and link

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Al Viro

    Aneesh Kumar K.V
     
  • [AV: duplicate of open() guts removed; file_open_root() used instead]

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Al Viro

    Aneesh Kumar K.V
     
  • The syscall also return mount id which can be used
    to lookup file system specific information such as uuid
    in /proc//mountinfo

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Al Viro

    Aneesh Kumar K.V
     
  • Linus Torvalds
     
  • For name_to_handle_at(2) we'll want both ...at()-style syscall that
    would be usable for non-directory descriptors (with empty relative
    pathname). Introduce new flag (AT_EMPTY_PATH) to deal with that and
    corresponding LOOKUP_EMPTY; teach user_path_at() and path_init() to
    deal with the latter.

    Signed-off-by: Al Viro

    Al Viro
     
  • * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-mn10300:
    MN10300: atomic_read() should ensure it emits a load
    MN10300: The SMP_ICACHE_INV_FLUSH_RANGE IPI command does not exist
    MN10300: Proper use of macros get_user() in the case of incremented pointers

    Linus Torvalds
     
  • * 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: (26 commits)
    MIPS: Alchemy: Fix reset for MTX-1 and XXS1500
    MIPS: MTX-1: Make au1000_eth probe all PHY addresses
    MIPS: Jz4740: Add HAVE_CLK
    MIPS: Move idle task creation to work queue
    MIPS, Perf-events: Use unsigned delta for right shift in event update
    MIPS, Perf-events: Work with the new callchain interface
    MIPS, Perf-events: Fix event check in validate_event()
    MIPS, Perf-events: Work with the new PMU interface
    MIPS, Perf-events: Work with irq_work
    MIPS: Fix always CONFIG_LOONGSON_UART_BASE=y
    MIPS: Loongson: Fix potentially wrong string handling
    MIPS: Fix GCC-4.6 'set but not used' warning in arch/mips/mm/init.c
    MIPS: Fix GCC-4.6 'set but not used' warning in ieee754int.h
    MIPS: Remove unused code from arch/mips/kernel/syscall.c
    MIPS: Fix GCC-4.6 'set but not used' warning in signal*.c
    MIPS: MSP: Fix MSP71xx bpci interrupt handler return value
    MIPS: Select R4K timer lib for all MSP platforms
    MIPS: Loongson: Remove ad-hoc cmdline default
    MIPS: Clear the correct flag in sysmips(MIPS_FIXADE, ...).
    MIPS: Add an unreachable return statement to satisfy buggy GCCs.
    ...

    Linus Torvalds
     
  • …git/tip/linux-2.6-tip

    * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86: ce4100: Set pci ops via callback instead of module init
    x86/mm: Fix pgd_lock deadlock
    x86/mm: Handle mm_fault_error() in kernel space
    x86: Don't check for BIOS corruption in first 64K when there's no need to

    Linus Torvalds
     
  • This reverts the parent commit. I hate doing that, but it's generating
    some discussion ("half of it is right"), and since I am planning on
    doing the 2.6.38 release later today we can punt it to stable if
    required. Let's not rock the boat right now.

    Signed-off-by: Linus Torvalds

    Linus Torvalds