08 Nov, 2011

1 commit

  • * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
    cpuidle: Single/Global registration of idle states
    cpuidle: Split cpuidle_state structure and move per-cpu statistics fields
    cpuidle: Remove CPUIDLE_FLAG_IGNORE and dev->prepare()
    cpuidle: Move dev->last_residency update to driver enter routine; remove dev->last_state
    ACPI: Fix CONFIG_ACPI_DOCK=n compiler warning
    ACPI: Export FADT pm_profile integer value to userspace
    thermal: Prevent polling from happening during system suspend
    ACPI: Drop ACPI_NO_HARDWARE_INIT
    ACPI atomicio: Convert width in bits to bytes in __acpi_ioremap_fast()
    PNPACPI: Simplify disabled resource registration
    ACPI: Fix possible recursive locking in hwregs.c
    ACPI: use kstrdup()
    mrst pmu: update comment
    tools/power turbostat: less verbose debugging

    Linus Torvalds
     

07 Nov, 2011

6 commits

  • …/kernel/git/jeremy/xen

    * 'upstream/jump-label-noearly' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
    jump-label: initialize jump-label subsystem much earlier
    x86/jump_label: add arch_jump_label_transform_static()
    s390/jump-label: add arch_jump_label_transform_static()
    jump_label: add arch_jump_label_transform_static() to optimise non-live code updates
    sparc/jump_label: drop arch_jump_label_text_poke_early()
    x86/jump_label: drop arch_jump_label_text_poke_early()
    jump_label: if a key has already been initialized, don't nop it out
    stop_machine: make stop_machine safe and efficient to call early
    jump_label: use proper atomic_t initializer

    Conflicts:
    - arch/x86/kernel/jump_label.c
    Added __init_or_module to arch_jump_label_text_poke_early vs
    removal of that function entirely
    - kernel/stop_machine.c
    same patch ("stop_machine: make stop_machine safe and efficient
    to call early") merged twice, with whitespace fix in one version

    Linus Torvalds
     
  • * 'upstream/xen-settime' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
    xen/dom0: set wallclock time in Xen
    xen: add dom0_op hypercall
    xen/acpi: Domain0 acpi parser related platform hypercall

    Linus Torvalds
     
  • * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
    Revert "tracing: Include module.h in define_trace.h"
    irq: don't put module.h into irq.h for tracking irqgen modules.
    bluetooth: macroize two small inlines to avoid module.h
    ip_vs.h: fix implicit use of module_get/module_put from module.h
    nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
    include: replace linux/module.h with "struct module" wherever possible
    include: convert various register fcns to macros to avoid include chaining
    crypto.h: remove unused crypto_tfm_alg_modname() inline
    uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
    pm_runtime.h: explicitly requires notifier.h
    linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
    miscdevice.h: fix up implicit use of lists and types
    stop_machine.h: fix implicit use of smp.h for smp_processor_id
    of: fix implicit use of errno.h in include/linux/of.h
    of_platform.h: delete needless include
    acpi: remove module.h include from platform/aclinux.h
    miscdevice.h: delete unnecessary inclusion of module.h
    device_cgroup.h: delete needless include
    net: sch_generic remove redundant use of
    net: inet_timewait_sock doesnt need
    ...

    Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
    - drivers/media/dvb/frontends/dibx000_common.c
    - drivers/media/video/{mt9m111.c,ov6650.c}
    - drivers/mfd/ab3550-core.c
    - include/linux/dmaengine.h

    Linus Torvalds
     
  • * 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
    scsi: drop unused Kconfig symbol
    pci: drop unused Kconfig symbol
    stmmac: drop unused Kconfig symbol
    x86: drop unused Kconfig symbol
    powerpc: drop unused Kconfig symbols
    powerpc: 40x: drop unused Kconfig symbol
    mips: drop unused Kconfig symbols
    openrisc: drop unused Kconfig symbols
    arm: at91: drop unused Kconfig symbol
    samples: drop unused Kconfig symbol
    m32r: drop unused Kconfig symbol
    score: drop unused Kconfig symbols
    sh: drop unused Kconfig symbol
    um: drop unused Kconfig symbol
    sparc: drop unused Kconfig symbol
    alpha: drop unused Kconfig symbol

    Fix up trivial conflict in drivers/net/ethernet/stmicro/stmmac/Kconfig
    as per Michal: the STMMAC_DUAL_MAC config variable is still unused and
    should be deleted.

    Linus Torvalds
     
  • * 'stable/vmalloc-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
    net: xen-netback: use API provided by xenbus module to map rings
    block: xen-blkback: use API provided by xenbus module to map rings
    xen: use generic functions instead of xen_{alloc, free}_vm_area()

    Linus Torvalds
     
  • referenced MeeGo, in particular, but really means Linux, in general.

    Signed-off-by: Len Brown

    Len Brown
     

04 Nov, 2011

1 commit

  • * 'for-next' of git://git.infradead.org/users/sameo/mfd-2.6: (80 commits)
    mfd: Fix missing abx500 header file updates
    mfd: Add missing include to intel_msic
    x86, mrst: add platform support for MSIC MFD driver
    mfd: Expose TurnOnStatus in ab8500 sysfs
    mfd: Remove support for early drop ab8500 chip
    mfd: Add support for ab8500 v3.3
    mfd: Add ab8500 interrupt disable hook
    mfd: Convert db8500-prcmu panic() into pr_crit()
    mfd: Refactor db8500-prcmu request_clock() function
    mfd: Rename db8500-prcmu init function
    mfd: Fix db5500-prcmu defines
    mfd: db8500-prcmu voltage domain consumers additions
    mfd: db8500-prcmu reset code retrieval
    mfd: db8500-prcmu tweak for modem wakeup
    mfd: Add db8500-pcmu watchdog accessor functions for watchdog
    mfd: hwacc power state db8500-prcmu accessor
    mfd: Add db8500-prcmu accessors for PLL and SGA clock
    mfd: Move to the new db500 PRCMU API
    mfd: Create a common interface for dbx500 PRCMU drivers
    mfd: Initialize DB8500 PRCMU regs
    ...

    Fix up trivial conflicts in
    arch/arm/mach-imx/mach-mx31moboard.c
    arch/arm/mach-omap2/board-omap3beagle.c
    arch/arm/mach-u300/include/mach/irqs.h
    drivers/mfd/wm831x-spi.c

    Linus Torvalds
     

03 Nov, 2011

5 commits

  • * 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac: (21 commits)
    MAINTAINERS: add an entry for Edac Sandy Bridge driver
    edac: tag sb_edac as EXPERIMENTAL, as it requires more testing
    EDAC: Fix incorrect edac mode reporting in sb_edac
    edac: sb_edac: Add it to the building system
    edac: Add an experimental new driver to support Sandy Bridge CPU's
    i7300_edac: Fix error cleanup logic
    i7core_edac: Initialize memory name with cpu, channel, bank
    i7core_edac: Fix compilation on 32 bits arch
    i7core_edac: scrubbing fixups
    EDAC: Correct Kconfig dependencies
    i7core_edac: return -ENODEV if no MC is found
    i7core_edac: use edac's own way to print errors
    MAINTAINERS: remove dropped edac_mce.* from the file
    i7core_edac: Drop the edac_mce facility
    x86, MCE: Use notifier chain only for MCE decoding
    EDAC i7core: Use mce socketid for better compatibility
    i7core_edac: Don't enable memory scrubbing for Xeon 35xx
    i7core_edac: Add scrubbing support
    edac: Move edac main structs to include/linux/edac.h
    i7core_edac: Fix oops when trying to inject errors
    ...

    Linus Torvalds
     
  • Says Andrew:

    "60 patches. That's good enough for -rc1 I guess. I have quite a lot
    of detritus to be rechecked, work through maintainers, etc.

    - most of the remains of MM
    - rtc
    - various misc
    - cgroups
    - memcg
    - cpusets
    - procfs
    - ipc
    - rapidio
    - sysctl
    - pps
    - w1
    - drivers/misc
    - aio"

    * akpm: (60 commits)
    memcg: replace ss->id_lock with a rwlock
    aio: allocate kiocbs in batches
    drivers/misc/vmw_balloon.c: fix typo in code comment
    drivers/misc/vmw_balloon.c: determine page allocation flag can_sleep outside loop
    w1: disable irqs in critical section
    drivers/w1/w1_int.c: multiple masters used same init_name
    drivers/power/ds2780_battery.c: fix deadlock upon insertion and removal
    drivers/power/ds2780_battery.c: add a nolock function to w1 interface
    drivers/power/ds2780_battery.c: create central point for calling w1 interface
    w1: ds2760 and ds2780, use ida for id and ida_simple_get() to get it
    pps gpio client: add missing dependency
    pps: new client driver using GPIO
    pps: default echo function
    include/linux/dma-mapping.h: add dma_zalloc_coherent()
    sysctl: make CONFIG_SYSCTL_SYSCALL default to n
    sysctl: add support for poll()
    RapidIO: documentation update
    drivers/net/rionet.c: fix ethernet address macros for LE platforms
    RapidIO: fix potential null deref in rio_setup_device()
    RapidIO: add mport driver for Tsi721 bridge
    ...

    Linus Torvalds
     
  • This avoids duplicating the function in every arch gup_fast.

    Signed-off-by: Andrea Arcangeli
    Cc: Peter Zijlstra
    Cc: Hugh Dickins
    Cc: Johannes Weiner
    Cc: Rik van Riel
    Cc: Mel Gorman
    Cc: KOSAKI Motohiro
    Cc: Benjamin Herrenschmidt
    Cc: David Gibson
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: David Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     
  • Michel while working on the working set estimation code, noticed that
    calling get_page_unless_zero() on a random pfn_to_page(random_pfn)
    wasn't safe, if the pfn ended up being a tail page of a transparent
    hugepage under splitting by __split_huge_page_refcount().

    He then found the problem could also theoretically materialize with
    page_cache_get_speculative() during the speculative radix tree lookups
    that uses get_page_unless_zero() in SMP if the radix tree page is freed
    and reallocated and get_user_pages is called on it before
    page_cache_get_speculative has a chance to call get_page_unless_zero().

    So the best way to fix the problem is to keep page_tail->_count zero at
    all times. This will guarantee that get_page_unless_zero() can never
    succeed on any tail page. page_tail->_mapcount is guaranteed zero and
    is unused for all tail pages of a compound page, so we can simply
    account the tail page references there and transfer them to
    tail_page->_count in __split_huge_page_refcount() (in addition to the
    head_page->_mapcount).

    While debugging this s/_count/_mapcount/ change I also noticed get_page is
    called by direct-io.c on pages returned by get_user_pages. That wasn't
    entirely safe because the two atomic_inc in get_page weren't atomic. As
    opposed to other get_user_page users like secondary-MMU page fault to
    establish the shadow pagetables would never call any superflous get_page
    after get_user_page returns. It's safer to make get_page universally safe
    for tail pages and to use get_page_foll() within follow_page (inside
    get_user_pages()). get_page_foll() is safe to do the refcounting for tail
    pages without taking any locks because it is run within PT lock protected
    critical sections (PT lock for pte and page_table_lock for
    pmd_trans_huge).

    The standard get_page() as invoked by direct-io instead will now take
    the compound_lock but still only for tail pages. The direct-io paths
    are usually I/O bound and the compound_lock is per THP so very
    finegrined, so there's no risk of scalability issues with it. A simple
    direct-io benchmarks with all lockdep prove locking and spinlock
    debugging infrastructure enabled shows identical performance and no
    overhead. So it's worth it. Ideally direct-io should stop calling
    get_page() on pages returned by get_user_pages(). The spinlock in
    get_page() is already optimized away for no-THP builds but doing
    get_page() on tail pages returned by GUP is generally a rare operation
    and usually only run in I/O paths.

    This new refcounting on page_tail->_mapcount in addition to avoiding new
    RCU critical sections will also allow the working set estimation code to
    work without any further complexity associated to the tail page
    refcounting with THP.

    Signed-off-by: Andrea Arcangeli
    Reported-by: Michel Lespinasse
    Reviewed-by: Michel Lespinasse
    Reviewed-by: Minchan Kim
    Cc: Peter Zijlstra
    Cc: Hugh Dickins
    Cc: Johannes Weiner
    Cc: Rik van Riel
    Cc: Mel Gorman
    Cc: KOSAKI Motohiro
    Cc: Benjamin Herrenschmidt
    Cc: David Gibson
    Cc:
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     
  • * 'for-linus' of git://github.com/richardweinberger/linux: (90 commits)
    um: fix ubd cow size
    um: Fix kmalloc argument order in um/vdso/vma.c
    um: switch to use of drivers/Kconfig
    UserModeLinux-HOWTO.txt: fix a typo
    UserModeLinux-HOWTO.txt: remove ^H characters
    um: we need sys/user.h only on i386
    um: merge delay_{32,64}.c
    um: distribute exports to where exported stuff is defined
    um: kill system-um.h
    um: generic ftrace.h will do...
    um: segment.h is x86-only and needed only there
    um: asm/pda.h is not needed anymore
    um: hw_irq.h can go generic as well
    um: switch to generic-y
    um: clean Kconfig up a bit
    um: a couple of missing dependencies...
    um: kill useless argument of free_chan() and free_one_chan()
    um: unify ptrace_user.h
    um: unify KSTK_...
    um: fix gcov build breakage
    ...

    Linus Torvalds
     

02 Nov, 2011

25 commits


01 Nov, 2011

2 commits

  • Remove edac_mce pieces and use the normal MCE decoder notifier chain by
    retaining the same functionality with considerably less code.

    Signed-off-by: Borislav Petkov
    Signed-off-by: Mauro Carvalho Chehab

    Borislav Petkov
     
  • The basic idea behind cross memory attach is to allow MPI programs doing
    intra-node communication to do a single copy of the message rather than a
    double copy of the message via shared memory.

    The following patch attempts to achieve this by allowing a destination
    process, given an address and size from a source process, to copy memory
    directly from the source process into its own address space via a system
    call. There is also a symmetrical ability to copy from the current
    process's address space into a destination process's address space.

    - Use of /proc/pid/mem has been considered, but there are issues with
    using it:
    - Does not allow for specifying iovecs for both src and dest, assuming
    preadv or pwritev was implemented either the area read from or
    written to would need to be contiguous.
    - Currently mem_read allows only processes who are currently
    ptrace'ing the target and are still able to ptrace the target to read
    from the target. This check could possibly be moved to the open call,
    but its not clear exactly what race this restriction is stopping
    (reason appears to have been lost)
    - Having to send the fd of /proc/self/mem via SCM_RIGHTS on unix
    domain socket is a bit ugly from a userspace point of view,
    especially when you may have hundreds if not (eventually) thousands
    of processes that all need to do this with each other
    - Doesn't allow for some future use of the interface we would like to
    consider adding in the future (see below)
    - Interestingly reading from /proc/pid/mem currently actually
    involves two copies! (But this could be fixed pretty easily)

    As mentioned previously use of vmsplice instead was considered, but has
    problems. Since you need the reader and writer working co-operatively if
    the pipe is not drained then you block. Which requires some wrapping to
    do non blocking on the send side or polling on the receive. In all to all
    communication it requires ordering otherwise you can deadlock. And in the
    example of many MPI tasks writing to one MPI task vmsplice serialises the
    copying.

    There are some cases of MPI collectives where even a single copy interface
    does not get us the performance gain we could. For example in an
    MPI_Reduce rather than copy the data from the source we would like to
    instead use it directly in a mathops (say the reduce is doing a sum) as
    this would save us doing a copy. We don't need to keep a copy of the data
    from the source. I haven't implemented this, but I think this interface
    could in the future do all this through the use of the flags - eg could
    specify the math operation and type and the kernel rather than just
    copying the data would apply the specified operation between the source
    and destination and store it in the destination.

    Although we don't have a "second user" of the interface (though I've had
    some nibbles from people who may be interested in using it for intra
    process messaging which is not MPI). This interface is something which
    hardware vendors are already doing for their custom drivers to implement
    fast local communication. And so in addition to this being useful for
    OpenMPI it would mean the driver maintainers don't have to fix things up
    when the mm changes.

    There was some discussion about how much faster a true zero copy would
    go. Here's a link back to the email with some testing I did on that:

    http://marc.info/?l=linux-mm&m=130105930902915&w=2

    There is a basic man page for the proposed interface here:

    http://ozlabs.org/~cyeoh/cma/process_vm_readv.txt

    This has been implemented for x86 and powerpc, other architecture should
    mainly (I think) just need to add syscall numbers for the process_vm_readv
    and process_vm_writev. There are 32 bit compatibility versions for
    64-bit kernels.

    For arch maintainers there are some simple tests to be able to quickly
    verify that the syscalls are working correctly here:

    http://ozlabs.org/~cyeoh/cma/cma-test-20110718.tgz

    Signed-off-by: Chris Yeoh
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Thomas Gleixner
    Cc: Arnd Bergmann
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: David Howells
    Cc: James Morris
    Cc:
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christopher Yeoh