15 Oct, 2009

1 commit


13 Oct, 2009

2 commits

  • Meaning receive multiple messages, reducing the number of syscalls and
    net stack entry/exit operations.

    Next patches will introduce mechanisms where protocols that want to
    optimize this operation will provide an unlocked_recvmsg operation.

    This takes into account comments made by:

    . Paul Moore: sock_recvmsg is called only for the first datagram,
    sock_recvmsg_nosec is used for the rest.

    . Caitlin Bestler: recvmmsg now has a struct timespec timeout, that
    works in the same fashion as the ppoll one.

    If the underlying protocol returns a datagram with MSG_OOB set, this
    will make recvmmsg return right away with as many datagrams (+ the OOB
    one) it has received so far.

    . Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen
    datagrams and then recvmsg returns an error, recvmmsg will return
    the successfully received datagrams, store the error and return it
    in the next call.

    This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg,
    where we will be able to acquire the lock only at batch start and end, not at
    every underlying recvmsg call.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Create a new socket level option to report number of queue overflows

    Recently I augmented the AF_PACKET protocol to report the number of frames lost
    on the socket receive queue between any two enqueued frames. This value was
    exported via a SOL_PACKET level cmsg. AFter I completed that work it was
    requested that this feature be generalized so that any datagram oriented socket
    could make use of this option. As such I've created this patch, It creates a
    new SOL_SOCKET level option called SO_RXQ_OVFL, which when enabled exports a
    SOL_SOCKET level cmsg that reports the nubmer of times the sk_receive_queue
    overflowed between any two given frames. It also augments the AF_PACKET
    protocol to take advantage of this new feature (as it previously did not touch
    sk->sk_drops, which this patch uses to record the overflow count). Tested
    successfully by me.

    Notes:

    1) Unlike my previous patch, this patch simply records the sk_drops value, which
    is not a number of drops between packets, but rather a total number of drops.
    Deltas must be computed in user space.

    2) While this patch currently works with datagram oriented protocols, it will
    also be accepted by non-datagram oriented protocols. I'm not sure if thats
    agreeable to everyone, but my argument in favor of doing so is that, for those
    protocols which aren't applicable to this option, sk_drops will always be zero,
    and reporting no drops on a receive queue that isn't used for those
    non-participating protocols seems reasonable to me. This also saves us having
    to code in a per-protocol opt in mechanism.

    3) This applies cleanly to net-next assuming that commit
    977750076d98c7ff6cbda51858bb5a5894a9d9ab (my af packet cmsg patch) is reverted

    Signed-off-by: Neil Horman
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Neil Horman
     

29 Sep, 2009

1 commit

  • Implement page mapping percpu first chunk allocator as a fallback to
    the embedding allocator. The next patch will make the embedding
    allocator check distances between units to determine whether it fits
    within the vmalloc area so that this fallback can be used on such
    cases.

    sparc64 currently has relatively small vmalloc area which makes it
    impossible to create any dynamic chunks on certain configurations
    leading to percpu allocation failures. This and the next patch should
    allow those configurations to keep working until proper solution is
    found.

    While at it, mark pcpu_cpu_distance() with __init.

    Signed-off-by: Tejun Heo
    Acked-by: David S. Miller

    Tejun Heo
     

27 Sep, 2009

1 commit


24 Sep, 2009

6 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (39 commits)
    cpumask: Move deprecated functions to end of header.
    cpumask: remove unused deprecated functions, avoid accusations of insanity
    cpumask: use new-style cpumask ops in mm/quicklist.
    cpumask: use mm_cpumask() wrapper: x86
    cpumask: use mm_cpumask() wrapper: um
    cpumask: use mm_cpumask() wrapper: mips
    cpumask: use mm_cpumask() wrapper: mn10300
    cpumask: use mm_cpumask() wrapper: m32r
    cpumask: use mm_cpumask() wrapper: arm
    cpumask: Use accessors for cpu_*_mask: um
    cpumask: Use accessors for cpu_*_mask: powerpc
    cpumask: Use accessors for cpu_*_mask: mips
    cpumask: Use accessors for cpu_*_mask: m32r
    cpumask: remove arch_send_call_function_ipi
    cpumask: arch_send_call_function_ipi_mask: s390
    cpumask: arch_send_call_function_ipi_mask: powerpc
    cpumask: arch_send_call_function_ipi_mask: mips
    cpumask: arch_send_call_function_ipi_mask: m32r
    cpumask: arch_send_call_function_ipi_mask: alpha
    cpumask: remove obsolete topology_core_siblings and topology_thread_siblings: ia64
    ...

    Linus Torvalds
     
  • * remove asm/atomic.h inclusion from linux/utsname.h --
    not needed after kref conversion
    * remove linux/utsname.h inclusion from files which do not need it

    NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however
    due to some personality stuff it _is_ needed -- cowardly leave ELF-related
    headers and files alone.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Now everyone is converted to arch_send_call_function_ipi_mask, remove
    the shim and the #defines.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • There were replaced by topology_core_cpumask and topology_thread_cpumask.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Signed-off-by: Rusty Russell

    Rusty Russell
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next: (30 commits)
    Use macros for .data.page_aligned section.
    Use macros for .bss.page_aligned section.
    Use new __init_task_data macro in arch init_task.c files.
    kbuild: Don't define ALIGN and ENTRY when preprocessing linker scripts.
    arm, cris, mips, sparc, powerpc, um, xtensa: fix build with bash 4.0
    kbuild: add static to prototypes
    kbuild: fail build if recordmcount.pl fails
    kbuild: set -fconserve-stack option for gcc 4.5
    kbuild: echo the record_mcount command
    gconfig: disable "typeahead find" search in treeviews
    kbuild: fix cc1 options check to ensure we do not use -fPIC when compiling
    checkincludes.pl: add option to remove duplicates in place
    markup_oops: use modinfo to avoid confusion with underscored module names
    checkincludes.pl: provide usage helper
    checkincludes.pl: close file as soon as we're done with it
    ctags: usability fix
    kernel hacking: move STRIP_ASM_SYMS from General
    gitignore usr/initramfs_data.cpio.bz2 and usr/initramfs_data.cpio.lzma
    kbuild: Check if linker supports the -X option
    kbuild: introduce ld-option
    ...

    Fix trivial conflict in scripts/basic/fixdep.c

    Linus Torvalds
     

23 Sep, 2009

1 commit

  • gcc permitting variable length arrays makes the current construct used for
    BUILD_BUG_ON() useless, as that doesn't produce any diagnostic if the
    controlling expression isn't really constant. Instead, this patch makes
    it so that a bit field gets used here. Consequently, those uses where the
    condition isn't really constant now also need fixing.

    Note that in the gfp.h, kmemcheck.h, and virtio_config.h cases
    MAYBE_BUILD_BUG_ON() really just serves documentation purposes - even if
    the expression is compile time constant (__builtin_constant_p() yields
    true), the array is still deemed of variable length by gcc, and hence the
    whole expression doesn't have the intended effect.

    [akpm@linux-foundation.org: make arch/sparc/include/asm/vio.h compile]
    [akpm@linux-foundation.org: more nonsensical assertions in tpm.c..]
    Signed-off-by: Jan Beulich
    Cc: Andi Kleen
    Cc: Rusty Russell
    Cc: Catalin Marinas
    Cc: "David S. Miller"
    Cc: Rajiv Andrade
    Cc: Mimi Zohar
    Cc: James Morris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Beulich
     

22 Sep, 2009

3 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits)
    trivial: fix typo in aic7xxx comment
    trivial: fix comment typo in drivers/ata/pata_hpt37x.c
    trivial: typo in kernel-parameters.txt
    trivial: fix typo in tracing documentation
    trivial: add __init/__exit macros in drivers/gpio/bt8xxgpio.c
    trivial: add __init macro/ fix of __exit macro location in ipmi_poweroff.c
    trivial: remove unnecessary semicolons
    trivial: Fix duplicated word "options" in comment
    trivial: kbuild: remove extraneous blank line after declaration of usage()
    trivial: improve help text for mm debug config options
    trivial: doc: hpfall: accept disk device to unload as argument
    trivial: doc: hpfall: reduce risk that hpfall can do harm
    trivial: SubmittingPatches: Fix reference to renumbered step
    trivial: fix typos "man[ae]g?ment" -> "management"
    trivial: media/video/cx88: add __init/__exit macros to cx88 drivers
    trivial: fix typo in CONFIG_DEBUG_FS in gcov doc
    trivial: fix missing printk space in amd_k7_smp_check
    trivial: fix typo s/ketymap/keymap/ in comment
    trivial: fix typo "to to" in multiple files
    trivial: fix typos in comments s/DGBU/DBGU/
    ...

    Linus Torvalds
     
  • Add a flag for mmap that will be used to request a huge page region that
    will look like anonymous memory to user space. This is accomplished by
    using a file on the internal vfsmount. MAP_HUGETLB is a modifier of
    MAP_ANONYMOUS and so must be specified with it. The region will behave
    the same as a MAP_ANONYMOUS region using small pages.

    The patch also adds the MAP_STACK flag, which was previously defined only
    on some architectures but not on others. Since MAP_STACK is meant to be a
    hint only, architectures can define it without assigning a specific
    meaning to it.

    Signed-off-by: Arnd Bergmann
    Cc: Eric B Munson
    Cc: Hugh Dickins
    Cc: David Rientjes
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arnd Bergmann
     
  • Commit 96177299416dbccb73b54e6b344260154a445375 ("Drop free_pages()")
    modified nr_free_pages() to return 'unsigned long' instead of 'unsigned
    int'. This made the casts to 'unsigned long' in most callers superfluous,
    so remove them.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Geert Uytterhoeven
    Reviewed-by: Christoph Lameter
    Acked-by: Ingo Molnar
    Acked-by: Russell King
    Acked-by: David S. Miller
    Acked-by: Kyle McMartin
    Acked-by: WANG Cong
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Haavard Skinnemoen
    Cc: Mikael Starvik
    Cc: "Luck, Tony"
    Cc: Hirokazu Takata
    Cc: Ralf Baechle
    Cc: David Howells
    Acked-by: Benjamin Herrenschmidt
    Cc: Martin Schwidefsky
    Cc: Paul Mundt
    Cc: Chris Zankel
    Cc: Michal Simek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Geert Uytterhoeven
     

21 Sep, 2009

5 commits

  • Signed-off-by: Joe Perches
    Signed-off-by: Jiri Kosina

    Joe Perches
     
  • Bye-bye Performance Counters, welcome Performance Events!

    In the past few months the perfcounters subsystem has grown out its
    initial role of counting hardware events, and has become (and is
    becoming) a much broader generic event enumeration, reporting, logging,
    monitoring, analysis facility.

    Naming its core object 'perf_counter' and naming the subsystem
    'perfcounters' has become more and more of a misnomer. With pending
    code like hw-breakpoints support the 'counter' name is less and
    less appropriate.

    All in one, we've decided to rename the subsystem to 'performance
    events' and to propagate this rename through all fields, variables
    and API names. (in an ABI compatible fashion)

    The word 'event' is also a bit shorter than 'counter' - which makes
    it slightly more convenient to write/handle as well.

    Thanks goes to Stephane Eranian who first observed this misnomer and
    suggested a rename.

    User-space tooling and ABI compatibility is not affected - this patch
    should be function-invariant. (Also, defconfigs were not touched to
    keep the size down.)

    This patch has been generated via the following script:

    FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')

    sed -i \
    -e 's/PERF_EVENT_/PERF_RECORD_/g' \
    -e 's/PERF_COUNTER/PERF_EVENT/g' \
    -e 's/perf_counter/perf_event/g' \
    -e 's/nb_counters/nb_events/g' \
    -e 's/swcounter/swevent/g' \
    -e 's/tpcounter_event/tp_event/g' \
    $FILES

    for N in $(find . -name perf_counter.[ch]); do
    M=$(echo $N | sed 's/perf_counter/perf_event/g')
    mv $N $M
    done

    FILES=$(find . -name perf_event.*)

    sed -i \
    -e 's/COUNTER_MASK/REG_MASK/g' \
    -e 's/COUNTER/EVENT/g' \
    -e 's/\/event_id/g' \
    -e 's/counter/event/g' \
    -e 's/Counter/Event/g' \
    $FILES

    ... to keep it as correct as possible. This script can also be
    used by anyone who has pending perfcounters patches - it converts
    a Linux kernel tree over to the new naming. We tried to time this
    change to the point in time where the amount of pending patches
    is the smallest: the end of the merge window.

    Namespace clashes were fixed up in a preparatory patch - and some
    stylistic fallout will be fixed up in a subsequent patch.

    ( NOTE: 'counters' are still the proper terminology when we deal
    with hardware registers - and these sed scripts are a bit
    over-eager in renaming them. I've undone some of that, but
    in case there's something left where 'counter' would be
    better than 'event' we can undo that on an individual basis
    instead of touching an otherwise nicely automated patch. )

    Suggested-by: Stephane Eranian
    Acked-by: Peter Zijlstra
    Acked-by: Paul Mackerras
    Reviewed-by: Arjan van de Ven
    Cc: Mike Galbraith
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    Cc: Steven Rostedt
    Cc: Benjamin Herrenschmidt
    Cc: David Howells
    Cc: Kyle McMartin
    Cc: Martin Schwidefsky
    Cc: "David S. Miller"
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc:
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Merge reason: pull in all the latest code before doing the rename.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Commit 5622f295 ("x86, perf_counter, bts: Optimize BTS overflow
    handling") removed the regs field from struct perf_sample_data and
    added a regs parameter to perf_counter_overflow(). This breaks the
    build on powerpc (and Sparc) as reported by Sachin Sant:

    arch/powerpc/kernel/perf_counter.c: In function 'record_and_restart':
    arch/powerpc/kernel/perf_counter.c:1165: error: unknown field 'regs' specified in initializer

    This adjusts arch/powerpc/kernel/perf_counter.c to correspond with the
    new struct perf_sample_data and perf_counter_overflow().

    [ v2: also fix Sparc, Markus Metzger ]

    Reported-by: Sachin Sant
    Signed-off-by: Paul Mackerras
    Cc: Markus Metzger
    Cc: David S. Miller
    Cc: benh@kernel.crashing.org
    Cc: linuxppc-dev@ozlabs.org
    Cc: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul Mackerras
     
  • Signed-off-by: Joe Perches
    Signed-off-by: Tim Abbott
    Acked-by: Paul Mundt
    Signed-off-by: Sam Ravnborg

    Joe Perches
     

20 Sep, 2009

1 commit

  • Albin Tonnerre reported:

    Bash 4 filters out variables which contain a dot in them.
    This happends to be the case of CPPFLAGS_vmlinux.lds.
    This is rather unfortunate, as it now causes
    build failures when using SHELL=/bin/bash to compile,
    or when bash happens to be used by make (eg when it's /bin/sh)

    Remove the common definition of CPPFLAGS_vmlinux.lds by
    pushing relevant stuff to either Makefile.build or the
    arch specific kernel/Makefile where we build the linker script.

    This is also nice cleanup as we move the information out where
    it is used.

    Notes for the different architectures touched:

    arm - we use an already exported symbol
    cris - we use a config symbol aleady available
    [Not build tested]
    mips - the jiffies complexity has moved to vmlinux.lds.S where we need it.
    Added a few variables to CPPFLAGS - they are only used by
    the linker script.
    [Not build tested]
    powerpc - removed assignment that is not needed
    [not build tested]
    sparc - simplified it using $(BITS)
    um - introduced a few new exported variables to deal with this
    xtensa - added options to CPP invocation
    [not build tested]

    Cc: Albin Tonnerre
    Cc: Russell King
    Cc: Mikael Starvik
    Cc: Jesper Nilsson
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: "David S. Miller"
    Cc: Jeff Dike
    Cc: Chris Zankel
    Signed-off-by: Sam Ravnborg

    Sam Ravnborg
     

19 Sep, 2009

1 commit


18 Sep, 2009

3 commits

  • …/git/tip/linux-2.6-tip

    * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (37 commits)
    sched: Fix SD_POWERSAVING_BALANCE|SD_PREFER_LOCAL vs SD_WAKE_AFFINE
    sched: Stop buddies from hogging the system
    sched: Add new wakeup preemption mode: WAKEUP_RUNNING
    sched: Fix TASK_WAKING & loadaverage breakage
    sched: Disable wakeup balancing
    sched: Rename flags to wake_flags
    sched: Clean up the load_idx selection in select_task_rq_fair
    sched: Optimize cgroup vs wakeup a bit
    sched: x86: Name old_perf in a unique way
    sched: Implement a gentler fair-sleepers feature
    sched: Add SD_PREFER_LOCAL
    sched: Add a few SYNC hint knobs to play with
    sched: Fix sync wakeups again
    sched: Add WF_FORK
    sched: Rename sync arguments
    sched: Rename select_task_rq() argument
    sched: Feature to disable APERF/MPERF cpu_power
    x86: sched: Provide arch implementations using aperf/mperf
    x86: Add generic aperf/mperf code
    x86: Move APERF/MPERF into a X86_FEATURE
    ...

    Fix up trivial conflict in arch/x86/include/asm/processor.h due to
    nearby addition of amd_get_nb_id() declaration from the EDAC merge.

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
    sparc: Update defconfigs.
    sparc: Kill PROM console driver.

    Linus Torvalds
     
  • GCC can't see the 'constant' properly as computed by
    is_power_of_2() etc.

    Signed-off-by: David S. Miller

    David S. Miller
     

16 Sep, 2009

6 commits

  • * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (75 commits)
    PCI hotplug: clean up acpi_run_hpp()
    PCI hotplug: acpiphp: use generic pci_configure_slot()
    PCI hotplug: shpchp: use generic pci_configure_slot()
    PCI hotplug: pciehp: use generic pci_configure_slot()
    PCI hotplug: add pci_configure_slot()
    PCI hotplug: clean up acpi_get_hp_params_from_firmware() interface
    PCI hotplug: acpiphp: don't cache hotplug_params in acpiphp_bridge
    PCI hotplug: acpiphp: remove superfluous _HPP/_HPX evaluation
    PCI: Clear saved_state after the state has been restored
    PCI PM: Return error codes from pci_pm_resume()
    PCI: use dev_printk in quirk messages
    PCI / PCIe portdrv: Fix pcie_portdrv_slot_reset()
    PCI Hotplug: convert acpi_pci_detect_ejectable() to take an acpi_handle
    PCI Hotplug: acpiphp: find bridges the easy way
    PCI: pcie portdrv: remove unused variable
    PCI / ACPI PM: Propagate wake-up enable for devices w/o ACPI support
    ACPI PM: Replace wakeup.prepared with reference counter
    PCI PM: Introduce device flag wakeup_prepared
    PCI / ACPI PM: Rework some debug messages
    PCI PM: Simplify PCI wake-up code
    ...

    Fixed up conflict in arch/powerpc/kernel/pci_64.c due to OF device tree
    scanning having been moved and merged for the 32- and 64-bit cases. The
    'needs_freset' initialization added in 6e19314cc ("PCI/powerpc: support
    PCIe fundamental reset") is now in arch/powerpc/kernel/pci_of_scan.c.

    Linus Torvalds
     
  • Sysbench thinks SD_BALANCE_WAKE is too agressive and kbuild doesn't
    really mind too much, SD_BALANCE_NEWIDLE picks up most of the
    slack.

    On a dual socket, quad core, dual thread nehalem system:

    sysbench (--num_threads=16):

    SD_BALANCE_WAKE-: 13982 tx/s
    SD_BALANCE_WAKE+: 15688 tx/s

    kbuild (-j16):

    SD_BALANCE_WAKE-: 47.648295846 seconds time elapsed ( +- 0.312% )
    SD_BALANCE_WAKE+: 47.608607360 seconds time elapsed ( +- 0.026% )

    (same within noise)

    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Signed-off-by: David S. Miller

    David S. Miller
     
  • Many years ago when this driver was written, it had a use, but these
    days it's nothing but trouble and distributions should not enable it
    in any situation.

    Pretty much every console device a sparc machine could see has a
    bonafide real driver, making the PROM console hack unnecessary.

    If any new device shows up, we should write a driver instead of
    depending upon this crutch to save us. We've been able to take care
    of this even when no chip documentation exists (sunxvr500, sunxvr2500)
    so there are no excuses.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (46 commits)
    powerpc64: convert to dynamic percpu allocator
    sparc64: use embedding percpu first chunk allocator
    percpu: kill lpage first chunk allocator
    x86,percpu: use embedding for 64bit NUMA and page for 32bit NUMA
    percpu: update embedding first chunk allocator to handle sparse units
    percpu: use group information to allocate vmap areas sparsely
    vmalloc: implement pcpu_get_vm_areas()
    vmalloc: separate out insert_vmalloc_vm()
    percpu: add chunk->base_addr
    percpu: add pcpu_unit_offsets[]
    percpu: introduce pcpu_alloc_info and pcpu_group_info
    percpu: move pcpu_lpage_build_unit_map() and pcpul_lpage_dump_cfg() upward
    percpu: add @align to pcpu_fc_alloc_fn_t
    percpu: make @dyn_size mandatory for pcpu_setup_first_chunk()
    percpu: drop @static_size from first chunk allocators
    percpu: generalize first chunk allocator selection
    percpu: build first chunk allocators selectively
    percpu: rename 4k first chunk allocator to page
    percpu: improve boot messages
    percpu: fix pcpu_reclaim() locking
    ...

    Fix trivial conflict as by Tejun Heo in kernel/sched.c

    Linus Torvalds
     
  • * 'agp-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6:
    agp/intel: remove restore in resume
    agp: fix uninorth build
    intel-agp: Set dma mask for i915
    agp: kill phys_to_gart() and gart_to_phys()
    intel-agp: fix sglist allocation to avoid vmalloc()
    intel-agp: Move repeated sglist free into separate function
    agp: Switch agp_{un,}map_page() to take struct page * argument
    agp: tidy up handling of scratch pages w.r.t. DMA API
    intel_agp: Use PCI DMA API correctly on chipsets new enough to have IOMMU
    agp: Add generic support for graphics dma remapping
    agp: Switch mask_memory() method to take address argument again, not page

    Linus Torvalds
     

15 Sep, 2009

6 commits

  • If we're looking to place a new task, we might as well find the
    idlest position _now_, not 1 tick ago.

    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • When merging select_task_rq_fair() and sched_balance_self() we lost
    the use of wake_idx, restore that and set them to 0 to make wake
    balancing more aggressive.

    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • The problem with wake_idle() is that is doesn't respect things like
    cpu_power, which means it doesn't deal well with SMT nor the recent
    RT interaction.

    To cure this, it needs to do what sched_balance_self() does, which
    leads to the possibility of merging select_task_rq_fair() and
    sched_balance_self().

    Modify sched_balance_self() to:

    - update_shares() when walking up the domain tree,
    (it only called it for the top domain, but it should
    have done this anyway), which allows us to remove
    this ugly bit from try_to_wake_up().

    - do wake_affine() on the smallest domain that contains
    both this (the waking) and the prev (the wakee) cpu for
    WAKE invocations.

    Then use the top-down balance steps it had to replace wake_idle().

    This leads to the dissapearance of SD_WAKE_BALANCE and
    SD_WAKE_IDLE_FAR, with SD_WAKE_IDLE replaced with SD_BALANCE_WAKE.

    SD_WAKE_AFFINE needs SD_BALANCE_WAKE to be effective.

    Touch all topology bits to replace the old with new SD flags --
    platforms might need re-tuning, enabling SD_BALANCE_WAKE
    conditionally on a NUMA distance seems like a good additional
    feature, magny-core and small nehalem systems would want this
    enabled, systems with slow interconnects would not.

    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Rafael J. Wysocki
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next-2.6: (21 commits)
    sparc64: Initial niagara2 perf counter support.
    sparc64: Perf counter 'nop' event is not constant.
    sparc64: Provide a way to specify a perf counter overflow IRQ enable bit.
    sparc64: Provide hypervisor tracing bit support for perf counters.
    sparc64: Initial hw perf counter support.
    sparc64: Implement a real set_perf_counter_pending().
    sparc64: Use nmi_enter() and nmi_exit(), as needed.
    sparc64: Provide extern decls for sparc_??u_type strings.
    sparc64: Make touch_nmi_watchdog() actually work.
    sparc64: Kill unnecessary cast in profile_timer_exceptions_notify().
    sparc64: Manage NMI watchdog enabling like x86.
    sparc: add basic support for 'perf'
    sparc: convert /proc/io_map, /proc/dvma_map to seq_file
    sparc, leon: sparc-leon specific SRMMU initialization and bootup fixes.
    sparc,leon: Added support for AMBAPP bus.
    sparc,leon: Introduce the sparc-leon CPU type.
    sparc,leon: Redefine MMU register access asi if CONFIG_LEON
    sparc,leon: CONFIG_SPARC_LEON option and leon specific files.
    sparc64: cheaper asm/uaccess.h inclusion
    SPARC: fix duplicate declaration
    ...

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1623 commits)
    netxen: update copyright
    netxen: fix tx timeout recovery
    netxen: fix file firmware leak
    netxen: improve pci memory access
    netxen: change firmware write size
    tg3: Fix return ring size breakage
    netxen: build fix for INET=n
    cdc-phonet: autoconfigure Phonet address
    Phonet: back-end for autoconfigured addresses
    Phonet: fix netlink address dump error handling
    ipv6: Add IFA_F_DADFAILED flag
    net: Add DEVTYPE support for Ethernet based devices
    mv643xx_eth.c: remove unused txq_set_wrr()
    ucc_geth: Fix hangs after switching from full to half duplex
    ucc_geth: Rearrange some code to avoid forward declarations
    phy/marvell: Make non-aneg speed/duplex forcing work for 88E1111 PHYs
    drivers/net/phy: introduce missing kfree
    drivers/net/wan: introduce missing kfree
    net: force bridge module(s) to be GPL
    Subject: [PATCH] appletalk: Fix skb leak when ipddp interface is not loaded
    ...

    Fixed up trivial conflicts:

    - arch/x86/include/asm/socket.h

    converted to in the x86 tree. The generic
    header has the same new #define's, so that works out fine.

    - drivers/net/tun.c

    fix conflict between 89f56d1e9 ("tun: reuse struct sock fields") that
    switched over to using 'tun->socket.sk' instead of the redundantly
    available (and thus removed) 'tun->sk', and 2b980dbd ("lsm: Add hooks
    to the TUN driver") which added a new 'tun->sk' use.

    Noted in 'next' by Stephen Rothwell.

    Linus Torvalds
     

12 Sep, 2009

3 commits

  • Conflicts:
    arch/sparc/Kconfig

    David S. Miller
     
  • …el/git/tip/linux-2.6-tip

    * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (32 commits)
    locking, m68k/asm-offsets: Rename signal defines
    locking: Inline spinlock code for all locking variants on s390
    locking: Simplify spinlock inlining
    locking: Allow arch-inlined spinlocks
    locking: Move spinlock function bodies to header file
    locking, m68k: Calculate thread_info offset with asm offset
    locking, m68k/asm-offsets: Rename pt_regs offset defines
    locking, sparc: Rename __spin_try_lock() and friends
    locking, powerpc: Rename __spin_try_lock() and friends
    lockdep: Remove recursion stattistics
    lockdep: Simplify lock_stat seqfile code
    lockdep: Simplify lockdep_chains seqfile code
    lockdep: Simplify lockdep seqfile code
    lockdep: Fix missing entries in /proc/lock_chains
    lockdep: Fix missing entry in /proc/lock_stat
    lockdep: Fix memory usage info of BFS
    lockdep: Reintroduce generation count to make BFS faster
    lockdep: Deal with many similar locks
    lockdep: Introduce lockdep_assert_held()
    lockdep: Fix style nits
    ...

    Linus Torvalds
     
  • …/git/tip/linux-2.6-tip

    * 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (59 commits)
    x86/gart: Do not select AGP for GART_IOMMU
    x86/amd-iommu: Initialize passthrough mode when requested
    x86/amd-iommu: Don't detach device from pt domain on driver unbind
    x86/amd-iommu: Make sure a device is assigned in passthrough mode
    x86/amd-iommu: Align locking between attach_device and detach_device
    x86/amd-iommu: Fix device table write order
    x86/amd-iommu: Add passthrough mode initialization functions
    x86/amd-iommu: Add core functions for pd allocation/freeing
    x86/dma: Mark iommu_pass_through as __read_mostly
    x86/amd-iommu: Change iommu_map_page to support multiple page sizes
    x86/amd-iommu: Support higher level PTEs in iommu_page_unmap
    x86/amd-iommu: Remove old page table handling macros
    x86/amd-iommu: Use 2-level page tables for dma_ops domains
    x86/amd-iommu: Remove bus_addr check in iommu_map_page
    x86/amd-iommu: Remove last usages of IOMMU_PTE_L0_INDEX
    x86/amd-iommu: Change alloc_pte to support 64 bit address space
    x86/amd-iommu: Introduce increase_address_space function
    x86/amd-iommu: Flush domains if address space size was increased
    x86/amd-iommu: Introduce set_dte_entry function
    x86/amd-iommu: Add a gneric version of amd_iommu_flush_all_devices
    ...

    Linus Torvalds