31 Oct, 2007

1 commit


30 Oct, 2007

9 commits


29 Oct, 2007

5 commits


25 Oct, 2007

9 commits

  • At the moment, a lot of load balancing code that is irrelevant to non
    SMP systems gets included during non SMP builds.

    This patch addresses this issue and reduces the binary size on non
    SMP systems:

    text data bss dec hex filename
    10983 28 1192 12203 2fab sched.o.before
    10739 28 1192 11959 2eb7 sched.o.after

    Signed-off-by: Peter Williams
    Signed-off-by: Ingo Molnar

    Peter Williams
     
  • At the moment, balance_tasks() provides low level functionality for both
    move_tasks() and move_one_task() (indirectly) via the load_balance()
    function (in the sched_class interface) which also provides dual
    functionality. This dual functionality complicates the interfaces and
    internal mechanisms and makes the run time overhead of operations that
    are called with two run queue locks held.

    This patch addresses this issue and reduces the overhead of these
    operations.

    Signed-off-by: Peter Williams
    Signed-off-by: Ingo Molnar

    Peter Williams
     
  • cpu_shares_{show,store}() can become static.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Ingo Molnar

    Adrian Bunk
     
  • - replace "cont" with "cgrp" in a few places in the CFS cgroup code,
    - use write_uint rather than write for cpu.shares write function

    Signed-off-by: Paul Menage
    Acked-by : Srivatsa Vaddagiri
    Signed-off-by: Ingo Molnar

    Paul Menage
     
  • profile=sleep only works if CONFIG_SCHEDSTATS is set. This patch notes
    the limitation in Documentation/kernel-parameters.txt and prints a
    warning at boot-time if profile=sleep is used without CONFIG_SCHEDSTAT.

    Signed-off-by: Mel Gorman
    Signed-off-by: Ingo Molnar

    Mel Gorman
     
  • A full register dump along with stack backtrace would make the
    "scheduling while atomic" message more helpful. Use show_regs() instead
    of dump_stack() for this. We already know we're atomic in here (that is
    why this function was called) so show_regs()'s atomicity expectations
    are guaranteed.

    Also, modify the output of the "BUG: scheduling while atomic:" header a
    bit to keep task->comm and task->pid together and preempt_count() after
    them.

    Signed-off-by: Satyam Sharma
    Signed-off-by: Ingo Molnar

    Satyam Sharma
     
  • clean up sched_domain_debug().

    this also shrinks the code a bit:

    text data bss dec hex filename
    50474 4306 480 55260 d7dc sched.o.before
    50404 4306 480 55190 d796 sched.o.after

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Jeff Dike noticed that wait_for_completion_interruptible()'s prototype
    had a mismatched fastcall.

    Fix this by removing the fastcall attributes from all the completion APIs.

    Found-by: Jeff Dike
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • commit 029190c515f15f512ac85de8fc686d4dbd0ae731 (cpuset
    sched_load_balance flag) was not tested SCHED_DEBUG enabled as
    committed as it dereferences NULL when used and it reordered
    the sysctl registration to cause it to never show any domains
    or their tunables.

    Fixes:

    1) restore arch_init_sched_domains ordering
    we can't walk the domains before we build them

    presently we register cpus with empty directories (no domain
    directories or files).

    2) make unregister_sched_domain_sysctl do nothing when already unregistered
    detach_destroy_domains is now called one set of cpus at a time
    unregister_syctl dereferences NULL if called with a null.

    While the the function would always dereference null if called
    twice, in the previous code it was always called once and then
    was followed a register. So only the hidden bug of the
    sysctl_root_table not being allocated followed by an attempt to
    free it would have shown the error.

    3) always call unregister and register in partition_sched_domains
    The code is "smart" about unregistering only needed domains.
    Since we aren't guaranteed any calls to unregister, always
    unregister. Without calling register on the way out we
    will not have a table or any sysctl tree.

    4) warn if register is called without unregistering
    The previous table memory is lost, leaving pointers to the
    later freed memory in sysctl and leaking the memory of the
    tables.

    Before this patch on a 2-core 4-thread box compiled for SMT and NUMA,
    the domains appear empty (there are actually 3 levels per cpu). And as
    soon as two domains a null pointer is dereferenced (unreliable in this
    case is stack garbage):

    bu19a:~# ls -R /proc/sys/kernel/sched_domain/
    /proc/sys/kernel/sched_domain/:
    cpu0 cpu1 cpu2 cpu3

    /proc/sys/kernel/sched_domain/cpu0:

    /proc/sys/kernel/sched_domain/cpu1:

    /proc/sys/kernel/sched_domain/cpu2:

    /proc/sys/kernel/sched_domain/cpu3:

    bu19a:~# mkdir /dev/cpuset
    bu19a:~# mount -tcpuset cpuset /dev/cpuset/
    bu19a:~# cd /dev/cpuset/
    bu19a:/dev/cpuset# echo 0 > sched_load_balance
    bu19a:/dev/cpuset# mkdir one
    bu19a:/dev/cpuset# echo 1 > one/cpus
    bu19a:/dev/cpuset# echo 0 > one/sched_load_balance
    Unable to handle kernel paging request for data at address 0x00000018
    Faulting instruction address: 0xc00000000006b608
    NIP: c00000000006b608 LR: c00000000006b604 CTR: 0000000000000000
    REGS: c000000018d973f0 TRAP: 0300 Not tainted (2.6.23-bml)
    MSR: 9000000000009032 CR: 28242442 XER: 00000000
    DAR: 0000000000000018, DSISR: 0000000040000000
    TASK = c00000001912e340[1987] 'bash' THREAD: c000000018d94000 CPU: 2
    ..
    NIP [c00000000006b608] .unregister_sysctl_table+0x38/0x110
    LR [c00000000006b604] .unregister_sysctl_table+0x34/0x110
    Call Trace:
    [c000000018d97670] [c000000007017270] 0xc000000007017270 (unreliable)
    [c000000018d97720] [c000000000058710] .detach_destroy_domains+0x30/0xb0
    [c000000018d977b0] [c00000000005cf1c] .partition_sched_domains+0x1bc/0x230
    [c000000018d97870] [c00000000009fdc4] .rebuild_sched_domains+0xb4/0x4c0
    [c000000018d97970] [c0000000000a02e8] .update_flag+0x118/0x170
    [c000000018d97a80] [c0000000000a1768] .cpuset_common_file_write+0x568/0x820
    [c000000018d97c00] [c00000000009d95c] .cgroup_file_write+0x7c/0x180
    [c000000018d97cf0] [c0000000000e76b8] .vfs_write+0xe8/0x1b0
    [c000000018d97d90] [c0000000000e810c] .sys_write+0x4c/0x90
    [c000000018d97e30] [c00000000000852c] syscall_exit+0x0/0x40

    Signed-off-by: Milton Miller
    Signed-off-by: Ingo Molnar

    Milton Miller
     

24 Oct, 2007

2 commits

  • Signed-off-by: Jeff Garzik

    Jeff Garzik
     
  • As it is some callers of synchronize_irq rely on memory barriers
    to provide synchronisation against the IRQ handlers. For example,
    the tg3 driver does

    tp->irq_sync = 1;
    smp_mb();
    synchronize_irq();

    and then in the IRQ handler:

    if (!tp->irq_sync)
    netif_rx_schedule(dev, &tp->napi);

    Unfortunately memory barriers only work well when they come in
    pairs. Because we don't actually have memory barriers on the
    IRQ path, the memory barrier before the synchronize_irq() doesn't
    actually protect us.

    In particular, synchronize_irq() may return followed by the
    result of netif_rx_schedule being made visible.

    This patch (mostly written by Linus) fixes this by using spin
    locks instead of memory barries on the synchronize_irq() path.

    Signed-off-by: Herbert Xu
    Acked-by: Benjamin Herrenschmidt
    Signed-off-by: Linus Torvalds

    Herbert Xu
     

23 Oct, 2007

3 commits

  • Fix kernel-doc for auditsc parameter changes.

    Warning(linux-2.6.23-git17//kernel/auditsc.c:1623): No description found for parameter 'dentry'
    Warning(linux-2.6.23-git17//kernel/auditsc.c:1666): No description found for parameter 'dentry'

    Signed-off-by: Randy Dunlap
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
    KVM: Use new smp_call_function_mask() in kvm_flush_remote_tlbs()
    sched: don't clear PF_VCPU in scheduler
    KVM: Improve local apic timer wraparound handling
    KVM: Fix local apic timer divide by zero
    KVM: Move kvm_guest_exit() after local_irq_enable()
    KVM: x86 emulator: fix access registers for instructions with ModR/M byte and Mod = 3
    KVM: VMX: Force vm86 mode if setting flags during real mode
    KVM: x86 emulator: implement 'movnti mem, reg'
    KVM: VMX: Reset mmu context when entering real mode
    KVM: VMX: Handle NMIs before enabling interrupts and preemption
    KVM: MMU: Set shadow pte atomically in mmu_pte_write_zap_pte()
    KVM: x86 emulator: fix repne/repnz decoding
    KVM: x86 emulator: fix merge screwup due to emulator split

    Linus Torvalds
     
  • Gabriel C reported that modprobing appletalk on current git gives a
    warning in dmesg :

    "sysctl table check failed: /net/appletalk .3.7 procname does not match binary path procname"

    Oops. My apologies it appears I made a mistake when creating my table
    to check up on sysctl values.

    Signed-off-by: "Eric W. Biederman"
    Tested-by: Gabriel C
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

22 Oct, 2007

1 commit


21 Oct, 2007

2 commits

  • New kind of audit rule predicates: "object is visible in given subtree".
    The part that can be sanely implemented, that is. Limitations:
    * if you have hardlink from outside of tree, you'd better watch
    it too (or just watch the object itself, obviously)
    * if you mount something under a watched tree, tell audit
    that new chunk should be added to watched subtrees
    * if you umount something in a watched tree and it's still mounted
    elsewhere, you will get matches on events happening there. New command
    tells audit to recalculate the trees, trimming such sources of false
    positives.

    Note that it's _not_ about path - if something mounted in several places
    (multiple mount, bindings, different namespaces, etc.), the match does
    _not_ depend on which one we are using for access.

    Signed-off-by: Al Viro

    Al Viro
     
  • makes caller simpler *and* allows to scan ancestors

    Signed-off-by: Al Viro

    Al Viro
     

20 Oct, 2007

8 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (74 commits)
    fix do_sys_open() prototype
    sysfs: trivial: fix sysfs_create_file kerneldoc spelling mistake
    Documentation: Fix typo in SubmitChecklist.
    Typo: depricated -> deprecated
    Add missing profile=kvm option to Documentation/kernel-parameters.txt
    fix typo about TBI in e1000 comment
    proc.txt: Add /proc/stat field
    small documentation fixes
    Fix compiler warning in smount example program from sharedsubtree.txt
    docs/sysfs: add missing word to sysfs attribute explanation
    documentation/ext3: grammar fixes
    Documentation/java.txt: typo and grammar fixes
    Documentation/filesystems/vfs.txt: typo fix
    include/asm-*/system.h: remove unused set_rmb(), set_wmb() macros
    trivial copy_data_pages() tidy up
    Fix typo in arch/x86/kernel/tsc_32.c
    file link fix for Pegasus USB net driver help
    remove unused return within void return function
    Typo fixes retrun -> return
    x86 hpet.h: remove broken links
    ...

    Linus Torvalds
     
  • Weird I thought I had written the makefile so this would be handled. Oh
    well this should fix it.

    Sorry about that.

    Signed-off-by: Eric W. Biederman
    Acked-and-tested-by: Randy Dunlap
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • Change the loop style of copy_data_pages() to remove a duplicate condition.

    Signed-off-by: Fengguang Wu
    Acked-by: Rafael J. Wysocki
    Signed-off-by: Adrian Bunk

    Fengguang Wu
     
  • Signed-off-by: Uwe Kleine-König
    Signed-off-by: Adrian Bunk

    Uwe Kleine-König
     
  • hardirq_offset is no longer needed.

    Signed-off-by: Michael Neuling
    Signed-off-by: Adrian Bunk

    Michael Neuling
     
  • Signed-off-by: Daniel Roesen
    Signed-off-by: Adrian Bunk

    Daniel Roesen
     
  • Fix the various misspellings of "system", controller", "interrupt" and
    "[un]necessary".

    Signed-off-by: Robert P. J. Day
    Signed-off-by: Adrian Bunk

    Robert P. J. Day
     
  • * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (41 commits)
    ACPICA: hw: Don't carry spinlock over suspend
    ACPICA: hw: remove use_lock flag from acpi_hw_register_{read, write}
    ACPI: cpuidle: port idle timer suspend/resume workaround to cpuidle
    ACPI: clean up acpi_enter_sleep_state_prep
    Hibernation: Make sure that ACPI is enabled in acpi_hibernation_finish
    ACPI: suppress uninitialized var warning
    cpuidle: consolidate 2.6.22 cpuidle branch into one patch
    ACPI: thinkpad-acpi: skip blanks before the data when parsing sysfs
    ACPI: AC: Add sysfs interface
    ACPI: SBS: Add sysfs alarm
    ACPI: SBS: Add ACPI_PROCFS around procfs handling code.
    ACPI: SBS: Add support for power_supply class (and sysfs)
    ACPI: SBS: Make SBS reads table-driven.
    ACPI: SBS: Simplify data structures in SBS
    ACPI: SBS: Split host controller (ACPI0001) from SBS driver (ACPI0002)
    ACPI: EC: Add new query handler to list head.
    ACPI: Add acpi_bus_generate_event4() function
    ACPI: Battery: add sysfs alarm
    ACPI: Battery: Add sysfs support
    ACPI: Battery: Misc clean-ups, no functional changes
    ...

    Fix up conflicts in drivers/misc/thinkpad_acpi.[ch] manually

    Linus Torvalds