26 Jul, 2008

7 commits

  • Following patch corrects URL of "The GNU Accounting Utilities" in init/Kconfig.

    Noticed by: Bart Van Assche"

    Signed-off-by: S.Çağlar Onur
    Signed-off-by: Sam Ravnborg

    S.Çağlar Onur
     
  • This patch adds proper prototypes for pid{hash,map}_init() in
    include/linux/pid_namespace.h

    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • int Version_* is only used with ksymoops, which is only needed (according
    to README and Documentation/Changes) if CONFIG_KALLSYMS is NOT defined.
    Therefore this patch defines version_string only if CONFIG_KALLSYMS is not
    defined.

    Signed-off-by: Daniel Guilak
    Cc: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daniel Guilak
     
  • Signed-off-by: Daniel Guilak
    Cc: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daniel Guilak
     
  • Inflate requires some dynamic memory allocation very early in the boot
    process and this is provided with a set of four functions:
    malloc/free/gzip_mark/gzip_release.

    The old inflate code used a mark/release strategy rather than implement
    free. This new version instead keeps a count on the number of outstanding
    allocations and when it hits zero, it resets the malloc arena.

    This allows removing all the mark and release implementations and unifying
    all the malloc/free implementations.

    The architecture-dependent code must define two addresses:
    - free_mem_ptr, the address of the beginning of the area in which
    allocations should be made
    - free_mem_end_ptr, the address of the end of the area in which
    allocations should be made. If set to 0, then no check is made on
    the number of allocations, it just grows as much as needed

    The architecture-dependent code can also provide an arch_decomp_wdog()
    function call. This function will be called several times during the
    decompression process, and allow to notify the watchdog that the system is
    still running. If an architecture provides such a call, then it must
    define ARCH_HAS_DECOMP_WDOG so that the generic inflate code calls
    arch_decomp_wdog().

    Work initially done by Matt Mackall, updated to a recent version of the
    kernel and improved by me.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Thomas Petazzoni
    Cc: Matt Mackall
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Mikael Starvik
    Cc: Jesper Nilsson
    Cc: Haavard Skinnemoen
    Cc: David Howells
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Andi Kleen
    Cc: "H. Peter Anvin"
    Acked-by: Paul Mundt
    Acked-by: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Petazzoni
     
  • There seems to be little point in explicitly setting, then testing the macro
    BUILD_CRAMDISK within the context of a single source file.

    Signed-off-by: Robert P. J. Day
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert P. J. Day
     
  • Every file should include the headers containing the externs for its
    global code (in this case for rd_doload).

    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     

24 Jul, 2008

1 commit

  • * 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    sched: hrtick_enabled() should use cpu_active()
    sched, x86: clean up hrtick implementation
    sched: fix build error, provide partition_sched_domains() unconditionally
    sched: fix warning in inc_rt_tasks() to not declare variable 'rq' if it's not needed
    cpu hotplug: Make cpu_active_map synchronization dependency clear
    cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2)
    sched: rework of "prioritize non-migratable tasks over migratable ones"
    sched: reduce stack size in isolated_cpu_setup()
    Revert parts of "ftrace: do not trace scheduler functions"

    Fixed up conflicts in include/asm-x86/thread_info.h (due to the
    TIF_SINGLESTEP unification vs TIF_HRTICK_RESCHED removal) and
    kernel/sched_fair.c (due to cpu_active_map vs for_each_cpu_mask_nr()
    introduction).

    Linus Torvalds
     

22 Jul, 2008

2 commits

  • ... as preparation for removing it completely, make it an
    invisible bool defaulting to yes.

    Signed-off-by: Johannes Berg
    Signed-off-by: Rusty Russell

    Johannes Berg
     
  • module.c and module.h conatains code for finding
    exported symbols which are declared with EXPORT_UNUSED_SYMBOL,
    and this code is compiled in even if CONFIG_UNUSED_SYMBOLS is not set
    and thus there can be no EXPORT_UNUSED_SYMBOLs in modules anyway
    (because EXPORT_UNUSED_SYMBOL(x) are compiled out to nothing then).

    This patch adds required #ifdefs.

    Signed-off-by: Denys Vlasenko
    Signed-off-by: Rusty Russell

    Denys Vlasenko
     

21 Jul, 2008

1 commit

  • On recent kernels, I get the following error when using an initrd:

    | initrd overwritten (0x00b78000 < 0x07668000) - disabling it.

    My Amiga 4000 has 12 MiB of RAM at physical address 0x07400000 (virtual
    0x00000000).
    The initrd is located at the end of RAM: 0x00b78000 - 0x00c00000 (virtual).
    The overwrite test compares the (virtual) initrd location to the (physical)
    first available memory location, which fails.

    This patch converts initrd_start to a page frame number, so it can safely be
    compared with min_low_pfn.

    Before the introduction of discontiguous memory support on m68k
    (12d810c1b8c2b913d48e629e2b5c01d105029839), min_low_pfn was just left
    untouched by the m68k-specific code (zero, I guess), and everything worked
    fine.

    Signed-off-by: Geert Uytterhoeven
    Signed-off-by: Linus Torvalds

    Geert Uytterhoeven
     

18 Jul, 2008

1 commit

  • This is based on Linus' idea of creating cpu_active_map that prevents
    scheduler load balancer from migrating tasks to the cpu that is going
    down.

    It allows us to simplify domain management code and avoid unecessary
    domain rebuilds during cpu hotplug event handling.

    Please ignore the cpusets part for now. It needs some more work in order
    to avoid crazy lock nesting. Although I did simplfy and unify domain
    reinitialization logic. We now simply call partition_sched_domains() in
    all the cases. This means that we're using exact same code paths as in
    cpusets case and hence the test below cover cpusets too.
    Cpuset changes to make rebuild_sched_domains() callable from various
    contexts are in the separate patch (right next after this one).

    This not only boots but also easily handles
    while true; do make clean; make -j 8; done
    and
    while true; do on-off-cpu 1; done
    at the same time.
    (on-off-cpu 1 simple does echo 0/1 > /sys/.../cpu1/online thing).

    Suprisingly the box (dual-core Core2) is quite usable. In fact I'm typing
    this on right now in gnome-terminal and things are moving just fine.

    Also this is running with most of the debug features enabled (lockdep,
    mutex, etc) no BUG_ONs or lockdep complaints so far.

    I believe I addressed all of the Dmitry's comments for original Linus'
    version. I changed both fair and rt balancer to mask out non-active cpus.
    And replaced cpu_is_offline() with !cpu_active() in the main scheduler
    code where it made sense (to me).

    Signed-off-by: Max Krasnyanskiy
    Acked-by: Linus Torvalds
    Acked-by: Peter Zijlstra
    Acked-by: Gregory Haskins
    Cc: dmitry.adamushko@gmail.com
    Cc: pj@sgi.com
    Signed-off-by: Ingo Molnar

    Max Krasnyansky
     

17 Jul, 2008

1 commit


16 Jul, 2008

3 commits

  • …l/git/tip/linux-2.6-tip

    * 'generic-ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits)
    generic-ipi: more merge fallout
    generic-ipi: merge fix
    x86, visws: use mach-default/entry_arch.h
    x86, visws: fix generic-ipi build
    generic-ipi: fixlet
    generic-ipi: fix s390 build bug
    generic-ipi: fix linux-next tree build failure
    fix: "smp_call_function: get rid of the unused nonatomic/retry argument"
    fix: "smp_call_function: get rid of the unused nonatomic/retry argument"
    fix "smp_call_function: get rid of the unused nonatomic/retry argument"
    on_each_cpu(): kill unused 'retry' parameter
    smp_call_function: get rid of the unused nonatomic/retry argument
    sh: convert to generic helpers for IPI function calls
    parisc: convert to generic helpers for IPI function calls
    mips: convert to generic helpers for IPI function calls
    m32r: convert to generic helpers for IPI function calls
    arm: convert to generic helpers for IPI function calls
    alpha: convert to generic helpers for IPI function calls
    ia64: convert to generic helpers for IPI function calls
    powerpc: convert to generic helpers for IPI function calls
    ...

    Fix trivial conflicts due to rcu updates in kernel/rcupdate.c manually

    Linus Torvalds
     
  • Conflicts:

    arch/powerpc/Kconfig
    arch/s390/kernel/time.c
    arch/x86/kernel/apic_32.c
    arch/x86/kernel/cpu/perfctr-watchdog.c
    arch/x86/kernel/i8259_64.c
    arch/x86/kernel/ldt.c
    arch/x86/kernel/nmi_64.c
    arch/x86/kernel/smpboot.c
    arch/x86/xen/smp.c
    include/asm-x86/hw_irq_32.h
    include/asm-x86/hw_irq_64.h
    include/asm-x86/mach-default/irq_vectors.h
    include/asm-x86/mach-voyager/irq_vectors.h
    include/asm-x86/smp.h
    kernel/Makefile

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Ingo Molnar
     

15 Jul, 2008

1 commit


26 Jun, 2008

1 commit

  • This adds kernel/smp.c which contains helpers for IPI function calls. In
    addition to supporting the existing smp_call_function() in a more efficient
    manner, it also adds a more scalable variant called smp_call_function_single()
    for calling a given function on a single CPU only.

    The core of this is based on the x86-64 patch from Nick Piggin, lots of
    changes since then. "Alan D. Brunelle" has
    contributed lots of fixes and suggestions as well. Also thanks to
    Paul E. McKenney for reviewing RCU usage
    and getting rid of the data allocation fallback deadlock.

    Acked-by: Ingo Molnar
    Reviewed-by: Paul E. McKenney
    Signed-off-by: Jens Axboe

    Jens Axboe
     

24 Jun, 2008

2 commits

  • As suggested by Ingo, remove all references to tsc from init/calibrate.c

    TSC is x86 specific, and using tsc in variable names in a generic file should
    be avoided. lpj_tsc is now called lpj_fine, since it is related to fine tuning
    of lpj value. Also tsc_rate_* is called timer_rate_*

    Signed-off-by: Alok N Kataria
    Cc: Arjan van de Ven
    Cc: Daniel Hecht
    Cc: Tim Mann
    Cc: Zach Amsden
    Cc: Sahil Rihan
    Signed-off-by: Ingo Molnar

    Alok Kataria
     
  • On the x86 platform we can use the value of tsc_khz computed during tsc
    calibration to calculate the loops_per_jiffy value. Its very important
    to keep the error in lpj values to minimum as any error in that may
    result in kernel panic in check_timer. In virtualization environment, On
    a highly overloaded host the guest delay calibration may sometimes
    result in errors beyond the ~50% that timer_irq_works can handle,
    resulting in the guest panicking.

    Does some formating changes to lpj_setup code to now have a single
    printk to print the bogomips value.

    We do this only for the boot processor because the AP's can have
    different base frequencies or the BIOS might boot a AP at a different
    frequency.

    Signed-off-by: Alok N Kataria
    Cc: Arjan van de Ven
    Cc: Daniel Hecht
    Cc: Tim Mann
    Cc: Zach Amsden
    Cc: Sahil Rihan
    Signed-off-by: Ingo Molnar

    Alok Kataria
     

16 Jun, 2008

1 commit


26 May, 2008

1 commit

  • init/Kconfig contains a list of configs that are searched
    for if 'make *config' are used with no .config present.
    Extend this list to look at the config identified by
    ARCH_DEFCONFIG.

    With this change we now try the defconfig targets last.

    This fixes a regression reported
    by: Linus Torvalds

    Signed-off-by: Sam Ravnborg
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"

    Sam Ravnborg
     

25 May, 2008

1 commit


19 May, 2008

1 commit

  • Fourth cut of patch to provide the call_rcu_sched(). This is again to
    synchronize_sched() as call_rcu() is to synchronize_rcu().

    Should be fine for experimental and -rt use, but not ready for inclusion.
    With some luck, I will be able to tell Andrew to come out of hiding on
    the next round.

    Passes multi-day rcutorture sessions with concurrent CPU hotplugging.

    Fixes since the first version include a bug that could result in
    indefinite blocking (spotted by Gautham Shenoy), better resiliency
    against CPU-hotplug operations, and other minor fixes.

    Fixes since the second version include reworking grace-period detection
    to avoid deadlocks that could happen when running concurrently with
    CPU hotplug, adding Mathieu's fix to avoid the softlockup messages,
    as well as Mathieu's fix to allow use earlier in boot.

    Fixes since the third version include a wrong-CPU bug spotted by
    Andrew, getting rid of the obsolete synchronize_kernel API that somehow
    snuck back in, merging spin_unlock() and local_irq_restore() in a
    few places, commenting the code that checks for quiescent states based
    on interrupting from user-mode execution or the idle loop, removing
    some inline attributes, and some code-style changes.

    Known/suspected shortcomings:

    o I still do not entirely trust the sleep/wakeup logic. Next step
    will be to use a private snapshot of the CPU online mask in
    rcu_sched_grace_period() -- if the CPU wasn't there at the start
    of the grace period, we don't need to hear from it. And the
    bit about accounting for changes in online CPUs inside of
    rcu_sched_grace_period() is ugly anyway.

    o It might be good for rcu_sched_grace_period() to invoke
    resched_cpu() when a given CPU wasn't responding quickly,
    but resched_cpu() is declared static...

    This patch also fixes a long-standing bug in the earlier preemptable-RCU
    implementation of synchronize_rcu() that could result in loss of
    concurrent external changes to a task's CPU affinity mask. I still cannot
    remember who reported this...

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Mathieu Desnoyers
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Paul E. McKenney
     

16 May, 2008

3 commits

  • This patch fixes a build bug on m68k - gcc decides to emit a call to the
    strlen library function, which we don't implement.

    More importantly - my previous patch "init: don't lose initcall return
    values" (commit e662e1cfd434aa234b72fbc781f1d70211cb785b) had introduced
    potential buffer overflow by wrong calculation of string accumulator
    size.

    Use strlcat() instead, fixing both bugs.

    Many thanks Andreas Schwab and Geert Uytterhoeven for helping
    to catch and fix the bug.

    Signed-off-by: Cyrill Gorcunov
    Cc: Geert Uytterhoeven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Cyrill Gorcunov
     
  • One function to just loop over the entries, one function to actually do
    the call and the associated debugging code.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Everybody wants to pass it a function pointer, and in fact, that is what
    you _must_ pass it for it to make sense (since it knows that ia64 and
    ppc64 use descriptors for function pointers and fetches the actual
    address from there).

    So don't make the argument be a 'unsigned long' and force everybody to
    add a cast.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

15 May, 2008

1 commit

  • Some devices, like md, may create partitions only at first access,
    so allow root= to be set to a valid non-existant partition of an
    existing disk. This applies only to non-initramfs root mounting.

    This fixes a regression from 2.6.24 which did allow this to happen and
    broke some users machines :(

    Acked-by: Neil Brown
    Tested-by: Joao Luis Meloni Assirati
    Cc: stable
    Signed-off-by: Kay Sievers
    Signed-off-by: Greg Kroah-Hartman

    Kay Sievers
     

13 May, 2008

1 commit


09 May, 2008

1 commit

  • Linus found a logic bug: we ignore the version number in a module's
    vermagic string if we have CONFIG_MODVERSIONS set, but modversions
    also lets through a module with no __versions section for modprobe
    --force (with tainting, but still).

    We should only ignore the start of the vermagic string if the module
    actually *has* crcs to check. Rather than (say) having an
    entertaining hissy fit and creating a config option to work around the
    buggy code.

    Signed-off-by: Rusty Russell
    Signed-off-by: Linus Torvalds

    Rusty Russell
     

07 May, 2008

1 commit

  • fix pcspkr dependancies: make the pcspkr platform
    drivers to depend on a platform device, and
    not the other way around.

    Signed-off-by: Stas Sergeev
    Acked-by: Thomas Gleixner
    Acked-by: Dmitry Torokhov
    CC: Vojtech Pavlik
    CC: Michael Opdenacker
    [fixed for 2.6.26-rc1 by tiwai]
    Signed-off-by: Takashi Iwai

    Stas Sergeev
     

06 May, 2008

3 commits

  • GROUP_SCHED is confirmed to cause unacceptable latencies, see:

    http://lkml.org/lkml/2008/5/2/370.

    Mark it EXPERIMENTAL and default to no for now.

    Signed-off-by: Parag Warudkar
    Signed-off-by: Ingo Molnar

    Parag Warudkar
     
  • this replaces the rq->clock stuff (and possibly cpu_clock()).

    - architectures that have an 'imperfect' hardware clock can set
    CONFIG_HAVE_UNSTABLE_SCHED_CLOCK

    - the 'jiffie' window might be superfulous when we update tick_gtod
    before the __update_sched_clock() call in sched_clock_tick()

    - cpu_clock() might be implemented as:

    sched_clock_cpu(smp_processor_id())

    if the accuracy proves good enough - how far can TSC drift in a
    single jiffie when considering the filtering and idle hooks?

    [ mingo@elte.hu: various fixes and cleanups ]

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • add the HAVE_UNSTABLE_SCHED_CLOCK, for architectures to select.

    the next change utilizes it.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

05 May, 2008

1 commit

  • The kernel module loader used to be much too happy to allow loading of
    modules for the wrong kernel version by default. For example, if you
    had MODVERSIONS enabled, but tried to load a module with no version
    info, it would happily load it and taint the kernel - whether it was
    likely to actually work or not!

    Generally, such forced module loading should be considered a really
    really bad idea, so make it conditional on a new config option
    (MODULE_FORCE_LOAD), and make it default to off.

    If somebody really wants to force module loads, that's their problem,
    but we should not encourage it. Especially as it happened to me by
    mistake (ie regular unversioned Fedora modules getting loaded) causing
    lots of strange behavior.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

02 May, 2008

1 commit


30 Apr, 2008

3 commits

  • We can see an ever repeating problem pattern with objects of any kind in the
    kernel:

    1) freeing of active objects
    2) reinitialization of active objects

    Both problems can be hard to debug because the crash happens at a point where
    we have no chance to decode the root cause anymore. One problem spot are
    kernel timers, where the detection of the problem often happens in interrupt
    context and usually causes the machine to panic.

    While working on a timer related bug report I had to hack specialized code
    into the timer subsystem to get a reasonable hint for the root cause. This
    debug hack was fine for temporary use, but far from a mergeable solution due
    to the intrusiveness into the timer code.

    The code further lacked the ability to detect and report the root cause
    instantly and keep the system operational.

    Keeping the system operational is important to get hold of the debug
    information without special debugging aids like serial consoles and special
    knowledge of the bug reporter.

    The problems described above are not restricted to timers, but timers tend to
    expose it usually in a full system crash. Other objects are less explosive,
    but the symptoms caused by such mistakes can be even harder to debug.

    Instead of creating specialized debugging code for the timer subsystem a
    generic infrastructure is created which allows developers to verify their code
    and provides an easy to enable debug facility for users in case of trouble.

    The debugobjects core code keeps track of operations on static and dynamic
    objects by inserting them into a hashed list and sanity checking them on
    object operations and provides additional checks whenever kernel memory is
    freed.

    The tracked object operations are:
    - initializing an object
    - adding an object to a subsystem list
    - deleting an object from a subsystem list

    Each operation is sanity checked before the operation is executed and the
    subsystem specific code can provide a fixup function which allows to prevent
    the damage of the operation. When the sanity check triggers a warning message
    and a stack trace is printed.

    The list of operations can be extended if the need arises. For now it's
    limited to the requirements of the first user (timers).

    The core code enqueues the objects into hash buckets. The hash index is
    generated from the address of the object to simplify the lookup for the check
    on kfree/vfree. Each bucket has it's own spinlock to avoid contention on a
    global lock.

    The debug code can be compiled in without being active. The runtime overhead
    is minimal and could be optimized by asm alternatives. A kernel command line
    option enables the debugging code.

    Thanks to Ingo Molnar for review, suggestions and cleanup patches.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: Greg KH
    Cc: Randy Dunlap
    Cc: Kay Sievers
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • There are some places that are known to operate on tasks'
    global pids only:

    * the rest_init() call (called on boot)
    * the kgdb's getthread
    * the create_kthread() (since the kthread is run in init ns)

    So use the find_task_by_pid_ns(..., &init_pid_ns) there
    and schedule the find_task_by_pid for removal.

    [sukadev@us.ibm.com: Fix warning in kernel/pid.c]
    Signed-off-by: Pavel Emelyanov
    Cc: "Eric W. Biederman"
    Signed-off-by: Sukadev Bhattiprolu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • The global init has a lot of long standing problems with the unhandled fatal
    signals.

    - The "is_global_init(current)" check in get_signal_to_deliver()
    protects only the main thread. Sub-thread can dequee the fatal
    signal and shutdown the whole thread group except the main thread.
    If it dequeues SIGSTOP /sbin/init will be stopped, this is not
    right too. Note that we can't use is_global_init(->group_leader),
    this breaks exec and this can't solve other problems we have.

    - Even if afterwards ignored, the fatal signals sets SIGNAL_GROUP_EXIT
    on delivery. This breaks exec, has other bad implications, and this
    is just wrong.

    Introduce the new SIGNAL_UNKILLABLE flag to fix these problems. It also helps
    to solve some other problems addressed by the subsequent patches.

    Currently we use this flag for the global init only, but it could also be used
    by kthreads and (perhaps) by the sub-namespace inits.

    Signed-off-by: Oleg Nesterov
    Acked-by: "Eric W. Biederman"
    Cc: Roland McGrath
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

29 Apr, 2008

1 commit

  • Avoid a possible kmem_cache_create() failure by creating idr_layer_cache
    unconditionary at boot time rather than creating it on-demand when idr_init()
    is called the first time.

    This change also enables us to eliminate the check every time idr_init() is
    called.

    [akpm@linux-foundation.org: rename init_id_cache() to idr_init_cache()]
    [akpm@linux-foundation.org: fix alpha build]
    Signed-off-by: Akinobu Mita
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita