28 Jul, 2008

5 commits


27 Jul, 2008

1 commit

  • Remove arch-specific show_mem() in favor of the generic version.

    This also removes the following redundant information display:

    - free swap pages, printed by show_swap_cache_info()
    - pages in swapcache, printed by show_swap_cache_info()
    - dirty pages, writeback pages, mapped pages, slab pages,
    pagetables pages, printed by show_free_areas()

    where show_mem() calls show_free_areas(), which calls
    show_swap_cache_info().

    Signed-off-by: Johannes Weiner
    Acked-by: David S. Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

26 Jul, 2008

3 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
    sparc: Wire up new system calls.

    Linus Torvalds
     
  • This wires up the recently added Wire up signalfd4, eventfd2,
    epoll_create1, dup3, pipe2, and inotify_init1 system calls.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Currently list of kretprobe instances are stored in kretprobe object (as
    used_instances,free_instances) and in kretprobe hash table. We have one
    global kretprobe lock to serialise the access to these lists. This causes
    only one kretprobe handler to execute at a time. Hence affects system
    performance, particularly on SMP systems and when return probe is set on
    lot of functions (like on all systemcalls).

    Solution proposed here gives fine-grain locks that performs better on SMP
    system compared to present kretprobe implementation.

    Solution:

    1) Instead of having one global lock to protect kretprobe instances
    present in kretprobe object and kretprobe hash table. We will have
    two locks, one lock for protecting kretprobe hash table and another
    lock for kretporbe object.

    2) We hold lock present in kretprobe object while we modify kretprobe
    instance in kretprobe object and we hold per-hash-list lock while
    modifying kretprobe instances present in that hash list. To prevent
    deadlock, we never grab a per-hash-list lock while holding a kretprobe
    lock.

    3) We can remove used_instances from struct kretprobe, as we can
    track used instances of kretprobe instances using kretprobe hash
    table.

    Time duration for kernel compilation ("make -j 8") on a 8-way ppc64 system
    with return probes set on all systemcalls looks like this.

    cacheline non-cacheline Un-patched kernel
    aligned patch aligned patch
    ===============================================================================
    real 9m46.784s 9m54.412s 10m2.450s
    user 40m5.715s 40m7.142s 40m4.273s
    sys 2m57.754s 2m58.583s 3m17.430s
    ===========================================================

    Time duration for kernel compilation ("make -j 8) on the same system, when
    kernel is not probed.
    =========================
    real 9m26.389s
    user 40m8.775s
    sys 2m7.283s
    =========================

    Signed-off-by: Srinivasa DS
    Signed-off-by: Jim Keniston
    Acked-by: Ananth N Mavinakayanahalli
    Cc: Anil S Keshavamurthy
    Cc: David S. Miller
    Cc: Masami Hiramatsu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Srinivasa D S
     

25 Jul, 2008

7 commits

  • …el/git/tip/linux-2.6-tip

    * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    nohz: adjust tick_nohz_stop_sched_tick() call of s390 as well
    nohz: prevent tick stop outside of the idle loop

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
    sparc64: Fix cpufreq notifier registry.
    sparc64: Fix lockdep issues in LDC protocol layer.

    Linus Torvalds
     
  • This patch introduces the new syscall pipe2 which is like pipe but it also
    takes an additional parameter which takes a flag value. This patch implements
    the handling of O_CLOEXEC for the flag. I did not add support for the new
    syscall for the architectures which have a special sys_pipe implementation. I
    think the maintainers of those archs have the chance to go with the unified
    implementation but that's up to them.

    The implementation introduces do_pipe_flags. I did that instead of changing
    all callers of do_pipe because some of the callers are written in assembler.
    I would probably screw up changing the assembly code. To avoid breaking code
    do_pipe is now a small wrapper around do_pipe_flags. Once all callers are
    changed over to do_pipe_flags the old do_pipe function can be removed.

    The following test must be adjusted for architectures other than x86 and
    x86-64 and in case the syscall numbers changed.

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    #include
    #include
    #include
    #include

    #ifndef __NR_pipe2
    # ifdef __x86_64__
    # define __NR_pipe2 293
    # elif defined __i386__
    # define __NR_pipe2 331
    # else
    # error "need __NR_pipe2"
    # endif
    #endif

    int
    main (void)
    {
    int fd[2];
    if (syscall (__NR_pipe2, fd, 0) != 0)
    {
    puts ("pipe2(0) failed");
    return 1;
    }
    for (int i = 0; i < 2; ++i)
    {
    int coe = fcntl (fd[i], F_GETFD);
    if (coe == -1)
    {
    puts ("fcntl failed");
    return 1;
    }
    if (coe & FD_CLOEXEC)
    {
    printf ("pipe2(0) set close-on-exit for fd[%d]\n", i);
    return 1;
    }
    }
    close (fd[0]);
    close (fd[1]);

    if (syscall (__NR_pipe2, fd, O_CLOEXEC) != 0)
    {
    puts ("pipe2(O_CLOEXEC) failed");
    return 1;
    }
    for (int i = 0; i < 2; ++i)
    {
    int coe = fcntl (fd[i], F_GETFD);
    if (coe == -1)
    {
    puts ("fcntl failed");
    return 1;
    }
    if ((coe & FD_CLOEXEC) == 0)
    {
    printf ("pipe2(O_CLOEXEC) does not set close-on-exit for fd[%d]\n", i);
    return 1;
    }
    }
    close (fd[0]);
    close (fd[1]);

    puts ("OK");

    return 0;
    }
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    Signed-off-by: Ulrich Drepper
    Acked-by: Davide Libenzi
    Cc: Michael Kerrisk
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     
  • On 32-bit architectures PAGE_ALIGN() truncates 64-bit values to the 32-bit
    boundary. For example:

    u64 val = PAGE_ALIGN(size);

    always returns a value < 4GB even if size is greater than 4GB.

    The problem resides in PAGE_MASK definition (from include/asm-x86/page.h for
    example):

    #define PAGE_SHIFT 12
    #define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT)
    #define PAGE_MASK (~(PAGE_SIZE-1))
    ...
    #define PAGE_ALIGN(addr) (((addr)+PAGE_SIZE-1)&PAGE_MASK)

    The "~" is performed on a 32-bit value, so everything in "and" with
    PAGE_MASK greater than 4GB will be truncated to the 32-bit boundary.
    Using the ALIGN() macro seems to be the right way, because it uses
    typeof(addr) for the mask.

    Also move the PAGE_ALIGN() definitions out of include/asm-*/page.h in
    include/linux/mm.h.

    See also lkml discussion: http://lkml.org/lkml/2008/6/11/237

    [akpm@linux-foundation.org: fix drivers/media/video/uvc/uvc_queue.c]
    [akpm@linux-foundation.org: fix v850]
    [akpm@linux-foundation.org: fix powerpc]
    [akpm@linux-foundation.org: fix arm]
    [akpm@linux-foundation.org: fix mips]
    [akpm@linux-foundation.org: fix drivers/media/video/pvrusb2/pvrusb2-dvb.c]
    [akpm@linux-foundation.org: fix drivers/mtd/maps/uclinux.c]
    [akpm@linux-foundation.org: fix powerpc]
    Signed-off-by: Andrea Righi
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Righi
     
  • Straight forward extensions for huge pages located in the PUD instead of
    PMDs.

    Signed-off-by: Andi Kleen
    Signed-off-by: Nick Piggin
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • The goal of this patchset is to support multiple hugetlb page sizes. This
    is achieved by introducing a new struct hstate structure, which
    encapsulates the important hugetlb state and constants (eg. huge page
    size, number of huge pages currently allocated, etc).

    The hstate structure is then passed around the code which requires these
    fields, they will do the right thing regardless of the exact hstate they
    are operating on.

    This patch adds the hstate structure, with a single global instance of it
    (default_hstate), and does the basic work of converting hugetlb to use the
    hstate.

    Future patches will add more hstate structures to allow for different
    hugetlbfs mounts to have different page sizes.

    [akpm@linux-foundation.org: coding-style fixes]
    Acked-by: Adam Litke
    Acked-by: Nishanth Aravamudan
    Signed-off-by: Andi Kleen
    Signed-off-by: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • There are a lot of places that define either a single bootmem descriptor or an
    array of them. Use only one central array with MAX_NUMNODES items instead.

    Signed-off-by: Johannes Weiner
    Acked-by: Ralf Baechle
    Cc: Ingo Molnar
    Cc: Richard Henderson
    Cc: Russell King
    Cc: Tony Luck
    Cc: Hirokazu Takata
    Cc: Geert Uytterhoeven
    Cc: Kyle McMartin
    Cc: Paul Mackerras
    Cc: Paul Mundt
    Cc: David S. Miller
    Cc: Yinghai Lu
    Cc: Christoph Lameter
    Cc: Mel Gorman
    Cc: Andy Whitcroft
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

24 Jul, 2008

1 commit


23 Jul, 2008

2 commits

  • We're calling request_irq() with a IRQs disabled.

    No straightforward fix exists because we want to
    enable these IRQs and setup state atomically before
    getting into the IRQ handler the first time.

    What happens now is that we mark the VIRQ to not be
    automatically enabled by request_irq(). Then we
    make explicit enable_irq() calls when we grab the
    LDC channel.

    This way we don't need to call request_irq() illegally
    under the LDC channel lock any more.

    Bump LDC version and release date.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
    remove CONFIG_KMOD from core kernel code
    remove CONFIG_KMOD from lib
    remove CONFIG_KMOD from sparc64
    rework try_then_request_module to do less in non-modular kernels
    remove mention of CONFIG_KMOD from documentation
    make CONFIG_KMOD invisible
    modules: Take a shortcut for checking if an address is in a module
    module: turn longs into ints for module sizes
    Shrink struct module: CONFIG_UNUSED_SYMBOLS ifdefs
    module: reorder struct module to save space on 64 bit builds
    module: generic each_symbol iterator function
    module: don't use stop_machine for waiting rmmod

    Linus Torvalds
     

22 Jul, 2008

4 commits

  • One place is just a comment, the other a conditional, unused
    inclusion of linux/kmod.h.

    Signed-off-by: Johannes Berg
    Cc: David S. Miller
    Signed-off-by: Rusty Russell

    Johannes Berg
     
  • This converts all instances of bus_id in the sparc core kernel to use
    either dev_set_name(), or dev_name() depending on the need.

    This is done in anticipation of removing the bus_id field from struct
    driver.

    Cc: Kay Sievers
    Acked-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     
  • This allow to dynamically generate attributes and share show/store
    functions between attributes. Right now most attributes are generated
    by special macros and lots of duplicated code. With the attribute
    passed it's instead possible to attach some data to the attribute
    and then use that in shared low level functions to do different things.

    I need this for the dynamically generated bank attributes in the x86
    machine check code, but it'll allow some further cleanups.

    I converted all users in tree to the new show/store prototype. It's a single
    huge patch to avoid unbisectable sections.

    Runtime tested: x86-32, x86-64
    Compiled only: ia64, powerpc
    Not compile tested/only grep converted: sh, arm, avr32

    Signed-off-by: Andi Kleen
    Signed-off-by: Greg Kroah-Hartman

    Andi Kleen
     
  • Kobjects do not have a limit in name size since a while, so stop
    pretending that they do.

    Signed-off-by: Kay Sievers
    Signed-off-by: Greg Kroah-Hartman

    Kay Sievers
     

19 Jul, 2008

2 commits

  • Ingo Molnar
     
  • Jack Ren and Eric Miao tracked down the following long standing
    problem in the NOHZ code:

    scheduler switch to idle task
    enable interrupts

    Window starts here

    ----> interrupt happens (does not set NEED_RESCHED)
    irq_exit() stops the tick

    ----> interrupt happens (does set NEED_RESCHED)

    return from schedule()

    cpu_idle(): preempt_disable();

    Window ends here

    The interrupts can happen at any point inside the race window. The
    first interrupt stops the tick, the second one causes the scheduler to
    rerun and switch away from idle again and we end up with the tick
    disabled.

    The fact that it needs two interrupts where the first one does not set
    NEED_RESCHED and the second one does made the bug obscure and extremly
    hard to reproduce and analyse. Kudos to Jack and Eric.

    Solution: Limit the NOHZ functionality to the idle loop to make sure
    that we can not run into such a situation ever again.

    cpu_idle()
    {
    preempt_disable();

    while(1) {
    tick_nohz_stop_sched_tick(1); ,
    Debugged-by: eric miao
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

18 Jul, 2008

9 commits

  • Signed-off-by: David S. Miller

    David S. Miller
     
  • Signed-off-by: David S. Miller

    David S. Miller
     
  • Adrian Bunk reported that enabling 4MB page size breaks the build.
    The problem is that MAX_ORDER combined with the page shift exceeds the
    SECTION_SIZE_BITS we use in asm-sparc64/sparsemem.h

    There are several ways I suppose we could work around this. For one
    we could define a CONFIG_FORCE_MAX_ZONEORDER to decrease MAX_ORDER in
    these higher page size cases.

    But I also know that these page size cases are broken wrt. TLB miss
    handling especially on pre-hypervisor systems, and there isn't an easy
    way to fix that.

    These options were meant to be fun experimental hacks anyways, and
    only 8K and 64K make any sense to support.

    So remove 512K and 4M base page size support. Of course, we still
    support these page sizes for huge pages.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Signed-off-by: David S. Miller

    David S. Miller
     
  • Signed-off-by: David S. Miller

    David S. Miller
     
  • kernel bugzilla #11059:

    sparc64 config menu is missing "Processor type and features",
    so add that and move General Setup before Processor menu.

    Signed-off-by: Randy Dunlap
    Signed-off-by: David S. Miller

    Randy Dunlap
     
  • Renaming the function sparc64_mmap_check() to
    sparc_mmap_check() was enough to make the two
    header files identical.

    :$ diff -u include/asm-sparc/mman.h include/asm-sparc64/mman.h
    :-- include/asm-sparc/mman.h 2008-06-13 06:46:39.000000000 +0200
    :++ include/asm-sparc64/mman.h 2008-06-13 06:46:39.000000000 +0200
    :@@ -1,5 +1,5 @@
    :-#ifndef __SPARC_MMAN_H__
    :-#define __SPARC_MMAN_H__
    :+#ifndef __SPARC64_MMAN_H__
    :+#define __SPARC64_MMAN_H__
    :
    : #include
    :
    :@@ -23,9 +23,9 @@
    :
    : #ifdef __KERNEL__
    : #ifndef __ASSEMBLY__
    :-#define arch_mmap_check(addr,len,flags) sparc_mmap_check(addr,len)
    :-int sparc_mmap_check(unsigned long addr, unsigned long len);
    :+#define arch_mmap_check(addr,len,flags) sparc64_mmap_check(addr,len)
    :+int sparc64_mmap_check(unsigned long addr, unsigned long len);
    : #endif
    : #endif
    :
    :-#endif /* __SPARC_MMAN_H__ */
    :+#endif /* __SPARC64_MMAN_H__ */

    Signed-off-by: Sam Ravnborg

    Sam Ravnborg
     
  • David Miller noticed that the build of vmlinux.lds
    failed to use the -m64 specifier.
    This caused the build to break with a bi-arch gcc with
    unified headers.

    Add the -m64 option to CPPFLAGS_vmlinux.lds so we
    have the correct defines available when building
    vmliux.lds.

    Signed-off-by: Sam Ravnborg

    Sam Ravnborg
     
  • This patch makes the following needlessly global code static:
    - central.c: struct central_bus
    - central.c: struct fhc_list
    - central.c: apply_fhc_ranges()
    - central.c: apply_central_ranges()
    - ds.c: struct ds_states_template[]
    - pci_msi.c: sparc64_setup_msi_irq()
    - pci_msi.c: sparc64_teardown_msi_irq()
    - pci_sun4v.c: struct sun4v_dma_ops
    - sys_sparc32.c: cp_compat_stat64()

    Signed-off-by: Adrian Bunk
    Signed-off-by: David S. Miller

    Adrian Bunk
     

17 Jul, 2008

1 commit

  • * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (72 commits)
    Revert "x86/PCI: ACPI based PCI gap calculation"
    PCI: remove unnecessary volatile in PCIe hotplug struct controller
    x86/PCI: ACPI based PCI gap calculation
    PCI: include linux/pm_wakeup.h for device_set_wakeup_capable
    PCI PM: Fix pci_prepare_to_sleep
    x86/PCI: Fix PCI config space for domains > 0
    Fix acpi_pm_device_sleep_wake() by providing a stub for CONFIG_PM_SLEEP=n
    PCI: Simplify PCI device PM code
    PCI PM: Introduce pci_prepare_to_sleep and pci_back_from_sleep
    PCI ACPI: Rework PCI handling of wake-up
    ACPI: Introduce new device wakeup flag 'prepared'
    ACPI: Introduce acpi_device_sleep_wake function
    PCI: rework pci_set_power_state function to call platform first
    PCI: Introduce platform_pci_power_manageable function
    ACPI: Introduce acpi_bus_power_manageable function
    PCI: make pci_name use dev_name
    PCI: handle pci_name() being const
    PCI: add stub for pci_set_consistent_dma_mask()
    PCI: remove unused arch pcibios_update_resource() functions
    PCI: fix pci_setup_device()'s sprinting into a const buffer
    ...

    Fixed up conflicts in various files (arch/x86/kernel/setup_64.c,
    arch/x86/pci/irq.c, arch/x86/pci/pci.h, drivers/acpi/sleep/main.c,
    drivers/pci/pci.c, drivers/pci/pci.h, include/acpi/acpi_bus.h) from x86
    and ACPI updates manually.

    Linus Torvalds
     

16 Jul, 2008

2 commits

  • Conflicts:

    arch/powerpc/Kconfig
    arch/s390/kernel/time.c
    arch/x86/kernel/apic_32.c
    arch/x86/kernel/cpu/perfctr-watchdog.c
    arch/x86/kernel/i8259_64.c
    arch/x86/kernel/ldt.c
    arch/x86/kernel/nmi_64.c
    arch/x86/kernel/smpboot.c
    arch/x86/xen/smp.c
    include/asm-x86/hw_irq_32.h
    include/asm-x86/hw_irq_64.h
    include/asm-x86/mach-default/irq_vectors.h
    include/asm-x86/mach-voyager/irq_vectors.h
    include/asm-x86/smp.h
    kernel/Makefile

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • * 'core/stacktrace' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    generic-ipi: powerpc/generic-ipi tree build failure
    stacktrace: fix build failure on sparc64
    stacktrace: export save_stack_trace[_tsk]
    stacktrace: fix modular build, export print_stack_trace and save_stack_trace
    backtrace: replace timer with tasklet + completions
    stacktrace: add saved stack traces to backtrace self-test
    stacktrace: print_stack_trace() cleanup
    debugging: make stacktrace independent from DEBUG_KERNEL
    stacktrace: don't crash on invalid stack trace structs

    Linus Torvalds
     

15 Jul, 2008

2 commits

  • * 'tracing/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (228 commits)
    ftrace: build fix for ftraced_suspend
    ftrace: separate out the function enabled variable
    ftrace: add ftrace_kill_atomic
    ftrace: use current CPU for function startup
    ftrace: start wakeup tracing after setting function tracer
    ftrace: check proper config for preempt type
    ftrace: trace schedule
    ftrace: define function trace nop
    ftrace: move sched_switch enable after markers
    ftrace: prevent ftrace modifications while being kprobe'd, v2
    fix "ftrace: store mcount address in rec->ip"
    mmiotrace broken in linux-next (8-bit writes only)
    ftrace: avoid modifying kprobe'd records
    ftrace: freeze kprobe'd records
    kprobes: enable clean usage of get_kprobe
    ftrace: store mcount address in rec->ip
    ftrace: build fix with gcc 4.3
    namespacecheck: fixes
    ftrace: fix "notrace" filtering priority
    ftrace: fix printout
    ...

    Linus Torvalds
     
  • Jonathan Corbet
     

08 Jul, 2008

1 commit

  • Today's linux-next build (spac64 allmodconfig) failed like this:

    arch/sparc64/kernel/stacktrace.c:50: warning: type defaults to `int' in declaration of `EXPORT_SYMBOL_GPL'
    arch/sparc64/kernel/stacktrace.c:50: warning: parameter names (without types) in function declaration
    arch/sparc64/kernel/stacktrace.c:50: warning: data definition has no type or storage class

    Signed-off-by: Stephen Rothwell
    Cc: "David S. Miller"
    Signed-off-by: Ingo Molnar

    Stephen Rothwell