27 Apr, 2008

1 commit

  • Implement __fls on all 64-bit archs:

    alpha has an implementation of fls64.
    Added __fls(x) = fls64(x) - 1.

    ia64 has fls, but not __fls.
    Added __fls based on code of fls.

    mips and powerpc have __ilog2, which is the same as __fls.
    Added __fls = __ilog2.

    parisc, s390, sh and sparc64:
    Include generic __fls.

    x86_64 already has __fls.

    Signed-off-by: Alexander van Heukelum
    Signed-off-by: Ingo Molnar

    Alexander van Heukelum
     

18 Apr, 2008

1 commit


17 Apr, 2008

15 commits

  • Semaphores are no longer performance-critical, so a generic C
    implementation is better for maintainability, debuggability and
    extensibility. Thanks to Peter Zijlstra for fixing the lockdep
    warning. Thanks to Harvey Harrison for pointing out that the
    unlikely() was unnecessary.

    Signed-off-by: Matthew Wilcox
    Acked-by: Ingo Molnar

    Matthew Wilcox
     
  • Move the function that prints the segment warning messages found in the
    monreader driver and the dcssblk driver to the extmem base code.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Heiko Carstens

    Martin Schwidefsky
     
  • Newer s390 models have a breaking-event-address-recording register.
    Each time an instruction causes a break in the sequential instruction
    execution, the address is saved in that hardware register. On a program
    interrupt the address is copied to the lowcore address 272-279, which
    makes it software accessible.

    This patch changes the program check handler and the stack overflow
    checker to copy the value into the pt_regs argument.
    The oops output is enhanced to show the last known breaking address.
    It might give additional information if the stack trace is corrupted.

    The feature is only available on 64 bit.

    The new oops output looks like:

    [---------snip----------]
    Modules linked in: vmcp sunrpc qeth_l2 dm_mod qeth ccwgroup
    CPU: 2 Not tainted 2.6.24zlive-host #8
    Process modprobe (pid: 4788, task: 00000000bf3d8718, ksp: 00000000b2b0b8e0)
    Krnl PSW : 0704200180000000 000003e000020028 (vmcp_init+0x28/0xe4 [vmcp])
    R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0 EA:3
    Krnl GPRS: 0000000004000002 000003e000020000 0000000000000000 0000000000000001
    000000000015734c ffffffffffffffff 000003e0000b3b00 0000000000000000
    000003e00007ca30 00000000b5bb5d40 00000000b5bb5800 000003e0000b3b00
    000003e0000a2000 00000000003ecf50 00000000b2b0bd50 00000000b2b0bcb0
    Krnl Code: 000003e000020018: c0c000040ff4 larl %r12,3e0000a2000
    000003e00002001e: e3e0f0000024 stg %r14,0(%r15)
    000003e000020024: a7f40001 brc 15,3e000020026
    >000003e000020028: e310c0100004 lg %r1,16(%r12)
    000003e00002002e: c020000413dc larl %r2,3e0000a27e6
    000003e000020034: c0a00004aee6 larl %r10,3e0000b5e00
    000003e00002003a: a7490001 lghi %r4,1
    000003e00002003e: a75900f0 lghi %r5,240
    Call Trace:
    ([] blocking_notifier_call_chain+0x2c/0x40)
    [] sys_init_module+0x19d8/0x1b08
    [] sysc_noemu+0x10/0x16
    [] 0x2000011cda2
    Last Breaking-Event-Address:
    [] vmcp_init+0x24/0xe4 [vmcp]
    [---------snip----------]

    Signed-off-by: Christian Borntraeger
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Heiko Carstens

    Christian Borntraeger
     
  • As noted by akpm:

    > kernel/time/tick-sched.c: In function 'tick_nohz_stop_sched_tick':
    > kernel/time/tick-sched.c:229: warning: format '%02x' expects type 'unsigned int', but argument 2 has type '__u64'
    >
    > I don't think the architecture's local_softirq_pending() should return u64.
    > This is the sort of thing which should be consistent across architectures.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Heiko Carstens

    Heiko Carstens
     
  • Most noteable part of this commit is the new local header file entry.h
    which contains all the function declarations of functions that get only
    called from asm code or are arch internal. That way we can avoid extern
    declarations in C files.
    This is more or less the same that was done for sparc64.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Heiko Carstens

    Heiko Carstens
     
  • This way we get rid of s390's NO_IDLE_HZ and use the generic dynticks
    variant instead. In addition we get high resolution timers for free.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Heiko Carstens

    Heiko Carstens
     
  • Remove the program check generating monitor calls and use function
    calls instead. Theres is no real advantage in using monitor calls,
    but they do make debugging harder, because of all the program checks
    it generates.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Heiko Carstens

    Heiko Carstens
     
  • The new function supports setting of permissions for the debugfs files
    created by the debug feature. In addition to that, the function provides
    uid and gid as parameters for future use. Currently only root is allowed
    for uid and gid.

    Signed-off-by: Michael Holzheu
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Heiko Carstens

    Michael Holzheu
     
  • Add get_clock_xt to read an 8 byte clock value using store clock
    extended (STCKE) and use get_clock_xt for sched_clock. STCKE should
    be faster than STCK on newer machines.

    Signed-off-by: Jan Glauber
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Heiko Carstens

    Jan Glauber
     
  • If vertical cpu polarization is active then the hypervisor will
    dispatch certain cpus for a longer time than other cpus for maximum
    performance. For example if a guest would have three virtual cpus,
    each of them with a share of 33 percent, then in case of vertical
    cpu polarization all of the processing time would be combined to a
    single cpu which would run all the time, while the other two cpus
    would get nearly no cpu time.

    There are three different types of vertical cpus: high, medium and
    low. Low cpus hardly get any real cpu time, while high cpus get a
    full real cpu. Medium cpus get something in between.

    In order to switch between the two possible modes (default is
    horizontal) a 0 for horizontal polarization or a 1 for vertical
    polarization must be written to the dispatching sysfs attribute:

    /sys/devices/system/cpu/dispatching

    The polarization of each single cpu can be figured out by the
    polarization sysfs attribute of each cpu:

    /sys/devices/system/cpu/cpuX/polarization

    horizontal, vertical:high, vertical:medium, vertical:low or unknown.

    When switching polarization the polarization attribute may contain
    the value unknown until the configuration change is done and the
    kernel has figured out the new polarization of each cpu.

    Note that running a system with different types of vertical cpus may
    result in significant performance regressions. If possible only one
    type of vertical cpus should be used. All other cpus should be
    offlined.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Heiko Carstens

    Heiko Carstens
     
  • Add s390 backend so we can give the scheduler some hints about the
    cpu topology.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Heiko Carstens

    Heiko Carstens
     
  • Make stfle visible so other code can call this.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Heiko Carstens

    Heiko Carstens
     
  • Add permanent and temporary model capacity and the corresponding
    capacity value fields for the three capacity identifiers to the
    output of /proc/sysinfo.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Heiko Carstens

    Martin Schwidefsky
     
  • drivers/s390/sysinfo.c uses the store system information intruction to query
    the system about information of the machine, the LPAR and additional
    hypervisors. KVM has to implement the host part for this instruction.

    To avoid code duplication, this patch splits the common definitions from
    sysinfo.c into a separate header file include/asm-s390/sysinfo.h for KVM use.

    Signed-off-by: Christian Borntraeger
    Signed-off-by: Carsten Otte
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Heiko Carstens

    Christian Borntraeger
     
  • Fix the following link error with allnoconfig:

    vmem.c:(.text+0x175c): undefined reference to `smp_ptlb_all'
    vmem.c:(.text+0x1b24): undefined reference to `smp_ptlb_all'
    fork.c:(.text+0x4190): undefined reference to `smp_ptlb_all'
    : undefined reference to `smp_ptlb_all'
    : undefined reference to `smp_ptlb_all'
    mm/built-in.o:: more undefined references to `smp_ptlb_all' follow
    make[1]: *** [.tmp_vmlinux1] Error 1
    make: *** [sub-make] Error 2

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Heiko Carstens

    Martin Schwidefsky
     

03 Apr, 2008

1 commit

  • Currently include/linux/kvm.h is not considered by make headers_install,
    because Kbuild cannot handle " unifdef-$(CONFIG_FOO) += foo.h. This problem
    was introduced by

    commit fb56dbb31c4738a3918db81fd24da732ce3b4ae6
    Author: Avi Kivity
    Date: Sun Dec 2 10:50:06 2007 +0200

    KVM: Export include/linux/kvm.h only if $ARCH actually supports KVM

    Currently, make headers_check barfs due to , which
    includes, not existing. Rather than add a zillion s, export kvm.
    only if the arch actually supports it.

    Signed-off-by: Avi Kivity

    which makes this an 2.6.25 regression.

    One way of solving the issue is to enhance Kbuild, but Avi and David conviced
    me, that changing headers_install is not the way to go. This patch changes
    the definition for linux/kvm.h to unifdef-y.

    If  unifdef-y is used for linux/kvm.h "make headers_check" will fail on all
    architectures without asm/kvm.h. Therefore, this patch also provides
    asm/kvm.h on all architectures.

    Signed-off-by: Christian Borntraeger
    Acked-by: Avi Kivity
    Cc: Sam Ravnborg
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christian Borntraeger
     

05 Mar, 2008

1 commit

  • Add CONFIG_HAVE_KRETPROBES to the arch//Kconfig file for relevant
    architectures with kprobes support. This facilitates easy handling of
    in-kernel modules (like samples/kprobes/kretprobe_example.c) that depend on
    kretprobes being present in the kernel.

    Thanks to Sam Ravnborg for helping make the patch more lean.

    Per Mathieu's suggestion, added CONFIG_KRETPROBES and fixed up dependencies.

    Signed-off-by: Ananth N Mavinakayanahalli
    Acked-by: Mathieu Desnoyers
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ananth N Mavinakayanahalli
     

19 Feb, 2008

1 commit


10 Feb, 2008

7 commits


09 Feb, 2008

4 commits

  • Background: I've implemented 1K/2K page tables for s390. These sub-page
    page tables are required to properly support the s390 virtualization
    instruction with KVM. The SIE instruction requires that the page tables
    have 256 page table entries (pte) followed by 256 page status table entries
    (pgste). The pgstes are only required if the process is using the SIE
    instruction. The pgstes are updated by the hardware and by the hypervisor
    for a number of reasons, one of them is dirty and reference bit tracking.
    To avoid wasting memory the standard pte table allocation should return
    1K/2K (31/64 bit) and 2K/4K if the process is using SIE.

    Problem: Page size on s390 is 4K, page table size is 1K or 2K. That means
    the s390 version for pte_alloc_one cannot return a pointer to a struct
    page. Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one
    cannot return a pointer to a pte either, since that would require more than
    32 bit for the return value of pte_alloc_one (and the pte * would not be
    accessible since its not kmapped).

    Solution: The only solution I found to this dilemma is a new typedef: a
    pgtable_t. For s390 pgtable_t will be a (pte *) - to be introduced with a
    later patch. For everybody else it will be a (struct page *). The
    additional problem with the initialization of the ptl lock and the
    NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and
    a destructor pgtable_page_dtor. The page table allocation and free
    functions need to call these two whenever a page table page is allocated or
    freed. pmd_populate will get a pgtable_t instead of a struct page pointer.
    To get the pgtable_t back from a pmd entry that has been installed with
    pmd_populate a new function pmd_pgtable is added. It replaces the pmd_page
    call in free_pte_range and apply_to_pte_range.

    Signed-off-by: Martin Schwidefsky
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Martin Schwidefsky
     
  • Move STACK_TOP[_MAX] out of asm/a.out.h and into asm/processor.h as they're
    required whether or not A.OUT format is available.

    Signed-off-by: David Howells
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • Backend for s390.

    Acked-by: Alan Cox
    Cc: Martin Schwidefsky
    Signed-off-by: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Heiko Carstens
     
  • Currently we possibly lookup the pid in the wrong pid namespace. So
    seq_file convert proc_pid_status which ensures the proper pid namespaces is
    passed in.

    [akpm@linux-foundation.org: coding-style fixes]
    [akpm@linux-foundation.org: build fix]
    [akpm@linux-foundation.org: another build fix]
    [akpm@linux-foundation.org: s390 build fix]
    [akpm@linux-foundation.org: fix task_name() output]
    [akpm@linux-foundation.org: fix nommu build]
    Signed-off-by: Eric W. Biederman
    Cc: Andrew Morgan
    Cc: Serge Hallyn
    Cc: Cedric Le Goater
    Cc: Pavel Emelyanov
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Paul Menage
    Cc: Paul Jackson
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

08 Feb, 2008

4 commits

  • Use the standard __cmpxchg for every type that can be updated atomically.
    Use the new generic cmpxchg_local (disables interrupt) for other types.

    Signed-off-by: Mathieu Desnoyers
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mathieu Desnoyers
     
  • struct user.u_ar0 is defined to contain a pointer offset on all
    architectures in which it is defined (all architectures which define an
    a.out format except SPARC.) However, it has a pointer type in the headers,
    which is pointless -- is not exported to userspace, and it
    just makes the code messy.

    Redefine the field as "unsigned long" (which is the same size as a pointer
    on all Linux architectures) and change the setting code to user offsetof()
    instead of hand-coded arithmetic.

    Cc: Linux Arch Mailing List
    Cc: Bryan Wu
    Cc: Roman Zippel
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Russell King
    Cc: Lennert Buytenhek
    Cc: Håvard Skinnemoen
    Cc: Mikael Starvik
    Cc: Yoshinori Sato
    Cc: Tony Luck
    Cc: Hirokazu Takata
    Cc: Ralf Baechle
    Cc: Paul Mackerras
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Paul Mundt
    Signed-off-by: H. Peter Anvin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    H. Peter Anvin
     
  • Do not export asm/page.h during make headers_install. This removes PAGE_SIZE
    from userspace headers.

    Signed-off-by: Kirill A. Shutemov
    Reviewed-by: David Woodhouse
    Cc: David Howells
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: H. Peter Anvin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • asm/elf.h, asm/page.h and asm/user.h don't export to userspace now, so we can
    drop #ifdef __KERNEL__ for them.

    [k.shutemov@gmail.com: remove #ifdef __KERNEL_]
    Signed-off-by: Kirill A. Shutemov
    Reviewed-by: David Woodhouse
    Cc:
    Signed-off-by: Kirill A. Shutemov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

07 Feb, 2008

1 commit

  • This moves the ability to scale cputime into generic code. This allows us
    to fix the issue in kernel/timer.c (noticed by Balbir) where we could only
    add an unscaled value to the scaled utime/stime.

    This adds a cputime_to_scaled function. As before, the POWERPC version
    does the scaling based on the last SPURR/PURR ratio calculated. The
    generic and s390 (only other arch to implement asm/cputime.h) versions are
    both NOPs.

    Also moves the SPURR and PURR snapshots closer.

    Signed-off-by: Michael Neuling
    Cc: Jay Lan
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: Heiko Carstens
    Cc: Martin Schwidefsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael Neuling
     

06 Feb, 2008

3 commits

  • * 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
    [S390] dcss: Initialize workqueue before using it.
    [S390] Remove BUILD_BUG_ON() in vmem code.
    [S390] sclp_tty/sclp_vt220: Fix scheduling while atomic
    [S390] dasd: fix panic caused by alias device offline
    [S390] dasd: add ifcc handling
    [S390] latencytop s390 support.
    [S390] Implement ext2_find_next_bit.
    [S390] Cleanup & optimize bitops.
    [S390] Define GENERIC_LOCKBREAK.
    [S390] console: allow vt220 console to be the only console
    [S390] Fix couple of section mismatches.
    [S390] Fix smp_call_function_mask semantics.
    [S390] Fix linker script.
    [S390] DEBUG_PAGEALLOC support for s390.
    [S390] cio: Add shutdown callback for ccwgroup.
    [S390] cio: Update documentation.
    [S390] cio: Clean up chsc response code handling.
    [S390] cio: make sense id procedure work with partial hardware response

    Linus Torvalds
     
  • (with Martin Schwidefsky )

    The pgd/pud/pmd/pte page table allocation functions get a mm_struct pointer as
    first argument. The free functions do not get the mm_struct argument. This
    is 1) asymmetrical and 2) to do mm related page table allocations the mm
    argument is needed on the free function as well.

    [kamalesh@linux.vnet.ibm.com: i386 fix]
    [akpm@linux-foundation.org: coding-syle fixes]
    Signed-off-by: Benjamin Herrenschmidt
    Signed-off-by: Martin Schwidefsky
    Cc:
    Signed-off-by: Kamalesh Babulal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Benjamin Herrenschmidt
     
  • The following replaces the earlier patches sent. It should address
    David Rientjes's comments, and has been compile tested on all the
    architectures that it touches, save for parisc.

    For the /proc//pagemap code[1], we need to able to query how
    much virtual address space a particular task has. The trick is
    that we do it through /proc and can't use TASK_SIZE since it
    references "current" on some arches. The process opening the
    /proc file might be a 32-bit process opening a 64-bit process's
    pagemap file.

    x86_64 already has a TASK_SIZE_OF() macro:

    #define TASK_SIZE_OF(child) ((test_tsk_thread_flag(child, TIF_IA32)) ? IA32_PAGE_OFFSET : TASK_SIZE64)

    I'd like to have that for other architectures. So, add it
    for all the architectures that actually use "current" in
    their TASK_SIZE. For the others, just add a quick #define
    in sched.h to use plain old TASK_SIZE.

    1. http://www.linuxworld.com/news/2007/042407-kernel.html

    - MIPS portion from Ralf Baechle

    [akpm@linux-foundation.org: fix mips build]
    Signed-off-by: Dave Hansen
    Signed-off-by: Ralf Baechle
    Signed-off-by: Matt Mackall
    Acked-by: David Rientjes
    Cc: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     

05 Feb, 2008

1 commit

  • Remove BUILD_BUG_ON() in vmem code since it causes build failures if
    the size of struct page increases. Instead calculate at compile time
    the address of the highest physical address that can be added to the
    1:1 mapping.
    This supposed to fix a build failure with the page owner tracking leak
    detector patches as reported by akpm.

    page-owner-tracking-leak-detector-broken-on-s390.patch can be removed
    from -mm again when this is merged.

    Cc: Andrew Morton
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens