30 Oct, 2011

2 commits


24 Jul, 2011

1 commit

  • The alignment is missing for various global symbols in s390 assembly code.
    With a recent gcc and an instruction like stgrl this can lead to a
    specification exception if the instruction uses such a mis-aligned address.

    Specify the alignment explicitely and while add it define __ALIGN for s390
    and use the ENTRY define to save some lines of code.

    Signed-off-by: Jan Glauber
    Signed-off-by: Martin Schwidefsky

    Jan Glauber
     

26 May, 2011

1 commit


11 Mar, 2011

2 commits

  • Change futex_atomic_op_inuser and futex_atomic_cmpxchg_inatomic
    prototypes to use u32 types for the futex as this is the data type the
    futex core code uses all over the place.

    Signed-off-by: Michel Lespinasse
    Cc: Darren Hart
    Cc: Peter Zijlstra
    Cc: Matt Turner
    Cc: Russell King
    Cc: David Howells
    Cc: Tony Luck
    Cc: Michal Simek
    Cc: Ralf Baechle
    Cc: "James E.J. Bottomley"
    Cc: Benjamin Herrenschmidt
    Cc: Martin Schwidefsky
    Cc: Paul Mundt
    Cc: "David S. Miller"
    Cc: Chris Metcalf
    Cc: Linus Torvalds
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Michel Lespinasse
     
  • The cmpxchg_futex_value_locked API was funny in that it returned either
    the original, user-exposed futex value OR an error code such as -EFAULT.
    This was confusing at best, and could be a source of livelocks in places
    that retry the cmpxchg_futex_value_locked after trying to fix the issue
    by running fault_in_user_writeable().

    This change makes the cmpxchg_futex_value_locked API more similar to the
    get_futex_value_locked one, returning an error code and updating the
    original value through a reference argument.

    Signed-off-by: Michel Lespinasse
    Acked-by: Chris Metcalf [tile]
    Acked-by: Tony Luck [ia64]
    Acked-by: Thomas Gleixner
    Tested-by: Michal Simek [microblaze]
    Acked-by: David Howells [frv]
    Cc: Darren Hart
    Cc: Peter Zijlstra
    Cc: Matt Turner
    Cc: Russell King
    Cc: Ralf Baechle
    Cc: "James E.J. Bottomley"
    Cc: Benjamin Herrenschmidt
    Cc: Martin Schwidefsky
    Cc: Paul Mundt
    Cc: "David S. Miller"
    Cc: Linus Torvalds
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Michel Lespinasse
     

31 Jan, 2011

1 commit

  • The uaccess functions copy_in_user_std and clear_user_std fail to
    switch back from secondary space mode to primary space mode with sacf
    in case of an unresolvable page fault. We need to make sure that the
    switch back to primary mode is done in all cases, otherwise the code
    following the uaccess inline assembly will crash.

    Reported-by: Alexander Graf
    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     

05 Jan, 2011

1 commit


25 Nov, 2010

1 commit

  • On each machine check all registers are revalidated. The save area for
    the clock comparator however only contains the upper most seven bytes
    of the former contents, if valid.
    Therefore the machine check handler uses a store clock instruction to
    get the current time and writes that to the clock comparator register
    which in turn will generate an immediate timer interrupt.
    However within the lowcore the expected time of the next timer
    interrupt is stored. If the interrupt happens before that time the
    handler won't be called. In turn the clock comparator won't be
    reprogrammed and therefore the interrupt condition stays pending which
    causes an interrupt loop until the expected time is reached.

    On NOHZ machines this can result in unresponsive machines since the
    time of the next expected interrupted can be a couple of days in the
    future.

    To fix this just revalidate the clock comparator register with the
    expected value.
    In addition the special handling for udelay must be changed as well.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

08 Mar, 2010

1 commit


27 Feb, 2010

2 commits

  • This patch introduces a new function that checks the running status
    of a cpu in a hypervisor. This status is not virtualized, so the check
    is only correct if running in an LPAR. On acquiring a spinlock, if the
    cpu holding the lock is scheduled by the hypervisor, we do a busy wait
    on the lock. If it is not scheduled, we yield over to that cpu.

    Signed-off-by: Gerald Schaefer
    Signed-off-by: Martin Schwidefsky

    Gerald Schaefer
     
  • Same as on x86 and sparc, besides the fact that enabling the option
    will just emit compile time warnings instead of errors.
    Keeps allyesconfig kernels compiling.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

14 Jan, 2010

1 commit


15 Dec, 2009

4 commits

  • Name space cleanup for rwlock functions. No functional change.

    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra
    Acked-by: David S. Miller
    Acked-by: Ingo Molnar
    Cc: linux-arch@vger.kernel.org

    Thomas Gleixner
     
  • Not strictly necessary for -rt as -rt does not have non sleeping
    rwlocks, but it's odd to not have a consistent naming convention.

    No functional change.

    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra
    Acked-by: David S. Miller
    Acked-by: Ingo Molnar
    Cc: linux-arch@vger.kernel.org

    Thomas Gleixner
     
  • Name space cleanup. No functional change.

    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra
    Acked-by: David S. Miller
    Acked-by: Ingo Molnar
    Cc: linux-arch@vger.kernel.org

    Thomas Gleixner
     
  • The raw_spin* namespace was taken by lockdep for the architecture
    specific implementations. raw_spin_* would be the ideal name space for
    the spinlocks which are not converted to sleeping locks in preempt-rt.

    Linus suggested to convert the raw_ to arch_ locks and cleanup the
    name space instead of using an artifical name like core_spin,
    atomic_spin or whatever

    No functional change.

    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra
    Acked-by: David S. Miller
    Acked-by: Ingo Molnar
    Cc: linux-arch@vger.kernel.org

    Thomas Gleixner
     

07 Dec, 2009

2 commits

  • The pagetable walk usercopy functions have used a modified copy of the
    do_exception() function for fault handling. This lead to inconsistencies
    with recent changes to do_exception(), e.g. performance counters. This
    patch changes the pagetable walk usercopy code to call do_exception()
    directly, eliminating the redundancy. A new parameter is added to
    do_exception() to specify the fault address.

    Signed-off-by: Gerald Schaefer
    Signed-off-by: Martin Schwidefsky

    Gerald Schaefer
     
  • Introduce user_mode to replace the two variables switch_amode and
    s390_noexec. There are three valid combinations of the old values:
    1) switch_amode == 0 && s390_noexec == 0
    2) switch_amode == 1 && s390_noexec == 0
    3) switch_amode == 1 && s390_noexec == 1
    They get replaced by
    1) user_mode == HOME_SPACE_MODE
    2) user_mode == PRIMARY_SPACE_MODE
    3) user_mode == SECONDARY_SPACE_MODE
    The new kernel parameter user_mode=[primary,secondary,home] lets
    you choose the address space mode the user space processes should
    use. In addition the CONFIG_S390_SWITCH_AMODE config option
    is removed.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     

06 Oct, 2009

3 commits

  • This patch adds an EX_TABLE entry to mvc{p|s|os} usercopy functions that
    may be called with KERNEL_DS. In combination with collaborative memory
    management, kernel pages marked as unused may trigger an adressing exception
    in the usercopy functions. This fixes an unhandled addressing exception bug
    where strncpy_from_user() is used with len > strnlen and KERNEL_DS, crossing
    a page boundary to an unused page.

    Signed-off-by: Gerald Schaefer
    Signed-off-by: Martin Schwidefsky

    Gerald Schaefer
     
  • Use an own implementation instead of the common code udelay loop.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • When udelay() gets called with a delay that would expire before the
    next clock event it reprograms the clock comparator.
    When the interrupt happens the clock comparator won't be resetted
    therefore the interrupt condition doesn't get cleared.
    The result is an endless timer interrupt loop until the next clock
    event would expire (stored in lowcore).
    So udelay() usually would wait much longer for small delays than it
    should.

    Fix this by disabling the local tick which makes sure that the clock
    comparator will be resetted when a timer interrupt happens.

    Signed-off-by: Christian Borntraeger
    Signed-off-by: Martin Schwidefsky

    Christian Borntraeger
     

07 Jul, 2009

2 commits

  • Provide __ucmpdi2() helper function on 31 bit so we don't run
    again and again in compile errors like this one:

    kernel/built-in.o: In function `T.689':
    perf_counter.c:(.text+0x56c86): undefined reference to `__ucmpdi2'

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • Our udelay implementation enables interrupts to receive a special timer
    interrupt regardless of the context it is called from.
    This might lead to false positive lockdep reports. Since lockdep isn't
    aware of the fact that only a single interrupt source is enabled it
    warns about possible deadlocks that in reality won't happen, like
    the one below.
    To fix this disable lockdep before enabling interrupts.

    [ 254.040888] =================================
    [ 254.040904] [ INFO: inconsistent lock state ]
    [ 254.040910] 2.6.30 #9
    [ 254.040914] ---------------------------------
    [ 254.040920] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
    [ 254.040927] swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
    [ 254.040934] (sch->lock){?.-...}, at: [] ccw_device_timeout+0x48/0x2f0
    [ 254.040961] {IN-HARDIRQ-W} state was registered at:
    [ 254.040969] [] __lock_acquire+0x9d4/0x188c
    [ 254.040985] [] lock_acquire+0x13c/0x16c
    [ 254.040998] [] _spin_lock+0x74/0xb8
    [ 254.041016] [] do_IRQ+0xde/0x208
    [ 254.041031] [] io_return+0x0/0x8
    [ 254.041049] [] vtime_stop_cpu+0xbe/0x114
    [ 254.041066] irq event stamp: 259629
    [ 254.041076] hardirqs last enabled at (259628): [] _spin_unlock_irq+0x5e/0x9c
    [ 254.041095] hardirqs last disabled at (259629): [] _spin_lock_irq+0x4a/0xc4
    [ 254.041126] softirqs last enabled at (259614): [] __do_softirq+0x296/0x2b0
    [ 254.041137] softirqs last disabled at (259619): [] do_softirq+0x102/0x108
    [ 254.041147]
    [ 254.041148] other info that might help us debug this:
    [ 254.041153] 2 locks held by swapper/0:
    [ 254.041157] #0: (&priv->timer){+.-...}, at: [] run_timer_softirq+0x19a/0x340
    [ 254.041170] #1: (sch->lock){?.-...}, at: [] ccw_device_timeout+0x48/0x2f0
    [ 254.041182]
    [ 254.041310] Call Trace:
    [ 254.041313] ([] show_trace+0x16c/0x170)
    [ 254.041321] [] show_stack+0x78/0x104
    [ 254.041327] [] dump_stack+0xc6/0xd4
    [ 254.041342] [] print_usage_bug+0x1c8/0x1fc
    [ 254.041353] [] mark_lock+0x4a2/0x670
    [ 254.041364] [] mark_held_locks+0x8a/0xb4
    [ 254.041375] [] trace_hardirqs_on_caller+0x74/0x1ac
    [ 254.041388] [] trace_hardirqs_on+0x2a/0x38
    [ 254.041402] [] __udelay_disabled+0xac/0xfc
    [ 254.041419] [] __udelay+0x12a/0x148
    [ 254.041433] [] cio_commit_config+0x170/0x290
    [ 254.041451] [] cio_disable_subchannel+0x120/0x1cc
    [ 254.041468] [] ccw_device_recog_done+0x54/0x2f4
    [ 254.041485] [] ccw_device_sense_id_done+0x50/0x90
    [ 254.041508] [] snsid_callback+0xfa/0x3a8
    [ 254.041515] [] ccwreq_stop+0x80/0x90
    [ 254.041523] [] ccw_request_timeout+0xc2/0xd0
    [ 254.041530] [] ccw_device_request_event+0x58/0x90
    [ 254.041537] [] ccw_device_timeout+0x7e/0x2f0
    [ 254.041555] [] run_timer_softirq+0x22a/0x340
    [ 254.041566] [] __do_softirq+0x138/0x2b0
    [ 254.041578] [] do_softirq+0x102/0x108
    [ 254.041590] [] irq_exit+0xee/0x114
    [ 254.041603] [] do_extint+0x130/0x17c
    [ 254.041617] [] ext_no_vtime+0x1e/0x22
    [ 254.041631] [] vtime_stop_cpu+0xbe/0x114
    [ 254.041646] ([] vtime_stop_cpu+0x6c/0x114)
    [ 254.041662] [] cpu_idle+0x122/0x1c0
    [ 254.041679] [] start_secondary+0xce/0xe0
    [ 254.041696] [] 0x0
    [ 254.041715] [] 0x0
    [ 254.041745] INFO: lockdep is turned off.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

22 Jun, 2009

1 commit

  • This allows the callers to now pass down the full set of FAULT_FLAG_xyz
    flags to handle_mm_fault(). All callers have been (mechanically)
    converted to the new calling convention, there's almost certainly room
    for architectures to clean up their code and then add FAULT_FLAG_RETRY
    when that support is added.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

12 Jun, 2009

1 commit


26 Mar, 2009

2 commits


18 Mar, 2009

2 commits

  • pfn_valid() actually checks for a valid struct page and not for a
    valid pfn. Using xip mappings w/o struct pages, this will result in
    -EFAULT returned by the (page table walk) user copy functions,
    even though there is valid memory. Those user copy functions don't
    need a struct page, so this patch just removes the pfn_valid() check.

    Signed-off-by: Gerald Schaefer
    Signed-off-by: Martin Schwidefsky

    Gerald Schaefer
     
  • The implementation of __div64_31 for G5 machines is broken. The comments
    in __div64_31 are correct, only the code does not do what the comments
    say. The part "If the remainder has overflown subtract base and increase
    the quotient" is only partially realized, the base is subtracted correctly
    but the quotient is only increased if the dividend had the last bit set.
    Using the correct instruction fixes the problem.

    Cc: stable@kernel.org
    Reported-by: Frans Pop
    Tested-by: Frans Pop
    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     

11 Oct, 2008

1 commit


04 Oct, 2008

1 commit

  • This fixes a regression that came with 934b2857cc576ae53c92a66e63fce7ddcfa74691
    ("[S390] nohz/sclp: disable timer on synchronous waits.").
    If udelay() gets called from a disabled context it sets the clock comparator
    to a value where it expects the next interrupt. When the interrupt happens
    the clock comparator gets not reset and therefore the interrupt condition
    doesn't get cleared. The result is an endless timer interrupt loop.

    In addition this patch fixes also the following:

    rcutorture reveals that our __udelay implementation is still buggy,
    since it might schedule tasklets, but prevents their execution:

    NOHZ: local_softirq_pending 42
    NOHZ: local_softirq_pending 02
    NOHZ: local_softirq_pending 142
    NOHZ: local_softirq_pending 02

    To fix this we make sure that only the clock comparator interrupt
    is enabled when the enabled wait psw is loaded.
    Also no code gets called anymore which might schedule tasklets.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

01 Aug, 2008

1 commit

  • sclp_sync_wait wait synchronously for an sclp interrupt and disables
    timer interrupts. However on the irq enter paths there is an extra
    check if a timer interrupt would be due and calls the timer callback.
    This would schedule softirqs in the wrong context.
    So introduce local_tick_enable/disable which prevents this.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

30 Apr, 2008

2 commits


17 Apr, 2008

2 commits

  • The current uaccess page table walk code assumes at a few places that
    any access is a user space access. This is not correct if somebody
    has issued a set_fs(KERNEL_DS) in advance.
    Add code which checks which address space we are in and with this make
    sure we access the correct address space. This way we get also rid of
    the dirty
    if (!currrent-mm)
    return -EFAULT;
    hack in futex_atomic_cmpxchg_pt.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Heiko Carstens

    Heiko Carstens
     
  • This way we get rid of s390's NO_IDLE_HZ and use the generic dynticks
    variant instead. In addition we get high resolution timers for free.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Heiko Carstens

    Heiko Carstens
     

21 Mar, 2008

1 commit

  • a0c1e9073ef7428a14309cba010633a6cd6719ea "futex: runtime enable pi and
    robust functionality" introduces a test wether futex in atomic stuff
    works or not.
    It does that by writing to address 0 of the kernel address space. This
    will crash on older machines where addressing mode switching is enabled
    but where the mvcos instruction is not available. Page table walking is
    done by hand and therefore the code tries to access current->mm which
    is NULL.
    Therefore add an extra check, so we survive the early test.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

19 Feb, 2008

1 commit

  • Add missing exception table entry so that the kernel can handle
    proctection exceptions as well on the cs instruction. Currently only
    specification exceptions are handled correctly.
    The missing entry allows user space to crash the kernel.

    Cc: stable
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

26 Jan, 2008

1 commit

  • In s390's spin_lock_irqsave, interrupts remain disabled while
    spinning. In other architectures like x86 and powerpc, interrupts are
    re-enabled while spinning if IRQ is not masked before spin_lock_irqsave
    is called.

    The following patch re-enables interrupts through local_irq_restore
    while spinning for a lock acquisition.
    This can improve system response.

    [heiko.carstens@de.ibm.com: removed saving of pc]

    Signed-off-by: Hisashi Hifumi
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Hisashi Hifumi