14 Jun, 2007

1 commit


14 Dec, 2006

1 commit

  • Currently, to tell a task that it should go to the refrigerator, we set the
    PF_FREEZE flag for it and send a fake signal to it. Unfortunately there
    are two SMP-related problems with this approach. First, a task running on
    another CPU may be updating its flags while the freezer attempts to set
    PF_FREEZE for it and this may leave the task's flags in an inconsistent
    state. Second, there is a potential race between freeze_process() and
    refrigerator() in which freeze_process() running on one CPU is reading a
    task's PF_FREEZE flag while refrigerator() running on another CPU has just
    set PF_FROZEN for the same task and attempts to reset PF_FREEZE for it. If
    the refrigerator wins the race, freeze_process() will state that PF_FREEZE
    hasn't been set for the task and will set it unnecessarily, so the task
    will go to the refrigerator once again after it's been thawed.

    To solve first of these problems we need to stop using PF_FREEZE to tell
    tasks that they should go to the refrigerator. Instead, we can introduce a
    special TIF_*** flag and use it for this purpose, since it is allowed to
    change the other tasks' TIF_*** flags and there are special calls for it.

    To avoid the freeze_process()-refrigerator() race we can make
    freeze_process() to always check the task's PF_FROZEN flag after it's read
    its "freeze" flag. We should also make sure that refrigerator() will
    always reset the task's "freeze" flag after it's set PF_FROZEN for it.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Cc: Russell King
    Cc: David Howells
    Cc: Andi Kleen
    Cc: "Luck, Tony"
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Paul Mundt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

26 Apr, 2006

1 commit


18 Apr, 2006

1 commit

  • We weren't using the recommended sequence for putting the CPU into
    nap mode. When I changed the idle loop, for some reason 7447A cpus
    started hanging when we put them into nap mode. Changing to the
    recommended sequence fixes that.

    The complexity here is that the recommended sequence is a loop that
    keeps putting the cpu back into nap mode. Clearly we need some way
    to break out of the loop when an interrupt (external interrupt,
    decrementer, performance monitor) occurs. Here we use a bit in
    the thread_info struct to indicate that we need this, and the exception
    entry code notices this and arranges for the exception to return
    to the value in the link register, thus breaking out of the loop.
    We use a new `local_flags' field in the thread_info which we can
    alter without needing to use an atomic update sequence.

    The PPC970 has the same recommended sequence, so we do the same thing
    there too.

    This also fixes a bug in the kernel stack overflow handling code on
    32-bit, since it was causing a value that we needed in a register to
    get trashed.

    Signed-off-by: Paul Mackerras

    Paul Mackerras
     

08 Mar, 2006

1 commit

  • A careful reading of the recent changes to the system call entry/exit
    paths revealed several problems, plus some things that could be
    simplified and improved:

    * 32-bit wasn't testing the _TIF_NOERROR bit in the syscall fast exit
    path, so it was only doing anything with it once it saw some other
    bit being set. In other words, the noerror behaviour would apply to
    the next system call where we had to reschedule or deliver a signal,
    which is not necessarily the current system call.

    * 32-bit wasn't doing the call to ptrace_notify in the syscall exit
    path when the _TIF_SINGLESTEP bit was set.

    * _TIF_RESTOREALL was in both _TIF_USER_WORK_MASK and
    _TIF_PERSYSCALL_MASK, which is odd since _TIF_RESTOREALL is only set
    by system calls. I took it out of _TIF_USER_WORK_MASK.

    * On 64-bit, _TIF_RESTOREALL wasn't causing the non-volatile registers
    to be restored (unless perhaps a signal was delivered or the syscall
    was traced or single-stepped). Thus the non-volatile registers
    weren't restored on exit from a signal handler. We probably got
    away with it mostly because signal handlers written in C wouldn't
    alter the non-volatile registers.

    * On 32-bit I simplified the code and made it more like 64-bit by
    making the syscall exit path jump to ret_from_except to handle
    preemption and signal delivery.

    * 32-bit was calling do_signal unnecessarily when _TIF_RESTOREALL was
    set - but I think because of that 32-bit was actually restoring the
    non-volatile registers on exit from a signal handler.

    * I changed the order of enabling interrupts and saving the
    non-volatile registers before calling do_syscall_trace_leave; now we
    enable interrupts first.

    Signed-off-by: Paul Mackerras

    Paul Mackerras
     

24 Feb, 2006

1 commit

  • The runlatch SPR can take a lot of time to write. My original runlatch
    code would set it on every exception entry even though most of the time
    this was not required. It would also continually set it in the idle
    loop, which is an issue on an SMT capable processor.

    Now we cache the runlatch value in a threadinfo bit, and only check for
    it in decrementer and hardware interrupt exceptions as well as the idle
    loop. Boot on POWER3, POWER5 and iseries, and compile tested on pmac32.

    Signed-off-by: Anton Blanchard
    Signed-off-by: Paul Mackerras

    Anton Blanchard
     

08 Feb, 2006

1 commit


19 Jan, 2006

1 commit


13 Jan, 2006

1 commit


09 Jan, 2006

1 commit

  • This cleanup patch speeds up the null syscall path on ppc64 by about 3%,
    and brings the ppc32 and ppc64 code slightly closer together.

    The ppc64 code was checking current_thread_info()->flags twice in the
    syscall exit path; once for TIF_SYSCALL_T_OR_A before disabling
    interrupts, and then again for TIF_SIGPENDING|TIF_NEED_RESCHED etc after
    disabling interrupts. Now we do the same as ppc32 -- check the flags
    only once in the fast path, and re-enable interrupts if necessary in the
    ptrace case.

    The patch abolishes the 'syscall_noerror' member of struct thread_info
    and replaces it with a TIF_NOERROR bit in the flags, which is handled in
    the slow path. This shortens the syscall entry code, which no longer
    needs to clear syscall_noerror.

    The patch adds a TIF_SAVE_NVGPRS flag which causes the syscall exit slow
    path to save the non-volatile GPRs into a signal frame. This removes the
    need for the assembly wrappers around sys_sigsuspend(),
    sys_rt_sigsuspend(), et al which existed solely to save those registers
    in advance. It also means I don't have to add new wrappers for ppoll()
    and pselect(), which is what I was supposed to be doing when I got
    distracted into this...

    Finally, it unifies the ppc64 and ppc32 methods of handling syscall exit
    directly into a signal handler (as required by sigsuspend et al) by
    introducing a TIF_RESTOREALL flag which causes _all_ the registers to be
    reloaded from the pt_regs by taking the ret_from_exception path, instead
    of the normal syscall exit path which stomps on the callee-saved GPRs.

    It appears to pass an LTP test run on ppc64, and passes basic testing on
    ppc32 too. Brief tests of ptrace functionality with strace and gdb also
    appear OK. I wouldn't send it to Linus for 2.6.15 just yet though :)

    Signed-off-by: David Woodhouse
    Signed-off-by: Paul Mackerras

    David Woodhouse
     

07 Nov, 2005

1 commit

  • Adds a new CONFIG_PPC_64K_PAGES which, when enabled, changes the kernel
    base page size to 64K. The resulting kernel still boots on any
    hardware. On current machines with 4K pages support only, the kernel
    will maintain 16 "subpages" for each 64K page transparently.

    Note that while real 64K capable HW has been tested, the current patch
    will not enable it yet as such hardware is not released yet, and I'm
    still verifying with the firmware architects the proper to get the
    information from the newer hypervisors.

    Signed-off-by: Benjamin Herrenschmidt
    Signed-off-by: Linus Torvalds

    Benjamin Herrenschmidt
     

27 Oct, 2005

1 commit

  • In readiness for 64k pages, when THREAD_SIZE will be less than
    PAGE_SIZE, ppc64 uses kmalloc() rather than __get_free_pages() to
    allocate kernel stacks, and since thread_info.h was merged, so does
    ppc32. However that adds some overhead which we don't really want
    when PAGE_SIZE
    Signed-off-by: Paul Mackerras

    David Gibson
     

21 Oct, 2005

1 commit

  • Merge ppc32 and ppc64 versions of thread_info.h. They were pretty
    similar already, the chief changes are:

    - Instead of inline asm to implement current_thread_info(),
    which needs to be different for ppc32 and ppc64, we use C with an
    asm("r1") register variable. gcc turns it into the same asm as we
    used to have for both platforms.
    - We replace ppc32's 'local_flags' with the ppc64
    'syscall_noerror' field. The noerror flag was in fact the only thing
    in the local_flags field anyway, so the ppc64 approach is simpler, and
    means we only need a load-immediate/store instead of load/mask/store
    when clearing the flag.
    - In readiness for 64k pages, when THREAD_SIZE will be less
    than a page, ppc64 used kmalloc() rather than get_free_pages() to
    allocate the kernel stack. With this patch we do the same for ppc32,
    since there's no strong reason not to.
    - For ppc64, we no longer export THREAD_SHIFT and THREAD_SIZE
    via asm-offsets, thread_info.h can now be safely included in asm, as
    on ppc32.

    Built and booted on G4 Powerbook (ARCH=ppc and ARCH=powerpc) and
    Power5 (ARCH=ppc64 and ARCH=powerpc).

    Signed-off-by: David Gibson
    Signed-off-by: Paul Mackerras

    David Gibson