01 Jul, 2011

1 commit

  • The nmi parameter indicated if we could do wakeups from the current
    context, if not, we would set some state and self-IPI and let the
    resulting interrupt do the wakeup.

    For the various event classes:

    - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from
    the PMI-tail (ARM etc.)
    - tracepoint: nmi=0; since tracepoint could be from NMI context.
    - software: nmi=[0,1]; some, like the schedule thing cannot
    perform wakeups, and hence need 0.

    As one can see, there is very little nmi=1 usage, and the down-side of
    not using it is that on some platforms some software events can have a
    jiffy delay in wakeup (when arch_irq_work_raise isn't implemented).

    The up-side however is that we can remove the nmi parameter and save a
    bunch of conditionals in fast paths.

    Signed-off-by: Peter Zijlstra
    Cc: Michael Cree
    Cc: Will Deacon
    Cc: Deng-Cheng Zhu
    Cc: Anton Blanchard
    Cc: Eric B Munson
    Cc: Heiko Carstens
    Cc: Paul Mundt
    Cc: David S. Miller
    Cc: Frederic Weisbecker
    Cc: Jason Wessel
    Cc: Don Zickus
    Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

13 Oct, 2010

1 commit


26 May, 2010

1 commit

  • This reverts commit b3b77c8caef1750ebeea1054e39e358550ea9f55, which was
    also totally broken (see commit 0d2daf5cc858 that reverted the crc32
    version of it). As reported by Stephen Rothwell, it causes problems on
    big-endian machines:

    > In file included from fs/jfs/jfs_types.h:33,
    > from fs/jfs/jfs_incore.h:26,
    > from fs/jfs/file.c:22:
    > fs/jfs/endian24.h:36:101: warning: "__LITTLE_ENDIAN" is not defined

    The kernel has never had that crazy "__BYTE_ORDER == __LITTLE_ENDIAN"
    model. It's not how we do things, and it isn't how we _should_ do
    things. So don't go there.

    Requested-by: Stephen Rothwell
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

25 May, 2010

1 commit

  • Linux does not define __BYTE_ORDER in its endian header files which makes
    some header files bend backwards to get at the current endian. Lets
    #define __BYTE_ORDER in big_endian.h/litte_endian.h to make it easier for
    header files that are used in user space too.

    In userspace the convention is that

    1. _both_ __LITTLE_ENDIAN and __BIG_ENDIAN are defined,
    2. you have to test for e.g. __BYTE_ORDER == __BIG_ENDIAN.

    Signed-off-by: Joakim Tjernlund
    Cc: Geert Uytterhoeven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joakim Tjernlund
     

13 Jan, 2010

1 commit

  • This follows the x86 xstate changes and implements a task_xstate slab
    cache that is dynamically sized to match one of hard FP/soft FP/FPU-less.

    This also tidies up and consolidates some of the SH-2A/SH-4 FPU
    fragmentation. Now fpu state restorers are commonly defined, with the
    init_fpu()/fpu_init() mess reworked to follow the x86 convention.
    The fpu_init() register initialization has been replaced by xstate setup
    followed by writing out to hardware via the standard restore path.

    As init_fpu() now performs a slab allocation a secondary lighterweight
    restorer is also introduced for the context switch.

    In the future the DSP state will be rolled in here, too.

    More work remains for math emulation and the SH-5 FPU, which presently
    uses its own special (UP-only) interfaces.

    Signed-off-by: Paul Mundt

    Paul Mundt
     

24 Nov, 2009

1 commit

  • A number of small optimisations to FPU handling, in particular:

    - move the task USEDFPU flag from the thread_info flags field (which
    is accessed asynchronously to the thread) to a new status field,
    which is only accessed by the thread itself. This allows locking to
    be removed in most cases, or can be reduced to a preempt_lock().
    This mimics the i386 behaviour.

    - move the modification of regs->sr and thread_info->status flags out
    of save_fpu() to __unlazy_fpu(). This gives the compiler a better
    chance to optimise things, as well as making save_fpu() symmetrical
    with restore_fpu() and init_fpu().

    - implement prepare_to_copy(), so that when creating a thread, we can
    unlazy the FPU prior to copying the thread data structures.

    Also make sure that the FPU is disabled while in the kernel, in
    particular while booting, and for newly created kernel threads,

    In a very artificial benchmark, the execution time for 2500000
    context switches was reduced from 50 to 45 seconds.

    Signed-off-by: Stuart Menefy
    Signed-off-by: Paul Mundt

    Stuart Menefy
     

11 Jun, 2007

1 commit


21 May, 2007

1 commit


03 Oct, 2006

1 commit


27 Sep, 2006

1 commit