17 Feb, 2010

1 commit

  • Both the store queue API and the PMB remapping take unsigned long for
    their pgprot flags, which cuts off the extended protection bits. In the
    case of the PMB this isn't really a problem since the cache attribute
    bits that we care about are all in the lower 32-bits, but we do it just
    to be safe. The store queue remapping on the other hand depends on the
    extended prot bits for enabling userspace access to the mappings.

    Signed-off-by: Paul Mundt

    Paul Mundt
     

26 Jan, 2010

1 commit

  • The old ctrl in/out routines are non-portable and unsuitable for
    cross-platform use. While drivers/sh has already been sanitized, there
    is still quite a lot of code that is not. This converts the arch/sh/ bits
    over, which permits us to flag the routines as deprecated whilst still
    building with -Werror for the architecture code, and to ensure that
    future users are not added.

    Signed-off-by: Paul Mundt

    Paul Mundt
     

20 Jan, 2010

1 commit


13 Jan, 2010

3 commits

  • Valid sizes include 256kB, not 258kB.

    Signed-off-by: Paul Mundt

    Paul Mundt
     
  • The mass produced cuts use an updated PVR value, add them to the list.

    Signed-off-by: Matt Fleming
    Signed-off-by: Paul Mundt

    Matt Fleming
     
  • This follows the x86 xstate changes and implements a task_xstate slab
    cache that is dynamically sized to match one of hard FP/soft FP/FPU-less.

    This also tidies up and consolidates some of the SH-2A/SH-4 FPU
    fragmentation. Now fpu state restorers are commonly defined, with the
    init_fpu()/fpu_init() mess reworked to follow the x86 convention.
    The fpu_init() register initialization has been replaced by xstate setup
    followed by writing out to hardware via the standard restore path.

    As init_fpu() now performs a slab allocation a secondary lighterweight
    restorer is also introduced for the context switch.

    In the future the DSP state will be rolled in here, too.

    More work remains for math emulation and the SH-5 FPU, which presently
    uses its own special (UP-only) interfaces.

    Signed-off-by: Paul Mundt

    Paul Mundt
     

15 Dec, 2009

1 commit

  • This patch breaks out the sh4 scif serial port platform
    data from a shared platform device to one platform
    device per port. Also, add serial ports to the list of
    early platform devices.

    While at it, get rid of the R2D ifdef in the processor
    code and adjust the defconfigs to use ttySC1.

    Signed-off-by: Magnus Damm
    Signed-off-by: Paul Mundt

    Magnus Damm
     

24 Nov, 2009

3 commits

  • A number of small optimisations to FPU handling, in particular:

    - move the task USEDFPU flag from the thread_info flags field (which
    is accessed asynchronously to the thread) to a new status field,
    which is only accessed by the thread itself. This allows locking to
    be removed in most cases, or can be reduced to a preempt_lock().
    This mimics the i386 behaviour.

    - move the modification of regs->sr and thread_info->status flags out
    of save_fpu() to __unlazy_fpu(). This gives the compiler a better
    chance to optimise things, as well as making save_fpu() symmetrical
    with restore_fpu() and init_fpu().

    - implement prepare_to_copy(), so that when creating a thread, we can
    unlazy the FPU prior to copying the thread data structures.

    Also make sure that the FPU is disabled while in the kernel, in
    particular while booting, and for newly created kernel threads,

    In a very artificial benchmark, the execution time for 2500000
    context switches was reduced from 50 to 45 seconds.

    Signed-off-by: Stuart Menefy
    Signed-off-by: Paul Mundt

    Stuart Menefy
     
  • Paul Mundt
     
  • sh port of the sLeAZY-fpu feature currently implemented for some architectures
    such us i386.

    Right now the SH kernel has a 100% lazy fpu behaviour.
    This is of course great for applications that have very sporadic or no FPU use.
    However for very frequent FPU users... you take an extra trap every context
    switch.
    The patch below adds a simple heuristic to this code: after 5 consecutive
    context switches of FPU use, the lazy behavior is disabled and the context
    gets restored every context switch.
    After 256 switches, this is reset and the 100% lazy behavior is returned.

    Tests with LMbench showed no regression.
    I saw a little improvement due to the prefetching (~2%).

    The tests below also show that, with this sLeazy patch, indeed,
    the number of FPU exceptions is reduced.
    To test this. I hacked the lat_ctx LMBench to use the FPU a little more.

    sLeasy implementation
    ===========================================
    switch_to calls | 79326
    sleasy calls | 42577
    do_fpu_state_restore calls| 59232
    restore_fpu calls | 59032

    Exceptions: 0x800 (FPU disabled ): 16604

    100% Leazy (default implementation)
    ===========================================
    switch_to calls | 79690
    do_fpu_state_restore calls | 53299
    restore_fpu calls | 53101

    Exceptions: 0x800 (FPU disabled ): 53273

    Signed-off-by: Giuseppe Cavallaro
    Signed-off-by: Stuart Menefy
    Signed-off-by: Paul Mundt

    Giuseppe CAVALLARO
     

12 Nov, 2009

1 commit


05 Nov, 2009

1 commit


01 Sep, 2009

1 commit


27 Aug, 2009

1 commit


21 Aug, 2009

1 commit


15 Aug, 2009

2 commits

  • This is superfluous, as the default CPU type and family are already
    established by the initial cpuinfo definition. Given that we are still
    able to probe for the CPU family even if we are not able to detect the
    subtype, it's preferable to let the probing code fill out what it can and
    leave the rest.

    Signed-off-by: Paul Mundt

    Paul Mundt
     
  • This adds a family member to struct sh_cpuinfo, which allows us to fall
    back more on the probe routines to work out what sort of subtype we are
    running on. This will be used by the CPU cache initialization code in
    order to first do family-level initialization, followed by subtype-level
    optimizations.

    Signed-off-by: Paul Mundt

    Paul Mundt
     

23 Jul, 2009

1 commit

  • Convert the processor platform device setup
    functions from __initcall() and sometimes
    device_initcall() to arch_initcall().

    This makes sure that the platform devices are
    registered a bit earlier so the devices are
    available when drivers register using initcall
    levels earlier than device_initcall().

    A good example is platform devices needed by
    i2c-sh_mobile.c which registers a bit earlier
    using subsys_initcall().

    Signed-off-by: Magnus Damm
    Signed-off-by: Paul Mundt

    Magnus Damm
     

01 Jun, 2009

3 commits


13 May, 2009

2 commits


12 May, 2009

7 commits


11 May, 2009

3 commits


16 Apr, 2009

1 commit


02 Apr, 2009

1 commit

  • Forcing direct-mapped worked on certain older 2-way set associative
    parts, but was always error prone on 4-way parts. As these are the
    norm these days, there is not much point in continuing to support this
    mode. Most of the folks that used direct-mapped mode generally just
    wanted writethrough caching in the first place..

    Signed-off-by: Paul Mundt

    Paul Mundt
     

17 Mar, 2009

1 commit

  • This adds support for extended ASIDs (up to 16-bits) on newer SH-X3 cores
    that implement the PTAEX register and respective functionality. Presently
    only the 65nm SH7786 (90nm only supports legacy 8-bit ASIDs).

    The main change is in how the PTE is written out when loading the entry
    in to the TLB, as well as in how the TLB entry is selectively flushed.

    While SH-X2 extended mode splits out the memory-mapped U and I-TLB data
    arrays for extra bits, extended ASID mode splits out the address arrays.
    While we don't use the memory-mapped data array access, the address
    array accesses are necessary for selective TLB flushes, so these are
    implemented newly and replace the generic SH-4 implementation.

    With this, TLB flushes in switch_mm() are almost non-existent on newer
    parts.

    Signed-off-by: Paul Mundt

    Paul Mundt
     

10 Mar, 2009

1 commit

  • Add Suspend-to-disk / swsusp / CONFIG_HIBERNATION support
    to the SuperH architecture.

    To suspend, use "swapon /dev/sda2; echo disk > /sys/power/state"
    To resume, pass "resume=/dev/sda2" on the kernel command line.

    The patch "pm: rework includes, remove arch ifdefs V2" is
    needed to allow the generic swsusp code to build properly.

    Hibernation is not enabled with this patch though, a patch
    setting ARCH_HIBERNATION_POSSIBLE will be submitted later.

    Signed-off-by: Magnus Damm
    Signed-off-by: Paul Mundt

    Magnus Damm
     

03 Mar, 2009

1 commit

  • This adds preliminary support for the SH7786 CPU subtype.

    While this is a dual-core CPU, only UP is supported for now. L2 cache
    support is likewise not yet implemented.

    More information on this particular CPU subtype is available at:

    http://www.renesas.com/fmwk.jsp?cnt=sh7786_root.jsp&fp=/products/mpumcu/superh_family/sh7780_series/sh7786_group/

    Signed-off-by: Kuninori Morimoto
    Signed-off-by: Paul Mundt

    Kuninori Morimoto
     

27 Feb, 2009

1 commit

  • Update intc tables and platform data to use one linux irq
    per maskable interrupt source instead of keeping the one-to-one
    mapping between vectors and linux irqs.

    This fixes potential irq masking issues for sh775x hardware
    blocks such as SCI/SCIF/RTC/DMAC/TMU2/REF.

    Signed-off-by: Magnus Damm
    Signed-off-by: Paul Mundt

    Magnus Damm
     

29 Jan, 2009

1 commit

  • This fixes a bug in the FPU exception handler for the FCNVDS instruction.
    To get the register number the instruction is shifted right by 9,
    though it should be shifted right by 8.

    More information at ST Linux bugzilla:

    https://bugzilla.stlinux.com/show_bug.cgi?id=4892

    Signed-off-by: Giuseppe Di Giore
    Signed-off-by: Carmelo Amoroso
    Signed-off-by: Stuart Menefy
    Signed-off-by: Paul Mundt

    Carmelo AMOROSO