14 Dec, 2010

1 commit

  • Alignment of alloc_bootmem() depends on the value of
    L1_CACHE_SHIFT. What we need here, however, is 64 byte alignment. Use
    alloc_bootmem_align() and explicitly specify the alignment instead.

    This fixes a kernel boot crash reported by Jody when the cpu in .config
    is set to MPENTIUMII but the kernel is booted on a xsave-capable CPU.

    Reported-by: Jody Bruchon
    Signed-off-by: Suresh Siddha
    LKML-Reference:
    Signed-off-by: H. Peter Anvin
    Cc:

    Suresh Siddha
     

07 Aug, 2010

2 commits

  • …git/tip/linux-2.6-tip

    * 'x86-xsave-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, xsave: Make xstate_enable_boot_cpu() __init, protect on CPU 0
    x86, xsave: Add __init attribute to setup_xstate_features()
    x86, xsave: Make init_xstate_buf static
    x86, xsave: Check cpuid level for XSTATE_CPUID (0x0d)
    x86, xsave: Introduce xstate enable functions
    x86, xsave: Separate fpu and xsave initialization
    x86, xsave: Move boot cpu initialization to xsave_init()
    x86, xsave: 32/64 bit boot cpu check unification in initialization
    x86, xsave: Do not include asm/i387.h in asm/xsave.h
    x86, xsave: Use xsaveopt in context-switch path when supported
    x86, xsave: Sync xsave memory layout with its header for user handling
    x86, xsave: Track the offset, size of state in the xsave layout

    Linus Torvalds
     
  • …inus', 'x86-apic-for-linus', 'x86-fpu-for-linus' and 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

    * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86: Clean up arch/x86/kernel/cpu/mtrr/cleanup.c: use ";" not "," to terminate statements

    * 'x86-vmware-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, vmware: Preset lpj values when on VMware.

    * 'x86-mtrr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, mtrr: Use stop machine context to rendezvous all the cpu's

    * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86/apic/es7000_32: Remove unused variable

    * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86: Avoid unnecessary __clear_user() and xrstor in signal handling

    * 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, vdso: Unmap vdso pages

    Linus Torvalds
     

22 Jul, 2010

6 commits

  • xstate_enable_boot_cpu() is, as the name implies, only used on the
    boot CPU; furthermore, it invokes alloc_bootmem(), which is __init;
    hence it needs to be tagged __init rather than __cpuinit.

    Furthermore, it is *not* safe in the long run to rely on CPU 0 only
    coming online during the early boot -- at some point we're going to
    support offlining (and re-onlining) the boot CPU, and at that point we
    must not call xstate_enable_boot_cpu() again.

    The code is a fair bit more obscure than one would like, because the
    __ref overrides aren't quite powerful enough.

    Signed-off-by: H. Peter Anvin
    Acked-by: Suresh Siddha
    Cc: Robert Richter
    LKML-Reference:

    H. Peter Anvin
     
  • This is called only from initialization code.

    Signed-off-by: Robert Richter
    LKML-Reference:
    Acked-by: Suresh Siddha
    Signed-off-by: H. Peter Anvin

    Robert Richter
     
  • The pointer is only used in xsave.c. Making it static.

    Signed-off-by: Robert Richter
    LKML-Reference:
    Acked-by: Suresh Siddha
    Signed-off-by: H. Peter Anvin

    Robert Richter
     
  • The patch introduces the XSTATE_CPUID macro and adds a check that
    tests if XSTATE_CPUID exists.

    Signed-off-by: Robert Richter
    LKML-Reference:
    Acked-by: Suresh Siddha
    Signed-off-by: H. Peter Anvin

    Robert Richter
     
  • The patch renames xsave_cntxt_init() and __xsave_init() into
    xstate_enable_boot_cpu() and xstate_enable() as this names are more
    meaningful.

    It also removes the duplicate xcr setup for the boot cpu.

    Signed-off-by: Robert Richter
    LKML-Reference:
    Acked-by: Suresh Siddha
    Signed-off-by: H. Peter Anvin

    Robert Richter
     
  • As xsave also supports other than fpu features, it should be
    initialized independently of the fpu. This patch moves this out of fpu
    initialization.

    There is also a lot of cross referencing between fpu and xsave
    code. This patch reduces this by making xsave_cntxt_init() and
    init_thread_xstate() static functions.

    The patch moves the cpu_has_xsave check at the beginning of
    xsave_init(). All other checks may removed then.

    Signed-off-by: Robert Richter
    LKML-Reference:
    Acked-by: Suresh Siddha
    Signed-off-by: H. Peter Anvin

    Robert Richter
     

21 Jul, 2010

1 commit


20 Jul, 2010

2 commits

  • With xsaveopt, if a processor implementation discern that a processor state
    component is in its initialized state it may modify the corresponding bit in
    the xsave_hdr.xstate_bv as '0', with out modifying the corresponding memory
    layout. Hence wHile presenting the xstate information to the user, we always
    ensure that the memory layout of a feature will be in the init state if the
    corresponding header bit is zero. This ensures the consistency and avoids the
    condition of the user seeing some some stale state in the memory layout during
    signal handling, debugging etc.

    Signed-off-by: Suresh Siddha
    LKML-Reference:
    Signed-off-by: H. Peter Anvin

    Suresh Siddha
     
  • Subleaves of the cpuid vector 0xd provides the offset and size of different
    feature state that are managed by the xsave/xrstor. Track this for the upcoming
    usage during signal handling.

    Signed-off-by: Suresh Siddha
    LKML-Reference:
    Signed-off-by: H. Peter Anvin

    Suresh Siddha
     

07 Jul, 2010

1 commit

  • fxsave/xsave doesn't touch all the bytes in the memory layout used by
    these instructions. Specifically SW reserved (bytes 464..511) fields
    in the fxsave frame and the reserved fields in the xsave header.

    To present a clean context for the signal handling, just clear these fields
    instead of clearing the complete fxsave/xsave memory layout, when we dump these
    registers directly to the user signal frame.

    Also avoid the call to second xrstor (which inits the state not passed
    in the signal frame) in restore_user_xstate() if all the state has already
    been restored by the first xrstor.

    These changes improve the performance of signal handling(by ~3-5% as measured
    by the lat_sig).

    Signed-off-by: Suresh Siddha
    LKML-Reference:
    Signed-off-by: H. Peter Anvin

    Suresh Siddha
     

10 Jun, 2010

1 commit

  • The places which call check_for_xstate() only care about zero or
    non-zero so this patch doesn't change how the code runs, but it's a
    cleanup. The main reason for this patch is that I'm looking for places
    which don't return -EFAULT for copy_from_user() failures.

    Signed-off-by: Dan Carpenter
    LKML-Reference:
    Signed-off-by: H. Peter Anvin
    Cc: Suresh Siddha

    Dan Carpenter
     

11 May, 2010

2 commits

  • Currently all fpu state access is through tsk->thread.xstate. Since we wish
    to generalize fpu access to non-task contexts, wrap the state in a new
    'struct fpu' and convert existing access to use an fpu API.

    Signal frame handlers are not converted to the API since they will remain
    task context only things.

    Signed-off-by: Avi Kivity
    Acked-by: Suresh Siddha
    LKML-Reference:
    Signed-off-by: H. Peter Anvin

    Avi Kivity
     
  • The fpu code currently uses current->thread_info->status & TS_XSAVE as
    a way to distinguish between XSAVE capable processors and older processors.
    The decision is not really task specific; instead we use the task status to
    avoid a global memory reference - the value should be the same across all
    threads.

    Eliminate this tie-in into the task structure by using an alternative
    instruction keyed off the XSAVE cpu feature; this results in shorter and
    faster code, without introducing a global memory reference.

    [ hpa: in the future, this probably should use an asm jmp ]

    Signed-off-by: Avi Kivity
    Acked-by: Suresh Siddha
    LKML-Reference:
    Signed-off-by: H. Peter Anvin

    Avi Kivity
     

12 Feb, 2010

1 commit

  • Add the xstate regset support which helps extend the kernel ptrace and the
    core-dump interfaces to support AVX state etc.

    This regset interface is designed to support all the future state that gets
    supported using xsave/xrstor infrastructure.

    Looking at the memory layout saved by "xsave", one can't say which state
    is represented in the memory layout. This is because if a particular state is
    in init state, in the xsave hdr it can be represented by bit '0'. And hence
    we can't really say by the xsave header wether a state is in init state or
    the state is not saved in the memory layout.

    And hence the xsave memory layout available through this regset
    interface uses SW usable bytes [464..511] to convey what state is represented
    in the memory layout.

    First 8 bytes of the sw_usable_bytes[464..467] will be set to OS enabled xstate
    mask(which is same as the 64bit mask returned by the xgetbv's xCR0).

    The note NT_X86_XSTATE represents the extended state information in the
    core file, using the above mentioned memory layout.

    Signed-off-by: Suresh Siddha
    LKML-Reference:
    Signed-off-by: Hongjiu Lu
    Cc: Roland McGrath
    Signed-off-by: H. Peter Anvin

    Suresh Siddha
     

21 Apr, 2009

1 commit

  • In 64bit signal delivery path, clear_used_math() was happening before saving
    the current active FPU state on to the user stack for signal handling. Between
    clear_used_math() and the state store on to the user stack, potentially we
    can get a page fault for the user address and can block. Infact, while testing
    we were hitting the might_fault() in __clear_user() which can do a schedule().

    At a later point in time, we will schedule back into this process and
    resume the save state (using "xsave/fxsave" instruction) which can lead
    to DNA fault. And as used_math was cleared before, we will reinit the FP state
    in the DNA fault and continue. This reinit will result in loosing the
    FPU state of the process.

    Move clear_used_math() to a point after the FPU state has been stored
    onto the user stack.

    This issue is present from a long time (even before the xsave changes
    and the x86 merge). But it can easily be exposed in 2.6.28.x and 2.6.29.x
    series because of the __clear_user() in this path, which has an explicit
    __cond_resched() leading to a context switch with CONFIG_PREEMPT_VOLUNTARY.

    [ Impact: fix FPU state corruption ]

    Signed-off-by: Suresh Siddha
    Cc: [2.6.28.x, 2.6.29.x]
    Signed-off-by: H. Peter Anvin

    Suresh Siddha
     

12 Apr, 2009

1 commit

  • Impact: save/restore Intel-AVX state properly between tasks

    Intel Advanced Vector Extensions (AVX) introduce 256-bit vector processing
    capability. More about AVX at http://software.intel.com/sites/avx

    Add OS support for YMM state management using xsave/xrstor infrastructure
    to support AVX.

    Signed-off-by: Suresh Siddha
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     

31 Dec, 2008

1 commit


20 Nov, 2008

1 commit


22 Oct, 2008

1 commit


12 Oct, 2008

1 commit

  • fix warning:

    arch/x86/kernel/xsave.c: In function ‘save_i387_xstate’:
    arch/x86/kernel/xsave.c:98: warning: ignoring return value of ‘__clear_user’, declared with attribute warn_unused_result

    check the return value and act on it. We should not be ignoring faults
    at this point.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

08 Oct, 2008

2 commits

  • If a processor implementation discern that a processor state component is in
    its initialized state, it may modify the corresponding bit in the
    xsave header.xstate_bv as '0'. State in the memory layout setup by 'xsave'
    will be consistent with the bit values in the header.

    During signal handling, legacy applications may change the FP/SSE bits
    in the sigcontext memory layout without touching the FP/SSE header bits
    in the xsave header. So always set FP/SSE bits in the xsave header
    while saving the sigcontext state to the user space. During signal return,
    this will enable the kernel to capture any changes to the FP/SSE bits by the
    legacy applications which don't touch xsave headers.

    xsave aware apps can change the xstate_bv in the xsave header aswell
    as change any contents in the memory layout. xrestor as part of sigreturn
    will capture all the changes.

    Signed-off-by: Suresh Siddha
    Signed-off-by: H. Peter Anvin

    Suresh Siddha
     
  • Actually return failure on error.

    Signed-off-by: Suresh Siddha
    Signed-off-by: H. Peter Anvin

    Suresh Siddha
     

07 Sep, 2008

1 commit

  • WARNING: vmlinux.o(.text+0x22453): Section mismatch in reference from the function setup_xstate_init() to the function .init.text:__alloc_bootmem()
    The function setup_xstate_init() references the function __init __alloc_bootmem().
    This is often because setup_xstate_init lacks a __init annotation or the annotation of __alloc_bootmem is wrong.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Ingo Molnar

    Alexey Dobriyan
     

14 Aug, 2008

2 commits

  • All these structure sizes are runtime determined. So use a runtime
    bug check.

    Signed-off-by: Suresh Siddha
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     
  • fxsave/xsave instructions will not touch all the bytes in the
    fxsave/xsave frame. Clear the user buffer before doing fxsave/xsave
    directly to user buffer during the sigcontext setup.

    This is essentially needed in the context of xsave(for example,
    some of the fields in the xsave header are not touched by the xsave
    and defined as must be zero).

    This will also present uniform and clean context to the user (from
    which user can safely do fxrstor/xrstor).

    Signed-off-by: Suresh Siddha
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     

31 Jul, 2008

5 commits

  • The XSAVE feature mask is a 64-bit number; keep it that way, in order
    to avoid the mistake done with rdmsr/wrmsr. Use the xsetbv() function
    provided in the previous patch.

    Signed-off-by: H. Peter Anvin
    Signed-off-by: Ingo Molnar

    H. Peter Anvin
     
  • On cpu's supporting xsave/xrstor, fpstate pointer in the sigcontext, will
    include the extended state information along with fpstate information. Presence
    of extended state information is indicated by the presence
    of FP_XSTATE_MAGIC1 at fpstate.sw_reserved.magic1 and FP_XSTATE_MAGIC2
    at fpstate + (fpstate.sw_reserved.extended_size - FP_XSTATE_MAGIC2_SIZE).

    Extended feature bit mask that is saved in the memory layout is represented
    by the fpstate.sw_reserved.xstate_bv

    For RT signal frames, UC_FP_XSTATE in the uc_flags also indicate the
    presence of extended state information in the sigcontext's fpstate
    pointer.

    Signed-off-by: Suresh Siddha
    Signed-off-by: H. Peter Anvin
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     
  • move 64bit routines that saves/restores fpstate in/from user stack from
    signal_64.c to xsave.c

    restore_i387_xstate() now handles the condition when user passes
    NULL fpstate.

    Other misc changes for prepartion of xsave/xrstor sigcontext support.

    Signed-off-by: Suresh Siddha
    Signed-off-by: H. Peter Anvin
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     
  • dynamically allocate fpstate on the stack, instead of static allocation
    in the current sigframe layout on the user stack. This will allow the
    fpstate structure to grow in the future, which includes extended state
    information supporting xsave/xrstor.

    signal handlers will be able to access the fpstate pointer from the
    sigcontext structure asusual, with no change. For the non RT sigframe's
    (which are supported only for 32bit apps), current static fpstate layout
    in the sigframe will be unused(so that we don't change the extramask[]
    offset in the sigframe and thus prevent breaking app's which modify
    extramask[]).

    Signed-off-by: Suresh Siddha
    Signed-off-by: H. Peter Anvin
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     
  • Enables xsave/xrstor by turning on cr4.osxsave on cpu's which have
    the xsave support. For now, features that OS supports/enabled are
    FP and SSE.

    Signed-off-by: Suresh Siddha
    Signed-off-by: H. Peter Anvin
    Signed-off-by: Ingo Molnar

    Suresh Siddha