30 Jan, 2008

7 commits


19 Dec, 2007

1 commit


18 Oct, 2007

3 commits

  • This patch removes the __STR() and STR() macros from x86_64 header files.
    They seem to be legacy, and has no more users. Even if there were users,
    they should use __stringify() instead.

    In fact, there were one third place in which this macro was defined
    (ia32_binfmt.c), and used just below. In this file, usage was properly
    converted to __stringify()

    [ tglx: arch/x86 adaptation ]

    Signed-off-by: Glauber de Oliveira Costa
    Signed-off-by: Andi Kleen
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Glauber de Oliveira Costa
     
  • Create an inline function for clflush(), with the proper arguments,
    and use it instead of hard-coding the instruction.

    This also removes one instance of hard-coded wbinvd, based on a patch
    by Bauder de Oliveira Costa.

    [ tglx: arch/x86 adaptation ]

    Cc: Andi Kleen
    Cc: Glauber de Oliveira Costa
    Signed-off-by: H. Peter Anvin
    Signed-off-by: Andi Kleen
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    H. Peter Anvin
     
  • Some gcc versions (I checked at least 4.1.1 from RHEL5 & 4.1.2 from gentoo)
    can generate incorrect code with read_crX()/write_crX() functions mix up,
    due to cached results of read_crX().

    The small app for x8664 below compiled with -O2 demonstrates this
    (i686 does the same thing):

    Kirill Korotaev
     

13 Oct, 2007

2 commits

  • According to latest memory ordering specification documents from Intel
    and AMD, both manufacturers are committed to in-order loads from
    cacheable memory for the x86 architecture. Hence, smp_rmb() may be a
    simple barrier.

    Also according to those documents, and according to existing practice in
    Linux (eg. spin_unlock doesn't enforce ordering), stores to cacheable
    memory are visible in program order too. Special string stores are safe
    -- their constituent stores may be out of order, but they must complete
    in order WRT surrounding stores. Nontemporal stores to WB memory can go
    out of order, and so they should be fenced explicitly to make them
    appear in-order WRT other stores. Hence, smp_wmb() may be a simple
    barrier.

    http://developer.intel.com/products/processor/manuals/318147.pdf
    http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24593.pdf

    In userspace microbenchmarks on a core2 system, fence instructions range
    anywhere from around 15 cycles to 50, which may not be totally
    insignificant in performance critical paths (code size will go down
    too).

    However the primary motivation for this is to have the canonical barrier
    implementation for x86 architecture.

    smp_rmb on buggy pentium pros remains a locked op, which is apparently
    required.

    Signed-off-by: Nick Piggin
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • wmb() on x86 must always include a barrier, because stores can go out of
    order in many cases when dealing with devices (eg. WC memory).

    Signed-off-by: Nick Piggin
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

11 Oct, 2007

1 commit