25 May, 2013

1 commit

  • gdbserver inserting a breakpoint ends up calling copy_user_page() for a
    code page. The generic version of which (non-aliasing config) didn't set
    the PG_arch_1 bit hence update_mmu_cache() didn't sync dcache/icache for
    corresponding dynamic loader code page - causing garbade to be executed.

    So now aliasing versions of copy_user_highpage()/clear_page() are made
    default. There is no significant overhead since all of special alias
    handling code is compiled out for non-aliasing build

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     

23 May, 2013

3 commits

  • The VM_EXEC check in update_mmu_cache() was getting optimized away
    because of a stupid error in definition of macro addr_not_cache_congruent()

    The intention was to have the equivalent of following:

    if (a || (1 ? b : 0))

    but we ended up with following:

    if (a || 1 ? b : 0)

    And because precedence of '||' is more that that of '?', gcc was optimizing
    away evaluation of

    Nasty Repercussions:
    1. For non-aliasing configs it would mean some extraneous dcache flushes
    for non-code pages if U/K mappings were not congruent.
    2. For aliasing config, some needed dcache flush for code pages might
    be missed if U/K mappings were congruent.

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • This manifested as grep failing psuedo-randomly:

    -------------->8---------------------
    [ARCLinux]$ ip address show lo | grep inet
    [ARCLinux]$ ip address show lo | grep inet
    [ARCLinux]$ ip address show lo | grep inet
    [ARCLinux]$
    [ARCLinux]$ ip address show lo | grep inet
    inet 127.0.0.1/8 scope host lo
    -------------->8---------------------

    ARC700 MMU provides fully orthogonal permission bits per page:
    Ur, Uw, Ux, Kr, Kw, Kx

    The user mode page permission templates used to have all Kernel mode
    access bits enabled.
    This caused a tricky race condition observed with uClibc buffered file
    read and UNIX pipes.

    1. Read access to an anon mapped page in libc .bss: write-protected
    zero_page mapped: TLB Entry installed with Ur + K[rwx]

    2. grep calls libc:getc() -> buffered read layer calls read(2) with the
    internal read buffer in same .bss page.
    The read() call is on STDIN which has been redirected to a pipe.
    read(2) => sys_read() => pipe_read() => copy_to_user()

    3. Since page has Kernel-write permission (despite being user-mode
    write-protected), copy_to_user() suceeds w/o taking a MMU TLB-Miss
    Exception (page-fault for ARC). core-MM is unaware that kernel
    erroneously wrote to the reserved read-only zero-page (BUG #1)

    4. Control returns to userspace which now does a write to same .bss page
    Since Linux MM is not aware that page has been modified by kernel, it
    simply reassigns a new writable zero-init page to mapping, loosing the
    prior write by kernel - effectively zero'ing out the libc read buffer
    under the hood - hence grep doesn't see right data (BUG #2)

    The fix is to make all kernel-mode access permissions mirror the
    user-mode ones. Note that the kernel still has full access to pages,
    when accessed directly (w/o MMU) - this fix ensures that kernel-mode
    access in copy_to_from() path uses the same faulting access model as for
    pure user accesses to keep MM fully aware of page state.

    The issue is peudo-random because it only shows up if the TLB entry
    installed in #1 is present at the time of #3. If it is evicted out, due
    to TLB pressure or some-such, then copy_to_user() does take a TLB Miss
    Exception, with a routine write-to-anon COW processing installing a
    fresh page for kernel writes and also usable as it is in userspace.

    Further the issue was dormant for so long as it depends on where the
    libc internal read buffer (in .bss) is mapped at runtime.
    If it happens to reside in file-backed data mapping of libc (in the
    page-aligned slack space trailing the file backed data), loader zero
    padding the slack space, does the early cow page replacement, setting
    things up at the very beginning itself.

    With gcc 4.8 based builds, the libc buffer got pushed out to a real
    anon mapping which triggers the issue.

    Reported-by: Anton Kolesov
    Cc: # 3.9
    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • Flush and INVALIDATE the dcache page.

    This helper is only used for writeback of CODE pages to memory. So
    there's no value in keeping the dcache lines around. Infact it is risky
    as a writeback on natural eviction under pressure can cause un-needed
    writeback with weird issues on aliasing dcache configurations.

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     

10 May, 2013

6 commits

  • Pull second set of arc arch updates from Vineet Gupta:
    "Aliasing VIPT dcache support for ARC

    I'm satisified with testing, specially with fuse which has
    historically given grief to VIPT arches (ARM/PARISC...)"

    * tag 'arc-v3.10-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
    ARC: [TB10x] Remove GENERIC_GPIO
    ARC: [mm] Aliasing VIPT dcache support 4/4
    ARC: [mm] Aliasing VIPT dcache support 3/4
    ARC: [mm] Aliasing VIPT dcache support 2/4
    ARC: [mm] Aliasing VIPT dcache support 1/4
    ARC: [mm] refactor the core (i|d)cache line ops loops
    ARC: [mm] serious bug in vaddr based icache flush

    Linus Torvalds
     
  • Pull ARC port updates from Vineet Gupta:
    "Support for two new platforms based on ARC700:
    - Abilis TB10x SoC [Chritisian/Pierrick]
    - Simulator only System-C Model [Mischa]

    ARC specific MM improvements:
    - Avoid full TLB flush (ASID increment) on munmap (even single page)
    - VIPT Cache Flushing improvements
    + Delayed dcache flush for non-aliasing dcache (big performance boost)
    + icache flush aliasing agnostic (no need to kill all possible aliases)

    Others:
    - Avoid needless rebuild of DTB files for every kernel build
    - Remove builtin cmdline as that is already provided by DeviceTree/bootargs
    - Fixing unaligned access emulation corner case
    - checkpatch fixes [Sachin]
    - Various fixlets [Noam]
    - Minor build failures/cleanups"

    * tag 'arc-v3.10-rc1-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (35 commits)
    ARC: [mm] Lazy D-cache flush (non aliasing VIPT)
    ARC: [mm] micro-optimize page size icache invalidate
    ARC: [mm] remove the pessimistic all-alias-invalidate icache helpers
    ARC: [mm] consolidate icache/dcache sync code
    ARC: [mm] optimise icache flush for kernel mappings
    ARC: [mm] optimise icache flush for user mappings
    ARC: [mm] optimize needless full mm TLB flush on munmap
    ARC: Add support for nSIM OSCI System C model
    ARC: [TB10x] Adapt device tree to new compatible string
    ARC: [TB10x] Add support for TB10x platform
    ARC: [TB10x] Device tree of TB100 and TB101 Development Kits
    ARC: Prepare interrupt code for external controllers
    ARC: Allow embedded arc-intc to be properly placed in DT intc hierarchy
    ARC: [cmdline] Don't overwrite u-boot provided bootargs
    ARC: [cmdline] Remove CONFIG_CMDLINE
    ARC: [plat-arcfpga] defconfig update
    ARC: unaligned access emulation broken if callee-reg dest of LD/ST
    ARC: unaligned access emulation error handling consolidation
    ARC: Debug/crash-printing Improvements
    ARC: fix typo with clock speed
    ...

    Linus Torvalds
     
  • Enforce congruency of userspace shared mappings

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • Fix the one zillion warnings

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • This is the meat of the series which prevents any dcache alias creation
    by always keeping the U and K mapping of a page congruent.
    If a mapping already exists, and other tries to access the page, prev
    one is flushed to physical page (wback+inv)

    Essentially flush_dcache_page()/copy_user_highpage() create K-mapping
    of a page, but try to defer flushing, unless U-mapping exist.
    When page is actually mapped to userspace, update_mmu_cache() flushes
    the K-mapping (in certain cases this can be optimised out)

    Additonally flush_cache_mm(), flush_cache_range(), flush_cache_page()
    handle the puring of stale userspace mappings on exit/munmap...

    flush_anon_page() handles the existing U-mapping for anon page before
    kernel reads it via the GUP path.

    Note that while not complete, this is enough to boot a simple
    dynamically linked Busybox based rootfs

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • This preps the low level dcache flush helpers to take vaddr argument in
    addition to the existing paddr to properly flush the VIPT dcache

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     

09 May, 2013

2 commits


07 May, 2013

7 commits

  • flush_dcache_page( ) is MM hook to ensure that a page has consistent
    views between kernel and userspace. Thus it is called when

    * kernel writes to a page which at some later point could get mapped to
    userspace (so kernel mapping needs to be flushed-n-inv)
    * kernel is about to read from a page with possible userspace mappings
    (so userspace mappings needs to be made coherent with kernel ones)

    However for Non aliasing VIPT dcache, any userspace mapping will always
    be congruent to kernel mapping. Thus d-cache need need not be flushed at
    all (or delayed indefinitely).

    The only reason it does need to be flushed is when mapping code pages.
    Since icache doesn't snoop dcache, those dirty dcache lines need to be
    written back to memory and icache line invalidated so that icache lines
    fetch will get the right data.

    Decent gains on LMBench fork/exec/sh and File I/O micro-benchmarks.

    (1) FPGA @ 80 MHZ

    Processor, Processes - times in microseconds - smaller is better
    ------------------------------------------------------------------------------
    Host OS Mhz null null open slct sig sig fork exec sh
    call I/O stat clos TCP inst hndl proc proc proc
    --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
    3.9-rc6-a Linux 3.9.0-r 80 4.79 8.72 66.7 116. 239. 8.39 30.4 4798 14.K 34.K
    3.9-rc6-b Linux 3.9.0-r 80 4.79 8.62 65.4 111. 239. 8.35 29.0 3995 12.K 30.K
    3.9-rc7-c Linux 3.9.0-r 80 4.79 9.00 66.1 106. 239. 8.61 30.4 2858 10.K 24.K
    ^^^^ ^^^^ ^^^

    File & VM system latencies in microseconds - smaller is better
    -------------------------------------------------------------------------------
    Host OS 0K File 10K File Mmap Prot Page 100fd
    Create Delete Create Delete Latency Fault Fault selct
    --------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
    3.9-rc6-a Linux 3.9.0-r 317.8 204.2 1122.3 375.1 3522.0 4.288 20.7 126.8
    3.9-rc6-b Linux 3.9.0-r 298.7 223.0 1141.6 367.8 3531.0 4.866 20.9 126.4
    3.9-rc7-c Linux 3.9.0-r 278.4 179.2 862.1 339.3 3705.0 3.223 20.3 126.6
    ^^^^^ ^^^^^ ^^^^^ ^^^^

    (2) Customer Silicon @ 500 MHz (166 MHz mem)

    ------------------------------------------------------------------------------
    Host OS Mhz null null open slct sig sig fork exec sh
    call I/O stat clos TCP inst hndl proc proc proc
    --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
    abilis-ba Linux 3.9.0-r 497 0.71 1.38 4.58 12.0 35.5 1.40 3.89 2070 5525 13.K
    abilis-ca Linux 3.9.0-r 497 0.71 1.40 4.61 11.8 35.6 1.37 3.92 1411 4317 10.K
    ^^^^ ^^^^ ^^^

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • start address is already page aligned and size is const PAGE_SIZE,
    thus fixups for alignment not needed in generated code.

    bloat-o-meter vmlinux-mm5 vmlinux
    add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-32 (-32)
    function old new delta
    __inv_icache_page 82 50 -32

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • No users of this code anymore - so RIP !

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • Now that we have same helper used for all icache invalidates (i.e.
    vaddr+paddr based exact line invalidate), consolidate the open coded
    calls into one place.

    Also rename flush_icache_range_vaddr => __sync_icache_dcache

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • This change continues the theme from prev commit - this time icache
    handling for kernel's own code modification (vmalloc: loadable modules,
    breakpoints for kprobes/kgdb...)

    flush_icache_range() calls the CDU icache helper with vaddr to enable
    exact line invalidate.

    For a true kernel-virtual mapping, the vaddr is actually virtual hence
    valid as index into cache. For kprobes breakpoint however, the vaddr arg
    is actually paddr - since that's how normal kernel is mapped in ARC
    memory map. This implies that CDU will use the same addr for
    indexing as for tag match - which is fine since kernel code would only
    have that "implicit" mapping and none other.

    This should speed up module loading significantly - specially on default
    ARC700 icache configurations (32k) which alias.

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • ARC icache doesn't snoop dcache thus executable pages need to be made
    coherent before mapping into userspace in flush_icache_page().

    However ARC700 CDU (hardware cache flush module) requires both vaddr
    (index in cache) as well as paddr (tag match) to correctly identify a
    line in the VIPT cache. A typical ARC700 SoC has aliasing icache, thus
    the paddr only based flush_icache_page() API couldn't be implemented
    efficiently. It had to loop thru all possible alias indexes and perform
    the invalidate operation (ofcourse the cache op would only succeed at
    the index(es) where tag matches - typically only 1, but the cost of
    visiting all the cache-bins needs to paid nevertheless).

    Turns out however that the vaddr (along with paddr) is available in
    update_mmu_cache() hence better suits ARC icache flush semantics.
    With both vaddr+paddr, exactly one flush operation per line is done.

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • Signed-off-by: Noam Camus
    Signed-off-by: Vineet Gupta

    Noam Camus
     

30 Apr, 2013

1 commit


09 Apr, 2013

5 commits


16 Feb, 2013

15 commits

  • !CONFIG_ARC_HAS_(I|D)CACHE makes Linux disable caches (assuming they
    exist in hardware) - mostly for debugging issues with new peripherals.
    However, independent of CONFIG_ARC_HAS_(I|D)CACHE, Linux also needs to
    handle, non-existant caches, using the information in Cache BCRs (Build
    Configuration Reg)

    Reported-by: Alexey Brodkin
    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • Implement ioremap_prot() to allow mapping IO memory with variable
    protection
    via TLB.

    Implementing this allows the /dev/mem driver to use its generic access()
    VMA callback, which in turn allows ptrace to examine data in memory
    mapped regions mapped via /dev/mem, such as Arc DCCM.

    The end result is that it is possible to examine values of variables
    placed into DCCM in user space programs via GDB.

    CC: Alexey Brodkin
    CC: Noam Camus
    Acked-by: Vineet Gupta
    Signed-off-by: Gilad Ben-Yossef
    Signed-off-by: Vineet Gupta

    Gilad Ben-Yossef
     
  • * Includes mapping of CCMs in address space
    * Annotations to move arbitrary code/data into CCM
    * Moving some of the critical code/data into CCM
    * Runtime detection/reporting

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • ARC common code to enable a SMP system + ISS provided SMP extensions.

    ARC700 natively lacks SMP support, hence some of the core features are
    are only enabled if SoCs have the necessary h/w pixie-dust. This
    includes:
    -Inter Processor Interrupts (IPI)
    -Cache coherency
    -load-locked/store-conditional
    ...

    The low level exception handling would be completely broken in SMP
    because we don't have hardware assisted stack switching. Thus a fair bit
    of this code is repurposing the MMU_SCRATCH reg for event handler
    prologues to keep them re-entrant.

    Many thanks to Rajeshwar Ranga for his initial "major" contributions to
    SMP Port (back in 2008), and to Noam Camus and Gilad Ben-Yossef for help
    with resurrecting that in 3.2 kernel (2012).

    Note that this platform code is again singleton design pattern - so
    multiple SMP platforms won't build at the moment - this deficiency is
    addressed in subsequent patches within this series.

    Signed-off-by: Vineet Gupta
    Cc: Arnd Bergmann
    Cc: Thomas Gleixner
    Cc: Rajeshwar Ranga
    Cc: Noam Camus
    Cc: Gilad Ben-Yossef

    Vineet Gupta
     
  • Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • * mem size now runtime configured (prev CONFIG_ARC_PLAT_SDRAM_SIZE)
    * core cpu clk runtime configured (prev CONFIG_ARC_PLAT_CLK)

    Signed-off-by: Vineet Gupta
    Cc: Arnd Bergmann
    Cc: Grant Likely

    Vineet Gupta
     
  • This is minimal infrastructure needed for devicetree work.
    It uses an a sample "skeleton" devicetree - embedded in kernel image -
    to print the board, manufacturer by parsing the top-level "compatible"
    string.

    As of now we don't need any additional "board" specific "machine_desc".

    TODO: support interpreting the command line as boot-loader passed dtb

    Signed-off-by: Vineet Gupta
    Cc: Arnd Bergmann
    Cc: Grant Likely
    Cc: devicetree-discuss@lists.ozlabs.org
    Cc: Rob Herring
    Cc: James Hogan
    Reviewed-by: Rob Herring
    Reviewed-by: James Hogan

    Vineet Gupta
     
  • Signed-off-by: Vineet Gupta
    Acked-by: Arnd Bergmann

    Vineet Gupta
     
  • Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • This includes recent changes to make handler "retry" and/or "killable"

    The killable (early exit) logic is loosely based on how SH implements it
    return if SIGKILL + either of VM_FAULT_OOM or VM_FAULT_RETRY
    which is different from Hexagon implementation which would NOT early
    exit for
    SIGKILL + VM_FAULT_OOM + !VM_FAULT_RETRY

    credits: Non executable stack support from Simon Spooner

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • * MMU I-TLB / D-TLB Miss Exceptions
    - Fast Path TLB Refill Handler
    - slowpath TLB creation via do_page_fault() -> update_mmu_cache()
    * Duplicate PD Exception Handler

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • ARC700 MMU provides for tagging TLB entries with a 8-bit ASID to avoid
    having to flush the TLB every task switch.

    It also allows for a quick way to invalidate all the TLB entries for
    task useful for:
    * COW sementics during fork()
    * task exit()ing

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • * ARC700 has VIPT L1 Caches
    * Caches don't snoop and are not coherent
    * Given the PAGE_SIZE and Cache associativity, we don't support aliasing
    D$ configurations (yet), but do allow aliasing I$ configs

    Signed-off-by: Vineet Gupta

    Vineet Gupta