04 Dec, 2011

1 commit


04 Nov, 2011

1 commit

  • An earlier Tilera compiler generated calls to an "__ll_mul"
    function for long long multiplication. Our libgcc supported that
    as an alias for the normal __muldi3 routine, so we made it available
    to kernel modules as well. However, for a while now the compiler
    has internally been generating only the standard __muldi3 symbol,
    and the version we are giving back to the community does not have
    the __ll_mul alias, so we are removing it from the kernel too.

    Signed-off-by: Chris Metcalf

    Chris Metcalf
     

13 Oct, 2011

1 commit


27 Jul, 2011

1 commit

  • This allows us to move duplicated code in
    (atomic_inc_not_zero() for now) to

    Signed-off-by: Arun Sharma
    Reviewed-by: Eric Dumazet
    Cc: Ingo Molnar
    Cc: David Miller
    Cc: Eric Dumazet
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun Sharma
     

13 May, 2011

1 commit

  • This support was partially present in the existing code (look for
    "__tilegx__" ifdefs) but with this change you can build a working
    kernel using the TILE-Gx toolchain and ARCH=tilegx.

    Most of these files are new, generally adding a foo_64.c file
    where previously there was just a foo_32.c file.

    The ARCH=tilegx directive redirects to arch/tile, not arch/tilegx,
    using the existing SRCARCH mechanism in the top-level Makefile.

    Changes to existing files:

    - and changed to factor the
    include of in the common header.

    - and arch/tile/kernel/compat.c changed to remove
    the "const" markers I had put on compat_sys_execve() when trying
    to match some recent similar changes to the non-compat execve.
    It turns out the compat version wasn't "upgraded" to use const.

    - and were
    previously included accidentally, with the 32-bit contents. Now
    they have the proper 64-bit contents.

    Finally, I had to hack the existing hacky drivers/input/input-compat.h
    to add yet another "#ifdef" for INPUT_COMPAT_TEST (same as x86_64).

    Signed-off-by: Chris Metcalf
    Acked-by: Dmitry Torokhov [drivers/input]

    Chris Metcalf
     

05 May, 2011

2 commits

  • Otherwise, it's possible to end up with the prefetcher pulling
    data into cache that the code believes has been flushed.

    Signed-off-by: Chris Metcalf

    Chris Metcalf
     
  • This semantic was already true for atomic operations within the kernel,
    and this change makes it true for the fast atomic syscalls (__NR_cmpxchg
    and __NR_atomic_update) as well. Previously, user-space had to use
    the fast atomic syscalls exclusively to update memory, since raw stores
    could lose a race with the atomic update code even when the atomic update
    hadn't actually modified the value.

    With this change, we no longer write back the value to memory if it
    hasn't changed. This allows certain types of idioms in user space to
    work as expected, e.g. "atomic exchange" to acquire a spinlock, followed
    by a raw store of zero to release the lock.

    Signed-off-by: Chris Metcalf

    Chris Metcalf
     

20 Mar, 2011

1 commit

  • Commit 8d7718aa082aaf30a0b4989e1f04858952f941bc changed "int"
    to "u32" in the prototypes but not the definition.
    I missed this when I saw the patch go by on LKML.

    We cast "u32 *" to "int *" since we are tying into the underlying
    atomics framework, and atomic_t uses int as its value type.

    Signed-off-by: Chris Metcalf
    Reviewed-by: Michel Lespinasse

    Chris Metcalf
     

11 Mar, 2011

3 commits

  • The first issue fixed in this patch is that pending rwlock write locks
    could lock out new readers; this could cause a deadlock if a read lock was
    held on cpu 1, a write lock was then attempted on cpu 2 and was pending,
    and cpu 1 was interrupted and attempted to re-acquire a read lock.
    The write lock code was modified to not lock out new readers.

    The second issue fixed is that there was a narrow race window where a tns
    instruction had been issued (setting the lock value to "1") and the store
    instruction to reset the lock value correctly had not yet been issued.
    In this case, if an interrupt occurred and the same cpu then tried to
    manipulate the lock, it would find the lock value set to "1" and spin
    forever, assuming some other cpu was partway through updating it. The fix
    is to enforce an interrupt critical section around the tns/store pair.

    In addition, this change now arranges to always validate that after
    a readlock we have not wrapped around the count of readers, which
    is only eight bits.

    Since these changes make the rwlock "fast path" code heavier weight,
    I decided to move all the rwlock code all out of line, leaving only the
    conventional spinlock code with fastpath inlines. Since the read_lock
    and read_trylock implementations ended up very similar, I just expressed
    read_lock in terms of read_trylock.

    As part of this change I also eliminate support for the now-obsolete
    tns_atomic mode.

    Signed-off-by: Chris Metcalf

    Chris Metcalf
     
  • The Tilera architecture traditionally supports 64KB page sizes
    to improve TLB utilization and improve performance when the
    hardware is being used primarily to run a single application.

    For more generic server scenarios, it can be beneficial to run
    with 4KB page sizes, so this commit allows that to be specified
    (by modifying the arch/tile/include/hv/pagesize.h header).

    As part of this change, we also re-worked the PTE management
    slightly so that PTE writes all go through a __set_pte() function
    where we can do some additional validation. The set_pte_order()
    function was eliminated since the "order" argument wasn't being used.

    One bug uncovered was in the PCI DMA code, which wasn't properly
    flushing the specified range. This was benign with 64KB pages,
    but with 4KB pages we were getting some larger flushes wrong.

    The per-cpu memory reservation code also needed updating to
    conform with the newer percpu stuff; before it always chose 64KB,
    and that was always correct, but with 4KB granularity we now have
    to pay closer attention and reserve the amount of memory that will
    be requested when the percpu code starts allocating.

    Signed-off-by: Chris Metcalf

    Chris Metcalf
     
  • This is a grab bag of changes with no actual change to generated code.
    This includes whitespace and comment typos, plus a couple of stale
    comments being removed.

    Signed-off-by: Chris Metcalf

    Chris Metcalf
     

02 Mar, 2011

5 commits

  • This adds a grab bag of symbols that have been missing for
    various modules.

    Signed-off-by: Chris Metcalf

    Chris Metcalf
     
  • It now takes an additional argument so it can be used to
    flush-and-invalidate pages that are cached using hash-for-home
    as well those that are cached with coherence point on a single cpu.

    This allows it to be used more widely for changing the coherence
    point of arbitrary pages when necessary.

    Signed-off-by: Chris Metcalf

    Chris Metcalf
     
  • This avoids having to maintain an additional separate assembly
    file, and of course the inline is slightly more efficient as well.

    Signed-off-by: Chris Metcalf

    Chris Metcalf
     
  • The current implementations of __ndelay and __udelay call a hypervisor
    service to delay, but the hypervisor service isn't actually implemented
    very well, and the consensus is that Linux should handle figuring this
    out natively and not use a hypervisor service.

    By converting nanoseconds to cycles, and then spinning until the
    cycle counter reaches the desired cycle, we get several benefits:
    first, we are sensitive to the actual clock speed; second, we use
    less power by issuing a slow SPR read once every six cycles while
    we delay; and third, we properly handle the case of an interrupt by
    exiting at the target time rather than after some number of cycles.

    Signed-off-by: Chris Metcalf

    Chris Metcalf
     
  • The convention changed to, e.g., ".data..page_aligned". This commit
    fixes the places in the tile architecture that were still using the
    old convention. One tile-specific section (.init.page) was dropped
    in favor of just using an "aligned" attribute.

    Sam Ravnborg pointed out __PAGE_ALIGNED_BSS, etc.

    Signed-off-by: Chris Metcalf

    Chris Metcalf
     

25 Nov, 2010

1 commit

  • This change fixes a bug that memchr() will read the first word
    of the source even if the length is zero. Ironically, the code
    was originally written with a test to avoid exactly this problem,
    but to make the code conform to Linux coding standards with all
    declarations preceding all statements, the first load from memory
    was moved up above that test as the initial value for a variable.

    The change just moves all the variable declarations to the top
    of the file, with no initializers, so that the test can also be
    at the top of the file.

    Signed-off-by: Chris Metcalf

    Chris Metcalf
     

15 Nov, 2010

1 commit

  • This avoids a deadlock in the IGMP code where one core gets a read
    lock, another core starts trying to get a write lock (thus blocking
    new readers), and then the first core tries to recursively re-acquire
    the read lock.

    We still try to preserve some degree of balance by giving priority
    to additional write lockers that come along while the lock is held
    for write, so they can all complete quickly and return the lock to
    the readers.

    Signed-off-by: Chris Metcalf

    Chris Metcalf
     

02 Nov, 2010

1 commit

  • This change makes KM_TYPE_NR independent of the actual deprecated
    list of km_type values, which are no longer used in tile code anywhere.
    For now we leave it set to 8, allowing that many nested mappings,
    and thus reserving 32MB of address space.

    A few remaining places using KM_* values were cleaned up as well.

    Signed-off-by: Chris Metcalf

    Chris Metcalf
     

16 Oct, 2010

1 commit

  • Our internal process shares memcpy, memset, etc., with libc, and
    we did some minor tweaking as part of moving from uclibc to glibc,
    which is now reflected in the kernel versions of these files.

    There are no semantic changes in this commit, just whitespace
    (memcpy_32.S now properly uses tabs), naming (memmove.c instead
    of memmove_32.c, since TILE-Gx shares the file with TILEPro),
    and a couple of other minor tweaks.

    Signed-off-by: Chris Metcalf

    Chris Metcalf
     

15 Oct, 2010

1 commit


06 Oct, 2010

1 commit


13 Aug, 2010

1 commit

  • This change rolls up random cleanups not representing any actual bugs.

    - Remove a stale CONFIG_ value from the default tile_defconfig
    - Remove unused tns_atomic_xxx() family of methods from
    - Optimize get_order() using Tile's "clz" instruction
    - Fix a bad hypervisor upcall name (not currently used in Linux anyway)
    - Use __copy_in_user_inatomic() name for consistency, and export it
    - Export some additional hypervisor driver I/O upcalls and some homecache calls
    - Remove the obfuscating MEMCPY_TEST_WH64 support code
    - Other stray comment cleanups, #if 0 removal, etc.

    Signed-off-by: Chris Metcalf

    Chris Metcalf
     

07 Jul, 2010

3 commits

  • This commit is primarily changes caused by reviewing "sparse"
    and "checkpatch" output on our sources, so is somewhat noisy, since
    things like "printk() -> pr_err()" (or whatever) throughout the
    codebase tend to get tedious to read. Rather than trying to tease
    apart precisely which things changed due to which type of code
    review, this commit includes various cleanups in the code:

    - sparse: Add declarations in headers for globals.
    - sparse: Fix __user annotations.
    - sparse: Using gfp_t consistently instead of int.
    - sparse: removing functions not actually used.
    - checkpatch: Clean up printk() warnings by using pr_info(), etc.;
    also avoid partial-line printks except in bootup code.
    - checkpatch: Use exposed structs rather than typedefs.
    - checkpatch: Change some C99 comments to C89 comments.

    In addition, a couple of minor other changes are rolled in
    to this commit:

    - Add support for a "raise" instruction to cause SIGFPE, etc., to be raised.
    - Remove some compat code that is unnecessary when we fully eliminate
    some of the deprecated syscalls from the generic syscall ABI.
    - Update the tile_defconfig to reflect current config contents.

    Signed-off-by: Chris Metcalf
    Acked-by: Arnd Bergmann

    Chris Metcalf
     
  • This code is used in other places in our system than in Linux, so
    to share it we now implement it as an inline function in our low-level
    headers, and instantiate it in one file in Linux's arch/tile/lib.
    The file is now cacheflush.c and is C code rather than the strangely-named
    and assembler-implemented __invalidate_icache.S.

    Signed-off-by: Chris Metcalf
    Acked-by: Arnd Bergmann

    Chris Metcalf
     
  • This wasn't properly tested until the perf-event subsystem started
    to get brought up under the tile architecture.

    The bug caused bogus atomic64_cmpxchg() values to be returned,
    among other things.

    Signed-off-by: Chris Metcalf
    Acked-by: Arnd Bergmann

    Chris Metcalf
     

05 Jun, 2010

1 commit

  • This change is the core kernel support for TILEPro and TILE64 chips.
    No driver support (except the console driver) is included yet.

    This includes the relevant Linux headers in asm/; the low-level
    low-level "Tile architecture" headers in arch/, which are
    shared with the hypervisor, etc., and are build-system agnostic;
    and the relevant hypervisor headers in hv/.

    Signed-off-by: Chris Metcalf
    Acked-by: Arnd Bergmann
    Acked-by: FUJITA Tomonori
    Reviewed-by: Paul Mundt

    Chris Metcalf