23 Jul, 2014

2 commits


03 Jun, 2014

1 commit


26 Mar, 2014

2 commits


07 Nov, 2013

1 commit


06 Nov, 2013

2 commits

  • Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • __get_cpu_var() is used for multiple purposes in the kernel source. One of them is
    address calculation via the form &__get_cpu_var(x). This calculates the address for
    the instance of the percpu variable of the current processor based on an offset.

    Other use cases are for storing and retrieving data from the current processors percpu area.
    __get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment.

    __get_cpu_var() is defined as :

    #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))

    __get_cpu_var() always only does an address determination. However, store and retrieve operations
    could use a segment prefix (or global register on other platforms) to avoid the address calculation.

    this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use
    optimized assembly code to read and write per cpu variables.

    This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr()
    or into a use of this_cpu operations that use the offset. Thereby address calcualtions are avoided
    and less registers are used when code is generated.

    At the end of the patchset all uses of __get_cpu_var have been removed so the macro is removed too.

    The patchset includes passes over all arches as well. Once these operations are used throughout then
    specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by
    f.e. using a global register that may be set to the per cpu base.

    Transformations done to __get_cpu_var()

    1. Determine the address of the percpu instance of the current processor.

    DEFINE_PER_CPU(int, y);
    int *x = &__get_cpu_var(y);

    Converts to

    int *x = this_cpu_ptr(&y);

    2. Same as #1 but this time an array structure is involved.

    DEFINE_PER_CPU(int, y[20]);
    int *x = __get_cpu_var(y);

    Converts to

    int *x = this_cpu_ptr(y);

    3. Retrieve the content of the current processors instance of a per cpu variable.

    DEFINE_PER_CPU(int, u);
    int x = __get_cpu_var(y)

    Converts to

    int x = __this_cpu_read(y);

    4. Retrieve the content of a percpu struct

    DEFINE_PER_CPU(struct mystruct, y);
    struct mystruct x = __get_cpu_var(y);

    Converts to

    memcpy(this_cpu_ptr(&x), y, sizeof(x));

    5. Assignment to a per cpu variable

    DEFINE_PER_CPU(int, y)
    __get_cpu_var(y) = x;

    Converts to

    this_cpu_write(y, x);

    6. Increment/Decrement etc of a per cpu variable

    DEFINE_PER_CPU(int, y);
    __get_cpu_var(y)++

    Converts to

    this_cpu_inc(y)

    Acked-by: Vineet Gupta
    Signed-off-by: Christoph Lameter

    Christoph Lameter
     

27 Sep, 2013

1 commit


27 Jun, 2013

1 commit

  • The __cpuinit type of throwaway sections might have made sense
    some time ago when RAM was more constrained, but now the savings
    do not offset the cost and complications. For example, the fix in
    commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
    is a good example of the nasty type of bugs that can be created
    with improper use of the various __init prefixes.

    After a discussion on LKML[1] it was decided that cpuinit should go
    the way of devinit and be phased out. Once all the users are gone,
    we can then finally remove the macros themselves from linux/init.h.

    Note that some harmless section mismatch warnings may result, since
    notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
    are flagged as __cpuinit -- so if we remove the __cpuinit from
    arch specific callers, we will also get section mismatch warnings.
    As an intermediate step, we intend to turn the linux/init.h cpuinit
    content into no-ops as early as possible, since that will get rid
    of these warnings. In any case, they are temporary and harmless.

    This removes all the arch/arc uses of the __cpuinit macros from
    all C files. Currently arc does not have any __CPUINIT used in
    assembly files.

    [1] https://lkml.org/lkml/2013/5/20/589

    Cc: Vineet Gupta
    Signed-off-by: Paul Gortmaker
    Signed-off-by: Vineet Gupta

    Paul Gortmaker
     

22 Jun, 2013

1 commit


09 Apr, 2013

1 commit


16 Feb, 2013

2 commits

  • The 64bit RTSC is not reliable, causing spurious "jumps" in higher word,
    making Linux timekeeping go bonkers. So as of now just use the lower
    32bit timestamp.

    A cleaner approach would have been removing RTSC support altogether as the
    32bit RTSC is equivalent to old TIMER1 based solution, but some customers
    can use the 32bit RTSC in SMP syn fashion (vs. TIMER1 which being incore
    can't be done easily).

    A fallout of this is sched_clock()'s hardware assisted version needs to
    go away since it can't use 32bit wrapping counter - instead we use the
    generic "weak" jiffies based version.

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • The orig platform code orgnaization was singleton design pattern - only
    one platform (and board thereof) would build at a time.

    Thus any platform/board specific code (e.g. irq init, early init ...)
    expected by ARC common code was exported as well defined set of APIs,
    with only ONE instance building ever.

    Now with multiple-platform build requirement, that design of code no
    longer holds - multiple board specific calls need to build at the same
    time - so ARC common code can't use the API approach, it needs a
    callback based design where each board registers it's specific set of
    functions, and at runtime, depending on board detection, the callbacks
    are used from the registry.

    This commit adds all the infrastructure, where board specific callbacks
    are specified as a "maThine description".

    All the hooks are placed in right spots, no board callbacks registered
    yet (with MACHINE_STARt/END constructs) so the hooks will not run.

    Next commit will actually convert the platform to this infrastructure.

    Signed-off-by: Vineet Gupta
    Cc: Arnd Bergmann
    Acked-by: Arnd Bergmann

    Vineet Gupta
     

11 Feb, 2013

1 commit

  • ARC700 includes 2 in-core 32bit timers TIMER0 and TIMER1.
    Both have exactly same capabilies.

    * programmable to count from TIMER_CNT to TIMER_LIMIT
    * for count 0 and LIMIT ~1, provides a free-running counter by
    auto-wrapping when limit is reached.
    * optionally interrupt when LIMIT is reached (oneshot event semantics)
    * rearming the interrupt provides periodic semantics
    * run at CPU clk

    ARC Linux uses TIMER0 for clockevent (periodic/oneshot) and TIMER1 for
    clocksource (free-running clock).

    Newer cores provide RTSC insn which gives a 64bit cpu clk snapshot hence
    is more apt for clocksource when available.

    SMP poses a bit of challenge for global timekeeping clocksource /
    sched_clock() backend:
    -TIMER1 based local clocks are out-of-sync hence can't be used
    (thus we default to jiffies based cs as well as sched_clock() one/both
    of which platform can override with it's specific hardware assist)
    -RTSC is only allowed in SMP if it's cross-core-sync (Kconfig glue
    ensures that) and thus usable for both requirements.

    Signed-off-by: Vineet Gupta
    Cc: Thomas Gleixner

    Vineet Gupta