14 Nov, 2015
1 commit
-
Signed-off-by: Vineet Gupta
03 Nov, 2015
1 commit
-
Otherwise perf profiles don't charge tme to memcpy
Signed-off-by: Vineet Gupta
29 Oct, 2015
1 commit
-
This is the first working implementation of 40-bit physical address
extension on ARCv2.Signed-off-by: Alexey Brodkin
Signed-off-by: Vineet Gupta
28 Oct, 2015
19 commits
-
Signed-off-by: Vineet Gupta
-
That way a single flip of phys_addr_t to 64 bit ensures all places
dealing with physical addresses get correct dataSigned-off-by: Vineet Gupta
-
Signed-off-by: Vineet Gupta
-
Implement kmap* API for ARC.
This enables
- permanent kernel maps (pkmaps): :kmap() API
- fixmap : kmap_atomic()We use a very simple/uniform approach for both (unlike some of the other
arches). So fixmap doesn't use the customary compile time address stuff.
The important semantic is sleep'ability (pkmap) vs. not (fixmap) which
the API guarantees.Note that this patch only enables highmem for subsequent PAE40 support
as there is no real highmem for ARC in pure 32-bit paradigm as explained
below.ARC has 2:2 address split of the 32-bit address space with lower half
being translated (virtual) while upper half unstranslated
(0x8000_0000 to 0xFFFF_FFFF). kernel itself is linked at base of
unstranslated space (i.e. 0x8000_0000 onwards), which is mapped to say
DDR 0x0 by external Bus Glue logic (outside the core). So kernel can
potentially access 1.75G worth of memory directly w/o need for highmem.
(the top 256M is taken by uncached peripheral space from 0xF000_0000 to
0xFFFF_FFFF)In PAE40, hardware can address memory beyond 4G (0x1_0000_0000) while
the logical/virtual addresses remain 32-bits. Thus highmem is required
for kernel proper to be able to access these pages for it's own purposes
(user space is agnostic to this anyways).Signed-off-by: Alexey Brodkin
Signed-off-by: Vineet Gupta -
Explicit'ify that all memory added so far is low memory
Nothing semanticalSigned-off-by: Vineet Gupta
-
Before we plug in highmem support, some of code needs to be ready for it
- copy_user_highpage() needs to be using the kmap_atomic API
- mk_pte() can't assume page_address()
- do_page_fault() can't assume VMALLOC_END is end of kernel vaddr spaceSigned-off-by: Alexey Brodkin
Signed-off-by: Vineet Gupta -
Signed-off-by: Alexey Brodkin
Signed-off-by: Vineet Gupta -
- Move the verbosity knob from .data to .bss by using inverted logic
- No need to readout PD1 descriptor
- clip the non pfn bits of PD0 to avoid clipping inside the loopSigned-off-by: Vineet Gupta
-
Cc: #3.9+
Signed-off-by: Vineet Gupta -
Signed-off-by: Vineet Gupta
-
With prev fixes, all cores now start via common entry point @stext which
already calls EARLY_CPU_SETUP for all cores - so no need to invoke it
againSigned-off-by: Vineet Gupta
-
MCIP now registers it's own per cpu setup routine (for IPI IRQ request)
using smp_ops.init_irq_cpu().So no need for platforms to do that. This now completely decouples
platforms from MCIP.Signed-off-by: Vineet Gupta
-
Note this is not part of platform owned static machine_desc,
but more of device owned plat_smp_ops (rather misnamed) which a IPI
provider or some such typically defines.This will help us seperate out the IPI registration from platform
specific init_cpu_smp() into device specific init_irq_cpu()Signed-off-by: Vineet Gupta
-
This conveys better that it is called for each cpu
Signed-off-by: Vineet Gupta
-
MCIP now registers it's own probe callback with smp_ops.init_early_smp()
which is called by ARC common code, so no need for platforms to do that.This decouples the platforms and MCIP and helps confine MCIP details
to it's own file.Signed-off-by: Vineet Gupta
-
This adds a platform agnostic early SMP init hook which is called on
Master core before calling setup_processor()setup_arch()
smp_init_cpus()
smp_ops.init_early_smp()
...
setup_processor()How this helps:
- Used for one time init of certain SMP centric IP blocks, before
calling setup_processor() which probes various bits of core,
possibly including this block- Currently platforms need to call this IP block init from their
init routines, which doesn't make sense as this is specific to ARC
core and not platform and otherwise requires copy/paste in all
(and hence a possible point of failure)e.g. MCIP init is called from 2 platforms currently (axs10x and sim)
which will go away once we have this.This change only adds the hooks but they are empty for now. Next commit
will populate them and remove the explicit init calls from platforms.Signed-off-by: Vineet Gupta
-
These are not in use for ARC platforms. Moreover DT mechanims exist to
probe them w/o explicit platform calls.- clocksource drivers can use CLOCKSOURCE_OF_DECLARE()
- intc IRQCHIP_DECLARE() calls + cascading inside DT allows external
intc to be probed automaticallySigned-off-by: Vineet Gupta
-
The reason this was not done so far was lack of genuine IPI_IRQ for
ARC700, as we don't have a SMP version of core yet (which might change
soon thx to EZChip). Nevertheles to increase the build coverage, we
need to allow CONFIG_SMP for ARC700 and still be able to run it on a
UP platform (nsim or AXS101) with a UP Device Tree (SMP-on-UP)The build itself requires some define for IPI_IRQ and even a dummy
value is fine since that code won't run anyways.Signed-off-by: Vineet Gupta
-
For Run-on-reset, non masters need to spin wait. For Halt-on-reset they
can jump to entry point directly.Also while at it, made reset vector handler as "the" entry point for
kernel including host debugger based boot (which uses the ELF header
entry point)Signed-off-by: Vineet Gupta
17 Oct, 2015
18 commits
-
For non halt-on-reset case, all cores start of simultaneously in @stext.
Master core0 proceeds with kernel boot, while other spin-wait on
@wake_flag being set by master once it is ready. So NO hardware assist
is needed for master to "kick" the others.This patch moves this soft implementation out of mcip.c (as there is no
hardware assist) into common smp.cSigned-off-by: Vineet Gupta
-
Signed-off-by: Vineet Gupta
-
Signed-off-by: Vineet Gupta
-
This frees up some bits to hold more high level info such as PAE being
present, w/o increasing the size of already bloated cpuinfo structSigned-off-by: Vineet Gupta
-
Signed-off-by: Vineet Gupta
-
It was generating warnings when called as write_aux_reg(x, paddr >> 32)
Signed-off-by: Vineet Gupta
-
This is done by improving the laddering logic !
Before:
if Exception
goto excep_or_pure_k_retif !Interrupt(L2)
goto l1_chk
else
INTERRUPT_EPILOGUE 2l1_chk:
if !Interrupt(L1) (i.e. pure kernel mode)
goto excep_or_pure_k_ret
else
INTERRUPT_EPILOGUE 1excep_or_pure_k_ret:
EXCEPTION_EPILOGUENow:
if !Interrupt(L1 or L2) (i.e. exception or pure kernel mode)
goto excep_or_pure_k_ret; guaranteed to be an interrupt
if !Interrupt(L2)
goto l1_ret
else
INTERRUPT_EPILOGUE 2; by virtue of above, no need to chk for L1 active
l1_ret:
INTERRUPT_EPILOGUE 1excep_or_pure_k_ret:
EXCEPTION_EPILOGUESigned-off-by: Vineet Gupta
-
Signed-off-by: Vineet Gupta
-
The requirement is to
- Reenable Exceptions (AE cleared)
- Reenable Interrupts (E1/E2 set)We need to do wiggle these bits into ERSTATUS and call RTIE.
Prev version used the pre-exception STATUS32 as starting point for what
goes into ERSTATUS. This required explicit fixups of U/DE/L bits.Instead, use the current (in-exception) STATUS32 as starting point.
Being in exception handler U/DE/L can be safely assumed to be correct.
Only AE/E1/E2 need to be fixed.So the new implementation is slightly better
-Avoids read form memory
-Is 4 bytes smaller for the typical 1 level of intr configuration
-Depicts the semantics more clearlySigned-off-by: Vineet Gupta
-
Historically this was done by ARC IDE driver, which is long gone.
IRQ core is pretty robust now and already checks if IRQs are enabled
in hard ISRs. Thus no point in checking this in arch code, for every
call of irq enabled.Further if some driver does do that - let it bring down the system so we
notice/fix this sooner than covering up for suckerThis makes local_irq_enable() - for L1 only case atleast simple enough
so we can inline it.Signed-off-by: Vineet Gupta
-
Signed-off-by: Vineet Gupta
-
Implement the TLB flush routine to evict a sepcific Super TLB entry,
vs. moving to a new ASID on every such flush.Signed-off-by: Vineet Gupta
-
ARCHes with special requirements for evicting THP backing TLB entries
can implement this.Otherwise also, it can help optimize TLB flush in THP regime.
stock flush_tlb_range() typically has optimization to nuke the entire
TLB if flush span is greater than a certain threshhold, which will
likely be true for a single huge page. Thus a single thp flush will
invalidate the entrire TLB which is not desirable.e.g. see arch/arc: flush_pmd_tlb_range
Acked-by: Kirill A. Shutemov
Link: http://lkml.kernel.org/r/20151009100816.GC7873@node
Signed-off-by: Vineet Gupta -
- pgtable-generic.c: Fold individual #ifdef for each helper into a top
level #ifdef. Makes code more readable- Converted the stub helpers for !THP to BUILD_BUG() vs. runtime BUG()
Acked-by: Kirill A. Shutemov
Link: http://lkml.kernel.org/r/20151009133450.GA8597@node
Signed-off-by: Vineet Gupta -
This reduces/simplifies the diff for the next patch which moves THP
specific code.No semantical changes !
Acked-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com
Link: http://lkml.kernel.org/r/1442918096-17454-9-git-send-email-vgupta@synopsys.com
Signed-off-by: Vineet Gupta -
Signed-off-by: Vineet Gupta
-
Signed-off-by: Vineet Gupta
-
MMUv4 in HS38x cores supports Super Pages which are basis for Linux THP
support.Normal and Super pages can co-exist (ofcourse not overlap) in TLB with a
new bit "SZ" in TLB page desciptor to distinguish between them.
Super Page size is configurable in hardware (4K to 16M), but fixed once
RTL builds.The exact THP size a Linx configuration will support is a function of:
- MMU page size (typical 8K, RTL fixed)
- software page walker address split between PGD:PTE:PFN (typical
11:8:13, but can be changed with 1 line)So for above default, THP size supported is 8K * 256 = 2M
Default Page Walker is 2 levels, PGD:PTE:PFN, which in THP regime
reduces to 1 level (as PTE is folded into PGD and canonically referred
to as PMD).Thus thp PMD accessors are implemented in terms of PTE (just like sparc)
Signed-off-by: Vineet Gupta