Doug / smarc-fsl-linux-kernel | Embedian Git Server

27 Sep, 2013

2 commits

6c00350b5 ARC: Workaround spinlock livelock in SMP SystemC simulation ... Browse Code »

Some ARC SMP systems lack native atomic R-M-W (LLOCK/SCOND) insns and
can only use atomic EX insn (reg with mem) to build higher level R-M-W
primitives. This includes a SystemC based SMP simulation model.

So rwlocks need to use a protecting spinlock for atomic cmp-n-exchange
operation to update reader(s)/writer count.

The spinlock operation itself looks as follows:

mov reg, 1 ; 1=locked, 0=unlocked
retry:
EX reg, [lock] ; load existing, store 1, atomically
BREQ reg, 1, rety ; if already locked, retry

In single-threaded simulation, SystemC alternates between the 2 cores
with "N" insn each based scheduling. Additionally for insn with global
side effect, such as EX writing to shared mem, a core switch is
enforced too.

Given that, 2 cores doing a repeated EX on same location, Linux often
got into a livelock e.g. when both cores were fiddling with tasklist
lock (gdbserver / hackbench) for read/write respectively as the
sequence diagram below shows:

core1 core2
-------- --------
1. spin lock [EX r=0, w=1] - LOCKED
2. rwlock(Read) - LOCKED
3. spin unlock [ST 0] - UNLOCKED
spin lock [EX r=0,w=1] - LOCKED
-- resched core 1----

5. spin lock [EX r=1] - ALREADY-LOCKED

-- resched core 2----
6. rwlock(Write) - READER-LOCKED
7. spin unlock [ST 0]
8. rwlock failed, retry again

9. spin lock [EX r=0, w=1]
-- resched core 1----

10 spinlock locked in #9, retry #5
11. spin lock [EX gets 1]
-- resched core 2----
...
...

The fix was to unlock using the EX insn too (step 7), to trigger another
SystemC scheduling pass which would let core1 proceed, eliding the
livelock.

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-09-27 18:58:48 +0800
0752adfda ARC: Fix 32-bit wrap around in access_ok() ... Browse Code »

Anton reported

| LTP tests syscalls/process_vm_readv01 and process_vm_writev01 fail
| similarly in one testcase test_iov_invalid -> lvec->iov_base.
| Testcase expects errno EFAULT and return code -1,
| but it gets return code 1 and ERRNO is 0 what means success.

Essentially test case was passing a pointer of -1 which access_ok()
was not catching. It was doing [@addr + @sz
Signed-off-by: Vineet Gupta

Vineet Gupta
2013-09-27 18:58:47 +0800

12 Sep, 2013

1 commit

c3567f8a3 ARC: SMP failed to boot due to missing IVT setup ... Browse Code »

Commit 05b016ecf5e7a "ARC: Setup Vector Table Base in early boot" moved
the Interrupt vector Table setup out of arc_init_IRQ() which is called
for all CPUs, to entry point of boot cpu only, breaking booting of others.

Fix by adding the same to entry point of non-boot CPUs too.

read_arc_build_cfg_regs() printing IVT Base Register didn't help the
casue since it prints a synthetic value if zero which is totally bogus,
so fix that to print the exact Register.

[vgupta: Remove the now stale comment from header of arc_init_IRQ and
also added the commentary for halt-on-reset]

Cc: Gilad Ben-Yossef
Cc: Cc: #3.11
Signed-off-by: Noam Camus
Signed-off-by: Vineet Gupta
Signed-off-by: Linus Torvalds

Noam Camus
2013-09-12 22:40:08 +0800

05 Sep, 2013

3 commits

07b9b6514 ARC: fix new Section mismatches in build (post __cpuinit cleanup) ... Browse Code »

--------------->8--------------------
WARNING: vmlinux.o(.text+0x708): Section mismatch in reference from the
function read_arc_build_cfg_regs() to the function
.init.text:read_decode_cache_bcr()

WARNING: vmlinux.o(.text+0x702): Section mismatch in reference from the
function read_arc_build_cfg_regs() to the function
.init.text:read_decode_mmu_bcr()
--------------->8--------------------

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-09-05 21:49:06 +0800
7efd0da2d ARC: Fix __udelay calculation ... Browse Code »

Cast usecs to u64, to ensure that the (usecs * 4295 * HZ)
multiplication is 64 bit.

Initially, the (usecs * 4295 * HZ) part was done as a 32 bit
multiplication, with the result casted to 64 bit. This led to some bits
falling off, causing a "DMA initialization error" in the stmmac Ethernet
driver, due to a premature timeout.

Signed-off-by: Mischa Jonker
Signed-off-by: Vineet Gupta

Mischa Jonker
2013-09-05 13:01:12 +0800
6532b02fe ARC: Add read*_relaxed to asm/io.h ... Browse Code »

Some drivers require these, and ARC didn't had them yet.

Signed-off-by: Mischa Jonker
Signed-off-by: Vineet Gupta

Mischa Jonker
2013-09-05 13:01:11 +0800

31 Aug, 2013

5 commits

947bf103f ARC: [ASID] Track ASID allocation cycles/generations ... Browse Code »

This helps remove asid-to-mm reverse map

While mm->context.id contains the ASID assigned to a process, our ASID
allocator also used asid_mm_map[] reverse map. In a new allocation
cycle (mm->ASID >= @asid_cache), the Round Robin ASID allocator used this
to check if new @asid_cache belonged to some mm2 (from prev cycle).
If so, it could locate that mm using the ASID reverse map, and mark that
mm as unallocated ASID, to force it to refresh at the time of switch_mm()

However, for SMP, the reverse map has to be maintained per CPU, so
becomes 2 dimensional, hence got rid of it.

With reverse map gone, it is NOT possible to reach out to current
assignee. So we track the ASID allocation generation/cycle and
on every switch_mm(), check if the current generation of CPU ASID is
same as mm's ASID; If not it is refreshed.

(Based loosely on arch/sh implementation)

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-08-31 00:12:19 +0800
c60115537 ARC: [ASID] activate_mm() == switch_mm() ... Browse Code »

ASID allocation changes/2

Use the fact that switch_mm() and activate_mm() are exactly same code
now while acknowledging the semantical difference in comment

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-08-31 00:12:19 +0800
3daa48d1d ARC: [ASID] get_new_mmu_context() to conditionally allocate new ASID ... Browse Code »

ASID allocation changes/1

This patch does 2 things:

(1) get_new_mmu_context() NOW moves mm->ASID to a new value ONLY if it
was from a prev allocation cycle/generation OR if mm had no ASID
allocated (vs. before would unconditionally moving to a new ASID)

Callers desiring unconditional update of ASID, e.g.local_flush_tlb_mm()
(for parent's address space invalidation at fork) need to first force
the parent to an unallocated ASID.

(2) get_new_mmu_context() always sets the MMU PID reg with unchanged/new
ASID value.

The gains are:
- consolidation of all asid alloc logic into get_new_mmu_context()
- avoiding code duplication in switch_mm() for PID reg setting
- Enables future change to fold activate_mm() into switch_mm()

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-08-31 00:12:18 +0800
5bd87adf9 ARC: [ASID] Refactor the TLB paranoid debug code ... Browse Code »

-Asm code already has values of SW and HW ASID values, so they can be
passed to the printing routine.

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-08-31 00:12:18 +0800
ade922f8e ARC: [ASID] Remove legacy/unused debug code ... Browse Code »

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-08-31 00:12:17 +0800

30 Aug, 2013

2 commits

483e9bcb0 ARC: MMUv4 preps/3 - Abstract out TLB Insert/Delete ... Browse Code »

This reorganizes the current TLB operations into psuedo-ops to better
pair with MMUv4's native Insert/Delete operations

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-08-30 12:52:48 +0800
d091fcb97 ARC: MMUv4 preps/2 - Reshuffle PTE bits ... Browse Code »

With previous commit freeing up PTE bits, reassign them so as to:

- Match the bit to H/w counterpart where possible
(e.g. MMUv2 GLOBAL/PRESENT, this avoids a shift in create_tlb())
- Avoid holes in _PAGE_xxx definitions

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-08-30 12:49:12 +0800

29 Aug, 2013

2 commits

64b703ef2 ARC: MMUv4 preps/1 - Fold PTE K/U access flags ... Browse Code »

The current ARC VM code has 13 flags in Page Table entry: some software
(accesed/dirty/non-linear-maps) and rest hardware specific. With 8k MMU
page, we need 19 bits for addressing page frame so remaining 13 bits is
just about enough to accomodate the current flags.

In MMUv4 there are 2 additional flags, SZ (normal or super page) and WT
(cache access mode write-thru) - and additionally PFN is 20 bits (vs. 19
before for 8k). Thus these can't be held in current PTE w/o making each
entry 64bit wide.

It seems there is some scope of compressing the current PTE flags (and
freeing up a few bits). Currently PTE contains fully orthogonal distinct
access permissions for kernel and user mode (Kr, Kw, Kx; Ur, Uw, Ux)
which can be folded into one set (R, W, X). The translation of 3 PTE
bits into 6 TLB bits (when programming the MMU) can be done based on
following pre-requites/assumptions:

1. For kernel-mode-only translations (vmalloc: 0x7000_0000 to
0x7FFF_FFFF), PTE additionally has PAGE_GLOBAL flag set (and user
space entries can never be global). Thus such a PTE can translate
to Kr, Kw, Kx (as appropriate) and zero for User mode counterparts.

2. For non global entries, the PTE flags can be used to create mirrored
K and U TLB bits. This is true after commit a950549c675f2c8c504
"ARC: copy_(to|from)_user() to honor usermode-access permissions"
which ensured that user-space translations _MUST_ have same access
permissions for both U/K mode accesses so that copy_{to,from}_user()
play fair with fault based CoW break and such...

There is no such thing as free lunch - the cost is slightly infalted
TLB-Miss Handlers.

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-08-29 20:21:36 +0800
4b06ff35f ARC: Code cosmetics (Nothing semantical) ... Browse Code »

* reduce editor lines taken by pt_regs
* ARCompact ISA specific part of TLB Miss handlers clubbed together
* cleanup some comments

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-08-29 20:21:15 +0800

26 Aug, 2013

2 commits

fce16bc35 ARC: Entry Handler tweaks: Optimize away redundant IRQ_DISABLE_SAVE ... Browse Code »

In the exception return path, for both U/K cases, intr are already
disabled (for various existing reasons). So when we drop down to
@restore_regs, we need not redo that.

There was subtle issue - when intr were NOT being disabled for
ret-to-kernel-but-no-preemption case - now fixed by moving the
IRQ_DISABLE further up in @resume_kernel_mode.

So what do we gain:

* Shaves off a few insn in return path.

* Eliminates the need for IRQ_DISABLE_SAVE assembler macro for ARCv2
hence allows for entry code sharing.

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-08-26 12:10:25 +0800
37f3ac498 ARC: Exception Handlers Code consolidation ... Browse Code »

After the recent cleanups, all the exception handlers now have same
boilerplate prologue code. Move that into common macro.

This reduces readability but helps greatly with sharing / duplicating
entry code with ARCv2 ISA where the handlers are pretty much the same,
just the entry prologue is different (due to hardware assist).

Also while at it, add the missing FAKE_RET_FROM_EXCPN calls in couple of
places to drop down to pure kernel mode (from exception mode) before
jumping off into "C" code.

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-08-26 12:10:25 +0800

27 Jul, 2013

1 commit

4ffd9e2c4 ARC: SMP build breakage ... Browse Code »

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-07-27 06:34:22 +0800

04 Jul, 2013

1 commit

76d3f4c27 Merge tag 'arc-v3.11-rc1-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc ... Browse Code »

Pull first batch of ARC changes from Vineet Gupta:
"There's a second bunch to follow next week - which depends on commits
on other trees (irq/net). I'd have preferred the accompanying ARC
change via respective trees, but it didn't workout somehow.

Highlights of changes:

- Continuation of ARC MM changes from 3.10 including

zero page optimization
Setting pagecache pages dirty by default
Non executable stack by default
Reducing dcache flushes for aliasing VIPT config

- Long overdue rework of pt_regs machinery - removing the unused word
gutters and adding ECR register to baseline (helps cleanup lot of
low level code)

- Support for ARC gcc 4.8

- Few other preventive fixes, cosmetics, usage of Kconfig helper..

The diffstat is larger than normal primarily because of arcregs.h
header split as well as beautification of macros in entry.h"

* tag 'arc-v3.11-rc1-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (32 commits)
ARC: warn on improper stack unwind FDE entries
arc: delete __cpuinit usage from all arc files
ARC: [tlb-miss] Fix bug with CONFIG_ARC_DBG_TLB_MISS_COUNT
ARC: [tlb-miss] Extraneous PTE bit testing/setting
ARC: Adjustments for gcc 4.8
ARC: Setup Vector Table Base in early boot
ARC: Remove explicit passing around of ECR
ARC: pt_regs update #5: Use real ECR for pt_regs->event vs. synth values
ARC: stop using pt_regs->orig_r8
ARC: pt_regs update #4: r25 saved/restored unconditionally
ARC: K/U SP saved from one location in stack switching macro
ARC: Entry Handler tweaks: Simplify branch for in-kernel preemption
ARC: Entry Handler tweaks: Avoid hardcoded LIMMS for ECR values
ARC: Increase readability of entry handlers
ARC: pt_regs update #3: Remove unused gutter at start of callee_regs
ARC: pt_regs update #2: Remove unused gutter at start of pt_regs
ARC: pt_regs update #1: Align pt_regs end with end of kernel stack page
ARC: pt_regs update #0: remove kernel stack canary
ARC: [mm] Remove @write argument to do_page_fault()
ARC: [mm] Make stack/heap Non-executable by default
...

Linus Torvalds
2013-07-04 02:09:27 +0800

29 Jun, 2013

1 commit

40d158e61 consolidate io_remap_pfn_range definitions ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2013-06-29 16:46:35 +0800

27 Jun, 2013

1 commit

ce7599567 arc: delete __cpuinit usage from all arc files ... Browse Code »

The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.

After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out. Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.

Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
are flagged as __cpuinit -- so if we remove the __cpuinit from
arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
content into no-ops as early as possible, since that will get rid
of these warnings. In any case, they are temporary and harmless.

This removes all the arch/arc uses of the __cpuinit macros from
all C files. Currently arc does not have any __CPUINIT used in
assembly files.

[1] https://lkml.org/lkml/2013/5/20/589

Cc: Vineet Gupta
Signed-off-by: Paul Gortmaker
Signed-off-by: Vineet Gupta

Paul Gortmaker
2013-06-27 17:07:58 +0800

26 Jun, 2013

2 commits

38a9ff6d2 ARC: Remove explicit passing around of ECR ... Browse Code »

With ECR now part of pt_regs

* No need to propagate from lowest asm handlers as arg
* No need to save it in tsk->thread.cause_code
* Avoid bit chopping to access the bit-fields

More code consolidation, cleanup

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-06-26 18:00:50 +0800
502a0c775 ARC: pt_regs update #5: Use real ECR for pt_regs->event vs. synth values ... Browse Code »

pt_regs->event was set with artificial values to identify the low level
system event (syscall trap / breakpoint trap / exceptions / interrupts)

With r8 saving out of the way, the full word can be used to save real
ECR (Exception Cause Register) which helps idenify the event naturally,
including additional info such as cause code, param.
Only for Interrupts, where ECR is not applicable, do we resort to
synthetic non ECR values.

SAVE_ALL_TRAP/EXCEPTIONS can now be merged as they both use ECR with
different runtime values.

The ptrace helpers now use the sub-fields of ECR to distinguish the
events (e.g. vector 0x25 is trap, param 0 is syscall...)

The following benefits will follow:

(1) This centralizes the location of where ECR is saved and will allow
the cleanup of task->thread.cause_code ECR placeholder which is set
in non-uniform way. Then ARC VM code can safely rely on it being
there for purpose of finer grained VM_EXEC dcache flush (based on
exec fault: I-TLB Miss)

(2) Further, ECR being passed around from low level handlers as arg can
be eliminated as it is part of standard reg-file in pt_regs

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-06-26 16:34:48 +0800

22 Jun, 2013

14 commits

352c1d95e ARC: stop using pt_regs->orig_r8 ... Browse Code »

Historically, pt_regs have had orig_r8, an overloaded container for
(1) backup copy of r8 (syscall number Trap Exceptions)
(2) additional system state: (syscall/Exception/Interrupt)

There is no point in keeping (1) since syscall number is never clobbered
in-place, in pt_regs, unlike r0 which duals as first syscall arg as well
as syscall return value and in case of syscall restart, the orig arg0
needs restoring (from orig_r0) after having been updated in-place with
syscall ret value.

This further paves way to convert (2) to contain ECR itself (rather than
current madeup values)

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-06-22 21:53:26 +0800
359105bdb ARC: pt_regs update #4: r25 saved/restored unconditionally ... Browse Code »

(This is a VERY IMP change for low level interrupt/exception handling)

-----------------------------------------------------------------------
WHAT
-----------------------------------------------------------------------
* User 25 now saved in pt_regs->user_r25 (vs. tsk->thread_info.user_r25)

* This allows Low level interrupt code to unconditionally save r25
(vs. the prev version which would only do it for U->K transition).
Ofcourse for nested interrupts, only the pt_regs->user_r25 of
bottom-most frame is useful.

* simplifies the interrupt prologue/epilogue

* Needed for ARCv2 ISA code and done here to keep design similar with
ARCompact event handling

-----------------------------------------------------------------------
WHY
-------------------------------------------------------------------------
With CONFIG_ARC_CURR_IN_REG, r25 is used to cache "current" task pointer
in kernel mode. So when entering kernel mode from User Mode
- user r25 is specially safe-kept (it being a callee reg is NOT part of
pt_regs which are saved by default on each interrupt/trap/exception)
- r25 loaded with current task pointer.

Further, if interrupt was taken in kernel mode, this is skipped since we
know that r25 already has valid "current" pointer.

With 2 level of interrupts in ARCompact ISA, detecting this is difficult
but still possible, since we could be in kernel mode but r25 not already saved
(in fact the stack itself might not have been switched).

A. User mode
B. L1 IRQ taken
C. L2 IRQ taken (while on 1st line of L1 ISR)

So in #C, although in kernel mode, r25 not saved (infact SP not
switched at all)

Given that ARcompact has manual stack switching, we could use a bit of
trickey - The low level code would make sure that SP is only set to kernel
mode value at the very end (after saving r25). So a non kernel mode SP,
even if in kernel mode, meant r25 was NOT saved.

The same paradigm won't work in ARCv2 ISA since SP is auto-switched so
it's setting can't be delayed/constrained.

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-06-22 21:53:25 +0800
ba3558c77 ARC: K/U SP saved from one location in stack switching macro ... Browse Code »

This paves way for further simplifications.

There's an overhead of 1 insn for the non-common case of interrupt taken
from kernel mode.

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-06-22 21:53:25 +0800
1898a959b ARC: Entry Handler tweaks: Avoid hardcoded LIMMS for ECR values ... Browse Code »

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-06-22 21:53:23 +0800
3ebedbb2f ARC: Increase readability of entry handlers ... Browse Code »

* use artificial PUSH/POP contructs for CORE Reg save/restore to stack
* use artificial PUSHAX/POPAX contructs for Auxiliary Space regs
* macro'ize multiple copies of callee-reg-save/restore (SAVE_R13_TO_R24)
* use BIC insn for inverse-and operation

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-06-22 21:53:23 +0800
16f9afe65 ARC: pt_regs update #3: Remove unused gutter at start of callee_regs ... Browse Code »

This is trickier than prev two:

* context switching code saves kernel mode callee regs in the format of
struct callee_regs thus needs adjustment. This also reduces the height
of topmost kernel stack frame by 1 word.

* Since kernel stack unwinder is sensitive to height of topmost kernel
stack frame, that needs a word of adjustment too.

ptrace needs a bit of updating since pt_regs now diverges from
user_regs_struct.

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-06-22 21:53:22 +0800
2fa919045 ARC: pt_regs update #2: Remove unused gutter at start of pt_regs ... Browse Code »

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-06-22 21:53:22 +0800
283237a04 ARC: pt_regs update #1: Align pt_regs end with end of kernel stack page ... Browse Code »

Historically, pt_regs would end at offset of 1 word from end of stack
page.

----------------- -> START of page (task->stack)
| |
| thread_info |
-----------------
| |
^ ~ ~
| ~ ~
| | |
| | | End of page (START of kernel stack)

This required special "one-off" considerations in low level code.

The root cause is very likely assumption of "empty" SP by the original
ARC kernel hackers, despite ARC700 always been "full" SP.

So finally RIP one word gutter !

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-06-22 21:53:21 +0800
bed30976e ARC: pt_regs update #0: remove kernel stack canary ... Browse Code »

This stack slot is going to be used in subsequent commits

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-06-22 21:53:21 +0800
3abc94480 ARC: [mm] Make stack/heap Non-executable by default ... Browse Code »

1. For VM_EXEC based delayed dcache/icache flush, reduces the number of
flushes.

2. Makes this security feature ON by default rather than OFF before.

3. Applications can use mprotect() to selectively override this.

4. ELF binaries have a GNU_STACK segment which can easily override the
kernel default permissions.
For nested-functions/trampolines, gcc already auto-enables executable
stack in elf. Others needing this can use -Wl,-z,execstack option.

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-06-22 21:53:20 +0800
2ed21dae0 ARC: [mm] Assume pagecache page dirty by default ... Browse Code »

Similar to ARM/SH

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-06-22 21:53:19 +0800
304991866 ARC: cache detection code bitrot ... Browse Code »

* Number of (i|d)cache ways can be retrieved from BCRs and hence no need
to cross check with with built-in constants
* Use of IS_ENABLED() to check for a Kconfig option
* is_not_cache_aligned() not used anymore

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-06-22 16:16:43 +0800
da1677b02 ARC: Disintegrate arcregs.h ... Browse Code »

* Move the various sub-system defines/types into relevant files/functions
(reduces compilation time)

* move CPU specific stuff out of asm/tlb.h into asm/mmu.h

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-06-22 16:16:42 +0800
8235703e1 ARC: Use kconfig helper IS_ENABLED() to get rid of defines.h ... Browse Code »

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-06-22 16:16:42 +0800

25 May, 2013

1 commit

7bb66f6e6 ARC: lazy dcache flush broke gdb in non-aliasing configs ... Browse Code »

gdbserver inserting a breakpoint ends up calling copy_user_page() for a
code page. The generic version of which (non-aliasing config) didn't set
the PG_arch_1 bit hence update_mmu_cache() didn't sync dcache/icache for
corresponding dynamic loader code page - causing garbade to be executed.

So now aliasing versions of copy_user_highpage()/clear_page() are made
default. There is no significant overhead since all of special alias
handling code is compiled out for non-aliasing build

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-05-25 16:45:55 +0800

23 May, 2013

2 commits

006dfb3c9 ARC: Use enough bits for determining page's cache color ... Browse Code »

The current code uses 2 bits for determining page's dcache color, thus
sorting pages into 4 bins, whereas the aliasing dcache really has 2 bins
(8k page, 64k dcache - 4 way-set-assoc).
This can cause extraneous flushes - e.g. color 0 and 2.

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-05-23 16:55:09 +0800
3e87974de ARC: Brown paper bag bug in macro for checking cache color ... Browse Code »

The VM_EXEC check in update_mmu_cache() was getting optimized away
because of a stupid error in definition of macro addr_not_cache_congruent()

The intention was to have the equivalent of following:

if (a || (1 ? b : 0))

but we ended up with following:

if (a || 1 ? b : 0)

And because precedence of '||' is more that that of '?', gcc was optimizing
away evaluation of

Nasty Repercussions:
1. For non-aliasing configs it would mean some extraneous dcache flushes
for non-code pages if U/K mappings were not congruent.
2. For aliasing config, some needed dcache flush for code pages might
be missed if U/K mappings were congruent.

Signed-off-by: Vineet Gupta

Vineet Gupta
2013-05-23 16:54:52 +0800