11 Aug, 2015
4 commits
-
commit 80f420842ff42ad61f84584716d74ef635f13892 upstream.
ARCompact/ARCv2 ISA provide that any instructions which deals with
bitpos/count operand ASL, LSL, BSET, BCLR, BMSK .... will only consider
lower 5 bits. i.e. auto-clamp the pos to 0-31.ARC Linux bitops exploited this fact by NOT explicitly masking out upper
bits for @nr operand in general, saving a bunch of AND/BMSK instructions
in generated code around bitops.While this micro-optimization has worked well over years it is NOT safe
as shifting a number with a value, greater than native size is
"undefined" per "C" spec.So as it turns outm EZChip ran into this eventually, in their massive
muti-core SMP build with 64 cpus. There was a test_bit() inside a loop
from 63 to 0 and gcc was weirdly optimizing away the first iteration
(so it was really adhering to standard by implementing undefined behaviour
vs. removing all the iterations which were phony i.e. (1 << [63..32])| for i = 63 to 0
| X = ( 1 << i )
| if X == 0
| continueSo fix the code to do the explicit masking at the expense of generating
additional instructions. Fortunately, this can be mitigated to a large
extent as gcc has SHIFT_COUNT_TRUNCATED which allows combiner to fold
masking into shift operation itself. It is currently not enabled in ARC
gcc backend, but could be done after a bit of testing.Fixes STAR 9000866918 ("unsafe "undefined behavior" code in kernel")
Reported-by: Noam Camus
Signed-off-by: Vineet Gupta
Signed-off-by: Greg Kroah-Hartman -
commit 04e2eee4b02edcafce96c9c37b31b1a3318291a4 upstream.
No semantical changes !
Acked-by: Peter Zijlstra (Intel)
Signed-off-by: Vineet Gupta
Signed-off-by: Greg Kroah-Hartman -
commit f51e2f1911122879eefefa4c592dea8bf794b39c upstream.
Currently instruction_pointer() returns pt_regs->ret and so return value
is of type "long", which implicitly stands for "signed long".While that's perfectly fine when dealing with 32-bit values if return
value of instruction_pointer() gets assigned to 64-bit variable sign
extension may happen.And at least in one real use-case it happens already.
In perf_prepare_sample() return value of perf_instruction_pointer()
(which is an alias to instruction_pointer() in case of ARC) is assigned
to (struct perf_sample_data)->ip (which type is "u64").And what we see if instuction pointer points to user-space application
that in case of ARC lays below 0x8000_0000 "ip" gets set properly with
leading 32 zeros. But if instruction pointer points to kernel address
space that starts from 0x8000_0000 then "ip" is set with 32 leadig
"f"-s. I.e. id instruction_pointer() returns 0x8100_0000, "ip" will be
assigned with 0xffff_ffff__8100_0000. Which is obviously wrong.In particular that issuse broke output of perf, because perf was unable
to associate addresses like 0xffff_ffff__8100_0000 with anything from
/proc/kallsyms.That's what we used to see:
----------->8----------
6.27% ls [unknown] [k] 0xffffffff8046c5cc
2.96% ls libuClibc-0.9.34-git.so [.] memcpy
2.25% ls libuClibc-0.9.34-git.so [.] memset
1.66% ls [unknown] [k] 0xffffffff80666536
1.54% ls libuClibc-0.9.34-git.so [.] 0x000224d6
1.18% ls libuClibc-0.9.34-git.so [.] 0x00022472
----------->8----------With that change perf output looks much better now:
----------->8----------
8.21% ls [kernel.kallsyms] [k] memset
3.52% ls libuClibc-0.9.34-git.so [.] memcpy
2.11% ls libuClibc-0.9.34-git.so [.] malloc
1.88% ls libuClibc-0.9.34-git.so [.] memset
1.64% ls [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
1.41% ls [kernel.kallsyms] [k] __d_lookup_rcu
----------->8----------Signed-off-by: Alexey Brodkin
Cc: arc-linux-dev@synopsys.com
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta
Signed-off-by: Greg Kroah-Hartman -
commit 97709069214eb75312c14946803b9da4d3814203 upstream.
ARC kernels have historically been built with -O3, despite top level
Makefile defaulting to -O2. This was facilitated by implicitly ordering
of arch makefile include AFTER top level assigned -O2.An upstream fix to top level a1c48bb160f ("Makefile: Fix unrecognized
cross-compiler command line options") changed the ordering, making ARC
-O3 defunct.Fix that by NOT relying on any ordering whatsoever and use the proper
arch override facility now present in kbuild (ARCH_*FLAGS)Depends-on: ("kbuild: Allow arch Makefiles to override {cpp,ld,c}flags")
Suggested-by: Michal Marek
Cc: Geert Uytterhoeven
Signed-off-by: Vineet Gupta
Signed-off-by: Greg Kroah-Hartman
22 Jul, 2015
3 commits
-
commit 7002f77541f877a5590615ceb3da32b114f14b62 upstream.
static arc_pmu in the arch/arc/kernel/perf_event.c is not initialized as
it's shadowed by a local variable of the same name in the
arc_pmu_device_probe.Signed-off-by: Max Filippov
Fixes: 03c94fcf954d "ARC: perf: make @arc_pmu static global"
Signed-off-by: Vineet Gupta
Signed-off-by: Greg Kroah-Hartman -
commit d57f727264f1425a94689bafc7e99e502cb135b5 upstream.
When auditing cmpxchg call sites, Chuck noted that gcc was optimizing
away some of the desired LDs.| do {
| new = old = *ipi_data_ptr;
| new |= 1U << msg;
| } while (cmpxchg(ipi_data_ptr, old, new) != old);was generating to below
| 8015cef8: ld r2,[r4,0]
Acked-by: Peter Zijlstra (Intel)
Signed-off-by: Vineet Gupta
Signed-off-by: Greg Kroah-Hartman -
commit 2576c28e3f623ed401db7e6197241865328620ef upstream.
- arch_spin_lock/unlock were lacking the ACQUIRE/RELEASE barriers
Since ARCv2 only provides load/load, store/store and all/all, we need
the full barrier- LLOCK/SCOND based atomics, bitops, cmpxchg, which return modified
values were lacking the explicit smp barriers.- Non LLOCK/SCOND varaints don't need the explicit barriers since that
is implicity provided by the spin locks used to implement the
critical section (the spin lock barriers in turn are also fixed in
this commit as explained aboveCc: Paul E. McKenney
Acked-by: Peter Zijlstra (Intel)
Signed-off-by: Vineet Gupta
Signed-off-by: Greg Kroah-Hartman
11 May, 2015
2 commits
-
Signed-off-by: Vineet Gupta
-
Signed-off-by: Vineet Gupta
10 May, 2015
1 commit
-
Fixes: f7d11e93ee97a locking,arch,arc: Fold atomic_ops
Cc: # 3.18
Signed-off-by: Vineet Gupta
24 Apr, 2015
1 commit
-
Pull ARC updates from Vineet Gupta:
- perf fixes/improvements
- misc cleanups
* tag 'arc-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: perf: don't add code for impossible case
ARC: perf: Rename DT binding to not confuse with power mgmt
ARC: perf: add user space attribution in callchains
ARC: perf: Add kernel callchain support
ARC: perf: support cache hit/miss ratio
ARC: perf: Add some comments/debug stuff
ARC: perf: make @arc_pmu static global
ARC: mem init spring cleaning - No functional changes
ARC: Fix RTT boot printing
ARC: fold __builtin_constant_p() into test_bit()
ARC: rename unhandled exception handler
ARC: cosmetic: Remove unused ECR bitfield masks
ARC: Fix WRITE_BCR
ARC: [nsimosci] Update defconfig
arc: copy_thread(): rename 'arg' argument to 'kthread_arg'
20 Apr, 2015
7 commits
-
Signed-off-by: Vineet Gupta
-
Signed-off-by: Vineet Gupta
-
The actual user space unwinding is more involved, so simply capture the
user space PCSigned-off-by: Vineet Gupta
-
Signed-off-by: Mischa Jonker
Signed-off-by: Vineet Gupta -
Signed-off-by: Vineet Gupta
-
Signed-off-by: Vineet Gupta
-
Signed-off-by: Vineet Gupta
17 Apr, 2015
1 commit
-
print_task_path_n_nm() is local to this file, its only user being
show_regs(). Mark the function static and avoid the EXPORT_SYMBOL.Signed-off-by: Davidlohr Bueso
Acked-by: Vineet Gupta
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
16 Apr, 2015
1 commit
-
Pull exec domain removal from Richard Weinberger:
"This series removes execution domain support from Linux.The idea behind exec domains was to support different ABIs. The
feature was never complete nor stable. Let's rip it out and make the
kernel signal handling code less complicated"* 'exec_domain_rip_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc: (27 commits)
arm64: Removed unused variable
sparc: Fix execution domain removal
Remove rest of exec domains.
arch: Remove exec_domain from remaining archs
arc: Remove signal translation and exec_domain
xtensa: Remove signal translation and exec_domain
xtensa: Autogenerate offsets in struct thread_info
x86: Remove signal translation and exec_domain
unicore32: Remove signal translation and exec_domain
um: Remove signal translation and exec_domain
tile: Remove signal translation and exec_domain
sparc: Remove signal translation and exec_domain
sh: Remove signal translation and exec_domain
s390: Remove signal translation and exec_domain
mn10300: Remove signal translation and exec_domain
microblaze: Remove signal translation and exec_domain
m68k: Remove signal translation and exec_domain
m32r: Remove signal translation and exec_domain
m32r: Autogenerate offsets in struct thread_info
frv: Remove signal translation and exec_domain
...
15 Apr, 2015
2 commits
-
Pull vfs update from Al Viro:
"Part one:- struct filename-related cleanups
- saner iov_iter_init() replacements (and switching the syscalls to
use of those)- ntfs switch to ->write_iter() (Anton)
- aio cleanups and splitting iocb into common and async parts
(Christoph)- assorted fixes (me, bfields, Andrew Elble)
There's a lot more, including the completion of switchover to
->{read,write}_iter(), d_inode/d_backing_inode annotations, f_flags
race fixes, etc, but that goes after #for-davem merge. David has
pulled it, and once it's in I'll send the next vfs pull request"* 'for-linus-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (35 commits)
sg_start_req(): use import_iovec()
sg_start_req(): make sure that there's not too many elements in iovec
blk_rq_map_user(): use import_single_range()
sg_io(): use import_iovec()
process_vm_access: switch to {compat_,}import_iovec()
switch keyctl_instantiate_key_common() to iov_iter
switch {compat_,}do_readv_writev() to {compat_,}import_iovec()
aio_setup_vectored_rw(): switch to {compat_,}import_iovec()
vmsplice_to_user(): switch to import_iovec()
kill aio_setup_single_vector()
aio: simplify arguments of aio_setup_..._rw()
aio: lift iov_iter_init() into aio_setup_..._rw()
lift iov_iter into {compat_,}do_readv_writev()
NFS: fix BUG() crash in notify_change() with patch to chown_common()
dcache: return -ESTALE not -EBUSY on distributed fs race
NTFS: Version 2.1.32 - Update file write from aio_write to write_iter.
VFS: Add iov_iter_fault_in_multipages_readable()
drop bogus check in file_open_root()
switch security_inode_getattr() to struct path *
constify tomoyo_realpath_from_path()
... -
Pull trivial tree from Jiri Kosina:
"Usual trivial tree updates. Nothing outstanding -- mostly printk()
and comment fixes and unused identifier removals"* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
goldfish: goldfish_tty_probe() is not using 'i' any more
powerpc: Fix comment in smu.h
qla2xxx: Fix printks in ql_log message
lib: correct link to the original source for div64_u64
si2168, tda10071, m88ds3103: Fix firmware wording
usb: storage: Fix printk in isd200_log_config()
qla2xxx: Fix printk in qla25xx_setup_mode
init/main: fix reset_device comment
ipwireless: missing assignment
goldfish: remove unreachable line of code
coredump: Fix do_coredump() comment
stacktrace.h: remove duplicate declaration task_struct
smpboot.h: Remove unused function prototype
treewide: Fix typo in printk messages
treewide: Fix typo in printk messages
mod_devicetable: fix comment for match_flags
13 Apr, 2015
8 commits
-
Signed-off-by: Vineet Gupta
-
Signed-off-by: Vineet Gupta
-
This makes test_bit() more like its siblings *_bit() routines.
Also add some comments about the constant @nr micro-optimizationSigned-off-by: Vineet Gupta
-
Signed-off-by: Vineet Gupta
-
Signed-off-by: Vineet Gupta
-
* There was obvious bit rot due to lack of use
* Old naming was confusing since BCR are read onlySigned-off-by: Vineet Gupta
-
Signed-off-by: Mischa Jonker
-
As execution domain support is gone we can remove
signal translation from the signal code and remove
exec_domain from thread_info.Signed-off-by: Richard Weinberger
12 Apr, 2015
1 commit
-
flush_old_exec() has already done that. Back on 2011 a bunch of
instances like that had been kicked out, but that hadn't taken
care of then-out-of-tree architectures, obviously, and they served
as reinfection vector...Signed-off-by: Al Viro
31 Mar, 2015
1 commit
-
The 'arg' argument to copy_thread() is only ever used when forking a new
kernel thread. Hence, rename it to 'kthread_arg' for clarity.Signed-off-by: Alex Dowad
Signed-off-by: Vineet Gupta
26 Mar, 2015
2 commits
-
A malicious signal handler / restorer can DOS the system by fudging the
user regs saved on stack, causing weird things such as sigreturn returning
to user mode PC but cpu state still being kernel mode....Ensure that in sigreturn path status32 always has U bit; any other bogosity
(gargbage PC etc) will be taken care of by normal user mode exceptions mechanisms.Reproducer signal handler:
void handle_sig(int signo, siginfo_t *info, void *context)
{
ucontext_t *uc = context;
struct user_regs_struct *regs = &(uc->uc_mcontext.regs);regs->scratch.status32 = 0;
}Before the fix, kernel would go off to weeds like below:
--------->8-----------
[ARCLinux]$ ./signal-test
Path: /signal-test
CPU: 0 PID: 61 Comm: signal-test Not tainted 4.0.0-rc5+ #65
task: 8f177880 ti: 5ffe6000 task.ti: 8f15c000[ECR ]: 0x00220200 => Invalid Write @ 0x00000010 by insn @ 0x00010698
[EFA ]: 0x00000010
[BLINK ]: 0x2007c1ee
[ERET ]: 0x10698
[STAT32]: 0x00000000 : 8-----------Reported-by: Alexey Brodkin
Cc:
Signed-off-by: Vineet Gupta -
The regfile provided to SA_SIGINFO signal handler as ucontext was off by
one due to pt_regs gutter cleanups in 2013.Before handling signal, user pt_regs are copied onto user_regs_struct and copied
back later. Both structs are binary compatible. This was all fine until
commit 2fa919045b72 (ARC: pt_regs update #2) which removed the empty stack slot
at top of pt_regs (corresponding to first pad) and made the corresponding
fixup in struct user_regs_struct (the pad in there was moved out of
@scratch - not removed altogether as it is part of ptrace ABI)struct user_regs_struct {
+ long pad;
struct {
- long pad;
long bta, lp_start, lp_end,....
} scratch;
...
}This meant that now user_regs_struct was off by 1 reg w.r.t pt_regs and
signal code needs to user_regs_struct.scratch to reflect it as pt_regs,
which is what this commit does.This problem was hidden for 2 years, because both save/restore, despite
using wrong location, were using the same location. Only an interim
inspection (reproducer below) exposed the issue.void handle_segv(int signo, siginfo_t *info, void *context)
{
ucontext_t *uc = context;
struct user_regs_struct *regs = &(uc->uc_mcontext.regs);printf("regs %x %x\n", scratch.r8, regs->scratch.r9);
}int main()
{
struct sigaction sa;sa.sa_sigaction = handle_segv;
sa.sa_flags = SA_SIGINFO;
sigemptyset(&sa.sa_mask);
sigaction(SIGSEGV, &sa, NULL);asm volatile(
"mov r7, 7 \n"
"mov r8, 8 \n"
"mov r9, 9 \n"
"mov r10, 10 \n"
:::"r7","r8","r9","r10");*((unsigned int*)0x10) = 0;
}Fixes: 2fa919045b72ec892e "ARC: pt_regs update #2: Remove unused gutter at start of pt_regs"
CC:
Signed-off-by: Vineet Gupta
07 Mar, 2015
1 commit
-
This patch fix spelling typo in printk messages.
Signed-off-by: Masanari Iida
Acked-by: Randy Dunlap
Signed-off-by: Jiri Kosina
27 Feb, 2015
4 commits
-
The old implementation assumed that SP at the time of __switch_to() is
right above pt_regs which is almost certainly not the case as there will
be some stack build up between entry into kernel and leading up to
__switch_toSigned-off-by: Vineet Gupta
-
/proc//maps currently don't annotate stack vma with "[stack]"
This is because KSTK_ESP ie expected to return usermode SP of tsk while
currently it returns the kernel mode SP of a sleeping tsk.While the fix is trivial, we also need to adjust the ARC kernel stack
unwinder to not use KSTK_SP and friends any more.Cc:
Reported-and-suggested-by: Alexey Brodkin
Signed-off-by: Vineet Gupta -
Signed-off-by: Vineet Gupta
-
The arc unwinder can also be used for perf callchains.
Signed-off-by: Mischa Jonker
Signed-off-by: Vineet Gupta
19 Feb, 2015
1 commit
-
Pull dmaengine updates from Vinod Koul:
"This update brings:- the big cleanup up by Maxime for device control and slave
capabilities. This makes the API much cleaner.- new IMG MDC driver by Andrew
- new Renesas R-Car Gen2 DMA Controller driver by Laurent along with
bunch of fixes on rcar drivers- odd fixes and updates spread over driver"
* 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma: (130 commits)
dmaengine: pl330: add DMA_PAUSE feature
dmaengine: pl330: improve pl330_tx_status() function
dmaengine: rcar-dmac: Disable channel 0 when using IOMMU
dmaengine: rcar-dmac: Work around descriptor mode IOMMU errata
dmaengine: rcar-dmac: Allocate hardware descriptors with DMAC device
dmaengine: rcar-dmac: Fix oops due to unintialized list in error ISR
dmaengine: rcar-dmac: Fix spinlock issues in interrupt
dmaenegine: edma: fix sparse warnings
dmaengine: rcar-dmac: Fix uninitialized variable usage
dmaengine: shdmac: extend PM methods
dmaengine: shdmac: use SET_RUNTIME_PM_OPS()
dmaengine: pl330: fix bug that cause start the same descs in cyclic
dmaengine: at_xdmac: allow muliple dwidths when doing slave transfers
dmaengine: at_xdmac: simplify channel configuration stuff
dmaengine: at_xdmac: introduce save_cc field
dmaengine: at_xdmac: wait for in-progress transaction to complete after pausing a channel
ioat: fail self-test if wait_for_completion times out
dmaengine: dw: define DW_DMA_MAX_NR_MASTERS
dmaengine: dw: amend description of dma_dev field
dmatest: move src_off, dst_off, len inside loop
...