08 Jun, 2016
1 commit
-
This patch allows to build the whole kernel with GCC plugins. It was ported from
grsecurity/PaX. The infrastructure supports building out-of-tree modules and
building in a separate directory. Cross-compilation is supported too.
Currently the x86, arm, arm64 and uml architectures enable plugins.The directory of the gcc plugins is scripts/gcc-plugins. You can use a file or a directory
there. The plugins compile with these options:
* -fno-rtti: gcc is compiled with this option so the plugins must use it too
* -fno-exceptions: this is inherited from gcc too
* -fasynchronous-unwind-tables: this is inherited from gcc too
* -ggdb: it is useful for debugging a plugin (better backtrace on internal
errors)
* -Wno-narrowing: to suppress warnings from gcc headers (ipa-utils.h)
* -Wno-unused-variable: to suppress warnings from gcc headers (gcc_version
variable, plugin-version.h)The infrastructure introduces a new Makefile target called gcc-plugins. It
supports all gcc versions from 4.5 to 6.0. The scripts/gcc-plugin.sh script
chooses the proper host compiler (gcc-4.7 can be built by either gcc or g++).
This script also checks the availability of the included headers in
scripts/gcc-plugins/gcc-common.h.The gcc-common.h header contains frequently included headers for GCC plugins
and it has a compatibility layer for the supported gcc versions.The gcc-generate-*-pass.h headers automatically generate the registration
structures for GIMPLE, SIMPLE_IPA, IPA and RTL passes.Note that 'make clean' keeps the *.so files (only the distclean or mrproper
targets clean all) because they are needed for out-of-tree modules.Based on work created by the PaX Team.
Signed-off-by: Emese Revfy
Acked-by: Kees Cook
Signed-off-by: Michal Marek
28 May, 2016
1 commit
-
Pull UML updates from Richard Weinberger:
"This contains a nice FPU fixup from Eli Cooper for UML"* 'for-linus-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: add extended processor state save/restore support
um: extend fpstate to _xstate to support YMM registers
um: fix FPU state preservation around signal handlers
24 May, 2016
1 commit
-
This option was replaced by PAGE_COUNTER which is selected by MEMCG.
Signed-off-by: Konstantin Khlebnikov
Acked-by: Arnd Bergmann
Acked-by: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
22 May, 2016
2 commits
-
This patch extends save_fp_registers() and restore_fp_registers() to use
PTRACE_GETREGSET and PTRACE_SETREGSET with the XSTATE note type, adding
support for new processor state extensions between context switches.When the new ptrace requests are unavailable, it falls back to the old
PTRACE_GETFPREGS and PTRACE_SETFPREGS methods, which have been renamed to
save_i387_registers() and restore_i387_registers().Now these functions expect *fp_regs to have the space of an _xstate struct.
Thus, this also makes ptrace in UML responde to PTRACE_GETFPREGS/_SETFPREG
requests with a user_i387_struct (thus independent from HOST_FP_SIZE), and
by calling save_i387_registers() and restore_i387_registers() instead of
the extended save_fp_registers() and restore_fp_registers() functions.Signed-off-by: Eli Cooper
-
Extends fpstate to _xstate, in order to hold AVX/YMM registers.
To avoid oversized stack frame, the following functions have been
refactored by using malloc.
- sig_handler_common
- timer_real_alarm_handlerSigned-off-by: Eli Cooper
21 May, 2016
1 commit
-
Define HAVE_EXIT_THREAD for archs which want to do something in
exit_thread. For others, let's define exit_thread as an empty inline.This is a cleanup before we change the prototype of exit_thread to
accept a task parameter.[akpm@linux-foundation.org: fix mips]
Signed-off-by: Jiri Slaby
Cc: "David S. Miller"
Cc: "H. Peter Anvin"
Cc: "James E.J. Bottomley"
Cc: Aurelien Jacquiot
Cc: Benjamin Herrenschmidt
Cc: Catalin Marinas
Cc: Chen Liqin
Cc: Chris Metcalf
Cc: Chris Zankel
Cc: David Howells
Cc: Fenghua Yu
Cc: Geert Uytterhoeven
Cc: Guan Xuetao
Cc: Haavard Skinnemoen
Cc: Hans-Christian Egtvedt
Cc: Heiko Carstens
Cc: Helge Deller
Cc: Ingo Molnar
Cc: Ivan Kokshaysky
Cc: James Hogan
Cc: Jeff Dike
Cc: Jesper Nilsson
Cc: Jiri Slaby
Cc: Jonas Bonn
Cc: Koichi Yasutake
Cc: Lennox Wu
Cc: Ley Foon Tan
Cc: Mark Salter
Cc: Martin Schwidefsky
Cc: Matt Turner
Cc: Max Filippov
Cc: Michael Ellerman
Cc: Michal Simek
Cc: Mikael Starvik
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Ralf Baechle
Cc: Rich Felker
Cc: Richard Henderson
Cc: Richard Kuo
Cc: Richard Weinberger
Cc: Russell King
Cc: Steven Miao
Cc: Thomas Gleixner
Cc: Tony Luck
Cc: Vineet Gupta
Cc: Will Deacon
Cc: Yoshinori Sato
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
18 May, 2016
1 commit
-
Pull networking updates from David Miller:
"Highlights:1) Support SPI based w5100 devices, from Akinobu Mita.
2) Partial Segmentation Offload, from Alexander Duyck.
3) Add GMAC4 support to stmmac driver, from Alexandre TORGUE.
4) Allow cls_flower stats offload, from Amir Vadai.
5) Implement bpf blinding, from Daniel Borkmann.
6) Optimize _ASYNC_ bit twiddling on sockets, unless the socket is
actually using FASYNC these atomics are superfluous. From Eric
Dumazet.7) Run TCP more preemptibly, also from Eric Dumazet.
8) Support LED blinking, EEPROM dumps, and rxvlan offloading in mlx5e
driver, from Gal Pressman.9) Allow creating ppp devices via rtnetlink, from Guillaume Nault.
10) Improve BPF usage documentation, from Jesper Dangaard Brouer.
11) Support tunneling offloads in qed, from Manish Chopra.
12) aRFS offloading in mlx5e, from Maor Gottlieb.
13) Add RFS and RPS support to SCTP protocol, from Marcelo Ricardo
Leitner.14) Add MSG_EOR support to TCP, this allows controlling packet
coalescing on application record boundaries for more accurate
socket timestamp sampling. From Martin KaFai Lau.15) Fix alignment of 64-bit netlink attributes across the board, from
Nicolas Dichtel.16) Per-vlan stats in bridging, from Nikolay Aleksandrov.
17) Several conversions of drivers to ethtool ksettings, from Philippe
Reynes.18) Checksum neutral ILA in ipv6, from Tom Herbert.
19) Factorize all of the various marvell dsa drivers into one, from
Vivien Didelot20) Add VF support to qed driver, from Yuval Mintz"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1649 commits)
Revert "phy dp83867: Fix compilation with CONFIG_OF_MDIO=m"
Revert "phy dp83867: Make rgmii parameters optional"
r8169: default to 64-bit DMA on recent PCIe chips
phy dp83867: Make rgmii parameters optional
phy dp83867: Fix compilation with CONFIG_OF_MDIO=m
bpf: arm64: remove callee-save registers use for tmp registers
asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions
switchdev: pass pointer to fib_info instead of copy
net_sched: close another race condition in tcf_mirred_release()
tipc: fix nametable publication field in nl compat
drivers: net: Don't print unpopulated net_device name
qed: add support for dcbx.
ravb: Add missing free_irq() calls to ravb_close()
qed: Remove a stray tab
net: ethernet: fec-mpc52xx: use phy_ethtool_{get|set}_link_ksettings
net: ethernet: fec-mpc52xx: use phydev from struct net_device
bpf, doc: fix typo on bpf_asm descriptions
stmmac: hardware TX COE doesn't work when force_thresh_dma_mode is set
net: ethernet: fs-enet: use phy_ethtool_{get|set}_link_ksettings
net: ethernet: fs-enet: use phydev from struct net_device
...
05 May, 2016
1 commit
-
Replace all trans_start updates with netif_trans_update helper.
change was done via spatch:struct net_device *d;
@@
- d->trans_start = jiffies
+ netif_trans_update(d)Compile tested only.
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: linux-xtensa@linux-xtensa.org
Cc: linux1394-devel@lists.sourceforge.net
Cc: linux-rdma@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: MPT-FusionLinux.pdl@broadcom.com
Cc: linux-scsi@vger.kernel.org
Cc: linux-can@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linux-omap@vger.kernel.org
Cc: linux-hams@vger.kernel.org
Cc: linux-usb@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: devel@driverdev.osuosl.org
Cc: b.a.t.m.a.n@lists.open-mesh.org
Cc: linux-bluetooth@vger.kernel.org
Signed-off-by: Florian Westphal
Acked-by: Felipe Balbi
Acked-by: Mugunthan V N
Acked-by: Antonio Quartulli
Signed-off-by: David S. Miller
13 Apr, 2016
1 commit
-
Signed-off-by: Jens Axboe
Reviewed-by: Christoph Hellwig
23 Mar, 2016
1 commit
-
This commit fixes the following security hole affecting systems where
all of the following conditions are fulfilled:- The fs.suid_dumpable sysctl is set to 2.
- The kernel.core_pattern sysctl's value starts with "/". (Systems
where kernel.core_pattern starts with "|/" are not affected.)
- Unprivileged user namespace creation is permitted. (This is
true on Linux >=3.8, but some distributions disallow it by
default using a distro patch.)Under these conditions, if a program executes under secure exec rules,
causing it to run with the SUID_DUMP_ROOT flag, then unshares its user
namespace, changes its root directory and crashes, the coredump will be
written using fsuid=0 and a path derived from kernel.core_pattern - but
this path is interpreted relative to the root directory of the process,
allowing the attacker to control where a coredump will be written with
root privileges.To fix the security issue, always interpret core_pattern for dumps that
are written under SUID_DUMP_ROOT relative to the root directory of init.Signed-off-by: Jann Horn
Acked-by: Kees Cook
Cc: Al Viro
Cc: "Eric W. Biederman"
Cc: Andy Lutomirski
Cc: Oleg Nesterov
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
21 Mar, 2016
1 commit
-
Pull x86 protection key support from Ingo Molnar:
"This tree adds support for a new memory protection hardware feature
that is available in upcoming Intel CPUs: 'protection keys' (pkeys).There's a background article at LWN.net:
https://lwn.net/Articles/643797/
The gist is that protection keys allow the encoding of
user-controllable permission masks in the pte. So instead of having a
fixed protection mask in the pte (which needs a system call to change
and works on a per page basis), the user can map a (handful of)
protection mask variants and can change the masks runtime relatively
cheaply, without having to change every single page in the affected
virtual memory range.This allows the dynamic switching of the protection bits of large
amounts of virtual memory, via user-space instructions. It also
allows more precise control of MMU permission bits: for example the
executable bit is separate from the read bit (see more about that
below).This tree adds the MM infrastructure and low level x86 glue needed for
that, plus it adds a high level API to make use of protection keys -
if a user-space application calls:mmap(..., PROT_EXEC);
or
mprotect(ptr, sz, PROT_EXEC);
(note PROT_EXEC-only, without PROT_READ/WRITE), the kernel will notice
this special case, and will set a special protection key on this
memory range. It also sets the appropriate bits in the Protection
Keys User Rights (PKRU) register so that the memory becomes unreadable
and unwritable.So using protection keys the kernel is able to implement 'true'
PROT_EXEC on x86 CPUs: without protection keys PROT_EXEC implies
PROT_READ as well. Unreadable executable mappings have security
advantages: they cannot be read via information leaks to figure out
ASLR details, nor can they be scanned for ROP gadgets - and they
cannot be used by exploits for data purposes either.We know about no user-space code that relies on pure PROT_EXEC
mappings today, but binary loaders could start making use of this new
feature to map binaries and libraries in a more secure fashion.There is other pending pkeys work that offers more high level system
call APIs to manage protection keys - but those are not part of this
pull request.Right now there's a Kconfig that controls this feature
(CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) that is default enabled
(like most x86 CPU feature enablement code that has no runtime
overhead), but it's not user-configurable at the moment. If there's
any serious problem with this then we can make it configurable and/or
flip the default"* 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
x86/mm/pkeys: Fix mismerge of protection keys CPUID bits
mm/pkeys: Fix siginfo ABI breakage caused by new u64 field
x86/mm/pkeys: Fix access_error() denial of writes to write-only VMA
mm/core, x86/mm/pkeys: Add execute-only protection keys support
x86/mm/pkeys: Create an x86 arch_calc_vm_prot_bits() for VMA flags
x86/mm/pkeys: Allow kernel to modify user pkey rights register
x86/fpu: Allow setting of XSAVE state
x86/mm: Factor out LDT init from context init
mm/core, x86/mm/pkeys: Add arch_validate_pkey()
mm/core, arch, powerpc: Pass a protection key in to calc_vm_flag_bits()
x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU
x86/mm/pkeys: Add Kconfig prompt to existing config option
x86/mm/pkeys: Dump pkey from VMA in /proc/pid/smaps
x86/mm/pkeys: Dump PKRU with other kernel registers
mm/core, x86/mm/pkeys: Differentiate instruction fetches
x86/mm/pkeys: Optimize fault handling in access_error()
mm/core: Do not enforce PKEY permissions on remote mm access
um, pkeys: Add UML arch_*_access_permitted() methods
mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keys
x86/mm/gup: Simplify get_user_pages() PTE bit handling
...
18 Mar, 2016
1 commit
-
There are few things about *pte_alloc*() helpers worth cleaning up:
- 'vma' argument is unused, let's drop it;
- most __pte_alloc() callers do speculative check for pmd_none(),
before taking ptl: let's introduce pte_alloc() macro which does
the check.The only direct user of __pte_alloc left is userfaultfd, which has
different expectation about atomicity wrt pmd.- pte_alloc_map() and pte_alloc_map_lock() are redefined using
pte_alloc().[sudeep.holla@arm.com: fix build for arm64 hugetlbpage]
[sfr@canb.auug.org.au: fix arch/arm/mm/mmu.c some more]
Signed-off-by: Kirill A. Shutemov
Cc: Dave Hansen
Signed-off-by: Sudeep Holla
Acked-by: Kirill A. Shutemov
Signed-off-by: Stephen Rothwell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
06 Mar, 2016
2 commits
-
...modules are using this symbol.
Export it like all other archs to.Signed-off-by: Richard Weinberger
-
Commit db2f24dc240856fb1d78005307f1523b7b3c121b
was plain wrong. I did not realize the we are
allowed to loop here.
In fact we have to loop and must not return to userspace
before all SIGSEGVs have been delivered.
Other archs do this directly in their entry code, UML
does it here.Reported-by: Al Viro
Signed-off-by: Richard Weinberger
19 Feb, 2016
1 commit
-
UML has a special mmu_context.h and needs updates whenever the generic one
is updated.Signed-off-by: Dave Hansen
Cc: Dave Hansen
Cc: Jeff Dike
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Richard Weinberger
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: user-mode-linux-user@lists.sourceforge.net
Link: http://lkml.kernel.org/r/20160218183557.AE1DB383@viggo.jf.intel.com
Signed-off-by: Ingo Molnar
06 Feb, 2016
1 commit
-
Commit 16da306849d0 ("um: kill pfn_t") introduced a compile warning for
defconfig (SUBARCH=i386):arch/um/kernel/skas/mmu.c:38:206:
warning: right shift count >= width of type [-Wshift-count-overflow]Aforementioned patch changes the definition of the phys_to_pfn() macro
from((pfn_t) ((p) >> PAGE_SHIFT))
to
((p) >> PAGE_SHIFT)
This effectively changes the phys_to_pfn() expansion's type from
unsigned long long to unsigned long.Through the callchain init_stub_pte() => mk_pte(), the expansion of
phys_to_pfn() is (indirectly) fed into the 'phys' argument of the
pte_set_val(pte, phys, prot) macro, eventually leading to(pte).pte_high = (phys) >> 32;
This results in the warning from above.
Since UML only deals with 32 bit addresses, the upper 32 bits from
'phys' used to be always zero anyway. Also, all page protection flags
defined by UML don't use any bits beyond bit 9. Since the contents of a
PTE are defined within architecture scope only, the ->pte_high member
can be safely removed.Remove the ->pte_high member from struct pte_t.
Rename ->pte_low to ->pte.
Adapt the pte helper macros in arch/um/include/asm/page.h.Noteworthy is the pte_copy() macro where a smp_wmb() gets dropped. This
write barrier doesn't seem to be paired with any read barrier though and
thus, was useless anyway.Fixes: 16da306849d0 ("um: kill pfn_t")
Signed-off-by: Nicolai Stange
Cc: Dan Williams
Cc: Richard Weinberger
Cc: Nicolai Stange
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
16 Jan, 2016
1 commit
-
The core has developed a need for a "pfn_t" type [1]. Convert the usage
of pfn_t by usermode-linux to an unsigned long, and update pfn_to_phys()
to drop its expectation of a typed pfn.[1]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002199.html
Signed-off-by: Dan Williams
Cc: Dave Hansen
Cc: Jeff Dike
Cc: Richard Weinberger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
13 Jan, 2016
1 commit
-
Pull misc vfs updates from Al Viro:
"All kinds of stuff. That probably should've been 5 or 6 separate
branches, but by the time I'd realized how large and mixed that bag
had become it had been too close to -final to play with rebasing.Some fs/namei.c cleanups there, memdup_user_nul() introduction and
switching open-coded instances, burying long-dead code, whack-a-mole
of various kinds, several new helpers for ->llseek(), assorted
cleanups and fixes from various people, etc.One piece probably deserves special mention - Neil's
lookup_one_len_unlocked(). Similar to lookup_one_len(), but gets
called without ->i_mutex and tries to avoid ever taking it. That, of
course, means that it's not useful for any directory modifications,
but things like getting inode attributes in nfds readdirplus are fine
with that. I really should've asked for moratorium on lookup-related
changes this cycle, but since I hadn't done that early enough... I
*am* asking for that for the coming cycle, though - I'm going to try
and get conversion of i_mutex to rwsem with ->lookup() done under lock
taken shared.There will be a patch closer to the end of the window, along the lines
of the one Linus had posted last May - mechanical conversion of
->i_mutex accesses to inode_lock()/inode_unlock()/inode_trylock()/
inode_is_locked()/inode_lock_nested(). To quote Linus back then:-----
| This is an automated patch using
|
| sed 's/mutex_lock(&\(.*\)->i_mutex)/inode_lock(\1)/'
| sed 's/mutex_unlock(&\(.*\)->i_mutex)/inode_unlock(\1)/'
| sed 's/mutex_lock_nested(&\(.*\)->i_mutex,[ ]*I_MUTEX_\([A-Z0-9_]*\))/inode_lock_nested(\1, I_MUTEX_\2)/'
| sed 's/mutex_is_locked(&\(.*\)->i_mutex)/inode_is_locked(\1)/'
| sed 's/mutex_trylock(&\(.*\)->i_mutex)/inode_trylock(\1)/'
|
| with a very few manual fixups
-----I'm going to send that once the ->i_mutex-affecting stuff in -next
gets mostly merged (or when Linus says he's about to stop taking
merges)"* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
nfsd: don't hold i_mutex over userspace upcalls
fs:affs:Replace time_t with time64_t
fs/9p: use fscache mutex rather than spinlock
proc: add a reschedule point in proc_readfd_common()
logfs: constify logfs_block_ops structures
fcntl: allow to set O_DIRECT flag on pipe
fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE
fs: xattr: Use kvfree()
[s390] page_to_phys() always returns a multiple of PAGE_SIZE
nbd: use ->compat_ioctl()
fs: use block_device name vsprintf helper
lib/vsprintf: add %*pg format specifier
fs: use gendisk->disk_name where possible
poll: plug an unused argument to do_poll
amdkfd: don't open-code memdup_user()
cdrom: don't open-code memdup_user()
rsxx: don't open-code memdup_user()
mtip32xx: don't open-code memdup_user()
[um] mconsole: don't open-code memdup_user_nul()
[um] hostaudio: don't open-code memdup_user()
...
11 Jan, 2016
9 commits
-
Open the memory mapped file with the O_TMPFILE flag when available.
Signed-off-by: Mickaël Salaün
Cc: Jeff Dike
Cc: Richard Weinberger
Acked-by: Tristan Schmelcher
Signed-off-by: Richard Weinberger -
Remove the insecure 0777 mode for temporary file to prohibit other users
to change the executable mapped code.An attacker could gain access to the mapped file descriptor from the
temporary file (before it is unlinked) in a read-only mode but it should
not be accessible in write mode to avoid arbitrary code execution.To not change the hostfs behavior, the temporary file creation
permission now depends on the current umask(2) and the implementation of
mkstemp(3).Signed-off-by: Mickaël Salaün
Cc: Jeff Dike
Cc: Richard Weinberger
Acked-by: Tristan Schmelcher
Signed-off-by: Richard Weinberger -
This brings SECCOMP_MODE_STRICT and SECCOMP_MODE_FILTER support through
prctl(2) and seccomp(2) to User-mode Linux for i386 and x86_64
subarchitectures.secure_computing() is called first in handle_syscall() so that the
syscall emulation will be aborted quickly if matching a seccomp rule.This is inspired from Meredydd Luff's patch
(https://gerrit.chromium.org/gerrit/21425).Signed-off-by: Mickaël Salaün
Cc: Jeff Dike
Cc: Richard Weinberger
Cc: Ingo Molnar
Cc: Kees Cook
Cc: Andy Lutomirski
Cc: Will Drewry
Cc: Chris Metcalf
Cc: Michael Ellerman
Cc: James Hogan
Cc: Meredydd Luff
Cc: David Drysdale
Signed-off-by: Richard Weinberger
Acked-by: Kees Cook -
Add subarchitecture-independent implementation of asm-generic/syscall.h
allowing access to user system call parameters and results:
* syscall_get_nr()
* syscall_rollback()
* syscall_get_error()
* syscall_get_return_value()
* syscall_set_return_value()
* syscall_get_arguments()
* syscall_set_arguments()
* syscall_get_arch() provided by arch/x86/um/asm/syscall.hThis provides the necessary syscall helpers needed by
HAVE_ARCH_SECCOMP_FILTER plus syscall_get_error().This is inspired from Meredydd Luff's patch
(https://gerrit.chromium.org/gerrit/21425).Signed-off-by: Mickaël Salaün
Cc: Jeff Dike
Cc: Richard Weinberger
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: H. Peter Anvin
Cc: Kees Cook
Cc: Andy Lutomirski
Cc: Will Drewry
Cc: Meredydd Luff
Cc: David Drysdale
Signed-off-by: Richard Weinberger
Acked-by: Kees Cook -
This fix two related bugs:
* PTRACE_GETREGS doesn't get the right orig_ax (syscall) value
* PTRACE_SETREGS can't set the orig_ax value (erased by initial value)Get rid of the now useless and error-prone get_syscall().
Fix inconsistent behavior in the ptrace implementation for i386 when
updating orig_eax automatically update the syscall number as well. This
is now updated in handle_syscall().Signed-off-by: Mickaël Salaün
Cc: Jeff Dike
Cc: Richard Weinberger
Cc: Thomas Gleixner
Cc: Kees Cook
Cc: Andy Lutomirski
Cc: Will Drewry
Cc: Thomas Meyer
Cc: Nicolas Iooss
Cc: Anton Ivanov
Cc: Meredydd Luff
Cc: David Drysdale
Signed-off-by: Richard Weinberger
Acked-by: Kees Cook -
This decreases the number of syscalls per read/write by half.
Signed-off-by: Anton Ivanov
Signed-off-by: Richard Weinberger -
Software IRQ processing in generic architectures assumes that the
exit out of hard IRQ may have re-enabled interrupts (some
architectures may have an implicit EOI). It presumes them enabled
and toggles the flags once more just in case unless this is turned
off in the architecture specific hardirq.h by setting
__ARCH_IRQ_EXIT_IRQS_DISABLEDThis patch adds this to UML where due to the way IRQs are handled
it is an optimization (it works fine without it too).Signed-off-by: Anton Ivanov
Signed-off-by: Richard Weinberger -
The existing IRQ handler design in UML does not prevent reentrancy
This is mitigated by fd-enable/fd-disable semantics for the IO
portion of the UML subsystem. The timer, however, can and is
re-entered resulting in very deep stack usage and occasional
stack exhaustion.This patch prevents this by checking if there is a timer
interrupt in-flight before processing any pending timer interrupts.Signed-off-by: Anton Ivanov
Signed-off-by: Richard Weinberger -
I was seeing some really weird behaviour where piping UML's output
somewhere would cause output to get duplicated:$ ./vmlinux | head -n 40
Checking that ptrace can change system call numbers...Core dump limits :
soft - 0
hard - NONE
OK
Checking syscall emulation patch for ptrace...Core dump limits :
soft - 0
hard - NONE
OK
Checking advanced syscall emulation patch for ptrace...Core dump limits :
soft - 0
hard - NONE
OK
Core dump limits :
soft - 0
hard - NONEThis is because these tests do a fork() which duplicates the non-empty
stdout buffer, then glibc flushes the duplicated buffer as each child
exits.A simple workaround is to flush before forking.
Cc: stable@vger.kernel.org
Signed-off-by: Vegard Nossum
Signed-off-by: Richard Weinberger
04 Jan, 2016
2 commits
-
Signed-off-by: Al Viro
-
Signed-off-by: Al Viro
09 Dec, 2015
3 commits
-
When using va_list ensure that va_start will be followed by va_end.
Signed-off-by: Geyslan G. Bem
Signed-off-by: Richard Weinberger -
On gcc Ubuntu 4.8.4-2ubuntu1~14.04, linking vmlinux fails with:
arch/um/os-Linux/built-in.o: In function `os_timer_create':
/android/kernel/android/arch/um/os-Linux/time.c:51: undefined reference to `timer_create'
arch/um/os-Linux/built-in.o: In function `os_timer_set_interval':
/android/kernel/android/arch/um/os-Linux/time.c:84: undefined reference to `timer_settime'
arch/um/os-Linux/built-in.o: In function `os_timer_remain':
/android/kernel/android/arch/um/os-Linux/time.c:109: undefined reference to `timer_gettime'
arch/um/os-Linux/built-in.o: In function `os_timer_one_shot':
/android/kernel/android/arch/um/os-Linux/time.c:132: undefined reference to `timer_settime'
arch/um/os-Linux/built-in.o: In function `os_timer_disable':
/android/kernel/android/arch/um/os-Linux/time.c:145: undefined reference to `timer_settime'This is because -lrt appears in the generated link commandline
after arch/um/os-Linux/built-in.o. Fix this by removing -lrt from
arch/um/Makefile and adding it to the UM-specific section of
scripts/link-vmlinux.sh.Signed-off-by: Lorenzo Colitti
Signed-off-by: Richard Weinberger -
If get_signal() returns us a signal to post
we must not call it again, otherwise the already
posted signal will be overridden.
Before commit a610d6e672d this was the case as we stopped
the while after a successful handle_signal().Cc: # 3.10-
Fixes: a610d6e672d ("pull clearing RESTORE_SIGMASK into block_sigmask()")
Signed-off-by: Richard Weinberger
07 Nov, 2015
6 commits
-
UML is using an obsolete itimer call for
all timers and "polls" for kernel space timer firing
in its userspace portion resulting in a long list
of bugs and incorrect behaviour(s). It also uses
ITIMER_VIRTUAL for its timer which results in the
timer being dependent on it running and the cpu
load.This patch fixes this by moving to posix high resolution
timers firing off CLOCK_MONOTONIC and relaying the timer
correctly to the UML userspace.Fixes:
- crashes when hosts suspends/resumes
- broken userspace timers - effecive ~40Hz instead
of what they should be. Note - this modifies skas behavior
by no longer setting an itimer per clone(). Timer events
are relayed instead.
- kernel network packet scheduling disciplines
- tcp behaviour especially under load
- various timer related corner casesFinally, overall responsiveness of userspace is better.
Signed-off-by: Thomas Meyer
Signed-off-by: Anton Ivanov
[rw: massaged commit message]
Signed-off-by: Richard Weinberger -
since GFP_KERNEL with GFP_ATOMIC while spinlock is held,
as code while holding a spinlock should be atomic.
GFP_KERNEL may sleep and can cause deadlock,
where as GFP_ATOMIC may fail but certainly avoids deadlockdex f70dd54..d898f6c 100644Signed-off-by: Saurabh Sengar
Signed-off-by: Richard Weinberger -
If UML runs on the host side out of memory, report this
condition more nicely.Signed-off-by: Richard Weinberger
-
We can use __NR_syscall_max.
Signed-off-by: Richard Weinberger
-
To support changing syscall numbers we have to store
it after syscall_trace_enter().Signed-off-by: Richard Weinberger
-
...such that processes within UML can do a ptrace(PTRACE_OLDSETOPTIONS, ...)
Signed-off-by: Richard Weinberger
20 Oct, 2015
2 commits
-
We have to exclude memory locations
-
If UML is executing a helper program it is using
waitpid() with the __WCLONE flag to wait for the program
as the helper is executed from a clone()'ed thread.
While using __WCLONE is perfectly fine for clone()'ed
childs it won't detect terminated childs if the helper
has issued an execve().We have to use __WALL to wait for both clone()'ed and
regular childs to detect the termination before and
after an execve().Reported-and-tested-by: Thomas Meyer
Signed-off-by: Richard Weinberger