Eric Lee / smarc-fsl-linux-kernel

09 Jan, 2021

1 commit

ab7709b55 exec: Transform exec_update_mutex into a rw_semaphore ... Browse Code »

[ Upstream commit f7cfd871ae0c5008d94b6f66834e7845caa93c15 ]

Recently syzbot reported[0] that there is a deadlock amongst the users
of exec_update_mutex. The problematic lock ordering found by lockdep
was:

perf_event_open (exec_update_mutex -> ovl_i_mutex)
chown (ovl_i_mutex -> sb_writes)
sendfile (sb_writes -> p->lock)
by reading from a proc file and writing to overlayfs
proc_pid_syscall (p->lock -> exec_update_mutex)

While looking at possible solutions it occured to me that all of the
users and possible users involved only wanted to state of the given
process to remain the same. They are all readers. The only writer is
exec.

There is no reason for readers to block on each other. So fix
this deadlock by transforming exec_update_mutex into a rw_semaphore
named exec_update_lock that only exec takes for writing.

Cc: Jann Horn
Cc: Vasiliy Kulikov
Cc: Al Viro
Cc: Bernd Edlinger
Cc: Oleg Nesterov
Cc: Christopher Yeoh
Cc: Cyrill Gorcunov
Cc: Sargun Dhillon
Cc: Christian Brauner
Cc: Arnd Bergmann
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Fixes: eea9673250db ("exec: Add exec_update_mutex to replace cred_guard_mutex")
[0] https://lkml.kernel.org/r/00000000000063640c05ade8e3de@google.com
Reported-by: syzbot+db9cdf3dd1f64252c6ef@syzkaller.appspotmail.com
Link: https://lkml.kernel.org/r/87ft4mbqen.fsf@x220.int.ebiederm.org
Signed-off-by: Eric W. Biederman
Signed-off-by: Sasha Levin

Eric W. Biederman
2021-01-09 20:46:24 +0800

17 Oct, 2020

1 commit

96685f866 Merge tag 'powerpc-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux ... Browse Code »

Pull powerpc updates from Michael Ellerman:

- A series from Nick adding ARCH_WANT_IRQS_OFF_ACTIVATE_MM & selecting
it for powerpc, as well as a related fix for sparc.

- Remove support for PowerPC 601.

- Some fixes for watchpoints & addition of a new ptrace flag for
detecting ISA v3.1 (Power10) watchpoint features.

- A fix for kernels using 4K pages and the hash MMU on bare metal
Power9 systems with > 16TB of RAM, or RAM on the 2nd node.

- A basic idle driver for shallow stop states on Power10.

- Tweaks to our sched domains code to better inform the scheduler about
the hardware topology on Power9/10, where two SMT4 cores can be
presented by firmware as an SMT8 core.

- A series doing further reworks & cleanups of our EEH code.

- Addition of a filter for RTAS (firmware) calls done via sys_rtas(),
to prevent root from overwriting kernel memory.

- Other smaller features, fixes & cleanups.

Thanks to: Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V,
Athira Rajeev, Biwen Li, Cameron Berkenpas, Cédric Le Goater, Christophe
Leroy, Christoph Hellwig, Colin Ian King, Daniel Axtens, David Dai, Finn
Thain, Frederic Barrat, Gautham R. Shenoy, Greg Kurz, Gustavo Romero,
Ira Weiny, Jason Yan, Joel Stanley, Jordan Niethe, Kajol Jain, Konrad
Rzeszutek Wilk, Laurent Dufour, Leonardo Bras, Liu Shixin, Luca
Ceresoli, Madhavan Srinivasan, Mahesh Salgaonkar, Nathan Lynch, Nicholas
Mc Guire, Nicholas Piggin, Nick Desaulniers, Oliver O'Halloran, Pedro
Miraglia Franco de Carvalho, Pratik Rajesh Sampat, Qian Cai, Qinglang
Miao, Ravi Bangoria, Russell Currey, Satheesh Rajendran, Scott Cheloha,
Segher Boessenkool, Srikar Dronamraju, Stan Johnson, Stephen Kitt,
Stephen Rothwell, Thiago Jung Bauermann, Tyrel Datwyler, Vaibhav Jain,
Vaidyanathan Srinivasan, Vasant Hegde, Wang Wensheng, Wolfram Sang, Yang
Yingliang, zhengbin.

* tag 'powerpc-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (228 commits)
Revert "powerpc/pci: unmap legacy INTx interrupts when a PHB is removed"
selftests/powerpc: Fix eeh-basic.sh exit codes
cpufreq: powernv: Fix frame-size-overflow in powernv_cpufreq_reboot_notifier
powerpc/time: Make get_tb() common to PPC32 and PPC64
powerpc/time: Make get_tbl() common to PPC32 and PPC64
powerpc/time: Remove get_tbu()
powerpc/time: Avoid using get_tbl() and get_tbu() internally
powerpc/time: Make mftb() common to PPC32 and PPC64
powerpc/time: Rename mftbl() to mftb()
powerpc/32s: Remove #ifdef CONFIG_PPC_BOOK3S_32 in head_book3s_32.S
powerpc/32s: Rename head_32.S to head_book3s_32.S
powerpc/32s: Setup the early hash table at all time.
powerpc/time: Remove ifdef in get_dec() and set_dec()
powerpc: Remove get_tb_or_rtc()
powerpc: Remove __USE_RTC()
powerpc: Tidy up a bit after removal of PowerPC 601.
powerpc: Remove support for PowerPC 601
powerpc: Remove PowerPC 601
powerpc: Drop SYNC_601() ISYNC_601() and SYNC()
powerpc: Remove CONFIG_PPC601_SYNC_FIX
...

Linus Torvalds
2020-10-17 03:21:15 +0800

16 Oct, 2020

1 commit

726eb70e0 Merge tag 'char-misc-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc ... Browse Code »

Pull char/misc driver updates from Greg KH:
"Here is the big set of char, misc, and other assorted driver subsystem
patches for 5.10-rc1.

There's a lot of different things in here, all over the drivers/
directory. Some summaries:

- soundwire driver updates

- habanalabs driver updates

- extcon driver updates

- nitro_enclaves new driver

- fsl-mc driver and core updates

- mhi core and bus updates

- nvmem driver updates

- eeprom driver updates

- binder driver updates and fixes

- vbox minor bugfixes

- fsi driver updates

- w1 driver updates

- coresight driver updates

- interconnect driver updates

- misc driver updates

- other minor driver updates

All of these have been in linux-next for a while with no reported
issues"

* tag 'char-misc-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (396 commits)
binder: fix UAF when releasing todo list
docs: w1: w1_therm: Fix broken xref, mistakes, clarify text
misc: Kconfig: fix a HISI_HIKEY_USB dependency
LSM: Fix type of id parameter in kernel_post_load_data prototype
misc: Kconfig: add a new dependency for HISI_HIKEY_USB
firmware_loader: fix a kernel-doc markup
w1: w1_therm: make w1_poll_completion static
binder: simplify the return expression of binder_mmap
test_firmware: Test partial read support
firmware: Add request_partial_firmware_into_buf()
firmware: Store opt_flags in fw_priv
fs/kernel_file_read: Add "offset" arg for partial reads
IMA: Add support for file reads without contents
LSM: Add "contents" flag to kernel_read_file hook
module: Call security_kernel_post_load_data()
firmware_loader: Use security_post_load_data()
LSM: Introduce kernel_post_load_data() hook
fs/kernel_read_file: Add file_size output argument
fs/kernel_read_file: Switch buffer size arg to size_t
fs/kernel_read_file: Remove redundant size argument
...

Linus Torvalds
2020-10-16 01:01:51 +0800

05 Oct, 2020

3 commits

5287b07f6 fs/kernel_read_file: Split into separate source file ... Browse Code »

These routines are used in places outside of exec(2), so in preparation
for refactoring them, move them into a separate source file,
fs/kernel_read_file.c.

Signed-off-by: Kees Cook
Reviewed-by: Mimi Zohar
Reviewed-by: Luis Chamberlain
Acked-by: Scott Branden
Link: https://lore.kernel.org/r/20201002173828.2099543-5-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman

Kees Cook
2020-10-05 19:34:18 +0800
b89999d00 fs/kernel_read_file: Split into separate include file ... Browse Code »

Move kernel_read_file* out of linux/fs.h to its own linux/kernel_read_file.h
include file. That header gets pulled in just about everywhere
and doesn't really need functions not related to the general fs interface.

Suggested-by: Christoph Hellwig
Signed-off-by: Scott Branden
Signed-off-by: Kees Cook
Reviewed-by: Christoph Hellwig
Reviewed-by: Mimi Zohar
Reviewed-by: Luis Chamberlain
Acked-by: Greg Kroah-Hartman
Acked-by: James Morris
Link: https://lore.kernel.org/r/20200706232309.12010-2-scott.branden@broadcom.com
Link: https://lore.kernel.org/r/20201002173828.2099543-4-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman

Scott Branden
2020-10-05 19:34:18 +0800
c307459b9 fs/kernel_read_file: Remove FIRMWARE_PREALLOC_BUFFER enum ... Browse Code »

FIRMWARE_PREALLOC_BUFFER is a "how", not a "what", and confuses the LSMs
that are interested in filtering between types of things. The "how"
should be an internal detail made uninteresting to the LSMs.

Fixes: a098ecd2fa7d ("firmware: support loading into a pre-allocated buffer")
Fixes: fd90bc559bfb ("ima: based on policy verify firmware signatures (pre-allocated buffer)")
Fixes: 4f0496d8ffa3 ("ima: based on policy warn about loading firmware (pre-allocated buffer)")
Signed-off-by: Kees Cook
Reviewed-by: Mimi Zohar
Reviewed-by: Luis Chamberlain
Acked-by: Scott Branden
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201002173828.2099543-2-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman

Kees Cook
2020-10-05 19:34:18 +0800

01 Oct, 2020

1 commit

0f2122045 io_uring: don't rely on weak ->files references ... Browse Code »

Grab actual references to the files_struct. To avoid circular references
issues due to this, we add a per-task note that keeps track of what
io_uring contexts a task has used. When the tasks execs or exits its
assigned files, we cancel requests based on this tracking.

With that, we can grab proper references to the files table, and no
longer need to rely on stashing away ring_fd and ring_file to check
if the ring_fd may have been closed.

Cc: stable@vger.kernel.org # v5.5+
Reviewed-by: Pavel Begunkov
Signed-off-by: Jens Axboe

Jens Axboe
2020-10-01 10:32:32 +0800

16 Sep, 2020

1 commit

d53c3dfb2 mm: fix exec activate_mm vs TLB shootdown and lazy tlb switching race ... Browse Code »

Reading and modifying current->mm and current->active_mm and switching
mm should be done with irqs off, to prevent races seeing an intermediate
state.

This is similar to commit 38cf307c1f20 ("mm: fix kthread_use_mm() vs TLB
invalidate"). At exec-time when the new mm is activated, the old one
should usually be single-threaded and no longer used, unless something
else is holding an mm_users reference (which may be possible).

Absent other mm_users, there is also a race with preemption and lazy tlb
switching. Consider the kernel_execve case where the current thread is
using a lazy tlb active mm:

call_usermodehelper()
kernel_execve()
old_mm = current->mm;
active_mm = current->active_mm;
*** preempt *** --------------------> schedule()
prev->active_mm = NULL;
mmdrop(prev active_mm);
...
mm = mm;
current->active_mm = mm;
if (!old_mm)
mmdrop(active_mm);

If we switch back to the kernel thread from a different mm, there is a
double free of the old active_mm, and a missing free of the new one.

Closing this race only requires interrupts to be disabled while ->mm
and ->active_mm are being switched, but the TLB problem requires also
holding interrupts off over activate_mm. Unfortunately not all archs
can do that yet, e.g., arm defers the switch if irqs are disabled and
expects finish_arch_post_lock_switch() to be called to complete the
flush; um takes a blocking lock in activate_mm().

So as a first step, disable interrupts across the mm/active_mm updates
to close the lazy tlb preempt race, and provide an arch option to
extend that to activate_mm which allows architectures doing IPI based
TLB shootdowns to close the second race.

This is a bit ugly, but in the interest of fixing the bug and backporting
before all architectures are converted this is a compromise.

Signed-off-by: Nicholas Piggin
Acked-by: Peter Zijlstra (Intel)
Signed-off-by: Michael Ellerman
Link: https://lore.kernel.org/r/20200914045219.3736466-2-npiggin@gmail.com

Nicholas Piggin
2020-09-16 10:24:31 +0800

13 Aug, 2020

5 commits

64019a2e4 mm/gup: remove task_struct pointer for all gup code ... Browse Code »

After the cleanup of page fault accounting, gup does not need to pass
task_struct around any more. Remove that parameter in the whole gup
stack.

Signed-off-by: Peter Xu
Signed-off-by: Andrew Morton
Reviewed-by: John Hubbard
Link: http://lkml.kernel.org/r/20200707225021.200906-26-peterx@redhat.com
Signed-off-by: Linus Torvalds

Peter Xu
2020-08-13 01:58:04 +0800
0fd338b2d exec: move path_noexec() check earlier ... Browse Code »

The path_noexec() check, like the regular file check, was happening too
late, letting LSMs see impossible execve()s. Check it earlier as well in
may_open() and collect the redundant fs/exec.c path_noexec() test under
the same robustness comment as the S_ISREG() check.

My notes on the call path, and related arguments, checks, etc:

do_open_execat()
struct open_flags open_exec_flags = {
.open_flag = O_LARGEFILE | O_RDONLY | __FMODE_EXEC,
.acc_mode = MAY_EXEC,
...
do_filp_open(dfd, filename, open_flags)
path_openat(nameidata, open_flags, flags)
file = alloc_empty_file(open_flags, current_cred());
do_open(nameidata, file, open_flags)
may_open(path, acc_mode, open_flag)
/* new location of MAY_EXEC vs path_noexec() test */
inode_permission(inode, MAY_OPEN | acc_mode)
security_inode_permission(inode, acc_mode)
vfs_open(path, file)
do_dentry_open(file, path->dentry->d_inode, open)
security_file_open(f)
open()
/* old location of path_noexec() test */

Signed-off-by: Kees Cook
Signed-off-by: Andrew Morton
Cc: Alexander Viro
Cc: Aleksa Sarai
Cc: Christian Brauner
Cc: Dmitry Vyukov
Cc: Eric Biggers
Cc: Tetsuo Handa
Link: http://lkml.kernel.org/r/20200605160013.3954297-4-keescook@chromium.org
Signed-off-by: Linus Torvalds

Kees Cook
2020-08-13 01:58:01 +0800
633fb6ac3 exec: move S_ISREG() check earlier ... Browse Code »

The execve(2)/uselib(2) syscalls have always rejected non-regular files.
Recently, it was noticed that a deadlock was introduced when trying to
execute pipes, as the S_ISREG() test was happening too late. This was
fixed in commit 73601ea5b7b1 ("fs/open.c: allow opening only regular files
during execve()"), but it was added after inode_permission() had already
run, which meant LSMs could see bogus attempts to execute non-regular
files.

Move the test into the other inode type checks (which already look for
other pathological conditions[1]). Since there is no need to use
FMODE_EXEC while we still have access to "acc_mode", also switch the test
to MAY_EXEC.

Also include a comment with the redundant S_ISREG() checks at the end of
execve(2)/uselib(2) to note that they are present to avoid any mistakes.

My notes on the call path, and related arguments, checks, etc:

do_open_execat()
struct open_flags open_exec_flags = {
.open_flag = O_LARGEFILE | O_RDONLY | __FMODE_EXEC,
.acc_mode = MAY_EXEC,
...
do_filp_open(dfd, filename, open_flags)
path_openat(nameidata, open_flags, flags)
file = alloc_empty_file(open_flags, current_cred());
do_open(nameidata, file, open_flags)
may_open(path, acc_mode, open_flag)
/* new location of MAY_EXEC vs S_ISREG() test */
inode_permission(inode, MAY_OPEN | acc_mode)
security_inode_permission(inode, acc_mode)
vfs_open(path, file)
do_dentry_open(file, path->dentry->d_inode, open)
/* old location of FMODE_EXEC vs S_ISREG() test */
security_file_open(f)
open()

[1] https://lore.kernel.org/lkml/202006041910.9EF0C602@keescook/

Signed-off-by: Kees Cook
Signed-off-by: Andrew Morton
Cc: Aleksa Sarai
Cc: Alexander Viro
Cc: Christian Brauner
Cc: Dmitry Vyukov
Cc: Eric Biggers
Cc: Tetsuo Handa
Link: http://lkml.kernel.org/r/20200605160013.3954297-3-keescook@chromium.org
Signed-off-by: Linus Torvalds

Kees Cook
2020-08-13 01:58:01 +0800
db19c91c3 exec: change uselib(2) IS_SREG() failure to EACCES ... Browse Code »

Patch series "Relocate execve() sanity checks", v2.

While looking at the code paths for the proposed O_MAYEXEC flag, I saw
some things that looked like they should be fixed up.

exec: Change uselib(2) IS_SREG() failure to EACCES
This just regularizes the return code on uselib(2).

exec: Move S_ISREG() check earlier
This moves the S_ISREG() check even earlier than it was already.

exec: Move path_noexec() check earlier
This adds the path_noexec() check to the same place as the
S_ISREG() check.

This patch (of 3):

Change uselib(2)' S_ISREG() error return to EACCES instead of EINVAL so
the behavior matches execve(2), and the seemingly documented value. The
"not a regular file" failure mode of execve(2) is explicitly
documented[1], but it is not mentioned in uselib(2)[2] which does,
however, say that open(2) and mmap(2) errors may apply. The documentation
for open(2) does not include a "not a regular file" error[3], but mmap(2)
does[4], and it is EACCES.

[1] http://man7.org/linux/man-pages/man2/execve.2.html#ERRORS
[2] http://man7.org/linux/man-pages/man2/uselib.2.html#ERRORS
[3] http://man7.org/linux/man-pages/man2/open.2.html#ERRORS
[4] http://man7.org/linux/man-pages/man2/mmap.2.html#ERRORS

Signed-off-by: Kees Cook
Signed-off-by: Andrew Morton
Acked-by: Christian Brauner
Cc: Aleksa Sarai
Cc: Alexander Viro
Cc: Dmitry Vyukov
Cc: Eric Biggers
Cc: Tetsuo Handa
Link: http://lkml.kernel.org/r/20200605160013.3954297-1-keescook@chromium.org
Link: http://lkml.kernel.org/r/20200605160013.3954297-2-keescook@chromium.org
Signed-off-by: Linus Torvalds

Kees Cook
2020-08-13 01:58:01 +0800
fe8141759 exec: use force_uaccess_begin during exec and exit ... Browse Code »

Both exec and exit want to ensure that the uaccess routines actually do
access user pointers. Use the newly added force_uaccess_begin helper
instead of an open coded set_fs for that to prepare for kernel builds
where set_fs() does not exist.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Acked-by: Linus Torvalds
Cc: Nick Hu
Cc: Greentime Hu
Cc: Vincent Chen
Cc: Paul Walmsley
Cc: Palmer Dabbelt
Cc: Geert Uytterhoeven
Link: http://lkml.kernel.org/r/20200710135706.537715-7-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-08-13 01:57:59 +0800

21 Jul, 2020

6 commits

be619f7f0 exec: Implement kernel_execve ... Browse Code »

To allow the kernel not to play games with set_fs to call exec
implement kernel_execve. The function kernel_execve takes pointers
into kernel memory and copies the values pointed to onto the new
userspace stack.

The calls with arguments from kernel space of do_execve are replaced
with calls to kernel_execve.

The calls do_execve and do_execveat are made static as there are now
no callers outside of exec.

The comments that mention do_execve are updated to refer to
kernel_execve or execve depending on the circumstances. In addition
to correcting the comments, this makes it easy to grep for do_execve
and verify it is not used.

Inspired-by: https://lkml.kernel.org/r/20200627072704.2447163-1-hch@lst.de
Reviewed-by: Kees Cook
Link: https://lkml.kernel.org/r/87wo365ikj.fsf@x220.int.ebiederm.org
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2020-07-21 21:24:52 +0800
d8b9cd549 exec: Factor bprm_stack_limits out of prepare_arg_pages ... Browse Code »

In preparation for implementiong kernel_execve (which will take kernel
pointers not userspace pointers) factor out bprm_stack_limits out of
prepare_arg_pages. This separates the counting which depends upon the
getting data from userspace from the calculations of the stack limits
which is usable in kernel_execve.

The remove prepare_args_pages and compute bprm->argc and bprm->envc
directly in do_execveat_common, before bprm_stack_limits is called.

Reviewed-by: Kees Cook
Reviewed-by: Christoph Hellwig
Link: https://lkml.kernel.org/r/87365u6x60.fsf@x220.int.ebiederm.org
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2020-07-21 21:24:52 +0800
0c9cdff05 exec: Factor bprm_execve out of do_execve_common ... Browse Code »

Currently it is necessary for the usermode helper code and the code
that launches init to use set_fs so that pages coming from the kernel
look like they are coming from userspace.

To allow that usage of set_fs to be removed cleanly the argument
copying from userspace needs to happen earlier. Factor bprm_execve
out of do_execve_common to separate out the copying of arguments
to the newe stack, and the rest of exec.

In separating bprm_execve from do_execve_common the copying
of the arguments onto the new stack happens earlier.

As the copying of the arguments does not depend any security hooks,
files, the file table, current->in_execve, current->fs->in_exec,
bprm->unsafe, or creds this is safe.

Likewise the security hook security_creds_for_exec does not depend upon
preventing the argument copying from happening.

In addition to making it possible to implement kernel_execve that
performs the copying differently, this separation of bprm_execve from
do_execve_common makes for a nice separation of responsibilities making
the exec code easier to navigate.

Reviewed-by: Kees Cook
Reviewed-by: Christoph Hellwig
Link: https://lkml.kernel.org/r/878sfm6x6x.fsf@x220.int.ebiederm.org
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2020-07-21 21:24:52 +0800
f18ac551e exec: Move bprm_mm_init into alloc_bprm ... Browse Code »

Currently it is necessary for the usermode helper code and the code that
launches init to use set_fs so that pages coming from the kernel look like
they are coming from userspace.

To allow that usage of set_fs to be removed cleanly the argument copying
from userspace needs to happen earlier. Move the allocation and
initialization of bprm->mm into alloc_bprm so that the bprm->mm is
available early to store the new user stack into. This is a prerequisite
for copying argv and envp into the new user stack early before ther rest of
exec.

To keep the things consistent the cleanup of bprm->mm is moved into
free_bprm. So that bprm->mm will be cleaned up whenever bprm->mm is
allocated and free_bprm are called.

Moving bprm_mm_init earlier is safe as it does not depend on any files,
current->in_execve, current->fs->in_exec, bprm->unsafe, or the if the file
table is shared. (AKA bprm_mm_init does not depend on any of the code that
happens between alloc_bprm and where it was previously called.)

This moves bprm->mm cleanup after current->fs->in_exec is set to 0. This
is safe because current->fs->in_exec is only used to preventy taking an
additional reference on the fs_struct.

This moves bprm->mm cleanup after current->in_execve is set to 0. This is
safe because current->in_execve is only used by the lsms (apparmor and
tomoyou) and always for LSM specific functions, never for anything to do
with the mm.

This adds bprm->mm cleanup into the successful return path. This is safe
because being on the successful return path implies that begin_new_exec
succeeded and set brpm->mm to NULL. As bprm->mm is NULL bprm cleanup I am
moving into free_bprm will do nothing.

Reviewed-by: Kees Cook
Reviewed-by: Christoph Hellwig
Link: https://lkml.kernel.org/r/87eepe6x7p.fsf@x220.int.ebiederm.org
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2020-07-21 21:24:52 +0800
60d9ad1d1 exec: Move initialization of bprm->filename into alloc_bprm ... Browse Code »

Currently it is necessary for the usermode helper code and the code
that launches init to use set_fs so that pages coming from the kernel
look like they are coming from userspace.

To allow that usage of set_fs to be removed cleanly the argument
copying from userspace needs to happen earlier. Move the computation
of bprm->filename and possible allocation of a name in the case
of execveat into alloc_bprm to make that possible.

The exectuable name, the arguments, and the environment are
copied into the new usermode stack which is stored in bprm
until exec passes the point of no return.

As the executable name is copied first onto the usermode stack
it needs to be known. As there are no dependencies to computing
the executable name, compute it early in alloc_bprm.

As an implementation detail if the filename needs to be generated
because it embeds a file descriptor store that filename in a new field
bprm->fdpath, and free it in free_bprm. Previously this was done in
an independent variable pathbuf. I have renamed pathbuf fdpath
because fdpath is more suggestive of what kind of path is in the
variable. I moved fdpath into struct linux_binprm because it is
tightly tied to the other variables in struct linux_binprm, and as
such is needed to allow the call alloc_binprm to move.

Reviewed-by: Kees Cook
Reviewed-by: Christoph Hellwig
Link: https://lkml.kernel.org/r/87k0z66x8f.fsf@x220.int.ebiederm.org
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2020-07-21 21:24:52 +0800
0a8f36eb4 exec: Factor out alloc_bprm ... Browse Code »

Currently it is necessary for the usermode helper code and the code
that launches init to use set_fs so that pages coming from the kernel
look like they are coming from userspace.

To allow that usage of set_fs to be removed cleanly the argument
copying from userspace needs to happen earlier. Move the allocation
of the bprm into it's own function (alloc_bprm) and move the call of
alloc_bprm before unshare_files so that bprm can ultimately be
allocated, the arguments can be placed on the new stack, and then the
bprm can be passed into the core of exec.

Neither the allocation of struct binprm nor the unsharing depend upon each
other so swapping the order in which they are called is trivially safe.

To keep things consistent the order of cleanup at the end of
do_execve_common swapped to match the order of initialization.

Reviewed-by: Kees Cook
Link: https://lkml.kernel.org/r/87pn8y6x9a.fsf@x220.int.ebiederm.org
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2020-07-21 21:24:44 +0800

04 Jul, 2020

1 commit

25cf336de exec: Remove do_execve_file ... Browse Code »

Now that the last callser has been removed remove this code from exec.

For anyone thinking of resurrecing do_execve_file please note that
the code was buggy in several fundamental ways.

- It did not ensure the file it was passed was read-only and that
deny_write_access had been called on it. Which subtlely breaks
invaniants in exec.

- The caller of do_execve_file was expected to hold and put a
reference to the file, but an extra reference for use by exec was
not taken so that when exec put it's reference to the file an
underflow occured on the file reference count.

- The point of the interface was so that a pathname did not need to
exist. Which breaks pathname based LSMs.

Tetsuo Handa originally reported these issues[1]. While it was clear
that deny_write_access was missing the fundamental incompatibility
with the passed in O_RDWR filehandle was not immediately recognized.

All of these issues were fixed by modifying the usermode driver code
to have a path, so it did not need this hack.

Reported-by: Tetsuo Handa
[1] https://lore.kernel.org/linux-fsdevel/2a8775b4-1dd5-9d5c-aa42-9872445e0942@i-love.sakura.ne.jp/
v1: https://lkml.kernel.org/r/871rm2f0hi.fsf_-_@x220.int.ebiederm.org
v2: https://lkml.kernel.org/r/87lfk54p0m.fsf_-_@x220.int.ebiederm.org
Link: https://lkml.kernel.org/r/20200702164140.4468-10-ebiederm@xmission.com
Reviewed-by: Greg Kroah-Hartman
Acked-by: Alexei Starovoitov
Tested-by: Alexei Starovoitov
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2020-07-04 22:35:43 +0800

10 Jun, 2020

2 commits

c1e8d7c6a mmap locking API: convert mmap_sem comments ... Browse Code »

Convert comments that reference mmap_sem to reference mmap_lock instead.

[akpm@linux-foundation.org: fix up linux-next leftovers]
[akpm@linux-foundation.org: s/lockaphore/lock/, per Vlastimil]
[akpm@linux-foundation.org: more linux-next fixups, per Michel]

Signed-off-by: Michel Lespinasse
Signed-off-by: Andrew Morton
Reviewed-by: Vlastimil Babka
Reviewed-by: Daniel Jordan
Cc: Davidlohr Bueso
Cc: David Rientjes
Cc: Hugh Dickins
Cc: Jason Gunthorpe
Cc: Jerome Glisse
Cc: John Hubbard
Cc: Laurent Dufour
Cc: Liam Howlett
Cc: Matthew Wilcox
Cc: Peter Zijlstra
Cc: Ying Han
Link: http://lkml.kernel.org/r/20200520052908.204642-13-walken@google.com
Signed-off-by: Linus Torvalds

Michel Lespinasse
2020-06-10 00:39:14 +0800
d8ed45c5d mmap locking API: use coccinelle to convert mmap_sem rwsem call sites ... Browse Code »

This change converts the existing mmap_sem rwsem calls to use the new mmap
locking API instead.

The change is generated using coccinelle with the following rule:

// spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir .

@@
expression mm;
@@
(
-init_rwsem
+mmap_init_lock
|
-down_write
+mmap_write_lock
|
-down_write_killable
+mmap_write_lock_killable
|
-down_write_trylock
+mmap_write_trylock
|
-up_write
+mmap_write_unlock
|
-downgrade_write
+mmap_write_downgrade
|
-down_read
+mmap_read_lock
|
-down_read_killable
+mmap_read_lock_killable
|
-down_read_trylock
+mmap_read_trylock
|
-up_read
+mmap_read_unlock
)
-(&mm->mmap_sem)
+(mm)

Signed-off-by: Michel Lespinasse
Signed-off-by: Andrew Morton
Reviewed-by: Daniel Jordan
Reviewed-by: Laurent Dufour
Reviewed-by: Vlastimil Babka
Cc: Davidlohr Bueso
Cc: David Rientjes
Cc: Hugh Dickins
Cc: Jason Gunthorpe
Cc: Jerome Glisse
Cc: John Hubbard
Cc: Liam Howlett
Cc: Matthew Wilcox
Cc: Peter Zijlstra
Cc: Ying Han
Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com
Signed-off-by: Linus Torvalds

Michel Lespinasse
2020-06-10 00:39:14 +0800

09 Jun, 2020

2 commits

bce2b68b8 exec: use flush_icache_user_range in read_code ... Browse Code »

read_code operates on user addresses.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Alexander Viro
Link: http://lkml.kernel.org/r/20200515143646.3857579-27-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:58 +0800
48304f799 exec: only build read_code when needed ... Browse Code »

Only build read_code when binary formats that use it are built into the
kernel.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Alexander Viro
Link: http://lkml.kernel.org/r/20200515143646.3857579-26-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:58 +0800

05 Jun, 2020

5 commits

886d7de63 Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge yet more updates from Andrew Morton:

- More MM work. 100ish more to go. Mike Rapoport's "mm: remove
__ARCH_HAS_5LEVEL_HACK" series should fix the current ppc issue

- Various other little subsystems

* emailed patches from Andrew Morton : (127 commits)
lib/ubsan.c: fix gcc-10 warnings
tools/testing/selftests/vm: remove duplicate headers
selftests: vm: pkeys: fix multilib builds for x86
selftests: vm: pkeys: use the correct page size on powerpc
selftests/vm/pkeys: override access right definitions on powerpc
selftests/vm/pkeys: test correct behaviour of pkey-0
selftests/vm/pkeys: introduce a sub-page allocator
selftests/vm/pkeys: detect write violation on a mapped access-denied-key page
selftests/vm/pkeys: associate key on a mapped page and detect write violation
selftests/vm/pkeys: associate key on a mapped page and detect access violation
selftests/vm/pkeys: improve checks to determine pkey support
selftests/vm/pkeys: fix assertion in test_pkey_alloc_exhaust()
selftests/vm/pkeys: fix number of reserved powerpc pkeys
selftests/vm/pkeys: introduce powerpc support
selftests/vm/pkeys: introduce generic pkey abstractions
selftests: vm: pkeys: use the correct huge page size
selftests/vm/pkeys: fix alloc_random_pkey() to make it really random
selftests/vm/pkeys: fix assertion in pkey_disable_set/clear()
selftests/vm/pkeys: fix pkey_disable_clear()
selftests: vm: pkeys: add helpers for pkey bits
...

Linus Torvalds
2020-06-05 10:18:29 +0800
762a3af6f exec: open code copy_string_kernel ... Browse Code »

Currently copy_string_kernel is just a wrapper around copy_strings that
simplifies the calling conventions and uses set_fs to allow passing a
kernel pointer. But due to the fact the we only need to handle a single
kernel argument pointer, the logic can be sigificantly simplified while
getting rid of the set_fs.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Alexander Viro
Link: http://lkml.kernel.org/r/20200501104105.2621149-3-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-05 10:06:26 +0800
986db2d14 exec: simplify the copy_strings_kernel calling convention ... Browse Code »

copy_strings_kernel is always used with a single argument,
adjust the calling convention to that.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Alexander Viro
Link: http://lkml.kernel.org/r/20200501104105.2621149-2-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-05 10:06:26 +0800
15a2bc4db Merge branch 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull execve updates from Eric Biederman:
"Last cycle for the Nth time I ran into bugs and quality of
implementation issues related to exec that could not be easily be
fixed because of the way exec is implemented. So I have been digging
into exec and cleanup up what I can.

I don't think I have exec sorted out enough to fix the issues I
started with but I have made some headway this cycle with 4 sets of
changes.

- promised cleanups after introducing exec_update_mutex

- trivial cleanups for exec

- control flow simplifications

- remove the recomputation of bprm->cred

The net result is code that is a bit easier to understand and work
with and a decrease in the number of lines of code (if you don't count
the added tests)"

* 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (24 commits)
exec: Compute file based creds only once
exec: Add a per bprm->file version of per_clear
binfmt_elf_fdpic: fix execfd build regression
selftests/exec: Add binfmt_script regression test
exec: Remove recursion from search_binary_handler
exec: Generic execfd support
exec/binfmt_script: Don't modify bprm->buf and then return -ENOEXEC
exec: Move the call of prepare_binprm into search_binary_handler
exec: Allow load_misc_binary to call prepare_binprm unconditionally
exec: Convert security_bprm_set_creds into security_bprm_repopulate_creds
exec: Factor security_bprm_creds_for_exec out of security_bprm_set_creds
exec: Teach prepare_exec_creds how exec treats uids & gids
exec: Set the point of no return sooner
exec: Move handling of the point of no return to the top level
exec: Run sync_mm_rss before taking exec_update_mutex
exec: Fix spelling of search_binary_handler in a comment
exec: Move the comment from above de_thread to above unshare_sighand
exec: Rename flush_old_exec begin_new_exec
exec: Move most of setup_new_exec into flush_old_exec
exec: In setup_new_exec cache current in the local variable me
...

Linus Torvalds
2020-06-05 05:07:08 +0800
9ff725857 Merge branch 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull proc updates from Eric Biederman:
"This has four sets of changes:

- modernize proc to support multiple private instances

- ensure we see the exit of each process tid exactly

- remove has_group_leader_pid

- use pids not tasks in posix-cpu-timers lookup

Alexey updated proc so each mount of proc uses a new superblock. This
allows people to actually use mount options with proc with no fear of
messing up another mount of proc. Given the kernel's internal mounts
of proc for things like uml this was a real problem, and resulted in
Android's hidepid mount options being ignored and introducing security
issues.

The rest of the changes are small cleanups and fixes that came out of
my work to allow this change to proc. In essence it is swapping the
pids in de_thread during exec which removes a special case the code
had to handle. Then updating the code to stop handling that special
case"

* 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
proc: proc_pid_ns takes super_block as an argument
remove the no longer needed pid_alive() check in __task_pid_nr_ns()
posix-cpu-timers: Replace __get_task_for_clock with pid_for_clock
posix-cpu-timers: Replace cpu_timer_pid_type with clock_pid_type
posix-cpu-timers: Extend rcu_read_lock removing task_struct references
signal: Remove has_group_leader_pid
exec: Remove BUG_ON(has_group_leader_pid)
posix-cpu-timer: Unify the now redundant code in lookup_task
posix-cpu-timer: Tidy up group_leader logic in lookup_task
proc: Ensure we see the exit of each process tid exactly once
rculist: Add hlists_swap_heads_rcu
proc: Use PIDTYPE_TGID in next_tgid
Use proc_pid_ns() to get pid_namespace from the proc superblock
proc: use named enums for better readability
proc: use human-readable values for hidepid
docs: proc: add documentation for "hidepid=4" and "subset=pid" options and new mount behavior
proc: add option to mount only a pids subset
proc: instantiate only pids that we can ptrace on 'hidepid=4' mount option
proc: allow to mount many instances of proc in one pid namespace
proc: rename struct proc_fs_info to proc_fs_opts

Linus Torvalds
2020-06-05 04:54:34 +0800

30 May, 2020

2 commits

56305aa9b exec: Compute file based creds only once ... Browse Code »

Move the computation of creds from prepare_binfmt into begin_new_exec
so that the creds need only be computed once. This is just code
reorganization no semantic changes of any kind are made.

Moving the computation is safe. I have looked through the kernel and
verified none of the binfmts look at bprm->cred directly, and that
there are no helpers that look at bprm->cred indirectly. Which means
that it is not a problem to compute the bprm->cred later in the
execution flow as it is not used until it becomes current->cred.

A new function bprm_creds_from_file is added to contain the work that
needs to be done. bprm_creds_from_file first computes which file
bprm->executable or most likely bprm->file that the bprm->creds
will be computed from.

The funciton bprm_fill_uid is updated to receive the file instead of
accessing bprm->file. The now unnecessary work needed to reset the
bprm->cred->euid, and bprm->cred->egid is removed from brpm_fill_uid.
A small comment to document that bprm_fill_uid now only deals with the
work to handle suid and sgid files. The default case is already
heandled by prepare_exec_creds.

The function security_bprm_repopulate_creds is renamed
security_bprm_creds_from_file and now is explicitly passed the file
from which to compute the creds. The documentation of the
bprm_creds_from_file security hook is updated to explain when the hook
is called and what it needs to do. The file is passed from
cap_bprm_creds_from_file into get_file_caps so that the caps are
computed for the appropriate file. The now unnecessary work in
cap_bprm_creds_from_file to reset the ambient capabilites has been
removed. A small comment to document that the work of
cap_bprm_creds_from_file is to read capabilities from the files
secureity attribute and derive capabilities from the fact the
user had uid 0 has been added.

Reviewed-by: Kees Cook
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2020-05-30 11:00:54 +0800
a7868323c exec: Add a per bprm->file version of per_clear ... Browse Code »

There is a small bug in the code that recomputes parts of bprm->cred
for every bprm->file. The code never recomputes the part of
clear_dangerous_personality_flags it is responsible for.

Which means that in practice if someone creates a sgid script
the interpreter will not be able to use any of:
READ_IMPLIES_EXEC
ADDR_NO_RANDOMIZE
ADDR_COMPAT_LAYOUT
MMAP_PAGE_ZERO.

This accentially clearing of personality flags probably does
not matter in practice because no one has complained
but it does make the code more difficult to understand.

Further remaining bug compatible prevents the recomputation from being
removed and replaced by simply computing bprm->cred once from the
final bprm->file.

Making this change removes the last behavior difference between
computing bprm->creds from the final file and recomputing
bprm->cred several times. Which allows this behavior change
to be justified for it's own reasons, and for any but hunts
looking into why the behavior changed to wind up here instead
of in the code that will follow that computes bprm->cred
from the final bprm->file.

This small logic bug appears to have existed since the code
started clearing dangerous personality bits.

History Tree: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Fixes: 1bb0fa189c6a ("[PATCH] NX: clean up legacy binary support")
Reviewed-by: Kees Cook
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2020-05-30 10:06:48 +0800

21 May, 2020

6 commits

bc2bf338d exec: Remove recursion from search_binary_handler ... Browse Code »

Recursion in kernel code is generally a bad idea as it can overflow
the kernel stack. Recursion in exec also hides that the code is
looping and that the loop changes bprm->file.

Instead of recursing in search_binary_handler have the methods that
would recurse set bprm->interpreter and return 0. Modify exec_binprm
to loop when bprm->interpreter is set. Consolidate all of the
reassignments of bprm->file in that loop to make it clear what is
going on.

The structure of the new loop in exec_binprm is that all errors return
immediately, while successful completion (ret == 0 &&
!bprm->interpreter) just breaks out of the loop and runs what
exec_bprm has always run upon successful completion.

Fail if the an interpreter is being call after execfd has been set.
The code has never properly handled an interpreter being called with
execfd being set and with reassignments of bprm->file and the
assignment of bprm->executable in generic code it has finally become
possible to test and fail when if this problematic condition happens.

With the reassignments of bprm->file and the assignment of
bprm->executable moved into the generic code add a test to see if
bprm->executable is being reassigned.

In search_binary_handler remove the test for !bprm->file. With all
reassignments of bprm->file moved to exec_binprm bprm->file can never
be NULL in search_binary_handler.

Link: https://lkml.kernel.org/r/87sgfwyd84.fsf_-_@x220.int.ebiederm.org
Acked-by: Linus Torvalds
Reviewed-by: Kees Cook
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2020-05-21 23:16:57 +0800
b8a61c9e7 exec: Generic execfd support ... Browse Code »

Most of the support for passing the file descriptor of an executable
to an interpreter already lives in the generic code and in binfmt_elf.
Rework the fields in binfmt_elf that deal with executable file
descriptor passing to make executable file descriptor passing a first
class concept.

Move the fd_install from binfmt_misc into begin_new_exec after the new
creds have been installed. This means that accessing the file through
/proc//fd/N is able to see the creds for the new executable
before allowing access to the new executables files.

Performing the install of the executables file descriptor after
the point of no return also means that nothing special needs to
be done on error. The exiting of the process will close all
of it's open files.

Move the would_dump from binfmt_misc into begin_new_exec right
after would_dump is called on the bprm->file. This makes it
obvious this case exists and that no nesting of bprm->file is
currently supported.

In binfmt_misc the movement of fd_install into generic code means
that it's special error exit path is no longer needed.

Link: https://lkml.kernel.org/r/87y2poyd91.fsf_-_@x220.int.ebiederm.org
Acked-by: Linus Torvalds
Reviewed-by: Kees Cook
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2020-05-21 23:16:57 +0800
8b72ca900 exec: Move the call of prepare_binprm into search_binary_handler ... Browse Code »

The code in prepare_binary_handler needs to be run every time
search_binary_handler is called so move the call into search_binary_handler
itself to make the code simpler and easier to understand.

Link: https://lkml.kernel.org/r/87d070zrvx.fsf_-_@x220.int.ebiederm.org
Acked-by: Linus Torvalds
Reviewed-by: Kees Cook
Reviewed-by: James Morris
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2020-05-21 23:16:57 +0800
a16b3357b exec: Allow load_misc_binary to call prepare_binprm unconditionally ... Browse Code »

Add a flag preserve_creds that binfmt_misc can set to prevent
credentials from being updated. This allows binfmt_misc to always
call prepare_binprm. Allowing the credential computation logic to be
consolidated.

Not replacing the credentials with the interpreters credentials is
safe because because an open file descriptor to the executable is
passed to the interpreter. As the interpreter does not need to
reopen the executable it is guaranteed to see the same file that
exec sees.

Ref: c407c033de84 ("[PATCH] binfmt_misc: improve calculation of interpreter's credentials")
Link: https://lkml.kernel.org/r/87imgszrwo.fsf_-_@x220.int.ebiederm.org
Acked-by: Linus Torvalds
Reviewed-by: Kees Cook
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2020-05-21 23:16:57 +0800
112b71475 exec: Convert security_bprm_set_creds into security_bprm_repopulate_creds ... Browse Code »

Rename bprm->cap_elevated to bprm->active_secureexec and initialize it
in prepare_binprm instead of in cap_bprm_set_creds. Initializing
bprm->active_secureexec in prepare_binprm allows multiple
implementations of security_bprm_repopulate_creds to play nicely with
each other.

Rename security_bprm_set_creds to security_bprm_reopulate_creds to
emphasize that this path recomputes part of bprm->cred. This
recomputation avoids the time of check vs time of use problems that
are inherent in unix #! interpreters.

In short two renames and a move in the location of initializing
bprm->active_secureexec.

Link: https://lkml.kernel.org/r/87o8qkzrxp.fsf_-_@x220.int.ebiederm.org
Acked-by: Linus Torvalds
Reviewed-by: Kees Cook
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2020-05-21 23:16:50 +0800
b8bff5992 exec: Factor security_bprm_creds_for_exec out of security_bprm_set_creds ... Browse Code »

Today security_bprm_set_creds has several implementations:
apparmor_bprm_set_creds, cap_bprm_set_creds, selinux_bprm_set_creds,
smack_bprm_set_creds, and tomoyo_bprm_set_creds.

Except for cap_bprm_set_creds they all test bprm->called_set_creds and
return immediately if it is true. The function cap_bprm_set_creds
ignores bprm->calld_sed_creds entirely.

Create a new LSM hook security_bprm_creds_for_exec that is called just
before prepare_binprm in __do_execve_file, resulting in a LSM hook
that is called exactly once for the entire of exec. Modify the bits
of security_bprm_set_creds that only want to be called once per exec
into security_bprm_creds_for_exec, leaving only cap_bprm_set_creds
behind.

Remove bprm->called_set_creds all of it's former users have been moved
to security_bprm_creds_for_exec.

Add or upate comments a appropriate to bring them up to date and
to reflect this change.

Link: https://lkml.kernel.org/r/87v9kszrzh.fsf_-_@x220.int.ebiederm.org
Acked-by: Linus Torvalds
Acked-by: Casey Schaufler # For the LSM and Smack bits
Reviewed-by: Kees Cook
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2020-05-21 03:45:31 +0800

18 May, 2020

1 commit

b127c16d0 Merge f87d1c955916 ("exec: Move would_dump into flush_old_exec") ... Browse Code »

The change to exec is relevant to the cleanup work I have been doing.

Merge it here so that I can build on top of it, and so hopefully
that other merge logic can pick up on this and see how to deal
with the conflict between that change and my exec cleanup work.

Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2020-05-18 20:12:43 +0800

17 May, 2020

1 commit

f87d1c955 exec: Move would_dump into flush_old_exec ... Browse Code »

I goofed when I added mm->user_ns support to would_dump. I missed the
fact that in the case of binfmt_loader, binfmt_em86, binfmt_misc, and
binfmt_script bprm->file is reassigned. Which made the move of
would_dump from setup_new_exec to __do_execve_file before exec_binprm
incorrect as it can result in would_dump running on the script instead
of the interpreter of the script.

The net result is that the code stopped making unreadable interpreters
undumpable. Which allows them to be ptraced and written to disk
without special permissions. Oops.

The move was necessary because the call in set_new_exec was after
bprm->mm was no longer valid.

To correct this mistake move the misplaced would_dump from
__do_execve_file into flos_old_exec, before exec_mmap is called.

I tested and confirmed that without this fix I can attach with gdb to
a script with an unreadable interpreter, and with this fix I can not.

Cc: stable@vger.kernel.org
Fixes: f84df2a6f268 ("exec: Ensure mm->user_ns contains the execed files")
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2020-05-17 23:48:24 +0800

12 May, 2020

1 commit

6834e0bb4 exec: Set the point of no return sooner ... Browse Code »

Make the code more robust by marking the point of no return sooner.
This ensures that future code changes don't need to worry about how
they return errors if they are past this point.

This results in no actual change in behavior as __do_execve_file does
not force SIGSEGV when there is a pending fatal signal pending past
the point of no return. Further the only error returns from de_thread
and exec_mmap that can occur result in fatal signals being pending.

Reviewed-by: Kees Cook
Link: https://lkml.kernel.org/r/87sgga5klu.fsf_-_@x220.int.ebiederm.org
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2020-05-12 01:08:49 +0800