Eric Lee / smarc-fsl-linux-kernel

24 Mar, 2011

1 commit

2e1496707 userns: rename is_owner_or_cap to inode_owner_or_capable ... Browse Code »

And give it a kernel-doc comment.

[akpm@linux-foundation.org: btrfs changed in linux-next]
Signed-off-by: Serge E. Hallyn
Cc: "Eric W. Biederman"
Cc: Daniel Lezcano
Acked-by: David Howells
Cc: James Morris
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Serge E. Hallyn
2011-03-24 10:47:13 +0800

15 Mar, 2011

1 commit

1abf0c718 New kind of open files - "location only". ... Browse Code »

New flag for open(2) - O_PATH. Semantics:
* pathname is resolved, but the file itself is _NOT_ opened
as far as filesystem is concerned.
* almost all operations on the resulting descriptors shall
fail with -EBADF. Exceptions are:
1) operations on descriptors themselves (i.e.
close(), dup(), dup2(), dup3(), fcntl(fd, F_DUPFD),
fcntl(fd, F_DUPFD_CLOEXEC, ...), fcntl(fd, F_GETFD),
fcntl(fd, F_SETFD, ...))
2) fcntl(fd, F_GETFL), for a common non-destructive way to
check if descriptor is open
3) "dfd" arguments of ...at(2) syscalls, i.e. the starting
points of pathname resolution
* closing such descriptor does *NOT* affect dnotify or
posix locks.
* permissions are checked as usual along the way to file;
no permission checks are applied to the file itself. Of course,
giving such thing to syscall will result in permission checks (at
the moment it means checking that starting point of ....at() is
a directory and caller has exec permissions on it).

fget() and fget_light() return NULL on such descriptors; use of
fget_raw() and fget_raw_light() is needed to get them. That protects
existing code from dealing with those things.

There are two things still missing (they come in the next commits):
one is handling of symlinks (right now we refuse to open them that
way; see the next commit for semantics related to those) and another
is descriptor passing via SCM_RIGHTS datagrams.

Signed-off-by: Al Viro

Al Viro
2011-03-15 14:21:45 +0800

03 Feb, 2011

1 commit

3cd90ea42 vfs: sparse: add __FMODE_EXEC ... Browse Code »

FMODE_EXEC is a constant type of fmode_t but was used with normal integer
constants. This results in following warnings from sparse. Fix it using
new macro __FMODE_EXEC.

fs/exec.c:116:58: warning: restricted fmode_t degrades to integer
fs/exec.c:689:58: warning: restricted fmode_t degrades to integer
fs/fcntl.c:777:9: warning: restricted fmode_t degrades to integer

Signed-off-by: Namhyung Kim
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Namhyung Kim
2011-02-03 08:03:19 +0800

28 Oct, 2010

2 commits

55f335a88 fasync: Fix placement of FASYNC flag comment ... Browse Code »

In commit f7347ce4ee7c ("fasync: re-organize fasync entry insertion to
allow it under a spinlock") Arnd took an earlier patch of mine that had
the comment about the FASYNC flag above the wrong function.

When the fasync_add_entry() function was split to introduce the new
fasync_insert_entry() helper function, the code that actually cares
about the FASYNC bit moved to that new helper.

So just move the comment to the right point.

Signed-off-by: Linus Torvalds

Linus Torvalds
2010-10-28 09:17:02 +0800
f7347ce4e fasync: re-organize fasync entry insertion to allow it under a spinlock ... Browse Code »

You currently cannot use "fasync_helper()" in an atomic environment to
insert a new fasync entry, because it will need to allocate the new
"struct fasync_struct".

Yet fcntl_setlease() wants to call this under lock_flocks(), which is in
the process of being converted from the BKL to a spinlock.

In order to fix this, this abstracts out the actual fasync list
insertion and the fasync allocations into functions of their own, and
teaches fs/locks.c to pre-allocate the fasync_struct entry. That way
the actual list insertion can happen while holding the required
spinlock.

Signed-off-by: Linus Torvalds
[bfields@redhat.com: rebase on top of my changes to Arnd's patch]
Tested-by: J. Bruce Fields
Signed-off-by: Arnd Bergmann

Linus Torvalds
2010-10-28 04:06:17 +0800

10 Sep, 2010

1 commit

3ab04d5cf vfs: take O_NONBLOCK out of the O_* uniqueness test ... Browse Code »

O_NONBLOCK on parisc has a dual value:

#define O_NONBLOCK 000200004 /* HPUX has separate NDELAY & NONBLOCK */

It is caught by the O_* bits uniqueness check and leads to a parisc
compile error. The fix would be to take O_NONBLOCK out.

Signed-off-by: Wu Fengguang
Signed-off-by: James Bottomley
Cc: Jamie Lokier
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

James Bottomley
2010-09-10 09:57:25 +0800

11 Aug, 2010

1 commit

454eedb89 vfs: O_* bit numbers uniqueness check ... Browse Code »

The O_* bit numbers are defined in 20+ arch/*, and can silently overlap.
Add a compile time check to ensure the uniqueness as suggested by David
Miller.

Signed-off-by: Wu Fengguang
Cc: David Miller
Cc: Stephen Rothwell
Cc: Al Viro
Cc: Christoph Hellwig
Cc: Eric Paris
Cc: Roland Dreier
Cc: Jamie Lokier
Cc: Andreas Schwab
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Wu Fengguang
2010-08-11 23:59:02 +0800

30 Jun, 2010

1 commit

f4985dc71 fs/fcntl.c:kill_fasync_rcu() fa_lock must be IRQ-safe ... Browse Code »

Fix a lockdep-splat-causing regression introduced by commit 989a2979205d
("fasync: RCU and fine grained locking").

kill_fasync() can be called from both process and hard-irq context, so
fa_lock must be taken with IRQs disabled.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16230

Reported-by: Sergey Senozhatsky
Reported-by: Dominik Brodowski
Tested-by: Dominik Brodowski
Cc: Maciej Rutecki
Acked-by: Eric Dumazet
Cc: Paul E. McKenney
Cc: Lai Jiangshan
Cc: "David S. Miller"
Acked-by: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2010-06-30 06:29:32 +0800

05 Jun, 2010

1 commit

5b54470da fcntl: return -EFAULT if copy_to_user fails ... Browse Code »

copy_to_user() returns the number of bytes remaining, but we want to
return -EFAULT.
ret = fcntl(fd, F_SETOWN_EX, NULL);
With the original code ret would be 8 here.

V2: Takuya Yoshikawa pointed out a similar issue in f_getown_ex()

Signed-off-by: Dan Carpenter
Signed-off-by: Al Viro

Dan Carpenter
2010-06-05 05:16:28 +0800

22 May, 2010

2 commits

ee9a3607f Merge branch 'master' into for-2.6.35 ... Browse Code »

Conflicts:
fs/ext3/fsync.c

Signed-off-by: Jens Axboe

Jens Axboe
2010-05-22 03:27:26 +0800
35f3d14db pipe: add support for shrinking and growing pipes ... Browse Code »
86

This patch adds F_GETPIPE_SZ and F_SETPIPE_SZ fcntl() actions for
growing and shrinking the size of a pipe and adjusts pipe.c and splice.c
(and relay and network splice) usage to work with these larger (or smaller)
pipes.

Signed-off-by: Jens Axboe

Jens Axboe
2010-05-22 03:12:40 +0800

22 Apr, 2010

1 commit

989a29792 fasync: RCU and fine grained locking ... Browse Code »

kill_fasync() uses a central rwlock, candidate for RCU conversion, to
avoid cache line ping pongs on SMP.

fasync_remove_entry() and fasync_add_entry() can disable IRQS on a short
section instead during whole list scan.

Use a spinlock per fasync_struct to synchronize kill_fasync_rcu() and
fasync_{remove|add}_entry(). This spinlock is IRQ safe, so sock_fasync()
doesnt need its own implementation and can use fasync_helper(), to
reduce code size and complexity.

We can remove __kill_fasync() direct use in net/socket.c, and rename it
to kill_fasync_rcu().

Signed-off-by: Eric Dumazet
Cc: Paul E. McKenney
Cc: Lai Jiangshan
Signed-off-by: David S. Miller

Eric Dumazet
2010-04-22 07:19:29 +0800

07 Mar, 2010

1 commit

d554ed895 fs: use rlimit helpers ... Browse Code »

Make sure compiler won't do weird things with limits. E.g. fetching them
twice may return 2 different values after writable limits are implemented.

I.e. either use rlimit helpers added in commit 3e10e716abf3 ("resource:
add helpers for fetching rlimits") or ACCESS_ONCE if not applicable.

Signed-off-by: Jiri Slaby
Cc: Alexander Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jiri Slaby
2010-03-07 03:26:29 +0800

08 Feb, 2010

1 commit

80e1e8239 Fix race in tty_fasync() properly ... Browse Code »

This reverts commit 703625118069 ("tty: fix race in tty_fasync") and
commit b04da8bfdfbb ("fnctl: f_modown should call write_lock_irqsave/
restore") that tried to fix up some of the fallout but was incomplete.

It turns out that we really cannot hold 'tty->ctrl_lock' over calling
__f_setown, because not only did that cause problems with interrupt
disables (which the second commit fixed), it also causes a potential
ABBA deadlock due to lock ordering.

Thanks to Tetsuo Handa for following up on the issue, and running
lockdep to show the problem. It goes roughly like this:

- f_getown gets filp->f_owner.lock for reading without interrupts
disabled, so an interrupt that happens while that lock is held can
cause a lockdep chain from f_owner.lock -> sighand->siglock.

- at the same time, the tty->ctrl_lock -> f_owner.lock chain that
commit 703625118069 introduced, together with the pre-existing
sighand->siglock -> tty->ctrl_lock chain means that we have a lock
dependency the other way too.

So instead of extending tty->ctrl_lock over the whole __f_setown() call,
we now just take a reference to the 'pid' structure while holding the
lock, and then release it after having done the __f_setown. That still
guarantees that 'struct pid' won't go away from under us, which is all
we really ever needed.

Reported-and-tested-by: Tetsuo Handa
Acked-by: Greg Kroah-Hartman
Acked-by: Américo Wang
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds

Linus Torvalds
2010-02-08 02:26:01 +0800

27 Jan, 2010

1 commit

b04da8bfd fnctl: f_modown should call write_lock_irqsave/restore ... Browse Code »

Commit 703625118069f9f8960d356676662d3db5a9d116 exposed that f_modown()
should call write_lock_irqsave instead of just write_lock_irq so that
because a caller could have a spinlock held and it would not be good to
renable interrupts.

Cc: Eric W. Biederman
Cc: Al Viro
Cc: Alan Cox
Cc: Tavis Ormandy
Cc: stable
Signed-off-by: Greg Kroah-Hartman
Signed-off-by: Linus Torvalds

Greg Kroah-Hartman
2010-01-27 09:25:38 +0800

17 Dec, 2009

1 commit

53281b6d3 fasync: split 'fasync_helper()' into separate add/remove functions ... Browse Code »

Yes, the add and remove cases do share the same basic loop and the
locking, but the compiler can inline and then CSE some of the end result
anyway. And splitting it up makes the code way easier to follow,
and makes it clearer exactly what the semantics are.

In particular, we must make sure that the FASYNC flag in file->f_flags
exactly matches the state of "is this file on any fasync list", since
not only is that flag visible to user space (F_GETFL), but we also use
that flag to check whether we need to remove any fasync entries on file
close.

We got that wrong for the case of a mixed use of file locking (which
tries to remove any fasync entries for file leases) and fasync.

Splitting the function up also makes it possible to do some future
optimizations without making the function even messier. In particular,
since the FASYNC flag has to match the state of "is this on a list", we
can do the following future optimizations:

- on remove, we don't even need to get the locks and traverse the list
if FASYNC isn't set, since we can know a priori that there is no
point (this is effectively the same optimization that we already do
in __fput() wrt removing fasync on file close)

- on add, we can use the FASYNC flag to decide whether we are changing
an existing entry or need to allocate a new one.

but this is just the cleanup + fix for the FASYNC flag.

Acked-by: Al Viro
Tested-by: Tavis Ormandy
Cc: Jeff Dike
Cc: Matt Mackall
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds

Linus Torvalds
2009-12-17 02:05:29 +0800

18 Nov, 2009

1 commit

978b4053a fcntl: rename F_OWNER_GID to F_OWNER_PGRP ... Browse Code »

This is for consistency with various ioctl() operations that include the
suffix "PGRP" in their names, and also for consistency with PRIO_PGRP,
used with setpriority() and getpriority(). Also, using PGRP instead of
GID avoids confusion with the common abbreviation of "group ID".

I'm fine with anything that makes it more consistent, and if PGRP is what
is the predominant abbreviation then I see no need to further confuse
matters by adding a third one.

Signed-off-by: Peter Zijlstra
Acked-by: Michael Kerrisk
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2009-11-18 09:40:33 +0800

24 Sep, 2009

2 commits

ba0a6c9f6 fcntl: add F_[SG]ETOWN_EX ... Browse Code »

In order to direct the SIGIO signal to a particular thread of a
multi-threaded application we cannot, like suggested by the manpage, put a
TID into the regular fcntl(F_SETOWN) call. It will still be send to the
whole process of which that thread is part.

Since people do want to properly direct SIGIO we introduce F_SETOWN_EX.

The need to direct SIGIO comes from self-monitoring profiling such as with
perf-counters. Perf-counters uses SIGIO to notify that new sample data is
available. If the signal is delivered to the same task that generated the
new sample it can augment that data by inspecting the task's user-space
state right after it returns from the kernel. This is esp. convenient
for interpreted or virtual machine driven environments.

Both F_SETOWN_EX and F_GETOWN_EX take a pointer to a struct f_owner_ex
as argument:

struct f_owner_ex {
int type;
pid_t pid;
};

Where type is one of F_OWNER_TID, F_OWNER_PID or F_OWNER_GID.

Signed-off-by: Peter Zijlstra
Reviewed-by: Oleg Nesterov
Tested-by: stephane eranian
Cc: Michael Kerrisk
Cc: Roland McGrath
Cc: Al Viro
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2009-09-24 22:21:01 +0800
06f1631a1 signals: send_sigio: use do_send_sig_info() to avoid check_kill_permission() ... Browse Code »

group_send_sig_info()->check_kill_permission() assumes that current is the
sender and uses current_cred().

This is not true in send_sigio_to_task() case. From the security pov the
sender is not current, but the task which did fcntl(F_SETOWN), that is why
we have sigio_perm() which uses the right creds to check.

Fortunately, send_sigio() always sends either SEND_SIG_PRIV or
SI_FROMKERNEL() signal, so check_kill_permission() does nothing. But
still it would be tidier to avoid this bogus security check and save a
couple of cycles.

Signed-off-by: Oleg Nesterov
Cc: Peter Zijlstra
Cc: stephane eranian
Cc: Ingo Molnar
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2009-09-24 22:21:01 +0800

13 Jul, 2009

1 commit

405f55712 headers: smp_lock.h redux ... Browse Code »

* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

This will make hardirq.h inclusion cheaper for every PREEMPT=n config
(which includes allmodconfig/allyesconfig, BTW)

Signed-off-by: Alexey Dobriyan
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2009-07-13 03:22:34 +0800

17 Jun, 2009

2 commits

8eeee4e2f send_sigio_to_task: sanitize the usage of fown->signum ... Browse Code »

send_sigio_to_task() reads fown->signum several times, we can race with
F_SETSIG which changes ->signum lockless. In theory, this can fool
security checks or we can call group_send_sig_info() with the wrong
->si_signo which does not match "int sig".

Change the code to cache ->signum.

Signed-off-by: Oleg Nesterov
Signed-off-by: Linus Torvalds

Oleg Nesterov
2009-06-17 06:36:17 +0800
2f38d70fb shift current_cred() from __f_setown() to f_modown() ... Browse Code »

Shift current_cred() from __f_setown() to f_modown(). This reduces
the number of arguments and saves 48 bytes from fs/fcntl.o.

[ Note: this doesn't clear euid/uid when pid is set to NULL. But if
f_owner.pid == NULL we never use f_owner.uid/euid. Otherwise we'd
have a bug anyway: we must not send signals if pid was reset to NULL. ]

Signed-off-by: Oleg Nesterov
Acked-by: David Howells
Signed-off-by: Linus Torvalds

Oleg Nesterov
2009-06-17 05:19:00 +0800

12 May, 2009

1 commit

2b79bc4f7 dup2: Fix return value with oldfd == newfd and invalid fd ... Browse Code »

The return value of dup2 when oldfd == newfd and the fd isn't valid is
not getting properly sign extended. We end up with 4294967287 instead
of -EBADF.

I've reproduced this on SLE11 (2.6.27.21), openSUSE Factory
(2.6.29-rc5), and Ubuntu 9.04 (2.6.28).

This patch uses a signed int for the error value so it is properly
extended.

Commit 6c5d0512a091480c9f981162227fdb1c9d70e555 introduced this
regression.

Reported-by: Jiri Dluhos
Signed-off-by: Jeff Mahoney
Signed-off-by: Linus Torvalds

Jeff Mahoney
2009-05-12 03:18:06 +0800

30 Mar, 2009

1 commit

4a6a44996 Fix a lockdep warning in fasync_helper() ... Browse Code »

Lockdep gripes if file->f_lock is taken in a no-IRQ situation, since that
is not always the case. We don't really want to disable IRQs for every
acquisition of f_lock; instead, just move it outside of fasync_lock.

Reported-by: Bartlomiej Zolnierkiewicz
Reported-by: Larry Finger
Reported-by: Wu Fengguang
Signed-off-by: Jonathan Corbet

Jonathan Corbet
2009-03-30 22:00:24 +0800

16 Mar, 2009

3 commits

60aa49243 Rationalize fasync return values ... Browse Code »

Most fasync implementations do something like:

return fasync_helper(...);

But fasync_helper() will return a positive value at times - a feature used
in at least one place. Thus, a number of other drivers do:

err = fasync_helper(...);
if (err < 0)
return err;
return 0;

In the interests of consistency and more concise code, it makes sense to
map positive return values onto zero where ->fasync() is called.

Cc: Al Viro
Signed-off-by: Jonathan Corbet

Jonathan Corbet
2009-03-16 22:34:35 +0800
76398425b Move FASYNC bit handling to f_op->fasync() ... Browse Code »

Removing the BKL from FASYNC handling ran into the challenge of keeping the
setting of the FASYNC bit in filp->f_flags atomic with regard to calls to
the underlying fasync() function. Andi Kleen suggested moving the handling
of that bit into fasync(); this patch does exactly that. As a result, we
have a couple of internal API changes: fasync() must now manage the FASYNC
bit, and it will be called without the BKL held.

As it happens, every fasync() implementation in the kernel with one
exception calls fasync_helper(). So, if we make fasync_helper() set the
FASYNC bit, we can avoid making any changes to the other fasync()
functions - as long as those functions, themselves, have proper locking.
Most fasync() implementations do nothing but call fasync_helper() - which
has its own lock - so they are easily verified as correct. The BKL had
already been pushed down into the rest.

The networking code has its own version of fasync_helper(), so that code
has been augmented with explicit FASYNC bit handling.

Cc: Al Viro
Cc: David Miller
Reviewed-by: Christoph Hellwig
Signed-off-by: Jonathan Corbet

Jonathan Corbet
2009-03-16 22:32:27 +0800
db1dd4d37 Use f_lock to protect f_flags ... Browse Code »

Traditionally, changes to struct file->f_flags have been done under BKL
protection, or with no protection at all. This patch causes all f_flags
changes after file open/creation time to be done under protection of
f_lock. This allows the removal of some BKL usage and fixes a number of
longstanding (if microscopic) races.

Reviewed-by: Christoph Hellwig
Cc: Al Viro
Signed-off-by: Jonathan Corbet

Jonathan Corbet
2009-03-16 22:32:27 +0800

14 Jan, 2009

1 commit

a26eab240 [CVE-2009-0029] System call wrappers part 15 ... Browse Code »

Signed-off-by: Heiko Carstens

Heiko Carstens
2009-01-14 21:15:24 +0800

25 Dec, 2008

1 commit

cbacc2c7f Merge branch 'next' into for-linus Browse Code »

James Morris
2008-12-25 08:40:09 +0800

06 Dec, 2008

1 commit

218d11a8b Fix a race condition in FASYNC handling ... Browse Code »

Changeset a238b790d5f99c7832f9b73ac8847025815b85f7 (Call fasync()
functions without the BKL) introduced a race which could leave
file->f_flags in a state inconsistent with what the underlying
driver/filesystem believes. Revert that change, and also fix the same
races in ioctl_fioasync() and ioctl_fionbio().

This is a minimal, short-term fix; the real fix will not involve the
BKL.

Reported-by: Oleg Nesterov
Cc: Andi Kleen
Cc: Al Viro
Cc: stable@kernel.org
Signed-off-by: Jonathan Corbet
Signed-off-by: Linus Torvalds

Jonathan Corbet
2008-12-06 07:35:10 +0800

14 Nov, 2008

4 commits

c69e8d9c0 CRED: Use RCU to access another task's creds and to release a task's own creds ... Browse Code »

Use RCU to access another task's creds and to release a task's own creds.
This means that it will be possible for the credentials of a task to be
replaced without another task (a) requiring a full lock to read them, and (b)
seeing deallocated memory.

Signed-off-by: David Howells
Acked-by: James Morris
Acked-by: Serge Hallyn
Signed-off-by: James Morris

David Howells
2008-11-14 07:39:19 +0800
86a264abe CRED: Wrap current->cred and a few other accessors ... Browse Code »

Wrap current->cred and a few other accessors to hide their actual
implementation.

Signed-off-by: David Howells
Acked-by: James Morris
Acked-by: Serge Hallyn
Signed-off-by: James Morris

David Howells
2008-11-14 07:39:18 +0800
b6dff3ec5 CRED: Separate task security context from task_struct ... Browse Code »

Separate the task security context from task_struct. At this point, the
security data is temporarily embedded in the task_struct with two pointers
pointing to it.

Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in
entry.S via asm-offsets.

With comment fixes Signed-off-by: Marc Dionne

Signed-off-by: David Howells
Acked-by: James Morris
Acked-by: Serge Hallyn
Signed-off-by: James Morris

David Howells
2008-11-14 07:39:16 +0800
da9592ede CRED: Wrap task credential accesses in the filesystem subsystem ... Browse Code »

Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells
Reviewed-by: James Morris
Acked-by: Serge Hallyn
Cc: Al Viro
Signed-off-by: James Morris

David Howells
2008-11-14 07:39:05 +0800

01 Aug, 2008

2 commits

1b7e190b4 [PATCH] clean dup2() up a bit ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2008-08-01 23:25:24 +0800
1027abe88 [PATCH] merge locate_fd() and get_unused_fd() ... Browse Code »

New primitive: alloc_fd(start, flags). get_unused_fd() and
get_unused_fd_flags() become wrappers on top of it.

Signed-off-by: Al Viro

Al Viro
2008-08-01 23:25:23 +0800

27 Jul, 2008

3 commits

4e1e018ec [PATCH] fix RLIM_NOFILE handling ... Browse Code »

* dup2() should return -EBADF on exceeded sysctl_nr_open
* dup() should *not* return -EINVAL even if you have rlimit set to 0;
it should get -EMFILE instead.

Check for orig_start exceeding rlimit taken to sys_fcntl().
Failing expand_files() in dup{2,3}() now gets -EMFILE remapped to -EBADF.
Consequently, remaining checks for rlimit are taken to expand_files().

Signed-off-by: Al Viro

Al Viro
2008-07-27 08:53:45 +0800
6c5d0512a [PATCH] get rid of corner case in dup3() entirely ... Browse Code »

Since Ulrich is OK with getting rid of dup3(fd, fd, flags) completely,
to hell the damn thing goes. Corner case for dup2() is handled in
sys_dup2() (complete with -EBADF if dup2(fd, fd) is called with fd
that is not open), the rest is done in dup3().

Signed-off-by: Al Viro

Al Viro
2008-07-27 08:53:44 +0800
3c333937e [PATCH] dup3 fix ... Browse Code »

Al Viro notice one cornercase that the new dup3() code. The dup2()
function, as a special case, handles dup-ing to the same file
descriptor. In this case the current dup3() code does nothing at
all. I.e., it ingnores the flags parameter. This shouldn't happen,
the close-on-exec flag should be set if requested.

In case the O_CLOEXEC bit in the flags parameter is not set the
dup3() function should behave in this respect identical to dup2().
This means dup3(fd, fd, 0) should not actively reset the c-o-e
flag.

The patch below implements this minor change.

[AV: credits to Artur Grabowski for bringing that up as potential subtle point
in dup2() behaviour]

Signed-off-by: Ulrich Drepper
Signed-off-by: Al Viro

Ulrich Drepper
2008-07-27 08:53:39 +0800

25 Jul, 2008

1 commit

336dd1f70 flag parameters: dup2 ... Browse Code »

This patch adds the new dup3 syscall. It extends the old dup2 syscall by one
parameter which is meant to hold a flag value. Support for the O_CLOEXEC flag
is added in this patch.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include
#include
#include
#include
#include

#ifndef __NR_dup3
# ifdef __x86_64__
# define __NR_dup3 292
# elif defined __i386__
# define __NR_dup3 330
# else
# error "need __NR_dup3"
# endif
#endif

int
main (void)
{
int fd = syscall (__NR_dup3, 1, 4, 0);
if (fd == -1)
{
puts ("dup3(0) failed");
return 1;
}
int coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
puts ("dup3(0) set close-on-exec flag");
return 1;
}
close (fd);

fd = syscall (__NR_dup3, 1, 4, O_CLOEXEC);
if (fd == -1)
{
puts ("dup3(O_CLOEXEC) failed");
return 1;
}
coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
puts ("dup3(O_CLOEXEC) set close-on-exec flag");
return 1;
}
close (fd);

puts ("OK");

return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ulrich Drepper
Acked-by: Davide Libenzi
Cc: Michael Kerrisk
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ulrich Drepper
2008-07-25 01:47:28 +0800