19 May, 2007
2 commits
-
The timerfd was using the unlocked waitqueue operations, but it was
using a different lock, so poll_wait() would race with it.This makes timerfd directly use the waitqueue lock.
Signed-off-by: Davide Libenzi
Signed-off-by: Linus Torvalds -
The eventfd was using the unlocked waitqueue operations, but it was
using a different lock, so poll_wait() would race with it.This makes eventfd directly use the waitqueue lock.
Signed-off-by: Davide Libenzi
Signed-off-by: Linus Torvalds
17 May, 2007
10 commits
-
grow_dev_page() simply passes GFP_NOFS to find_or_create_page. This means
the allocation of radix tree nodes is done with GFP_NOFS and the allocation
of a new page is done using GFP_NOFS.The mapping has a flags field that contains the necessary allocation flags
for the page cache allocation. These need to be consulted in order to get
DMA and HIGHMEM allocations etc right. And yes a blockdev could be
allowing Highmem allocations if its a ramdisk.Cc: Hugh Dickins
Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
i_mutex on quota files is special. Unlike i_mutexes for other inodes it is
acquired under dqonoff_mutex. Tell lockdep about this lock ranking. Also
comment and code in quota_sync_sb() seem to be bogus (as i_mutex for quota
file can be acquired under dqonoff_mutex). Move truncate_inode_pages()
call under dqonoff_mutex and save some problems with races...Signed-off-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use zero_user_page() instead of open-coding it.
Signed-off-by: Nate Diller
Cc: Michael Halcrow
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Make sysctl/kernel/core_pattern and fs/exec.c agree on maximum core
filename size and change it to 128, so that extensive patterns such as
'/local/cores/%e-%h-%s-%t-%p.core' won't result in truncated filename
generation.Signed-off-by: Dan Aloni
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Just thought this is easier to read.
Acked-by: Davide Libenzi
Signed-off-by: Heiko Carstens
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it.
Signed-off-by: Christoph Lameter
Cc: David Howells
Cc: Jens Axboe
Cc: Steven French
Cc: Michael Halcrow
Cc: OGAWA Hirofumi
Cc: Miklos Szeredi
Cc: Steven Whitehouse
Cc: Roman Zippel
Cc: David Woodhouse
Cc: Dave Kleikamp
Cc: Trond Myklebust
Cc: "J. Bruce Fields"
Cc: Anton Altaparmakov
Cc: Mark Fasheh
Cc: Paul Mackerras
Cc: Christoph Hellwig
Cc: Jan Kara
Cc: David Chinner
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
afs_prepare_write() should not mark a page up to date if it only partially
fills it in, in expectation of the caller filling in the rest prior to calling
commit_write(). commit_write(), however, should mark the page up to date.Signed-off-by: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Fix AFS to write back dirty on unmounting. This didn't happen because
afs_super_ops.drop_inode was pointing to generic_delete_inode. Now this
pointer is left set to NULL so that the default behaviour occurs instead.Signed-off-by: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
16 May, 2007
1 commit
15 May, 2007
11 commits
-
Move the kfree() call inside the ep_free() function.
Signed-off-by: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Fixes some epoll code comments.
Signed-off-by: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Changes the rwlock to a spinlock, and drops the use-count variable.
Operations are always bound by the mutex now, so the use-count is no more
needed. For the same reason, the rwlock can become a simple spinlock.Signed-off-by: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Fixes the epoll single pass code. During the unlocked event delivery (to
userspace) code, the poll callback can re-issue new events, and we must
receive them correctly. Since we loop in a lockless fashion, we want to be
O(nready), and we don't want to flash on/off the spinlock for every event, we
have the poll callback to use a secondary list to queue events while we're
inside the event delivery loop. The rw_semaphore has been turned into a
mutex. This patch also adds the wait-exclusive flag, as suggested by Davi
Arnaut.Signed-off-by: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
- fs/lockd/xdr4.c:140:27: warning: incorrect type in argument 2 (different
explicit signedness)
- fs/lockd/xdr4.c:141:27: warning: incorrect type in argument 2 (different
explicit signedness)
- fs/lockd/xdr4.c:432:28: warning: incorrect type in argument 2 (different
explicit signedness)
- fs/lockd/xdr4.c:433:28: warning: incorrect type in argument 2 (different
explicit signedness)
- fs/lockd/xdr4.c:587:20: warning: symbol 'nlm_version4' was not declared.
Should it be static?Signed-off-by: Trond Myklebust
-
- fs/nfs/nfs4xdr.c:2499:42: warning: incorrect type in argument 2
(different signedness)
- fs/nfs/nfs4xdr.c:2658:49: warning: incorrect type in argument 4
(different explicit signedness)
- fs/nfs/nfs4xdr.c:2683:50: warning: incorrect type in argument 4
(different explicit signedness)
- fs/nfs/nfs4xdr.c:3063:68: warning: incorrect type in argument 4
(different explicit signedness)
- fs/nfs/nfs4xdr.c:3065:68: warning: incorrect type in argument 4
(different explicit signedness)- fs/nfs/callback_xdr.c:138:31: warning: incorrect type in argument 2
(different signedness)Signed-off-by: Trond Myklebust
-
- fs/nfs/dir.c:610:8: warning: symbol 'nfs_llseek_dir' was not declared.
Should it be static?
- fs/nfs/dir.c:636:5: warning: symbol 'nfs_fsync_dir' was not declared.
Should it be static?
- fs/nfs/write.c:925:19: warning: symbol 'req' shadows an earlier one
- fs/nfs/write.c:61:6: warning: symbol 'nfs_commit_rcu_free' was not
declared. Should it be static?
- fs/nfs/nfs4proc.c:793:5: warning: symbol 'nfs4_recover_expired_lease'
was not declared. Should it be static?Signed-off-by: Trond Myklebust
-
The XDR code should not depend on the physical allocation size of
structures like nfs4_stateid and nfs4_verifier since those may have to
change at some future date. We therefore replace all uses of
sizeof() with constants like NFS4_VERIFIER_SIZE and NFS4_STATEID_SIZE.This also has the side-effect of fixing some warnings of the type
format ‘%u’ expects type ‘unsigned int’, but argument X has type
‘long unsigned int’
on 64-bit systemsSigned-off-by: Trond Myklebust
-
Use zero_user_page() instead of the newly deprecated memclear_highpage_flush().
Signed-off-by: Nate Diller
Cc: Trond Myklebust
Cc: "J. Bruce Fields"
Signed-off-by: Andrew Morton
Signed-off-by: Trond Myklebust -
reclaimer() calls allow_signal() which plays with parent process's ->sighand.
Signed-off-by: Oleg Nesterov
Cc: Trond Myklebust
Cc: "J. Bruce Fields"
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Trond Myklebust -
nlmsvc_timeout is already in units of HZ...
Signed-off-by: Trond Myklebust
13 May, 2007
1 commit
-
Use zero_user_page() instead of open-coding it.
[akpm@linux-foundation.org: kmap-type fixes]
Signed-off-by: Nate Diller
Acked-by: Anton Altaparmakov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
12 May, 2007
1 commit
-
* 'audit.b38' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
[PATCH] Abnormal End of Processes
[PATCH] match audit name data
[PATCH] complete message queue auditing
[PATCH] audit inode for all xattr syscalls
[PATCH] initialize name osid
[PATCH] audit signal recipients
[PATCH] add SIGNAL syscall class (v3)
[PATCH] auditing ptrace
11 May, 2007
14 commits
-
Re-arrange epoll code to avoid static functions pre-declarations, and apply
akpm-filter on it.Signed-off-by: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Epoll is either compiled it, or not (if EMBEDDED). Remove the module code
and use fs_initcall().Signed-off-by: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Cut out lots of code from epoll, by reusing the anonymous inode source
patch (fs/anon_inodes.c).Signed-off-by: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This is an example about how to add eventfd support to the current KAIO code,
in order to enable KAIO to post readiness events to a pollable fd (hence
compatible with POSIX select/poll). The KAIO code simply signals the eventfd
fd when events are ready, and this triggers a POLLIN in the fd. This patch
uses a reserved for future use member of the struct iocb to pass an eventfd
file descriptor, that KAIO will use to post events every time a request
completes. At that point, an aio_getevents() will return the completed result
to a struct io_event. I made a quick test program to verify the patch, and it
runs fine here:http://www.xmailserver.org/eventfd-aio-test.c
The test program uses poll(2), but it'd, of course, work with select and epoll
too.This can allow to schedule both block I/O and other poll-able devices
requests, and wait for results using select/poll/epoll. In a typical
scenario, an application would submit KAIO request using aio_submit(), and
will also use epoll_ctl() on the whole other class of devices (that with the
addition of signals, timers and user events, now it's pretty much complete),
and then would:epoll_wait(...);
for_each_event {
if (curr_event_is_kaiofd) {
aio_getevents();
dispatch_aio_events();
} else {
dispatch_epoll_event();
}
}Signed-off-by: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This is a very simple and light file descriptor, that can be used as event
wait/dispatch by userspace (both wait and dispatch) and by the kernel
(dispatch only). It can be used instead of pipe(2) in all cases where those
would simply be used to signal events. Their kernel overhead is much lower
than pipes, and they do not consume two fds. When used in the kernel, it can
offer an fd-bridge to enable, for example, functionalities like KAIO or
syslets/threadlets to signal to an fd the completion of certain operations.
But more in general, an eventfd can be used by the kernel to signal readiness,
in a POSIX poll/select way, of interfaces that would otherwise be incompatible
with it. The API is:int eventfd(unsigned int count);
The eventfd API accepts an initial "count" parameter, and returns an eventfd
fd. It supports poll(2) (POLLIN, POLLOUT, POLLERR), read(2) and write(2).The POLLIN flag is raised when the internal counter is greater than zero.
The POLLOUT flag is raised when at least a value of "1" can be written to the
internal counter.The POLLERR flag is raised when an overflow in the counter value is detected.
The write(2) operation can never overflow the counter, since it blocks (unless
O_NONBLOCK is set, in which case -EAGAIN is returned).But the eventfd_signal() function can do it, since it's supposed to not sleep
during its operation.The read(2) function reads the __u64 counter value, and reset the internal
value to zero. If the value read is equal to (__u64) -1, an overflow happened
on the internal counter (due to 2^64 eventfd_signal() posts that has never
been retired - unlickely, but possible).The write(2) call writes an __u64 count value, and adds it to the current
counter. The eventfd fd supports O_NONBLOCK also.On the kernel side, we have:
struct file *eventfd_fget(int fd);
int eventfd_signal(struct file *file, unsigned int n);The eventfd_fget() should be called to get a struct file* from an eventfd fd
(this is an fget() + check of f_op being an eventfd fops pointer).The kernel can then call eventfd_signal() every time it wants to post an event
to userspace. The eventfd_signal() function can be called from any context.
An eventfd() simple test and bench is available here:http://www.xmailserver.org/eventfd-bench.c
This is the eventfd-based version of pipetest-4 (pipe(2) based):
http://www.xmailserver.org/pipetest-4.c
Not that performance matters much in the eventfd case, but eventfd-bench
shows almost as double as performance than pipetest-4.[akpm@linux-foundation.org: fix i386 build]
[akpm@linux-foundation.org: add sys_eventfd to sys_ni.c]
Signed-off-by: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This patch implements the necessary compat code for the timerfd system call.
Signed-off-by: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This patch introduces a new system call for timers events delivered though
file descriptors. This allows timer event to be used with standard POSIX
poll(2), select(2) and read(2). As a consequence of supporting the Linux
f_op->poll subsystem, they can be used with epoll(2) too.The system call is defined as:
int timerfd(int ufd, int clockid, int flags, const struct itimerspec *utmr);
The "ufd" parameter allows for re-use (re-programming) of an existing timerfd
w/out going through the close/open cycle (same as signalfd). If "ufd" is -1,
s new file descriptor will be created, otherwise the existing "ufd" will be
re-programmed.The "clockid" parameter is either CLOCK_MONOTONIC or CLOCK_REALTIME. The time
specified in the "utmr->it_value" parameter is the expiry time for the timer.If the TFD_TIMER_ABSTIME flag is set in "flags", this is an absolute time,
otherwise it's a relative time.If the time specified in the "utmr->it_interval" is not zero (.tv_sec == 0,
tv_nsec == 0), this is the period at which the following ticks should be
generated.The "utmr->it_interval" should be set to zero if only one tick is requested.
Setting the "utmr->it_value" to zero will disable the timer, or will create a
timerfd without the timer enabled.The function returns the new (or same, in case "ufd" is a valid timerfd
descriptor) file, or -1 in case of error.As stated before, the timerfd file descriptor supports poll(2), select(2) and
epoll(2). When a timer event happened on the timerfd, a POLLIN mask will be
returned.The read(2) call can be used, and it will return a u32 variable holding the
number of "ticks" that happened on the interface since the last call to
read(2). The read(2) call supportes the O_NONBLOCK flag too, and EAGAIN will
be returned if no ticks happened.A quick test program, shows timerfd working correctly on my amd64 box:
http://www.xmailserver.org/timerfd-test.c
[akpm@linux-foundation.org: add sys_timerfd to sys_ni.c]
Signed-off-by: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This patch implements the necessary compat code for the signalfd system call.
Signed-off-by: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This patch series implements the new signalfd() system call.
I took part of the original Linus code (and you know how badly it can be
broken :), and I added even more breakage ;) Signals are fetched from the same
signal queue used by the process, so signalfd will compete with standard
kernel delivery in dequeue_signal(). If you want to reliably fetch signals on
the signalfd file, you need to block them with sigprocmask(SIG_BLOCK). This
seems to be working fine on my Dual Opteron machine. I made a quick test
program for it:http://www.xmailserver.org/signafd-test.c
The signalfd() system call implements signal delivery into a file descriptor
receiver. The signalfd file descriptor if created with the following API:int signalfd(int ufd, const sigset_t *mask, size_t masksize);
The "ufd" parameter allows to change an existing signalfd sigmask, w/out going
to close/create cycle (Linus idea). Use "ufd" == -1 if you want a brand new
signalfd file.The "mask" allows to specify the signal mask of signals that we are interested
in. The "masksize" parameter is the size of "mask".The signalfd fd supports the poll(2) and read(2) system calls. The poll(2)
will return POLLIN when signals are available to be dequeued. As a direct
consequence of supporting the Linux poll subsystem, the signalfd fd can use
used together with epoll(2) too.The read(2) system call will return a "struct signalfd_siginfo" structure in
the userspace supplied buffer. The return value is the number of bytes copied
in the supplied buffer, or -1 in case of error. The read(2) call can also
return 0, in case the sighand structure to which the signalfd was attached,
has been orphaned. The O_NONBLOCK flag is also supported, and read(2) will
return -EAGAIN in case no signal is available.If the size of the buffer passed to read(2) is lower than sizeof(struct
signalfd_siginfo), -EINVAL is returned. A read from the signalfd can also
return -ERESTARTSYS in case a signal hits the process. The format of the
struct signalfd_siginfo is, and the valid fields depends of the (->code &
__SI_MASK) value, in the same way a struct siginfo would:struct signalfd_siginfo {
__u32 signo; /* si_signo */
__s32 err; /* si_errno */
__s32 code; /* si_code */
__u32 pid; /* si_pid */
__u32 uid; /* si_uid */
__s32 fd; /* si_fd */
__u32 tid; /* si_fd */
__u32 band; /* si_band */
__u32 overrun; /* si_overrun */
__u32 trapno; /* si_trapno */
__s32 status; /* si_status */
__s32 svint; /* si_int */
__u64 svptr; /* si_ptr */
__u64 utime; /* si_utime */
__u64 stime; /* si_stime */
__u64 addr; /* si_addr */
};[akpm@linux-foundation.org: fix signalfd_copyinfo() on i386]
Signed-off-by: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This patch add an anonymous inode source, to be used for files that need
and inode only in order to create a file*. We do not care of having an
inode for each file, and we do not even care of having different names in
the associated dentries (dentry names will be same for classes of file*).
This allow code reuse, and will be used by epoll, signalfd and timerfd
(and whatever else there'll be).Signed-off-by: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Make autofs container-friendly by caching struct pid reference rather than
pid_t and using pid_nr() to retreive a task's pid_t.ChangeLog:
- Fix Eric Biederman's comments - Use find_get_pid() to hold a
reference to oz_pgrp and release while unmounting; separate out
changes to autofs and autofs4.
- Fix Cedric's comments: retain old prototype of parse_options()
and move necessary change to its caller.Signed-off-by: Sukadev Bhattiprolu
Cc: Cedric Le Goater
Cc: Dave Hansen
Cc: Serge Hallyn
Cc: Eric Biederman
Cc: containers@lists.osdl.org
Acked-by: Eric W. Biederman
Cc: Ian Kent
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Fix coding style errors (extra spaces, long lines) in autofs and autofs4 files
being modified for container/pidspace issues.Signed-off-by: Sukadev Bhattiprolu
Cc: Cedric Le Goater
Cc: Dave Hansen
Cc: Serge Hallyn
Cc:
Cc: Eric W. Biederman
Cc: Ian Kent
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
attach_pid() currently takes a pid_t and then uses find_pid() to find the
corresponding struct pid. Sometimes we already have the struct pid. We can
then skip find_pid() if attach_pid() were to take a struct pid parameter.Signed-off-by: Sukadev Bhattiprolu
Cc: Cedric Le Goater
Cc: Dave Hansen
Cc: Serge Hallyn
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Clean up massive code duplication between mpage_writepages() and
generic_writepages().The new generic function, write_cache_pages() takes a function pointer
argument, which will be called for each page to be written.Maybe cifs_writepages() too can use this infrastructure, but I'm not
touching that with a ten-foot pole.The upcoming page writeback support in fuse will also want this.
Signed-off-by: Miklos Szeredi
Acked-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds