10 Jun, 2016

1 commit

  • timerfd gives processes a way to set wake alarms, but unlike timers made using
    timer_create, timerfds don't check whether the process has CAP_WAKE_ALARM
    before setting alarm-time timers. CAP_WAKE_ALARM is supposed to gate this
    behavior and so it makes sense that we should deny permission to create such
    timerfds if the process doesn't have this capability.

    Signed-off-by: Eric Caruso
    Cc: Todd Poynor
    Link: http://lkml.kernel.org/r/1465427339-96209-1-git-send-email-ejcaruso@chromium.org
    Signed-off-by: Thomas Gleixner

    Eric Caruso
     

17 Jan, 2016

1 commit

  • Helge reported that a relative timer can return a remaining time larger than
    the programmed relative time on parisc and other architectures which have
    CONFIG_TIME_LOW_RES set. This happens because we add a jiffie to the resulting
    expiry time to prevent short timeouts.

    Use the new function hrtimer_expires_remaining_adjusted() to calculate the
    remaining time. It takes that extra added time into account for relative
    timers.

    Reported-and-tested-by: Helge Deller
    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: John Stultz
    Cc: linux-m68k@lists.linux-m68k.org
    Cc: dhowells@redhat.com
    Cc: stable@vger.kernel.org
    Link: http://lkml.kernel.org/r/20160114164159.354500742@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

06 Nov, 2014

1 commit

  • seq_printf functions shouldn't really check the return value.
    Checking seq_has_overflowed() occasionally is used instead.

    Update vfs documentation.

    Link: http://lkml.kernel.org/p/e37e6e7b76acbdcc3bb4ab2a57c8f8ca1ae11b9a.1412031505.git.joe@perches.com

    Cc: David S. Miller
    Cc: Al Viro
    Signed-off-by: Joe Perches
    [ did a few clean ups ]
    Signed-off-by: Steven Rostedt

    Joe Perches
     

27 Aug, 2014

1 commit


24 Jul, 2014

1 commit

  • We have a few other use cases of ktime_get_monotonic_offset() which
    can be optimized with ktime_mono_to_real(). The timerfd code uses the
    offset only for comparison, so we can use ktime_mono_to_real(0) for
    this as well.

    Funny enough text size shrinks with that on ARM and x8664 !?

    Signed-off-by: Thomas Gleixner
    Signed-off-by: John Stultz

    Thomas Gleixner
     

18 Jul, 2014

2 commits

  • The read() of timerfd files allows to fetch the number of timer ticks
    while there is no way to set it back from userspace.

    To restore the timer's state as it was at checkpoint moment we need
    a path to bring @ticks back. Initially I thought about writing ticks
    back via write() interface but it seems such API is somehow obscure.

    Instead implement timerfd_ioctl() method with TFD_IOC_SET_TICKS
    command which allows to adjust @ticks into non-zero value waking
    up the waiters.

    I wrapped code with CONFIG_CHECKPOINT_RESTORE which can be
    dropped off if there users except c/r camp appear.

    v2 (by akpm@):
    - Use define timerfd_ioctl NULL for non c/r config

    v3:
    - Use copy_from_user for @ticks fetching since
    not all arch support get_user for 8 byte argument

    Signed-off-by: Cyrill Gorcunov
    Cc: Andrew Morton
    Cc: Michael Kerrisk
    Cc: Andrey Vagin
    Cc: Arnd Bergmann
    Cc: Christopher Covington
    Cc: Pavel Emelyanov
    Cc: Vladimir Davydov
    Link: http://lkml.kernel.org/r/20140715215703.285617923@openvz.org
    Signed-off-by: Thomas Gleixner

    Cyrill Gorcunov
     
  • For checkpoint/restore of timerfd files we need to know how exactly
    the timer were armed, to be able to recreate it on restore stage.
    Thus implement show_fdinfo method which provides enough information
    for that.

    One of significant changes I think is the addition of @settime_flags
    member. Currently there are two flags TFD_TIMER_ABSTIME and
    TFD_TIMER_CANCEL_ON_SET, and the second can be found from
    @might_cancel variable but in case if the flags will be extended
    in future we most probably will have to somehow remember them
    explicitly anyway so I guss doing that right now won't hurt.

    To not bloat the timerfd_ctx structure I've converted @expired
    to short integer and defined @settime_flags as short too.

    v2 (by avagin@, vdavydov@ and tglx@):

    - Add it_value/it_interval fields
    - Save flags being used in timerfd_setup in context

    v3 (by tglx@):
    - don't forget to use CONFIG_PROC_FS

    v4 (by akpm@):
    -Use define timerfd_show NULL for non c/r config

    Signed-off-by: Cyrill Gorcunov
    Cc: Andrew Morton
    Cc: Michael Kerrisk
    Cc: Andrey Vagin
    Cc: Pavel Emelyanov
    Cc: Vladimir Davydov
    Link: http://lkml.kernel.org/r/20140715215703.114365649@openvz.org
    Signed-off-by: Thomas Gleixner

    Cyrill Gorcunov
     

24 Jan, 2014

1 commit


30 May, 2013

1 commit

  • Add support for clocks CLOCK_REALTIME_ALARM and CLOCK_BOOTTIME_ALARM,
    thereby enabling wakeup alarm timers via file descriptors.

    Signed-off-by: Todd Poynor
    Signed-off-by: John Stultz

    Todd Poynor
     

02 Mar, 2013

1 commit


04 Feb, 2013

1 commit


27 Sep, 2012

2 commits


14 Jun, 2011

1 commit

  • Currently processes waiting with poll on cancelable timerfd timers are
    not woken up when the timers are canceled. When the system time is set
    the clock_was_set() function calls timerfd_clock_was_set() to cancel
    and wake up processes waiting on potential cancelable timerfd
    timers. However the wake up currently has no effect because in the
    case of timerfd_read it is dependent on ctx->ticks not being
    0. timerfd_poll also requires ctx->ticks being non zero. As a
    consequence processes waiting on cancelable timers only get woken up
    when the timers expire. This patch fixes this by incrementing
    ctx->ticks before calling wake_up.

    Signed-off-by: Max Asbock
    Cc: kay.sievers@vrfy.org
    Cc: virtuoso@slind.org
    Cc: johnstul
    Link: http://lkml.kernel.org/r/1307985512.4710.41.camel@w-amax.beaverton.ibm.com
    Signed-off-by: Thomas Gleixner

    Max Asbock
     

23 May, 2011

1 commit

  • Peter is concerned about the extra scan of CLOCK_REALTIME_COS in the
    timer interrupt. Yes, I did not think about it, because the solution
    was so elegant. I didn't like the extra list in timerfd when it was
    proposed some time ago, but with a rcu based list the list walk it's
    less horrible than the original global lock, which was held over the
    list iteration.

    Requested-by: Peter Zijlstra
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Peter Zijlstra

    Thomas Gleixner
     

03 May, 2011

1 commit

  • Some applications must be aware of clock realtime being set
    backward. A simple example is a clock applet which arms a timer for
    the next minute display. If clock realtime is set backward then the
    applet displays a stale time for the amount of time which the clock
    was set backwards. Due to that applications poll the time because we
    don't have an interface.

    Extend the timerfd interface by adding a flag which puts the timer
    onto a different internal realtime clock. All timers on this clock are
    expired whenever the clock was set.

    The timerfd core records the monotonic offset when the timer is
    created. When the timer is armed, then the current offset is compared
    to the previous recorded offset. When it has changed, then
    timerfd_settime returns -ECANCELED. When a timer is read the offset is
    compared and if it changed -ECANCELED returned to user space. Periodic
    timers are not rearmed in the cancelation case.

    Signed-off-by: Thomas Gleixner
    Acked-by: John Stultz
    Cc: Chris Friesen
    Tested-by: Kay Sievers
    Cc: "Kirill A. Shutemov"
    Cc: Peter Zijlstra
    Cc: Davide Libenzi
    Reviewed-by: Alexander Shishkin
    Link: http://lkml.kernel.org/r/%3Calpine.LFD.2.02.1104271359580.3323%40ionos%3E
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

15 Oct, 2010

1 commit

  • All file_operations should get a .llseek operation so we can make
    nonseekable_open the default for future file operations without a
    .llseek pointer.

    The three cases that we can automatically detect are no_llseek, seq_lseek
    and default_llseek. For cases where we can we can automatically prove that
    the file offset is always ignored, we use noop_llseek, which maintains
    the current behavior of not returning an error from a seek.

    New drivers should normally not use noop_llseek but instead use no_llseek
    and call nonseekable_open at open time. Existing drivers can be converted
    to do the same when the maintainer knows for certain that no user code
    relies on calling seek on the device file.

    The generated code is often incorrectly indented and right now contains
    comments that clarify for each added line why a specific variant was
    chosen. In the version that gets submitted upstream, the comments will
    be gone and I will manually fix the indentation, because there does not
    seem to be a way to do that using coccinelle.

    Some amount of new code is currently sitting in linux-next that should get
    the same modifications, which I will do at the end of the merge window.

    Many thanks to Julia Lawall for helping me learn to write a semantic
    patch that does all this.

    ===== begin semantic patch =====
    // This adds an llseek= method to all file operations,
    // as a preparation for making no_llseek the default.
    //
    // The rules are
    // - use no_llseek explicitly if we do nonseekable_open
    // - use seq_lseek for sequential files
    // - use default_llseek if we know we access f_pos
    // - use noop_llseek if we know we don't access f_pos,
    // but we still want to allow users to call lseek
    //
    @ open1 exists @
    identifier nested_open;
    @@
    nested_open(...)
    {

    }

    @ open exists@
    identifier open_f;
    identifier i, f;
    identifier open1.nested_open;
    @@
    int open_f(struct inode *i, struct file *f)
    {

    }

    @ read disable optional_qualifier exists @
    identifier read_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    expression E;
    identifier func;
    @@
    ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
    {

    }

    @ read_no_fpos disable optional_qualifier exists @
    identifier read_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    @@
    ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
    {
    ... when != off
    }

    @ write @
    identifier write_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    expression E;
    identifier func;
    @@
    ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
    {

    }

    @ write_no_fpos @
    identifier write_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    @@
    ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
    {
    ... when != off
    }

    @ fops0 @
    identifier fops;
    @@
    struct file_operations fops = {
    ...
    };

    @ has_llseek depends on fops0 @
    identifier fops0.fops;
    identifier llseek_f;
    @@
    struct file_operations fops = {
    ...
    .llseek = llseek_f,
    ...
    };

    @ has_read depends on fops0 @
    identifier fops0.fops;
    identifier read_f;
    @@
    struct file_operations fops = {
    ...
    .read = read_f,
    ...
    };

    @ has_write depends on fops0 @
    identifier fops0.fops;
    identifier write_f;
    @@
    struct file_operations fops = {
    ...
    .write = write_f,
    ...
    };

    @ has_open depends on fops0 @
    identifier fops0.fops;
    identifier open_f;
    @@
    struct file_operations fops = {
    ...
    .open = open_f,
    ...
    };

    // use no_llseek if we call nonseekable_open
    ////////////////////////////////////////////
    @ nonseekable1 depends on !has_llseek && has_open @
    identifier fops0.fops;
    identifier nso ~= "nonseekable_open";
    @@
    struct file_operations fops = {
    ... .open = nso, ...
    +.llseek = no_llseek, /* nonseekable */
    };

    @ nonseekable2 depends on !has_llseek @
    identifier fops0.fops;
    identifier open.open_f;
    @@
    struct file_operations fops = {
    ... .open = open_f, ...
    +.llseek = no_llseek, /* open uses nonseekable */
    };

    // use seq_lseek for sequential files
    /////////////////////////////////////
    @ seq depends on !has_llseek @
    identifier fops0.fops;
    identifier sr ~= "seq_read";
    @@
    struct file_operations fops = {
    ... .read = sr, ...
    +.llseek = seq_lseek, /* we have seq_read */
    };

    // use default_llseek if there is a readdir
    ///////////////////////////////////////////
    @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier readdir_e;
    @@
    // any other fop is used that changes pos
    struct file_operations fops = {
    ... .readdir = readdir_e, ...
    +.llseek = default_llseek, /* readdir is present */
    };

    // use default_llseek if at least one of read/write touches f_pos
    /////////////////////////////////////////////////////////////////
    @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read.read_f;
    @@
    // read fops use offset
    struct file_operations fops = {
    ... .read = read_f, ...
    +.llseek = default_llseek, /* read accesses f_pos */
    };

    @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier write.write_f;
    @@
    // write fops use offset
    struct file_operations fops = {
    ... .write = write_f, ...
    + .llseek = default_llseek, /* write accesses f_pos */
    };

    // Use noop_llseek if neither read nor write accesses f_pos
    ///////////////////////////////////////////////////////////

    @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read_no_fpos.read_f;
    identifier write_no_fpos.write_f;
    @@
    // write fops use offset
    struct file_operations fops = {
    ...
    .write = write_f,
    .read = read_f,
    ...
    +.llseek = noop_llseek, /* read and write both use no f_pos */
    };

    @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier write_no_fpos.write_f;
    @@
    struct file_operations fops = {
    ... .write = write_f, ...
    +.llseek = noop_llseek, /* write uses no f_pos */
    };

    @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read_no_fpos.read_f;
    @@
    struct file_operations fops = {
    ... .read = read_f, ...
    +.llseek = noop_llseek, /* read uses no f_pos */
    };

    @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    @@
    struct file_operations fops = {
    ...
    +.llseek = noop_llseek, /* no read or write fn */
    };
    ===== End semantic patch =====

    Signed-off-by: Arnd Bergmann
    Cc: Julia Lawall
    Cc: Christoph Hellwig

    Arnd Bergmann
     

21 May, 2010

1 commit

  • This patch modifies the fs/timerfd.c to use the newly created
    wait_event_interruptible_locked_irq() macro. This replaces an open
    code implementation with a single macro call.

    Signed-off-by: Michal Nazarewicz
    Cc: Kyungmin Park
    Cc: Marek Szyprowski
    Cc: Alexander Viro
    Cc: Thomas Gleixner
    Cc: Roland Dreier
    Cc: Tejun Heo
    Cc: Christoph Lameter
    Cc: Davide Libenzi
    Signed-off-by: Greg Kroah-Hartman

    Michal Nazarewicz
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

23 Dec, 2009

1 commit

  • It seems a couple places such as arch/ia64/kernel/perfmon.c and
    drivers/infiniband/core/uverbs_main.c could use anon_inode_getfile()
    instead of a private pseudo-fs + alloc_file(), if only there were a way
    to get a read-only file. So provide this by having anon_inode_getfile()
    create a read-only file if we pass O_RDONLY in flags.

    Signed-off-by: Roland Dreier
    Signed-off-by: Al Viro

    Roland Dreier
     

19 Feb, 2009

1 commit

  • As requested by Michael, add a missing check for valid flags in
    timerfd_settime(), and make it return EINVAL in case some extra bits are
    set.

    Michael said:
    If this is to be any use to userland apps that want to check flag
    support (perhaps it is too late already), then the sooner we get it
    into the kernel the better: 2.6.29 would be good; earlier stables as
    well would be even better.

    [akpm@linux-foundation.org: remove unused TFD_FLAGS_SET]
    Acked-by: Michael Kerrisk
    Signed-off-by: Davide Libenzi
    Cc: [2.6.27.x, 2.6.28.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davide Libenzi
     

14 Jan, 2009

2 commits


06 Sep, 2008

1 commit


25 Jul, 2008

4 commits

  • This patch adds test that ensure the boundary conditions for the various
    constants introduced in the previous patches is met. No code is generated.

    [akpm@linux-foundation.org: fix alpha]
    Signed-off-by: Ulrich Drepper
    Acked-by: Davide Libenzi
    Cc: Michael Kerrisk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     
  • This patch adds support for the TFD_NONBLOCK flag to timerfd_create. The
    additional changes needed are minimal.

    The following test must be adjusted for architectures other than x86 and
    x86-64 and in case the syscall numbers changed.

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    #include
    #include
    #include
    #include
    #include

    #ifndef __NR_timerfd_create
    # ifdef __x86_64__
    # define __NR_timerfd_create 283
    # elif defined __i386__
    # define __NR_timerfd_create 322
    # else
    # error "need __NR_timerfd_create"
    # endif
    #endif

    #define TFD_NONBLOCK O_NONBLOCK

    int
    main (void)
    {
    int fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, 0);
    if (fd == -1)
    {
    puts ("timerfd_create(0) failed");
    return 1;
    }
    int fl = fcntl (fd, F_GETFL);
    if (fl == -1)
    {
    puts ("fcntl failed");
    return 1;
    }
    if (fl & O_NONBLOCK)
    {
    puts ("timerfd_create(0) set non-blocking mode");
    return 1;
    }
    close (fd);

    fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, TFD_NONBLOCK);
    if (fd == -1)
    {
    puts ("timerfd_create(TFD_NONBLOCK) failed");
    return 1;
    }
    fl = fcntl (fd, F_GETFL);
    if (fl == -1)
    {
    puts ("fcntl failed");
    return 1;
    }
    if ((fl & O_NONBLOCK) == 0)
    {
    puts ("timerfd_create(TFD_NONBLOCK) set non-blocking mode");
    return 1;
    }
    close (fd);

    puts ("OK");

    return 0;
    }
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    Signed-off-by: Ulrich Drepper
    Acked-by: Davide Libenzi
    Cc: Michael Kerrisk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     
  • The timerfd_create syscall already has a flags parameter. It just is
    unused so far. This patch changes this by introducing the TFD_CLOEXEC
    flag to set the close-on-exec flag for the returned file descriptor.

    A new name TFD_CLOEXEC is introduced which in this implementation must
    have the same value as O_CLOEXEC.

    The following test must be adjusted for architectures other than x86 and
    x86-64 and in case the syscall numbers changed.

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    #include
    #include
    #include
    #include
    #include

    #ifndef __NR_timerfd_create
    # ifdef __x86_64__
    # define __NR_timerfd_create 283
    # elif defined __i386__
    # define __NR_timerfd_create 322
    # else
    # error "need __NR_timerfd_create"
    # endif
    #endif

    #define TFD_CLOEXEC O_CLOEXEC

    int
    main (void)
    {
    int fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, 0);
    if (fd == -1)
    {
    puts ("timerfd_create(0) failed");
    return 1;
    }
    int coe = fcntl (fd, F_GETFD);
    if (coe == -1)
    {
    puts ("fcntl failed");
    return 1;
    }
    if (coe & FD_CLOEXEC)
    {
    puts ("timerfd_create(0) set close-on-exec flag");
    return 1;
    }
    close (fd);

    fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, TFD_CLOEXEC);
    if (fd == -1)
    {
    puts ("timerfd_create(TFD_CLOEXEC) failed");
    return 1;
    }
    coe = fcntl (fd, F_GETFD);
    if (coe == -1)
    {
    puts ("fcntl failed");
    return 1;
    }
    if ((coe & FD_CLOEXEC) == 0)
    {
    puts ("timerfd_create(TFD_CLOEXEC) set close-on-exec flag");
    return 1;
    }
    close (fd);

    puts ("OK");

    return 0;
    }
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    Signed-off-by: Ulrich Drepper
    Acked-by: Davide Libenzi
    Cc: Michael Kerrisk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     
  • This patch just extends the anon_inode_getfd interface to take an additional
    parameter with a flag value. The flag value is passed on to
    get_unused_fd_flags in anticipation for a use with the O_CLOEXEC flag.

    No actual semantic changes here, the changed callers all pass 0 for now.

    [akpm@linux-foundation.org: KVM fix]
    Signed-off-by: Ulrich Drepper
    Acked-by: Davide Libenzi
    Cc: Michael Kerrisk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     

02 May, 2008

1 commit

  • a) none of the callers even looks at inode or file returned by anon_inode_getfd()
    b) any caller that would try to look at those would be racy, since by the time
    it returns we might have raced with close() from another thread and that
    file would be pining for fjords.

    Signed-off-by: Al Viro

    Al Viro
     

29 Apr, 2008

1 commit


06 Feb, 2008

1 commit

  • This is the new timerfd API as it is implemented by the following patch:

    int timerfd_create(int clockid, int flags);
    int timerfd_settime(int ufd, int flags,
    const struct itimerspec *utmr,
    struct itimerspec *otmr);
    int timerfd_gettime(int ufd, struct itimerspec *otmr);

    The timerfd_create() API creates an un-programmed timerfd fd. The "clockid"
    parameter can be either CLOCK_MONOTONIC or CLOCK_REALTIME.

    The timerfd_settime() API give new settings by the timerfd fd, by optionally
    retrieving the previous expiration time (in case the "otmr" parameter is not
    NULL).

    The time value specified in "utmr" is absolute, if the TFD_TIMER_ABSTIME bit
    is set in the "flags" parameter. Otherwise it's a relative time.

    The timerfd_gettime() API returns the next expiration time of the timer, or
    {0, 0} if the timerfd has not been set yet.

    Like the previous timerfd API implementation, read(2) and poll(2) are
    supported (with the same interface). Here's a simple test program I used to
    exercise the new timerfd APIs:

    http://www.xmailserver.org/timerfd-test2.c

    [akpm@linux-foundation.org: coding-style cleanups]
    [akpm@linux-foundation.org: fix ia64 build]
    [akpm@linux-foundation.org: fix m68k build]
    [akpm@linux-foundation.org: fix mips build]
    [akpm@linux-foundation.org: fix alpha, arm, blackfin, cris, m68k, s390, sparc and sparc64 builds]
    [heiko.carstens@de.ibm.com: fix s390]
    [akpm@linux-foundation.org: fix powerpc build]
    [akpm@linux-foundation.org: fix sparc64 more]
    Signed-off-by: Davide Libenzi
    Cc: Michael Kerrisk
    Cc: Thomas Gleixner
    Cc: Davide Libenzi
    Cc: Michael Kerrisk
    Cc: Martin Schwidefsky
    Signed-off-by: Heiko Carstens
    Cc: Michael Kerrisk
    Cc: Davide Libenzi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davide Libenzi
     

27 Jul, 2007

1 commit

  • Davi fixed a missing cast in the __put_user(), that was making timerfd
    return a single byte instead of the full value.

    Talking with Michael about the timerfd man page, we think it'd be better to
    use a u64 for the returned value, to align it with the eventfd
    implementation.

    This is an ABI change. The timerfd code is new in 2.6.22 and if we merge this
    into 2.6.23 then we should also merge it into 2.6.22.x. That will leave a few
    early 2.6.22 kernels out in the wild which might misbehave when a future
    timerfd-enabled glibc is run on them.

    mtk says: The difference would be that read() will only return 4 bytes, while
    the application will expect 8. If the application is checking the size of
    returned value, as it should, then it will be able to detect the problem (it
    could even be sophisticated enough to know that if this is a 4-byte return,
    then it is running on an old 2.6.22 kernel). If the application is not
    checking the return from read(), then its 8-byte buffer will not be filled --
    the contents of the last 4 bytes will be undefined, so the u64 value as a
    whole will be junk.

    Signed-off-by: Davide Libenzi
    Cc: Michael Kerrisk
    Cc: Davi Arnaut
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davide Libenzi
     

19 May, 2007

1 commit

  • The timerfd was using the unlocked waitqueue operations, but it was
    using a different lock, so poll_wait() would race with it.

    This makes timerfd directly use the waitqueue lock.

    Signed-off-by: Davide Libenzi
    Signed-off-by: Linus Torvalds

    Davide Libenzi
     

11 May, 2007

1 commit

  • This patch introduces a new system call for timers events delivered though
    file descriptors. This allows timer event to be used with standard POSIX
    poll(2), select(2) and read(2). As a consequence of supporting the Linux
    f_op->poll subsystem, they can be used with epoll(2) too.

    The system call is defined as:

    int timerfd(int ufd, int clockid, int flags, const struct itimerspec *utmr);

    The "ufd" parameter allows for re-use (re-programming) of an existing timerfd
    w/out going through the close/open cycle (same as signalfd). If "ufd" is -1,
    s new file descriptor will be created, otherwise the existing "ufd" will be
    re-programmed.

    The "clockid" parameter is either CLOCK_MONOTONIC or CLOCK_REALTIME. The time
    specified in the "utmr->it_value" parameter is the expiry time for the timer.

    If the TFD_TIMER_ABSTIME flag is set in "flags", this is an absolute time,
    otherwise it's a relative time.

    If the time specified in the "utmr->it_interval" is not zero (.tv_sec == 0,
    tv_nsec == 0), this is the period at which the following ticks should be
    generated.

    The "utmr->it_interval" should be set to zero if only one tick is requested.
    Setting the "utmr->it_value" to zero will disable the timer, or will create a
    timerfd without the timer enabled.

    The function returns the new (or same, in case "ufd" is a valid timerfd
    descriptor) file, or -1 in case of error.

    As stated before, the timerfd file descriptor supports poll(2), select(2) and
    epoll(2). When a timer event happened on the timerfd, a POLLIN mask will be
    returned.

    The read(2) call can be used, and it will return a u32 variable holding the
    number of "ticks" that happened on the interface since the last call to
    read(2). The read(2) call supportes the O_NONBLOCK flag too, and EAGAIN will
    be returned if no ticks happened.

    A quick test program, shows timerfd working correctly on my amd64 box:

    http://www.xmailserver.org/timerfd-test.c

    [akpm@linux-foundation.org: add sys_timerfd to sys_ni.c]
    Signed-off-by: Davide Libenzi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davide Libenzi