14 Jan, 2020

1 commit

  • timerfd_settime() accepts an absolute value of the expiration time if
    TFD_TIMER_ABSTIME is specified. This value is in the task's time namespace
    and has to be converted to the host's time namespace.

    Co-developed-by: Dmitry Safonov
    Signed-off-by: Andrei Vagin
    Signed-off-by: Dmitry Safonov
    Signed-off-by: Thomas Gleixner
    Link: https://lore.kernel.org/r/20191112012724.250792-14-dima@arista.com

    Andrei Vagin
     

15 Nov, 2019

1 commit

  • timerfd_show() uses a 'struct itimerspec' internally, but that is
    deprecated because of the time_t overflow and a conflict with the glibc
    type of the same name that is now incompatible in user space.

    Use a pair of timespec64 variables instead as a simple replacement.

    As this removes the last use of itimerspec from the kernel, allowing the
    removal of the definition from the uapi headers along with timespec and
    timeval later.

    Reviewed-by: Thomas Gleixner
    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

02 Aug, 2019

1 commit

  • Use the hrtimer_cancel_wait_running() synchronization mechanism to prevent
    priority inversion and live locks on PREEMPT_RT.

    [ tglx: Split out of combo patch ]

    Signed-off-by: Anna-Maria Gleixner
    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20190730223828.600085866@linutronix.de

    Anna-Maria Gleixner
     

07 Feb, 2019

1 commit

  • A lot of system calls that pass a time_t somewhere have an implementation
    using a COMPAT_SYSCALL_DEFINEx() on 64-bit architectures, and have
    been reworked so that this implementation can now be used on 32-bit
    architectures as well.

    The missing step is to redefine them using the regular SYSCALL_DEFINEx()
    to get them out of the compat namespace and make it possible to build them
    on 32-bit architectures.

    Any system call that ends in 'time' gets a '32' suffix on its name for
    that version, while the others get a '_time32' suffix, to distinguish
    them from the normal version, which takes a 64-bit time argument in the
    future.

    In this step, only 64-bit architectures are changed, doing this rename
    first lets us avoid touching the 32-bit architectures twice.

    Acked-by: Catalin Marinas
    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

27 Aug, 2018

1 commit

  • Christoph Hellwig suggested a slightly different path for handling
    backwards compatibility with the 32-bit time_t based system calls:

    Rather than simply reusing the compat_sys_* entry points on 32-bit
    architectures unchanged, we get rid of those entry points and the
    compat_time types by renaming them to something that makes more sense
    on 32-bit architectures (which don't have a compat mode otherwise),
    and then share the entry points under the new name with the 64-bit
    architectures that use them for implementing the compatibility.

    The following types and interfaces are renamed here, and moved
    from linux/compat_time.h to linux/time32.h:

    old new
    --- ---
    compat_time_t old_time32_t
    struct compat_timeval struct old_timeval32
    struct compat_timespec struct old_timespec32
    struct compat_itimerspec struct old_itimerspec32
    ns_to_compat_timeval() ns_to_old_timeval32()
    get_compat_itimerspec64() get_old_itimerspec32()
    put_compat_itimerspec64() put_old_itimerspec32()
    compat_get_timespec64() get_old_timespec32()
    compat_put_timespec64() put_old_timespec32()

    As we already have aliases in place, this patch addresses only the
    instances that are relevant to the system call interface in particular,
    not those that occur in device drivers and other modules. Those
    will get handled separately, while providing the 64-bit version
    of the respective interfaces.

    I'm not renaming the timex, rusage and itimerval structures, as we are
    still debating what the new interface will look like, and whether we
    will need a replacement at all.

    This also doesn't change the names of the syscall entry points, which can
    be done more easily when we actually switch over the 32-bit architectures
    to use them, at that point we need to change COMPAT_SYSCALL_DEFINEx to
    SYSCALL_DEFINEx with a new name, e.g. with a _time32 suffix.

    Suggested-by: Christoph Hellwig
    Link: https://lore.kernel.org/lkml/20180705222110.GA5698@infradead.org/
    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

14 Aug, 2018

1 commit


06 Aug, 2018

1 commit


13 Jul, 2018

1 commit


29 Jun, 2018

1 commit

  • The poll() changes were not well thought out, and completely
    unexplained. They also caused a huge performance regression, because
    "->poll()" was no longer a trivial file operation that just called down
    to the underlying file operations, but instead did at least two indirect
    calls.

    Indirect calls are sadly slow now with the Spectre mitigation, but the
    performance problem could at least be largely mitigated by changing the
    "->get_poll_head()" operation to just have a per-file-descriptor pointer
    to the poll head instead. That gets rid of one of the new indirections.

    But that doesn't fix the new complexity that is completely unwarranted
    for the regular case. The (undocumented) reason for the poll() changes
    was some alleged AIO poll race fixing, but we don't make the common case
    slower and more complex for some uncommon special case, so this all
    really needs way more explanations and most likely a fundamental
    redesign.

    [ This revert is a revert of about 30 different commits, not reverted
    individually because that would just be unnecessarily messy - Linus ]

    Cc: Al Viro
    Cc: Christoph Hellwig
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

24 Jun, 2018

1 commit

  • timer_set/gettime and timerfd_set/get apis use struct itimerspec at the
    user interface layer. struct itimerspec is not y2038-safe. Change these
    interfaces to use y2038-safe struct __kernel_itimerspec instead. This will
    help define new syscalls when 32bit architectures select CONFIG_64BIT_TIME.

    Signed-off-by: Deepa Dinamani
    Signed-off-by: Thomas Gleixner
    Cc: arnd@arndb.de
    Cc: viro@zeniv.linux.org.uk
    Cc: linux-fsdevel@vger.kernel.org
    Cc: linux-api@vger.kernel.org
    Cc: y2038@lists.linaro.org
    Link: https://lkml.kernel.org/r/20180617051144.29756-4-deepa.kernel@gmail.com

    Deepa Dinamani
     

26 May, 2018

1 commit


12 Feb, 2018

1 commit

  • This is the mindless scripted replacement of kernel use of POLL*
    variables as described by Al, done by this script:

    for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
    L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
    for f in $L; do sed -i "-es/^\([^\"]*\)\(\\)/\\1E\\2/" $f; done
    done

    with de-mangling cleanups yet to come.

    NOTE! On almost all architectures, the EPOLL* constants have the same
    values as the POLL* constants do. But they keyword here is "almost".
    For various bad reasons they aren't the same, and epoll() doesn't
    actually work quite correctly in some cases due to this on Sparc et al.

    The next patch from Al will sort out the final differences, and we
    should be all done.

    Scripted-by: Al Viro
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

28 Nov, 2017

1 commit


02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

30 Jun, 2017

1 commit

  • Usage of these apis and their compat versions makes
    the syscalls: timerfd_settime and timerfd_gettime and
    their compat implementations simpler.

    This patch also serves as a preparatory patch for changing
    syscalls to use new time_t data types to support the
    y2038 effort by isolating the processing of user pointers
    through these apis.

    Signed-off-by: Deepa Dinamani
    Signed-off-by: Al Viro

    Deepa Dinamani
     

01 Mar, 2017

1 commit

  • timerfd_create() and do_timerfd_settime() evaluate capable(CAP_WAKE_ALARM)
    unconditionally although CAP_WAKE_ALARM is only required for
    CLOCK_REALTIME_ALARM and CLOCK_BOOTTIME_ALARM.

    This can cause extraneous audit messages when using a LSM such as SELinux,
    incorrectly causes PF_SUPERPRIV to be set even when no privilege was
    exercised, and is inefficient.

    Flip the order of the tests in both functions so that we only call
    capable() if the capability is truly required for the operation.

    Signed-off-by: Stephen Smalley
    Cc: linux-security-module@vger.kernel.org
    Cc: selinux@tycho.nsa.gov
    Link: http://lkml.kernel.org/r/1487344439-22293-1-git-send-email-sds@tycho.nsa.gov
    Signed-off-by: Thomas Gleixner

    Stephen Smalley
     

10 Feb, 2017

1 commit

  • The handling of the might_cancel queueing is not properly protected, so
    parallel operations on the file descriptor can race with each other and
    lead to list corruptions or use after free.

    Protect the context for these operations with a seperate lock.

    The wait queue lock cannot be reused for this because that would create a
    lock inversion scenario vs. the cancel lock. Replacing might_cancel with an
    atomic (atomic_t or atomic bit) does not help either because it still can
    race vs. the actual list operation.

    Reported-by: Dmitry Vyukov
    Signed-off-by: Thomas Gleixner
    Cc: "linux-fsdevel@vger.kernel.org"
    Cc: syzkaller
    Cc: Al Viro
    Cc: linux-fsdevel@vger.kernel.org
    Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1701311521430.3457@nanos
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

26 Dec, 2016

2 commits

  • ktime_set(S,N) was required for the timespec storage type and is still
    useful for situations where a Seconds and Nanoseconds part of a time value
    needs to be converted. For anything where the Seconds argument is 0, this
    is pointless and can be replaced with a simple assignment.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra

    Thomas Gleixner
     
  • ktime is a union because the initial implementation stored the time in
    scalar nanoseconds on 64 bit machine and in a endianess optimized timespec
    variant for 32bit machines. The Y2038 cleanup removed the timespec variant
    and switched everything to scalar nanoseconds. The union remained, but
    become completely pointless.

    Get rid of the union and just keep ktime_t as simple typedef of type s64.

    The conversion was done with coccinelle and some manual mopping up.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra

    Thomas Gleixner
     

10 Jun, 2016

1 commit

  • timerfd gives processes a way to set wake alarms, but unlike timers made using
    timer_create, timerfds don't check whether the process has CAP_WAKE_ALARM
    before setting alarm-time timers. CAP_WAKE_ALARM is supposed to gate this
    behavior and so it makes sense that we should deny permission to create such
    timerfds if the process doesn't have this capability.

    Signed-off-by: Eric Caruso
    Cc: Todd Poynor
    Link: http://lkml.kernel.org/r/1465427339-96209-1-git-send-email-ejcaruso@chromium.org
    Signed-off-by: Thomas Gleixner

    Eric Caruso
     

17 Jan, 2016

1 commit

  • Helge reported that a relative timer can return a remaining time larger than
    the programmed relative time on parisc and other architectures which have
    CONFIG_TIME_LOW_RES set. This happens because we add a jiffie to the resulting
    expiry time to prevent short timeouts.

    Use the new function hrtimer_expires_remaining_adjusted() to calculate the
    remaining time. It takes that extra added time into account for relative
    timers.

    Reported-and-tested-by: Helge Deller
    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: John Stultz
    Cc: linux-m68k@lists.linux-m68k.org
    Cc: dhowells@redhat.com
    Cc: stable@vger.kernel.org
    Link: http://lkml.kernel.org/r/20160114164159.354500742@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

06 Nov, 2014

1 commit

  • seq_printf functions shouldn't really check the return value.
    Checking seq_has_overflowed() occasionally is used instead.

    Update vfs documentation.

    Link: http://lkml.kernel.org/p/e37e6e7b76acbdcc3bb4ab2a57c8f8ca1ae11b9a.1412031505.git.joe@perches.com

    Cc: David S. Miller
    Cc: Al Viro
    Signed-off-by: Joe Perches
    [ did a few clean ups ]
    Signed-off-by: Steven Rostedt

    Joe Perches
     

27 Aug, 2014

1 commit


24 Jul, 2014

1 commit

  • We have a few other use cases of ktime_get_monotonic_offset() which
    can be optimized with ktime_mono_to_real(). The timerfd code uses the
    offset only for comparison, so we can use ktime_mono_to_real(0) for
    this as well.

    Funny enough text size shrinks with that on ARM and x8664 !?

    Signed-off-by: Thomas Gleixner
    Signed-off-by: John Stultz

    Thomas Gleixner
     

18 Jul, 2014

2 commits

  • The read() of timerfd files allows to fetch the number of timer ticks
    while there is no way to set it back from userspace.

    To restore the timer's state as it was at checkpoint moment we need
    a path to bring @ticks back. Initially I thought about writing ticks
    back via write() interface but it seems such API is somehow obscure.

    Instead implement timerfd_ioctl() method with TFD_IOC_SET_TICKS
    command which allows to adjust @ticks into non-zero value waking
    up the waiters.

    I wrapped code with CONFIG_CHECKPOINT_RESTORE which can be
    dropped off if there users except c/r camp appear.

    v2 (by akpm@):
    - Use define timerfd_ioctl NULL for non c/r config

    v3:
    - Use copy_from_user for @ticks fetching since
    not all arch support get_user for 8 byte argument

    Signed-off-by: Cyrill Gorcunov
    Cc: Andrew Morton
    Cc: Michael Kerrisk
    Cc: Andrey Vagin
    Cc: Arnd Bergmann
    Cc: Christopher Covington
    Cc: Pavel Emelyanov
    Cc: Vladimir Davydov
    Link: http://lkml.kernel.org/r/20140715215703.285617923@openvz.org
    Signed-off-by: Thomas Gleixner

    Cyrill Gorcunov
     
  • For checkpoint/restore of timerfd files we need to know how exactly
    the timer were armed, to be able to recreate it on restore stage.
    Thus implement show_fdinfo method which provides enough information
    for that.

    One of significant changes I think is the addition of @settime_flags
    member. Currently there are two flags TFD_TIMER_ABSTIME and
    TFD_TIMER_CANCEL_ON_SET, and the second can be found from
    @might_cancel variable but in case if the flags will be extended
    in future we most probably will have to somehow remember them
    explicitly anyway so I guss doing that right now won't hurt.

    To not bloat the timerfd_ctx structure I've converted @expired
    to short integer and defined @settime_flags as short too.

    v2 (by avagin@, vdavydov@ and tglx@):

    - Add it_value/it_interval fields
    - Save flags being used in timerfd_setup in context

    v3 (by tglx@):
    - don't forget to use CONFIG_PROC_FS

    v4 (by akpm@):
    -Use define timerfd_show NULL for non c/r config

    Signed-off-by: Cyrill Gorcunov
    Cc: Andrew Morton
    Cc: Michael Kerrisk
    Cc: Andrey Vagin
    Cc: Pavel Emelyanov
    Cc: Vladimir Davydov
    Link: http://lkml.kernel.org/r/20140715215703.114365649@openvz.org
    Signed-off-by: Thomas Gleixner

    Cyrill Gorcunov
     

24 Jan, 2014

1 commit


30 May, 2013

1 commit

  • Add support for clocks CLOCK_REALTIME_ALARM and CLOCK_BOOTTIME_ALARM,
    thereby enabling wakeup alarm timers via file descriptors.

    Signed-off-by: Todd Poynor
    Signed-off-by: John Stultz

    Todd Poynor
     

02 Mar, 2013

1 commit


04 Feb, 2013

1 commit


27 Sep, 2012

2 commits


14 Jun, 2011

1 commit

  • Currently processes waiting with poll on cancelable timerfd timers are
    not woken up when the timers are canceled. When the system time is set
    the clock_was_set() function calls timerfd_clock_was_set() to cancel
    and wake up processes waiting on potential cancelable timerfd
    timers. However the wake up currently has no effect because in the
    case of timerfd_read it is dependent on ctx->ticks not being
    0. timerfd_poll also requires ctx->ticks being non zero. As a
    consequence processes waiting on cancelable timers only get woken up
    when the timers expire. This patch fixes this by incrementing
    ctx->ticks before calling wake_up.

    Signed-off-by: Max Asbock
    Cc: kay.sievers@vrfy.org
    Cc: virtuoso@slind.org
    Cc: johnstul
    Link: http://lkml.kernel.org/r/1307985512.4710.41.camel@w-amax.beaverton.ibm.com
    Signed-off-by: Thomas Gleixner

    Max Asbock
     

23 May, 2011

1 commit

  • Peter is concerned about the extra scan of CLOCK_REALTIME_COS in the
    timer interrupt. Yes, I did not think about it, because the solution
    was so elegant. I didn't like the extra list in timerfd when it was
    proposed some time ago, but with a rcu based list the list walk it's
    less horrible than the original global lock, which was held over the
    list iteration.

    Requested-by: Peter Zijlstra
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Peter Zijlstra

    Thomas Gleixner
     

03 May, 2011

1 commit

  • Some applications must be aware of clock realtime being set
    backward. A simple example is a clock applet which arms a timer for
    the next minute display. If clock realtime is set backward then the
    applet displays a stale time for the amount of time which the clock
    was set backwards. Due to that applications poll the time because we
    don't have an interface.

    Extend the timerfd interface by adding a flag which puts the timer
    onto a different internal realtime clock. All timers on this clock are
    expired whenever the clock was set.

    The timerfd core records the monotonic offset when the timer is
    created. When the timer is armed, then the current offset is compared
    to the previous recorded offset. When it has changed, then
    timerfd_settime returns -ECANCELED. When a timer is read the offset is
    compared and if it changed -ECANCELED returned to user space. Periodic
    timers are not rearmed in the cancelation case.

    Signed-off-by: Thomas Gleixner
    Acked-by: John Stultz
    Cc: Chris Friesen
    Tested-by: Kay Sievers
    Cc: "Kirill A. Shutemov"
    Cc: Peter Zijlstra
    Cc: Davide Libenzi
    Reviewed-by: Alexander Shishkin
    Link: http://lkml.kernel.org/r/%3Calpine.LFD.2.02.1104271359580.3323%40ionos%3E
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

15 Oct, 2010

1 commit

  • All file_operations should get a .llseek operation so we can make
    nonseekable_open the default for future file operations without a
    .llseek pointer.

    The three cases that we can automatically detect are no_llseek, seq_lseek
    and default_llseek. For cases where we can we can automatically prove that
    the file offset is always ignored, we use noop_llseek, which maintains
    the current behavior of not returning an error from a seek.

    New drivers should normally not use noop_llseek but instead use no_llseek
    and call nonseekable_open at open time. Existing drivers can be converted
    to do the same when the maintainer knows for certain that no user code
    relies on calling seek on the device file.

    The generated code is often incorrectly indented and right now contains
    comments that clarify for each added line why a specific variant was
    chosen. In the version that gets submitted upstream, the comments will
    be gone and I will manually fix the indentation, because there does not
    seem to be a way to do that using coccinelle.

    Some amount of new code is currently sitting in linux-next that should get
    the same modifications, which I will do at the end of the merge window.

    Many thanks to Julia Lawall for helping me learn to write a semantic
    patch that does all this.

    ===== begin semantic patch =====
    // This adds an llseek= method to all file operations,
    // as a preparation for making no_llseek the default.
    //
    // The rules are
    // - use no_llseek explicitly if we do nonseekable_open
    // - use seq_lseek for sequential files
    // - use default_llseek if we know we access f_pos
    // - use noop_llseek if we know we don't access f_pos,
    // but we still want to allow users to call lseek
    //
    @ open1 exists @
    identifier nested_open;
    @@
    nested_open(...)
    {

    }

    @ open exists@
    identifier open_f;
    identifier i, f;
    identifier open1.nested_open;
    @@
    int open_f(struct inode *i, struct file *f)
    {

    }

    @ read disable optional_qualifier exists @
    identifier read_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    expression E;
    identifier func;
    @@
    ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
    {

    }

    @ read_no_fpos disable optional_qualifier exists @
    identifier read_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    @@
    ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
    {
    ... when != off
    }

    @ write @
    identifier write_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    expression E;
    identifier func;
    @@
    ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
    {

    }

    @ write_no_fpos @
    identifier write_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    @@
    ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
    {
    ... when != off
    }

    @ fops0 @
    identifier fops;
    @@
    struct file_operations fops = {
    ...
    };

    @ has_llseek depends on fops0 @
    identifier fops0.fops;
    identifier llseek_f;
    @@
    struct file_operations fops = {
    ...
    .llseek = llseek_f,
    ...
    };

    @ has_read depends on fops0 @
    identifier fops0.fops;
    identifier read_f;
    @@
    struct file_operations fops = {
    ...
    .read = read_f,
    ...
    };

    @ has_write depends on fops0 @
    identifier fops0.fops;
    identifier write_f;
    @@
    struct file_operations fops = {
    ...
    .write = write_f,
    ...
    };

    @ has_open depends on fops0 @
    identifier fops0.fops;
    identifier open_f;
    @@
    struct file_operations fops = {
    ...
    .open = open_f,
    ...
    };

    // use no_llseek if we call nonseekable_open
    ////////////////////////////////////////////
    @ nonseekable1 depends on !has_llseek && has_open @
    identifier fops0.fops;
    identifier nso ~= "nonseekable_open";
    @@
    struct file_operations fops = {
    ... .open = nso, ...
    +.llseek = no_llseek, /* nonseekable */
    };

    @ nonseekable2 depends on !has_llseek @
    identifier fops0.fops;
    identifier open.open_f;
    @@
    struct file_operations fops = {
    ... .open = open_f, ...
    +.llseek = no_llseek, /* open uses nonseekable */
    };

    // use seq_lseek for sequential files
    /////////////////////////////////////
    @ seq depends on !has_llseek @
    identifier fops0.fops;
    identifier sr ~= "seq_read";
    @@
    struct file_operations fops = {
    ... .read = sr, ...
    +.llseek = seq_lseek, /* we have seq_read */
    };

    // use default_llseek if there is a readdir
    ///////////////////////////////////////////
    @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier readdir_e;
    @@
    // any other fop is used that changes pos
    struct file_operations fops = {
    ... .readdir = readdir_e, ...
    +.llseek = default_llseek, /* readdir is present */
    };

    // use default_llseek if at least one of read/write touches f_pos
    /////////////////////////////////////////////////////////////////
    @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read.read_f;
    @@
    // read fops use offset
    struct file_operations fops = {
    ... .read = read_f, ...
    +.llseek = default_llseek, /* read accesses f_pos */
    };

    @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier write.write_f;
    @@
    // write fops use offset
    struct file_operations fops = {
    ... .write = write_f, ...
    + .llseek = default_llseek, /* write accesses f_pos */
    };

    // Use noop_llseek if neither read nor write accesses f_pos
    ///////////////////////////////////////////////////////////

    @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read_no_fpos.read_f;
    identifier write_no_fpos.write_f;
    @@
    // write fops use offset
    struct file_operations fops = {
    ...
    .write = write_f,
    .read = read_f,
    ...
    +.llseek = noop_llseek, /* read and write both use no f_pos */
    };

    @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier write_no_fpos.write_f;
    @@
    struct file_operations fops = {
    ... .write = write_f, ...
    +.llseek = noop_llseek, /* write uses no f_pos */
    };

    @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read_no_fpos.read_f;
    @@
    struct file_operations fops = {
    ... .read = read_f, ...
    +.llseek = noop_llseek, /* read uses no f_pos */
    };

    @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    @@
    struct file_operations fops = {
    ...
    +.llseek = noop_llseek, /* no read or write fn */
    };
    ===== End semantic patch =====

    Signed-off-by: Arnd Bergmann
    Cc: Julia Lawall
    Cc: Christoph Hellwig

    Arnd Bergmann
     

21 May, 2010

1 commit

  • This patch modifies the fs/timerfd.c to use the newly created
    wait_event_interruptible_locked_irq() macro. This replaces an open
    code implementation with a single macro call.

    Signed-off-by: Michal Nazarewicz
    Cc: Kyungmin Park
    Cc: Marek Szyprowski
    Cc: Alexander Viro
    Cc: Thomas Gleixner
    Cc: Roland Dreier
    Cc: Tejun Heo
    Cc: Christoph Lameter
    Cc: Davide Libenzi
    Signed-off-by: Greg Kroah-Hartman

    Michal Nazarewicz
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

23 Dec, 2009

1 commit

  • It seems a couple places such as arch/ia64/kernel/perfmon.c and
    drivers/infiniband/core/uverbs_main.c could use anon_inode_getfile()
    instead of a private pseudo-fs + alloc_file(), if only there were a way
    to get a read-only file. So provide this by having anon_inode_getfile()
    create a read-only file if we pass O_RDONLY in flags.

    Signed-off-by: Roland Dreier
    Signed-off-by: Al Viro

    Roland Dreier
     

19 Feb, 2009

1 commit

  • As requested by Michael, add a missing check for valid flags in
    timerfd_settime(), and make it return EINVAL in case some extra bits are
    set.

    Michael said:
    If this is to be any use to userland apps that want to check flag
    support (perhaps it is too late already), then the sooner we get it
    into the kernel the better: 2.6.29 would be good; earlier stables as
    well would be even better.

    [akpm@linux-foundation.org: remove unused TFD_FLAGS_SET]
    Acked-by: Michael Kerrisk
    Signed-off-by: Davide Libenzi
    Cc: [2.6.27.x, 2.6.28.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davide Libenzi