05 Jan, 2020

1 commit

  • [ Upstream commit a33121e5487b424339636b25c35d3a180eaa5f5e ]

    In a case when a ptp chardev (like /dev/ptp0) is open but an underlying
    device is removed, closing this file leads to a race. This reproduces
    easily in a kvm virtual machine:

    ts# cat openptp0.c
    int main() { ... fp = fopen("/dev/ptp0", "r"); ... sleep(10); }
    ts# uname -r
    5.5.0-rc3-46cf053e
    ts# cat /proc/cmdline
    ... slub_debug=FZP
    ts# modprobe ptp_kvm
    ts# ./openptp0 &
    [1] 670
    opened /dev/ptp0, sleeping 10s...
    ts# rmmod ptp_kvm
    ts# ls /dev/ptp*
    ls: cannot access '/dev/ptp*': No such file or directory
    ts# ...woken up
    [ 48.010809] general protection fault: 0000 [#1] SMP
    [ 48.012502] CPU: 6 PID: 658 Comm: openptp0 Not tainted 5.5.0-rc3-46cf053e #25
    [ 48.014624] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ...
    [ 48.016270] RIP: 0010:module_put.part.0+0x7/0x80
    [ 48.017939] RSP: 0018:ffffb3850073be00 EFLAGS: 00010202
    [ 48.018339] RAX: 000000006b6b6b6b RBX: 6b6b6b6b6b6b6b6b RCX: ffff89a476c00ad0
    [ 48.018936] RDX: fffff65a08d3ea08 RSI: 0000000000000247 RDI: 6b6b6b6b6b6b6b6b
    [ 48.019470] ... ^^^ a slub poison
    [ 48.023854] Call Trace:
    [ 48.024050] __fput+0x21f/0x240
    [ 48.024288] task_work_run+0x79/0x90
    [ 48.024555] do_exit+0x2af/0xab0
    [ 48.024799] ? vfs_write+0x16a/0x190
    [ 48.025082] do_group_exit+0x35/0x90
    [ 48.025387] __x64_sys_exit_group+0xf/0x10
    [ 48.025737] do_syscall_64+0x3d/0x130
    [ 48.026056] entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [ 48.026479] RIP: 0033:0x7f53b12082f6
    [ 48.026792] ...
    [ 48.030945] Modules linked in: ptp i6300esb watchdog [last unloaded: ptp_kvm]
    [ 48.045001] Fixing recursive fault but reboot is needed!

    This happens in:

    static void __fput(struct file *file)
    { ...
    if (file->f_op->release)
    file->f_op->release(inode, file); <<< cdev is kfree'd here
    if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL &&
    !(mode & FMODE_PATH))) {
    cdev_put(inode->i_cdev); <<< cdev fields are accessed here

    Namely:

    __fput()
    posix_clock_release()
    kref_put(&clk->kref, delete_clock) <<< the last reference
    delete_clock()
    delete_ptp_clock()
    kfree(ptp) <<< cdev is embedded in ptp
    cdev_put
    module_put(p->owner) <<< *p is kfree'd, bang!

    Here cdev is embedded in posix_clock which is embedded in ptp_clock.
    The race happens because ptp_clock's lifetime is controlled by two
    refcounts: kref and cdev.kobj in posix_clock. This is wrong.

    Make ptp_clock's sysfs device a parent of cdev with cdev_device_add()
    created especially for such cases. This way the parent device with its
    ptp_clock is not released until all references to the cdev are released.
    This adds a requirement that an initialized but not exposed struct
    device should be provided to posix_clock_register() by a caller instead
    of a simple dev_t.

    This approach was adopted from the commit 72139dfa2464 ("watchdog: Fix
    the race between the release of watchdog_core_data and cdev"). See
    details of the implementation in the commit 233ed09d7fda ("chardev: add
    helper function to register char devs with a struct device").

    Link: https://lore.kernel.org/linux-fsdevel/20191125125342.6189-1-vdronov@redhat.com/T/#u
    Analyzed-by: Stephen Johnston
    Analyzed-by: Vern Lovejoy
    Signed-off-by: Vladis Dronov
    Acked-by: Richard Cochran
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Vladis Dronov
     

07 Feb, 2019

1 commit

  • struct timex is not y2038 safe.
    Replace all uses of timex with y2038 safe __kernel_timex.

    Note that struct __kernel_timex is an ABI interface definition.
    We could define a new structure based on __kernel_timex that
    is only available internally instead. Right now, there isn't
    a strong motivation for this as the structure is isolated to
    a few defined struct timex interfaces and such a structure would
    be exactly the same as struct timex.

    The patch was generated by the following coccinelle script:

    virtual patch

    @depends on patch forall@
    identifier ts;
    expression e;
    @@
    (
    - struct timex ts;
    + struct __kernel_timex ts;
    |
    - struct timex ts = {};
    + struct __kernel_timex ts = {};
    |
    - struct timex ts = e;
    + struct __kernel_timex ts = e;
    |
    - struct timex *ts;
    + struct __kernel_timex *ts;
    |
    (memset \| copy_from_user \| copy_to_user \)(...,
    - sizeof(struct timex))
    + sizeof(struct __kernel_timex))
    )

    @depends on patch forall@
    identifier ts;
    identifier fn;
    @@
    fn(...,
    - struct timex *ts,
    + struct __kernel_timex *ts,
    ...) {
    ...
    }

    @depends on patch forall@
    identifier ts;
    identifier fn;
    @@
    fn(...,
    - struct timex *ts) {
    + struct __kernel_timex *ts) {
    ...
    }

    Signed-off-by: Deepa Dinamani
    Cc: linux-alpha@vger.kernel.org
    Cc: netdev@vger.kernel.org
    Signed-off-by: Arnd Bergmann

    Deepa Dinamani
     

23 Nov, 2018

3 commits

  • The SPDX identifier defines the license of the file already. No need for
    the boilerplate.

    Signed-off-by: Thomas Gleixner
    Acked-by: Richard Cochran
    Acked-by: Kees Cook
    Acked-by: Ingo Molnar
    Acked-by: Manfred Rudigier
    Acked-by: John Stultz
    Acked-by: Corey Minyard
    Cc: Peter Zijlstra
    Cc: Kate Stewart
    Cc: Philippe Ombredanne
    Cc: Peter Anvin
    Cc: Russell King
    Cc: "Paul E. McKenney"
    Cc: Nicolas Pitre
    Cc: David Riley
    Cc: Colin Cross
    Cc: Mark Brown
    Link: https://lkml.kernel.org/r/20181031182253.385909804@linutronix.de

    Thomas Gleixner
     
  • Update the time(r) core files files with the correct SPDX license
    identifier based on the license text in the file itself. The SPDX
    identifier is a legally binding shorthand, which can be used instead of the
    full boiler plate text.

    This work is based on a script and data from Philippe Ombredanne, Kate
    Stewart and myself. The data has been created with two independent license
    scanners and manual inspection.

    The following files do not contain any direct license information and have
    been omitted from the big initial SPDX changes:

    timeconst.bc: The .bc files were not touched
    time.c, timer.c, timekeeping.c: Licence was deduced from EXPORT_SYMBOL_GPL

    As those files do not contain direct license references they fall under the
    project license, i.e. GPL V2 only.

    Signed-off-by: Thomas Gleixner
    Acked-by: Kees Cook
    Acked-by: Ingo Molnar
    Acked-by: John Stultz
    Acked-by: Corey Minyard
    Cc: Peter Zijlstra
    Cc: Kate Stewart
    Cc: Philippe Ombredanne
    Cc: Russell King
    Cc: Richard Cochran
    Cc: Nicolas Pitre
    Cc: David Riley
    Cc: Colin Cross
    Cc: Mark Brown
    Cc: H. Peter Anvin
    Cc: Paul E. McKenney
    Link: https://lkml.kernel.org/r/20181031182252.879109557@linutronix.de

    Thomas Gleixner
     
  • Remove the pointless filenames in the top level comments. They have no
    value at all and just occupy space. While at it tidy up some of the
    comments and remove a stale one.

    Signed-off-by: Thomas Gleixner
    Acked-by: Nicolas Pitre
    Acked-by: Kees Cook
    Acked-by: Ingo Molnar
    Acked-by: John Stultz
    Acked-by: Corey Minyard
    Cc: Peter Zijlstra
    Cc: Kate Stewart
    Cc: Philippe Ombredanne
    Cc: Peter Anvin
    Cc: Russell King
    Cc: Richard Cochran
    Cc: "Paul E. McKenney"
    Cc: David Riley
    Cc: Colin Cross
    Cc: Mark Brown
    Link: https://lkml.kernel.org/r/20181031182252.794898238@linutronix.de

    Thomas Gleixner
     

12 Feb, 2018

1 commit

  • This is the mindless scripted replacement of kernel use of POLL*
    variables as described by Al, done by this script:

    for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
    L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
    for f in $L; do sed -i "-es/^\([^\"]*\)\(\\)/\\1E\\2/" $f; done
    done

    with de-mangling cleanups yet to come.

    NOTE! On almost all architectures, the EPOLL* constants have the same
    values as the POLL* constants do. But they keyword here is "almost".
    For various bad reasons they aren't the same, and epoll() doesn't
    actually work quite correctly in some cases due to this on Sparc et al.

    The next patch from Al will sort out the final differences, and we
    should be all done.

    Scripted-by: Al Viro
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

31 Jan, 2018

1 commit

  • Pull poll annotations from Al Viro:
    "This introduces a __bitwise type for POLL### bitmap, and propagates
    the annotations through the tree. Most of that stuff is as simple as
    'make ->poll() instances return __poll_t and do the same to local
    variables used to hold the future return value'.

    Some of the obvious brainos found in process are fixed (e.g. POLLIN
    misspelled as POLL_IN). At that point the amount of sparse warnings is
    low and most of them are for genuine bugs - e.g. ->poll() instance
    deciding to return -EINVAL instead of a bitmap. I hadn't touched those
    in this series - it's large enough as it is.

    Another problem it has caught was eventpoll() ABI mess; select.c and
    eventpoll.c assumed that corresponding POLL### and EPOLL### were
    equal. That's true for some, but not all of them - EPOLL### are
    arch-independent, but POLL### are not.

    The last commit in this series separates userland POLL### values from
    the (now arch-independent) kernel-side ones, converting between them
    in the few places where they are copied to/from userland. AFAICS, this
    is the least disruptive fix preserving poll(2) ABI and making epoll()
    work on all architectures.

    As it is, it's simply broken on sparc - try to give it EPOLLWRNORM and
    it will trigger only on what would've triggered EPOLLWRBAND on other
    architectures. EPOLLWRBAND and EPOLLRDHUP, OTOH, are never triggered
    at all on sparc. With this patch they should work consistently on all
    architectures"

    * 'misc.poll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits)
    make kernel-side POLL... arch-independent
    eventpoll: no need to mask the result of epi_item_poll() again
    eventpoll: constify struct epoll_event pointers
    debugging printk in sg_poll() uses %x to print POLL... bitmap
    annotate poll(2) guts
    9p: untangle ->poll() mess
    ->si_band gets POLL... bitmap stored into a user-visible long field
    ring_buffer_poll_wait() return value used as return value of ->poll()
    the rest of drivers/*: annotate ->poll() instances
    media: annotate ->poll() instances
    fs: annotate ->poll() instances
    ipc, kernel, mm: annotate ->poll() instances
    net: annotate ->poll() instances
    apparmor: annotate ->poll() instances
    tomoyo: annotate ->poll() instances
    sound: annotate ->poll() instances
    acpi: annotate ->poll() instances
    crypto: annotate ->poll() instances
    block: annotate ->poll() instances
    x86: annotate ->poll() instances
    ...

    Linus Torvalds
     

04 Jan, 2018

1 commit

  • Shifting a negative signed number is undefined behavior. Looking at the
    macros MAKE_PROCESS_CPUCLOCK and FD_TO_CLOCKID, it seems that the
    subexpression:

    (~(clockid_t) (pid) << 3)

    where clockid_t resolves to a signed int, which once negated, is
    undefined behavior to shift the value of if the results thus far are
    negative.

    It was further suggested to make these macros into inline functions.

    Suggested-by: Thomas Gleixner
    Signed-off-by: Nick Desaulniers
    Signed-off-by: Thomas Gleixner
    Cc: Dimitri Sivanich
    Cc: Frederic Weisbecker
    Cc: Al Viro
    Cc: linux-kselftest@vger.kernel.org
    Cc: Shuah Khan
    Cc: Deepa Dinamani
    Link: https://lkml.kernel.org/r/1514517100-18051-1-git-send-email-nick.desaulniers@gmail.com

    Nick Desaulniers
     

28 Nov, 2017

1 commit


04 Jun, 2017

2 commits

  • None of these declarations is required outside of kernel/time. Move them to
    an internal header.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: John Stultz
    Cc: Christoph Hellwig
    Link: http://lkml.kernel.org/r/20170530211656.394803853@linutronix.de

    Thomas Gleixner
     
  • The only user of this facility is ptp_clock, which does not implement any of
    those functions.

    Remove them to prevent accidental users. Especially the interval timer
    interfaces are now more or less impossible to implement because the
    necessary infrastructure has been confined to the core code. Aside of that
    it's really complex to make these callbacks implemented according to spec
    as the alarm timer implementation demonstrates. If at all then a nanosleep
    callback might be a reasonable extension. For now keep just what ptp_clock
    needs.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: John Stultz
    Link: http://lkml.kernel.org/r/20170530211656.145036286@linutronix.de

    Thomas Gleixner
     

27 May, 2017

1 commit

  • There are no more modular users providing a posix clock. The register
    function is now pointless so the posix clock array can be initialized
    statically at compile time and the array including the various k_clock
    structs can be marked 'const'.

    Inspired by changes in the Grsecurity patch set, but done proper.

    [ tglx: Massaged changelog and fixed the POSIX_TIMER=n case ]

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Thomas Gleixner
    Cc: Mike Travis
    Cc: Dimitri Sivanich
    Link: http://lkml.kernel.org/r/20170526090311.3377-3-hch@lst.de

    Christoph Hellwig
     

15 Apr, 2017

5 commits

  • struct timespec is not y2038 safe on 32 bit machines. Replace uses of
    struct timespec with struct timespec64 in the kernel.

    struct itimerspec internally uses struct timespec. Use struct itimerspec64
    which uses struct timespec64.

    The syscall interfaces themselves will be changed in a separate series.

    Signed-off-by: Deepa Dinamani
    Cc: y2038@lists.linaro.org
    Cc: john.stultz@linaro.org
    Cc: arnd@arndb.de
    Link: http://lkml.kernel.org/r/1490555058-4603-7-git-send-email-deepa.kernel@gmail.com
    Signed-off-by: Thomas Gleixner

    Deepa Dinamani
     
  • struct timespec is not y2038 safe on 32 bit machines. Replace uses of
    struct timespec with struct timespec64 in the kernel.

    The syscall interfaces themselves will be changed in a separate series.

    Signed-off-by: Deepa Dinamani
    Cc: y2038@lists.linaro.org
    Cc: john.stultz@linaro.org
    Cc: arnd@arndb.de
    Link: http://lkml.kernel.org/r/1490555058-4603-6-git-send-email-deepa.kernel@gmail.com
    Signed-off-by: Thomas Gleixner

    Deepa Dinamani
     
  • struct timespec is not y2038 safe on 32 bit machines. Replace uses of
    struct timespec with struct timespec64 in the kernel. The syscall
    interfaces themselves will be changed in a separate series.

    The clock_getres() interface has also been changed to use timespec64 even
    though this particular interface is not affected by the y2038 problem. This
    helps verification for internal kernel code for y2038 readiness by getting
    rid of time_t/ timeval/ timespec completely.

    Signed-off-by: Deepa Dinamani
    Cc: y2038@lists.linaro.org
    Cc: john.stultz@linaro.org
    Cc: arnd@arndb.de
    Link: http://lkml.kernel.org/r/1490555058-4603-5-git-send-email-deepa.kernel@gmail.com
    Signed-off-by: Thomas Gleixner

    Deepa Dinamani
     
  • struct timespec is not y2038 safe on 32 bit machines. Replace uses of
    struct timespec with struct timespec64 in the kernel.

    The syscall interfaces themselves will be changed in a separate series.

    Signed-off-by: Deepa Dinamani
    Cc: y2038@lists.linaro.org
    Cc: john.stultz@linaro.org
    Cc: arnd@arndb.de
    Link: http://lkml.kernel.org/r/1490555058-4603-4-git-send-email-deepa.kernel@gmail.com
    Signed-off-by: Thomas Gleixner

    Deepa Dinamani
     
  • struct timespec is not y2038 safe on 32 bit machines.

    The posix clocks apis use struct timespec directly and through struct
    itimerspec.

    Replace the posix clock interfaces to use struct timespec64 and struct
    itimerspec64 instead. Also fix up their implementations accordingly.

    Note that the clock_getres() interface has also been changed to use
    timespec64 even though this particular interface is not affected by the
    y2038 problem. This helps verification for internal kernel code for y2038
    readiness by getting rid of time_t/ timeval/ timespec.

    Signed-off-by: Deepa Dinamani
    Cc: arnd@arndb.de
    Cc: y2038@lists.linaro.org
    Cc: netdev@vger.kernel.org
    Cc: Richard Cochran
    Cc: john.stultz@linaro.org
    Link: http://lkml.kernel.org/r/1490555058-4603-3-git-send-email-deepa.kernel@gmail.com
    Signed-off-by: Thomas Gleixner

    Deepa Dinamani
     

29 Dec, 2015

1 commit

  • The posix_clock_poll function is supposed to return a bit mask of
    POLLxxx values. However, in case the hardware has disappeared (due to
    hot plugging for example) this code returns -ENODEV in a futile
    attempt to throw an error at the file descriptor level. The kernel's
    file_operations interface does not accept such error codes from the
    poll method. Instead, this function aught to return POLLERR.

    The value -ENODEV does, in fact, contain the POLLERR bit (and almost
    all the other POLLxxx bits as well), but only by chance. This patch
    fixes code to return a proper bit mask.

    Credit goes to Markus Elfring for pointing out the suspicious
    signed/unsigned mismatch.

    Reported-by: Markus Elfring
    igned-off-by: Richard Cochran
    Cc: John Stultz
    Cc: Julia Lawall
    Link: http://lkml.kernel.org/r/1450819198-17420-1-git-send-email-richardcochran@gmail.com
    Cc: stable@vger.kernel.org
    Signed-off-by: Thomas Gleixner

    Richard Cochran
     

01 Nov, 2011

1 commit


18 Apr, 2011

1 commit

  • A dynamic posix clock is protected from asynchronous removal by a mutex.
    However, using a mutex has the unwanted effect that a long running clock
    operation in one process will unnecessarily block other processes.

    For example, one process might call read() to get an external time stamp
    coming in at one pulse per second. A second process calling clock_gettime
    would have to wait for almost a whole second.

    This patch fixes the issue by using a reader/writer semaphore instead of
    a mutex.

    Signed-off-by: Richard Cochran
    Cc: John Stultz
    Link: http://lkml.kernel.org/r/%3C20110330132421.GA31771%40riccoc20.at.omicron.at%3E
    Signed-off-by: Thomas Gleixner

    Richard Cochran
     

13 Mar, 2011

1 commit

  • pc_clock_settime() and pc_clock_adjtime() do not check whether the fd
    was opened in write mode, so a clock can be set with a read only fd.

    [ tglx: We deliberately do not return -EPERM as we want this to be
    distingushable from the capability based permission check ]

    Signed-off-by: Torben Hohn
    LKML-Reference:
    Cc: Richard Cochran
    Cc: John Stultz
    Cc: Thomas Gleixner

    Torben Hohn
     

02 Feb, 2011

1 commit

  • This patch adds support for adding and removing posix clocks. The
    clock lifetime cycle is patterned after usb devices. Each clock is
    represented by a standard character device. In addition, the driver
    may optionally implement custom character device operations.

    The posix clock and timer system calls listed below now work with
    dynamic posix clocks, as well as the traditional static clocks.
    The following system calls are affected:

    - clock_adjtime (brand new syscall)
    - clock_gettime
    - clock_getres
    - clock_settime
    - timer_create
    - timer_delete
    - timer_gettime
    - timer_settime

    [ tglx: Adapted to the posix-timer cleanup. Moved clock_posix_dynamic
    to posix-clock.c and made all referenced functions static ]

    Signed-off-by: Richard Cochran
    Acked-by: John Stultz
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Richard Cochran