14 Jun, 2020

1 commit

  • …git/dhowells/linux-fs

    Pull notification queue from David Howells:
    "This adds a general notification queue concept and adds an event
    source for keys/keyrings, such as linking and unlinking keys and
    changing their attributes.

    Thanks to Debarshi Ray, we do have a pull request to use this to fix a
    problem with gnome-online-accounts - as mentioned last time:

    https://gitlab.gnome.org/GNOME/gnome-online-accounts/merge_requests/47

    Without this, g-o-a has to constantly poll a keyring-based kerberos
    cache to find out if kinit has changed anything.

    [ There are other notification pending: mount/sb fsinfo notifications
    for libmount that Karel Zak and Ian Kent have been working on, and
    Christian Brauner would like to use them in lxc, but let's see how
    this one works first ]

    LSM hooks are included:

    - A set of hooks are provided that allow an LSM to rule on whether or
    not a watch may be set. Each of these hooks takes a different
    "watched object" parameter, so they're not really shareable. The
    LSM should use current's credentials. [Wanted by SELinux & Smack]

    - A hook is provided to allow an LSM to rule on whether or not a
    particular message may be posted to a particular queue. This is
    given the credentials from the event generator (which may be the
    system) and the watch setter. [Wanted by Smack]

    I've provided SELinux and Smack with implementations of some of these
    hooks.

    WHY
    ===

    Key/keyring notifications are desirable because if you have your
    kerberos tickets in a file/directory, your Gnome desktop will monitor
    that using something like fanotify and tell you if your credentials
    cache changes.

    However, we also have the ability to cache your kerberos tickets in
    the session, user or persistent keyring so that it isn't left around
    on disk across a reboot or logout. Keyrings, however, cannot currently
    be monitored asynchronously, so the desktop has to poll for it - not
    so good on a laptop. This facility will allow the desktop to avoid the
    need to poll.

    DESIGN DECISIONS
    ================

    - The notification queue is built on top of a standard pipe. Messages
    are effectively spliced in. The pipe is opened with a special flag:

    pipe2(fds, O_NOTIFICATION_PIPE);

    The special flag has the same value as O_EXCL (which doesn't seem
    like it will ever be applicable in this context)[?]. It is given up
    front to make it a lot easier to prohibit splice&co from accessing
    the pipe.

    [?] Should this be done some other way? I'd rather not use up a new
    O_* flag if I can avoid it - should I add a pipe3() system call
    instead?

    The pipe is then configured::

    ioctl(fds[1], IOC_WATCH_QUEUE_SET_SIZE, queue_depth);
    ioctl(fds[1], IOC_WATCH_QUEUE_SET_FILTER, &filter);

    Messages are then read out of the pipe using read().

    - It should be possible to allow write() to insert data into the
    notification pipes too, but this is currently disabled as the
    kernel has to be able to insert messages into the pipe *without*
    holding pipe->mutex and the code to make this work needs careful
    auditing.

    - sendfile(), splice() and vmsplice() are disabled on notification
    pipes because of the pipe->mutex issue and also because they
    sometimes want to revert what they just did - but one or more
    notification messages might've been interleaved in the ring.

    - The kernel inserts messages with the wait queue spinlock held. This
    means that pipe_read() and pipe_write() have to take the spinlock
    to update the queue pointers.

    - Records in the buffer are binary, typed and have a length so that
    they can be of varying size.

    This allows multiple heterogeneous sources to share a common
    buffer; there are 16 million types available, of which I've used
    just a few, so there is scope for others to be used. Tags may be
    specified when a watchpoint is created to help distinguish the
    sources.

    - Records are filterable as types have up to 256 subtypes that can be
    individually filtered. Other filtration is also available.

    - Notification pipes don't interfere with each other; each may be
    bound to a different set of watches. Any particular notification
    will be copied to all the queues that are currently watching for it
    - and only those that are watching for it.

    - When recording a notification, the kernel will not sleep, but will
    rather mark a queue as having lost a message if there's
    insufficient space. read() will fabricate a loss notification
    message at an appropriate point later.

    - The notification pipe is created and then watchpoints are attached
    to it, using one of:

    keyctl_watch_key(KEY_SPEC_SESSION_KEYRING, fds[1], 0x01);
    watch_mount(AT_FDCWD, "/", 0, fd, 0x02);
    watch_sb(AT_FDCWD, "/mnt", 0, fd, 0x03);

    where in both cases, fd indicates the queue and the number after is
    a tag between 0 and 255.

    - Watches are removed if either the notification pipe is destroyed or
    the watched object is destroyed. In the latter case, a message will
    be generated indicating the enforced watch removal.

    Things I want to avoid:

    - Introducing features that make the core VFS dependent on the
    network stack or networking namespaces (ie. usage of netlink).

    - Dumping all this stuff into dmesg and having a daemon that sits
    there parsing the output and distributing it as this then puts the
    responsibility for security into userspace and makes handling
    namespaces tricky. Further, dmesg might not exist or might be
    inaccessible inside a container.

    - Letting users see events they shouldn't be able to see.

    TESTING AND MANPAGES
    ====================

    - The keyutils tree has a pipe-watch branch that has keyctl commands
    for making use of notifications. Proposed manual pages can also be
    found on this branch, though a couple of them really need to go to
    the main manpages repository instead.

    If the kernel supports the watching of keys, then running "make
    test" on that branch will cause the testing infrastructure to spawn
    a monitoring process on the side that monitors a notifications pipe
    for all the key/keyring changes induced by the tests and they'll
    all be checked off to make sure they happened.

    https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/log/?h=pipe-watch

    - A test program is provided (samples/watch_queue/watch_test) that
    can be used to monitor for keyrings, mount and superblock events.
    Information on the notifications is simply logged to stdout"

    * tag 'notifications-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
    smack: Implement the watch_key and post_notification hooks
    selinux: Implement the watch_key security hook
    keys: Make the KEY_NEED_* perms an enum rather than a mask
    pipe: Add notification lossage handling
    pipe: Allow buffers to be marked read-whole-or-error for notifications
    Add sample notification program
    watch_queue: Add a key/keyring notification facility
    security: Add hooks to rule on setting a watch
    pipe: Add general notification queue support
    pipe: Add O_NOTIFICATION_PIPE
    security: Add a hook for the point of notification insertion
    uapi: General notification queue definitions

    Linus Torvalds
     

04 Jun, 2020

1 commit

  • Pull splice updates from Al Viro:
    "Christoph's assorted splice cleanups"

    * 'work.splice' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    fs: rename pipe_buf ->steal to ->try_steal
    fs: make the pipe_buf_operations ->confirm operation optional
    fs: make the pipe_buf_operations ->steal operation optional
    trace: remove tracing_pipe_buf_ops
    pipe: merge anon_pipe_buf*_ops
    fs: simplify do_splice_from
    fs: simplify do_splice_to

    Linus Torvalds
     

03 Jun, 2020

1 commit

  • Pull io_uring updates from Jens Axboe:
    "A relatively quiet round, mostly just fixes and code improvements. In
    particular:

    - Make statx just use the generic statx handler, instead of open
    coding it. We don't need that anymore, as we always call it async
    safe (Bijan)

    - Enable closing of the ring itself. Also fixes O_PATH closure (me)

    - Properly name completion members (me)

    - Batch reap of dead file registrations (me)

    - Allow IORING_OP_POLL with double waitqueues (me)

    - Add tee(2) support (Pavel)

    - Remove double off read (Pavel)

    - Fix overflow cancellations (Pavel)

    - Improve CQ timeouts (Pavel)

    - Async defer drain fixes (Pavel)

    - Add support for enabling/disabling notifications on a registered
    eventfd (Stefano)

    - Remove dead state parameter (Xiaoguang)

    - Disable SQPOLL submit on dying ctx (Xiaoguang)

    - Various code cleanups"

    * tag 'for-5.8/io_uring-2020-06-01' of git://git.kernel.dk/linux-block: (29 commits)
    io_uring: fix overflowed reqs cancellation
    io_uring: off timeouts based only on completions
    io_uring: move timeouts flushing to a helper
    statx: hide interfaces no longer used by io_uring
    io_uring: call statx directly
    statx: allow system call to be invoked from io_uring
    io_uring: add io_statx structure
    io_uring: get rid of manual punting in io_close
    io_uring: separate DRAIN flushing into a cold path
    io_uring: don't re-read sqe->off in timeout_prep()
    io_uring: simplify io_timeout locking
    io_uring: fix flush req->refs underflow
    io_uring: don't submit sqes when ctx->refs is dying
    io_uring: async task poll trigger cleanup
    io_uring: add tee(2) support
    splice: export do_tee()
    io_uring: don't repeat valid flag list
    io_uring: rename io_file_put()
    io_uring: remove req->needs_fixed_files
    io_uring: cleanup io_poll_remove_one() logic
    ...

    Linus Torvalds
     

21 May, 2020

7 commits

  • syzbot is reporting that splice()ing from non-empty read side to
    already-full write side causes unkillable task, for opipe_prep() is by
    error not inverting pipe_full() test.

    CPU: 0 PID: 9460 Comm: syz-executor.5 Not tainted 5.6.0-rc3-next-20200228-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    RIP: 0010:rol32 include/linux/bitops.h:105 [inline]
    RIP: 0010:iterate_chain_key kernel/locking/lockdep.c:369 [inline]
    RIP: 0010:__lock_acquire+0x6a3/0x5270 kernel/locking/lockdep.c:4178
    Call Trace:
    lock_acquire+0x197/0x420 kernel/locking/lockdep.c:4720
    __mutex_lock_common kernel/locking/mutex.c:956 [inline]
    __mutex_lock+0x156/0x13c0 kernel/locking/mutex.c:1103
    pipe_lock_nested fs/pipe.c:66 [inline]
    pipe_double_lock+0x1a0/0x1e0 fs/pipe.c:104
    splice_pipe_to_pipe fs/splice.c:1562 [inline]
    do_splice+0x35f/0x1520 fs/splice.c:1141
    __do_sys_splice fs/splice.c:1447 [inline]
    __se_sys_splice fs/splice.c:1427 [inline]
    __x64_sys_splice+0x2b5/0x320 fs/splice.c:1427
    do_syscall_64+0xf6/0x790 arch/x86/entry/common.c:295
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    Reported-by: syzbot+b48daca8639150bc5e73@syzkaller.appspotmail.com
    Link: https://syzkaller.appspot.com/bug?id=9386d051e11e09973d5a4cf79af5e8cedf79386d
    Fixes: 8cefc107ca54c8b0 ("pipe: Use head and tail pointers for the ring, not cursor and length")
    Cc: stable@vger.kernel.org # 5.5+
    Signed-off-by: Tetsuo Handa
    Signed-off-by: Linus Torvalds

    Tetsuo Handa
     
  • And replace the arcane return value convention with a simple bool
    where true means success and false means failure.

    [AV: braino fix folded in]

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • Just return 0 for success if it is not present.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • Just return 1 for failure if it is not present.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • All the op vectors are exactly the same, they are just used to encode
    packet or nomerge behavior. There already is a flag for the packet
    behavior, so just add a new one to allow for merging. Inverting it vs
    the previous nomerge special casing actually allows for much nicer code.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • No need for a local function pointer when we can trivial branch on the
    ->splice_write presence.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • No need for a local function pointer when we can trivial branch on the
    ->splice_read presence.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

19 May, 2020

1 commit

  • Make it possible to have a general notification queue built on top of a
    standard pipe. Notifications are 'spliced' into the pipe and then read
    out. splice(), vmsplice() and sendfile() are forbidden on pipes used for
    notifications as post_one_notification() cannot take pipe->mutex. This
    means that notifications could be posted in between individual pipe
    buffers, making iov_iter_revert() difficult to effect.

    The way the notification queue is used is:

    (1) An application opens a pipe with a special flag and indicates the
    number of messages it wishes to be able to queue at once (this can
    only be set once):

    pipe2(fds, O_NOTIFICATION_PIPE);
    ioctl(fds[0], IOC_WATCH_QUEUE_SET_SIZE, queue_depth);

    (2) The application then uses poll() and read() as normal to extract data
    from the pipe. read() will return multiple notifications if the
    buffer is big enough, but it will not split a notification across
    buffers - rather it will return a short read or EMSGSIZE.

    Notification messages include a length in the header so that the
    caller can split them up.

    Each message has a header that describes it:

    struct watch_notification {
    __u32 type:24;
    __u32 subtype:8;
    __u32 info;
    };

    The type indicates the source (eg. mount tree changes, superblock events,
    keyring changes, block layer events) and the subtype indicates the event
    type (eg. mount, unmount; EIO, EDQUOT; link, unlink). The info field
    indicates a number of things, including the entry length, an ID assigned to
    a watchpoint contributing to this buffer and type-specific flags.

    Supplementary data, such as the key ID that generated an event, can be
    attached in additional slots. The maximum message size is 127 bytes.
    Messages may not be padded or aligned, so there is no guarantee, for
    example, that the notification type will be on a 4-byte bounary.

    Signed-off-by: David Howells

    David Howells
     

18 May, 2020

1 commit


07 May, 2020

1 commit

  • do_splice() is used by io_uring, as will be do_tee(). Move f_mode
    checks from sys_{splice,tee}() to do_{splice,tee}(), so they're
    enforced for io_uring as well.

    Fixes: 7d67af2c0134 ("io_uring: add splice(2) support")
    Reported-by: Jann Horn
    Signed-off-by: Pavel Begunkov
    Signed-off-by: Jens Axboe

    Pavel Begunkov
     

03 Mar, 2020

1 commit


09 Feb, 2020

1 commit

  • This makes the pipe code use separate wait-queues and exclusive waiting
    for readers and writers, avoiding a nasty thundering herd problem when
    there are lots of readers waiting for data on a pipe (or, less commonly,
    lots of writers waiting for a pipe to have space).

    While this isn't a common occurrence in the traditional "use a pipe as a
    data transport" case, where you typically only have a single reader and
    a single writer process, there is one common special case: using a pipe
    as a source of "locking tokens" rather than for data communication.

    In particular, the GNU make jobserver code ends up using a pipe as a way
    to limit parallelism, where each job consumes a token by reading a byte
    from the jobserver pipe, and releases the token by writing a byte back
    to the pipe.

    This pattern is fairly traditional on Unix, and works very well, but
    will waste a lot of time waking up a lot of processes when only a single
    reader needs to be woken up when a writer releases a new token.

    A simplified test-case of just this pipe interaction is to create 64
    processes, and then pass a single token around between them (this
    test-case also intentionally passes another token that gets ignored to
    test the "wake up next" logic too, in case anybody wonders about it):

    #include

    int main(int argc, char **argv)
    {
    int fd[2], counters[2];

    pipe(fd);
    counters[0] = 0;
    counters[1] = -1;
    write(fd[1], counters, sizeof(counters));

    /* 64 processes */
    fork(); fork(); fork(); fork(); fork(); fork();

    do {
    int i;
    read(fd[0], &i, sizeof(i));
    if (i < 0)
    continue;
    counters[0] = i+1;
    write(fd[1], counters, (1+(i & 1)) *sizeof(int));
    } while (counters[0] < 1000000);
    return 0;
    }

    and in a perfect world, passing that token around should only cause one
    context switch per transfer, when the writer of a token causes a
    directed wakeup of just a single reader.

    But with the "writer wakes all readers" model we traditionally had, on
    my test box the above case causes more than an order of magnitude more
    scheduling: instead of the expected ~1M context switches, "perf stat"
    shows

    231,852.37 msec task-clock # 15.857 CPUs utilized
    11,250,961 context-switches # 0.049 M/sec
    616,304 cpu-migrations # 0.003 M/sec
    1,648 page-faults # 0.007 K/sec
    1,097,903,998,514 cycles # 4.735 GHz
    120,781,778,352 instructions # 0.11 insn per cycle
    27,997,056,043 branches # 120.754 M/sec
    283,581,233 branch-misses # 1.01% of all branches

    14.621273891 seconds time elapsed

    0.018243000 seconds user
    3.611468000 seconds sys

    before this commit.

    After this commit, I get

    5,229.55 msec task-clock # 3.072 CPUs utilized
    1,212,233 context-switches # 0.232 M/sec
    103,951 cpu-migrations # 0.020 M/sec
    1,328 page-faults # 0.254 K/sec
    21,307,456,166 cycles # 4.074 GHz
    12,947,819,999 instructions # 0.61 insn per cycle
    2,881,985,678 branches # 551.096 M/sec
    64,267,015 branch-misses # 2.23% of all branches

    1.702148350 seconds time elapsed

    0.004868000 seconds user
    0.110786000 seconds sys

    instead. Much better.

    [ Note! This kernel improvement seems to be very good at triggering a
    race condition in the make jobserver (in GNU make 4.2.1) for me. It's
    a long known bug that was fixed back in June 2017 by GNU make commit
    b552b0525198 ("[SV 51159] Use a non-blocking read with pselect to
    avoid hangs.").

    But there wasn't a new release of GNU make until 4.3 on Jan 19 2020,
    so a number of distributions may still have the buggy version. Some
    have backported the fix to their 4.2.1 release, though, and even
    without the fix it's quite timing-dependent whether the bug actually
    is hit. ]

    Josh Triplett says:
    "I've been hammering on your pipe fix patch (switching to exclusive
    wait queues) for a month or so, on several different systems, and I've
    run into no issues with it. The patch *substantially* improves
    parallel build times on large (~100 CPU) systems, both with parallel
    make and with other things that use make's pipe-based jobserver.

    All current distributions (including stable and long-term stable
    distributions) have versions of GNU make that no longer have the
    jobserver bug"

    Tested-by: Josh Triplett
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

08 Dec, 2019

1 commit

  • This code is ancient, and goes back to when we only had a single page
    for the pipe buffers. The exact history is hidden in the mists of time
    (ie "before git", and in fact predates the BK repository too).

    At that long-ago point in time, it actually helped to try to merge big
    back-and-forth pipe reads and writes, and not limit pipe reads to the
    single pipe buffer in length just because that was all we had at a time.

    However, since then we've expanded the pipe buffers to multiple pages,
    and this logic really doesn't seem to make sense. And a lot of it is
    somewhat questionable (ie "hmm, the user asked for a non-blocking read,
    but we see that there's a writer pending, so let's wait anyway to get
    the extra data that the writer will have").

    But more importantly, it makes the "go to sleep" logic much less
    obvious, and considering the wakeup issues we've had, I want to make for
    less of those kinds of things.

    Cc: David Howells
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

07 Dec, 2019

1 commit

  • Similarly to commit 8f868d68d335 ("pipe: Fix missing mask update after
    pipe_wait()") this fixes a case where the pipe rewrite ended up caching
    the pipe state incorrectly over a pipe lock drop event.

    It wasn't quite as obvious, because you needed to splice data from a
    pipe to a file, which is a fairly unusual operation, but it's completely
    wrong.

    Make sure we load the pipe head/tail/size information only after we've
    waited for there to be data in the pipe.

    While in that file, also make one of the splice helper functions use the
    canonical arghument order for pipe_empty(). That's syntactic - pipe
    emptiness is just that head and tail are equal, and thus mixing up head
    and tail doesn't really matter. It's still wrong, though.

    Reported-by: David Sterba
    Cc: David Howells
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

01 Dec, 2019

1 commit

  • …ux/kernel/git/dhowells/linux-fs

    Pull pipe rework from David Howells:
    "This is my set of preparatory patches for building a general
    notification queue on top of pipes. It makes a number of significant
    changes:

    - It removes the nr_exclusive argument from __wake_up_sync_key() as
    this is always 1. This prepares for the next step:

    - Adds wake_up_interruptible_sync_poll_locked() so that poll can be
    woken up from a function that's holding the poll waitqueue
    spinlock.

    - Change the pipe buffer ring to be managed in terms of unbounded
    head and tail indices rather than bounded index and length. This
    means that reading the pipe only needs to modify one index, not
    two.

    - A selection of helper functions are provided to query the state of
    the pipe buffer, plus a couple to apply updates to the pipe
    indices.

    - The pipe ring is allowed to have kernel-reserved slots. This allows
    many notification messages to be spliced in by the kernel without
    allowing userspace to pin too many pages if it writes to the same
    pipe.

    - Advance the head and tail indices inside the pipe waitqueue lock
    and use wake_up_interruptible_sync_poll_locked() to poke poll
    without having to take the lock twice.

    - Rearrange pipe_write() to preallocate the buffer it is going to
    write into and then drop the spinlock. This allows kernel
    notifications to then be added the ring whilst it is filling the
    buffer it allocated. The read side is stalled because the pipe
    mutex is still held.

    - Don't wake up readers on a pipe if there was already data in it
    when we added more.

    - Don't wake up writers on a pipe if the ring wasn't full before we
    removed a buffer"

    * tag 'notifications-pipe-prep-20191115' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
    pipe: Remove sync on wake_ups
    pipe: Increase the writer-wakeup threshold to reduce context-switch count
    pipe: Check for ring full inside of the spinlock in pipe_write()
    pipe: Remove redundant wakeup from pipe_write()
    pipe: Rearrange sequence in pipe_write() to preallocate slot
    pipe: Conditionalise wakeup in pipe_read()
    pipe: Advance tail pointer inside of wait spinlock in pipe_read()
    pipe: Allow pipes to have kernel-reserved slots
    pipe: Use head and tail pointers for the ring, not cursor and length
    Add wake_up_interruptible_sync_poll_locked()
    Remove the nr_exclusive argument from __wake_up_sync_key()
    pipe: Reduce #inclusion of pipe_fs_i.h

    Linus Torvalds
     

16 Nov, 2019

1 commit

  • Split pipe->ring_size into two numbers:

    (1) pipe->ring_size - indicates the hard size of the pipe ring.

    (2) pipe->max_usage - indicates the maximum number of pipe ring slots that
    userspace orchestrated events can fill.

    This allows for a pipe that is both writable by the general kernel
    notification facility and by userspace, allowing plenty of ring space for
    notifications to be added whilst preventing userspace from being able to
    pin too much unswappable kernel space.

    Signed-off-by: David Howells

    David Howells
     

31 Oct, 2019

1 commit

  • Convert pipes to use head and tail pointers for the buffer ring rather than
    pointer and length as the latter requires two atomic ops to update (or a
    combined op) whereas the former only requires one.

    (1) The head pointer is the point at which production occurs and points to
    the slot in which the next buffer will be placed. This is equivalent
    to pipe->curbuf + pipe->nrbufs.

    The head pointer belongs to the write-side.

    (2) The tail pointer is the point at which consumption occurs. It points
    to the next slot to be consumed. This is equivalent to pipe->curbuf.

    The tail pointer belongs to the read-side.

    (3) head and tail are allowed to run to UINT_MAX and wrap naturally. They
    are only masked off when the array is being accessed, e.g.:

    pipe->bufs[head & mask]

    This means that it is not necessary to have a dead slot in the ring as
    head == tail isn't ambiguous.

    (4) The ring is empty if "head == tail".

    A helper, pipe_empty(), is provided for this.

    (5) The occupancy of the ring is "head - tail".

    A helper, pipe_occupancy(), is provided for this.

    (6) The number of free slots in the ring is "pipe->ring_size - occupancy".

    A helper, pipe_space_for_user() is provided to indicate how many slots
    userspace may use.

    (7) The ring is full if "head - tail >= pipe->ring_size".

    A helper, pipe_full(), is provided for this.

    Signed-off-by: David Howells

    David Howells
     

15 Oct, 2019

1 commit

  • Andreas Grünbacher reports that on the two filesystems that support
    iomap directio, it's possible for splice() to return -EAGAIN (instead of
    a short splice) if the pipe being written to has less space available in
    its pipe buffers than the length supplied by the calling process.

    Months ago we fixed splice_direct_to_actor to clamp the length of the
    read request to the size of the splice pipe. Do the same to do_splice.

    Fixes: 17614445576b6 ("splice: don't read more than available pipe space")
    Reported-by: syzbot+3c01db6025f26530cf8d@syzkaller.appspotmail.com
    Reported-by: Andreas Grünbacher
    Reviewed-by: Andreas Grünbacher
    Signed-off-by: Darrick J. Wong

    Darrick J. Wong
     

01 Jun, 2019

1 commit


21 May, 2019

1 commit

  • Add SPDX license identifiers to all files which:

    - Have no license information of any form

    - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
    initial scan/conversion to ignore the file

    These files fall under the project license, GPL v2 only. The resulting SPDX
    license identifier is:

    GPL-2.0-only

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

27 Apr, 2019

1 commit

  • Pull tracing fixes from Steven Rostedt:
    "Three tracing fixes:

    - Use "nosteal" for ring buffer splice pages

    - Memory leak fix in error path of trace_pid_write()

    - Fix preempt_enable_no_resched() (use preempt_enable()) in ring
    buffer code"

    * tag 'trace-v5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    trace: Fix preempt_enable_no_resched() abuse
    tracing: Fix a memory leak by early error exit in trace_pid_write()
    tracing: Fix buffer_ref pipe ops

    Linus Torvalds
     

26 Apr, 2019

1 commit

  • This fixes multiple issues in buffer_pipe_buf_ops:

    - The ->steal() handler must not return zero unless the pipe buffer has
    the only reference to the page. But generic_pipe_buf_steal() assumes
    that every reference to the pipe is tracked by the page's refcount,
    which isn't true for these buffers - buffer_pipe_buf_get(), which
    duplicates a buffer, doesn't touch the page's refcount.
    Fix it by using generic_pipe_buf_nosteal(), which refuses every
    attempted theft. It should be easy to actually support ->steal, but the
    only current users of pipe_buf_steal() are the virtio console and FUSE,
    and they also only use it as an optimization. So it's probably not worth
    the effort.
    - The ->get() and ->release() handlers can be invoked concurrently on pipe
    buffers backed by the same struct buffer_ref. Make them safe against
    concurrency by using refcount_t.
    - The pointers stored in ->private were only zeroed out when the last
    reference to the buffer_ref was dropped. As far as I know, this
    shouldn't be necessary anyway, but if we do it, let's always do it.

    Link: http://lkml.kernel.org/r/20190404215925.253531-1-jannh@google.com

    Cc: Ingo Molnar
    Cc: Masami Hiramatsu
    Cc: Al Viro
    Cc: stable@vger.kernel.org
    Fixes: 73a757e63114d ("ring-buffer: Return reader page back into existing ring buffer")
    Signed-off-by: Jann Horn
    Signed-off-by: Steven Rostedt (VMware)

    Jann Horn
     

15 Apr, 2019

2 commits

  • Merge page ref overflow branch.

    Jann Horn reported that he can overflow the page ref count with
    sufficient memory (and a filesystem that is intentionally extremely
    slow).

    Admittedly it's not exactly easy. To have more than four billion
    references to a page requires a minimum of 32GB of kernel memory just
    for the pointers to the pages, much less any metadata to keep track of
    those pointers. Jann needed a total of 140GB of memory and a specially
    crafted filesystem that leaves all reads pending (in order to not ever
    free the page references and just keep adding more).

    Still, we have a fairly straightforward way to limit the two obvious
    user-controllable sources of page references: direct-IO like page
    references gotten through get_user_pages(), and the splice pipe page
    duplication. So let's just do that.

    * branch page-refs:
    fs: prevent page refcount overflow in pipe_buf_get
    mm: prevent get_user_pages() from overflowing page refcount
    mm: add 'try_get_page()' helper function
    mm: make page ref count overflow check tighter and more explicit

    Linus Torvalds
     
  • Change pipe_buf_get() to return a bool indicating whether it succeeded
    in raising the refcount of the page (if the thing in the pipe is a page).
    This removes another mechanism for overflowing the page refcount. All
    callers converted to handle a failure.

    Reported-by: Jann Horn
    Signed-off-by: Matthew Wilcox
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     

13 Mar, 2019

1 commit

  • Pull misc vfs updates from Al Viro:
    "Assorted fixes (really no common topic here)"

    * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    vfs: Make __vfs_write() static
    vfs: fix preadv64v2 and pwritev64v2 compat syscalls with offset == -1
    pipe: stop using ->can_merge
    splice: don't merge into linked buffers
    fs: move generic stat response attr handling to vfs_getattr_nosec
    orangefs: don't reinitialize result_mask in ->getattr
    fs/devpts: always delete dcache dentry-s in dput()

    Linus Torvalds
     

05 Mar, 2019

2 commits

  • The current implementation of splice() and tee() ignores O_NONBLOCK set
    on pipe file descriptors and checks only the SPLICE_F_NONBLOCK flag for
    blocking on pipe arguments. This is inconsistent since splice()-ing
    from/to non-pipe file descriptors does take O_NONBLOCK into
    consideration.

    Fix this by promoting O_NONBLOCK, when set on a pipe, to
    SPLICE_F_NONBLOCK.

    Some context for how the current implementation of splice() leads to
    inconsistent behavior. In the ongoing work[1] to add VM tracing
    capability to trace-cmd we stream tracing data over named FIFOs or
    vsockets from guests back to the host.

    When we receive SIGINT from user to stop tracing, we set O_NONBLOCK on
    the input file descriptor and set SPLICE_F_NONBLOCK for the next call to
    splice(). If splice() was blocked waiting on data from the input FIFO,
    after SIGINT splice() restarts with the same arguments (no
    SPLICE_F_NONBLOCK) and blocks again instead of returning -EAGAIN when no
    data is available.

    This differs from the splice() behavior when reading from a vsocket or
    when we're doing a traditional read()/write() loop (trace-cmd's
    --nosplice argument).

    With this patch applied we get the same behavior in all situations after
    setting O_NONBLOCK which also matches the behavior of doing a
    read()/write() loop instead of splice().

    This change does have potential of breaking users who don't expect
    EAGAIN from splice() when SPLICE_F_NONBLOCK is not set. OTOH programs
    that set O_NONBLOCK and don't anticipate EAGAIN are arguably buggy[2].

    [1] https://github.com/skaslev/trace-cmd/tree/vsock
    [2] https://github.com/torvalds/linux/blob/d47e3da1759230e394096fd742aad423c291ba48/fs/read_write.c#L1425

    Signed-off-by: Slavomir Kaslev
    Reviewed-by: Steven Rostedt (VMware)
    Signed-off-by: Linus Torvalds

    Slavomir Kaslev
     
  • Every in-kernel use of this function defined it to KERNEL_DS (either as
    an actual define, or as an inline function). It's an entirely
    historical artifact, and long long long ago used to actually read the
    segment selector valueof '%ds' on x86.

    Which in the kernel is always KERNEL_DS.

    Inspired by a patch from Jann Horn that just did this for a very small
    subset of users (the ones in fs/), along with Al who suggested a script.
    I then just took it to the logical extreme and removed all the remaining
    gunk.

    Roughly scripted with

    git grep -l '(get_ds())' -- :^tools/ | xargs sed -i 's/(get_ds())/(KERNEL_DS)/'
    git grep -lw 'get_ds' -- :^tools/ | xargs sed -i '/^#define get_ds()/d'

    plus manual fixups to remove a few unusual usage patterns, the couple of
    inline function cases and to fix up a comment that had become stale.

    The 'get_ds()' function remains in an x86 kvm selftest, since in user
    space it actually does something relevant.

    Inspired-by: Jann Horn
    Inspired-by: Al Viro
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

01 Feb, 2019

2 commits

  • Al Viro pointed out that since there is only one pipe buffer type to which
    new data can be appended, it isn't necessary to have a ->can_merge field in
    struct pipe_buf_operations, we can just check for a magic type.

    Suggested-by: Al Viro
    Signed-off-by: Jann Horn
    Signed-off-by: Al Viro

    Jann Horn
     
  • Before this patch, it was possible for two pipes to affect each other after
    data had been transferred between them with tee():

    ============
    $ cat tee_test.c

    int main(void) {
    int pipe_a[2];
    if (pipe(pipe_a)) err(1, "pipe");
    int pipe_b[2];
    if (pipe(pipe_b)) err(1, "pipe");
    if (write(pipe_a[1], "abcd", 4) != 4) err(1, "write");
    if (tee(pipe_a[0], pipe_b[1], 2, 0) != 2) err(1, "tee");
    if (write(pipe_b[1], "xx", 2) != 2) err(1, "write");

    char buf[5];
    if (read(pipe_a[0], buf, 4) != 4) err(1, "read");
    buf[4] = 0;
    printf("got back: '%s'\n", buf);
    }
    $ gcc -o tee_test tee_test.c
    $ ./tee_test
    got back: 'abxx'
    $
    ============

    As suggested by Al Viro, fix it by creating a separate type for
    non-mergeable pipe buffers, then changing the types of buffers in
    splice_pipe_to_pipe() and link_pipe().

    Cc:
    Fixes: 7c77f0b3f920 ("splice: implement pipe to pipe splicing")
    Fixes: 70524490ee2e ("[PATCH] splice: add support for sys_tee()")
    Suggested-by: Al Viro
    Signed-off-by: Jann Horn
    Signed-off-by: Al Viro

    Jann Horn
     

05 Dec, 2018

1 commit

  • In commit 4721a601099, we tried to fix a problem wherein directio reads
    into a splice pipe will bounce EFAULT/EAGAIN all the way out to
    userspace by simulating a zero-byte short read. This happens because
    some directio read implementations (xfs) will call
    bio_iov_iter_get_pages to grab pipe buffer pages and issue asynchronous
    reads, but as soon as we run out of pipe buffers that _get_pages call
    returns EFAULT, which the splice code translates to EAGAIN and bounces
    out to userspace.

    In that commit, the iomap code catches the EFAULT and simulates a
    zero-byte read, but that causes assertion errors on regular splice reads
    because xfs doesn't allow short directio reads.

    The brokenness is compounded by splice_direct_to_actor immediately
    bailing on do_splice_to returning actor
    (which empties out the pipe), so if userspace calls back we'll EFAULT
    again on the full pipe, and nothing ever gets copied.

    Therefore, teach splice_direct_to_actor to clamp its requests to the
    amount of free space in the pipe and remove the simulated short read
    from the iomap directio code.

    Fixes: 4721a601099 ("iomap: dio data corruption and spurious errors when pipes fill")
    Reported-by: Murphy Zhou
    Ranted-by: Amir Goldstein
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Darrick J. Wong

    Darrick J. Wong
     

24 Oct, 2018

1 commit

  • In the iov_iter struct, separate the iterator type from the iterator
    direction and use accessor functions to access them in most places.

    Convert a bunch of places to use switch-statements to access them rather
    then chains of bitwise-AND statements. This makes it easier to add further
    iterator types. Also, this can be more efficient as to implement a switch
    of small contiguous integers, the compiler can use ~50% fewer compare
    instructions than it has to use bitwise-and instructions.

    Further, cease passing the iterator type into the iterator setup function.
    The iterator function can set that itself. Only the direction is required.

    Signed-off-by: David Howells

    David Howells
     

16 Jun, 2018

1 commit


13 Jun, 2018

1 commit

  • The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
    patch replaces cases of:

    kmalloc(a * b, gfp)

    with:
    kmalloc_array(a * b, gfp)

    as well as handling cases of:

    kmalloc(a * b * c, gfp)

    with:

    kmalloc(array3_size(a, b, c), gfp)

    as it's slightly less ugly than:

    kmalloc_array(array_size(a, b), c, gfp)

    This does, however, attempt to ignore constant size factors like:

    kmalloc(4 * 1024, gfp)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The tools/ directory was manually excluded, since it has its own
    implementation of kmalloc().

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    kmalloc(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    kmalloc(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    kmalloc(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (COUNT_ID)
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * COUNT_ID
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (COUNT_CONST)
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * COUNT_CONST
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (COUNT_ID)
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * COUNT_ID
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (COUNT_CONST)
    + COUNT_CONST, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * COUNT_CONST
    + COUNT_CONST, sizeof(THING)
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    - kmalloc
    + kmalloc_array
    (
    - SIZE * COUNT
    + COUNT, SIZE
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    kmalloc(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    kmalloc(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kmalloc(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    kmalloc(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products,
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    kmalloc(C1 * C2 * C3, ...)
    |
    kmalloc(
    - (E1) * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - (E1) * (E2) * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - (E1) * (E2) * (E3)
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants,
    // keeping sizeof() as the second factor argument.
    @@
    expression THING, E1, E2;
    type TYPE;
    constant C1, C2, C3;
    @@

    (
    kmalloc(sizeof(THING) * C2, ...)
    |
    kmalloc(sizeof(TYPE) * C2, ...)
    |
    kmalloc(C1 * C2 * C3, ...)
    |
    kmalloc(C1 * C2, ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (E2)
    + E2, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * E2
    + E2, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (E2)
    + E2, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * E2
    + E2, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - (E1) * E2
    + E1, E2
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - (E1) * (E2)
    + E1, E2
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - E1 * E2
    + E1, E2
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook
     

11 Jun, 2018

1 commit


03 Apr, 2018

1 commit

  • Using the fs-internal do_vmsplice() helper allows us to get rid of the
    fs-internal call to the sys_vmsplice() syscall.

    This patch is part of a series which removes in-kernel calls to syscalls.
    On this basis, the syscall entry path can be streamlined. For details, see
    http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

    Cc: Al Viro
    Cc: Andrew Morton
    Signed-off-by: Dominik Brodowski

    Dominik Brodowski
     

25 Oct, 2017

1 commit

  • …READ_ONCE()/WRITE_ONCE()

    Please do not apply this to mainline directly, instead please re-run the
    coccinelle script shown below and apply its output.

    For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
    preference to ACCESS_ONCE(), and new code is expected to use one of the
    former. So far, there's been no reason to change most existing uses of
    ACCESS_ONCE(), as these aren't harmful, and changing them results in
    churn.

    However, for some features, the read/write distinction is critical to
    correct operation. To distinguish these cases, separate read/write
    accessors must be used. This patch migrates (most) remaining
    ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
    coccinelle script:

    ----
    // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
    // WRITE_ONCE()

    // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch

    virtual patch

    @ depends on patch @
    expression E1, E2;
    @@

    - ACCESS_ONCE(E1) = E2
    + WRITE_ONCE(E1, E2)

    @ depends on patch @
    expression E;
    @@

    - ACCESS_ONCE(E)
    + READ_ONCE(E)
    ----

    Signed-off-by: Mark Rutland <mark.rutland@arm.com>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: davem@davemloft.net
    Cc: linux-arch@vger.kernel.org
    Cc: mpe@ellerman.id.au
    Cc: shuah@kernel.org
    Cc: snitzer@redhat.com
    Cc: thor.thayer@linux.intel.com
    Cc: tj@kernel.org
    Cc: viro@zeniv.linux.org.uk
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Mark Rutland
     

05 Sep, 2017

1 commit