13 Dec, 2015

1 commit

  • A line discipline which does not define a receive_buf() method can
    can cause a GPF if data is ever received [1]. Oddly, this was known
    to the author of n_tracesink in 2011, but never fixed.

    [1] GPF report
    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: [< (null)>] (null)
    PGD 3752d067 PUD 37a7b067 PMD 0
    Oops: 0010 [#1] SMP KASAN
    Modules linked in:
    CPU: 2 PID: 148 Comm: kworker/u10:2 Not tainted 4.4.0-rc2+ #51
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Workqueue: events_unbound flush_to_ldisc
    task: ffff88006da94440 ti: ffff88006db60000 task.ti: ffff88006db60000
    RIP: 0010:[] [< (null)>] (null)
    RSP: 0018:ffff88006db67b50 EFLAGS: 00010246
    RAX: 0000000000000102 RBX: ffff88003ab32f88 RCX: 0000000000000102
    RDX: 0000000000000000 RSI: ffff88003ab330a6 RDI: ffff88003aabd388
    RBP: ffff88006db67c48 R08: ffff88003ab32f9c R09: ffff88003ab31fb0
    R10: ffff88003ab32fa8 R11: 0000000000000000 R12: dffffc0000000000
    R13: ffff88006db67c20 R14: ffffffff863df820 R15: ffff88003ab31fb8
    FS: 0000000000000000(0000) GS:ffff88006dc00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 0000000000000000 CR3: 0000000037938000 CR4: 00000000000006e0
    Stack:
    ffffffff829f46f1 ffff88006da94bf8 ffff88006da94bf8 0000000000000000
    ffff88003ab31fb0 ffff88003aabd438 ffff88003ab31ff8 ffff88006430fd90
    ffff88003ab32f9c ffffed0007557a87 1ffff1000db6cf78 ffff88003ab32078
    Call Trace:
    [] process_one_work+0x8f1/0x17a0 kernel/workqueue.c:2030
    [] worker_thread+0xd4/0x1180 kernel/workqueue.c:2162
    [] kthread+0x1cf/0x270 drivers/block/aoe/aoecmd.c:1302
    [] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
    Code: Bad RIP value.
    RIP [< (null)>] (null)
    RSP
    CR2: 0000000000000000
    ---[ end trace a587f8947e54d6ea ]---

    Reported-by: Dmitry Vyukov
    Cc:
    Signed-off-by: Peter Hurley
    Signed-off-by: Greg Kroah-Hartman

    Peter Hurley
     

18 Oct, 2015

2 commits

  • The commonly accepted wisdom that scheduling work on the same cpu
    that handled interrupt i/o benefits from cache-locality is only
    true if the cpu is idle (since bound kworkers are often the highest
    vruntime and thus the lowest priority).

    Measurements of scheduling via the unbound queue show lowered
    worst-case latency responses of up to 5x over bound workqueue, without
    increase in average latency or throughput.

    pty i/o test measurements show >3x (!) reduced total running time; tests
    previously taking ~8s now complete in
    Signed-off-by: Greg Kroah-Hartman

    Peter Hurley
     
  • Introduce API functions to restart and cancel tty buffer work, rather
    than manipulate buffer work directly.

    Signed-off-by: Peter Hurley
    Signed-off-by: Greg Kroah-Hartman

    Peter Hurley
     

05 Oct, 2015

3 commits

  • Race on buffer data happens when newly committed data is
    picked up by an old flush work in the following scenario:
    __tty_buffer_request_room does a plain write of tail->commit,
    no barriers were executed before that.
    At this point flush_to_ldisc reads this new value of commit,
    and reads buffer data, no barriers in between.
    The committed buffer data is not necessary visible to flush_to_ldisc.

    Similar bug happens when tty_schedule_flip commits data.

    Update commit with smp_store_release and read commit with
    smp_load_acquire, as it is commit that signals data readiness.
    This is orthogonal to the existing synchronization on tty_buffer.next,
    which is required to not dismiss a buffer with unconsumed data.

    The data race was found with KernelThreadSanitizer (KTSAN).

    Signed-off-by: Dmitry Vyukov
    Reviewed-by: Peter Hurley
    Signed-off-by: Greg Kroah-Hartman

    Dmitry Vyukov
     
  • tty_buffer_flush frees not acquired buffers.
    As the result, for example, read of b->size in tty_buffer_free
    can return garbage value which will lead to a huge buffer
    hanging in the freelist. This is just the benignest
    manifestation of freeing of a not acquired object.
    If the object is passed to kfree, heap can be corrupted.

    Acquire visibility over the buffer before freeing it.

    The data race was found with KernelThreadSanitizer (KTSAN).

    Signed-off-by: Dmitry Vyukov
    Reviewed-by: Peter Hurley
    Signed-off-by: Greg Kroah-Hartman

    Dmitry Vyukov
     
  • flush_to_ldisc reads port->itty and checks that it is not NULL,
    concurrently release_tty sets port->itty to NULL. It is possible
    that flush_to_ldisc loads port->itty once, ensures that it is
    not NULL, but then reloads it again and uses. The second load
    can already return NULL, which will cause a crash.

    Use READ_ONCE to read port->itty.

    The data race was found with KernelThreadSanitizer (KTSAN).

    Signed-off-by: Dmitry Vyukov
    Reviewed-by: Peter Hurley
    Signed-off-by: Greg Kroah-Hartman

    Dmitry Vyukov
     

24 Jul, 2015

2 commits


19 May, 2015

1 commit


11 May, 2015

2 commits

  • A read() from a pty master may mistakenly indicate EOF (errno == -EIO)
    after the pty slave has closed, even though input data remains to be read.
    For example,

    pty slave | input worker | pty master
    | |
    | | n_tty_read()
    pty_write() | | input avail? no
    add data | | sleep
    schedule worker --->| | .
    |---> flush_to_ldisc() | .
    pty_close() | fill read buffer | .
    wait for worker | wakeup reader --->| .
    | read buffer full? |---> input avail ? yes
    |= 4096 so the ldisc read buffer
    is empty
    3. the subsequent read() occurs before the kicked worker has processed
    more input

    However, the underlying cause of the race is that data is pipelined, while
    tty state is not; ie., data already written by the pty slave end is not
    yet visible to the pty master end, but state changes by the pty slave end
    are visible to the pty master end immediately.

    Pipeline the TTY_OTHER_CLOSED state through input worker to the reader.
    1. Introduce TTY_OTHER_DONE which is set by the input worker when
    TTY_OTHER_CLOSED is set and either the input buffers are flushed or
    input processing has completed. Readers/polls are woken when
    TTY_OTHER_DONE is set.
    2. Reader/poll checks TTY_OTHER_DONE instead of TTY_OTHER_CLOSED.
    3. A new input worker is started from pty_close() after setting
    TTY_OTHER_CLOSED, which ensures the TTY_OTHER_DONE state will be
    set if the last input worker is already finished (or just about to
    exit).

    Remove tty_flush_to_ldisc(); no in-tree callers.

    Fixes: 52bce7f8d4fc ("pty, n_tty: Simplify input processing on final close")
    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=96311
    BugLink: http://bugs.launchpad.net/bugs/1429756
    Cc: # 3.19+
    Reported-by: Andy Whitcroft
    Reported-by: H.J. Lu
    Signed-off-by: Peter Hurley
    Signed-off-by: Greg Kroah-Hartman

    Peter Hurley
     
  • We should not be doing assignments within an if () block
    so fix up the code to not do this.

    change was created using Coccinelle.

    CC: Jiri Slaby
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

03 Feb, 2015

1 commit

  • The pty driver does not clear its write buffer when commanded.
    This is to avoid an apparent deadlock between parallel flushes from
    both pty ends; specifically when handling either BRK or INTR input.
    However, parallel flushes from this source is not possible since
    the pty master can never be set to BRKINT or ISIG. Parallel flushes
    from other sources are possible but these do not threaten deadlocks.

    Annotate the tty buffer mutex for lockdep to represent the nested
    tty_buffer locking which occurs when the pty slave is processing input
    (its buffer mutex held) and receives INTR or BRK and acquires the
    linked tty buffer mutex via tty_buffer_flush().

    Signed-off-by: Peter Hurley
    Signed-off-by: Greg Kroah-Hartman

    Peter Hurley
     

06 Nov, 2014

1 commit

  • tty_ldisc_flush() first clears the line discipline input buffer,
    then clears the tty flip buffers. However, this allows for existing
    data in the tty flip buffers to be added after the ldisc input
    buffer has been cleared, but before the flip buffers have been cleared.

    Add an optional ldisc parameter to tty_buffer_flush() to allow
    tty_ldisc_flush() to pass the ldisc to clear.

    NB: Initially, the plan was to do this automatically in
    tty_buffer_flush(). However, an audit of the behavior of existing
    line disciplines showed that performing a ldisc buffer flush on
    ioctl(TCFLSH) was not always the outcome. For example, some line
    disciplines have flush_buffer() methods but not ioctl() methods,
    so a ->flush_buffer() command would be unexpected.

    Reviewed-by: Alan Cox
    Signed-off-by: Peter Hurley
    Signed-off-by: Greg Kroah-Hartman

    Peter Hurley
     

24 May, 2014

1 commit

  • This function is largely a duplicate of paste_selection() in
    drivers/tty/vt/selection.c, but with its own selection state. The
    speakup selection mechanism should really be merged with vt.

    For now, apply the changes from 'TTY: vt, fix paste_selection ldisc
    handling', 'tty: Make ldisc input flow control concurrency-friendly',
    and 'tty: Fix unsafe vt paste_selection()'.

    References: https://bugs.debian.org/735202
    References: https://bugs.debian.org/744015
    Reported-by: Paul Gevers
    Reported-and-tested-by: Jarek Czekalski
    Signed-off-by: Ben Hutchings
    Cc: # v3.8 but needs backporting for < 3.12
    Signed-off-by: Greg Kroah-Hartman

    Ben Hutchings
     

04 May, 2014

2 commits

  • Commit 6a20dbd6caa2358716136144bf524331d70b1e03,
    "tty: Fix race condition between __tty_buffer_request_room and flush_to_ldisc"
    correctly identifies an unsafe race condition between
    __tty_buffer_request_room() and flush_to_ldisc(), where the consumer
    flush_to_ldisc() prematurely advances the head before consuming the
    last of the data committed. For example:

    CPU 0 | CPU 1
    __tty_buffer_request_room | flush_to_ldisc
    ... | ...
    | count = head->commit - head->read
    n = tty_buffer_alloc() |
    b->commit = b->used |
    b->next = n |
    | if (!count) /* T */
    | if (head->next == NULL) /* F */
    | buf->head = head->next

    In this case, buf->head has been advanced but head->commit may have
    been updated with a new value.

    Instead of reintroducing an unnecessary lock, fix the race locklessly.
    Read the commit-next pair in the reverse order of writing, which guarantees
    the commit value read is the latest value written if the head is
    advancing.

    Reported-by: Manfred Schlaegl
    Cc: # 3.12.x+
    Signed-off-by: Peter Hurley
    Signed-off-by: Greg Kroah-Hartman

    Peter Hurley
     
  • This reverts commit 6a20dbd6caa2358716136144bf524331d70b1e03.

    Although the commit correctly identifies an unsafe race condition
    between __tty_buffer_request_room() and flush_to_ldisc(), the commit
    fixes the race with an unnecessary spinlock in a lockless algorithm.

    The follow-on commit, "tty: Fix lockless tty buffer race" fixes
    the race locklessly.

    Signed-off-by: Peter Hurley
    Signed-off-by: Greg Kroah-Hartman

    Peter Hurley
     

25 Apr, 2014

1 commit

  • The race was introduced while development of linux-3.11 by
    e8437d7ecbc50198705331449367d401ebb3181f and
    e9975fdec0138f1b2a85b9624e41660abd9865d4.
    Originally it was found and reproduced on linux-3.12.15 and
    linux-3.12.15-rt25, by sending 500 byte blocks with 115kbaud to the
    target uart in a loop with 100 milliseconds delay.

    In short:
    1. The consumer flush_to_ldisc is on to remove the head tty_buffer.
    2. The producer adds a number of bytes, so that a new tty_buffer must
    be allocated and added by __tty_buffer_request_room.
    3. The consumer removes the head tty_buffer element, without handling
    newly committed data.

    Detailed example:
    * Initial buffer:
    * Head, Tail -> 0: used=250; commit=250; read=240; next=NULL
    * Consumer: ''flush_to_ldisc''
    * consumed 10 Byte
    * buffer:
    * Head, Tail -> 0: used=250; commit=250; read=250; next=NULL
    {{{
    count = head->commit - head->read; // count = 0
    if (!count) { // enter
    // INTERRUPTED BY PRODUCER ->
    if (head->next == NULL)
    break;
    buf->head = head->next;
    tty_buffer_free(port, head);
    continue;
    }
    }}}
    * Producer: tty_insert_flip_... 10 bytes + tty_flip_buffer_push
    * buffer:
    * Head, Tail -> 0: used=250; commit=250; read=250; next=NULL
    * added 6 bytes: head-element filled to maximum.
    * buffer:
    * Head, Tail -> 0: used=256; commit=250; read=250; next=NULL
    * added 4 bytes: __tty_buffer_request_room is called
    * buffer:
    * Head -> 0: used=256; commit=256; read=250; next=1
    * Tail -> 1: used=4; commit=0; read=250 next=NULL
    * push (tty_flip_buffer_push)
    * buffer:
    * Head -> 0: used=256; commit=256; read=250; next=1
    * Tail -> 1: used=4; commit=4; read=250 next=NULL
    * Consumer
    {{{
    count = head->commit - head->read;
    if (!count) {
    // INTERRUPTED BY PRODUCER next == NULL) // -> no break
    break;
    buf->head = head->next;
    tty_buffer_free(port, head);
    // ERROR: tty_buffer head freed -> 6 bytes lost
    continue;
    }
    }}}

    This patch reintroduces a spin_lock to protect this case. Perhaps later
    a lock-less solution could be found.

    Signed-off-by: Manfred Schlaegl
    Cc: stable # 3.11
    Signed-off-by: Greg Kroah-Hartman

    Manfred Schlaegl
     

01 Mar, 2014

1 commit

  • The user-settable knob, low_latency, has been the source of
    several BUG reports which stem from flush_to_ldisc() running
    in interrupt context. Since 3.12, which added several sleeping
    locks (termios_rwsem and buf->lock) to the input processing path,
    the frequency of these BUG reports has increased.

    Note that changes in 3.12 did not introduce this regression;
    sleeping locks were first added to the input processing path
    with the removal of the BKL from N_TTY in commit
    a88a69c91256418c5907c2f1f8a0ec0a36f9e6cc,
    'n_tty: Fix loss of echoed characters and remove bkl from n_tty'
    and later in commit 38db89799bdf11625a831c5af33938dcb11908b6,
    'tty: throttling race fix'. Since those changes, executing
    flush_to_ldisc() in interrupt_context (ie, low_latency set), is unsafe.

    However, since most devices do not validate if the low_latency
    setting is appropriate for the context (process or interrupt) in
    which they receive data, some reports are due to misconfiguration.
    Further, serial dma devices for which dma fails, resort to
    interrupt receiving as a backup without resetting low_latency.

    Historically, low_latency was used to force wake-up the reading
    process rather than wait for the next scheduler tick. The
    effect was to trim multiple milliseconds of latency from
    when the process would receive new data.

    Recent tests [1] have shown that the reading process now receives
    data with only 10's of microseconds latency without low_latency set.

    Remove the low_latency rx steering from tty_flip_buffer_push();
    however, leave the knob as an optional hint to drivers that can
    tune their rx fifos and such like. Cleanup stale code comments
    regarding low_latency.

    [1] https://lkml.org/lkml/2014/2/20/434

    "Yay.. thats an annoying historical pain in the butt gone."
    -- Alan Cox

    Reported-by: Beat Bolli
    Reported-by: Pavel Roskin
    Acked-by: David Sterba
    Cc: Grant Edwards
    Cc: Stanislaw Gruszka
    Cc: Hal Murray
    Cc: # 3.12.x+
    Signed-off-by: Peter Hurley
    Signed-off-by: Alan Cox
    Signed-off-by: Greg Kroah-Hartman

    Peter Hurley
     

08 Jan, 2014

1 commit


10 Dec, 2013

1 commit

  • tty flip buffers use GFP_ATOMIC allocations for received data
    which is to be processed by the line discipline. For each byte
    received, an extra byte is used to indicate the error status of
    that byte.

    Instead, if the received data is error-free, encode the entire
    buffer without status bytes.

    Signed-off-by: Peter Hurley
    Signed-off-by: Greg Kroah-Hartman

    Peter Hurley
     

09 Dec, 2013

5 commits


24 Jul, 2013

15 commits