21 Dec, 2011

1 commit

  • -> #2 (&tty->write_wait){-.-...}:

    is a lot more informative than:

    -> #2 (key#19){-.....}:

    Signed-off-by: Peter Zijlstra
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/n/tip-8zpopbny51023rdb0qq67eye@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

06 Oct, 2010

1 commit

  • The "flags" member of "struct wait_queue_t" is used in several places in
    the kernel code without beeing initialized by init_wait(). "flags" is
    used in bitwise operations.

    If "flags" not initialized then unexpected behaviour may take place.
    Incorrect flags might used later in code.

    Added initialization of "wait_queue_t.flags" with zero value into
    "init_wait".

    Signed-off-by: Evgeny Kuznetsov
    [ The bit we care about does end up being initialized by both
    prepare_to_wait() and add_to_wait_queue(), so this doesn't seem to
    cause actual bugs, but is definitely the right thing to do -Linus ]
    Signed-off-by: Linus Torvalds

    Evgeny Kuznetsov
     

21 May, 2010

2 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (229 commits)
    USB: remove unused usb_buffer_alloc and usb_buffer_free macros
    usb: musb: update gfp/slab.h includes
    USB: ftdi_sio: fix legacy SIO-device header
    USB: kl5usb105: reimplement using generic framework
    USB: kl5usb105: minor clean ups
    USB: kl5usb105: fix memory leak
    USB: io_ti: use kfifo to implement write buffering
    USB: io_ti: remove unsused private counter
    USB: ti_usb: use kfifo to implement write buffering
    USB: ir-usb: fix incorrect write-buffer length
    USB: aircable: fix incorrect write-buffer length
    USB: safe_serial: straighten out read processing
    USB: safe_serial: reimplement read using generic framework
    USB: safe_serial: reimplement write using generic framework
    usb-storage: always print quirks
    USB: usb-storage: trivial debug improvements
    USB: oti6858: use port write fifo
    USB: oti6858: use kfifo to implement write buffering
    USB: cypress_m8: use kfifo to implement write buffering
    USB: cypress_m8: remove unused drain define
    ...

    Fix up conflicts (due to usb_buffer_alloc/free renaming) in
    drivers/input/tablet/acecad.c
    drivers/input/tablet/kbtab.c
    drivers/input/tablet/wacom_sys.c
    drivers/media/video/gspca/gspca.c
    sound/usb/usbaudio.c

    Linus Torvalds
     
  • New wait_event_interruptible{,_exclusive}_locked{,_irq} macros added.
    They work just like versions without _locked* suffix but require the
    wait queue's lock to be held. Also __wake_up_locked() is now exported
    as to pair it with the above macros.

    The use case of this new facility is when one uses wait queue's lock
    to protect a data structure. This may be advantageous if the
    structure needs to be protected by a spinlock anyway. In particular,
    with additional spinlock the following code has to be used to wait
    for a condition:

    spin_lock(&data.lock);
    ...
    for (ret = 0; !ret && !(condition); ) {
    spin_unlock(&data.lock);
    ret = wait_event_interruptible(data.wqh, (condition));
    spin_lock(&data.lock);
    }
    ...
    spin_unlock(&data.lock);

    This looks bizarre plus wait_event_interruptible() locks the wait
    queue's lock anyway so there is a unlock+lock sequence where it could
    be avoided.

    To avoid those problems and benefit from wait queue's lock, a code
    similar to the following should be used:

    /* Waiting */
    spin_lock(&data.wqh.lock);
    ...
    ret = wait_event_interruptible_locked(data.wqh, (condition));
    ...
    spin_unlock(&data.wqh.lock);

    /* Waiting exclusively */
    spin_lock(&data.whq.lock);
    ...
    ret = wait_event_interruptible_exclusive_locked(data.whq, (condition));
    ...
    spin_unlock(&data.whq.lock);

    /* Waking up */
    spin_lock(&data.wqh.lock);
    ...
    wake_up_locked(&data.wqh);
    ...
    spin_unlock(&data.wqh.lock);

    When spin_lock_irq() is used matching versions of macros need to be
    used (*_locked_irq()).

    Signed-off-by: Michal Nazarewicz
    Cc: Kyungmin Park
    Cc: Marek Szyprowski
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Takashi Iwai
    Cc: David Howells
    Cc: Andreas Herrmann
    Cc: Thomas Gleixner
    Cc: Mike Galbraith
    Signed-off-by: Greg Kroah-Hartman

    Michal Nazarewicz
     

11 May, 2010

1 commit

  • epoll should not touch flags in wait_queue_t. This patch introduces a new
    function __add_wait_queue_exclusive(), for the users, who use wait queue as a
    LIFO queue.

    __add_wait_queue_tail_exclusive() is introduced too instead of
    add_wait_queue_exclusive_locked(). remove_wait_queue_locked() is removed, as
    it is a duplicate of __remove_wait_queue(), disliked by users, and with less
    users.

    Signed-off-by: Changli Gao
    Signed-off-by: Peter Zijlstra
    Cc: Alexander Viro
    Cc: Paul Menage
    Cc: Li Zefan
    Cc: Davide Libenzi
    Cc:
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Changli Gao
     

15 Sep, 2009

1 commit

  • In order to extend the functions to have more than 1 flag (sync),
    rename the argument to flags, and explicitly define a WF_ space for
    individual flags.

    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

10 Aug, 2009

1 commit

  • Give waitqueue spinlocks their own lockdep classes when they
    are initialised from init_waitqueue_head(). This means that
    struct wait_queue::func functions can operate other waitqueues.

    This is used by CacheFiles to catch the page from a backing fs
    being unlocked and to wake up another thread to take a copy of
    it.

    Signed-off-by: Peter Zijlstra
    Signed-off-by: David Howells
    Tested-by: Takashi Iwai
    Cc: linux-cachefs@redhat.com
    Cc: torvalds@osdl.org
    Cc: akpm@linux-foundation.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

11 May, 2009

1 commit


28 Apr, 2009

1 commit

  • In 2.6.25 we added UDP mem accounting.

    This unfortunatly added a penalty when a frame is transmitted, since
    we have at TX completion time to call sock_wfree() to perform necessary
    memory accounting. This calls sock_def_write_space() and utimately
    scheduler if any thread is waiting on the socket.
    Thread(s) waiting for an incoming frame was scheduled, then had to sleep
    again as event was meaningless.

    (All threads waiting on a socket are using same sk_sleep anchor)

    This adds lot of extra wakeups and increases latencies, as noted
    by Christoph Lameter, and slows down softirq handler.

    Reference : http://marc.info/?l=linux-netdev&m=124060437012283&w=2

    Fortunatly, Davide Libenzi recently added concept of keyed wakeups
    into kernel, and particularly for sockets (see commit
    37e5540b3c9d838eb20f2ca8ea2eb8072271e403
    epoll keyed wakeups: make sockets use keyed wakeups)

    Davide goal was to optimize epoll, but this new wakeup infrastructure
    can help non epoll users as well, if they care to setup an appropriate
    handler.

    This patch introduces new DEFINE_WAIT_FUNC() helper and uses it
    in wait_for_packet(), so that only relevant event can wakeup a thread
    blocked in this function.

    Trace of function calls from bnx2 TX completion bnx2_poll_work() is :
    __kfree_skb()
    skb_release_head_state()
    sock_wfree()
    sock_def_write_space()
    __wake_up_sync_key()
    __wake_up_common()
    receiver_wake_function() : Stops here since thread is waiting for an INPUT

    Reported-by: Christoph Lameter
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

14 Apr, 2009

1 commit

  • '777c6c5 wait: prevent exclusive waiter starvation' made
    __wake_up_common() global to be used from abort_exclusive_wait().

    It was needed to do a wake-up with the waitqueue lock held while
    passing down a key to the wake-up function.

    Since '4ede816 epoll keyed wakeups: add __wake_up_locked_key() and
    __wake_up_sync_key()' there is an appropriate wrapper for this case:
    __wake_up_locked_key().

    Use it here and make __wake_up_common() private to the scheduler
    again.

    Signed-off-by: Johannes Weiner
    Cc: Andrew Morton
    Cc: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Johannes Weiner
     

01 Apr, 2009

2 commits

  • Introduce new wakeup macros that allow passing an event mask to the wakeup
    targets. They exactly mimic their non-_poll() counterpart, with the added
    event mask passing capability. I did add only the ones currently
    requested, avoiding the _nr() and _all() for the moment.

    Signed-off-by: Davide Libenzi
    Cc: Alan Cox
    Cc: Ingo Molnar
    Cc: David Miller
    Cc: William Lee Irwin III
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davide Libenzi
     
  • This patchset introduces wakeup hints for some of the most popular (from
    epoll POV) devices, so that epoll code can avoid spurious wakeups on its
    waiters.

    The problem with epoll is that the callback-based wakeups do not, ATM,
    carry any information about the events the wakeup is related to. So the
    only choice epoll has (not being able to call f_op->poll() from inside the
    callback), is to add the file* to a ready-list and resolve the real events
    later on, at epoll_wait() (or its own f_op->poll()) time. This can cause
    spurious wakeups, since the wake_up() itself might be for an event the
    caller is not interested into.

    The rate of these spurious wakeup can be pretty high in case of many
    network sockets being monitored.

    By allowing devices to report the events the wakeups refer to (at least
    the two major classes - POLLIN/POLLOUT), we are able to spare useless
    wakeups by proper handling inside the epoll's poll callback.

    Epoll will have in any case to call f_op->poll() on the file* later on,
    since the change to be done in order to have the full event set sent via
    wakeup, is too invasive for the way our f_op->poll() system works (the
    full event set is calculated inside the poll function - there are too many
    of them to even start thinking the change - also poll/select would need
    change too).

    Epoll is changed in a way that both devices which send event hints, and
    the ones that don't, are correctly handled. The former will gain some
    efficiency though.

    As a general rule for devices, would be to add an event mask by using
    key-aware wakeup macros, when making up poll wait queues. I tested it
    (together with the epoll's poll fix patch Andrew has in -mm) and wakeups
    for the supported devices are correctly filtered.

    Test program available here:

    http://www.xmailserver.org/epoll_test.c

    This patch:

    Nothing revolutionary here. Just using the available "key" that our
    wakeup core already support. The __wake_up_locked_key() was no brainer,
    since both __wake_up_locked() and __wake_up_locked_key() are thin wrappers
    around __wake_up_common().

    The __wake_up_sync() function had a body, so the choice was between
    borrowing the body for __wake_up_sync_key() and calling it from
    __wake_up_sync(), or make an inline and calling it from both. I chose the
    former since in most archs it all resolves to "mov $0, REG; jmp ADDR".

    Signed-off-by: Davide Libenzi
    Cc: Alan Cox
    Cc: Ingo Molnar
    Cc: David Miller
    Cc: William Lee Irwin III
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davide Libenzi
     

06 Feb, 2009

1 commit

  • With exclusive waiters, every process woken up through the wait queue must
    ensure that the next waiter down the line is woken when it has finished.

    Interruptible waiters don't do that when aborting due to a signal. And if
    an aborting waiter is concurrently woken up through the waitqueue, noone
    will ever wake up the next waiter.

    This has been observed with __wait_on_bit_lock() used by
    lock_page_killable(): the first contender on the queue was aborting when
    the actual lock holder woke it up concurrently. The aborted contender
    didn't acquire the lock and therefor never did an unlock followed by
    waking up the next waiter.

    Add abort_exclusive_wait() which removes the process' wait descriptor from
    the waitqueue, iff still queued, or wakes up the next waiter otherwise.
    It does so under the waitqueue lock. Racing with a wake up means the
    aborting process is either already woken (removed from the queue) and will
    wake up the next waiter, or it will remove itself from the queue and the
    concurrent wake up will apply to the next waiter after it.

    Use abort_exclusive_wait() in __wait_event_interruptible_exclusive() and
    __wait_on_bit_lock() when they were interrupted by other means than a wake
    up through the queue.

    [akpm@linux-foundation.org: coding-style fixes]
    Reported-by: Chris Mason
    Signed-off-by: Johannes Weiner
    Mentored-by: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Matthew Wilcox
    Cc: Chuck Lever
    Cc: Nick Piggin
    Cc: Ingo Molnar
    Cc: ["after some testing"]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

17 Oct, 2008

1 commit

  • is_sync_wait() is used to distinguish between sync and async waits.
    Basically sync waits are the ones initialized with init_waitqueue_entry()
    and async ones with init_waitqueue_func_entry(). The sync/async
    distinction is used only in prepare_to_wait[_exclusive]() and its only
    function is to skip setting the current task state if the wait is async.
    This has a few problems.

    * No one uses it. None of func_entry users use prepare_to_wait()
    functions, so the code path never gets executed.

    * The distinction is bogus. Maybe back when func_entry is used only
    by aio but it's now also used by epoll and in future possibly by 9p
    and poll/select.

    * Taking @state as argument and ignoring it silenly depending on how
    @wait is initialized is just a bad error-prone API.

    * It prevents func_entry waits from using wait->private for no good
    reason.

    This patch kills is_sync_wait() and the associated code paths from
    prepare_to_wait[_exclusive](). As there was no user of these code paths,
    this patch doesn't cause any behavior difference.

    Signed-off-by: Tejun Heo
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     

14 Feb, 2008

1 commit


06 Feb, 2008

1 commit

  • On Sat, 2008-01-05 at 13:35 -0800, Davide Libenzi wrote:

    > I remember I talked with Arjan about this time ago. Basically, since 1)
    > you can drop an epoll fd inside another epoll fd 2) callback-based wakeups
    > are used, you can see a wake_up() from inside another wake_up(), but they
    > will never refer to the same lock instance.
    > Think about:
    >
    > dfd = socket(...);
    > efd1 = epoll_create();
    > efd2 = epoll_create();
    > epoll_ctl(efd1, EPOLL_CTL_ADD, dfd, ...);
    > epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...);
    >
    > When a packet arrives to the device underneath "dfd", the net code will
    > issue a wake_up() on its poll wake list. Epoll (efd1) has installed a
    > callback wakeup entry on that queue, and the wake_up() performed by the
    > "dfd" net code will end up in ep_poll_callback(). At this point epoll
    > (efd1) notices that it may have some event ready, so it needs to wake up
    > the waiters on its poll wait list (efd2). So it calls ep_poll_safewake()
    > that ends up in another wake_up(), after having checked about the
    > recursion constraints. That are, no more than EP_MAX_POLLWAKE_NESTS, to
    > avoid stack blasting. Never hit the same queue, to avoid loops like:
    >
    > epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...);
    > epoll_ctl(efd3, EPOLL_CTL_ADD, efd2, ...);
    > epoll_ctl(efd4, EPOLL_CTL_ADD, efd3, ...);
    > epoll_ctl(efd1, EPOLL_CTL_ADD, efd4, ...);
    >
    > The code "if (tncur->wq == wq || ..." prevents re-entering the same
    > queue/lock.

    Since the epoll code is very careful to not nest same instance locks
    allow the recursion.

    Signed-off-by: Peter Zijlstra
    Tested-by: Stefan Richter
    Acked-by: Davide Libenzi
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

07 Dec, 2007

2 commits


10 Jul, 2007

1 commit


31 Oct, 2006

1 commit

  • kernel: INFO: trying to register non-static key.
    kernel: the code is fine but needs lockdep annotation.
    kernel: turning off the locking correctness validator.
    kernel: [] show_trace_log_lvl+0x58/0x16a
    kernel: [] show_trace+0xd/0x10
    kernel: [] dump_stack+0x19/0x1b
    kernel: [] __lock_acquire+0xf0/0x90d
    kernel: [] lock_acquire+0x4b/0x6b
    kernel: [] _spin_lock_irqsave+0x22/0x32
    kernel: [] prepare_to_wait+0x17/0x4b
    kernel: [] lpfc_do_work+0xdd/0xcc2 [lpfc]
    kernel: [] kthread+0xc3/0xf2
    kernel: [] kernel_thread_helper+0x5/0xb

    Another case of non-static lockdep keys; duplicate the paradigm set by
    DECLARE_COMPLETION_ONSTACK and introduce DECLARE_WAIT_QUEUE_HEAD_ONSTACK.

    Signed-off-by: Peter Zijlstra
    Cc: Greg KH
    Cc: Markus Lidel
    Acked-by: Ingo Molnar
    Cc: Arjan van de Ven
    Cc: James Bottomley
    Cc: Marcel Holtmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

11 Jul, 2006

1 commit

  • allyesconfig vmlinux size delta:

    text data bss dec filename
    20736884 6073834 3075176 29885894 vmlinux.before
    20721009 6073966 3075176 29870151 vmlinux.after

    ~18 bytes per callsite, 15K of text size (~0.1%) saved.

    (as an added bonus this also removes a lockdep annotation.)

    Signed-off-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

04 Jul, 2006

2 commits


26 Apr, 2006

1 commit


07 Nov, 2005

1 commit

  • Fix more include file problems that surfaced since I submitted the previous
    fix-missing-includes.patch. This should now allow not to include sched.h
    from module.h, which is done by a followup patch.

    Signed-off-by: Tim Schmielau
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tim Schmielau
     

24 Jun, 2005

1 commit

  • In the upcoming aio_down patch, it is useful to store a private data
    pointer in the kiocb's wait_queue. Since we provide our own wake up
    function and do not require the task_struct pointer, it makes sense to
    convert the task pointer into a generic private pointer.

    Signed-off-by: Benjamin LaHaise
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Benjamin LaHaise
     

25 May, 2005

1 commit


17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds