30 Jul, 2013

5 commits

  • sock_aio_dtor() is dead code - and stuff that does need to do cleanup
    can simply do it before calling aio_complete().

    Signed-off-by: Kent Overstreet
    Cc: Zach Brown
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Cc: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Cc: Theodore Ts'o
    Signed-off-by: Benjamin LaHaise

    Kent Overstreet
     
  • The kiocb refcount is only needed for cancellation - to ensure a kiocb
    isn't freed while a ki_cancel callback is running. But if we restrict
    ki_cancel callbacks to not block (which they currently don't), we can
    simply drop the refcount.

    Signed-off-by: Kent Overstreet
    Cc: Zach Brown
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Cc: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Cc: Theodore Ts'o
    Signed-off-by: Benjamin LaHaise

    Kent Overstreet
     
  • The old aio retry infrastucture needed to save the various arguments to
    to aio operations. But with the retry infrastructure gone, we can trim
    struct kiocb quite a bit.

    Signed-off-by: Kent Overstreet
    Cc: Zach Brown
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Cc: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Cc: Theodore Ts'o
    Signed-off-by: Benjamin LaHaise

    Kent Overstreet
     
  • This code doesn't serve any purpose anymore, since the aio retry
    infrastructure has been removed.

    This change should be safe because aio_read/write are also used for
    synchronous IO, and called from do_sync_read()/do_sync_write() - and
    there's no looping done in the sync case (the read and write syscalls).

    Signed-off-by: Kent Overstreet
    Cc: Zach Brown
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Cc: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Signed-off-by: Benjamin LaHaise

    Kent Overstreet
     
  • Originally, io_event() was documented to return the io_event if
    cancellation succeeded - the io_event wouldn't be delivered via the ring
    buffer like it normally would.

    But this isn't what the implementation was actually doing; the only
    driver implementing cancellation, the usb gadget code, never returned an
    io_event in its cancel function. And aio_complete() was recently changed
    to no longer suppress event delivery if the kiocb had been cancelled.

    This gets rid of the unused io_event argument to kiocb_cancel() and
    kiocb->ki_cancel(), and changes io_cancel() to return -EINPROGRESS if
    kiocb->ki_cancel() returned success.

    Also tweak the refcounting in kiocb_cancel() to make more sense.

    Signed-off-by: Kent Overstreet
    Cc: Zach Brown
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Cc: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Signed-off-by: Benjamin LaHaise

    Kent Overstreet
     

08 May, 2013

9 commits

  • Thanks to Zach Brown's work to rip out the retry infrastructure, we don't
    need this anymore - ki_retry was only called right after the kiocb was
    initialized.

    This also refactors and trims some duplicated code, as well as cleaning up
    the refcounting/error handling a bit.

    [akpm@linux-foundation.org: use fmode_t in aio_run_iocb()]
    [akpm@linux-foundation.org: fix file_start_write/file_end_write tests]
    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Kent Overstreet
    Cc: Zach Brown
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Cc: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Reviewed-by: "Theodore Ts'o"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kent Overstreet
     
  • ki_key wasn't actually used for anything previously - it was always 0.
    Drop it to trim struct kiocb a bit.

    Signed-off-by: Kent Overstreet
    Cc: Zach Brown
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Cc: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Reviewed-by: "Theodore Ts'o"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kent Overstreet
     
  • Previously, allocating a kiocb required touching quite a few global
    (well, per kioctx) cachelines... so batching up allocation to amortize
    those was worthwhile. But we've gotten rid of some of those, and in
    another couple of patches kiocb allocation won't require writing to any
    shared cachelines, so that means we can just rip this code out.

    Signed-off-by: Kent Overstreet
    Cc: Zach Brown
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Cc: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Reviewed-by: "Theodore Ts'o"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kent Overstreet
     
  • Cancelling kiocbs requires adding them to a per kioctx linked list,
    which is one of the few things we need to take the kioctx lock for in
    the fast path. But most kiocbs can't be cancelled - so if we just do
    this lazily, we can avoid quite a bit of locking overhead.

    While we're at it, instead of using a flag bit switch to using ki_cancel
    itself to indicate that a kiocb has been cancelled/completed. This lets
    us get rid of ki_flags entirely.

    [akpm@linux-foundation.org: remove buggy BUG()]
    Signed-off-by: Kent Overstreet
    Cc: Zach Brown
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Cc: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Reviewed-by: "Theodore Ts'o"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kent Overstreet
     
  • Freeing a kiocb needed to touch the kioctx for three things:

    * Pull it off the reqs_active list
    * Decrementing reqs_active
    * Issuing a wakeup, if the kioctx was in the process of being freed.

    This patch moves these to aio_complete(), for a couple reasons:

    * aio_complete() already has to issue the wakeup, so if we drop the
    kioctx refcount before aio_complete does its wakeup we don't have to
    do it twice.
    * aio_complete currently has to take the kioctx lock, so it makes sense
    for it to pull the kiocb off the reqs_active list too.
    * A later patch is going to change reqs_active to include unreaped
    completions - this will mean allocating a kiocb doesn't have to look
    at the ringbuffer. So taking the decrement of reqs_active out of
    kiocb_free() is useful prep work for that patch.

    This doesn't really affect cancellation, since existing (usb) code that
    implements a cancel function still calls aio_complete() - we just have
    to make sure that aio_complete does the necessary teardown for cancelled
    kiocbs.

    It does affect code paths where we free kiocbs that were never
    submitted; they need to decrement reqs_active and pull the kiocb off the
    reqs_active list. This occurs in two places: kiocb_batch_free(), which
    is going away in a later patch, and the error path in io_submit_one.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Kent Overstreet
    Cc: Zach Brown
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Acked-by: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Reviewed-by: "Theodore Ts'o"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kent Overstreet
     
  • Signed-off-by: Kent Overstreet
    Cc: Zach Brown
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Acked-by: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Cc: Theodore Ts'o
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kent Overstreet
     
  • Nothing used the return value, and it probably wasn't possible to use it
    safely for the locked versions (aio_complete(), aio_put_req()). Just
    kill it.

    Signed-off-by: Kent Overstreet
    Acked-by: Zach Brown
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Acked-by: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Reviewed-by: "Theodore Ts'o"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kent Overstreet
     
  • This removes the retry-based AIO infrastructure now that nothing in tree
    is using it.

    We want to remove retry-based AIO because it is fundemantally unsafe.
    It retries IO submission from a kernel thread that has only assumed the
    mm of the submitting task. All other task_struct references in the IO
    submission path will see the kernel thread, not the submitting task.
    This design flaw means that nothing of any meaningful complexity can use
    retry-based AIO.

    This removes all the code and data associated with the retry machinery.
    The most significant benefit of this is the removal of the locking
    around the unused run list in the submission path.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Kent Overstreet
    Signed-off-by: Zach Brown
    Cc: Zach Brown
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Acked-by: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Reviewed-by: "Theodore Ts'o"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Zach Brown
     
  • Signed-off-by: Zach Brown
    Signed-off-by: Kent Overstreet
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Acked-by: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Reviewed-by: "Theodore Ts'o"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Zach Brown
     

31 Jul, 2012

1 commit

  • Convert init_sync_kiocb() from a nasty macro into a nice C function. The
    struct assignment trick takes care of zeroing all unmentioned fields.
    Shrinks fs/read_write.o's .text from 9857 bytes to 9714.

    Also demacroize is_sync_kiocb() and aio_ring_avail(). The latter fixes an
    arg-referenced-multiple-times hand grenade.

    Cc: Junxiao Bi
    Cc: Mark Fasheh
    Acked-by: Jeff Moyer
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

05 Jul, 2012

1 commit

  • Ocfs2 uses kiocb.*private as a flag of unsigned long size. In
    commit a11f7e6 ocfs2: serialize unaligned aio, the unaligned
    io flag is involved in it to serialize the unaligned aio. As
    *private is not initialized in init_sync_kiocb() of do_sync_write(),
    this unaligned io flag may be unexpectly set in an aligned dio.
    And this will cause OCFS2_I(inode)->ip_unaligned_aio decreased
    to -1 in ocfs2_dio_end_io(), thus the following unaligned dio
    will hang forever at ocfs2_aiodio_wait() in ocfs2_file_aio_write().

    Signed-off-by: Junxiao Bi
    Cc: stable@vger.kernel.org
    Acked-by: Jeff Moyer
    Signed-off-by: Joel Becker

    Junxiao Bi
     

03 Nov, 2011

1 commit

  • In testing aio on a fast storage device, I found that the context lock
    takes up a fair amount of cpu time in the I/O submission path. The reason
    is that we take it for every I/O submitted (see __aio_get_req). Since we
    know how many I/Os are passed to io_submit, we can preallocate the kiocbs
    in batches, reducing the number of times we take and release the lock.

    In my testing, I was able to reduce the amount of time spent in
    _raw_spin_lock_irq by .56% (average of 3 runs). The command I used to
    test this was:

    aio-stress -O -o 2 -o 3 -r 8 -d 128 -b 32 -i 32 -s 16384

    I also tested the patch with various numbers of events passed to
    io_submit, and I ran the xfstests aio group of tests to ensure I didn't
    break anything.

    Signed-off-by: Jeff Moyer
    Cc: Daniel Ehrenberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Moyer
     

27 Jul, 2011

1 commit

  • This allows us to move duplicated code in
    (atomic_inc_not_zero() for now) to

    Signed-off-by: Arun Sharma
    Reviewed-by: Eric Dumazet
    Cc: Ingo Molnar
    Cc: David Miller
    Cc: Eric Dumazet
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun Sharma
     

28 May, 2010

1 commit

  • The aio compat code was not converting the struct iovecs from 32bit to
    64bit pointers, causing either EINVAL to be returned from io_getevents, or
    EFAULT as the result of the I/O. This patch passes a compat flag to
    io_submit to signal that pointer conversion is necessary for a given iocb
    array.

    A variant of this was tested by Michael Tokarev. I have also updated the
    libaio test harness to exercise this code path with good success.
    Further, I grabbed a copy of ltp and ran the
    testcases/kernel/syscall/readv and writev tests there (compiled with -m32
    on my 64bit system). All seems happy, but extra eyes on this would be
    welcome.

    [akpm@linux-foundation.org: coding-style fixes]
    [akpm@linux-foundation.org: fix CONFIG_COMPAT=n build]
    Signed-off-by: Jeff Moyer
    Reported-by: Michael Tokarev
    Cc: Zach Brown
    Cc: [2.6.35.1]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Moyer
     

16 Dec, 2009

1 commit

  • Don't know the reason, but it appears ki_wait field of iocb never gets used.

    Signed-off-by: Shaohua Li
    Cc: Jeff Moyer
    Cc: Benjamin LaHaise
    Cc: Zach Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Shaohua Li
     

20 Sep, 2009

1 commit


01 Jul, 2009

1 commit

  • Change the eventfd interface to de-couple the eventfd memory context, from
    the file pointer instance.

    Without such change, there is no clean way to racely free handle the
    POLLHUP event sent when the last instance of the file* goes away. Also,
    now the internal eventfd APIs are using the eventfd context instead of the
    file*.

    This patch is required by KVM's IRQfd code, which is still under
    development.

    Signed-off-by: Davide Libenzi
    Cc: Gregory Haskins
    Cc: Rusty Russell
    Cc: Benjamin LaHaise
    Cc: Avi Kivity
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davide Libenzi
     

29 Dec, 2008

1 commit

  • The mm->ioctx_list is currently protected by a reader-writer lock,
    so we always grab that lock on the read side for doing ioctx
    lookups. As the workload is extremely reader biased, turn this into
    an rcu hlist so we can make lookup_ioctx() lockless. Get rid of
    the rwlock and use a spinlock for providing update side exclusion.

    There's usually only 1 entry on this list, so it doesn't make sense
    to look into fancier data structures.

    Reviewed-by: Jeff Moyer
    Signed-off-by: Jens Axboe

    Jens Axboe
     

17 Oct, 2008

1 commit

  • This patchs adds the CONFIG_AIO option which allows to remove support
    for asynchronous I/O operations, that are not necessarly used by
    applications, particularly on embedded devices. As this is a
    size-reduction option, it depends on CONFIG_EMBEDDED. It allows to
    save ~7 kilobytes of kernel code/data:

    text data bss dec hex filename
    1115067 119180 217088 1451335 162547 vmlinux
    1108025 119048 217088 1444161 160941 vmlinux.new
    -7042 -132 0 -7174 -1C06 +/-

    This patch has been originally written by Matt Mackall
    , and is part of the Linux Tiny project.

    [randy.dunlap@oracle.com: build fix]
    Signed-off-by: Thomas Petazzoni
    Cc: Benjamin LaHaise
    Cc: Zach Brown
    Signed-off-by: Matt Mackall
    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Petazzoni
     

27 Jul, 2008

1 commit


29 Apr, 2008

1 commit

  • Make the following needlessly global functions static:

    - __put_ioctx()
    - lookup_ioctx()
    - io_submit_one()

    Signed-off-by: Adrian Bunk
    Cc: Zach Brown
    Cc: Benjamin LaHaise
    Cc: Badari Pulavarty
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     

19 Feb, 2008

1 commit

  • Commit b2e895dbd80c420bfc0937c3729b4afe073b3848 #if 0'ed this code stating:

    [PATCH] revert blockdev direct io back to 2.6.19 version

    Andrew Vasquez is reporting as-iosched oopses and a 65% throughput
    slowdown due to the recent special-casing of direct-io against
    blockdevs. We don't know why either of these things are occurring.

    The patch minimally reverts us back to the 2.6.19 code for a 2.6.20
    release.

    It has since been dead code, and unless someone wants to revive it now
    it's time to remove it.

    This patch also makes bio_release_pages() static again and removes the
    ki_bio_count member from struct kiocb, reverting changes that had been
    done for this dead code.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Jens Axboe

    Adrian Bunk
     

14 Feb, 2008

1 commit


19 Oct, 2007

1 commit

  • Hell knows what happened in commit 63b05203af57e7de4f3bb63b8b81d43bc196d32b
    during 2.6.9 development. Commit introduced io_wait field which remained
    write-only than and still remains write-only.

    Also garbage collect macros which "use" io_wait.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

20 Jul, 2007

1 commit

  • Fix type issue reported by latest 'sparse': kiocb.ki_flags should be
    "unsigned long" (not "long"), to match bitop type signature.

    Signed-off-by: David Brownell
    Signed-off-by: Benjamin LaHaise
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Brownell
     

11 May, 2007

1 commit

  • This is an example about how to add eventfd support to the current KAIO code,
    in order to enable KAIO to post readiness events to a pollable fd (hence
    compatible with POSIX select/poll). The KAIO code simply signals the eventfd
    fd when events are ready, and this triggers a POLLIN in the fd. This patch
    uses a reserved for future use member of the struct iocb to pass an eventfd
    file descriptor, that KAIO will use to post events every time a request
    completes. At that point, an aio_getevents() will return the completed result
    to a struct io_event. I made a quick test program to verify the patch, and it
    runs fine here:

    http://www.xmailserver.org/eventfd-aio-test.c

    The test program uses poll(2), but it'd, of course, work with select and epoll
    too.

    This can allow to schedule both block I/O and other poll-able devices
    requests, and wait for results using select/poll/epoll. In a typical
    scenario, an application would submit KAIO request using aio_submit(), and
    will also use epoll_ctl() on the whole other class of devices (that with the
    addition of signals, timers and user events, now it's pretty much complete),
    and then would:

    epoll_wait(...);
    for_each_event {
    if (curr_event_is_kaiofd) {
    aio_getevents();
    dispatch_aio_events();
    } else {
    dispatch_epoll_event();
    }
    }

    Signed-off-by: Davide Libenzi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davide Libenzi
     

10 May, 2007

1 commit

  • Stick an unlikely() around is_aio(): I assert that most IO is synchronous.

    Cc: Suparna Bhattacharya
    Cc: Ingo Molnar
    Cc: Benjamin LaHaise
    Cc: Zach Brown
    Cc: Ulrich Drepper
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

14 Dec, 2006

1 commit

  • Implement block device specific .direct_IO method instead of going through
    generic direct_io_worker for block device.

    direct_io_worker() is fairly complex because it needs to handle O_DIRECT on
    file system, where it needs to perform block allocation, hole detection,
    extents file on write, and tons of other corner cases. The end result is
    that it takes tons of CPU time to submit an I/O.

    For block device, the block allocation is much simpler and a tight triple
    loop can be written to iterate each iovec and each page within the iovec in
    order to construct/prepare bio structure and then subsequently submit it to
    the block layer. This significantly speeds up O_D on block device.

    [akpm@osdl.org: small speedup]
    Signed-off-by: Ken Chen
    Cc: Christoph Hellwig
    Cc: Zach Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chen, Kenneth W
     

08 Dec, 2006

1 commit

  • Remove the ki_retried member from struct kiocb. I think the idea was
    bounced around a while back, but Arnaldo pointed out another reason that we
    should dig it up when he pointed out that the last cacheline of struct
    kiocb only contains 4 bytes. By removing the debugging member, we save
    more than the 8 byte on 64 bit machines.

    Signed-off-by: Benjamin LaHaise
    Acked-by: Ken Chen
    Acked-by: Zach Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Benjamin LaHaise
     

22 Nov, 2006

1 commit

  • Separate delayable work items from non-delayable work items be splitting them
    into a separate structure (delayed_work), which incorporates a work_struct and
    the timer_list removed from work_struct.

    The work_struct struct is huge, and this limits it's usefulness. On a 64-bit
    architecture it's nearly 100 bytes in size. This reduces that by half for the
    non-delayable type of event.

    Signed-Off-By: David Howells

    David Howells
     

01 Oct, 2006

4 commits


09 Jan, 2006

1 commit

  • Reorder members of the kiocb structure to make sync kiocb setup faster. By
    setting the elements sequentially, the write combining buffers on the CPU
    are able to combine the writes into a single burst, which results in fewer
    cache cycles being consumed, freeing them up for other code. This results
    in a 10-20KB/s[*] increase on the bw_unix part of LMbench on my test
    system.

    * The improvement varies based on what other patches are in the system,
    as there are a number of bottlenecks, so this number is not absolutely
    accurate.

    Signed-off-by: Benjamin LaHaise
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Benjamin LaHaise