08 Dec, 2006

40 commits

  • It is possible to have tasklets get scheduled before softirqd has had a chance
    to spawn on all CPUs. This is totally harmless; after success during action
    CPU_UP_PREPARE, action CPU_ONLINE will be called, which immediately wakes
    softirqd on the appropriate CPU to process the already pending tasklets. So
    there is no danger of having a missed wakeup for any tasklets that were
    already pending.

    In particular, i386 is affected by this during startup, and is visible when
    using a very large initrd; during the time it takes for the initrd to be
    decompressed, a timer IRQ can come in and schedule RCU callbacks. It is also
    possible that resending of a hardware IRQ via a softirq triggers the same bug.

    Because of different timing conditions, this shows up in all emulators and
    virtual machines tested, including Xen, VMware, Virtual PC, and Qemu. It is
    also possible to trigger on native hardware with a large enough initrd,
    although I don't have a reliable case demonstrating that.

    Signed-off-by: Zachary Amsden
    Cc:
    Cc: Ingo Molnar
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Zachary Amsden
     
  • When kernel is compiled with old version of autofs (CONFIG_AUTOFS_FS), and
    new (observed at least with 5.x.x) automount deamon is started, kernel
    correctly reports incompatible version of kernel and userland daemon, but
    then screws things up instead of correct handling of the error:

    autofs: kernel does not match daemon version
    =====================================
    [ BUG: bad unlock balance detected! ]
    -------------------------------------
    automount/4199 is trying to release lock (&type->s_umount_key) at:
    [] get_sb_nodev+0x76/0xa4
    but there are no more locks to release!

    other info that might help us debug this:
    no locks held by automount/4199.

    stack backtrace:
    [] dump_trace+0x68/0x1b2
    [] show_trace_log_lvl+0x18/0x2c
    [] show_trace+0xf/0x11
    [] dump_stack+0x12/0x14
    [] print_unlock_inbalance_bug+0xe7/0xf3
    [] lock_release+0x8d/0x164
    [] up_write+0x14/0x27
    [] get_sb_nodev+0x76/0xa4
    [] vfs_kern_mount+0x83/0xf6
    [] do_kern_mount+0x2d/0x3e
    [] do_mount+0x607/0x67a
    [] sys_mount+0x72/0xa4
    [] sysenter_past_esp+0x5f/0x99
    DWARF2 unwinder stuck at sysenter_past_esp+0x5f/0x99
    Leftover inexact backtrace:
    =======================

    and then deadlock comes.

    The problem: autofs_fill_super() returns EINVAL to get_sb_nodev(), but
    before that, it calls kill_anon_super() to destroy the superblock which
    won't be needed. This is however way too soon to call kill_anon_super(),
    because get_sb_nodev() has to perform its own cleanup of the superblock
    first (deactivate_super(), etc.). The correct time to call
    kill_anon_super() is in the autofs_kill_sb() callback, which is called by
    deactivate_super() at proper time, when the superblock is ready to be
    killed.

    I can see the same faulty codepath also in autofs4. This patch solves
    issues in both filesystems in a same way - it postpones the
    kill_anon_super() until the proper time is signalized by deactivate_super()
    calling the kill_sb() callback.

    [raven@themaw.net: update comment]
    Signed-off-by: Jiri Kosina
    Acked-by: Ian Kent
    Cc:
    Signed-off-by: Ian Kent
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiri Kosina
     
  • The i2c and hwmon trees have moved to a new location.

    The lm-sensors project moved to a new home as well.

    Signed-off-by: Jean Delvare
    Cc: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jean Delvare
     
  • Add suspend/resume methods to drivers/serial/8250_pnp.c. Tested on a
    P4/HT 16550A box, ttyS0 login survives across suspend to ram.

    [akpm@osdl.org: cleanups]
    Signed-off-by: Mike Galbraith
    Cc: "Rafael J. Wysocki"
    Cc: Pavel Machek
    Cc: Russell King
    Cc: Adam Belay
    Cc: Bjorn Helgaas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Galbraith
     
  • Some people want to use ide_cd for CD-ROM but still dynamically load
    ide-scsi for things like tape drives. If you compile in the CD driver this
    works out but if you want them modular you need an option to ensure that
    whoever loads first the right things happen.

    This replaces the original draft patch which leaked a scsi host reference

    [akpm@osdl.org: add MODULE_PARM_DESC]
    Signed-off-by: Alan Cox
    Cc: Bartlomiej Zolnierkiewicz
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Cox
     
  • Make the locking self-test failures (of 'FAILURE' type) easier to debug by
    printing more information.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • Some have reported a chain-table overflow - double its size.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • CONFIG_W1_SLAVE_DS2433_CRC can be used directly, there's no reason for the
    indirection of defining a different variable in the Makefile.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Evgeniy Polyakov
    Cc: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Evgeniy Polyakov
     
  • Port fix to the off-by-one in find_next_usable_block's memscan from ext2 to
    ext4; but it didn't cause a serious problem for ext4 because the additional
    ext4_test_allocatable check rescued it from the error.

    [akpm@osdl.org: build fix]
    Signed-off-by: Mingming Cao
    Signed-off-by: Hugh Dickins
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • ext4_new_blocks has a nice io_error label for setting -EIO, so goto that in
    the one place that doesn't already use it.

    Signed-off-by: Mingming Cao
    Signed-off-by: Hugh Dickins
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • The reservations tree is an rb_tree not a list, so it's less confusing to use
    rb_entry() than list_entry() - though they're both just container_of().

    Signed-off-by: Mingming Cao
    Signed-off-by: Hugh Dickins
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • rsv_end is the last block within the reservation, so alloc_new_reservation
    should accept start_block == rsv_end as success.

    Signed-off-by: Mingming Cao
    Signed-off-by: Hugh Dickins
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • grp_goal 0 is a genuine goal (unlike -1), so ext4_try_to_allocate_with_rsv
    should treat it as such.

    Signed-off-by: Mingming Cao
    Signed-off-by: Hugh Dickins
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • ext4_new_blocks should reset the reservation window size to 0 when squeezing
    the last blocks out of an almost full filesystem, so the retry doesn't skip
    any groups with less than half that free, reporting ENOSPC too soon.

    Signed-off-by: Mingming Cao
    Signed-off-by: Hugh Dickins
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • In the current jbd code, if a buffer on BJ_SyncData list is dirty and not
    locked, the buffer is refiled to BJ_Locked list, submitted to the IO and
    waited for IO completion.

    But the fsstress test showed the case that when a buffer was already
    submitted to the IO just before the buffer_dirty(bh) check, the buffer was
    not waited for IO completion.

    Following patch solves this problem. If it is assumed that a buffer is
    submitted to the IO before the buffer_dirty(bh) check and still being
    written to disk, this buffer is refiled to BJ_Locked list.

    Signed-off-by: Hisashi Hifumi
    Cc: Jan Kara
    Cc: "Stephen C. Tweedie"
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hisashi Hifumi
     
  • Always build hweight8/16/32/64() functions into the kernel so that loadable
    modules may use them.

    I didn't remove GENERIC_HWEIGHT since ALPHA_EV67, ia64, and some variants
    of UltraSparc(64) provide their own hweight functions.

    Fixes config/build problems with NTFS=m and JOYSTICK_ANALOG=m.

    Kernel: arch/x86_64/boot/bzImage is ready (#19)
    Building modules, stage 2.
    MODPOST 94 modules
    WARNING: "hweight32" [fs/ntfs/ntfs.ko] undefined!
    WARNING: "hweight16" [drivers/input/joystick/analog.ko] undefined!
    WARNING: "hweight8" [drivers/input/joystick/analog.ko] undefined!
    make[1]: *** [__modpost] Error 1
    make: *** [modules] Error 2

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Remove the carta_random32.h header file. The carta_random32() function was
    was put in and removed in favor of random32(). In the removal process, the
    header file was forgotten.

    Signed-off-by: Stephane Eranian
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephane Eranian
     
  • Add kernel .config file to REPORTING-BUGS.

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • According to the datasheet rs5c372 supports three different methods for
    reading register values. Change from method #1 to method #3, since method #3
    is the only one that works on Thecus N2100 board with this RTC.

    Signed-off-by: Riku Voipio
    Cc: Alessandro Zummo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Riku Voipio
     
  • We add a save link for O_DIRECT writes to protect the i_size against the
    crashes before we actually finish the I/O. If we hit an -ENOSPC in
    aops->prepare_write(), we would do a truncate() to release the blocks which
    might have got initialized. Now the truncate would add another save link
    for the same inode causing a reiserfs panic for having multiple save links
    for the same inode.

    Signed-off-by: Vladimir V. Saveliev
    Signed-off-by: Amit Arora
    Signed-off-by: Suzuki K P
    Cc: Jeff Mahoney
    Cc: Chris Mason
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir V. Saveliev
     
  • At present show_state prints a header the does not match the output of
    show_task, as follows:

    -
    sibling
    task PC pid father child younger older
    init S 00000000 0 1 0 2 (NOTLB)
    -

    This patch corrects the output of show_state so that the header is
    aligned with the data, ala:

    -
    free sibling
    task PC stack pid father child younger older
    init S 00000000 0 1 0 2 (NOTLB)
    -

    Signed-off-by: Chris Caputo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Caputo
     
  • In the functions do_proc_dointvec() and do_proc_doulongvec_minmax(),
    there seems to be a bug in string length calculation if string contains
    negative integer.

    The console log given below explains the bug. Setting negative values
    may not be a right thing to do for "console log level" but then the test
    (given below) can be used to demonstrate the bug in the code.

    # echo "-1 -1 -1 -123456" > /proc/sys/kernel/printk
    # cat /proc/sys/kernel/printk
    -1 -1 -1 -1234
    #
    # echo "-1 -1 -1 123456" > /proc/sys/kernel/printk
    # cat /proc/sys/kernel/printk
    -1 -1 -1 1234
    #

    (akpm: the bug is that 123456 gets truncated)

    It works as expected if string contains all +ve integers

    # echo "1 2 3 4" > /proc/sys/kernel/printk
    # cat /proc/sys/kernel/printk
    1 2 3 4
    #

    The patch given below fixes the issue.

    Signed-off-by: Praveen BP
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    BP, Praveen
     
  • Initialization synclink_gt forgot to unregister pci driver on error path.

    Signed-off-by: Akinobu Mita
    Cc: Paul Fulghum
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • Check the return value of platform_device_register_simple().

    Cc: David Brownell
    Signed-off-by: Akinobu Mita
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • Check register_filesystem() and kern_mount() return values.

    Cc: Ingo Molnar
    Signed-off-by: Akinobu Mita
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • Replace kmalloc+memset with kzalloc

    Signed-off-by: Yan Burman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yan Burman
     
  • This a set of fixes mostly to make the driver actually work:

    1. Actually select the line for setting parameters and receiver
    disable/enable.
    2. Select the line for receive and transmit interrupt handling correctly.
    3. Report the transmitter empty state correctly.
    4. Set the I/O type of ports correctly.
    5. Perform polled transmission correctly.
    6. Don't fix the console line at ttyS3.
    7. Magic SysRq support.
    8. Various small bits here and there.

    Tested with a DECstation 2100 (thanks Flo for making this possible).

    [akpm@osdl.org: fix typo]
    Signed-off-by: Maciej W. Rozycki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Maciej W. Rozycki
     
  • Provide a common interface for all the subsystems to lock and unlock their
    per-subsystem hotcpu mutexes.

    When CONFIG_HOTPLUG_CPU is not set, these operations would be no-ops.

    [akpm@osdl.org: macros -> inlines]
    Signed-off-by: Gautham R Shenoy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gautham R Shenoy
     
  • Pass struct dev pointer to dma_cache_sync()

    dma_cache_sync() is ill-designed in that it does not have a struct device
    pointer argument which makes proper support for systems that consist of a
    mix of coherent and non-coherent DMA devices hard. Change dma_cache_sync
    to take a struct device pointer as first argument and fix all its callers
    to pass it.

    Signed-off-by: Ralf Baechle
    Cc: James Bottomley
    Cc: "David S. Miller"
    Cc: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ralf Baechle
     
  • dma_is_consistent() is ill-designed in that it does not have a struct
    device pointer argument which makes proper support for systems that consist
    of a mix of coherent and non-coherent DMA devices hard. Change
    dma_is_consistent to take a struct device pointer as first argument and fix
    the sole caller to pass it.

    Signed-off-by: Ralf Baechle
    Cc: James Bottomley
    Cc: "David S. Miller"
    Cc: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ralf Baechle
     
  • On 32bits SMP platforms, 64bits i_size is protected by a seqcount
    (i_size_seqcount).

    When i_size is read or written, i_size_seqcount is read/written as well, so
    it make sense to group these two fields together in the same cache line.

    This patch moves i_size_seqcount next to i_size, and also moves i_version
    to let offsetof(struct inode, i_size) being 0x40 instead of 0x3c (for
    32bits platforms).

    For 64 bits platforms, i_size_seqcount doesnt exist, and the move of a
    'long i_version' should not introduce a new hole because of padding.

    Signed-off-by: Eric Dumazet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     
  • Be more careful about function pointer args:
    look for "(...*" instead of just "(".

    This line in include/linux/input.h fools the current kernel-doc script
    into deciding that this is a function pointer:

    unsigned long ffbit[NBITS(FF_MAX)];

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Burman Yan
     
  • We currently insert socket dentries into the global dentry hashtable. This
    is suboptimal because there is currently no way these entries can be used
    for a lookup(). (/proc/xxx/fd/xxx uses a different mechanism). Inserting
    them in dentry hashtable slows dcache lookups.

    To let __dpath() still work correctly (ie not adding a " (deleted)") after
    dentry name, we do :

    - Right after d_alloc(), pretend they are hashed by clearing the
    DCACHE_UNHASHED bit.

    - Call d_instantiate() instead of d_add() : dentry is not inserted in
    hash table.

    __dpath() & friends work as intended during dentry lifetime.

    - At dismantle time, once dput() must clear the dentry, setting again
    DCACHE_UNHASHED bit inside the custom d_delete() function provided by
    socket code, so that dput() can just kill_it.

    Signed-off-by: Eric Dumazet
    Cc: Al Viro
    Acked-by: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     
  • Some dentries don't need to be globally visible in dentry hashtable.
    (pipes & sockets)

    Such dentries dont need to wait for a RCU grace period at delete time.
    Being able to free them permits a better CPU cache use (hot cache)

    This patch combined with (dont insert pipe dentries into dentry_hashtable)
    reduced time of { pipe(p); close(p[0]); close(p[1]);} on my UP machine (1.6
    GHz Pentium-M) from 3.23 us to 2.86 us (But this patch does not depend on
    other patches, only bench results)

    Signed-off-by: Eric Dumazet
    Cc: Al Viro
    Cc: Maneesh Soni
    Cc: "Paul E. McKenney"
    Cc: Dipankar Sarma
    Acked-by: David Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     
  • We currently insert pipe dentries into the global dentry hashtable. This
    is suboptimal because there is currently no way these entries can be used
    for a lookup(). (/proc/xxx/fd/xxx uses a different mechanism). Inserting
    them in dentry hashtable slows dcache lookups.

    To let __dpath() still work correctly (ie not adding a " (deleted)") after
    dentry name, we do :

    - Right after d_alloc(), pretend they are hashed by clearing the
    DCACHE_UNHASHED bit.

    - Call d_instantiate() instead of d_add() : dentry is not inserted in
    hash table.

    __dpath() & friends work as intended during dentry lifetime.

    - At dismantle time, once dput() must clear the dentry, setting again
    DCACHE_UNHASHED bit inside the custom d_delete() function provided by
    pipe code, so that dput() can just kill_it.

    This patch, combined with (avoid RCU for never hashed dentries) reduced
    time of { pipe(p); close(p[0]); close(p[1]);} on my UP machine (1.6GHz
    Pentium-M) from 3.23 us to 2.86 us (But this patch does not depend on other
    patches, only bench results)

    Signed-off-by: Eric Dumazet
    Acked-by: David Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     
  • On some workloads, (for example when lot of close() syscalls are done), RCU
    qlen can be quite large, and RCU heads are no longer in cpu cache when
    rcu_do_batch() is called.

    This patch adds a prefetch() in rcu_do_batch() to give CPU a hint to bring
    back cache lines containing 'struct rcu_head's.

    Most list manipulations macros include prefetch(), but not open coded ones
    (at least with current C compilers :) )

    I got a nice speedup on a trivial benchmark (3.48 us per iteration instead
    of 3.95 us on a 1.6 GHz Pentium-M)

    while (1) { pipe(p); close(fd[0]); close(fd[1]);}

    Signed-off-by: Eric Dumazet
    Cc: "Paul E. McKenney"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     
  • Remove the videodev chapter from the kernel-api book. It's done much better
    in the videobook kernel-doc.

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Correct lots of typos, kernel-doc warnings, & kernel-doc usage in fusion and
    i2o drivers.

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Add Fusion and I2O message-based device interfaces to kernel-api book.

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap