17 Oct, 2007

40 commits

  • Various architectures may call bust_spinlocks() recursively; the function
    itself, however, doesn't appear to be meant to be called in this manner.
    Nevertheless, this doesn't appear to be a problem as long as
    bust_spinlocks(0) doesn't get called twice in a row (otherwise,
    unblank_screen() may enter the scheduler). However, at least on i386 die()
    has been capable of returning (and on other architectures this should
    really be that way, too) when notify_die() returns NOTIFY_STOP.

    Short of getting a reply to a respective query, this patch makes
    bust_spinlocks() increment/decrement oops_in_progress, and wake klogd only
    when the count drops back to zero.

    Signed-off-by: Jan Beulich
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Beulich
     
  • To go along with the existing "roundup_pow_of_two" routine, add one for
    rounding down since that operation appears to crop up on a regular basis in
    the source tree.

    [m.kozlowski@tuxland.pl: fix unbalanced parentheses]
    Signed-off-by: Robert P. J. Day
    Signed-off-by: Mariusz Kozlowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert P. J. Day
     
  • Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     
  • Implement sending of quota messages via netlink interface. The advantage
    is that in userspace we can better decide what to do with the message - for
    example display a dialogue in your X session or just write the message to
    the console. As a bonus, we can get rid of problems with console locking
    deep inside filesystem code once we remove the old printing mechanism.

    Signed-off-by: Jan Kara
    Cc: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Since the "ramdisk" kernel parameter has been officially deprecated
    since at least 2.6.18, might as well finally get rid of it.

    Signed-off-by: Robert P. J. Day
    Acked-by: Geert Uytterhoeven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert P. J. Day
     
  • initrd/initramfs/ramdisk docs:
    - fix typos/spellos/grammar
    - clarify RAM disk config location
    - correct cpio option

    Acked-by: Bryan O'Sullivan
    Acked-by: Rob Landley
    Cc: Werner Almesberger
    Cc: H. Peter Anvin
    Signed-off-by: Randy Dunlap
    Acked-by: Jesper Juhl
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Signed-off-by: Mathieu Desnoyers
    Cc: Grant Grundler
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mathieu Desnoyers
     
  • local_t is a variant of atomic_t and has related ops to match.
    Add reference for local_t documentation to atomic_ops.txt.

    Signed-off-by: Grant Grundler
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Grant Grundler
     
  • Grant Grundler was asking for more detail about correct usage of local
    atomic operations and suggested adding the resulting summary to
    local_ops.txt.

    "Please add a bit more detail. If DaveM is correct (he normally is), then
    there must be limits on how the local_t can be used in the kernel process
    and interrupt contexts. I'd like those rules spelled out very clearly
    since it's easy to get wrong and tracking down such a bug is quite
    painful."

    Signed-off-by: Mathieu Desnoyers
    Signed-off-by: Grant Grundler
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mathieu Desnoyers
     
  • Since CONFIG_RAMFS is currently hard-selected to "y", and since
    Documentation/filesystems/ramfs-rootfs-initramfs.txt reads as follows:

    "The amount of code required to implement ramfs is tiny, because all the
    work is done by the existing Linux caching infrastructure. Basically,
    you're mounting the disk cache as a filesystem. Because of this, ramfs is
    not an optional component removable via menuconfig, since there would be
    negligible space savings."

    It seems pointless to leave this as a Kconfig entry.

    Signed-off-by: Robert P. J. Day
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert P. J. Day
     
  • The Coverity checker spotted that we have already oops'ed if "disk"
    was NULL.

    Since "disk" being NULL seems impossible at this point this patch
    removes the NULL check.

    Signed-off-by: Adrian Bunk
    Acked-by: Mike Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • {,un}register_timer_hook() is the API that should be used.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • kernel/sys_ni.c can't #include due to cond_syscall(),
    but let's tell gcc to not warn with -Wmissing-prototypes.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • The Coverity checker spotted that we'd have already oops'ed if "tty"
    was NULL.

    Since "tty" can't be NULL when we reach this line of code this patch
    removes the NULL check.

    Signed-off-by: Adrian Bunk
    Acked-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • All asm/ipc.h files do only #include .

    This patch therefore removes all include/asm-*/ipc.h files and moves the
    contents of include/asm-generic/ipc.h to include/linux/ipc.h.

    Signed-off-by: Adrian Bunk
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • mm.h doesn't use directly anything from mutex.h and backing-dev.h, so
    remove them and add them back to files which need them.

    Cross-compile tested on many configs and archs.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • This fixes a problem with the way cciss was filling out the "errors" field
    of the request structure upon completion of requests. Previously, it just
    put a 1 or a 0 in there and used the negation of this as the uptodate
    parameter to one of the functions in the block layer, being a block device.
    For the SG_IO ioctl, this was not sufficient, and we noticed that, for
    example, sg_turs from sg3_utils did not correctly detect problems due to
    cciss having set rq->errors incorrectly.

    Signed-off-by: Stephen M. Cameron
    Acked-by: Mike Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Steve Cameron
     
  • Allow NBD I/O to be cancelled when a network outage occurs. Previously, I/O
    would just hang, and if enough I/O was hung in nbd, the system (at least
    user-level) would completely hang until a TCP timeout (default, 15 minutes)
    occurred.

    The patch introduces a new ioctl NBD_SET_TIMEOUT that allows a transmit
    timeout value (in seconds) to be specified. Any network send that exceeds the
    timeout will be cancelled and the nbd connection will be shut down. I've
    tested with various timeout values and 6 seconds seems to be a good choice for
    the timeout. If the NBD_SET_TIMEOUT ioctl is not called, you get the old (I/O
    hang) behavior.

    Signed-off-by: Paul Clements
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Clements
     
  • This fixes errors with utilities (such as LVM's vgscan) that try to scan all
    devices. Previously this would generate read errors when uninitialized nbd
    devices were scanned:

    # vgscan
    Reading all physical volumes. This may take a while...
    /dev/nbd0: read failed after 0 of 1024 at 0: Input/output error
    /dev/nbd0: read failed after 0 of 1024 at 509804544: Input/output error
    /dev/nbd0: read failed after 0 of 2048 at 0: Input/output error
    /dev/nbd1: read failed after 0 of 1024 at 509804544: Input/output error
    /dev/nbd1: read failed after 0 of 2048 at 0: Input/output error

    From now on, uninitialized nbd devices will have size zero, which
    prevents these errors.

    Signed-off-by: Paul Clements
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Clements
     
  • I would suggest this change to make CodingStyle properly reflect the style
    used by the kernel, rather than the current wording which is wishful
    thinking and misleading, and comes from the same school of thought that
    gets off on prescriptive grammar, latin and comp.std.c

    Signed-off-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Cox
     
  • AUTO_DMA and FLOPPY_MOTOR_MASK in include/asm-*/floppy.h are dead symbols -
    remove them.

    Signed-off-by: Jan Beulich
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Beulich
     
  • The floppy driver is already written to be able to operate in virtual DMA
    mode. Thus it can easily be adjusted to tolerate failure from
    fd_request_dma() as long as virtual DMA mode is not disallowed.

    Signed-off-by: Jan Beulich
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Beulich
     
  • Kconfig.preempt is not included on some archs (for example, m68k). On those
    archs, the Kconfig machinery complains that KVM selects an undefined symbol
    PREEMPT_NOTIFIERS (which lives in Kconfig.preempt).

    So move the offending symbol into a Kconfig file which is included by
    everyone.

    Cc: Roman Zippel
    Cc: Geert Uytterhoeven
    Signed-off-by: Avi Kivity
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Avi Kivity
     
  • No longer used. TTY_FLIPBUF_SIZE will also go soon but needs a couple of
    other cleanups first

    Signed-off-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Cox
     
  • robust_list, compat_robust_list, pi_state_list, pi_state_cache are
    really used if futexes are on.

    Signed-off-by: Alexey Dobriyan
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Add a prefix "VMCOREINFO_" to the vmcoreinfo macros. Old vmcoreinfo macros
    were defined as generic names SYMBOL/SIZE/OFFSET /LENGTH/CONFIG, and it is
    impossible to grep for them. So these names should be changed. This
    discussion is the following:
    http://www.ussg.iu.edu/hypermail/linux/kernel/0709.1/0415.html

    Signed-off-by: Ken'ichi Ohmichi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ken'ichi Ohmichi
     
  • Signed-off-by: Ken'ichi Ohmichi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ken'ichi Ohmichi
     
  • [2/3] Add nodemask_t's size and NR_FREE_PAGES's value to vmcoreinfo_data.
    The dump filetering command 'makedumpfile'(v1.1.6 or before) had assumed
    the above values, and it was not good from the reliability viewpoint.
    So makedumpfile v1.2.0 came to need these values and I created the patch
    to let the kernel output them.
    makedumpfile site:
    https://sourceforge.net/projects/makedumpfile/

    Signed-off-by: Ken'ichi Ohmichi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ken'ichi Ohmichi
     
  • [1/3] Cleanup the coding style according to Andrew's comments:
    http://lists.infradead.org/pipermail/kexec/2007-August/000522.html
    - vmcoreinfo_append_str() should have suitable __attribute__s so that
    the compiler can check its use.
    - vmcoreinfo_max_size should have size_t.
    - Use get_seconds() instead of xtime.tv_sec.
    - Use init_uts_ns.name.release instead of UTS_RELEASE.

    Signed-off-by: Ken'ichi Ohmichi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ken'ichi Ohmichi
     
  • This patch set frees the restriction that makedumpfile users should install a
    vmlinux file (including the debugging information) into each system.

    makedumpfile command is the dump filtering feature for kdump. It creates a
    small dumpfile by filtering unnecessary pages for the analysis. To
    distinguish unnecessary pages, it needs a vmlinux file including the debugging
    information. These days, the debugging package becomes a huge file, and it is
    hard to install it into each system.

    To solve the problem, kdump developers discussed it at lkml and kexec-ml. As
    the result, we reached the conclusion that necessary information for dump
    filtering (called "vmcoreinfo") should be embedded into the first kernel file
    and it should be accessed through /proc/vmcore during the second kernel.
    (http://www.uwsg.iu.edu/hypermail/linux/kernel/0707.0/1806.html)

    Dan Aloni created the patch set for the above implementation.
    (http://www.uwsg.iu.edu/hypermail/linux/kernel/0707.1/1053.html)

    And I updated it for multi architectures and memory models.
    (http://lists.infradead.org/pipermail/kexec/2007-August/000479.html)

    Signed-off-by: Dan Aloni
    Signed-off-by: Ken'ichi Ohmichi
    Signed-off-by: Bernhard Walle
    Signed-off-by: Daisuke Nishimura
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ken'ichi Ohmichi
     
  • Fix this lot:

    fs/binfmt_flat.c: In function `decompress_exec':
    fs/binfmt_flat.c:293: warning: label `out' defined but not used
    fs/binfmt_flat.c: In function `load_flat_file':
    fs/binfmt_flat.c:462: warning: unsigned int format, long int arg (arg 3)
    fs/binfmt_flat.c:462: warning: unsigned int format, long int arg (arg 4)
    fs/binfmt_flat.c:518: warning: comparison of distinct pointer types lacks a cast
    fs/binfmt_flat.c:549: warning: passing arg 1 of `ksize' makes pointer from integer without a cast
    fs/binfmt_flat.c:601: warning: passing arg 1 of `ksize' makes pointer from integer without a cast
    fs/binfmt_flat.c: In function `load_flat_binary':
    fs/binfmt_flat.c:116: warning: 'dummy' might be used uninitialized in this function

    Acked-by: Greg Ungerer
    Cc: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Simply fill out the bits in checkstack.pl for Blackfin. I thought I already
    sent this, but I don't see it in -mm anywhere ...

    Signed-off-by: Mike Frysinger
    Cc: Bryan Wu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Frysinger
     
  • do_sigaction() returns -ERESTARTNOINTR if signal_pending(). The comment says:

    * If there might be a fatal signal pending on multiple
    * threads, make sure we take it before changing the action.

    I think this is not needed. We should only worry about SIGNAL_GROUP_EXIT case,
    bit it implies a pending SIGKILL which can't be cleared by do_sigaction.

    Kill this special case.

    Signed-off-by: Oleg Nesterov
    Acked-by: Roland McGrath
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • de_thread() yields waiting for ->group_leader to be a zombie. This deadlocks
    if an rt-prio execer shares the same cpu with ->group_leader. Change the code
    to use ->group_exit_task/notify_count mechanics.

    This patch certainly uglifies the code, perhaps someone can suggest something
    better.

    Signed-off-by: Oleg Nesterov
    Cc: Roland McGrath
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Now that we don't pre-allocate the new ->sighand, we can kill the first fast
    path, it doesn't make sense any longer. At best, it can save one "list_empty()"
    check but leads to the code duplication.

    Signed-off-by: Oleg Nesterov
    Cc: Roland McGrath
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • de_thread() pre-allocates newsighand to make sure that exec() can't fail after
    killing all sub-threads. Imho, this buys nothing, but complicates the code:

    - this is (mostly) needed to handle CLONE_SIGHAND without CLONE_THREAD
    tasks, this is very unlikely (if ever used) case

    - unless we already have some serious problems, GFP_KERNEL allocation
    should not fail

    - ENOMEM still can happen after de_thread(), ->sighand is not the last
    object we have to allocate

    Change the code to allocate the new ->sighand on demand.

    Signed-off-by: Oleg Nesterov
    Cc: Roland McGrath
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • There is no any reason to do recalc_sigpending() after changing ->sighand.
    To begin with, recalc_sigpending() does not take ->sighand into account.

    This means we don't need to take newsighand->siglock while changing sighands.
    rcu_assign_pointer() provides a necessary barrier, and if another process
    reads the new ->sighand it should either take tasklist_lock or it should use
    lock_task_sighand() which has a corresponding smp_read_barrier_depends().

    Signed-off-by: Oleg Nesterov
    Cc: Roland McGrath
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Fix f_version type: should be u64 instead of long

    There is a type inconsistency between struct inode i_version and struct file
    f_version.

    fs.h:

    struct inode
    u64 i_version;

    and

    struct file
    unsigned long f_version;

    Users do:

    fs/ext3/dir.c:

    if (filp->f_version != inode->i_version) {

    So why isn't f_version a u64 ? It becomes a problem if versions gets
    higher than 2^32 and we are on an architecture where longs are 32 bits.

    This patch changes the f_version type to u64, and updates the users accordingly.

    It applies to 2.6.23-rc2-mm2.

    Signed-off-by: Mathieu Desnoyers
    Cc: Martin Bligh
    Cc: "Randy.Dunlap"
    Cc: Al Viro
    Cc:
    Cc: Mark Fasheh
    Cc: Christoph Hellwig
    Cc: "J. Bruce Fields"
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mathieu Desnoyers
     
  • Some months back I proposed changing the schedule() call in
    read_events to an io_schedule():
    http://osdir.com/ml/linux.kernel.aio.general/2006-10/msg00024.html
    This was rejected as there are AIO operations that do not initiate
    disk I/O. I've had another look at the problem, and the only AIO
    operation that will not initiate disk I/O is IOCB_CMD_NOOP. However,
    this command isn't even wired up!

    Given that it doesn't work, and hasn't for *years*, I'm going to
    suggest again that we do proper I/O accounting when using AIO.

    Signed-off-by: Jeff Moyer
    Acked-by: Zach Brown
    Cc: Benjamin LaHaise
    Cc: Suparna Bhattacharya
    Cc: Badari Pulavarty
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Moyer
     
  • Repost of http://lkml.org/lkml/2007/8/10/472 made available by request.

    The locking used by get_random_bytes() can conflict with the
    preempt_disable() and synchronize_sched() form of RCU. This patch changes
    rcutorture's RNG to gather entropy from the new cpu_clock() interface
    (relying on interrupts, preemption, daemons, and rcutorture's reader
    thread's rock-bottom scheduling priority to provide useful entropy), and
    also adds and EXPORT_SYMBOL_GPL() to make that interface available to GPLed
    kernel modules such as rcutorture.

    Passes several hours of rcutorture.

    [ego@in.ibm.com: Use raw_smp_processor_id() in rcu_random()]
    Signed-off-by: Paul E. McKenney
    Cc: Ingo Molnar
    Signed-off-by: Gautham R Shenoy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul E. McKenney