30 Sep, 2006

18 commits

  • The clock_nanosleep() function does not return the time remaining when the
    sleep is interrupted by a signal.

    This patch creates a new call out, compat_clock_nanosleep_restart(), which
    handles returning the remaining time after a sleep is interrupted. This
    patch revives clock_nanosleep_restart(). It is now accessed via the new
    call out. The compat_clock_nanosleep_restart() is used for compatibility
    access.

    Since this is implemented in compatibility mode the normal path is
    virtually unaffected - no real performance impact.

    Signed-off-by: Toyo Abe
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Roland McGrath
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Toyo Abe
     
  • Spawing ksoftirqd, migration, or watchdog, and calling init_timers_cpu()
    may fail with small memory. If it happens in initcalls, kernel NULL
    pointer dereference happens later. This patch makes crash happen
    immediately in such cases. It seems a bit better than getting kernel NULL
    pointer dereference later.

    Cc: Ingo Molnar
    Signed-off-by: Akinobu Mita
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • Both __kfifo_put() and __kfifo_get() have header comments stating that if
    there is but one concurrent reader and one concurrent writer, locking is not
    necessary. This is almost the case, but a couple of memory barriers are
    needed. Another option would be to change the header comments to remove the
    bit about locking not being needed, and to change the those callers who
    currently don't use locking to add the required locking. The attachment
    analyzes this approach, but the patch below seems simpler.

    Signed-off-by: Paul E. McKenney
    Cc: Stelian Pop
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul E. McKenney
     
  • Lets do the same thing we do for oopses - print out the version in the
    report. It's an extra line of output though. We could tack it on the end
    of the INFO: lines, but that screws up Ingo's pretty output.

    Signed-off-by: Dave Jones
    Cc: Ingo Molnar
    Cc: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Jones
     
  • This is an updated version of Eric Biederman's is_init() patch.
    (http://lkml.org/lkml/2006/2/6/280). It applies cleanly to 2.6.18-rc3 and
    replaces a few more instances of ->pid == 1 with is_init().

    Further, is_init() checks pid and thus removes dependency on Eric's other
    patches for now.

    Eric's original description:

    There are a lot of places in the kernel where we test for init
    because we give it special properties. Most significantly init
    must not die. This results in code all over the kernel test
    ->pid == 1.

    Introduce is_init to capture this case.

    With multiple pid spaces for all of the cases affected we are
    looking for only the first process on the system, not some other
    process that has pid == 1.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Sukadev Bhattiprolu
    Cc: Dave Hansen
    Cc: Serge Hallyn
    Cc: Cedric Le Goater
    Cc:
    Acked-by: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sukadev Bhattiprolu
     
  • Fixed race on put_files_struct on exec with proc. Restoring files on
    current on error path may lead to proc having a pointer to already kfree-d
    files_struct.

    ->files changing at exit.c and khtread.c are safe as exit_files() makes all
    things under lock.

    Found during OpenVZ stress testing.

    [akpm@osdl.org: add export]
    Signed-off-by: Pavel Emelianov
    Signed-off-by: Kirill Korotaev
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill Korotaev
     
  • Fix "variable defined but not used" compiler warning in unwind.c when
    CONFIG_MODULES is not set.

    Signed-off-by: Chuck Ebbert
    Cc: Jan Beulich
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chuck Ebbert
     
  • Some of the kerneldoc comments in this file are ignored since the lead-in
    is malformed, using either "/*" or "/***" instead of "/**".

    [rdunlap@xenotime.net: kerneldoc fixes]
    Signed-off-by: Rolf Eike Beer
    Acked-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rolf Eike Beer
     
  • Oleg brought up some interesting points about grabbing the pi_lock for some
    protections. In this discussion, I realized that there are some places
    that the pi_lock is being grabbed when it really wasn't necessary. Also
    this patch does a little bit of clean up.

    This patch basically does three things:

    1) renames the "boost" variable to "chain_walk". Since it is used in
    the debugging case when it isn't going to be boosted. It better
    describes what the test is going to do if it succeeds.

    2) moves get_task_struct to just before the unlocking of the wait_lock.
    This removes duplicate code, and makes it a little easier to read. The
    owner wont go away while either the pi_lock or the wait_lock are held.

    3) removes the pi_locking and owner blocked checking completely from the
    debugging case. This is because the grabbing the lock and doing the
    check, then releasing the lock is just so full of races. It's just as
    good to go ahead and call the pi_chain_walk function, since after
    releasing the lock the owner can then block anyway, and we would have
    missed that. For the debug case, we really do want to do the chain walk
    to test for deadlocks anyway.

    [oleg@tv-sign.ru: more of the same]
    Signed-of-by: Steven Rostedt
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Oleg Nesterov
    Cc: Esben Nielsen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Steven Rostedt
     
  • Signed-off-by: Alexey Dobriyan
    Acked-by: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • lock_timer_base acquires a lock and returns with that lock held. Add a
    lock annotation to this function so that sparse can check callers for lock
    pairing, and so that sparse will not complain about this function since it
    intentionally uses the lock in this manner.

    Signed-off-by: Josh Triplett
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josh Triplett
     
  • Initialize module_subsys earlier (or at least earlier than devices) since
    it could be used very early in the boot process if kmod loads a module
    before the device initcalls. Otherwise, kmod will crash in
    kernel/module.c:mod_sysfs_setup() since the kset in module_subsys is not
    initialized yet.

    I only noticed this problem because occasionally, kmod loads the modules
    for my SCSI and Ethernet adapters very early, during the boot process
    itself. I don't quite understand why it loads them sometimes and doesn't
    load them other times. Or who is telling kmod to do so. Can someone
    explain?

    Signed-off-by: Mark Huang
    Cc: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mark Huang
     
  • rcu_torture_read_lock and rcu_bh_torture_read_lock acquire locks without
    releasing them, and the matching functions rcu_torture_read_unlock and
    rcu_bh_torture_read_unlock get called with the corresponding locks held and
    release them. Add lock annotations to these four functions so that sparse
    can check callers for lock pairing, and so that sparse will not complain
    about these functions since they intentionally use locks in this manner.

    Signed-off-by: Josh Triplett
    Acked-by: Paul McKenney
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josh Triplett
     
  • Signed-off-by: Yoichi Yuasa
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yoichi Yuasa
     
  • Signed-off-by: Yoichi Yuasa
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yoichi Yuasa
     
  • Add relay interface support to DocBook/kernel-api.tmpl. Fix typos etc. in
    relay.c and relayfs.txt.

    Signed-off-by: Randy Dunlap
    Acked-by: Tom Zanussi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Check driver layer return values in kernel/params.c

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • If the cpu has the lock held for write, is interrupted, and the interrupt
    handler calls read_trylock(), it's an instant deadlock.

    Now, Dave Miller has subsequently pointed out that we don't have any
    situations where this can occur. Nevertheless, we should delete
    generic__raw_read_lock (and its associated EXPORT to make Arjan happy) so that
    nobody thinks they can use it.

    Acked-by: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     

29 Sep, 2006

1 commit

  • * 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (130 commits)
    [ARM] 3856/1: Add clocksource for Intel IXP4xx platforms
    [ARM] 3855/1: Add generic time support
    [ARM] 3873/1: S3C24XX: Add irq_chip names
    [ARM] 3872/1: S3C24XX: Apply consistant tabbing to irq_chips
    [ARM] 3871/1: S3C24XX: Fix ordering of EINT4..23
    [ARM] nommu: confirms the CR_V bit in nommu mode
    [ARM] nommu: abort handler fixup for !CPU_CP15_MMU cores.
    [ARM] 3870/1: AT91: Start removing static memory mappings
    [ARM] 3869/1: AT91: NAND support for DK and KB9202 boards
    [ARM] 3868/1: AT91 hardware header update
    [ARM] 3867/1: AT91 GPIO update
    [ARM] 3866/1: AT91 clock update
    [ARM] 3865/1: AT91RM9200 header updates
    [ARM] 3862/2: S3C2410 - add basic power management support for AML M5900 series
    [ARM] kthread: switch arch/arm/kernel/apm.c
    [ARM] Off-by-one in arch/arm/common/icst*
    [ARM] 3864/1: Refactore sharpsl_pm
    [ARM] 3863/1: Add Locomo SPI Device
    [ARM] 3847/2: Convert LOMOMO to use struct device for GPIOs
    [ARM] Use CPU_CACHE_* where possible in asm/cacheflush.h
    ...

    Linus Torvalds
     

28 Sep, 2006

1 commit


27 Sep, 2006

10 commits

  • With the patches flying between Oleg and myself somehow this temporary
    debug code got left in pid.c. It was never intended to make it to the
    stable kernel.

    Signed-off-by: Eric W. Biederman
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • In de_thread we move pids from one process to another, a rather ugly case.
    The function transfer_pid makes it clear what we are doing, and makes the
    action atomic. This is useful we ever want to atomically traverse the
    process group and session lists, in a rcu safe manner.

    Even if the atomic properties this change should be a win as transfer_pid
    should be less code to execute than executing both attach_pid and
    detach_pid, and this should make de_thread slightly smaller as only a
    single function call needs to be emitted. The only downside is that the
    code might be slower to execute as the odds are against transfer_pid being
    in cache.

    Signed-off-by: Eric W. Biederman
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • Since sys_sysctl is deprecated start allow it to be compiled out. This
    should catch any remaining user space code that cares, and paves the way
    for further sysctl cleanups.

    [akpm@osdl.org: If sys_sysctl() is not compiled-in, emit a warning]
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • This eliminates the i_blksize field from struct inode. Filesystems that want
    to provide a per-inode st_blksize can do so by providing their own getattr
    routine instead of using the generic_fillattr() function.

    Note that some filesystems were providing pretty much random (and incorrect)
    values for i_blksize.

    [bunk@stusta.de: cleanup]
    [akpm@osdl.org: generic_fillattr() fix]
    Signed-off-by: "Theodore Ts'o"
    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Theodore Ts'o
     
  • The following patches reduce the size of the VFS inode structure by 28 bytes
    on a UP x86. (It would be more on an x86_64 system). This is a 10% reduction
    in the inode size on a UP kernel that is configured in a production mode
    (i.e., with no spinlock or other debugging functions enabled; if you want to
    save memory taken up by in-core inodes, the first thing you should do is
    disable the debugging options; they are responsible for a huge amount of bloat
    in the VFS inode structure).

    This patch:

    The filesystem or device-specific pointer in the inode is inside a union,
    which is pretty pointless given that all 30+ users of this field have been
    using the void pointer. Get rid of the union and rename it to i_private, with
    a comment to explain who is allowed to use the void pointer. This is just a
    cleanup, but it allows us to reuse the union 'u' for something something where
    the union will actually be used.

    [judith@osdl.org: powerpc build fix]
    Signed-off-by: "Theodore Ts'o"
    Signed-off-by: Judith Lebzelter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Theodore Ts'o
     
  • Move the fallback arch_vma_name() to a sensible place (kernel/signal.c).

    Currently it's in fs/proc/task_mmu.c, a file that is dependent on both
    CONFIG_PROC_FS and CONFIG_MMU being enabled, but it's used from
    kernel/signal.c from where it is called unconditionally.

    [akpm@osdl.org: build fix]
    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • Check that access_process_vm() is accessing a valid mapping in the target
    process.

    This limits ptrace() accesses and accesses through /proc//maps to only
    those regions actually mapped by a program.

    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • If you have two resources which aree exactly the same size,
    insert_resource() currently inserts the new one below the existing one.
    This is wrong because there's no way to insert a resource of the same size
    above an existing one.

    I took this opportunity to rewrite the initial loop to be a for-loop
    instead of a goto-loop and fix the documentation.

    Signed-off-by: Matthew Wilcox
    Cc: Ivan Kokshaysky
    Cc: Dominik Brodowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Matthew Wilcox
     
  • * 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (225 commits)
    [PATCH] Don't set calgary iommu as default y
    [PATCH] i386/x86-64: New Intel feature flags
    [PATCH] x86: Add a cumulative thermal throttle event counter.
    [PATCH] i386: Make the jiffies compares use the 64bit safe macros.
    [PATCH] x86: Refactor thermal throttle processing
    [PATCH] Add 64bit jiffies compares (for use with get_jiffies_64)
    [PATCH] Fix unwinder warning in traps.c
    [PATCH] x86: Allow disabling early pci scans with pci=noearly or disallowing conf1
    [PATCH] x86: Move direct PCI scanning functions out of line
    [PATCH] i386/x86-64: Make all early PCI scans dependent on CONFIG_PCI
    [PATCH] Don't leak NT bit into next task
    [PATCH] i386/x86-64: Work around gcc bug with noreturn functions in unwinder
    [PATCH] Fix some broken white space in ia32_signal.c
    [PATCH] Initialize argument registers for 32bit signal handlers.
    [PATCH] Remove all traces of signal number conversion
    [PATCH] Don't synchronize time reading on single core AMD systems
    [PATCH] Remove outdated comment in x86-64 mmconfig code
    [PATCH] Use string instructions for Core2 copy/clear
    [PATCH] x86: - restore i8259A eoi status on resume
    [PATCH] i386: Split multi-line printk in oops output.
    ...

    Linus Torvalds
     
  • * master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (47 commits)
    Driver core: Don't call put methods while holding a spinlock
    Driver core: Remove unneeded routines from driver core
    Driver core: Fix potential deadlock in driver core
    PCI: enable driver multi-threaded probe
    Driver Core: add ability for drivers to do a threaded probe
    sysfs: add proper sysfs_init() prototype
    drivers/base: check errors
    drivers/base: Platform notify needs to occur before drivers attach to the device
    v4l-dev2: handle __must_check
    add CONFIG_ENABLE_MUST_CHECK
    add __must_check to device management code
    Driver core: fixed add_bind_files() definition
    Driver core: fix comments in drivers/base/power/resume.c
    sysfs_remove_bin_file: no return value, dump_stack on error
    kobject: must_check fixes
    Driver core: add ability for devices to create and remove bin files
    Class: add support for class interfaces for devices
    Driver core: create devices/virtual/ tree
    Driver core: add device_rename function
    Driver core: add ability for classes to handle devices properly
    ...

    Linus Torvalds
     

26 Sep, 2006

10 commits

  • Add the pm_trace attribute in /sys/power which has to be explicitly set to
    one to really enable the "PM tracing" code compiled in when CONFIG_PM_TRACE
    is set (which modifies the machine's CMOS clock in unpredictable ways).

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Change suspend_console() so that it waits for all consoles to flush the
    remaining messages and make it possible to switch the console suspending off
    with the help of a Kconfig option.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Cc: Stefan Seyfried
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Make swsusp use memory bitmaps to store its internal information during the
    resume phase of the suspend-resume cycle.

    If the pfns of saveable pages are saved during the suspend phase instead of
    the kernel virtual addresses of these pages, we can use them during the resume
    phase directly to set the corresponding bits in a memory bitmap. Then, this
    bitmap is used to mark the page frames corresponding to the pages that were
    saveable before the suspend (aka "unsafe" page frames).

    Next, we allocate as many page frames as needed to store the entire suspend
    image and make sure that there will be some extra free "safe" page frames for
    the list of PBEs constructed later. Subsequently, the image is loaded and, if
    possible, the data loaded from it are written into their "original" page
    frames (ie. the ones they had occupied before the suspend).

    The image data that cannot be written into their "original" page frames are
    loaded into "safe" page frames and their "original" kernel virtual addresses,
    as well as the addresses of the "safe" pages containing their copies, are
    stored in a list of PBEs. Finally, the list of PBEs is used to copy the
    remaining image data into their "original" page frames (this is done
    atomically, by the architecture-dependent parts of swsusp).

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Introduce the memory bitmap data structure and make swsusp use in the suspend
    phase.

    The current swsusp's internal data structure is not very efficient from the
    memory usage point of view, so it seems reasonable to replace it with a data
    structure that will require less memory, such as a pair of bitmaps.

    The idea is to use bitmaps that may be allocated as sets of individual pages,
    so that we can avoid making allocations of order greater than 0. For this
    reason the memory bitmap structure consists of several linked lists of objects
    that contain pointers to memory pages with the actual bitmap data. Still, for
    a typical system all of these lists fit in a single page, so it's reasonable
    to introduce an additional mechanism allowing us to allocate all of them
    efficiently without sacrificing the generality of the design. This is done
    with the help of the chain_allocator structure and associated functions.

    We need to use two memory bitmaps during the suspend phase of the
    suspend-resume cycle. One of them is necessary for marking the saveable
    pages, and the second is used to mark the pages in which to store the copies
    of them (aka image pages).

    First, the bitmaps are created and we allocate as many image pages as needed
    (the corresponding bits in the second bitmap are set as soon as the pages are
    allocated). Second, the bits corresponding to the saveable pages are set in
    the first bitmap and the saveable pages are copied to the image pages.
    Finally, the first bitmap is used to save the kernel virtual addresses of the
    saveable pages and the second one is used to save the contents of the image
    pages.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Introduce some constants that hopefully will help improve the readability of
    code in kernel/power/snapshot.c.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • The name of the pagedir_nosave variable does not make sense any more, so it
    seems reasonable to change it to something more meaningful.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Get rid of the FIXME in kernel/power/snapshot.c#alloc_pagedir() and
    simplify the functions called by it.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Move some functions in kernel/power/snapshot.c to a better place (in the
    same file) and introduce free_image_page() (will be necessary in the
    future).

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Clean up mm/page_alloc.c#mark_free_pages() and make it avoid clearing
    PageNosaveFree for PageNosave pages. This allows us to get rid of an ugly
    hack in kernel/power/snapshot.c#copy_data_pages().

    Additionally, the page-copying loop in copy_data_pages() is moved to an
    inline function.

    Signed-off-by: Rafael J. Wysocki
    Cc: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • The current suspend code has to be run on one CPU, so we use the CPU
    hotplug to take the non-boot CPUs offline on SMP machines. However, we
    should also make sure that these CPUs will not be enabled by someone else
    after we have disabled them.

    The functions disable_nonboot_cpus() and enable_nonboot_cpus() are moved to
    kernel/cpu.c, because they now refer to some stuff in there that should
    better be static. Also it's better if disable_nonboot_cpus() returns an
    error instead of panicking if something goes wrong, and
    enable_nonboot_cpus() has no reason to panic(), because the CPUs may have
    been enabled by the userland before it tries to take them online.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki