01 May, 2007

5 commits

  • This patch changes the docs and behaviour from "all states valid" to "no
    states valid" if no .valid callback is assigned. Users of pm_ops that only
    need mem sleep can assign pm_valid_only_mem without any overhead, others
    will require more elaborate callbacks.

    Now that all users of pm_ops have a .valid callback this is a safe thing to
    do and prevents things from getting messy again as they were before.

    Signed-off-by: Johannes Berg
    Acked-by: Pavel Machek
    Looks-okay-to: Rafael J. Wysocki
    Cc:
    Cc: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Berg
     
  • Almost all users of pm_ops only support mem sleep, don't check in .valid and
    don't reject any others in .prepare so users can be confused if they check
    /sys/power/state, especially when new states are added (these would then
    result in s-t-r although they're supposed to be something different).

    This patch implements a generic pm_valid_only_mem function that is then
    exported for users and puts it to use in almost all existing pm_ops.

    Signed-off-by: Johannes Berg
    Cc: David Brownell
    Acked-by: Pavel Machek
    Cc: linux-pm@lists.linux-foundation.org
    Cc: Len Brown
    Acked-by: Russell King
    Cc: Greg KH
    Cc: "Rafael J. Wysocki"
    Cc: Paul Mundt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Berg
     
  • This patch removes the firmware disk suspend mode which is the wrong approach,
    it is supposed to be used for implementing firmware-based disk suspend but
    cannot actually be used for that.

    Signed-off-by: Johannes Berg
    Acked-by: Pavel Machek
    Cc:
    Cc: David Brownell
    Cc: Len Brown
    Acked-by: Russell King
    Cc: Greg KH
    Cc: "Rafael J. Wysocki"
    Cc: Paul Mundt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Berg
     
  • This patch series cleans up some misconceptions about pm_ops. Some users of
    the pm_ops structure attempt to use it to stop the user from entering suspend
    to disk, this, however, is not possible since the user can always use
    "shutdown" in /sys/power/disk and then the pm_ops are never invoked. Also,
    platforms that don't support suspend to disk simply should not allow
    configuring SOFTWARE_SUSPEND (read the help text on it, it only selects
    suspend to disk and nothing else, all the other stuff depends on PM).

    The pm_ops structure is actually intended to provide a way to enter
    platform-defined sleep states (currently supported states are "standby" and
    "mem" (suspend to ram)) and additionally (if SOFTWARE_SUSPEND is configured)
    allows a platform to support a platform specific way to enter low-power mode
    once everything has been saved to disk. This is currently only used by ACPI
    (S4).

    This patch:

    The pm_ops.pm_disk_mode is used in totally bogus ways since nobody really
    seems to understand what it actually does.

    This patch clarifies the pm_disk_mode description.

    It also removes all the arm and sh users that think they can veto suspend to
    disk via pm_ops; not so since the user can always do echo shutdown >
    /sys/power/disk, they need to find a better way involving Kconfig or such.

    ACPI is the only user left with a non-zero pm_disk_mode.

    The patch also sets the default mode to shutdown again, but when a new pm_ops
    is registered its pm_disk_mode is selected as default, that way the default
    stays for ACPI where it is apparently required.

    Signed-off-by: Johannes Berg
    Cc: David Brownell
    Acked-by: Pavel Machek
    Cc:
    Cc: Len Brown
    Acked-by: Russell King
    Cc: Greg KH
    Cc: "Rafael J. Wysocki"
    Acked-by: Paul Mundt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Berg
     
  • Today's print_symbol function dumps a kernel symbol with printk. This
    patch extends the functionality of kallsyms.c so that the symbol lookup
    function may be used without the printk. This is useful for modules that
    want to dump symbols elsewhere, for example, to debugfs. I intend to use
    the new function call in the GFS2 file system (which will be a separate
    patch).

    [akpm@linux-foundation.org: build fix]
    [clameter@sgi.com: sprint_symbol should return length of string like sprintf]
    Signed-off-by: Robert Peterson
    Cc: Rusty Russell
    Cc: Roman Zippel
    Cc: "Randy.Dunlap"
    Cc: Sam Ravnborg
    Acked-by: Paulo Marques
    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert Peterson
     

29 Apr, 2007

1 commit

  • Both old-IDE and libata should be able handle all controllers and
    devices found using normal resource reservation methods.

    This eliminates the awful, low-performing split-driver configuration
    where old-IDE drove the PATA portion of a PCI device, in PIO-only mode,
    and libata drove the SATA portion of the /same/ PCI device, in DMA mode.
    Typically vendors would ship SATA hard drive / PATA optical
    configuration, which would lend itself to slow (PIO-only) CD-ROM
    performance.

    For Intel users running in combined mode, it is now wholly dependent on
    your driver choice (potentially link order, if you compile both drivers
    in) whether old-IDE or libata will drive your hardware.

    In either case, you will get full performance from both SATA and PATA
    ports now, without having to pass a kernel command line parameter.

    Signed-off-by: Jeff Garzik

    Jeff Garzik
     

28 Apr, 2007

5 commits

  • Fix miscellaneous networking compilation errors.

    (*) Export ktime_add_ns() for modules.

    (*) wext_proc_init() should have an ANSI declaration.

    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    David Howells
     
  • * master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (46 commits)
    dev_dbg: check dev_dbg() arguments
    drivers/base/attribute_container.c: use mutex instead of binary semaphore
    mod_sysfs_setup() doesn't return errno when kobject_add_dir() failure occurs
    s2ram: add arch irq disable/enable hooks
    define platform wakeup hook, use in pci_enable_wake()
    security: prevent permission checking of file removal via sysfs_remove_group()
    device_schedule_callback() needs a module reference
    s390: cio: Delay uevents for subchannels
    sysfs: bin.c printk fix
    Driver core: use mutex instead of semaphore in DMA pool handler
    driver core: bus_add_driver should return an error if no bus
    debugfs: Add debugfs_create_u64()
    the overdue removal of the mount/umount uevents
    kobject: Comment and warning fixes to kobject.c
    Driver core: warn when userspace writes to the uevent file in a non-supported way
    Driver core: make uevent-environment available in uevent-file
    kobject core: remove rwsem from struct subsystem
    qeth: Remove usage of subsys.rwsem
    PHY: remove rwsem use from phy core
    IEEE1394: remove rwsem use from ieee1394 core
    ...

    Linus Torvalds
     
  • mod_sysfs_setup() doesn't return an errno when kobject_add_dir() for module
    "holders" directory fails. So caller of mod_sysfs_setup() will keep going
    and get oops.

    Signed-off-by: Akinobu Mita
    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Akinobu Mita
     
  • After some more discussion this patch replaces it:

    From: Johannes Berg
    Subject: suspend: add arch irq disable/enable hooks

    For powermac, we need to do some things between suspending devices and
    device_power_off, for example setting the decrementer. This patch
    allows architectures to define arch_s2ram_{en,dis}able_irqs in their
    asm/suspend.h to have control over this step.

    Signed-off-by: Johannes Berg
    Acked-by: Pavel Machek
    Cc: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Johannes Berg
     
  • show_state() (SysRq-T) developed the buggy habbit of not showing
    TASK_RUNNING tasks. This was due to the mistaken belief that state_filter
    == -1 would be a pass-through filter - while in reality it did not let
    TASK_RUNNING == 0 p->state values through.

    Fix this by restoring the original '!state_filter means all tasks'
    special-case i had in the original version. Test-built and test-booted on
    i686, SysRq-T now works as intended.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

27 Apr, 2007

1 commit


26 Apr, 2007

6 commits

  • Switch cb_lock to mutex and allow netlink kernel users to override it
    with a subsystem specific mutex for consistent locking in dump callbacks.
    All netlink_dump_start users have been audited not to rely on any
    side-effects of the previously used spinlock.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Change tcp_probe to use ktime (needed to add one export).
    Add option to only get events when cwnd changes - from Doug Leith

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • For the common "(struct nlmsghdr *)skb->data" sequence, so that we reduce the
    number of direct accesses to skb->data and for consistency with all the other
    cast skb member helpers.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
    on 64bit architectures, allowing us to combine the 4 bytes hole left by the
    layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
    64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
    :-)

    Many calculations that previously required that skb->{transport,network,
    mac}_header be first converted to a pointer now can be done directly, being
    meaningful as offsets or pointers.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Get rid of the manual clock source selection mess and use ktime. Also
    use a scalar representation, which allows to clean up pkt_sched.h a bit
    more and results in less ktime_to_ns() calls in most cases.

    The PSCHED_US2JIFFIE/PSCHED_JIFFIE2US macros are implemented quite
    inefficient by this patch, following patches will convert all qdiscs
    to hrtimers and get rid of them entirely.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • We currently use a special structure (struct skb_timeval) and plain
    'struct timeval' to store packet timestamps in sk_buffs and struct
    sock.

    This has some drawbacks :
    - Fixed resolution of micro second.
    - Waste of space on 64bit platforms where sizeof(struct timeval)=16

    I suggest using ktime_t that is a nice abstraction of high resolution
    time services, currently capable of nanosecond resolution.

    As sizeof(ktime_t) is 8 bytes, using ktime_t in 'struct sock' permits
    a 8 byte shrink of this structure on 64bit architectures. Some other
    structures also benefit from this size reduction (struct ipq in
    ipv4/ip_fragment.c, struct frag_queue in ipv6/reassembly.c, ...)

    Once this ktime infrastructure adopted, we can more easily provide
    nanosecond resolution on top of it. (ioctl SIOCGSTAMPNS and/or
    SO_TIMESTAMPNS/SCM_TIMESTAMPNS)

    Note : this patch includes a bug correction in
    compat_sock_get_timestamp() where a "err = 0;" was missing (so this
    syscall returned -ENOENT instead of 0)

    Signed-off-by: Eric Dumazet
    CC: Stephen Hemminger
    CC: John find
    Signed-off-by: David S. Miller

    Eric Dumazet
     

24 Apr, 2007

1 commit

  • The commit 34f5a39899f3f3e815da64f48ddb72942d86c366 restricted reading
    of the tainted value. The attached patch changes this back to a
    write-only check and restores the read behaviour of older versions.

    Signed-off-by: Bastian Blank
    Cc: Theodore Ts'o
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bastian Blank
     

13 Apr, 2007

1 commit


08 Apr, 2007

4 commits

  • Getting rid of the p->children printout in show_task() left behind an
    unused variable.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • the p->parent PID printout gives us all the information about the
    task tree that we need - the eldest_child()/older_sibling()/
    younger_sibling() printouts are mostly historic and i do not
    remember ever having used those fields. (IMO in fact they confuse
    the SysRq-T output.) So remove them.

    This code has sentimental value though, those fields and
    printouts are one of the oldest ones still surviving from
    Linux v0.95's kernel/sched.c:

    if (p->p_ysptr || p->p_osptr)
    printk(" Younger sib=%d, older sib=%d\n\r",
    p->p_ysptr ? p->p_ysptr->pid : -1,
    p->p_osptr ? p->p_osptr->pid : -1);
    else
    printk("\n\r");

    written 15 years ago, in early 1992.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Linus 'snif' Torvalds

    Ingo Molnar
     
  • devres should be deallocated with devres_free() not kfree(). This bug
    corrupts slab on IRQ request failure. Fix it.

    Signed-off-by: Tejun Heo
    Cc: Andrew Morton
    Cc: Greg KH
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • Soeren Sonnenburg reported that upon resume he is getting
    this backtrace:

    [] smp_apic_timer_interrupt+0x57/0x90
    [] retrigger_next_event+0x0/0xb0
    [] apic_timer_interrupt+0x28/0x30
    [] retrigger_next_event+0x0/0xb0
    [] __kfifo_put+0x8/0x90
    [] on_each_cpu+0x35/0x60
    [] clock_was_set+0x18/0x20
    [] timekeeping_resume+0x7c/0xa0
    [] __sysdev_resume+0x11/0x80
    [] sysdev_resume+0x47/0x80
    [] device_power_up+0x5/0x10

    it turns out that on resume we mistakenly re-enable interrupts too
    early. Do the timer retrigger only on the current CPU.

    Signed-off-by: Ingo Molnar
    Acked-by: Thomas Gleixner
    Acked-by: Soeren Sonnenburg
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

05 Apr, 2007

3 commits

  • In debugging a problem w/ the -rt tree, I noticed that on systems that mark
    the tsc as unstable before it is registered, the TSC would still be
    selected and used for a short period of time. Digging in it looks to be a
    result of the mix of the clocksource list changes and my clocksource
    initialization changes.

    With the -rt tree, using a bad TSC, even for a short period of time can
    results in a hang at boot. I was not able to reproduce this hang w/
    mainline, but I'm not completely certain that someone won't trip on it.

    This patch resolves the issue by initializing the jiffies clocksource
    earlier so a bad TSC won't get selected just because nothing else is yet
    registered.

    Signed-off-by: John Stultz
    Acked-by: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    john stultz
     
  • Fix a bug in the swsusp's memory shrinker that causes some systems using
    highmem to refuse to suspend to disk if image_size is set above 1/2 of
    available RAM.

    Special thanks to Jiri Slaby for reporting the problem and assistance in
    debugging it.

    Signed-off-by: Rafael J. Wysocki
    Cc: Jiri Slaby
    Cc: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • This patch adds 2 missing symbol exports: jiffies_to_timeval() and
    timeval_to_jiffies(). The (not yet merged) dm-raid4-5 module will need
    them, and they used to be indirectly exported by virtue of being inline
    functions.

    Commit 8b9365d753d9870bb6451504c13570b81923228f ("[PATCH] Uninline
    jiffies.h functions") uninlined them, and thus modules now need them
    explicitly exported to use them.

    Signed-off-by: Thomas Bittermann
    Acked-by: Andrew Morton
    Acked-by: Ingo Molnar
    Acked-by: Thomas Gleixner
    Acked-by: john stultz
    Signed-off-by: Linus Torvalds

    Thomas Bittermann
     

03 Apr, 2007

2 commits

  • Fix the regression resulting from the recent change of suspend code
    ordering that causes systems based on Intel x86 CPUs using the microcode
    driver to hang during the resume.

    The problem occurs since the microcode driver uses request_firmware() in
    its CPU hotplug notifier, which is called after tasks has been frozen and
    hangs. It can be fixed by telling the microcode driver to use the
    microcode stored in memory during the resume instead of trying to load it
    from disk.

    Signed-off-by: Rafael J. Wysocki
    Adrian Bunk
    Cc: Tigran Aivazian
    Cc: Pavel Machek
    Cc: Maxim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • built-in drivers had broken sysfs links that caused bootup hangs for
    certain driver unregistry sequences.

    Signed-off-by: Ingo Molnar
    Acked-by: Kay Sievers
    Signed-off-by: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kay Sievers
     

29 Mar, 2007

2 commits

  • In commit 0475ac0845f9295bc5f69af45f58dff2c104c8d1 when converting the
    orphaned process group handling to use struct pid I made a small
    mistake. I accidentally replaced an == with a !=.

    Besides just being a dumb thing to do apparently this has a bad side
    effect. The improper orphaned process group detection causes kwin to
    die after a suspend/resume cycle.

    I'm amazed this patch has been around as long as it has without anyone
    else noticing something funny going on.

    And the following people deserve credit for spotting and helping
    to reproduce this.

    Thanks to: Sid Boyce
    Thanks to: "Michael Wu"

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • hrtimer_start() incorrectly set the 'reprogram' flag to enqueue_hrtimer(),
    which should only be 1 if the hrtimer is queued to the current CPU.

    Doing otherwise could result in a reprogramming of the current CPU's
    clockevents device, with a timer that is not queued to it - resulting in a
    bogus next expiry value.

    Signed-off-by: Ingo Molnar
    Cc: Michal Piotrowski
    Acked-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

28 Mar, 2007

2 commits

  • This reverts commit 94985134b7b46848267ed6b734320db01c974e72 and
    insteads removes the WARN_ON() that caused that commit in the first
    place.

    The problem is that we call disable_nonboot_cpus() in swsusp before
    powering down the system in order to avoid triggering the WARN_ON()
    in arch/x86_64/kernel/acpi/sleep.c:init_low_mapping() and this doesn't
    work well on Thomas' system.

    So instead, remove the WARN_ON() in arch/x86_64/kernel/acpi/sleep.c:
    init_low_mapping(), which triggers every time during the suspend to disk
    in the platform mode, as the potential problem it is related to doesn't
    seem to occur in practice.

    [ I think we might want to disallow the case of multiple users of that
    mm, or something. Normally, playing with the current process page
    tables on the current CPU should be fine as long as we don't have
    other threads using those tables at the same time..

    Anyway, not pretty, but better than the warning or the lockup - Linus ]

    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • I've been seeing some odd NTP behavior recently on a few boxes and
    finally narrowed it down to time_offset overflowing when converted to
    SHIFT_UPDATE units (which was a side effect from my HZfreeNTP patch).

    This patch converts time_offset from a long to a s64 which resolves the
    issue.

    [tglx@linutronix.de: signedness fixes]
    Signed-off-by: John Stultz
    Cc: Roman Zippel
    Cc: john stultz
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    john stultz
     

27 Mar, 2007

1 commit


26 Mar, 2007

2 commits

  • The watchdog implementation excludes low res / non continuous
    clocksources from being selected as a watchdog reference
    unintentionally.

    Allow using jiffies/PIT as a watchdog reference as long as no better
    clocksource is available. This is necessary to detect TSC breakage on
    systems, which have no pmtimer/hpet.

    The main goal of the initial patch (preventing to switch to highres/nohz
    when no reliable fallback clocksource is available) is still guaranteed
    by the checks in clocksource_watchdog().

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • The rework of next_timer_interrupt() fixed the timer wheel bugs, but
    invented a rounding error versus the next hrtimer event. This is caused
    by the conversion of the hrtimer internal representation to relative
    jiffies.

    This causes bug #8100:
    http://bugzilla.kernel.org/show_bug.cgi?id=8100

    next_timer_interrupt() returns "now" in such a case and causes the code
    in tick_nohz_stop_sched_tick() to trigger the timer softirq, which is
    bogus as no timer is due for expiry. This results in an endless context
    switching between idle and ksoftirqd until a timer is due for expiry.

    Modify the hrtimer evaluation so that, it returns now + 1, when the
    conversion results in a delta < 1 jiffie.

    It's confirmed to resolve bug #8100

    Reported-by: Emil Karlson
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     

24 Mar, 2007

1 commit


23 Mar, 2007

3 commits

  • lockdep's data shouldn't be used when debug_locks == 0 because it's not
    updated after this, so it's more misleading than helpful.

    PS: probably lockdep's current-> fields should be reset after it turns
    debug_locks off: so, after printing a bug report, but before return from
    exported functions, but there are really a lot of these possibilities (e.g.
    after DEBUG_LOCKS_WARN_ON), so, something could be missed. (Of course
    direct use of this fields isn't recommended either.)

    Reported-by: Folkert van Heusden
    Inspired-by: Oleg Nesterov
    Signed-off-by: Jarek Poplawski
    Acked-by: Peter Zijlstra
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jarek Poplawski
     
  • It causes extra moon icons blinking on x60, and breaks at least two other
    systems.

    During resume, we do not know that "reboot"/"shutdown" method was used, so
    we assume "plaform" and call BIOS, anyway...

    This is 2.6.21 material, and should fix 2 or 3 regressions from 2.6.20.

    Signed-off-by: Pavel Machek
    Acked-by: "Rafael J. Wysocki"
    Cc: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Machek
     
  • The SNAPSHOT_S2RAM ioctl does not disable the nonboot CPUs before entering
    the suspend, although it should do this.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki