28 Apr, 2007

4 commits

  • * master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (46 commits)
    dev_dbg: check dev_dbg() arguments
    drivers/base/attribute_container.c: use mutex instead of binary semaphore
    mod_sysfs_setup() doesn't return errno when kobject_add_dir() failure occurs
    s2ram: add arch irq disable/enable hooks
    define platform wakeup hook, use in pci_enable_wake()
    security: prevent permission checking of file removal via sysfs_remove_group()
    device_schedule_callback() needs a module reference
    s390: cio: Delay uevents for subchannels
    sysfs: bin.c printk fix
    Driver core: use mutex instead of semaphore in DMA pool handler
    driver core: bus_add_driver should return an error if no bus
    debugfs: Add debugfs_create_u64()
    the overdue removal of the mount/umount uevents
    kobject: Comment and warning fixes to kobject.c
    Driver core: warn when userspace writes to the uevent file in a non-supported way
    Driver core: make uevent-environment available in uevent-file
    kobject core: remove rwsem from struct subsystem
    qeth: Remove usage of subsys.rwsem
    PHY: remove rwsem use from phy core
    IEEE1394: remove rwsem use from ieee1394 core
    ...

    Linus Torvalds
     
  • mod_sysfs_setup() doesn't return an errno when kobject_add_dir() for module
    "holders" directory fails. So caller of mod_sysfs_setup() will keep going
    and get oops.

    Signed-off-by: Akinobu Mita
    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Akinobu Mita
     
  • After some more discussion this patch replaces it:

    From: Johannes Berg
    Subject: suspend: add arch irq disable/enable hooks

    For powermac, we need to do some things between suspending devices and
    device_power_off, for example setting the decrementer. This patch
    allows architectures to define arch_s2ram_{en,dis}able_irqs in their
    asm/suspend.h to have control over this step.

    Signed-off-by: Johannes Berg
    Acked-by: Pavel Machek
    Cc: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Johannes Berg
     
  • show_state() (SysRq-T) developed the buggy habbit of not showing
    TASK_RUNNING tasks. This was due to the mistaken belief that state_filter
    == -1 would be a pass-through filter - while in reality it did not let
    TASK_RUNNING == 0 p->state values through.

    Fix this by restoring the original '!state_filter means all tasks'
    special-case i had in the original version. Test-built and test-booted on
    i686, SysRq-T now works as intended.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

27 Apr, 2007

1 commit


26 Apr, 2007

6 commits

  • Switch cb_lock to mutex and allow netlink kernel users to override it
    with a subsystem specific mutex for consistent locking in dump callbacks.
    All netlink_dump_start users have been audited not to rely on any
    side-effects of the previously used spinlock.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Change tcp_probe to use ktime (needed to add one export).
    Add option to only get events when cwnd changes - from Doug Leith

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • For the common "(struct nlmsghdr *)skb->data" sequence, so that we reduce the
    number of direct accesses to skb->data and for consistency with all the other
    cast skb member helpers.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
    on 64bit architectures, allowing us to combine the 4 bytes hole left by the
    layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
    64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
    :-)

    Many calculations that previously required that skb->{transport,network,
    mac}_header be first converted to a pointer now can be done directly, being
    meaningful as offsets or pointers.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Get rid of the manual clock source selection mess and use ktime. Also
    use a scalar representation, which allows to clean up pkt_sched.h a bit
    more and results in less ktime_to_ns() calls in most cases.

    The PSCHED_US2JIFFIE/PSCHED_JIFFIE2US macros are implemented quite
    inefficient by this patch, following patches will convert all qdiscs
    to hrtimers and get rid of them entirely.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • We currently use a special structure (struct skb_timeval) and plain
    'struct timeval' to store packet timestamps in sk_buffs and struct
    sock.

    This has some drawbacks :
    - Fixed resolution of micro second.
    - Waste of space on 64bit platforms where sizeof(struct timeval)=16

    I suggest using ktime_t that is a nice abstraction of high resolution
    time services, currently capable of nanosecond resolution.

    As sizeof(ktime_t) is 8 bytes, using ktime_t in 'struct sock' permits
    a 8 byte shrink of this structure on 64bit architectures. Some other
    structures also benefit from this size reduction (struct ipq in
    ipv4/ip_fragment.c, struct frag_queue in ipv6/reassembly.c, ...)

    Once this ktime infrastructure adopted, we can more easily provide
    nanosecond resolution on top of it. (ioctl SIOCGSTAMPNS and/or
    SO_TIMESTAMPNS/SCM_TIMESTAMPNS)

    Note : this patch includes a bug correction in
    compat_sock_get_timestamp() where a "err = 0;" was missing (so this
    syscall returned -ENOENT instead of 0)

    Signed-off-by: Eric Dumazet
    CC: Stephen Hemminger
    CC: John find
    Signed-off-by: David S. Miller

    Eric Dumazet
     

24 Apr, 2007

1 commit

  • The commit 34f5a39899f3f3e815da64f48ddb72942d86c366 restricted reading
    of the tainted value. The attached patch changes this back to a
    write-only check and restores the read behaviour of older versions.

    Signed-off-by: Bastian Blank
    Cc: Theodore Ts'o
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bastian Blank
     

13 Apr, 2007

1 commit


08 Apr, 2007

4 commits

  • Getting rid of the p->children printout in show_task() left behind an
    unused variable.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • the p->parent PID printout gives us all the information about the
    task tree that we need - the eldest_child()/older_sibling()/
    younger_sibling() printouts are mostly historic and i do not
    remember ever having used those fields. (IMO in fact they confuse
    the SysRq-T output.) So remove them.

    This code has sentimental value though, those fields and
    printouts are one of the oldest ones still surviving from
    Linux v0.95's kernel/sched.c:

    if (p->p_ysptr || p->p_osptr)
    printk(" Younger sib=%d, older sib=%d\n\r",
    p->p_ysptr ? p->p_ysptr->pid : -1,
    p->p_osptr ? p->p_osptr->pid : -1);
    else
    printk("\n\r");

    written 15 years ago, in early 1992.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Linus 'snif' Torvalds

    Ingo Molnar
     
  • devres should be deallocated with devres_free() not kfree(). This bug
    corrupts slab on IRQ request failure. Fix it.

    Signed-off-by: Tejun Heo
    Cc: Andrew Morton
    Cc: Greg KH
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • Soeren Sonnenburg reported that upon resume he is getting
    this backtrace:

    [] smp_apic_timer_interrupt+0x57/0x90
    [] retrigger_next_event+0x0/0xb0
    [] apic_timer_interrupt+0x28/0x30
    [] retrigger_next_event+0x0/0xb0
    [] __kfifo_put+0x8/0x90
    [] on_each_cpu+0x35/0x60
    [] clock_was_set+0x18/0x20
    [] timekeeping_resume+0x7c/0xa0
    [] __sysdev_resume+0x11/0x80
    [] sysdev_resume+0x47/0x80
    [] device_power_up+0x5/0x10

    it turns out that on resume we mistakenly re-enable interrupts too
    early. Do the timer retrigger only on the current CPU.

    Signed-off-by: Ingo Molnar
    Acked-by: Thomas Gleixner
    Acked-by: Soeren Sonnenburg
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

05 Apr, 2007

3 commits

  • In debugging a problem w/ the -rt tree, I noticed that on systems that mark
    the tsc as unstable before it is registered, the TSC would still be
    selected and used for a short period of time. Digging in it looks to be a
    result of the mix of the clocksource list changes and my clocksource
    initialization changes.

    With the -rt tree, using a bad TSC, even for a short period of time can
    results in a hang at boot. I was not able to reproduce this hang w/
    mainline, but I'm not completely certain that someone won't trip on it.

    This patch resolves the issue by initializing the jiffies clocksource
    earlier so a bad TSC won't get selected just because nothing else is yet
    registered.

    Signed-off-by: John Stultz
    Acked-by: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    john stultz
     
  • Fix a bug in the swsusp's memory shrinker that causes some systems using
    highmem to refuse to suspend to disk if image_size is set above 1/2 of
    available RAM.

    Special thanks to Jiri Slaby for reporting the problem and assistance in
    debugging it.

    Signed-off-by: Rafael J. Wysocki
    Cc: Jiri Slaby
    Cc: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • This patch adds 2 missing symbol exports: jiffies_to_timeval() and
    timeval_to_jiffies(). The (not yet merged) dm-raid4-5 module will need
    them, and they used to be indirectly exported by virtue of being inline
    functions.

    Commit 8b9365d753d9870bb6451504c13570b81923228f ("[PATCH] Uninline
    jiffies.h functions") uninlined them, and thus modules now need them
    explicitly exported to use them.

    Signed-off-by: Thomas Bittermann
    Acked-by: Andrew Morton
    Acked-by: Ingo Molnar
    Acked-by: Thomas Gleixner
    Acked-by: john stultz
    Signed-off-by: Linus Torvalds

    Thomas Bittermann
     

03 Apr, 2007

2 commits

  • Fix the regression resulting from the recent change of suspend code
    ordering that causes systems based on Intel x86 CPUs using the microcode
    driver to hang during the resume.

    The problem occurs since the microcode driver uses request_firmware() in
    its CPU hotplug notifier, which is called after tasks has been frozen and
    hangs. It can be fixed by telling the microcode driver to use the
    microcode stored in memory during the resume instead of trying to load it
    from disk.

    Signed-off-by: Rafael J. Wysocki
    Adrian Bunk
    Cc: Tigran Aivazian
    Cc: Pavel Machek
    Cc: Maxim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • built-in drivers had broken sysfs links that caused bootup hangs for
    certain driver unregistry sequences.

    Signed-off-by: Ingo Molnar
    Acked-by: Kay Sievers
    Signed-off-by: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kay Sievers
     

29 Mar, 2007

2 commits

  • In commit 0475ac0845f9295bc5f69af45f58dff2c104c8d1 when converting the
    orphaned process group handling to use struct pid I made a small
    mistake. I accidentally replaced an == with a !=.

    Besides just being a dumb thing to do apparently this has a bad side
    effect. The improper orphaned process group detection causes kwin to
    die after a suspend/resume cycle.

    I'm amazed this patch has been around as long as it has without anyone
    else noticing something funny going on.

    And the following people deserve credit for spotting and helping
    to reproduce this.

    Thanks to: Sid Boyce
    Thanks to: "Michael Wu"

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • hrtimer_start() incorrectly set the 'reprogram' flag to enqueue_hrtimer(),
    which should only be 1 if the hrtimer is queued to the current CPU.

    Doing otherwise could result in a reprogramming of the current CPU's
    clockevents device, with a timer that is not queued to it - resulting in a
    bogus next expiry value.

    Signed-off-by: Ingo Molnar
    Cc: Michal Piotrowski
    Acked-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

28 Mar, 2007

2 commits

  • This reverts commit 94985134b7b46848267ed6b734320db01c974e72 and
    insteads removes the WARN_ON() that caused that commit in the first
    place.

    The problem is that we call disable_nonboot_cpus() in swsusp before
    powering down the system in order to avoid triggering the WARN_ON()
    in arch/x86_64/kernel/acpi/sleep.c:init_low_mapping() and this doesn't
    work well on Thomas' system.

    So instead, remove the WARN_ON() in arch/x86_64/kernel/acpi/sleep.c:
    init_low_mapping(), which triggers every time during the suspend to disk
    in the platform mode, as the potential problem it is related to doesn't
    seem to occur in practice.

    [ I think we might want to disallow the case of multiple users of that
    mm, or something. Normally, playing with the current process page
    tables on the current CPU should be fine as long as we don't have
    other threads using those tables at the same time..

    Anyway, not pretty, but better than the warning or the lockup - Linus ]

    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • I've been seeing some odd NTP behavior recently on a few boxes and
    finally narrowed it down to time_offset overflowing when converted to
    SHIFT_UPDATE units (which was a side effect from my HZfreeNTP patch).

    This patch converts time_offset from a long to a s64 which resolves the
    issue.

    [tglx@linutronix.de: signedness fixes]
    Signed-off-by: John Stultz
    Cc: Roman Zippel
    Cc: john stultz
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    john stultz
     

27 Mar, 2007

1 commit


26 Mar, 2007

2 commits

  • The watchdog implementation excludes low res / non continuous
    clocksources from being selected as a watchdog reference
    unintentionally.

    Allow using jiffies/PIT as a watchdog reference as long as no better
    clocksource is available. This is necessary to detect TSC breakage on
    systems, which have no pmtimer/hpet.

    The main goal of the initial patch (preventing to switch to highres/nohz
    when no reliable fallback clocksource is available) is still guaranteed
    by the checks in clocksource_watchdog().

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • The rework of next_timer_interrupt() fixed the timer wheel bugs, but
    invented a rounding error versus the next hrtimer event. This is caused
    by the conversion of the hrtimer internal representation to relative
    jiffies.

    This causes bug #8100:
    http://bugzilla.kernel.org/show_bug.cgi?id=8100

    next_timer_interrupt() returns "now" in such a case and causes the code
    in tick_nohz_stop_sched_tick() to trigger the timer softirq, which is
    bogus as no timer is due for expiry. This results in an endless context
    switching between idle and ksoftirqd until a timer is due for expiry.

    Modify the hrtimer evaluation so that, it returns now + 1, when the
    conversion results in a delta < 1 jiffie.

    It's confirmed to resolve bug #8100

    Reported-by: Emil Karlson
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     

24 Mar, 2007

1 commit


23 Mar, 2007

3 commits

  • lockdep's data shouldn't be used when debug_locks == 0 because it's not
    updated after this, so it's more misleading than helpful.

    PS: probably lockdep's current-> fields should be reset after it turns
    debug_locks off: so, after printing a bug report, but before return from
    exported functions, but there are really a lot of these possibilities (e.g.
    after DEBUG_LOCKS_WARN_ON), so, something could be missed. (Of course
    direct use of this fields isn't recommended either.)

    Reported-by: Folkert van Heusden
    Inspired-by: Oleg Nesterov
    Signed-off-by: Jarek Poplawski
    Acked-by: Peter Zijlstra
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jarek Poplawski
     
  • It causes extra moon icons blinking on x60, and breaks at least two other
    systems.

    During resume, we do not know that "reboot"/"shutdown" method was used, so
    we assume "plaform" and call BIOS, anyway...

    This is 2.6.21 material, and should fix 2 or 3 regressions from 2.6.20.

    Signed-off-by: Pavel Machek
    Acked-by: "Rafael J. Wysocki"
    Cc: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Machek
     
  • The SNAPSHOT_S2RAM ioctl does not disable the nonboot CPUs before entering
    the suspend, although it should do this.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

17 Mar, 2007

7 commits

  • I finally found a dual core box, which survives suspend/resume without
    crashing in the middle of nowhere. Sigh, I never figured out from the
    code and the bug reports what's going on.

    The observed hangs are caused by a stale state transition of the clock
    event devices, which keeps the RCU synchronization away from completion,
    when the non boot CPU is brought back up.

    The suspend/resume in oneshot mode needs the similar care as the
    periodic mode during suspend to RAM. My assumption that the state
    transitions during the different shutdown/bringups of s2disk would go
    through the periodic boot phase and then switch over to highres resp.
    nohz mode were simply wrong.

    Add the appropriate suspend / resume handling for the non periodic
    modes.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Fixes a bogus lockdep warning which causes lockdep to disable itself.

    Acked-by: Ingo Molnar
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Zilvinas Valinskas
     
  • Testing of -rt by IBM uncovered a locking bug in wake_futex_pi(): the PI
    state needs to be locked before we access it.

    Signed-off-by: Ingo Molnar
    Acked-by: Thomas Gleixner
    Cc: Chuck Ebbert
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • When the console is in VT_AUTO+KD_GRAPHICS mode, switching to the
    SUSPEND_CONSOLE fails, resulting in vt_waitactive() waiting indefinitely or
    until the task is interrupted. This patch tests if a console switch can
    occur in set_console() and returns early if a console switch is not
    possible.

    [akpm@linux-foundation.org: cleanup]
    Signed-off-by: Andrew Johnson
    Acked-by: Pavel Machek
    Cc: "Antonino A. Daplas"
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Johnson
     
  • commit f4304ab21513b834c8fe3403927c60c2b81a72d7 (HZ free NTP) moved the
    access to wall_to_monotonic in hrtimer_get_softirq_time() out of the
    xtime_lock protection.

    Move it back into the seq_lock section.

    Signed-off-by: Thomas Gleixner
    Acked-by: John Stultz
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • hrtimer_forward() does not check for the possible overflow of
    timer->expires. This can happen on 64 bit machines with large interval
    values and results currently in an endless loop in the softirq because the
    expiry value becomes negative and therefor the timer is expired all the
    time.

    Check for this condition and set the expiry value to the max. expiry time
    in the future. The fix should be applied to stable kernel series as well.

    Signed-off-by: Thomas Gleixner
    Acked-by: Ingo Molnar
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Prevent the WARN_ON() in arch/x86_64/kernel/acpi/sleep.c:init_low_mapping()
    from triggering by disabling nonboot CPUs before we finally enter the
    platform suspend.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki