06 Jan, 2007

1 commit

  • In the kernels later than 2.6.19 there is a regression that makes swsusp
    fail if the resume device is not explicitly specified.

    It can be fixed by adding an additional parameter to
    mm/swapfile.c:swap_type_of() allowing us to pass the (struct block_device
    *) corresponding to the first available swap back to the caller.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

16 Dec, 2006

1 commit


14 Dec, 2006

3 commits

  • Deprecate the old "legacy" PM API, and more importantly default it to "n".
    Virtually nothing in-tree uses it any more.

    Signed-off-by: David Brownell
    Signed-off-by: Greg Kroah-Hartman

    David Brownell
     
  • Currently, to tell a task that it should go to the refrigerator, we set the
    PF_FREEZE flag for it and send a fake signal to it. Unfortunately there
    are two SMP-related problems with this approach. First, a task running on
    another CPU may be updating its flags while the freezer attempts to set
    PF_FREEZE for it and this may leave the task's flags in an inconsistent
    state. Second, there is a potential race between freeze_process() and
    refrigerator() in which freeze_process() running on one CPU is reading a
    task's PF_FREEZE flag while refrigerator() running on another CPU has just
    set PF_FROZEN for the same task and attempts to reset PF_FREEZE for it. If
    the refrigerator wins the race, freeze_process() will state that PF_FREEZE
    hasn't been set for the task and will set it unnecessarily, so the task
    will go to the refrigerator once again after it's been thawed.

    To solve first of these problems we need to stop using PF_FREEZE to tell
    tasks that they should go to the refrigerator. Instead, we can introduce a
    special TIF_*** flag and use it for this purpose, since it is allowed to
    change the other tasks' TIF_*** flags and there are special calls for it.

    To avoid the freeze_process()-refrigerator() race we can make
    freeze_process() to always check the task's PF_FROZEN flag after it's read
    its "freeze" flag. We should also make sure that refrigerator() will
    always reset the task's "freeze" flag after it's set PF_FROZEN for it.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Cc: Russell King
    Cc: David Howells
    Cc: Andi Kleen
    Cc: "Luck, Tony"
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Paul Mundt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Currently, if a task is stopped (ie. it's in the TASK_STOPPED state), it
    is considered by the freezer as unfreezeable. However, there may be a race
    between the freezer and the delivery of the continuation signal to the task
    resulting in the task running after we have finished freezing the other
    tasks. This, in turn, may lead to undesirable effects up to and including
    data corruption.

    To prevent this from happening we first need to make the freezer consider
    stopped tasks as freezeable. For this purpose we need to make freezeable()
    stop returning 0 for these tasks and we need to force them to enter the
    refrigerator. However, if there's no continuation signal in the meantime,
    the stopped tasks should remain stopped after all processes have been
    thawed, so we need to send an additional SIGSTOP to each of them before
    waking it up.

    Also, a stopped task that has just been woken up should first check if
    there's a freezing request for it and go to the refrigerator if that's the
    case.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

08 Dec, 2006

25 commits

  • - move some file_operations structs into the .rodata section

    - move static strings from policy_types[] array into the .rodata section

    - fix generic seq_operations usages, so that those structs may be defined
    as "const" as well

    [akpm@osdl.org: couple of fixes]
    Signed-off-by: Helge Deller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Helge Deller
     
  • The new shared APM emulation just like its ARM and MIPS predecessors uses
    pm_suspend() which was only exported on SH. Move export to close to it's
    definition where it really should be anyway.

    Signed-off-by: Ralf Baechle
    Cc: Russell King
    Cc: Paul Mundt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ralf Baechle
     
  • Cleanup write-only variable, suggested by D Binderman.

    Signed-off-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Machek
     
  • The 'testproc' swsusp debug mode thaws tasks twice in a row, which is _very_
    confusing. Fix that.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Move all labels in the swsusp code to the second column, so that they won't
    fool diff -p.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Cc: Nigel Cunningham
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Fix coding style in suspend.c.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Cc: Nigel Cunningham
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Move the loop from freeze_processes() to a separate function and call it
    independently for user space processes and kernel threads so that the order
    of freezing tasks is clearly visible.

    Signed-off-by: Rafael J. Wysocki
    Cc: Pavel Machek
    Cc: Nigel Cunningham
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Move the loop from thaw_processes() to a separate function and call it
    independently for kernel threads and user space processes so that the order
    of thawing tasks is clearly visible.

    Drop thaw_kernel_threads() which is never used.

    Signed-off-by: Rafael J. Wysocki
    Cc: Pavel Machek
    Cc: Nigel Cunningham
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • The power management semaphore is only used as mutex, so convert it.

    [akpm@osdl.org: fix rotten bug]
    Signed-off-by: Stephen Hemminger
    Acked-by: Ingo Molnar
    Acked-by: Pavel Machek
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Hemminger
     
  • Fix http://bugzilla.kernel.org/show_bug.cgi?id=7534

    Fix the freezing of processes so that it won't fail if there is a traced
    process the parent of which has been stopped.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Cc: maurice barnum
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Make swsusp measure and print the time needed to shrink memory during the
    suspend.

    Signed-off-by: Rafael J. Wysocki
    Cc: Pavel Machek
    Cc: Nigel Cunningham
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Make swsusp support i386 systems with PAE or without PSE.

    This is done by creating temporary page tables located in resume-safe page
    frames before the suspend image is restored in the same way as x86_64 does
    it.

    Signed-off-by: Rafael J. Wysocki
    Cc: Andi Kleen
    Cc: Dave Jones
    Cc: Nigel Cunningham
    Cc: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Modify process thawing so that we can thaw kernel space without thawing
    userspace, and thaw kernelspace first. This will be useful in later
    patches, where I intend to get swsusp thawing kernel threads only before
    seeking to free memory.

    Signed-off-by: Nigel Cunningham
    Cc: Pavel Machek
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nigel Cunningham
     
  • Minor whitespace and formatting modifications for the freezer.

    Signed-off-by: Nigel Cunningham
    Acked-by: Pavel Machek
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nigel Cunningham
     
  • The freezer currently prints an '=' for every process that is frozen. This
    is pretty pointless, as the equals sign says nothing about which process is
    frozen, and makes logs look messier (especially if there were a large
    number of processes running). All we really need to know is that we
    started trying to freeze processes and what processes (if any) failed to
    freeze, or that we succeeded.

    Signed-off-by: Nigel Cunningham
    Acked-by: Pavel Machek
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nigel Cunningham
     
  • Move process freezing functions from include/linux/sched.h to freezer.h, so
    that modifications to the freezer or the kernel configuration don't require
    recompiling just about everything.

    [akpm@osdl.org: fix ueagle driver]
    Signed-off-by: Nigel Cunningham
    Cc: "Rafael J. Wysocki"
    Cc: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nigel Cunningham
     
  • At some point after 2.6.13, in-kernel software suspend got "incomplete" for
    the so-called "platform" mode. pm_ops->prepare() is never called. A
    visible sign of this is the "moon" light on thinkpads not flashing during
    suspend. Fix by readding the pm_ops->prepare call during suspend.

    Signed-off-by: Stefan Seyfried
    Acked-by: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stefan Seyfried
     
  • swsusp uses GFP_ATOMIC, but it can afford to use __GFP_WAIT, which will
    permit it to reclaim clean pagecache instead of emitting scary
    page-allocation-failure messages.

    Cc: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Currently swsusp saves the contents of highmem pages by copying them to the
    normal zone which is quite inefficient (eg. it requires two normal pages
    to be used for saving one highmem page). This may be improved by using
    highmem for saving the contents of saveable highmem pages.

    Namely, during the suspend phase of the suspend-resume cycle we try to
    allocate as many free highmem pages as there are saveable highmem pages.
    If there are not enough highmem image pages to store the contents of all of
    the saveable highmem pages, some of them will be stored in the "normal"
    memory. Next, we allocate as many free "normal" pages as needed to store
    the (remaining) image data. We use a memory bitmap to mark the allocated
    free pages (ie. highmem as well as "normal" image pages).

    Now, we use another memory bitmap to mark all of the saveable pages
    (highmem as well as "normal") and the contents of the saveable pages are
    copied into the image pages. Then, the second bitmap is used to save the
    pfns corresponding to the saveable pages and the first one is used to save
    their data.

    During the resume phase the pfns of the pages that were saveable during the
    suspend are loaded from the image and used to mark the "unsafe" page
    frames. Next, we try to allocate as many free highmem page frames as to
    load all of the image data that had been in the highmem before the suspend
    and we allocate so many free "normal" page frames that the total number of
    allocated free pages (highmem and "normal") is equal to the size of the
    image. While doing this we have to make sure that there will be some extra
    free "normal" and "safe" page frames for two lists of PBEs constructed
    later.

    Now, the image data are loaded, if possible, into their "original" page
    frames. The image data that cannot be written into their "original" page
    frames are loaded into "safe" page frames and their "original" kernel
    virtual addresses, as well as the addresses of the "safe" pages containing
    their copies, are stored in one of two lists of PBEs.

    One list of PBEs is for the copies of "normal" suspend pages (ie. "normal"
    pages that were saveable during the suspend) and it is used in the same way
    as previously (ie. by the architecture-dependent parts of swsusp). The
    other list of PBEs is for the copies of highmem suspend pages. The pages
    in this list are restored (in a reversible way) right before the
    arch-dependent code is called.

    Signed-off-by: Rafael J. Wysocki
    Cc: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • To be able to use swap files as suspend storage from the userland suspend
    tools we need an additional ioctl() that will allow us to provide the kernel
    with both the swap header's offset and the identification of the resume
    partition.

    The new ioctl() should be regarded as a replacement for the
    SNAPSHOT_SET_SWAP_FILE ioctl() that from now on will be considered as
    obsolete, but has to stay for backwards compatibility of the interface.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Add the kernel command line parameter "resume_offset=" allowing us to specify
    the offset, in units, from the beginning of the partition pointed
    to by the "resume=" parameter at which the swap header is located.

    This offset can be determined, for example, by an application using the FIBMAP
    ioctl to obtain the swap header's block number for given file.

    [akpm@osdl.org: we don't know what type sector_t is]
    Signed-off-by: Rafael J. Wysocki
    Cc: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Make swsusp use block device offsets instead of swap offsets to identify swap
    locations and make it use the same code paths for writing as well as for
    reading data.

    This allows us to use the same code for handling swap files and swap
    partitions and to simplify the code, eg. by dropping rw_swap_page_sync().

    Signed-off-by: Rafael J. Wysocki
    Cc: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Rearrange the code in kernel/power/swap.c so that the next patch is more
    readable.

    [This patch only moves the existing code.]

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • The Linux kernel handles swap files almost in the same way as it handles swap
    partitions and there are only two differences between these two types of swap
    areas:

    (1) swap files need not be contiguous,

    (2) the header of a swap file is not in the first block of the partition
    that holds it. From the swsusp's point of view (1) is not a problem,
    because it is already taken care of by the swap-handling code, but (2) has
    to be taken into consideration.

    In principle the location of a swap file's header may be determined with the
    help of appropriate filesystem driver. Unfortunately, however, it requires
    the filesystem holding the swap file to be mounted, and if this filesystem is
    journaled, it cannot be mounted during a resume from disk. For this reason we
    need some other means by which swap areas can be identified.

    For example, to identify a swap area we can use the partition that holds the
    area and the offset from the beginning of this partition at which the swap
    header is located.

    The following patch allows swsusp to identify swap areas this way. It changes
    swap_type_of() so that it takes an additional argument representing an offset
    of the swap header within the partition represented by its first argument.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Add an ioctl to the userspace swsusp code that enables the usage of the
    pmops->prepare, pmops->enter and pmops->finish methods (the in-kernel
    suspend knows these as "platform method"). These are needed on many
    machines to (among others) speed up resuming by letting the BIOS skip some
    steps or let my hp nx5000 recognise the correct ac_adapter state after
    resume again.

    It also ensures on many machines, that changed hardware (unplugged AC
    adapters) gets correctly detected and that kacpid does not run wild after
    resume.

    Signed-off-by: Stefan Seyfried
    Cc: "Rafael J. Wysocki"
    Cc: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stefan Seyfried
     

22 Nov, 2006

1 commit

  • Pass the work_struct pointer to the work function rather than context data.
    The work function can use container_of() to work out the data.

    For the cases where the container of the work_struct may go away the moment the
    pending bit is cleared, it is made possible to defer the release of the
    structure by deferring the clearing of the pending bit.

    To make this work, an extra flag is introduced into the management side of the
    work_struct. This governs auto-release of the structure upon execution.

    Ordinarily, the work queue executor would release the work_struct for further
    scheduling or deallocation by clearing the pending bit prior to jumping to the
    work function. This means that, unless the driver makes some guarantee itself
    that the work_struct won't go away, the work function may not access anything
    else in the work_struct or its container lest they be deallocated.. This is a
    problem if the auxiliary data is taken away (as done by the last patch).

    However, if the pending bit is *not* cleared before jumping to the work
    function, then the work function *may* access the work_struct and its container
    with no problems. But then the work function must itself release the
    work_struct by calling work_release().

    In most cases, automatic release is fine, so this is the default. Special
    initiators exist for the non-auto-release case (ending in _NAR).

    Signed-Off-By: David Howells

    David Howells
     

04 Nov, 2006

1 commit

  • Add a swsusp debugging mode. This does everything that's needed for a suspend
    except for actually suspending. So we can look in the log messages and work
    out a) what code is being slow and b) which drivers are misbehaving.

    (1)
    # echo testproc > /sys/power/disk
    # echo disk > /sys/power/state

    This should turn off the non-boot CPU, freeze all processes, wait for 5
    seconds and then thaw the processes and the CPU.

    (2)
    # echo test > /sys/power/disk
    # echo disk > /sys/power/state

    This should turn off the non-boot CPU, freeze all processes, shrink
    memory, suspend all devices, wait for 5 seconds, resume the devices etc.

    Cc: Pavel Machek
    Cc: Stefan Seyfried
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

02 Nov, 2006

1 commit

  • It has been reported that on some systems the functionality after a resume
    from disk is limited if the system is simply powered off during the suspend
    instead of using the ACPI S4 suspend (aka platform mode).

    Unfortunately the default is currently to power off the system during the
    suspend so the users of these systems experience problems after the resume
    if they don't switch to the platform mode explicitly. This patch makes swsusp
    use the platform mode by default to avoid such situations.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Stefan Seyfried
    Acked-by: Pavel Machek
    Signed-off-by: Len Brown

    Rafael J. Wysocki
     

17 Oct, 2006

1 commit

  • My fancy new swsusp IO code had a big memory leak. It's somewhat invisible
    because the whole mem_map[] gets overwritten after resume, but it can cause us
    to get low on memory during the actual suspend process.

    Cc: "Rafael J. Wysocki"
    Cc: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

12 Oct, 2006

1 commit

  • Add suspend_console() and resume_console() to the suspend-to-disk code paths
    so that the users of netconsole can use swsusp with it.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

08 Oct, 2006

1 commit


05 Oct, 2006

1 commit

  • Maintain a per-CPU global "struct pt_regs *" variable which can be used instead
    of passing regs around manually through all ~1800 interrupt handlers in the
    Linux kernel.

    The regs pointer is used in few places, but it potentially costs both stack
    space and code to pass it around. On the FRV arch, removing the regs parameter
    from all the genirq function results in a 20% speed up of the IRQ exit path
    (ie: from leaving timer_interrupt() to leaving do_IRQ()).

    Where appropriate, an arch may override the generic storage facility and do
    something different with the variable. On FRV, for instance, the address is
    maintained in GR28 at all times inside the kernel as part of general exception
    handling.

    Having looked over the code, it appears that the parameter may be handed down
    through up to twenty or so layers of functions. Consider a USB character
    device attached to a USB hub, attached to a USB controller that posts its
    interrupts through a cascaded auxiliary interrupt controller. A character
    device driver may want to pass regs to the sysrq handler through the input
    layer which adds another few layers of parameter passing.

    I've build this code with allyesconfig for x86_64 and i386. I've runtested the
    main part of the code on FRV and i386, though I can't test most of the drivers.
    I've also done partial conversion for powerpc and MIPS - these at least compile
    with minimal configurations.

    This will affect all archs. Mostly the changes should be relatively easy.
    Take do_IRQ(), store the regs pointer at the beginning, saving the old one:

    struct pt_regs *old_regs = set_irq_regs(regs);

    And put the old one back at the end:

    set_irq_regs(old_regs);

    Don't pass regs through to generic_handle_irq() or __do_IRQ().

    In timer_interrupt(), this sort of change will be necessary:

    - update_process_times(user_mode(regs));
    - profile_tick(CPU_PROFILING, regs);
    + update_process_times(user_mode(get_irq_regs()));
    + profile_tick(CPU_PROFILING);

    I'd like to move update_process_times()'s use of get_irq_regs() into itself,
    except that i386, alone of the archs, uses something other than user_mode().

    Some notes on the interrupt handling in the drivers:

    (*) input_dev() is now gone entirely. The regs pointer is no longer stored in
    the input_dev struct.

    (*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking. It does
    something different depending on whether it's been supplied with a regs
    pointer or not.

    (*) Various IRQ handler function pointers have been moved to type
    irq_handler_t.

    Signed-Off-By: David Howells
    (cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)

    David Howells
     

02 Oct, 2006

1 commit

  • In some places, particularly drivers and __init code, the init utsns is the
    appropriate one to use. This patch replaces those with a the init_utsname
    helper.

    Changes: Removed several uses of init_utsname(). Hope I picked all the
    right ones in net/ipv4/ipconfig.c. These are now changed to
    utsname() (the per-process namespace utsname) in the previous
    patch (2/7)

    [akpm@osdl.org: CIFS fix]
    Signed-off-by: Serge E. Hallyn
    Cc: Kirill Korotaev
    Cc: "Eric W. Biederman"
    Cc: Herbert Poetzl
    Cc: Andrey Savochkin
    Cc: Serge Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Serge E. Hallyn
     

27 Sep, 2006

1 commit

  • * master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (47 commits)
    Driver core: Don't call put methods while holding a spinlock
    Driver core: Remove unneeded routines from driver core
    Driver core: Fix potential deadlock in driver core
    PCI: enable driver multi-threaded probe
    Driver Core: add ability for drivers to do a threaded probe
    sysfs: add proper sysfs_init() prototype
    drivers/base: check errors
    drivers/base: Platform notify needs to occur before drivers attach to the device
    v4l-dev2: handle __must_check
    add CONFIG_ENABLE_MUST_CHECK
    add __must_check to device management code
    Driver core: fixed add_bind_files() definition
    Driver core: fix comments in drivers/base/power/resume.c
    sysfs_remove_bin_file: no return value, dump_stack on error
    kobject: must_check fixes
    Driver core: add ability for devices to create and remove bin files
    Class: add support for class interfaces for devices
    Driver core: create devices/virtual/ tree
    Driver core: add device_rename function
    Driver core: add ability for classes to handle devices properly
    ...

    Linus Torvalds
     

26 Sep, 2006

1 commit

  • Add the pm_trace attribute in /sys/power which has to be explicitly set to
    one to really enable the "PM tracing" code compiled in when CONFIG_PM_TRACE
    is set (which modifies the machine's CMOS clock in unpredictable ways).

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki