07 Jan, 2006

13 commits

  • There are some more places where the use of cputime_t instead of an integer
    type and the associated macros is necessary for the virtual cputime accounting
    on s390. Affected are the s390 specific appldata code and BSD process
    accounting.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Martin Schwidefsky
     
  • This makes the swsusp_info structure become the header of the image in the
    literal sense (ie. it is saved to the swap and read before any other image
    data with the help of the swsusp's swap map structure, so generally it is
    treated in the same way as the rest of the image).

    The main thing it does is to make swsusp_header contain the offset of the swap
    map used to track the image data pages rather than the offset of swsusp_info.
     Simultaneously, swsusp_info becomes the first image page written to the swap.

    The other changes are generally consequences of the above with a few
    exceptions (there's some consolidation in the image reading part as a few
    functions turn into trivial wrappers around something else).

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • This changes the handling of swap partitions by swsusp to avoid locking of the
    swap devices that are not used for suspend and, consequently, simplifies the
    code.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Make the suspend image size limit tunable via /sys/power/image_size.

    It is necessary for systems on which there is a limited amount of swap
    available for suspend. It can also be useful for optimizing performance of
    swsusp on systems with 1 GB of RAM or more.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Limit the size of the suspend image to approx. 500 MB, which should
    improve the overall performance of swsusp on systems with more than 1 GB of
    RAM.

    It introduces the constant IMAGE_SIZE that can be set to the preferred size
    of the image (in MB) and modifies the memory-shrinking part of swsusp to
    take this constant into account (500 is the default value of IMAGE_SIZE).

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • These two prototypes are already present in sched.h, remove duplicate
    version.

    Signed-off-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Machek
     
  • This patch fixes a problem with the function enough_free_mem() used by
    swsusp to verify if there is a sufficient number of memory pages available
    to it to create and save the suspend image.

    Namely, enough_free_mem() uses nr_free_pages() to obtain the number of free
    memory pages, which is incorrect, because this function returns the total
    number of free pages, including free highmem pages, and the highmem pages
    cannot be used by swsusp for storing the image data.

    The patch makes enough_free_mem() avoid counting the free highmem
    pages as available to swsusp.

    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • This patch makes swsusp free only as much memory as needed to complete the
    suspend and not as much as possible.  In the most of cases this should speed
    up the suspend and make the system much more responsive after resume,
    especially if a GUI (eg. X Windows) is used.

    If needed, the old behavior (ie to free as much memory as possible during
    suspend) can be restored by unsetting FAST_FREE in power.h

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • This patch introduces the swap map structure that can be used by swsusp for
    keeping tracks of data pages written to the swap.  The structure itself is
    described in a comment within the patch.

    The overall idea is to reduce the amount of metadata written to the swap and
    to write and read the image pages sequentially, in a file-alike way. This
    makes the swap-handling part of swsusp fairly independent of its
    snapshot-handling part and will hopefully allow us to completely separate
    these two parts in the future.

    This patch is needed to remove the suspend image size limit imposed by the
    limited size of the swsusp_info structure, which is essential for x86-64
    systems with more than 512 MB of RAM.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • This patch removes the image encryption that is only used by swsusp instead of
    zeroing the image after resume in order to prevent someone from reading some
    confidential data from it in the future and it does not protect the image from
    being read by an unauthorized person before resume. The functionality it
    provides should really belong to the user space and will possibly be
    reimplemented after the swap-handling functionality of swsusp is moved to the
    user space.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Thanks to Christoph for doing most of the work.

    This allows automatic SMP IRQ affinity assignment other than default "all
    interrupts on all CPUs" which is rather expensive. This might be useful if
    the hardware can be programmed to distribute interrupts among different
    CPUs, like Alpha does.

    Signed-off-by: Ivan Kokshaysky
    Cc: Christoph Hellwig
    Cc: Richard Henderson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ivan Kokshaysky
     
  • Make the futex code compilable and usable on NOMMU by making the attempt to
    handle page faults conditional on CONFIG_MMU. If this is not enabled, then
    we can assume that EFAULT returned from futex_atomic_op_inuser() is not
    recoverable, and that the address lies outside of valid memory.

    handle_mm_fault() is made to BUG if called on NOMMU without attempting to
    invoke the actual handler (__handle_mm_fault).

    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • - This function returns -EINVAL all the time. Fix.

    - Decruftify it a bit too.

    - Writing to it doesn't seem to do what it's suppoed to do.

    Cc: Pavel Machek
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

05 Jan, 2006

6 commits

  • Trivial manual merge fixup for usb_find_interface clashes.

    Linus Torvalds
     
  • Linus Torvalds
     
  • lib/lib.a(kobject_uevent.o)(.text+0x25f): In function `kobject_uevent':
    : undefined reference to `__alloc_skb'
    lib/lib.a(kobject_uevent.o)(.text+0x2a1): In function `kobject_uevent':
    : undefined reference to `skb_over_panic'
    lib/lib.a(kobject_uevent.o)(.text+0x31d): In function `kobject_uevent':
    : undefined reference to `skb_over_panic'
    lib/lib.a(kobject_uevent.o)(.text+0x356): In function `kobject_uevent':
    : undefined reference to `netlink_broadcast'
    lib/lib.a(kobject_uevent.o)(.init.text+0x9): In function `kobject_uevent_init':
    : undefined reference to `netlink_kernel_create'
    make: *** [.tmp_vmlinux1] Error 1

    Netlink is unconditionally enabled if CONFIG_NET, so that's OK.

    kobject_uevent.o is compiled even if !CONFIG_HOTPLUG, which is lazy.

    Let's compound the sin.

    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    akpm@osdl.org
     
  • Leave the overloaded "hotplug" word to susbsystems which are handling
    real devices. The driver core does not "plug" anything, it just exports
    the state to userspace and generates events.

    Signed-off-by: Kay Sievers
    Signed-off-by: Greg Kroah-Hartman

    Kay Sievers
     
  • This deprecates the /proc/sys/kernel/hotplug file, as all
    this stuff should be in /sys some day, right? :)
    In /sys/kernel/ we have now uevent_seqnum and uevent_helper.
    The seqnum is no longer used by udev, as the version for this
    kernel depends on netlink which events will never get
    out-of-order.

    Recent udev versions disable the /sbin/hotplug helper with
    an init script, cause it leads to OOM on big boxes by running
    hundreds of shells in parallel. It should be done now by:
    echo "" > /sys/kernel/uevent_helper

    (Note that "-n" does not work, cause neighter proc nor sysfs
    support truncate().)

    Signed-off-by: Kay Sievers
    Signed-off-by: Greg Kroah-Hartman

    Kay Sievers
     
  • It makes zero sense to have hotplug, but not the netlink
    events enabled today. Remove this option and merge the
    kobject_uevent.h header into the kobject.h header file.

    Signed-off-by: Kay Sievers
    Signed-off-by: Greg Kroah-Hartman

    Kay Sievers
     

03 Jan, 2006

2 commits


01 Jan, 2006

1 commit

  • This is a slightly more complete fix for the previous minimal sysctl
    string fix. It always terminates the returned string with a NUL, even
    if the full result wouldn't fit in the user-supplied buffer.

    The returned length is the full untruncated length, so that you can
    tell when truncation has occurred.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

31 Dec, 2005

2 commits

  • For the sysctl syscall, if the user wants to get the old value of a
    sysctl entry and set a new value for it in the same syscall, the old
    value is always overwritten by the new value if the sysctl entry is of
    string type and if the user sets its strategy to sysctl_string. This
    issue lies in the strategy being run twice if the strategy is set to
    sysctl_string, the general strategy sysctl_string always returns 0 if
    success.

    Such strategy routines as sysctl_jiffies and sysctl_jiffies_ms return 1
    because they do read and write for the sysctl entry.

    The strategy routine sysctl_string return 0 although it actually read
    and write the sysctl entry.

    According to my analysis, if a strategy routine do read and write, it
    should return 1, if it just does some necessary check but not read and
    write, it should return 0, for example sysctl_intvec.

    Signed-off-by: Yi Yang
    Signed-off-by: Linus Torvalds

    Yi Yang
     
  • If the string was too long to fit in the user-supplied buffer,
    the sysctl layer would zero-terminate it by writing past the
    end of the buffer. Don't do that.

    Noticed by Yi Yang

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

25 Dec, 2005

1 commit


21 Dec, 2005

1 commit

  • All the work was done to setup the file and maintain the file handles but
    the access functions were zeroed out due to the #ifdef. Removing the
    #ifdef allows full access to all the parameters when CONFIG_MODULES=n.

    akpm: put it back again, but use CONFIG_SYSFS instead.

    Signed-off-by: Jason Wessel
    Signed-off-by: Andrew Morton
    Signed-off-by: Adrian Bunk
    Signed-off-by: Linus Torvalds

    Jason Wessel
     

13 Dec, 2005

9 commits

  • When multiple probes are registered at the same address and if due to some
    recursion (probe getting triggered within a probe handler), we skip calling
    pre_handlers and just increment nmissed field.

    The below patch make sure it walks the list for multiple probes case.
    Without the below patch we get incorrect results of nmissed count for
    multiple probe case.

    Signed-off-by: Anil S Keshavamurthy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Keshavamurthy Anil S
     
  • For Kprobes critical path is the path from debug break exception handler
    till the control reaches kprobes exception code. No probes can be
    supported in this path as we will end up in recursion.

    This patch prevents this by moving the below function to safe __kprobes
    section onto which no probes can be inserted.

    Signed-off-by: Anil S Keshavamurthy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Keshavamurthy Anil S
     
  • kauditd was causing suspends to fail because it refused to freeze. Adding
    a try_to_freeze() to its sleep loop solves the issue.

    Signed-off-by: Pierre Ossman
    Acked-by: Pavel Machek
    Cc: David Woodhouse
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pierre Ossman
     
  • When registering multiple kprobes at the same address, we leave a small
    window where the kprobe hlist will not contain a reference to the
    registered kprobe, leading to potentially, a system crash if the breakpoint
    is hit on another processor.

    Patch below now automically relpace the old kprobe with the new
    kprobe from the hash list.

    Signed-off-by: Anil S Keshavamurthy
    Acked-by: Ananth N Mavinakayanahalli
    Cc: "Paul E. McKenney"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Keshavamurthy Anil S
     
  • There are several functions that might seem appropriate for a timestamp:

    get_cycles()
    current_kernel_time()
    do_gettimeofday()

    Each has problems with combinations of SMP-safety, low resolution, and
    monotonicity. This patch adds a new function that returns a monotonic SMP-safe
    timestamp with nanosecond resolution where available.

    Changes:
    Split timestamp into separate patch
    Moved to kernel/time.c
    Renamed to getnstimestamp
    Fixed unintended-pointer-arithmetic bug

    Signed-off-by: Matt Helsley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matt Helsley
     
  • Accessing nohz_cpu_mask before incrementing rcp->cur is racy. It can cause
    tickless idle CPUs to be included in rsp->cpumask, which will extend
    graceperiods unnecessarily.

    Fix this race. It has been tested using extensions to RCU torture module
    that forces various CPUs to become idle.

    Signed-off-by: Srivatsa Vaddagiri
    Cc: Dipankar Sarma
    Cc: "Paul E. McKenney"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Srivatsa Vaddagiri
     
  • While doing some test of RCU torture module, I hit a OOPS in rcu_do_batch,
    which was trying to processes callback of a module that was just removed.
    This is because we weren't waiting long enough for all callbacks to fire.

    Signed-off-by: Srivatsa Vaddagiri
    Cc: Dipankar Sarma
    Acked-by: "Paul E. McKenney"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Srivatsa Vaddagiri
     
  • This introduces a new interface - rcu_barrier() which waits until all
    the RCUs queued until this call have been completed.

    Reiser4 needs this, because we do more than just freeing memory object
    in our RCU callback: we also remove it from the list hanging off
    super-block. This means, that before freeing reiser4-specific portion
    of super-block (during umount) we have to wait until all pending RCU
    callbacks are executed.

    The only change of reiser4 made to the original patch, is exporting of
    rcu_barrier().

    Cc: Hans Reiser
    Cc: Vladimir V. Saveliev
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dipankar Sarma
     
  • When a Kprobes are inserted/removed on a modules, the modules must be ref
    counted so as not to allow to unload while probes are registered on that
    module.

    Without this patch, the probed module is free to unload, and when the
    probing module unregister the probe, the kpobes code while trying to
    replace the original instruction might crash.

    Signed-off-by: Anil S Keshavamurthy
    Signed-off-by: Mao Bibo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mao, Bibo
     

30 Nov, 2005

2 commits

  • Fix swsusp on machines not supporting S4. With recent changes, it is not
    possible to trigger it using /sys filesystem. Swsusp does not really need
    any support from low-level code, it is possible to reboot or halt at the
    end of suspend.

    Signed-off-by: Pavel Machek
    Cc: "Brown, Len"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Machek
     
  • set_page_dirty() will not cope with being handed a page * which is part of
    a compound page, but not the master page in that compound page. This case
    can occur via access_process_vm() if you attemp to write to another
    process's hugepage memory area using ptrace() (causing an oops or hang).

    This patch fixes the bug by only calling set_page_dirty() from
    access_process_vm() if the page is not a compound page. We already use a
    similar fix in bio_set_pages_dirty() for the case of direct io to
    hugepages.

    Signed-off-by: David Gibson
    Acked-by: William Irwin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Gibson
     

29 Nov, 2005

3 commits

  • Move the cpuset_fork() call below the write_unlock_irq call in
    kernel/fork.c copy_process().

    Since the cpuset-dual-semaphore-locking-overhaul.patch, the cpuset_fork()
    routine acquires task_lock(), so cannot be called while holding the
    tasklist_lock for write.

    Signed-off-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Jackson
     
  • Tracked this down on an Ultra Enterprise 3000. It's a 6-way machine. Odd
    thing about this machine (and it's good for finding bugs like this) is that
    the CPU id's are not 0 based. For instance, on my machine the CPU's are
    6/7/10/11/14/15.

    This caused some NULL pointer dereference in kernel/workqueue.c because for
    single_threaded workqueue's, it hardcoded the cpu to 0.

    I changed the 0's to any_online_cpu(cpu_online_mask), which cpumask.h
    claims is "First cpu in mask". So this fits the same usage.

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ben Collins
     
  • fix 32bit overflow in timespec_to_sample()

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov