08 Feb, 2006

6 commits

  • If the namespace structure is being shared, allocate a new one and copy
    information from the current, shared, structure.

    Signed-off-by: Janak Desai
    Cc: Al Viro
    Cc: Christoph Hellwig
    Cc: Michael Kerrisk
    Cc: Andi Kleen
    Cc: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    JANAK DESAI
     
  • If filesystem structure is being shared, allocate a new one and copy
    information from the current, shared, structure.

    Signed-off-by: Janak Desai
    Cc: Al Viro
    Cc: Christoph Hellwig
    Cc: Michael Kerrisk
    Cc: Andi Kleen
    Cc: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    JANAK DESAI
     
  • sys_unshare system call handler function accepts the same flags as clone
    system call, checks constraints on each of the flags and invokes corresponding
    unshare functions to disassociate respective process context if it was being
    shared with another task.

    Here is the link to a program for testing unshare system call.

    http://prdownloads.sourceforge.net/audit/unshare_test.c?download

    Please note that because of a problem in rmdir associated with bind mounts and
    clone with CLONE_NEWNS, the test fails while trying to remove temporary test
    directory. You can remove that temporary directory by doing rmdir, twice,
    from the command line. The first will fail with EBUSY, but the second will
    succeed. I have reported the problem to Ram Pai and Al Viro with a small
    program which reproduces the problem. Al told us yesterday that he will be
    looking at the problem soon. I have tried multiple rmdirs from the
    unshare_test program itself, but for some reason that is not working. Doing
    two rmdirs from command line does seem to remove the directory.

    Signed-off-by: Janak Desai
    Cc: Al Viro
    Cc: Christoph Hellwig
    Cc: Michael Kerrisk
    Cc: Andi Kleen
    Cc: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    JANAK DESAI
     
  • Fix compilation problem in PM headers.

    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • - Remove unneeded bio_get() which would cause a bio leak

    - Writing doesn't dirty pages. Reading dirties pages.

    - We should dirty the pages after the IO completion, not before

    (Busy-waiting for disk I/O completion isn't very polite.)

    Signed-off-by: Pavel Machek
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Machek
     

06 Feb, 2006

3 commits

  • It may suck something awful, but it shouldn't taint the kernel.

    Signed-off-by: Dave Jones
    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Jones
     
  • migration_cost prints after every CPU hotplug event. Make it print only
    once at boot.

    Signed-off-by: Chuck Ebbert
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chuck Ebbert
     
  • percpu_data blindly allocates bootmem memory to store NR_CPUS instances of
    cpudata, instead of allocating memory only for possible cpus.

    As a preparation for changing that, we need to convert various 0 -> NR_CPUS
    loops to use for_each_cpu().

    (The above only applies to users of asm-generic/percpu.h. powerpc has gone it
    alone and is presently only allocating memory for present CPUs, so it's
    currently corrupting memory).

    Signed-off-by: Eric Dumazet
    Cc: "David S. Miller"
    Cc: James Bottomley
    Acked-by: Ingo Molnar
    Cc: Jens Axboe
    Cc: Anton Blanchard
    Acked-by: William Irwin
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     

04 Feb, 2006

6 commits

  • Five callsites. I dunno how all this crap got back in there :(

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • kernel/cpuset.c:644:38: warning: non-ANSI function declaration of function 'cpuset_update_task_memory_state'

    Signed-off-by: Randy Dunlap
    Acked-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • - In case of a negative nsec value the result of the division must be
    normalized.

    - Remove inline from an exported function.

    Signed-off-by: George Anzinger
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    George Anzinger
     
  • When one module exports a function symbol and another module uses that
    symbol then kallsyms shows the symbol twice. Once from the consumer with a
    type of 'U' and once from the provider with a type of 't' or 'T'. On most
    architectures, both entries have the same address so it does not matter
    which one is returned by kallsyms_lookup_name(). But on architectures with
    function descriptors, the 'U' entry points to the descriptor, not to the
    code body, which is not what we want.

    IA64 # grep -w qla2x00_remove_one /proc/kallsyms
    a000000208c25ef8 U qla2x00_remove_one [qla2300]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Keith Owens
     
  • When two function-return probes are inserted on kfree()[1] and the second
    on say, sys_link()[2], and later [2] is unregistered, we have a deadlock as
    kfree is called with the kretprobe_lock held and the function-return probe
    on kfree will also try to grab the same lock.

    However, we can move the kfree() during unregistration to outside the
    spinlock as we are sure that no instances from the free list will be used
    after synchronized_sched() returns during the unregistration process.
    Thanks to Masami Hiramatsu for spotting this.

    Signed-off-by: Ananth N Mavinakayanahalli
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ananth N Mavinakayanahalli
     
  • kernel/kprobes.c:353: warning: 'pre_handler_kretprobe' defined but not used

    Signed-off-by: Adrian Bunk
    Acked-by: Ananth N Mavinakayanahalli
    Acked-by: "Keshavamurthy, Anil S"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     

02 Feb, 2006

14 commits

  • Linus Torvalds
     
  • Currently the zone_reclaim code has a fixed window of 30 seconds of off node
    allocations should a local zone have no unused pagecache pages left. Reclaim
    will be attempted again after this timeout period to avoid repeated useless
    scans for memory. This is also useful to established sufficiently large off
    node allocation chunks to relieve the local node.

    It may be beneficial to adjust that time period for some special situations.
    For example if memory use was exceeding node capacity one may want to give up
    for longer periods of time. If memory spikes intermittendly then one may want
    to shorten the time period to reduce the number of off node allocations.

    This patch allows just that....

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • - If we only reclaim nr_pages then its okay to stay on node.
    Switch from > to >= for the comparison.

    - vm_table[] entry for zone_reclaim_mode is a bit screwed up.

    - Add empty lines around shrink_zone to show that this is the
    central function to be called.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Prevent the kernel from setting the log level to 10 unconditionally during
    suspend/resume which was needed in the past for debugging, but generally is
    undesirable.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Change sched_getaffinity() so that it returns a bitmap that indicates the
    legally schedulable cpus that a task is allowed to run on.

    Without this patch, if CONFIG_HOTPLUG_CPU is enabled, sched_getaffinity()
    unconditionally returns (at least on IA64) a mask with NR_CPUS bits set.
    This conveys no useful infornmation except for a kernel compile option.

    This fixes a breakage we obseved running recent kernels. We have MPI jobs
    that use sched_getaffinity() to determine where to place their threads.
    Placing them on non-existant cpus is problematic :-)

    Signed-off-by: Jack Steiner
    Acked-by: Ingo Molnar
    Cc: Nathan Lynch
    Cc: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jack Steiner
     
  • This function is neither used nor has any real contents.

    Signed-off-by: Adrian Bunk
    Acked-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • The expiry time for relative timers with SIGEV_NONE set was never
    updated to the correct value.

    Pointed out by George Anzinger.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • At some point we added credits to people who actively helped to bring
    k/hr-timers along. This was lost in the big code revamp. Add it back.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Clean up the interface to hrtimers by changing the init code to pass the mode
    as well as the clock. This allow the init code to select the correct base and
    eliminates extra timer re-init code in posix-timers. We also simplify the
    restart interface nanosleep use.

    Signed-off-by: George Anzinger
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    George Anzinger
     
  • From: Steven Rostedtrostedt@goodmis.org

    CPU0 expires a posix-timer and runs the callback function. The signal is
    queued.

    After releasing the posix-timer lock and before returning to hrtimer_run_queue
    CPU0 gets interrupted. CPU1 delivers the queued signal and rearms the timer.
    CPU0 comes back to hrtimer_run_queue and sets the timer state to expired.

    The next modification of the timer can result in an oops, because the state
    information is wrong.

    Keep track of state = RUNNING and check if the state has been in the return
    path of hrtimer_run_queue. In case the state has been changed, ignore a
    restart request and do not touch the state variable.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    akpm@osdl.org
     
  • This resolves bugzilla bug#5617. The oldvalue of the timer was read after the
    timer was cancelled, so the remaining time was always zero.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Fixup the conversion of posix-timers to hrtimers.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • The itimer conversion removed the locking which protects the timer and
    variables in the shared signal structure. Steven Rostedt found the problem in
    the latest -rt patches.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Make swsusp use bytes as the image size units, which is needed for future
    compatibility.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

01 Feb, 2006

5 commits


25 Jan, 2006

1 commit


19 Jan, 2006

4 commits

  • EDAC requires a way to scrub memory if an ECC error is found and the chipset
    does not do the work automatically. That means rewriting memory locations
    atomically with respect to all CPUs _and_ bus masters. That means we can't
    use atomic_add(foo, 0) as it gets optimised for non-SMP

    This adds a function to include/asm-foo/atomic.h for the platforms currently
    supported which implements a scrub of a mapped block.

    It also adjusts a few other files include order where atomic.h is included
    before types.h as this now causes an error as atomic_scrub uses u32.

    Signed-off-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Cox
     
  • The TIF_RESTORE_SIGMASK flag allows us to have a generic implementation of
    sys_rt_sigsuspend() instead of duplicating it for each architecture. This
    provides such an implementation and makes arch/powerpc use it.

    It also tidies up the ppc32 sys_sigsuspend() to use TIF_RESTORE_SIGMASK.

    Signed-off-by: David Woodhouse
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Woodhouse
     
  • Currently, a negative policy argument passed into the
    'sys_sched_setscheduler()' system call, will return with success. However,
    the manpage for 'sys_sched_setscheduler' says:

    EINVAL The scheduling policy is not one of the recognized policies, or the
    parameter p does not make sense for the policy.

    Signed-off-by: Jason Baron
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jason Baron
     
  • proc support for zone reclaim

    This patch creates a proc entry /proc/sys/vm/zone_reclaim_mode that may be
    used to override the automatic determination of the zone reclaim made on
    bootup.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

17 Jan, 2006

1 commit

  • fix the following sparse warning:

    kernel/hrtimer.c:665:34: warning: incorrect type in argument 2 (different address spaces)
    kernel/hrtimer.c:665:34: expected void const *from
    kernel/hrtimer.c:665:34: got struct timespec [noderef] *
    kernel/hrtimer.c:664:2: warning: dereference of noderef expression

    Signed-off-by: Ingo Molnar
    Signed-off-by: Linus Torvalds

    Ingo Molnar