03 Nov, 2009

8 commits

  • * 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
    PM: Remove some debug messages producing too much noise
    PM: Fix warning on suspend errors
    PM / Hibernate: Add newline to load_image() fail path
    PM / Hibernate: Fix error handling in save_image()
    PM / Hibernate: Fix blkdev refleaks
    PM / yenta: Split resume into early and late parts (rev. 4)

    Linus Torvalds
     
  • nr_processes() returns the sum of the per cpu counter process_counts for
    all online CPUs. This counter is incremented for the current CPU on
    fork() and decremented for the current CPU on exit(). Since a process
    does not necessarily fork and exit on the same CPU the process_count for
    an individual CPU can be either positive or negative and effectively has
    no meaning in isolation.

    Therefore calculating the sum of process_counts over only the online
    CPUs omits the processes which were started or stopped on any CPU which
    has since been unplugged. Only the sum of process_counts across all
    possible CPUs has meaning.

    The only caller of nr_processes() is proc_root_getattr() which
    calculates the number of links to /proc as
    stat->nlink = proc_root.nlink + nr_processes();

    You don't have to be all that unlucky for the nr_processes() to return a
    negative value leading to a negative number of links (or rather, an
    apparently enormous number of links). If this happens then you can get
    failures where things like "ls /proc" start to fail because they got an
    -EOVERFLOW from some stat() call.

    Example with some debugging inserted to show what goes on:
    # ps haux|wc -l
    nr_processes: CPU0: 90
    nr_processes: CPU1: 1030
    nr_processes: CPU2: -900
    nr_processes: CPU3: -136
    nr_processes: TOTAL: 84
    proc_root_getattr. nlink 12 + nr_processes() 84 = 96
    84
    # echo 0 >/sys/devices/system/cpu/cpu1/online
    # ps haux|wc -l
    nr_processes: CPU0: 85
    nr_processes: CPU2: -901
    nr_processes: CPU3: -137
    nr_processes: TOTAL: -953
    proc_root_getattr. nlink 12 + nr_processes() -953 = -941
    75
    # stat /proc/
    nr_processes: CPU0: 84
    nr_processes: CPU2: -901
    nr_processes: CPU3: -137
    nr_processes: TOTAL: -954
    proc_root_getattr. nlink 12 + nr_processes() -954 = -942
    File: `/proc/'
    Size: 0 Blocks: 0 IO Block: 1024 directory
    Device: 3h/3d Inode: 1 Links: 4294966354
    Access: (0555/dr-xr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
    Access: 2009-11-03 09:06:55.000000000 +0000
    Modify: 2009-11-03 09:06:55.000000000 +0000
    Change: 2009-11-03 09:06:55.000000000 +0000

    I'm not 100% convinced that the per_cpu regions remain valid for offline
    CPUs, although my testing suggests that they do. If not then I think the
    correct solution would be to aggregate the process_count for a given CPU
    into a global base value in cpu_down().

    This bug appears to pre-date the transition to git and it looks like it
    may even have been present in linux-2.6.0-test7-bk3 since it looks like
    the code Rusty patched in http://lwn.net/Articles/64773/ was already
    wrong.

    Signed-off-by: Ian Campbell
    Cc: Andrew Morton
    Cc: Rusty Russell
    Signed-off-by: Linus Torvalds

    Ian Campbell
     
  • Finish a line by \n when load_image fails in the middle of loading.

    Signed-off-by: Jiri Slaby
    Acked-by: Pavel Machek
    Signed-off-by: Rafael J. Wysocki

    Jiri Slaby
     
  • There are too many retval variables in save_image(). Thus error return
    value from snapshot_read_next() may be ignored and only part of the
    snapshot (successfully) written.

    Remove 'error' variable, invert the condition in the do-while loop
    and convert the loop to use only 'ret' variable.

    Switch the rest of the function to consider only 'ret'.

    Also make sure we end printed line by \n if an error occurs.

    Signed-off-by: Jiri Slaby
    Acked-by: Pavel Machek
    Signed-off-by: Rafael J. Wysocki

    Jiri Slaby
     
  • While cruising through the swsusp code I found few blkdev reference
    leaks of resume_bdev.

    swsusp_read: remove blkdev_put altogether. Some fail paths do
    not do that.
    swsusp_check: make sure we always put a reference on fail paths
    software_resume: all fail paths between swsusp_check and swsusp_read
    omit swsusp_close. Add it in those cases. And since
    swsusp_read doesn't drop the reference anymore, do
    it here unconditionally.

    [rjw: Fixed a small coding style issue.]

    Signed-off-by: Jiri Slaby
    Signed-off-by: Rafael J. Wysocki

    Jiri Slaby
     
  • …/git/tip/linux-2.6-tip

    * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    futex: Fix spurious wakeup for requeue_pi really

    Linus Torvalds
     
  • …/git/tip/linux-2.6-tip

    * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    perf tools: Remove -Wcast-align
    perf tools: Fix compatibility with libelf 0.8 and autodetect
    perf events: Don't generate events for the idle task when exclude_idle is set
    perf events: Fix swevent hrtimer sampling by keeping track of remaining time when enabling/disabling swevent hrtimers

    Linus Torvalds
     
  • …nel/git/tip/linux-2.6-tip

    * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    tracing: Remove cpu arg from the rb_time_stamp() function
    tracing: Fix comment typo and documentation example
    tracing: Fix trace_seq_printf() return value
    tracing: Update *ppos instead of filp->f_pos

    Linus Torvalds
     

30 Oct, 2009

2 commits


29 Oct, 2009

11 commits

  • * 'hwpoison-2.6.32' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6:
    HWPOISON: fix invalid page count in printk output
    HWPOISON: Allow schedule_on_each_cpu() from keventd
    HWPOISON: fix/proc/meminfo alignment
    HWPOISON: fix oops on ksm pages
    HWPOISON: Fix page count leak in hwpoison late kill in do_swap_page
    HWPOISON: return early on non-LRU pages
    HWPOISON: Add brief hwpoison description to Documentation
    HWPOISON: Clean up PR_MCE_KILL interface

    Linus Torvalds
     
  • …/git/tip/linux-2.6-tip

    * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    futex: Move drop_futex_key_refs out of spinlock'ed region
    rcu: Fix TREE_PREEMPT_RCU CPU_HOTPLUG bad-luck hang
    rcu: Stopgap fix for synchronize_rcu_expedited() for TREE_PREEMPT_RCU
    rcu: Prevent RCU IPI storms in presence of high call_rcu() load
    futex: Check for NULL keys in match_futex
    futex: Handle spurious wake up

    Linus Torvalds
     
  • …/git/tip/linux-2.6-tip

    * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    perf timechart: Improve the visual appearance of scheduler delays
    perf timechart: Fix the wakeup-arrows that point to non-visible processes
    perf top: Fix --delay_secs 0 division by zero
    perf tools: Bump version to 0.0.2
    perf_event: Adjust frequency and unthrottle for non-group-leader events

    Linus Torvalds
     
  • …l/git/tip/linux-2.6-tip

    * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    sched: Do less agressive buddy clearing
    sched: Disable SD_PREFER_LOCAL for MC/CPU domains

    Linus Torvalds
     
  • Having ->procname but not ->proc_handler is valid when PROC_SYSCTL=n,
    people use such combination to reduce ifdefs with non-standard handlers.

    Addresses http://bugzilla.kernel.org/show_bug.cgi?id=14408

    Signed-off-by: Alexey Dobriyan
    Reported-by: Peter Teoh
    Cc: "Eric W. Biederman"
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • cgroup_write_X64() and cgroup_write_string() ignore the return value of
    strstrip(). it makes small inconsistent behavior.

    example:
    =========================
    # cd /mnt/cgroup/hoge
    # cat memory.swappiness
    60
    # echo "59 " > memory.swappiness
    # cat memory.swappiness
    59
    # echo " 58" > memory.swappiness
    bash: echo: write error: Invalid argument

    This patch fixes it.

    Cc: Li Zefan
    Acked-by: Paul Menage
    Signed-off-by: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • Since commit 02b51df1b07b4e9ca823c89284e704cadb323cd1 (proc connector: add
    event for process becoming session leader) we have the following warning:

    Badness at kernel/softirq.c:143
    [...]
    Krnl PSW : 0404c00180000000 00000000001481d4 (local_bh_enable+0xb0/0xe0)
    [...]
    Call Trace:
    ([] 0x13fe04100)
    [] sk_filter+0x9a/0xd0
    [] netlink_broadcast+0x2c0/0x53c
    [] cn_netlink_send+0x272/0x2b0
    [] proc_sid_connector+0xc4/0xd4
    [] __set_special_pids+0x58/0x90
    [] sys_setsid+0xb4/0xd8
    [] sysc_noemu+0x10/0x16
    [] 0x41616cb266

    The warning is
    ---> WARN_ON_ONCE(in_irq() || irqs_disabled());

    The network code must not be called with disabled interrupts but
    sys_setsid holds the tasklist_lock with spinlock_irq while calling the
    connector.

    After a discussion we agreed that we can move proc_sid_connector from
    __set_special_pids to sys_setsid.

    We also agreed that it is sufficient to change the check from
    task_session(curr) != pid into err > 0, since if we don't change the
    session, this means we were already the leader and return -EPERM.

    One last thing:
    There is also daemonize(), and some people might want to get a
    notification in that case. Since daemonize() is only needed if a user
    space does kernel_thread this does not look important (and there seems
    to be no consensus if this connector should be called in daemonize). If
    we really want this, we can add proc_sid_connector to daemonize() in an
    additional patch (Scott?)

    Signed-off-by: Christian Borntraeger
    Cc: Scott James Remnant
    Cc: Matt Helsley
    Cc: David S. Miller
    Acked-by: Oleg Nesterov
    Acked-by: Evgeniy Polyakov
    Acked-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christian Borntraeger
     
  • We create a dummy struct kernel_param on the stack for parsing each
    array element, but we didn't initialize the flags word. This matters
    for arrays of type "bool", where the flag indicates if it really is
    an array of bools or unsigned int (old-style).

    Reported-by: Takashi Iwai
    Signed-off-by: Rusty Russell
    Cc: stable@kernel.org

    Rusty Russell
     
  • kp->arg is always true: it's the contents of that pointer we care about.

    Reported-by: Takashi Iwai
    Signed-off-by: Rusty Russell
    Cc: stable@kernel.org

    Rusty Russell
     
  • e180a6b7759a "param: fix charp parameters set via sysfs" fixed the case
    where charp parameters written via sysfs were freed, leaving drivers
    accessing random memory.

    Unfortunately, storing a flag in the kparam struct was a bad idea: it's
    rodata so setting it causes an oops on some archs. But that's not all:

    1) module_param_array() on charp doesn't work reliably, since we use an
    uninitialized temporary struct kernel_param.
    2) there's a fundamental race if a module uses this parameter and then
    it's changed: they will still access the old, freed, memory.

    The simplest fix (ie. for 2.6.32) is to never free the memory. This
    prevents all these problems, at cost of a memory leak. In practice, there
    are only 18 places where a charp is writable via sysfs, and all are
    root-only writable.

    Reported-by: Takashi Iwai
    Cc: Sitsofe Wheeler
    Cc: Frederic Weisbecker
    Cc: Christof Schmitt
    Signed-off-by: Rusty Russell
    Cc: stable@kernel.org

    Rusty Russell
     
  • The requeue_pi path doesn't use unqueue_me() (and the racy lock_ptr ==
    NULL test) nor does it use the wake_list of futex_wake() which where
    the reason for commit 41890f2 (futex: Handle spurious wake up)

    See debugging discussing on LKML Message-ID:

    The changes in this fix to the wait_requeue_pi path were considered to
    be a likely unecessary, but harmless safety net. But it turns out that
    due to the fact that for unknown $@#!*( reasons EWOULDBLOCK is defined
    as EAGAIN we built an endless loop in the code path which returns
    correctly EWOULDBLOCK.

    Spurious wakeups in wait_requeue_pi code path are unlikely so we do
    the easy solution and return EWOULDBLOCK^WEAGAIN to user space and let
    it deal with the spurious wakeup.

    Cc: Darren Hart
    Cc: Peter Zijlstra
    Cc: Eric Dumazet
    Cc: John Stultz
    Cc: Dinakar Guniguntala
    LKML-Reference:
    Cc: stable@kernel.org
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

28 Oct, 2009

1 commit

  • Commit 34d76c41 introduced percpu array update_shares_data, size of which
    being proportional to NR_CPUS. Unfortunately this blows up ia64 for large
    NR_CPUS configuration, as ia64 allows only 64k for .percpu section.

    Fix this by allocating this array dynamically and keep only pointer to it
    percpu.

    The per-cpu handling doesn't impose significant performance penalty on
    potentially contented path in tg_shares_up().

    ...
    ffffffff8104337c: 65 48 8b 14 25 20 cd mov %gs:0xcd20,%rdx
    ffffffff81043383: 00 00
    ffffffff81043385: 48 c7 c0 00 e1 00 00 mov $0xe100,%rax
    ffffffff8104338c: 48 c7 45 a0 00 00 00 movq $0x0,-0x60(%rbp)
    ffffffff81043393: 00
    ffffffff81043394: 48 c7 45 a8 00 00 00 movq $0x0,-0x58(%rbp)
    ffffffff8104339b: 00
    ffffffff8104339c: 48 01 d0 add %rdx,%rax
    ffffffff8104339f: 49 8d 94 24 08 01 00 lea 0x108(%r12),%rdx
    ffffffff810433a6: 00
    ffffffff810433a7: b9 ff ff ff ff mov $0xffffffff,%ecx
    ffffffff810433ac: 48 89 45 b0 mov %rax,-0x50(%rbp)
    ffffffff810433b0: bb 00 04 00 00 mov $0x400,%ebx
    ffffffff810433b5: 48 89 55 c0 mov %rdx,-0x40(%rbp)
    ...

    After:

    ...
    ffffffff8104337c: 65 8b 04 25 28 cd 00 mov %gs:0xcd28,%eax
    ffffffff81043383: 00
    ffffffff81043384: 48 98 cltq
    ffffffff81043386: 49 8d bc 24 08 01 00 lea 0x108(%r12),%rdi
    ffffffff8104338d: 00
    ffffffff8104338e: 48 8b 15 d3 7f 76 00 mov 0x767fd3(%rip),%rdx # ffffffff817ab368
    ffffffff81043395: 48 8b 34 c5 00 ee 6d mov -0x7e921200(,%rax,8),%rsi
    ffffffff8104339c: 81
    ffffffff8104339d: 48 c7 45 a0 00 00 00 movq $0x0,-0x60(%rbp)
    ffffffff810433a4: 00
    ffffffff810433a5: b9 ff ff ff ff mov $0xffffffff,%ecx
    ffffffff810433aa: 48 89 7d c0 mov %rdi,-0x40(%rbp)
    ffffffff810433ae: 48 c7 45 a8 00 00 00 movq $0x0,-0x58(%rbp)
    ffffffff810433b5: 00
    ffffffff810433b6: bb 00 04 00 00 mov $0x400,%ebx
    ffffffff810433bb: 48 01 f2 add %rsi,%rdx
    ffffffff810433be: 48 89 55 b0 mov %rdx,-0x50(%rbp)
    ...

    Signed-off-by: Jiri Kosina
    Acked-by: Ingo Molnar
    Signed-off-by: Tejun Heo

    Jiri Kosina
     

24 Oct, 2009

4 commits

  • The cpu argument is not used inside the rb_time_stamp() function.
    Plus fix a typo.

    Signed-off-by: Jiri Olsa
    Signed-off-by: Steven Rostedt
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Jiri Olsa
     
  • Trivial patch to fix a documentation example and to fix a
    comment.

    Signed-off-by: Jiri Olsa
    Signed-off-by: Steven Rostedt
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Jiri Olsa
     
  • trace_seq_printf() return value is a little ambiguous. It
    currently returns the length of the space available in the
    buffer. printf usually returns the amount written. This is not
    adequate here, because:

    trace_seq_printf(s, "");

    is perfectly legal, and returning 0 would indicate that it
    failed.

    We can always see the amount written by looking at the before
    and after values of s->len. This is not quite the same use as
    printf. We only care if the string was successfully written to
    the buffer or not.

    Make trace_seq_printf() return 0 if the trace oversizes the
    buffer's free space, 1 otherwise.

    Signed-off-by: Jiri Olsa
    Signed-off-by: Steven Rostedt
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Jiri Olsa
     
  • Instead of directly updating filp->f_pos we should update the *ppos
    argument. The filp->f_pos gets updated within the file_pos_write()
    function called from sys_write().

    Signed-off-by: Jiri Olsa
    Signed-off-by: Steven Rostedt
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Jiri Olsa
     

23 Oct, 2009

2 commits

  • Getting samples for the idle task is often not interesting, so
    don't generate them when exclude_idle is set for the event in
    question.

    Signed-off-by: Søren Sandmann Pedersen
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    Cc: Steven Rostedt
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Soeren Sandmann
     
  • …n enabling/disabling swevent hrtimers

    Make the hrtimer based events work for sysprof.

    Whenever a swevent is scheduled out, the hrtimer is canceled.
    When it is scheduled back in, the timer is restarted. This
    happens every scheduler tick, which means the timer never
    expired because it was getting repeatedly restarted over and
    over with the same period.

    To fix that, save the remaining time when disabling; when
    reenabling, use that saved time as the period instead of the
    user-specified sampling period.

    Also, move the starting and stopping of the hrtimers to helper
    functions instead of duplicating the code.

    Signed-off-by: Søren Sandmann Pedersen <sandmann@redhat.com>
    LKML-Reference: <ye8vdi7mluz.fsf@camel16.daimi.au.dk>
    Signed-off-by: Ingo Molnar <mingo@elte.hu>

    Soeren Sandmann
     

22 Oct, 2009

1 commit

  • Increase TEST_SUSPEND_SECONDS to 10 so the warning in
    suspend_test_finish() doesn't annoy the users of slower systems so much.

    Also, make the warning print the suspend-resume cycle time, so that we
    know why the warning actually triggered.

    Patch prepared during the hacking session at the Kernel Summit in Tokyo.

    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

19 Oct, 2009

1 commit

  • Right now when calling schedule_on_each_cpu() from keventd there
    is a deadlock because it tries to schedule a work item on the current CPU
    too. This happens via lru_add_drain_all() in hwpoison.

    Just call the function for the current CPU in this case. This is actually
    faster too.

    Debugging with Fengguang Wu & Max Asbock

    Signed-off-by: Andi Kleen

    Andi Kleen
     

16 Oct, 2009

3 commits

  • When requeuing tasks from one futex to another, the reference held
    by the requeued task to the original futex location needs to be
    dropped eventually.

    Dropping the reference may ultimately lead to a call to
    "iput_final" and subsequently call into filesystem- specific code -
    which may be non-atomic.

    It is therefore safer to defer this drop operation until after the
    futex_hash_bucket spinlock has been dropped.

    Originally-From: Helge Bahmann
    Signed-off-by: Darren Hart
    Cc:
    Cc: Peter Zijlstra
    Cc: Eric Dumazet
    Cc: Dinakar Guniguntala
    Cc: John Stultz
    Cc: Sven-Thorsten Dietrich
    Cc: John Kacur
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Darren Hart
     
  • * branch 'tty-fixes'
    tty: use the new 'flush_delayed_work()' helper to do ldisc flush
    workqueue: add 'flush_delayed_work()' to run and wait for delayed work
    tty: Make flush_to_ldisc() locking more robust

    Linus Torvalds
     
  • If the following sequence of events occurs, then
    TREE_PREEMPT_RCU will hang waiting for a grace period to
    complete, eventually OOMing the system:

    o A TREE_PREEMPT_RCU build of the kernel is booted on a system
    with more than 64 physical CPUs present (32 on a 32-bit system).
    Alternatively, a TREE_PREEMPT_RCU build of the kernel is booted
    with RCU_FANOUT set to a sufficiently small value that the
    physical CPUs populate two or more leaf rcu_node structures.

    o A task is preempted in an RCU read-side critical section
    while running on a CPU corresponding to a given leaf rcu_node
    structure.

    o All CPUs corresponding to this same leaf rcu_node structure
    record quiescent states for the current grace period.

    o All of these same CPUs go offline (hence the need for enough
    physical CPUs to populate more than one leaf rcu_node structure).
    This causes the preempted task to be moved to the root rcu_node
    structure.

    At this point, there is nothing left to cause the quiescent
    state to be propagated up the rcu_node tree, so the current
    grace period never completes.

    The simplest fix, especially after considering the deadlock
    possibilities, is to detect this situation when the last CPU is
    offlined, and to set that CPU's ->qsmask bit in its leaf
    rcu_node structure. This will cause the next invocation of
    force_quiescent_state() to end the grace period.

    Without this fix, this hang can be triggered in an hour or so on
    some machines with rcutorture and random CPU onlining/offlining.
    With this fix, these same machines pass a full 10 hours of this
    sort of abuse.

    Signed-off-by: Paul E. McKenney
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

15 Oct, 2009

7 commits

  • For the short term, map synchronize_rcu_expedited() to
    synchronize_rcu() for TREE_PREEMPT_RCU and to
    synchronize_sched_expedited() for TREE_RCU.

    Longer term, there needs to be a real expedited grace period for
    TREE_PREEMPT_RCU, but candidate patches to date are considerably
    more complex and intrusive.

    Signed-off-by: Paul E. McKenney
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    Cc: npiggin@suse.de
    Cc: jens.axboe@oracle.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • As the number of callbacks on a given CPU rises, invoke
    force_quiescent_state() only every blimit number of callbacks
    (defaults to 10,000), and even then only if no other CPU has
    invoked force_quiescent_state() in the meantime.

    This should fix the performance regression reported by Nick.

    Reported-by: Nick Piggin
    Signed-off-by: Paul E. McKenney
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    Cc: jens.axboe@oracle.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • * branch 'tty-fixes':
    tty: use the new 'flush_delayed_work()' helper to do ldisc flush
    workqueue: add 'flush_delayed_work()' to run and wait for delayed work
    Make flush_to_ldisc properly handle parallel calls

    Linus Torvalds
     
  • …/git/tip/linux-2.6-tip

    * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    oprofile: warn on freeing event buffer too early
    oprofile: fix race condition in event_buffer free
    lockdep: Use cpu_clock() for lockstat

    Linus Torvalds
     
  • …l/git/tip/linux-2.6-tip

    * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    sched: Fix missing kernel-doc notation
    Revert "x86, timers: Check for pending timers after (device) interrupts"
    sched: Update the clock of runqueue select_task_rq() selected

    Linus Torvalds
     
  • …nel/git/tip/linux-2.6-tip

    * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    tracing/filters: Fix memory leak when setting a filter
    tracing: fix trace_vprintk call

    Linus Torvalds
     
  • It basically turns a delayed work into an immediate work, and then waits
    for it to finish, thus allowing you to force (and wait for) an immediate
    flush of a delayed work.

    We'll want to use this in the tty layer to clean up tty_flush_to_ldisc().

    Acked-by: Oleg Nesterov
    [ Fixed to use 'del_timer_sync()' as noted by Oleg ]
    Signed-off-by: Linus Torvalds

    Linus Torvalds