18 Jul, 2007

14 commits

  • Rather than using a tri-state integer for the wait flag in
    call_usermodehelper_exec, define a proper enum, and use that. I've
    preserved the integer values so that any callers I've missed should
    still work OK.

    Signed-off-by: Jeremy Fitzhardinge
    Cc: James Bottomley
    Cc: Randy Dunlap
    Cc: Christoph Hellwig
    Cc: Andi Kleen
    Cc: Paul Mackerras
    Cc: Johannes Berg
    Cc: Ralf Baechle
    Cc: Bjorn Helgaas
    Cc: Joel Becker
    Cc: Tony Luck
    Cc: Kay Sievers
    Cc: Srivatsa Vaddagiri
    Cc: Oleg Nesterov
    Cc: David Howells

    Jeremy Fitzhardinge
     
  • Various pieces of code around the kernel want to be able to trigger an
    orderly poweroff. This pulls them together into a single
    implementation.

    By default the poweroff command is /sbin/poweroff, but it can be set
    via sysctl: kernel/poweroff_cmd. This is split at whitespace, so it
    can include command-line arguments.

    This patch replaces four other instances of invoking either "poweroff"
    or "shutdown -h now": two sbus drivers, and acpi thermal
    management.

    sparc64 has its own "powerd"; still need to determine whether it should
    be replaced by orderly_poweroff().

    Signed-off-by: Jeremy Fitzhardinge
    Acked-by: Len Brown
    Signed-off-by: Chris Wright
    Cc: Andrew Morton
    Cc: Randy Dunlap
    Cc: Andi Kleen
    Cc: Al Viro
    Cc: Arnd Bergmann
    Cc: David S. Miller

    Jeremy Fitzhardinge
     
  • Rather than having hundreds of variations of call_usermodehelper for
    various pieces of usermode state which could be set up, split the
    info allocation and initialization from the actual process execution.

    This means the general pattern becomes:
    info = call_usermodehelper_setup(path, argv, envp); /* basic state */
    call_usermodehelper_(info, stuff...); /* extra state */
    call_usermodehelper_exec(info, wait); /* run process and free info */

    This patch introduces wrappers for all the existing calling styles for
    call_usermodehelper_*, but folds their implementations into one.

    Signed-off-by: Jeremy Fitzhardinge
    Cc: Andi Kleen
    Cc: Rusty Russell
    Cc: David Howells
    Cc: Bj?rn Steinbrink
    Cc: Randy Dunlap

    Jeremy Fitzhardinge
     
  • Kill this warning...

    kernel/auditfilter.c: In function ‘audit_receive_filter’:
    kernel/auditfilter.c:1213: warning: ‘ndw’ may be used uninitialized in this function
    kernel/auditfilter.c:1213: warning: ‘ndp’ may be used uninitialized in this function

    ...with a simplification of the code. audit_put_nd() can accept NULL
    arguments, just like kfree(). It is cleaner to init two existing vars
    to NULL, remove the redundant test variable 'putnd_needed' branches, and call
    audit_put_nd() directly.

    As a desired side effect, the warning goes away.

    Signed-off-by: Jeff Garzik

    Jeff Garzik
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (80 commits)
    KVM: Use CPU_DYING for disabling virtualization
    KVM: Tune hotplug/suspend IPIs
    KVM: Keep track of which cpus have virtualization enabled
    SMP: Allow smp_call_function_single() to current cpu
    i386: Allow smp_call_function_single() to current cpu
    x86_64: Allow smp_call_function_single() to current cpu
    HOTPLUG: Adapt thermal throttle to CPU_DYING
    HOTPLUG: Adapt cpuset hotplug callback to CPU_DYING
    HOTPLUG: Add CPU_DYING notifier
    KVM: Clean up #includes
    KVM: Remove kvmfs in favor of the anonymous inodes source
    KVM: SVM: Reliably detect if SVM was disabled by BIOS
    KVM: VMX: Remove unnecessary code in vmx_tlb_flush()
    KVM: MMU: Fix Wrong tlb flush order
    KVM: VMX: Reinitialize the real-mode tss when entering real mode
    KVM: Avoid useless memory write when possible
    KVM: Fix x86 emulator writeback
    KVM: Add support for in-kernel pio handlers
    KVM: VMX: Fix interrupt checking on lightweight exit
    KVM: Adds support for in-kernel mmio handlers
    ...

    Linus Torvalds
     
  • Pointed out by Michal Schmidt .

    The bug was introduced in 2.6.22 by me.

    cleanup_workqueue_thread() does flush_cpu_workqueue(cwq) in a loop until
    ->worklist becomes empty. This is live-lockable, a re-niced caller can get
    CPU after wake_up() and insert a new barrier before the lower-priority
    cwq->thread has a chance to clear ->current_work.

    Change cleanup_workqueue_thread() to do flush_cpu_workqueue(cwq) only once.
    We can rely on the fact that run_workqueue() won't return until it flushes
    all works. So it is safe to call kthread_stop() after that, the "should
    stop" request won't be noticed until run_workqueue() returns.

    Signed-off-by: Oleg Nesterov
    Cc: Michal Schmidt
    Cc: Srivatsa Vaddagiri
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • KSYM_NAME_LEN is peculiar in that it does not include the space for the
    trailing '\0', forcing all users to use KSYM_NAME_LEN + 1 when allocating
    buffer. This is nonsense and error-prone. Moreover, when the caller
    forgets that it's very likely to subtly bite back by corrupting the stack
    because the last position of the buffer is always cleared to zero.

    This patch increments KSYM_NAME_LEN by one and updates code accordingly.

    * off-by-one bug in asm-powerpc/kprobes.h::kprobe_lookup_name() macro
    is fixed.

    * Where MODULE_NAME_LEN and KSYM_NAME_LEN were used together,
    MODULE_NAME_LEN was treated as if it didn't include space for the
    trailing '\0'. Fix it.

    Signed-off-by: Tejun Heo
    Acked-by: Paulo Marques
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • Add a proper prototype for proc_nr_files() in include/linux/fs.h

    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • Identical implementations of PTRACE_POKEDATA go into generic_ptrace_pokedata()
    function.

    AFAICS, fix bug on xtensa where successful PTRACE_POKEDATA will nevertheless
    return EPERM.

    Signed-off-by: Alexey Dobriyan
    Cc: Christoph Hellwig
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Identical implementations of PTRACE_PEEKDATA go into generic_ptrace_peekdata()
    function.

    Signed-off-by: Alexey Dobriyan
    Cc: Christoph Hellwig
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • If the kernel OOPSed or BUGed then it probably should be considered as
    tainted. Thus, all subsequent OOPSes and SysRq dumps will report the
    tainted kernel. This saves a lot of time explaining oddities in the
    calltraces.

    Signed-off-by: Pavel Emelianov
    Acked-by: Randy Dunlap
    Cc:
    Signed-off-by: Andrew Morton
    [ Added parisc patch from Matthew Wilson -Linus ]
    Signed-off-by: Linus Torvalds

    Pavel Emelianov
     
  • Currently, the freezer treats all tasks as freezable, except for the kernel
    threads that explicitly set the PF_NOFREEZE flag for themselves. This
    approach is problematic, since it requires every kernel thread to either
    set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't
    care for the freezing of tasks at all.

    It seems better to only require the kernel threads that want to or need to
    be frozen to use some freezer-related code and to remove any
    freezer-related code from the other (nonfreezable) kernel threads, which is
    done in this patch.

    The patch causes all kernel threads to be nonfreezable by default (ie. to
    have PF_NOFREEZE set by default) and introduces the set_freezable()
    function that should be called by the freezable kernel threads in order to
    unset PF_NOFREEZE. It also makes all of the currently freezable kernel
    threads call set_freezable(), so it shouldn't cause any (intentional)
    change of behaviour to appear. Additionally, it updates documentation to
    describe the freezing of tasks more accurately.

    [akpm@linux-foundation.org: build fixes]
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Nigel Cunningham
    Cc: Pavel Machek
    Cc: Oleg Nesterov
    Cc: Gautham R Shenoy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • kmalloc_node() and kmem_cache_alloc_node() were not available in a zeroing
    variant in the past. But with __GFP_ZERO it is possible now to do zeroing
    while allocating.

    Use __GFP_ZERO to remove the explicit clearing of memory via memset whereever
    we can.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Huge pages are not movable so are not allocated from ZONE_MOVABLE. However,
    as ZONE_MOVABLE will always have pages that can be migrated or reclaimed, it
    can be used to satisfy hugepage allocations even when the system has been
    running a long time. This allows an administrator to resize the hugepage pool
    at runtime depending on the size of ZONE_MOVABLE.

    This patch adds a new sysctl called hugepages_treat_as_movable. When a
    non-zero value is written to it, future allocations for the huge page pool
    will use ZONE_MOVABLE. Despite huge pages being non-movable, we do not
    introduce additional external fragmentation of note as huge pages are always
    the largest contiguous block we care about.

    [akpm@linux-foundation.org: various fixes]
    Signed-off-by: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     

17 Jul, 2007

26 commits

  • All of the clockevent notifiers expect a pointer to
    an "unsigned int" cpu argument, but hrtimer_cpu_notify()
    passes in a pointer to a long.

    [ Discussed with and ok by Thomas Gleixner ]

    Signed-off-by: David S. Miller
    Signed-off-by: Linus Torvalds

    David Miller
     
  • Randy Dunlap noticed that the recent comment clarifications from Andrew
    had somehow gotten duplicated. Quoth Andrew: "hm, that could have been
    some late-night reject-fixing."

    Fix it up.

    Cc: From: Andrew Morton
    Cc: Randy Dunlap
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
    [PATCH] sched: fix up fs/proc/array.c whitespace problems
    [PATCH] sched: prettify prio_to_wmult[]
    [PATCH] sched: document prio_to_wmult[]
    [PATCH] sched: improve weight-array comments
    [PATCH] sched: remove dead code from task_stime()

    Fixed up trivial conflict in fs/proc/array.c

    Linus Torvalds
     
  • kernel/printk.c: document possible deadlock against scheduler

    The printk's comment states that it can be called from every context,
    which might lead to false illusion that it could be called from everywhere
    without any restrictions.

    This is however not true - a call to printk() could deadlock if called from
    scheduler code (namely from schedule(), wake_up(), etc) on runqueue lock
    when it tries to wake up klogd. Document this.

    Signed-off-by: Jiri Kosina
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiri Kosina
     
  • Now we always use stop_machine for module insertion or deletion, we no
    longer need the modlist_lock: merely disabling preemption is sufficient to
    block against list manipulation. This avoids deadlock on OOPSen where we
    can potentially grab the lock twice.

    Bug: 8695
    Signed-off-by: Rusty Russell
    Cc: Ingo Molnar
    Cc: Tobias Oed
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rusty Russell
     
  • Change cancel_work_sync() and cancel_delayed_work_sync() to return a boolean
    indicating whether the work was actually cancelled. A zero return value means
    that the work was not pending/queued.

    Without that kind of change it is not possible to avoid flush_workqueue()
    sometimes, see the next patch as an example.

    Also, this patch unifies both functions and kills the (unlikely) busy-wait
    loop.

    Signed-off-by: Oleg Nesterov
    Acked-by: Jarek Poplawski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Imho, the current naming of cancel_xxx workqueue functions is very confusing.

    cancel_delayed_work()
    cancel_rearming_delayed_work()
    cancel_rearming_delayed_workqueue() // obsolete

    cancel_work_sync()

    This looks as if the first 2 functions differ in "type" of their argument
    which is not true any longer, nowadays the difference is the behaviour.

    The semantics of cancel_rearming_delayed_work(dwork) was changed
    significantly, it doesn't require that dwork rearms itself, and cancels dwork
    synchronously.

    Rename it to cancel_delayed_work_sync(). This matches cancel_delayed_work()
    and cancel_work_sync(). Re-create cancel_rearming_delayed_work() as a simple
    inline obsolete wrapper, like cancel_rearming_delayed_workqueue().

    Signed-off-by: Oleg Nesterov
    Acked-by: Jarek Poplawski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Replace (n & (n-1)) with is_power_of_2()

    Signed-off-by: vignesh babu
    Acked-by: Stelian Pop
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    vignesh babu
     
  • This follows a suggestion from Chuck Ebbert on how to make seccomp
    absolutely zerocost in schedule too. The only remaining footprint of
    seccomp is in terms of the bzImage size that becomes a few bytes (perhaps
    even a few kbytes) larger, measure it if you care in the embedded.

    Signed-off-by: Andrea Arcangeli
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     
  • This reduces the memory footprint and it enforces that only the current
    task can enable seccomp on itself (this is a requirement for a
    strightforward [modulo preempt ;) ] TIF_NOTSC implementation).

    Signed-off-by: Andrea Arcangeli
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     
  • Remove pointless `else'.

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Hopefully this will help people to understand the new regime.

    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • The recent PRIVATE and REQUEUE_PI changes to the futex code made it hard to
    read. Tidy it up.

    Signed-off-by: Thomas Gleixner
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Improve performance of sys_time(). sys_time() returns time in seconds, but
    it does so by calling do_gettimeofday() and then returning the tv_sec
    portion of the GTOD time. But the data structure "xtime", which is updated
    by every timer/scheduler tick, already offers HZ granularity time.

    The patch improves the sysbench OLTP macrobenchmark significantly:

    2.6.22-rc6:

    #threads
    1: transactions: 3733 (373.21 per sec.)
    2: transactions: 6676 (667.46 per sec.)
    3: transactions: 6957 (695.50 per sec.)
    4: transactions: 7055 (705.48 per sec.)
    5: transactions: 6596 (659.33 per sec.)

    2.6.22-rc6 + sys_time.patch:

    1: transactions: 4005 (400.47 per sec.)
    2: transactions: 7379 (737.77 per sec.)
    3: transactions: 7347 (734.49 per sec.)
    4: transactions: 7468 (746.65 per sec.)
    5: transactions: 7428 (742.47 per sec.)

    Mixed API uses of gettimeofday() and time() are guaranteed to be coherent
    via the use of a at-most-once-per-second slowpath that updates xtime.

    [akpm@linux-foundation.org: build fixes]
    Signed-off-by: Ingo Molnar
    Cc: John Stultz
    Cc: Thomas Gleixner
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • While working on unshare support for the network namespace I noticed we
    were putting clone flags in an int. Which is weird because the syscall
    uses unsigned long and we at least need an unsigned to properly hold all of
    the unshare flags.

    So to make the code consistent, this patch updates the code to use
    unsigned long instead of int for the clone flags in those places
    where we get it wrong today.

    Signed-off-by: Eric W. Biederman
    Acked-by: Cedric Le Goater
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • OpenVZ Linux kernel team has discovered the problem with 32bit quota tools
    working on 64bit architectures. In 2.6.10 kernel sys32_quotactl() function
    was replaced by sys_quotactl() with the comment "sys_quotactl seems to be
    32/64bit clean, enable it for 32bit" However this isn't right. Look at
    if_dqblk structure:

    struct if_dqblk {
    __u64 dqb_bhardlimit;
    __u64 dqb_bsoftlimit;
    __u64 dqb_curspace;
    __u64 dqb_ihardlimit;
    __u64 dqb_isoftlimit;
    __u64 dqb_curinodes;
    __u64 dqb_btime;
    __u64 dqb_itime;
    __u32 dqb_valid;
    };

    For 32 bit quota tools sizeof(if_dqblk) == 0x44.
    But for 64 bit kernel its size is 0x48, 'cause of alignment!
    Thus we got a problem. Attached patch reintroduce sys32_quotactl() function,
    that handles this and related situations.

    [michal.k.k.piotrowski@gmail.com: build fix]
    [akpm@linux-foundation.org: Make it link with CONFIG_QUOTA=n]
    Signed-off-by: Vasily Tarasov
    Cc: Andi Kleen
    Cc: "Luck, Tony"
    Cc: Jan Kara
    Cc:
    Signed-off-by: Michal Piotrowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vasily Tarasov
     
  • Fix parameter name in audit_core_dumps for kerneldoc.

    Signed-off-by: Henrik Kretzschmar
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Henrik Kretzschmar
     
  • It should improve performance in some scenarii where a lot of
    these nsproxy objects are created by unsharing namespaces. This is
    a typical use of virtual servers that are being created or entered.

    This is also a good tool to find leaks and gather statistics on
    namespace usage.

    Signed-off-by: Cedric Le Goater
    Cc: Herbert Poetzl
    Cc: Pavel Emelianov
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Cedric Le Goater
     
  • dup_mnt_ns() and clone_uts_ns() return NULL on failure. This is wrong,
    create_new_namespaces() uses ERR_PTR() to catch an error. This means that the
    subsequent create_new_namespaces() will hit BUG_ON() in copy_mnt_ns() or
    copy_utsname().

    Modify create_new_namespaces() to also use the errors returned by the
    copy_*_ns routines and not to systematically return ENOMEM.

    [oleg@tv-sign.ru: better changelog]
    Signed-off-by: Cedric Le Goater
    Cc: Serge E. Hallyn
    Cc: Badari Pulavarty
    Cc: Pavel Emelianov
    Cc: Herbert Poetzl
    Cc: Eric W. Biederman
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Cedric Le Goater
     
  • This patch enables the unshare of user namespaces.

    It adds a new clone flag CLONE_NEWUSER and implements copy_user_ns() which
    resets the current user_struct and adds a new root user (uid == 0)

    For now, unsharing the user namespace allows a process to reset its
    user_struct accounting and uid 0 in the new user namespace should be contained
    using appropriate means, for instance selinux

    The plan, when the full support is complete (all uid checks covered), is to
    keep the original user's rights in the original namespace, and let a process
    become uid 0 in the new namespace, with full capabilities to the new
    namespace.

    Signed-off-by: Serge E. Hallyn
    Signed-off-by: Cedric Le Goater
    Acked-by: Pavel Emelianov
    Cc: Herbert Poetzl
    Cc: Kirill Korotaev
    Cc: Eric W. Biederman
    Cc: Chris Wright
    Cc: Stephen Smalley
    Cc: James Morris
    Cc: Andrew Morgan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Serge E. Hallyn
     
  • Basically, it will allow a process to unshare its user_struct table,
    resetting at the same time its own user_struct and all the associated
    accounting.

    A new root user (uid == 0) is added to the user namespace upon creation.
    Such root users have full privileges and it seems that theses privileges
    should be controlled through some means (process capabilities ?)

    The unshare is not included in this patch.

    Changes since [try #4]:
    - Updated get_user_ns and put_user_ns to accept NULL, and
    get_user_ns to return the namespace.

    Changes since [try #3]:
    - moved struct user_namespace to files user_namespace.{c,h}

    Changes since [try #2]:
    - removed struct user_namespace* argument from find_user()

    Changes since [try #1]:
    - removed struct user_namespace* argument from find_user()
    - added a root_user per user namespace

    Signed-off-by: Cedric Le Goater
    Signed-off-by: Serge E. Hallyn
    Acked-by: Pavel Emelianov
    Cc: Herbert Poetzl
    Cc: Kirill Korotaev
    Cc: Eric W. Biederman
    Cc: Chris Wright
    Cc: Stephen Smalley
    Cc: James Morris
    Cc: Andrew Morgan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Cedric Le Goater
     
  • CONFIG_UTS_NS and CONFIG_IPC_NS have very little value as they only
    deactivate the unshare of the uts and ipc namespaces and do not improve
    performance.

    Signed-off-by: Cedric Le Goater
    Acked-by: "Serge E. Hallyn"
    Cc: Eric W. Biederman
    Cc: Herbert Poetzl
    Cc: Pavel Emelianov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Cedric Le Goater
     
  • Add TTY input auditing, used to audit system administrator's actions. This is
    required by various security standards such as DCID 6/3 and PCI to provide
    non-repudiation of administrator's actions and to allow a review of past
    actions if the administrator seems to overstep their duties or if the system
    becomes misconfigured for unknown reasons. These requirements do not make it
    necessary to audit TTY output as well.

    Compared to an user-space keylogger, this approach records TTY input using the
    audit subsystem, correlated with other audit events, and it is completely
    transparent to the user-space application (e.g. the console ioctls still
    work).

    TTY input auditing works on a higher level than auditing all system calls
    within the session, which would produce an overwhelming amount of mostly
    useless audit events.

    Add an "audit_tty" attribute, inherited across fork (). Data read from TTYs
    by process with the attribute is sent to the audit subsystem by the kernel.
    The audit netlink interface is extended to allow modifying the audit_tty
    attribute, and to allow sending explanatory audit events from user-space (for
    example, a shell might send an event containing the final command, after the
    interactive command-line editing and history expansion is performed, which
    might be difficult to decipher from the TTY input alone).

    Because the "audit_tty" attribute is inherited across fork (), it would be set
    e.g. for sshd restarted within an audited session. To prevent this, the
    audit_tty attribute is cleared when a process with no open TTY file
    descriptors (e.g. after daemon startup) opens a TTY.

    See https://www.redhat.com/archives/linux-audit/2007-June/msg00000.html for a
    more detailed rationale document for an older version of this patch.

    [akpm@linux-foundation.org: build fix]
    Signed-off-by: Miloslav Trmac
    Cc: Al Viro
    Cc: Alan Cox
    Cc: Paul Fulghum
    Cc: Casey Schaufler
    Cc: Steve Grubb
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miloslav Trmac
     
  • Currently we handle spurious IRQ activity based upon seeing a lot of
    invalid interrupts, and we clear things back on the base of lots of valid
    interrupts.

    Unfortunately in some cases you get legitimate invalid interrupts caused by
    timing asynchronicity between the PCI bus and the APIC bus when disabling
    interrupts and pulling other tricks. In this case although the spurious
    IRQs are not a problem our unhandled counters didn't clear and they act as
    a slow running timebomb. (This is effectively what the serial port/tty
    problem that was fixed by clearing counters when registering a handler
    showed up)

    It's easy enough to add a second parameter - time. This means that if we
    see a regular stream of harmless spurious interrupts which are not harming
    processing we don't go off and do something stupid like disable the IRQ
    after a month of running. OTOH lockups and performance killers show up a
    lot more than 10/second

    [akpm@linux-foundation.org: cleanup]
    Signed-off-by: Alan Cox
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Cox
     
  • Make available to the user the following task and process performance
    statistics:

    * Involuntary Context Switches (task_struct->nivcsw)
    * Voluntary Context Switches (task_struct->nvcsw)

    Statistics information is available from:
    1. taskstats interface (Documentation/accounting/)
    2. /proc/PID/status (task only).

    This data is useful for detecting hyperactivity patterns between processes.

    [akpm@linux-foundation.org: cleanup]
    Signed-off-by: Maxim Uvarov
    Cc: Shailabh Nagar
    Cc: Balbir Singh
    Cc: Jay Lan
    Cc: Jonathan Lim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Maxim Uvarov
     
  • I forgot to remove capability.h from mm.h while removing sched.h! This
    patch remedies that, because the only inline function which was using
    CAP_something was made out of line.

    Cross-compile tested without regressions on:

    all powerpc defconfigs
    all mips defconfigs
    all m68k defconfigs
    all arm defconfigs
    all ia64 defconfigs

    alpha alpha-allnoconfig alpha-defconfig alpha-up
    arm
    i386 i386-allnoconfig i386-defconfig i386-up
    ia64 ia64-allnoconfig ia64-defconfig ia64-up
    m68k
    mips
    parisc parisc-allnoconfig parisc-defconfig parisc-up
    powerpc powerpc-up
    s390 s390-allnoconfig s390-defconfig s390-up
    sparc sparc-allnoconfig sparc-defconfig sparc-up
    sparc64 sparc64-allnoconfig sparc64-defconfig sparc64-up
    um-x86_64
    x86_64 x86_64-allnoconfig x86_64-defconfig x86_64-up

    as well as my two usual configs.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan