27 May, 2011

7 commits

  • profile_hits() has a common check for prof_on and prof_buffer regardless
    of SMP or !SMP. So, remove some duplicate code by splitting profile_hits
    into two.

    [akpm@linux-foundation.org: make do_profile_hits static]
    Signed-off-by: Rakib Mullick
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rakib Mullick
     
  • Setup and cleanup of mm_struct->exe_file is currently done in fs/proc/.
    This was because exe_file was needed only for /proc//exe. Since we
    will need the exe_file functionality also for core dumps (so core name can
    contain full binary path), built this functionality always into the
    kernel.

    To achieve that move that out of proc FS to the kernel/ where in fact it
    should belong. By doing that we can make dup_mm_exe_file static. Also we
    can drop linux/proc_fs.h inclusion in fs/exec.c and kernel/fork.c.

    Signed-off-by: Jiri Slaby
    Cc: Alexander Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiri Slaby
     
  • The ns_cgroup is an annoying cgroup at the namespace / cgroup frontier and
    leads to some problems:

    * cgroup creation is out-of-control
    * cgroup name can conflict when pids are looping
    * it is not possible to have a single process handling a lot of
    namespaces without falling in a exponential creation time
    * we may want to create a namespace without creating a cgroup

    The ns_cgroup was replaced by a compatibility flag 'clone_children',
    where a newly created cgroup will copy the parent cgroup values.
    The userspace has to manually create a cgroup and add a task to
    the 'tasks' file.

    This patch removes the ns_cgroup as suggested in the following thread:

    https://lists.linux-foundation.org/pipermail/containers/2009-June/018616.html

    The 'cgroup_clone' function is removed because it is no longer used.

    This is a userspace-visible change. Commit 45531757b45c ("cgroup: notify
    ns_cgroup deprecated") (merged into 2.6.27) caused the kernel to emit a
    printk warning users that the feature is planned for removal. Since that
    time we have heard from XXX users who were affected by this.

    Signed-off-by: Daniel Lezcano
    Signed-off-by: Serge E. Hallyn
    Cc: Eric W. Biederman
    Cc: Jamal Hadi Salim
    Reviewed-by: Li Zefan
    Acked-by: Paul Menage
    Acked-by: Matt Helsley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daniel Lezcano
     
  • Convert cgroup_attach_proc to use flex_array.

    The cgroup_attach_proc implementation requires a pre-allocated array to
    store task pointers to atomically move a thread-group, but asking for a
    monolithic array with kmalloc() may be unreliable for very large groups.
    Using flex_array provides the same functionality with less risk of
    failure.

    This is a post-patch for cgroup-procs-write.patch.

    Signed-off-by: Ben Blum
    Cc: "Eric W. Biederman"
    Cc: Li Zefan
    Cc: Matt Helsley
    Reviewed-by: Paul Menage
    Cc: Oleg Nesterov
    Cc: David Rientjes
    Cc: Miao Xie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ben Blum
     
  • Make procs file writable to move all threads by tgid at once.

    Add functionality that enables users to move all threads in a threadgroup
    at once to a cgroup by writing the tgid to the 'cgroup.procs' file. This
    current implementation makes use of a per-threadgroup rwsem that's taken
    for reading in the fork() path to prevent newly forking threads within the
    threadgroup from "escaping" while the move is in progress.

    Signed-off-by: Ben Blum
    Cc: "Eric W. Biederman"
    Cc: Li Zefan
    Cc: Matt Helsley
    Reviewed-by: Paul Menage
    Cc: Oleg Nesterov
    Cc: David Rientjes
    Cc: Miao Xie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ben Blum
     
  • Add cgroup subsystem callbacks for per-thread attachment in atomic contexts

    Add can_attach_task(), pre_attach(), and attach_task() as new callbacks
    for cgroups's subsystem interface. Unlike can_attach and attach, these
    are for per-thread operations, to be called potentially many times when
    attaching an entire threadgroup.

    Also, the old "bool threadgroup" interface is removed, as replaced by
    this. All subsystems are modified for the new interface - of note is
    cpuset, which requires from/to nodemasks for attach to be globally scoped
    (though per-cpuset would work too) to persist from its pre_attach to
    attach_task and attach.

    This is a pre-patch for cgroup-procs-writable.patch.

    Signed-off-by: Ben Blum
    Cc: "Eric W. Biederman"
    Cc: Li Zefan
    Cc: Matt Helsley
    Reviewed-by: Paul Menage
    Cc: Oleg Nesterov
    Cc: David Rientjes
    Cc: Miao Xie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ben Blum
     
  • Adds functionality to read/write lock CLONE_THREAD fork()ing per-threadgroup

    Add an rwsem that lives in a threadgroup's signal_struct that's taken for
    reading in the fork path, under CONFIG_CGROUPS. If another part of the
    kernel later wants to use such a locking mechanism, the CONFIG_CGROUPS
    ifdefs should be changed to a higher-up flag that CGROUPS and the other
    system would both depend on.

    This is a pre-patch for cgroup-procs-write.patch.

    Signed-off-by: Ben Blum
    Cc: "Eric W. Biederman"
    Cc: Li Zefan
    Cc: Matt Helsley
    Reviewed-by: Paul Menage
    Cc: Oleg Nesterov
    Cc: David Rientjes
    Cc: Miao Xie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ben Blum
     

26 May, 2011

7 commits

  • commit 4b06042(bitmap, irq: add smp_affinity_list interface to
    /proc/irq) causes the following warning:

    [ 274.239500] WARNING: at fs/proc/generic.c:850 remove_proc_entry+0x24c/0x27a()
    [ 274.251761] remove_proc_entry: removing non-empty directory 'irq/184',
    leaking at least 'smp_affinity_list'

    Remove the new file in the exit path.

    Signed-off-by: Yinghai Lu
    Cc: Mike Travis
    Link: http://lkml.kernel.org/r/4DDDE094.6050505@kernel.org
    Signed-off-by: Thomas Gleixner

    Yinghai Lu
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/linux-2.6-nsfd:
    net: fix get_net_ns_by_fd for !CONFIG_NET_NS
    ns proc: Return -ENOENT for a nonexistent /proc/self/ns/ entry.
    ns: Declare sys_setns in syscalls.h
    net: Allow setting the network namespace by fd
    ns proc: Add support for the ipc namespace
    ns proc: Add support for the uts namespace
    ns proc: Add support for the network namespace.
    ns: Introduce the setns syscall
    ns: proc files for namespace naming policy.

    Linus Torvalds
     
  • * 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc:
    signal: sys_pause() should check signal_pending()
    ptrace: ptrace_resume() shouldn't wake up !TASK_TRACED thread

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: (26 commits)
    arch/tile: prefer "tilepro" as the name of the 32-bit architecture
    compat: include aio_abi.h for aio_context_t
    arch/tile: cleanups for tilegx compat mode
    arch/tile: allocate PCI IRQs later in boot
    arch/tile: support signal "exception-trace" hook
    arch/tile: use better definitions of xchg() and cmpxchg()
    include/linux/compat.h: coding-style fixes
    tile: add an RTC driver for the Tilera hypervisor
    arch/tile: finish enabling support for TILE-Gx 64-bit chip
    compat: fixes to allow working with tile arch
    arch/tile: update defconfig file to something more useful
    tile: do_hardwall_trap: do not play with task->sighand
    tile: replace mm->cpu_vm_mask with mm_cpumask()
    tile,mn10300: add device parameter to dma_cache_sync()
    audit: support the "standard"
    arch/tile: clarify flush_buffer()/finv_buffer() function names
    arch/tile: kernel-related cleanups from removing static page size
    arch/tile: various header improvements for building drivers
    arch/tile: disable GX prefetcher during cache flush
    arch/tile: tolerate disabling CONFIG_BLK_DEV_INITRD
    ...

    Linus Torvalds
     
  • commit 9ec2690758a5 ("timerfd: Manage cancelable timers in timerfd")
    introduced a CONFIG_HIGHRES_TIMERS (should be CONFIG_HIGH_RES_TIMERS)
    typo, which caused applications depending on CLOCK_REALTIME timers to
    become sluggy due to the fact that the time base of the realtime
    timers was not updated when the wall clock time was set.

    This causes anything from 100% CPU use for some applications to odd
    delays and hickups.

    Reported-bisected-and-tested-by: Anca Emanuel
    Tested-by: Linus Torvalds
    Fatfingered-by: Thomas Gleixner
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • ERESTART* is always wrong without TIF_SIGPENDING. Teach sys_pause()
    to handle the spurious wakeup correctly.

    Signed-off-by: Oleg Nesterov

    Oleg Nesterov
     
  • It is not clear why ptrace_resume() does wake_up_process(). Unless the
    caller is PTRACE_KILL the tracee should be TASK_TRACED so we can use
    wake_up_state(__TASK_TRACED). If sys_ptrace() races with SIGKILL we do
    not need the extra and potentionally spurious wakeup.

    If the caller is PTRACE_KILL, wake_up_process() is even more wrong.
    The tracee can sleep in any state in any place, and if we have a buggy
    code which doesn't handle a spurious wakeup correctly PTRACE_KILL can
    be used to exploit it. For example:

    int main(void)
    {
    int child, status;

    child = fork();
    if (!child) {
    int ret;

    assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);

    ret = pause();
    printf("pause: %d %m\n", ret);

    return 0x23;
    }

    sleep(1);
    assert(ptrace(PTRACE_KILL, child, 0,0) == 0);

    assert(child == wait(&status));
    printf("wait: %x\n", status);

    return 0;
    }

    prints "pause: -1 Unknown error 514", -ERESTARTNOHAND leaks to the
    userland. In this case sys_pause() is buggy as well and should be
    fixed.

    I do not know what was the original rationality behind PTRACE_KILL.
    The man page is simply wrong and afaics it was always wrong. Imho
    it should be deprecated, or may be it should do send_sig(SIGKILL)
    as Denys suggests, but in any case I do not think that the current
    behaviour was intentional.

    Note: there is another problem, ptrace_resume() changes ->exit_code
    and this can race with SIGKILL too. Eventually we should change ptrace
    to not use ->exit_code.

    Signed-off-by: Oleg Nesterov

    Oleg Nesterov
     

25 May, 2011

9 commits

  • …el/git/tip/linux-2.6-tip

    * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    posix-timers: RCU conversion

    Linus Torvalds
     
  • On larger systems, because of the numerous ACPI, Bootmem and EFI messages,
    the static log buffer overflows before the larger one specified by the
    log_buf_len param is allocated. Minimize the overflow by allocating the
    new log buffer as soon as possible.

    On kernels without memblock, a later call to setup_log_buf from
    kernel/init.c is the fallback.

    [akpm@linux-foundation.org: coding-style fixes]
    [akpm@linux-foundation.org: fix CONFIG_PRINTK=n build]
    Signed-off-by: Mike Travis
    Cc: Yinghai Lu
    Cc: "H. Peter Anvin"
    Cc: Jack Steiner
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Travis
     
  • Manually adjusting the smp_affinity for IRQ's becomes unwieldy when the
    cpu count is large.

    Setting smp affinity to cpus 256 to 263 would be:

    echo 000000ff,00000000,00000000,00000000,00000000,00000000,00000000,00000000 > smp_affinity

    instead of:

    echo 256-263 > smp_affinity_list

    Think about what it looks like for cpus around say, 4088 to 4095.

    We already have many alternate "list" interfaces:

    /sys/devices/system/cpu/cpuX/indexY/shared_cpu_list
    /sys/devices/system/cpu/cpuX/topology/thread_siblings_list
    /sys/devices/system/cpu/cpuX/topology/core_siblings_list
    /sys/devices/system/node/nodeX/cpulist
    /sys/devices/pci***/***/local_cpulist

    Add a companion interface, smp_affinity_list to use cpu lists instead of
    cpu maps. This conforms to other companion interfaces where both a map
    and a list interface exists.

    This required adding a bitmap_parselist_user() function in a manner
    similar to the bitmap_parse_user() function.

    [akpm@linux-foundation.org: make __bitmap_parselist() static]
    Signed-off-by: Mike Travis
    Cc: Thomas Gleixner
    Cc: Jack Steiner
    Cc: Lee Schermerhorn
    Cc: Andy Shevchenko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Travis
     
  • cpumask_t is very big struct and cpu_vm_mask is placed wrong position.
    It might lead to reduce cache hit ratio.

    This patch has two change.
    1) Move the place of cpumask into last of mm_struct. Because usually cpumask
    is accessed only front bits when the system has cpu-hotplug capability
    2) Convert cpu_vm_mask into cpumask_var_t. It may help to reduce memory
    footprint if cpumask_size() will use nr_cpumask_bits properly in future.

    In addition, this patch change the name of cpu_vm_mask with cpu_vm_mask_var.
    It may help to detect out of tree cpu_vm_mask users.

    This patch has no functional change.

    [akpm@linux-foundation.org: build fix]
    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: KOSAKI Motohiro
    Cc: David Howells
    Cc: Koichi Yasutake
    Cc: Hugh Dickins
    Cc: Chris Metcalf
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • Straightforward conversion of i_mmap_lock to a mutex.

    Signed-off-by: Peter Zijlstra
    Acked-by: Hugh Dickins
    Cc: Benjamin Herrenschmidt
    Cc: David Miller
    Cc: Martin Schwidefsky
    Cc: Russell King
    Cc: Paul Mundt
    Cc: Jeff Dike
    Cc: Richard Weinberger
    Cc: Tony Luck
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: KOSAKI Motohiro
    Cc: Nick Piggin
    Cc: Namhyung Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • Hugh says:
    "The only significant loser, I think, would be page reclaim (when
    concurrent with truncation): could spin for a long time waiting for
    the i_mmap_mutex it expects would soon be dropped? "

    Counter points:
    - cpu contention makes the spin stop (need_resched())
    - zap pages should be freeing pages at a higher rate than reclaim
    ever can

    I think the simplification of the truncate code is definitely worth it.

    Effectively reverts: 2aa15890f3c ("mm: prevent concurrent
    unmap_mapping_range() on the same inode") and takes out the code that
    caused its problem.

    Signed-off-by: Peter Zijlstra
    Reviewed-by: KAMEZAWA Hiroyuki
    Cc: Hugh Dickins
    Cc: Benjamin Herrenschmidt
    Cc: David Miller
    Cc: Martin Schwidefsky
    Cc: Russell King
    Cc: Paul Mundt
    Cc: Jeff Dike
    Cc: Richard Weinberger
    Cc: Tony Luck
    Cc: Mel Gorman
    Cc: KOSAKI Motohiro
    Cc: Nick Piggin
    Cc: Namhyung Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • In order to convert i_mmap_lock to a mutex we need a mutex equivalent to
    spin_lock_nest_lock(), thus provide the mutex_lock_nest_lock() annotation.

    As with spin_lock_nest_lock(), mutex_lock_nest_lock() allows annotation of
    the locking pattern where an outer lock serializes the acquisition order
    of nested locks. That is, if every time you lock multiple locks A, say A1
    and A2 you first acquire N, the order of acquiring A1 and A2 is
    irrelevant.

    Signed-off-by: Peter Zijlstra
    Cc: Benjamin Herrenschmidt
    Cc: David Miller
    Cc: Martin Schwidefsky
    Cc: Russell King
    Cc: Paul Mundt
    Cc: Jeff Dike
    Cc: Richard Weinberger
    Cc: Tony Luck
    Cc: KAMEZAWA Hiroyuki
    Cc: Hugh Dickins
    Cc: Mel Gorman
    Cc: KOSAKI Motohiro
    Cc: Nick Piggin
    Cc: Namhyung Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • …s/security-testing-2.6

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (43 commits)
    TOMOYO: Fix wrong domainname validation.
    SELINUX: add /sys/fs/selinux mount point to put selinuxfs
    CRED: Fix load_flat_shared_library() to initialise bprm correctly
    SELinux: introduce path_has_perm
    flex_array: allow 0 length elements
    flex_arrays: allow zero length flex arrays
    flex_array: flex_array_prealloc takes a number of elements, not an end
    SELinux: pass last path component in may_create
    SELinux: put name based create rules in a hashtable
    SELinux: generic hashtab entry counter
    SELinux: calculate and print hashtab stats with a generic function
    SELinux: skip filename trans rules if ttype does not match parent dir
    SELinux: rename filename_compute_type argument to *type instead of *con
    SELinux: fix comment to state filename_compute_type takes an objname not a qstr
    SMACK: smack_file_lock can use the struct path
    LSM: separate LSM_AUDIT_DATA_DENTRY from LSM_AUDIT_DATA_PATH
    LSM: split LSM_AUDIT_DATA_FS into _PATH and _INODE
    SELINUX: Make selinux cache VFS RCU walks safe
    SECURITY: Move exec_permission RCU checks into security modules
    SELinux: security_read_policy should take a size_t not ssize_t
    ...

    Linus Torvalds
     
  • * 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
    percpu: Unify input section names
    percpu: Avoid extra NOP in percpu_cmpxchg16b_double
    percpu: Cast away printk format warning
    percpu: Always align percpu output section to PAGE_SIZE

    Fix up fairly trivial conflict in arch/x86/include/asm/percpu.h as per Tejun

    Linus Torvalds
     

24 May, 2011

10 commits

  • James Morris
     
  • Ben Nagy reported a scalability problem with KVM/QEMU that hit very hard
    a single spinlock (idr_lock) in posix-timers code, on its 48 core
    machine.

    Even on a 16 cpu machine (2x4x2), a single test can show 98% of cpu time
    used in ticket_spin_lock, from lock_timer

    Ref: http://www.spinics.net/lists/kvm/msg51526.html

    Switching to RCU is quite easy, IDR being already RCU ready. idr_lock
    should be locked only for an insert/delete, not a lookup.

    Benchmark on a 2x4x2 machine, 16 processes calling timer_gettime().

    Before :

    real 1m18.669s
    user 0m1.346s
    sys 1m17.180s

    After :

    real 0m3.296s
    user 0m1.366s
    sys 0m1.926s

    Reported-by: Ben Nagy
    Signed-off-by: Eric Dumazet
    Tested-by: Ben Nagy
    Cc: Oleg Nesterov
    Cc: Avi Kivity
    Cc: John Stultz
    Cc: Richard Cochran
    Cc: Paul E. McKenney
    Cc: Andrew Morton
    Signed-off-by: Thomas Gleixner

    Eric Dumazet
     
  • Tejun Heo
     
  • …l/git/tip/linux-2.6-tip

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    perf tools: Fix sample type size calculation in 32 bits archs
    profile: Use vzalloc() rather than vmalloc() & memset()

    Linus Torvalds
     
  • We try to enforce it by using -Wstrict-prototypes, but apparently they
    sometimes get through. Introduced by 4eec42f39204 ("watchdog: Change
    the default timeout and configure nmi watchdog period based").

    Reported-by: Stephen Rothwell
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • …/git/tip/linux-2.6-tip

    * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    sched: Increase SCHED_LOAD_SCALE resolution
    sched: Introduce SCHED_POWER_SCALE to scale cpu_power calculations
    sched: Cleanup set_load_weight()

    Linus Torvalds
     
  • * 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6: (970 commits)
    staging: usbip: replace usbip_u{dbg,err,info} and printk with dev_ and pr_
    staging:iio: Trivial kconfig reorganization and uniformity improvements.
    staging:iio:documenation partial update.
    staging:iio: use pollfunc allocation helpers in remaining drivers.
    staging:iio:max1363 misc cleanups and use of for_each_bit_set to simplify event code spitting out.
    staging:iio: implement an iio_info structure to take some of the constant elements out of iio_dev.
    staging:iio:meter:ade7758: Use private data space from iio_allocate_device
    staging:iio:accel:lis3l02dq make write_reg_8 take value not a pointer to value.
    staging:iio: ring core cleanups + check if read_last available in lis3l02dq
    staging:iio:core cleanup: squash tiny wrappers and use dev_set_name to handle creation of event interface name.
    staging:iio: poll func allocation clean up.
    staging:iio:ad7780 trivial unused header cleanup.
    staging:iio:adc: AD7780: Use private data space from iio_allocate_device + trivial fixes
    staging:iio:adc:AD7780: Convert to new channel registration method
    staging:iio:adc: AD7606: Drop dev_data in favour of iio_priv()
    staging:iio:adc: AD7606: Consitently use indio_dev
    staging:iio: Rip out helper for software rings.
    staging:iio:adc:AD7298: Use private data space from iio_allocate_device
    staging:iio: rationalization of different buffer implementation hooks.
    staging:iio:imu:adis16400 avoid allocating rx, tx, and state separately from iio_dev.
    ...

    Fix up trivial conflicts in
    - drivers/staging/intel_sst/intelmid.c: patches applied in both branches
    - drivers/staging/rt2860/common/cmm_data_{pci,usb}.c: removed vs spelling
    - drivers/staging/usbip/vhci_sysfs.c: trivial header file inclusion

    Linus Torvalds
     
  • * 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    hrtimers: Reorder clock bases
    hrtimers: Avoid touching inactive timer bases
    hrtimers: Make struct hrtimer_cpu_base layout less stupid
    timerfd: Manage cancelable timers in timerfd
    clockevents: Move C3 stop test outside lock
    alarmtimer: Drop device refcount after rtc_open()
    alarmtimer: Check return value of class_find_device()
    timerfd: Allow timers to be cancelled when clock was set
    hrtimers: Prepare for cancel on clock was set timers

    Linus Torvalds
     
  • …l/git/tip/linux-2.6-tip

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    perf tools: Fix sample size bit operations
    perf tools: Fix ommitted mmap data update on remap
    watchdog: Change the default timeout and configure nmi watchdog period based on watchdog_thresh
    watchdog: Disable watchdog when thresh is zero
    watchdog: Only disable/enable watchdog if neccessary
    watchdog: Fix rounding bug in get_sample_period()
    perf tools: Propagate event parse error handling
    perf tools: Robustify dynamic sample content fetch
    perf tools: Pre-check sample size before parsing
    perf tools: Move evlist sample helpers to evlist area
    perf tools: Remove junk code in mmap size handling
    perf tools: Check we are able to read the event size on mmap

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
    b43: fix comment typo reqest -> request
    Haavard Skinnemoen has left Atmel
    cris: typo in mach-fs Makefile
    Kconfig: fix copy/paste-ism for dell-wmi-aio driver
    doc: timers-howto: fix a typo ("unsgined")
    perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c
    md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course').
    treewide: fix a few typos in comments
    regulator: change debug statement be consistent with the style of the rest
    Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations"
    audit: acquire creds selectively to reduce atomic op overhead
    rtlwifi: don't touch with treewide double semicolon removal
    treewide: cleanup continuations and remove logging message whitespace
    ath9k_hw: don't touch with treewide double semicolon removal
    include/linux/leds-regulator.h: fix syntax in example code
    tty: fix typo in descripton of tty_termios_encode_baud_rate
    xtensa: remove obsolete BKL kernel option from defconfig
    m68k: fix comment typo 'occcured'
    arch:Kconfig.locks Remove unused config option.
    treewide: remove extra semicolons
    ...

    Linus Torvalds
     

23 May, 2011

7 commits

  • Merge reason: this commit was queued up quite some time ago but was
    forgotten about.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • The ordering of the clock bases is historical due to the
    CLOCK_REALTIME and CLOCK_MONOTONIC constants. Now the hrtimer bases
    have their own enumeration due to the gap between CLOCK_MONOTONIC and
    CLOCK_BOOTTIME. So we can be more clever as most timers end up on the
    CLOCK_MONOTONIC base due to the virtue of POSIX declaring that
    relative CLOCK_REALTIME timers are not affected by time changes. In
    desktop environments this is slowly changing as applications switch to
    absolute timers, but I've observed empty CLOCK_REALTIME bases often
    enough. There is no performance penalty or overhead when
    CLOCK_REALTIME timers are active, but in case they are not we don't
    skip over a full cache line.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Peter Zijlstra

    Thomas Gleixner
     
  • Instead of iterating over all possible timer bases avoid it by marking
    the active bases in the cpu base.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Peter Zijlstra

    Thomas Gleixner
     
  • Peter is concerned about the extra scan of CLOCK_REALTIME_COS in the
    timer interrupt. Yes, I did not think about it, because the solution
    was so elegant. I didn't like the extra list in timerfd when it was
    proposed some time ago, but with a rcu based list the list walk it's
    less horrible than the original global lock, which was held over the
    list iteration.

    Requested-by: Peter Zijlstra
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Peter Zijlstra

    Thomas Gleixner
     
  • Before the conversion of the NMI watchdog to perf event, the
    watchdog timeout was 5 seconds. Now it is 60 seconds. For my
    particular application, netbooks, 5 seconds was a better
    timeout. With a short timeout, we catch faults earlier and are
    able to send back a panic. With a 60 second timeout, the user is
    unlikely to wait and will instead hit the power button, causing
    us to lose the panic info.

    This change configures the NMI period to watchdog_thresh and
    sets the softlockup_thresh to watchdog_thresh * 2. In addition,
    watchdog_thresh was reduced to 10 seconds as suggested by Ingo
    Molnar.

    Signed-off-by: Mandeep Singh Baines
    Cc: Marcin Slusarz
    Cc: Don Zickus
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/1306127423-3347-4-git-send-email-msb@chromium.org
    Signed-off-by: Ingo Molnar
    LKML-Reference:

    Mandeep Singh Baines
     
  • This restores the previous behavior of softlock_thresh.

    Currently, setting watchdog_thresh to zero causes the watchdog
    kthreads to consume a lot of CPU.

    In addition, the logic of proc_dowatchdog_thresh and
    proc_dowatchdog_enabled has been factored into proc_dowatchdog.

    Signed-off-by: Mandeep Singh Baines
    Cc: Marcin Slusarz
    Cc: Don Zickus
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/1306127423-3347-3-git-send-email-msb@chromium.org
    Signed-off-by: Ingo Molnar
    LKML-Reference:

    Mandeep Singh Baines
     
  • Don't take any action on an unsuccessful write to /proc.

    Signed-off-by: Mandeep Singh Baines
    Cc: Marcin Slusarz
    Cc: Don Zickus
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/1306127423-3347-2-git-send-email-msb@chromium.org
    Signed-off-by: Ingo Molnar

    Mandeep Singh Baines