27 May, 2011
7 commits
-
profile_hits() has a common check for prof_on and prof_buffer regardless
of SMP or !SMP. So, remove some duplicate code by splitting profile_hits
into two.[akpm@linux-foundation.org: make do_profile_hits static]
Signed-off-by: Rakib Mullick
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Setup and cleanup of mm_struct->exe_file is currently done in fs/proc/.
This was because exe_file was needed only for /proc//exe. Since we
will need the exe_file functionality also for core dumps (so core name can
contain full binary path), built this functionality always into the
kernel.To achieve that move that out of proc FS to the kernel/ where in fact it
should belong. By doing that we can make dup_mm_exe_file static. Also we
can drop linux/proc_fs.h inclusion in fs/exec.c and kernel/fork.c.Signed-off-by: Jiri Slaby
Cc: Alexander Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The ns_cgroup is an annoying cgroup at the namespace / cgroup frontier and
leads to some problems:* cgroup creation is out-of-control
* cgroup name can conflict when pids are looping
* it is not possible to have a single process handling a lot of
namespaces without falling in a exponential creation time
* we may want to create a namespace without creating a cgroupThe ns_cgroup was replaced by a compatibility flag 'clone_children',
where a newly created cgroup will copy the parent cgroup values.
The userspace has to manually create a cgroup and add a task to
the 'tasks' file.This patch removes the ns_cgroup as suggested in the following thread:
https://lists.linux-foundation.org/pipermail/containers/2009-June/018616.html
The 'cgroup_clone' function is removed because it is no longer used.
This is a userspace-visible change. Commit 45531757b45c ("cgroup: notify
ns_cgroup deprecated") (merged into 2.6.27) caused the kernel to emit a
printk warning users that the feature is planned for removal. Since that
time we have heard from XXX users who were affected by this.Signed-off-by: Daniel Lezcano
Signed-off-by: Serge E. Hallyn
Cc: Eric W. Biederman
Cc: Jamal Hadi Salim
Reviewed-by: Li Zefan
Acked-by: Paul Menage
Acked-by: Matt Helsley
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Convert cgroup_attach_proc to use flex_array.
The cgroup_attach_proc implementation requires a pre-allocated array to
store task pointers to atomically move a thread-group, but asking for a
monolithic array with kmalloc() may be unreliable for very large groups.
Using flex_array provides the same functionality with less risk of
failure.This is a post-patch for cgroup-procs-write.patch.
Signed-off-by: Ben Blum
Cc: "Eric W. Biederman"
Cc: Li Zefan
Cc: Matt Helsley
Reviewed-by: Paul Menage
Cc: Oleg Nesterov
Cc: David Rientjes
Cc: Miao Xie
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Make procs file writable to move all threads by tgid at once.
Add functionality that enables users to move all threads in a threadgroup
at once to a cgroup by writing the tgid to the 'cgroup.procs' file. This
current implementation makes use of a per-threadgroup rwsem that's taken
for reading in the fork() path to prevent newly forking threads within the
threadgroup from "escaping" while the move is in progress.Signed-off-by: Ben Blum
Cc: "Eric W. Biederman"
Cc: Li Zefan
Cc: Matt Helsley
Reviewed-by: Paul Menage
Cc: Oleg Nesterov
Cc: David Rientjes
Cc: Miao Xie
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add cgroup subsystem callbacks for per-thread attachment in atomic contexts
Add can_attach_task(), pre_attach(), and attach_task() as new callbacks
for cgroups's subsystem interface. Unlike can_attach and attach, these
are for per-thread operations, to be called potentially many times when
attaching an entire threadgroup.Also, the old "bool threadgroup" interface is removed, as replaced by
this. All subsystems are modified for the new interface - of note is
cpuset, which requires from/to nodemasks for attach to be globally scoped
(though per-cpuset would work too) to persist from its pre_attach to
attach_task and attach.This is a pre-patch for cgroup-procs-writable.patch.
Signed-off-by: Ben Blum
Cc: "Eric W. Biederman"
Cc: Li Zefan
Cc: Matt Helsley
Reviewed-by: Paul Menage
Cc: Oleg Nesterov
Cc: David Rientjes
Cc: Miao Xie
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Adds functionality to read/write lock CLONE_THREAD fork()ing per-threadgroup
Add an rwsem that lives in a threadgroup's signal_struct that's taken for
reading in the fork path, under CONFIG_CGROUPS. If another part of the
kernel later wants to use such a locking mechanism, the CONFIG_CGROUPS
ifdefs should be changed to a higher-up flag that CGROUPS and the other
system would both depend on.This is a pre-patch for cgroup-procs-write.patch.
Signed-off-by: Ben Blum
Cc: "Eric W. Biederman"
Cc: Li Zefan
Cc: Matt Helsley
Reviewed-by: Paul Menage
Cc: Oleg Nesterov
Cc: David Rientjes
Cc: Miao Xie
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
26 May, 2011
7 commits
-
commit 4b06042(bitmap, irq: add smp_affinity_list interface to
/proc/irq) causes the following warning:[ 274.239500] WARNING: at fs/proc/generic.c:850 remove_proc_entry+0x24c/0x27a()
[ 274.251761] remove_proc_entry: removing non-empty directory 'irq/184',
leaking at least 'smp_affinity_list'Remove the new file in the exit path.
Signed-off-by: Yinghai Lu
Cc: Mike Travis
Link: http://lkml.kernel.org/r/4DDDE094.6050505@kernel.org
Signed-off-by: Thomas Gleixner -
* git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/linux-2.6-nsfd:
net: fix get_net_ns_by_fd for !CONFIG_NET_NS
ns proc: Return -ENOENT for a nonexistent /proc/self/ns/ entry.
ns: Declare sys_setns in syscalls.h
net: Allow setting the network namespace by fd
ns proc: Add support for the ipc namespace
ns proc: Add support for the uts namespace
ns proc: Add support for the network namespace.
ns: Introduce the setns syscall
ns: proc files for namespace naming policy. -
* 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc:
signal: sys_pause() should check signal_pending()
ptrace: ptrace_resume() shouldn't wake up !TASK_TRACED thread -
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: (26 commits)
arch/tile: prefer "tilepro" as the name of the 32-bit architecture
compat: include aio_abi.h for aio_context_t
arch/tile: cleanups for tilegx compat mode
arch/tile: allocate PCI IRQs later in boot
arch/tile: support signal "exception-trace" hook
arch/tile: use better definitions of xchg() and cmpxchg()
include/linux/compat.h: coding-style fixes
tile: add an RTC driver for the Tilera hypervisor
arch/tile: finish enabling support for TILE-Gx 64-bit chip
compat: fixes to allow working with tile arch
arch/tile: update defconfig file to something more useful
tile: do_hardwall_trap: do not play with task->sighand
tile: replace mm->cpu_vm_mask with mm_cpumask()
tile,mn10300: add device parameter to dma_cache_sync()
audit: support the "standard"
arch/tile: clarify flush_buffer()/finv_buffer() function names
arch/tile: kernel-related cleanups from removing static page size
arch/tile: various header improvements for building drivers
arch/tile: disable GX prefetcher during cache flush
arch/tile: tolerate disabling CONFIG_BLK_DEV_INITRD
... -
commit 9ec2690758a5 ("timerfd: Manage cancelable timers in timerfd")
introduced a CONFIG_HIGHRES_TIMERS (should be CONFIG_HIGH_RES_TIMERS)
typo, which caused applications depending on CLOCK_REALTIME timers to
become sluggy due to the fact that the time base of the realtime
timers was not updated when the wall clock time was set.This causes anything from 100% CPU use for some applications to odd
delays and hickups.Reported-bisected-and-tested-by: Anca Emanuel
Tested-by: Linus Torvalds
Fatfingered-by: Thomas Gleixner
Signed-off-by: Thomas Gleixner
Signed-off-by: Linus Torvalds -
ERESTART* is always wrong without TIF_SIGPENDING. Teach sys_pause()
to handle the spurious wakeup correctly.Signed-off-by: Oleg Nesterov
-
It is not clear why ptrace_resume() does wake_up_process(). Unless the
caller is PTRACE_KILL the tracee should be TASK_TRACED so we can use
wake_up_state(__TASK_TRACED). If sys_ptrace() races with SIGKILL we do
not need the extra and potentionally spurious wakeup.If the caller is PTRACE_KILL, wake_up_process() is even more wrong.
The tracee can sleep in any state in any place, and if we have a buggy
code which doesn't handle a spurious wakeup correctly PTRACE_KILL can
be used to exploit it. For example:int main(void)
{
int child, status;child = fork();
if (!child) {
int ret;assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);
ret = pause();
printf("pause: %d %m\n", ret);return 0x23;
}sleep(1);
assert(ptrace(PTRACE_KILL, child, 0,0) == 0);assert(child == wait(&status));
printf("wait: %x\n", status);return 0;
}prints "pause: -1 Unknown error 514", -ERESTARTNOHAND leaks to the
userland. In this case sys_pause() is buggy as well and should be
fixed.I do not know what was the original rationality behind PTRACE_KILL.
The man page is simply wrong and afaics it was always wrong. Imho
it should be deprecated, or may be it should do send_sig(SIGKILL)
as Denys suggests, but in any case I do not think that the current
behaviour was intentional.Note: there is another problem, ptrace_resume() changes ->exit_code
and this can race with SIGKILL too. Eventually we should change ptrace
to not use ->exit_code.Signed-off-by: Oleg Nesterov
25 May, 2011
9 commits
-
…el/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
posix-timers: RCU conversion -
On larger systems, because of the numerous ACPI, Bootmem and EFI messages,
the static log buffer overflows before the larger one specified by the
log_buf_len param is allocated. Minimize the overflow by allocating the
new log buffer as soon as possible.On kernels without memblock, a later call to setup_log_buf from
kernel/init.c is the fallback.[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix CONFIG_PRINTK=n build]
Signed-off-by: Mike Travis
Cc: Yinghai Lu
Cc: "H. Peter Anvin"
Cc: Jack Steiner
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Manually adjusting the smp_affinity for IRQ's becomes unwieldy when the
cpu count is large.Setting smp affinity to cpus 256 to 263 would be:
echo 000000ff,00000000,00000000,00000000,00000000,00000000,00000000,00000000 > smp_affinity
instead of:
echo 256-263 > smp_affinity_list
Think about what it looks like for cpus around say, 4088 to 4095.
We already have many alternate "list" interfaces:
/sys/devices/system/cpu/cpuX/indexY/shared_cpu_list
/sys/devices/system/cpu/cpuX/topology/thread_siblings_list
/sys/devices/system/cpu/cpuX/topology/core_siblings_list
/sys/devices/system/node/nodeX/cpulist
/sys/devices/pci***/***/local_cpulistAdd a companion interface, smp_affinity_list to use cpu lists instead of
cpu maps. This conforms to other companion interfaces where both a map
and a list interface exists.This required adding a bitmap_parselist_user() function in a manner
similar to the bitmap_parse_user() function.[akpm@linux-foundation.org: make __bitmap_parselist() static]
Signed-off-by: Mike Travis
Cc: Thomas Gleixner
Cc: Jack Steiner
Cc: Lee Schermerhorn
Cc: Andy Shevchenko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
cpumask_t is very big struct and cpu_vm_mask is placed wrong position.
It might lead to reduce cache hit ratio.This patch has two change.
1) Move the place of cpumask into last of mm_struct. Because usually cpumask
is accessed only front bits when the system has cpu-hotplug capability
2) Convert cpu_vm_mask into cpumask_var_t. It may help to reduce memory
footprint if cpumask_size() will use nr_cpumask_bits properly in future.In addition, this patch change the name of cpu_vm_mask with cpu_vm_mask_var.
It may help to detect out of tree cpu_vm_mask users.This patch has no functional change.
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: KOSAKI Motohiro
Cc: David Howells
Cc: Koichi Yasutake
Cc: Hugh Dickins
Cc: Chris Metcalf
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Straightforward conversion of i_mmap_lock to a mutex.
Signed-off-by: Peter Zijlstra
Acked-by: Hugh Dickins
Cc: Benjamin Herrenschmidt
Cc: David Miller
Cc: Martin Schwidefsky
Cc: Russell King
Cc: Paul Mundt
Cc: Jeff Dike
Cc: Richard Weinberger
Cc: Tony Luck
Cc: KAMEZAWA Hiroyuki
Cc: Mel Gorman
Cc: KOSAKI Motohiro
Cc: Nick Piggin
Cc: Namhyung Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Hugh says:
"The only significant loser, I think, would be page reclaim (when
concurrent with truncation): could spin for a long time waiting for
the i_mmap_mutex it expects would soon be dropped? "Counter points:
- cpu contention makes the spin stop (need_resched())
- zap pages should be freeing pages at a higher rate than reclaim
ever canI think the simplification of the truncate code is definitely worth it.
Effectively reverts: 2aa15890f3c ("mm: prevent concurrent
unmap_mapping_range() on the same inode") and takes out the code that
caused its problem.Signed-off-by: Peter Zijlstra
Reviewed-by: KAMEZAWA Hiroyuki
Cc: Hugh Dickins
Cc: Benjamin Herrenschmidt
Cc: David Miller
Cc: Martin Schwidefsky
Cc: Russell King
Cc: Paul Mundt
Cc: Jeff Dike
Cc: Richard Weinberger
Cc: Tony Luck
Cc: Mel Gorman
Cc: KOSAKI Motohiro
Cc: Nick Piggin
Cc: Namhyung Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
In order to convert i_mmap_lock to a mutex we need a mutex equivalent to
spin_lock_nest_lock(), thus provide the mutex_lock_nest_lock() annotation.As with spin_lock_nest_lock(), mutex_lock_nest_lock() allows annotation of
the locking pattern where an outer lock serializes the acquisition order
of nested locks. That is, if every time you lock multiple locks A, say A1
and A2 you first acquire N, the order of acquiring A1 and A2 is
irrelevant.Signed-off-by: Peter Zijlstra
Cc: Benjamin Herrenschmidt
Cc: David Miller
Cc: Martin Schwidefsky
Cc: Russell King
Cc: Paul Mundt
Cc: Jeff Dike
Cc: Richard Weinberger
Cc: Tony Luck
Cc: KAMEZAWA Hiroyuki
Cc: Hugh Dickins
Cc: Mel Gorman
Cc: KOSAKI Motohiro
Cc: Nick Piggin
Cc: Namhyung Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
…s/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (43 commits)
TOMOYO: Fix wrong domainname validation.
SELINUX: add /sys/fs/selinux mount point to put selinuxfs
CRED: Fix load_flat_shared_library() to initialise bprm correctly
SELinux: introduce path_has_perm
flex_array: allow 0 length elements
flex_arrays: allow zero length flex arrays
flex_array: flex_array_prealloc takes a number of elements, not an end
SELinux: pass last path component in may_create
SELinux: put name based create rules in a hashtable
SELinux: generic hashtab entry counter
SELinux: calculate and print hashtab stats with a generic function
SELinux: skip filename trans rules if ttype does not match parent dir
SELinux: rename filename_compute_type argument to *type instead of *con
SELinux: fix comment to state filename_compute_type takes an objname not a qstr
SMACK: smack_file_lock can use the struct path
LSM: separate LSM_AUDIT_DATA_DENTRY from LSM_AUDIT_DATA_PATH
LSM: split LSM_AUDIT_DATA_FS into _PATH and _INODE
SELINUX: Make selinux cache VFS RCU walks safe
SECURITY: Move exec_permission RCU checks into security modules
SELinux: security_read_policy should take a size_t not ssize_t
... -
* 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: Unify input section names
percpu: Avoid extra NOP in percpu_cmpxchg16b_double
percpu: Cast away printk format warning
percpu: Always align percpu output section to PAGE_SIZEFix up fairly trivial conflict in arch/x86/include/asm/percpu.h as per Tejun
24 May, 2011
10 commits
-
Ben Nagy reported a scalability problem with KVM/QEMU that hit very hard
a single spinlock (idr_lock) in posix-timers code, on its 48 core
machine.Even on a 16 cpu machine (2x4x2), a single test can show 98% of cpu time
used in ticket_spin_lock, from lock_timerRef: http://www.spinics.net/lists/kvm/msg51526.html
Switching to RCU is quite easy, IDR being already RCU ready. idr_lock
should be locked only for an insert/delete, not a lookup.Benchmark on a 2x4x2 machine, 16 processes calling timer_gettime().
Before :
real 1m18.669s
user 0m1.346s
sys 1m17.180sAfter :
real 0m3.296s
user 0m1.366s
sys 0m1.926sReported-by: Ben Nagy
Signed-off-by: Eric Dumazet
Tested-by: Ben Nagy
Cc: Oleg Nesterov
Cc: Avi Kivity
Cc: John Stultz
Cc: Richard Cochran
Cc: Paul E. McKenney
Cc: Andrew Morton
Signed-off-by: Thomas Gleixner -
…l/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf tools: Fix sample type size calculation in 32 bits archs
profile: Use vzalloc() rather than vmalloc() & memset() -
We try to enforce it by using -Wstrict-prototypes, but apparently they
sometimes get through. Introduced by 4eec42f39204 ("watchdog: Change
the default timeout and configure nmi watchdog period based").Reported-by: Stephen Rothwell
Signed-off-by: Linus Torvalds -
…/git/tip/linux-2.6-tip
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Increase SCHED_LOAD_SCALE resolution
sched: Introduce SCHED_POWER_SCALE to scale cpu_power calculations
sched: Cleanup set_load_weight() -
* 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6: (970 commits)
staging: usbip: replace usbip_u{dbg,err,info} and printk with dev_ and pr_
staging:iio: Trivial kconfig reorganization and uniformity improvements.
staging:iio:documenation partial update.
staging:iio: use pollfunc allocation helpers in remaining drivers.
staging:iio:max1363 misc cleanups and use of for_each_bit_set to simplify event code spitting out.
staging:iio: implement an iio_info structure to take some of the constant elements out of iio_dev.
staging:iio:meter:ade7758: Use private data space from iio_allocate_device
staging:iio:accel:lis3l02dq make write_reg_8 take value not a pointer to value.
staging:iio: ring core cleanups + check if read_last available in lis3l02dq
staging:iio:core cleanup: squash tiny wrappers and use dev_set_name to handle creation of event interface name.
staging:iio: poll func allocation clean up.
staging:iio:ad7780 trivial unused header cleanup.
staging:iio:adc: AD7780: Use private data space from iio_allocate_device + trivial fixes
staging:iio:adc:AD7780: Convert to new channel registration method
staging:iio:adc: AD7606: Drop dev_data in favour of iio_priv()
staging:iio:adc: AD7606: Consitently use indio_dev
staging:iio: Rip out helper for software rings.
staging:iio:adc:AD7298: Use private data space from iio_allocate_device
staging:iio: rationalization of different buffer implementation hooks.
staging:iio:imu:adis16400 avoid allocating rx, tx, and state separately from iio_dev.
...Fix up trivial conflicts in
- drivers/staging/intel_sst/intelmid.c: patches applied in both branches
- drivers/staging/rt2860/common/cmm_data_{pci,usb}.c: removed vs spelling
- drivers/staging/usbip/vhci_sysfs.c: trivial header file inclusion -
* 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
hrtimers: Reorder clock bases
hrtimers: Avoid touching inactive timer bases
hrtimers: Make struct hrtimer_cpu_base layout less stupid
timerfd: Manage cancelable timers in timerfd
clockevents: Move C3 stop test outside lock
alarmtimer: Drop device refcount after rtc_open()
alarmtimer: Check return value of class_find_device()
timerfd: Allow timers to be cancelled when clock was set
hrtimers: Prepare for cancel on clock was set timers -
…l/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf tools: Fix sample size bit operations
perf tools: Fix ommitted mmap data update on remap
watchdog: Change the default timeout and configure nmi watchdog period based on watchdog_thresh
watchdog: Disable watchdog when thresh is zero
watchdog: Only disable/enable watchdog if neccessary
watchdog: Fix rounding bug in get_sample_period()
perf tools: Propagate event parse error handling
perf tools: Robustify dynamic sample content fetch
perf tools: Pre-check sample size before parsing
perf tools: Move evlist sample helpers to evlist area
perf tools: Remove junk code in mmap size handling
perf tools: Check we are able to read the event size on mmap -
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
b43: fix comment typo reqest -> request
Haavard Skinnemoen has left Atmel
cris: typo in mach-fs Makefile
Kconfig: fix copy/paste-ism for dell-wmi-aio driver
doc: timers-howto: fix a typo ("unsgined")
perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c
md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course').
treewide: fix a few typos in comments
regulator: change debug statement be consistent with the style of the rest
Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations"
audit: acquire creds selectively to reduce atomic op overhead
rtlwifi: don't touch with treewide double semicolon removal
treewide: cleanup continuations and remove logging message whitespace
ath9k_hw: don't touch with treewide double semicolon removal
include/linux/leds-regulator.h: fix syntax in example code
tty: fix typo in descripton of tty_termios_encode_baud_rate
xtensa: remove obsolete BKL kernel option from defconfig
m68k: fix comment typo 'occcured'
arch:Kconfig.locks Remove unused config option.
treewide: remove extra semicolons
...
23 May, 2011
7 commits
-
Merge reason: this commit was queued up quite some time ago but was
forgotten about.Signed-off-by: Ingo Molnar
-
The ordering of the clock bases is historical due to the
CLOCK_REALTIME and CLOCK_MONOTONIC constants. Now the hrtimer bases
have their own enumeration due to the gap between CLOCK_MONOTONIC and
CLOCK_BOOTTIME. So we can be more clever as most timers end up on the
CLOCK_MONOTONIC base due to the virtue of POSIX declaring that
relative CLOCK_REALTIME timers are not affected by time changes. In
desktop environments this is slowly changing as applications switch to
absolute timers, but I've observed empty CLOCK_REALTIME bases often
enough. There is no performance penalty or overhead when
CLOCK_REALTIME timers are active, but in case they are not we don't
skip over a full cache line.Signed-off-by: Thomas Gleixner
Reviewed-by: Peter Zijlstra -
Instead of iterating over all possible timer bases avoid it by marking
the active bases in the cpu base.Signed-off-by: Thomas Gleixner
Reviewed-by: Peter Zijlstra -
Peter is concerned about the extra scan of CLOCK_REALTIME_COS in the
timer interrupt. Yes, I did not think about it, because the solution
was so elegant. I didn't like the extra list in timerfd when it was
proposed some time ago, but with a rcu based list the list walk it's
less horrible than the original global lock, which was held over the
list iteration.Requested-by: Peter Zijlstra
Signed-off-by: Thomas Gleixner
Reviewed-by: Peter Zijlstra -
Before the conversion of the NMI watchdog to perf event, the
watchdog timeout was 5 seconds. Now it is 60 seconds. For my
particular application, netbooks, 5 seconds was a better
timeout. With a short timeout, we catch faults earlier and are
able to send back a panic. With a 60 second timeout, the user is
unlikely to wait and will instead hit the power button, causing
us to lose the panic info.This change configures the NMI period to watchdog_thresh and
sets the softlockup_thresh to watchdog_thresh * 2. In addition,
watchdog_thresh was reduced to 10 seconds as suggested by Ingo
Molnar.Signed-off-by: Mandeep Singh Baines
Cc: Marcin Slusarz
Cc: Don Zickus
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Link: http://lkml.kernel.org/r/1306127423-3347-4-git-send-email-msb@chromium.org
Signed-off-by: Ingo Molnar
LKML-Reference: -
This restores the previous behavior of softlock_thresh.
Currently, setting watchdog_thresh to zero causes the watchdog
kthreads to consume a lot of CPU.In addition, the logic of proc_dowatchdog_thresh and
proc_dowatchdog_enabled has been factored into proc_dowatchdog.Signed-off-by: Mandeep Singh Baines
Cc: Marcin Slusarz
Cc: Don Zickus
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Link: http://lkml.kernel.org/r/1306127423-3347-3-git-send-email-msb@chromium.org
Signed-off-by: Ingo Molnar
LKML-Reference: -
Don't take any action on an unsuccessful write to /proc.
Signed-off-by: Mandeep Singh Baines
Cc: Marcin Slusarz
Cc: Don Zickus
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Link: http://lkml.kernel.org/r/1306127423-3347-2-git-send-email-msb@chromium.org
Signed-off-by: Ingo Molnar