24 Sep, 2009

4 commits

  • * 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6: (21 commits)
    HWPOISON: Enable error_remove_page on btrfs
    HWPOISON: Add simple debugfs interface to inject hwpoison on arbitary PFNs
    HWPOISON: Add madvise() based injector for hardware poisoned pages v4
    HWPOISON: Enable error_remove_page for NFS
    HWPOISON: Enable .remove_error_page for migration aware file systems
    HWPOISON: The high level memory error handler in the VM v7
    HWPOISON: Add PR_MCE_KILL prctl to control early kill behaviour per process
    HWPOISON: shmem: call set_page_dirty() with locked page
    HWPOISON: Define a new error_remove_page address space op for async truncation
    HWPOISON: Add invalidate_inode_page
    HWPOISON: Refactor truncate to allow direct truncating of page v2
    HWPOISON: check and isolate corrupted free pages v2
    HWPOISON: Handle hardware poisoned pages in try_to_unmap
    HWPOISON: Use bitmask/action code for try_to_unmap behaviour
    HWPOISON: x86: Add VM_FAULT_HWPOISON handling to x86 page fault handler v2
    HWPOISON: Add poison check to page fault handling
    HWPOISON: Add basic support for poisoned pages in fault handler v3
    HWPOISON: Add new SIGBUS error codes for hardware poison signals
    HWPOISON: Add support for poison swap entries v2
    HWPOISON: Export some rmap vma locking to outside world
    ...

    Linus Torvalds
     
  • It's unused.

    It isn't needed -- read or write flag is already passed and sysctl
    shouldn't care about the rest.

    It _was_ used in two places at arch/frv for some reason.

    Signed-off-by: Alexey Dobriyan
    Cc: David Howells
    Cc: "Eric W. Biederman"
    Cc: Al Viro
    Cc: Ralf Baechle
    Cc: Martin Schwidefsky
    Cc: Ingo Molnar
    Cc: "David S. Miller"
    Cc: James Morris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Introduce core pipe limiting sysctl.

    Since we can dump cores to pipe, rather than directly to the filesystem,
    we create a condition in which a user can create a very high load on the
    system simply by running bad applications.

    If the pipe reader specified in core_pattern is poorly written, we can
    have lots of ourstandig resources and processes in the system.

    This sysctl introduces an ability to limit that resource consumption.
    core_pipe_limit defines how many in-flight dumps may be run in parallel,
    dumps beyond this value are skipped and a note is made in the kernel log.
    A special value of 0 in core_pipe_limit denotes unlimited core dumps may
    be handled (this is the default value).

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Neil Horman
    Reported-by: Earl Chew
    Cc: Oleg Nesterov
    Cc: Andi Kleen
    Cc: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Neil Horman
     
  • * remove asm/atomic.h inclusion from linux/utsname.h --
    not needed after kref conversion
    * remove linux/utsname.h inclusion from files which do not need it

    NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however
    due to some personality stuff it _is_ needed -- cowardly leave ELF-related
    headers and files alone.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

23 Sep, 2009

1 commit

  • When syslog is not possible, at the same time there's no serial/net
    console available, it will be hard to read the printk messages. For
    example oops/panic/warning messages in shutdown phase.

    Add a printk delay feature, we can make each printk message delay some
    milliseconds.

    Setting the delay by proc/sysctl interface: /proc/sys/kernel/printk_delay

    The value range from 0 - 10000, default value is 0

    [akpm@linux-foundation.org: fix a few things]
    Signed-off-by: Dave Young
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Young
     

21 Sep, 2009

1 commit

  • Bye-bye Performance Counters, welcome Performance Events!

    In the past few months the perfcounters subsystem has grown out its
    initial role of counting hardware events, and has become (and is
    becoming) a much broader generic event enumeration, reporting, logging,
    monitoring, analysis facility.

    Naming its core object 'perf_counter' and naming the subsystem
    'perfcounters' has become more and more of a misnomer. With pending
    code like hw-breakpoints support the 'counter' name is less and
    less appropriate.

    All in one, we've decided to rename the subsystem to 'performance
    events' and to propagate this rename through all fields, variables
    and API names. (in an ABI compatible fashion)

    The word 'event' is also a bit shorter than 'counter' - which makes
    it slightly more convenient to write/handle as well.

    Thanks goes to Stephane Eranian who first observed this misnomer and
    suggested a rename.

    User-space tooling and ABI compatibility is not affected - this patch
    should be function-invariant. (Also, defconfigs were not touched to
    keep the size down.)

    This patch has been generated via the following script:

    FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')

    sed -i \
    -e 's/PERF_EVENT_/PERF_RECORD_/g' \
    -e 's/PERF_COUNTER/PERF_EVENT/g' \
    -e 's/perf_counter/perf_event/g' \
    -e 's/nb_counters/nb_events/g' \
    -e 's/swcounter/swevent/g' \
    -e 's/tpcounter_event/tp_event/g' \
    $FILES

    for N in $(find . -name perf_counter.[ch]); do
    M=$(echo $N | sed 's/perf_counter/perf_event/g')
    mv $N $M
    done

    FILES=$(find . -name perf_event.*)

    sed -i \
    -e 's/COUNTER_MASK/REG_MASK/g' \
    -e 's/COUNTER/EVENT/g' \
    -e 's/\/event_id/g' \
    -e 's/counter/event/g' \
    -e 's/Counter/Event/g' \
    $FILES

    ... to keep it as correct as possible. This script can also be
    used by anyone who has pending perfcounters patches - it converts
    a Linux kernel tree over to the new naming. We tried to time this
    change to the point in time where the amount of pending patches
    is the smallest: the end of the merge window.

    Namespace clashes were fixed up in a preparatory patch - and some
    stylistic fallout will be fixed up in a subsequent patch.

    ( NOTE: 'counters' are still the proper terminology when we deal
    with hardware registers - and these sed scripts are a bit
    over-eager in renaming them. I've undone some of that, but
    in case there's something left where 'counter' would be
    better than 'event' we can undo that on an individual basis
    instead of touching an otherwise nicely automated patch. )

    Suggested-by: Stephane Eranian
    Acked-by: Peter Zijlstra
    Acked-by: Paul Mackerras
    Reviewed-by: Arjan van de Ven
    Cc: Mike Galbraith
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    Cc: Steven Rostedt
    Cc: Benjamin Herrenschmidt
    Cc: David Howells
    Cc: Kyle McMartin
    Cc: Martin Schwidefsky
    Cc: "David S. Miller"
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc:
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

16 Sep, 2009

2 commits

  • Add the high level memory handler that poisons pages
    that got corrupted by hardware (typically by a two bit flip in a DIMM
    or a cache) on the Linux level. The goal is to prevent everyone
    from accessing these pages in the future.

    This done at the VM level by marking a page hwpoisoned
    and doing the appropriate action based on the type of page
    it is.

    The code that does this is portable and lives in mm/memory-failure.c

    To quote the overview comment:

    High level machine check handler. Handles pages reported by the
    hardware as being corrupted usually due to a 2bit ECC memory or cache
    failure.

    This focuses on pages detected as corrupted in the background.
    When the current CPU tries to consume corruption the currently
    running process can just be killed directly instead. This implies
    that if the error cannot be handled for some reason it's safe to
    just ignore it because no corruption has been consumed yet. Instead
    when that happens another machine check will happen.

    Handles page cache pages in various states. The tricky part
    here is that we can access any page asynchronous to other VM
    users, because memory failures could happen anytime and anywhere,
    possibly violating some of their assumptions. This is why this code
    has to be extremely careful. Generally it tries to use normal locking
    rules, as in get the standard locks, even if that means the
    error handling takes potentially a long time.

    Some of the operations here are somewhat inefficient and have non
    linear algorithmic complexity, because the data structures have not
    been optimized for this case. This is in particular the case
    for the mapping from a vma to a process. Since this case is expected
    to be rare we hope we can get away with this.

    There are in principle two strategies to kill processes on poison:
    - just unmap the data and wait for an actual reference before
    killing
    - kill as soon as corruption is detected.
    Both have advantages and disadvantages and should be used
    in different situations. Right now both are implemented and can
    be switched with a new sysctl vm.memory_failure_early_kill
    The default is early kill.

    The patch does some rmap data structure walking on its own to collect
    processes to kill. This is unusual because normally all rmap data structure
    knowledge is in rmap.c only. I put it here for now to keep
    everything together and rmap knowledge has been seeping out anyways

    Includes contributions from Johannes Weiner, Chris Mason, Fengguang Wu,
    Nick Piggin (who did a lot of great work) and others.

    Cc: npiggin@suse.de
    Cc: riel@redhat.com
    Signed-off-by: Andi Kleen
    Acked-by: Rik van Riel
    Reviewed-by: Hidehiro Kawai

    Andi Kleen
     
  • kernel/built-in.o:(.data+0x17b0): undefined reference to `blk_iopoll_enabled'

    Since the extern declaration makes the compile work, but the actual
    symbol is missing when block/blk-iopoll.o isn't linked in.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

15 Sep, 2009

1 commit

  • * 'for-2.6.32' of git://git.kernel.dk/linux-2.6-block: (29 commits)
    block: use blkdev_issue_discard in blk_ioctl_discard
    Make DISCARD_BARRIER and DISCARD_NOBARRIER writes instead of reads
    block: don't assume device has a request list backing in nr_requests store
    block: Optimal I/O limit wrapper
    cfq: choose a new next_req when a request is dispatched
    Seperate read and write statistics of in_flight requests
    aoe: end barrier bios with EOPNOTSUPP
    block: trace bio queueing trial only when it occurs
    block: enable rq CPU completion affinity by default
    cfq: fix the log message after dispatched a request
    block: use printk_once
    cciss: memory leak in cciss_init_one()
    splice: update mtime and atime on files
    block: make blk_iopoll_prep_sched() follow normal 0/1 return convention
    cfq-iosched: get rid of must_alloc flag
    block: use interrupts disabled version of raise_softirq_irqoff()
    block: fix comment in blk-iopoll.c
    block: adjust default budget for blk-iopoll
    block: fix long lines in block/blk-iopoll.c
    block: add blk-iopoll, a NAPI like approach for block devices
    ...

    Linus Torvalds
     

12 Sep, 2009

1 commit

  • …/git/tip/linux-2.6-tip

    * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (64 commits)
    sched: Fix sched::sched_stat_wait tracepoint field
    sched: Disable NEW_FAIR_SLEEPERS for now
    sched: Keep kthreads at default priority
    sched: Re-tune the scheduler latency defaults to decrease worst-case latencies
    sched: Turn off child_runs_first
    sched: Ensure that a child can't gain time over it's parent after fork()
    sched: enable SD_WAKE_IDLE
    sched: Deal with low-load in wake_affine()
    sched: Remove short cut from select_task_rq_fair()
    sched: Turn on SD_BALANCE_NEWIDLE
    sched: Clean up topology.h
    sched: Fix dynamic power-balancing crash
    sched: Remove reciprocal for cpu_power
    sched: Try to deal with low capacity, fix update_sd_power_savings_stats()
    sched: Try to deal with low capacity
    sched: Scale down cpu_power due to RT tasks
    sched: Implement dynamic cpu_power
    sched: Add smt_gain
    sched: Update the cpu_power sum during load-balance
    sched: Add SD_PREFER_SIBLING
    ...

    Linus Torvalds
     

11 Sep, 2009

1 commit

  • This borrows some code from NAPI and implements a polled completion
    mode for block devices. The idea is the same as NAPI - instead of
    doing the command completion when the irq occurs, schedule a dedicated
    softirq in the hopes that we will complete more IO when the iopoll
    handler is invoked. Devices have a budget of commands assigned, and will
    stay in polled mode as long as they continue to consume their budget
    from the iopoll softirq handler. If they do not, the device is set back
    to interrupt completion mode.

    This patch holds the core bits for blk-iopoll, device driver support
    sold separately.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

09 Sep, 2009

1 commit

  • Set child_runs_first default to off.

    It hurts 'optimal' make -j workloads as make jobs
    get preempted by child tasks, reducing parallelism.

    Note, this patch might make existing races in user
    applications more prominent than before - so breakages
    might be bisected to this commit.

    Child-runs-first is broken on SMP to begin with, and we
    already had it off briefly in v2.6.23 so most of the
    offenders ought to be fixed. Would be nice not to revert
    this commit but fix those apps finally ...

    Signed-off-by: Mike Galbraith
    Acked-by: Peter Zijlstra
    LKML-Reference:
    [ made the sysctl independent of CONFIG_SCHED_DEBUG, in case
    people want to work around broken apps. ]
    Signed-off-by: Ingo Molnar

    Mike Galbraith
     

07 Sep, 2009

1 commit


04 Sep, 2009

1 commit

  • Keep an average on the amount of time spend on RT tasks and use
    that fraction to scale down the cpu_power for regular tasks.

    Signed-off-by: Peter Zijlstra
    Tested-by: Andreas Herrmann
    Acked-by: Andreas Herrmann
    Acked-by: Gautham R Shenoy
    Cc: Balbir Singh
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

17 Aug, 2009

1 commit

  • Currently SELinux enforcement of controls on the ability to map low memory
    is determined by the mmap_min_addr tunable. This patch causes SELinux to
    ignore the tunable and instead use a seperate Kconfig option specific to how
    much space the LSM should protect.

    The tunable will now only control the need for CAP_SYS_RAWIO and SELinux
    permissions will always protect the amount of low memory designated by
    CONFIG_LSM_MMAP_MIN_ADDR.

    This allows users who need to disable the mmap_min_addr controls (usual reason
    being they run WINE as a non-root user) to do so and still have SELinux
    controls preventing confined domains (like a web server) from being able to
    map some area of low memory.

    Signed-off-by: Eric Paris
    Signed-off-by: James Morris

    Eric Paris
     

29 Jun, 2009

1 commit

  • …git/tip/linux-2.6-tip

    * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, delay: tsc based udelay should have rdtsc_barrier
    x86, setup: correct include file in <asm/boot.h>
    x86, setup: Fix typo "CONFIG_x86_64" in <asm/boot.h>
    x86, mce: percpu mcheck_timer should be pinned
    x86: Add sysctl to allow panic on IOCK NMI error
    x86: Fix uv bau sending buffer initialization
    x86, mce: Fix mce resume on 32bit
    x86: Move init_gbpages() to setup_arch()
    x86: ensure percpu lpage doesn't consume too much vmalloc space
    x86: implement percpu_alloc kernel parameter
    x86: fix pageattr handling for lpage percpu allocator and re-enable it
    x86: reorganize cpa_process_alias()
    x86: prepare setup_pcpu_lpage() for pageattr fix
    x86: rename remap percpu first chunk allocator to lpage
    x86: fix duplicate free in setup_pcpu_remap() failure path
    percpu: fix too lazy vunmap cache flushing
    x86: Set cpu_llc_id on AMD CPUs

    Linus Torvalds
     

26 Jun, 2009

1 commit

  • This patch introduces a new sysctl:

    /proc/sys/kernel/panic_on_io_nmi

    which defaults to 0 (off).

    When enabled, the kernel panics when the kernel receives an NMI
    caused by an IO error.

    The IO error triggered NMI indicates a serious system
    condition, which could result in IO data corruption. Rather
    than contiuing, panicing and dumping might be a better choice,
    so one can figure out what's causing the IO error.

    This could be especially important to companies running IO
    intensive applications where corruption must be avoided, e.g. a
    bank's databases.

    [ SuSE has been shipping it for a while, it was done at the
    request of a large database vendor, for their users. ]

    Signed-off-by: Kurt Garloff
    Signed-off-by: Roberto Angelino
    Signed-off-by: Greg Kroah-Hartman
    Cc: "Eric W. Biederman"
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Kurt Garloff
     

23 Jun, 2009

1 commit

  • Poornima Nayek reported:

    | Timer migration interface /proc/sys/kernel/timer_migration in
    | 2.6.30-git9 accepts any numerical value as input.
    |
    | Steps to reproduce:
    | 1. echo -6666666 > /proc/sys/kernel/timer_migration
    | 2. cat /proc/sys/kernel/timer_migration
    | -6666666
    |
    | 1. echo 44444444444444444444444444444444444444444444444444444444444 > /proc/sys/kernel/timer_migration
    | 2. cat /proc/sys/kernel/timer_migration
    | -1357789412
    |
    | Expected behavior: Should 'echo: write error: Invalid argument' while
    | setting any value other then 0 & 1

    Restrict valid values to 0 and 1.

    Reported-by: Poornima Nayak
    Tested-by: Poornima Nayak
    Signed-off-by: Arun R Bharadwaj
    Cc: poornima nayak
    Cc: Arun Bharadwaj
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arun R Bharadwaj
     

19 Jun, 2009

1 commit

  • Remoce the unused variable 'val' from __do_proc_dointvec()

    The integer has been declared and used as 'val = -val' and there is no
    reference to it anywhere.

    Signed-off-by: Sukanto Ghosh
    Cc: Jaswinder Singh Rajput
    Cc: Sukanto Ghosh
    Cc: Jiri Kosina
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sukanto Ghosh
     

17 Jun, 2009

3 commits

  • * akpm: (182 commits)
    fbdev: bf54x-lq043fb: use kzalloc over kmalloc/memset
    fbdev: *bfin*: fix __dev{init,exit} markings
    fbdev: *bfin*: drop unnecessary calls to memset
    fbdev: bfin-t350mcqb-fb: drop unused local variables
    fbdev: blackfin has __raw I/O accessors, so use them in fb.h
    fbdev: s1d13xxxfb: add accelerated bitblt functions
    tcx: use standard fields for framebuffer physical address and length
    fbdev: add support for handoff from firmware to hw framebuffers
    intelfb: fix a bug when changing video timing
    fbdev: use framebuffer_release() for freeing fb_info structures
    radeon: P2G2CLK_ALWAYS_ONb tested twice, should 2nd be P2G2CLK_DAC_ALWAYS_ONb?
    s3c-fb: CPUFREQ frequency scaling support
    s3c-fb: fix resource releasing on error during probing
    carminefb: fix possible access beyond end of carmine_modedb[]
    acornfb: remove fb_mmap function
    mb862xxfb: use CONFIG_OF instead of CONFIG_PPC_OF
    mb862xxfb: restrict compliation of platform driver to PPC
    Samsung SoC Framebuffer driver: add Alpha Channel support
    atmel-lcdc: fix pixclock upper bound detection
    offb: use framebuffer_alloc() to allocate fb_info struct
    ...

    Manually fix up conflicts due to kmemcheck in mm/slab.c

    Linus Torvalds
     
  • Currently, nobody wants to turn UNEVICTABLE_LRU off. Thus this
    configurability is unnecessary.

    Signed-off-by: KOSAKI Motohiro
    Cc: Johannes Weiner
    Cc: Andi Kleen
    Acked-by: Minchan Kim
    Cc: David Woodhouse
    Cc: Matt Mackall
    Cc: Rik van Riel
    Cc: Lee Schermerhorn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/vegard/kmemcheck: (39 commits)
    signal: fix __send_signal() false positive kmemcheck warning
    fs: fix do_mount_root() false positive kmemcheck warning
    fs: introduce __getname_gfp()
    trace: annotate bitfields in struct ring_buffer_event
    net: annotate struct sock bitfield
    c2port: annotate bitfield for kmemcheck
    net: annotate inet_timewait_sock bitfields
    ieee1394/csr1212: fix false positive kmemcheck report
    ieee1394: annotate bitfield
    net: annotate bitfields in struct inet_sock
    net: use kmemcheck bitfields API for skbuff
    kmemcheck: introduce bitfield API
    kmemcheck: add opcode self-testing at boot
    x86: unify pte_hidden
    x86: make _PAGE_HIDDEN conditional
    kmemcheck: make kconfig accessible for other architectures
    kmemcheck: enable in the x86 Kconfig
    kmemcheck: add hooks for the page allocator
    kmemcheck: add hooks for page- and sg-dma-mappings
    kmemcheck: don't track page tables
    ...

    Linus Torvalds
     

16 Jun, 2009

1 commit

  • …kernel/git/tip/linux-2.6-tip

    * 'timers-for-linus-migration' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    timers: Logic to move non pinned timers
    timers: /proc/sys sysctl hook to enable timer migration
    timers: Identifying the existing pinned timers
    timers: Framework for identifying pinned timers
    timers: allow deferrable timers for intervals tv2-tv5 to be deferred

    Fix up conflicts in kernel/sched.c and kernel/timer.c manually

    Linus Torvalds
     

13 Jun, 2009

1 commit

  • General description: kmemcheck is a patch to the linux kernel that
    detects use of uninitialized memory. It does this by trapping every
    read and write to memory that was allocated dynamically (e.g. using
    kmalloc()). If a memory address is read that has not previously been
    written to, a message is printed to the kernel log.

    Thanks to Andi Kleen for the set_memory_4k() solution.

    Andrew Morton suggested documenting the shadow member of struct page.

    Signed-off-by: Vegard Nossum
    Signed-off-by: Pekka Enberg

    [export kmemcheck_mark_initialized]
    [build fix for setup_max_cpus]
    Signed-off-by: Ingo Molnar

    [rebased for mainline inclusion]
    Signed-off-by: Vegard Nossum

    Vegard Nossum
     

12 Jun, 2009

1 commit

  • …el/git/tip/linux-2.6-tip

    * 'perfcounters-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (574 commits)
    perf_counter: Turn off by default
    perf_counter: Add counter->id to the throttle event
    perf_counter: Better align code
    perf_counter: Rename L2 to LL cache
    perf_counter: Standardize event names
    perf_counter: Rename enums
    perf_counter tools: Clean up u64 usage
    perf_counter: Rename perf_counter_limit sysctl
    perf_counter: More paranoia settings
    perf_counter: powerpc: Implement generalized cache events for POWER processors
    perf_counters: powerpc: Add support for POWER7 processors
    perf_counter: Accurate period data
    perf_counter: Introduce struct for sample data
    perf_counter tools: Normalize data using per sample period data
    perf_counter: Annotate exit ctx recursion
    perf_counter tools: Propagate signals properly
    perf_counter tools: Small frequency related fixes
    perf_counter: More aggressive frequency adjustment
    perf_counter/x86: Fix the model number of Intel Core2 processors
    perf_counter, x86: Correct some event and umask values for Intel processors
    ...

    Linus Torvalds
     

11 Jun, 2009

5 commits

  • Conflicts:
    arch/x86/kernel/irqinit.c
    arch/x86/kernel/irqinit_64.c
    arch/x86/kernel/traps.c
    arch/x86/mm/fault.c
    include/linux/sched.h
    kernel/exit.c

    Ingo Molnar
     
  • Rename perf_counter_limit to perf_counter_max_sample_rate and
    prohibit creation of counters with a known higher sample
    frequency.

    Signed-off-by: Peter Zijlstra
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Rename the perf_counter_priv knob to perf_counter_paranoia (because
    priv can be read as private, as opposed to privileged) and provide
    one more level:

    0 - permissive
    1 - restrict cpu counters to privilidged contexts
    2 - restrict kernel-mode code counting and profiling

    Signed-off-by: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • James Morris
     
  • …/git/tip/linux-2.6-tip

    * 'x86-kbuild-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (46 commits)
    x86, boot: add new generated files to the appropriate .gitignore files
    x86, boot: correct the calculation of ZO_INIT_SIZE
    x86-64: align __PHYSICAL_START, remove __KERNEL_ALIGN
    x86, boot: correct sanity checks in boot/compressed/misc.c
    x86: add extension fields for bootloader type and version
    x86, defconfig: update kernel position parameters
    x86, defconfig: update to current, no material changes
    x86: make CONFIG_RELOCATABLE the default
    x86: default CONFIG_PHYSICAL_START and CONFIG_PHYSICAL_ALIGN to 16 MB
    x86: document new bzImage fields
    x86, boot: make kernel_alignment adjustable; new bzImage fields
    x86, boot: remove dead code from boot/compressed/head_*.S
    x86, boot: use LOAD_PHYSICAL_ADDR on 64 bits
    x86, boot: make symbols from the main vmlinux available
    x86, boot: determine compressed code offset at compile time
    x86, boot: use appropriate rep string for move and clear
    x86, boot: zero EFLAGS on 32 bits
    x86, boot: set up the decompression stack as early as possible
    x86, boot: straighten out ranges to copy/zero in compressed/head*.S
    x86, boot: stylistic cleanups for boot/compressed/head_64.S
    ...

    Fixed trivial conflict in arch/x86/configs/x86_64_defconfig manually

    Linus Torvalds
     

04 Jun, 2009

1 commit


26 May, 2009

1 commit

  • Introduce a generic per counter interrupt throttle.

    This uses the perf_counter_overflow() quick disable to throttle a specific
    counter when its going too fast when a pmu->unthrottle() method is provided
    which can undo the quick disable.

    Power needs to implement both the quick disable and the unthrottle method.

    Signed-off-by: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Corey Ashford
    Cc: Arnaldo Carvalho de Melo
    Cc: John Kacur
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

22 May, 2009

1 commit


18 May, 2009

1 commit


15 May, 2009

1 commit

  • This reverts commit fafd688e4c0c34da0f3de909881117d374e4c7af.

    Work is progressing to switch away from pdflush as the process backing
    for flushing out dirty data. So it seems pointless to add more knobs
    to control pdflush threads. The original author of the patch did not
    have any specific use cases for adding the knobs, so we can easily
    revert this before 2.6.30 to avoid having to maintain this API
    forever.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

13 May, 2009

1 commit

  • * Arun R Bharadwaj [2009-04-16 12:11:36]:

    This patch creates the /proc/sys sysctl interface at
    /proc/sys/kernel/timer_migration

    Timer migration is enabled by default.

    To disable timer migration, when CONFIG_SCHED_DEBUG = y,

    echo 0 > /proc/sys/kernel/timer_migration

    Signed-off-by: Arun R Bharadwaj
    Signed-off-by: Thomas Gleixner

    Arun R Bharadwaj
     

12 May, 2009

1 commit

  • A long ago, in days of yore, it all began with a god named Thor.
    There were vikings and boats and some plans for a Linux kernel
    header. Unfortunately, a single 8-bit field was used for bootloader
    type and version. This has generally worked without *too* much pain,
    but we're getting close to flat running out of ID fields.

    Add extension fields for both type and version. The type will be
    extended if it the old field is 0xE; the version is a simple MSB
    extension.

    Keep /proc/sys/kernel/bootloader_type containing
    (type << 4) + (ver & 0xf) for backwards compatiblity, but also add
    /proc/sys/kernel/bootloader_version which contains the full version
    number.

    [ Impact: new feature to support more bootloaders ]

    Signed-off-by: H. Peter Anvin

    H. Peter Anvin
     

08 May, 2009

1 commit


06 May, 2009

1 commit

  • Provide a threshold to relax the mlock accounting, increasing usability.

    Each counter gets perf_counter_mlock_kb for free.

    [ Impact: allow more mmap buffering ]

    Signed-off-by: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Corey Ashford
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

03 May, 2009

1 commit

  • Avoid setting less than two pages for vm_dirty_bytes: this is necessary to
    avoid potential division by 0 (like the following) in get_dirty_limits().

    [ 49.951610] divide error: 0000 [#1] PREEMPT SMP
    [ 49.952195] last sysfs file: /sys/devices/pci0000:00/0000:00:01.1/host0/target0:0:0/0:0:0:0/block/sda/uevent
    [ 49.952195] CPU 1
    [ 49.952195] Modules linked in: pcspkr
    [ 49.952195] Pid: 3064, comm: dd Not tainted 2.6.30-rc3 #1
    [ 49.952195] RIP: 0010:[] [] get_dirty_limits+0xe9/0x2c0
    [ 49.952195] RSP: 0018:ffff88001de03a98 EFLAGS: 00010202
    [ 49.952195] RAX: 00000000000000c0 RBX: ffff88001de03b80 RCX: 28f5c28f5c28f5c3
    [ 49.952195] RDX: 0000000000000000 RSI: 00000000000000c0 RDI: 0000000000000000
    [ 49.952195] RBP: ffff88001de03ae8 R08: 0000000000000000 R09: 0000000000000000
    [ 49.952195] R10: ffff88001ddda9a0 R11: 0000000000000001 R12: 0000000000000001
    [ 49.952195] R13: ffff88001fbc8218 R14: ffff88001de03b70 R15: ffff88001de03b78
    [ 49.952195] FS: 00007fe9a435b6f0(0000) GS:ffff8800025d9000(0000) knlGS:0000000000000000
    [ 49.952195] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 49.952195] CR2: 00007fe9a39ab000 CR3: 000000001de38000 CR4: 00000000000006e0
    [ 49.952195] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 49.952195] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    [ 49.952195] Process dd (pid: 3064, threadinfo ffff88001de02000, task ffff88001ddda250)
    [ 49.952195] Stack:
    [ 49.952195] ffff88001fa0de00 ffff88001f2dbd70 ffff88001f9fe800 000080b900000000
    [ 49.952195] 00000000000000c0 ffff8800027a6100 0000000000000400 ffff88001fbc8218
    [ 49.952195] 0000000000000000 0000000000000600 ffff88001de03bb8 ffffffff802d3ed7
    [ 49.952195] Call Trace:
    [ 49.952195] [] balance_dirty_pages_ratelimited_nr+0x1d7/0x3f0
    [ 49.952195] [] ? ext3_writeback_write_end+0x9e/0x120
    [ 49.952195] [] generic_file_buffered_write+0x12f/0x330
    [ 49.952195] [] __generic_file_aio_write_nolock+0x26d/0x460
    [ 49.952195] [] ? generic_file_aio_write+0x52/0xd0
    [ 49.952195] [] generic_file_aio_write+0x69/0xd0
    [ 49.952195] [] ext3_file_write+0x26/0xc0
    [ 49.952195] [] do_sync_write+0xf1/0x140
    [ 49.952195] [] ? get_lock_stats+0x2a/0x60
    [ 49.952195] [] ? autoremove_wake_function+0x0/0x40
    [ 49.952195] [] vfs_write+0xcb/0x190
    [ 49.952195] [] sys_write+0x50/0x90
    [ 49.952195] [] system_call_fastpath+0x16/0x1b
    [ 49.952195] Code: 00 00 00 2b 05 09 1c 17 01 48 89 c6 49 0f af f4 48 c1 ee 02 48 89 f0 48 f7 e1 48 89 d6 31 d2 48 c1 ee 02 48 0f af 75 d0 48 89 f0 f7 f7 41 8b 95 ac 01 00 00 48 89 c7 49 0f af d4 48 c1 ea 02
    [ 49.952195] RIP [] get_dirty_limits+0xe9/0x2c0
    [ 49.952195] RSP
    [ 50.096523] ---[ end trace 008d7aa02f244d7b ]---

    Signed-off-by: Andrea Righi
    Cc: Peter Zijlstra
    Cc: David Rientjes
    Cc: Dave Chinner
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Righi