13 Jul, 2009

2 commits

  • * 'kmemleak' of git://linux-arm.org/linux-2.6:
    kmemleak: Remove alloc_bootmem annotations introduced in the past
    kmemleak: Add callbacks to the bootmem allocator
    kmemleak: Allow partial freeing of memory blocks
    kmemleak: Trace the kmalloc_large* functions in slub
    kmemleak: Scan objects allocated during a scanning episode
    kmemleak: Do not acquire scan_mutex in kmemleak_open()
    kmemleak: Remove the reported leaks number limitation
    kmemleak: Add more cond_resched() calls in the scanning thread
    kmemleak: Renice the scanning thread to +10

    Linus Torvalds
     
  • * Remove smp_lock.h from files which don't need it (including some headers!)
    * Add smp_lock.h to files which do need it
    * Make smp_lock.h include conditional in hardirq.h
    It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

    This will make hardirq.h inclusion cheaper for every PREEMPT=n config
    (which includes allmodconfig/allyesconfig, BTW)

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

11 Jul, 2009

3 commits

  • …el/git/tip/linux-2.6-tip

    * 'core-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    dma-debug: Fix the overlap() function to be correct and readable
    oprofile: reset bt_lost_no_mapping with other stats
    x86/oprofile: rename kernel parameter for architectural perfmon to arch_perfmon
    signals: declare sys_rt_tgsigqueueinfo in syscalls.h
    rcu: Mark Hierarchical RCU no longer experimental
    dma-debug: Put all hash-chain locks into the same lock class
    dma-debug: fix off-by-one error in overlap function

    Linus Torvalds
     
  • Optimize cond_resched() by removing one conditional.

    Currently cond_resched() checks system_state ==
    SYSTEM_RUNNING in order to avoid scheduling before the
    scheduler is running.

    We can however, as per suggestion of Matt, use
    PREEMPT_ACTIVE to accomplish that very same.

    Suggested-by: Matt Mackall
    Signed-off-by: Peter Zijlstra
    Acked-by: Matt Mackall
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • …nel/git/tip/linux-2.6-tip

    * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    tracing: Fix trace_print_seq()
    kprobes: No need to unlock kprobe_insn_mutex
    tracing/fastboot: Document the need of initcall_debug
    trace_export: Repair missed fields
    tracing: Fix stack tracer sysctl handling

    Linus Torvalds
     

10 Jul, 2009

1 commit


09 Jul, 2009

2 commits

  • Commit 5fd29d6ccbc98884569d6f3105aeca70858b3e0f ("printk: clean up
    handling of log-levels and newlines") changed printk semantics. printk
    lines with multiple KERN_ prefixes are no longer emitted as
    before the patch.

    is now included in the output on each additional use.

    Remove all uses of multiple KERN_s in formats.

    Signed-off-by: Joe Perches
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Fix various silly problems wrt mnt_namespace.h:

    - exit_mnt_ns() isn't used, remove it
    - done that, sched.h and nsproxy.h inclusions aren't needed
    - mount.h inclusion was need for vfsmount_lock, but no longer
    - remove mnt_namespace.h inclusion from files which don't use anything
    from mnt_namespace.h

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

07 Jul, 2009

2 commits

  • do_execve() and ptrace_attach() return -EINTR if
    mutex_lock_interruptible(->cred_guard_mutex) fails.

    This is not right, change the code to return ERESTARTNOINTR.

    Perhaps we should also change proc_pid_attr_write().

    Signed-off-by: Oleg Nesterov
    Cc: David Howells
    Acked-by: Roland McGrath
    Cc: James Morris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • These warnings were observed on MIPS32 using 2.6.31-rc1 and gcc-4.2.0:

    mm/page_alloc.c: In function 'alloc_pages_exact':
    mm/page_alloc.c:1986: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast

    drivers/usb/mon/mon_bin.c: In function 'mon_alloc_buff':
    drivers/usb/mon/mon_bin.c:1264: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast

    [akpm@linux-foundation.org: fix kernel/perf_counter.c too]
    Signed-off-by: Kevin Cernekee
    Cc: Andi Kleen
    Cc: Ralf Baechle
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kevin Cernekee
     

02 Jul, 2009

1 commit

  • We will lose something if trace_seq->buffer[0] is 0, because the copy length
    is calculated by strlen() in seq_puts(), so using seq_write() instead of
    seq_puts().

    There have a example:
    after reboot:

    # echo kmemtrace > current_tracer
    # echo 0 > options/kmem_minimalistic
    # cat trace
    # tracer: kmemtrace
    #
    #

    Nothing is exported, because the first byte of trace_seq->buffer[ ]
    is KMEMTRACE_USER_ALLOC.

    ( the value of KMEMTRACE_USER_ALLOC is zero, seeing
    kmemtrace_print_alloc_user() in kernel/trace/kmemtrace.c)

    Signed-off-by: Xiao Guangrong
    Acked-by: Frederic Weisbecker
    Acked-by: Pekka Enberg
    Acked-by: Eduard - Gabriel Munteanu
    Cc: Steven Rostedt
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Xiao Guangrong
     

01 Jul, 2009

5 commits

  • Remove needless kprobe_insn_mutex unlocking during safety check
    in garbage collection, because if someone releases a dirty slot
    during safety check (which ensures other cpus doesn't execute
    all dirty slots), the safety check must be fail. So, we need to
    hold the mutex while checking safety.

    Signed-off-by: Masami Hiramatsu
    Cc: Ananth N Mavinakayanahalli
    Cc: Jim Keniston
    Cc: Ananth N Mavinakayanahalli
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     
  • * 'kmemleak' of git://linux-arm.org/linux-2.6:
    kmemleak: Inform kmemleak about pid_hash
    kmemleak: Do not warn if an unknown object is freed
    kmemleak: Do not report new leaked objects if the scanning was stopped
    kmemleak: Slightly change the policy on newly allocated objects
    kmemleak: Do not trigger a scan when reading the debug/kmemleak file
    kmemleak: Simplify the reports logged by the scanning thread
    kmemleak: Enable task stacks scanning by default
    kmemleak: Allow the early log buffer to be configurable.

    Linus Torvalds
     
  • …x/kernel/git/tip/linux-2.6-tip

    * 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (47 commits)
    perf report: Add --symbols parameter
    perf report: Add --comms parameter
    perf report: Add --dsos parameter
    perf_counter tools: Adjust only prelinked symbol's addresses
    perf_counter: Provide a way to enable counters on exec
    perf_counter tools: Reduce perf stat measurement overhead/skew
    perf stat: Use percentages for scaling output
    perf_counter, x86: Update x86_pmu after WARN()
    perf stat: Micro-optimize the code: memcpy is only required if no event is selected and !null_run
    perf stat: Improve output
    perf stat: Fix multi-run stats
    perf stat: Add -n/--null option to run without counters
    perf_counter tools: Remove dead code
    perf_counter: Complete counter swap
    perf report: Print sorted callchains per histogram entries
    perf_counter tools: Prepare a small callchain framework
    perf record: Fix unhandled io return value
    perf_counter tools: Add alias for 'l1d' and 'l1i'
    perf-report: Add bare minimum PERF_EVENT_READ parsing
    perf-report: Add modes for inherited stats and no-samples
    ...

    Linus Torvalds
     
  • The file opened in acct_on and freshly stored in the ns->bacct struct can
    be closed in acct_file_reopen by a concurrent call after we release
    acct_lock and before we call mntput(file->f_path.mnt).

    Record file->f_path.mnt in a local variable and use this variable only.

    Signed-off-by: Renaud Lottiaux
    Signed-off-by: Louis Rilling
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Renaud Lottiaux
     
  • When the 32-bit signed quantities get assigned to the u64 resource_size_t,
    they are incorrectly sign-extended.

    Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13253
    Addresses http://bugzilla.kernel.org/show_bug.cgi?id=9905

    Signed-off-by: Zhang Rui
    Reported-by: Leann Ogasawara
    Cc: Pierre Ossman
    Reported-by:
    Tested-by:
    Cc:
    Cc: Jesse Barnes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Zhang Rui
     

30 Jun, 2009

2 commits


29 Jun, 2009

4 commits

  • To use boot tracer, one should pass initcall_debug as well as
    ftrace=initcall to the command line.

    Signed-off-by: Li Zefan
    Cc: Frederic Weisbecker
    Cc: Steven Rostedt
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Li Zefan
     
  • …git/tip/linux-2.6-tip

    * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, delay: tsc based udelay should have rdtsc_barrier
    x86, setup: correct include file in <asm/boot.h>
    x86, setup: Fix typo "CONFIG_x86_64" in <asm/boot.h>
    x86, mce: percpu mcheck_timer should be pinned
    x86: Add sysctl to allow panic on IOCK NMI error
    x86: Fix uv bau sending buffer initialization
    x86, mce: Fix mce resume on 32bit
    x86: Move init_gbpages() to setup_arch()
    x86: ensure percpu lpage doesn't consume too much vmalloc space
    x86: implement percpu_alloc kernel parameter
    x86: fix pageattr handling for lpage percpu allocator and re-enable it
    x86: reorganize cpa_process_alias()
    x86: prepare setup_pcpu_lpage() for pageattr fix
    x86: rename remap percpu first chunk allocator to lpage
    x86: fix duplicate free in setup_pcpu_remap() failure path
    percpu: fix too lazy vunmap cache flushing
    x86: Set cpu_llc_id on AMD CPUs

    Linus Torvalds
     
  • …el/git/tip/linux-2.6-tip

    * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    timer stats: Optimize by adding quick check to avoid function calls
    timers: Fix timer_migration interface which accepts any number as input

    Linus Torvalds
     
  • …nel/git/tip/linux-2.6-tip

    * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    ftrace: Fix the output of profile
    ring-buffer: Make it generally available
    ftrace: Remove duplicate newline
    tracing: Fix trace_buf_size boot option
    ftrace: Fix t_hash_start()
    ftrace: Don't manipulate @pos in t_start()
    ftrace: Don't increment @pos in g_start()
    tracing: Reset iterator in t_start()
    trace_stat: Don't increment @pos in seq start()
    tracing_bprintk: Don't increment @pos in t_start()
    tracing/events: Don't increment @pos in s_start()

    Linus Torvalds
     

27 Jun, 2009

2 commits

  • Some fields for struct ftrace_graph_ret are missed
    when they are exported to user.

    Signed-off-by: Lai Jiangshan
    Cc: Frederic Weisbecker
    Cc: Steven Rostedt
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Lai Jiangshan
     
  • This made my machine completely frozen:

    # echo 1 > /proc/sys/kernel/stack_tracer_enabled
    # echo 2 > /proc/sys/kernel/stack_tracer_enabled

    The cause is register_ftrace_function() was called twice.

    Also fix ftrace_enabled sysctl, though seems nothing bad happened
    as I tested it.

    Signed-off-by: Li Zefan
    Cc: Steven Rostedt
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Li Zefan
     

26 Jun, 2009

8 commits

  • Complete the counter swap by indeed switching the times too and
    updating the userpage after modifying the counter values.

    Signed-off-by: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • The first entry of the ftrace profile was always skipped when
    reading trace_stat/functionX.

    Signed-off-by: Li Zefan
    Cc: Steven Rostedt
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Li Zefan
     
  • This patch introduces a new sysctl:

    /proc/sys/kernel/panic_on_io_nmi

    which defaults to 0 (off).

    When enabled, the kernel panics when the kernel receives an NMI
    caused by an IO error.

    The IO error triggered NMI indicates a serious system
    condition, which could result in IO data corruption. Rather
    than contiuing, panicing and dumping might be a better choice,
    so one can figure out what's causing the IO error.

    This could be especially important to companies running IO
    intensive applications where corruption must be avoided, e.g. a
    bank's databases.

    [ SuSE has been shipping it for a while, it was done at the
    request of a large database vendor, for their users. ]

    Signed-off-by: Kurt Garloff
    Signed-off-by: Roberto Angelino
    Signed-off-by: Greg Kroah-Hartman
    Cc: "Eric W. Biederman"
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Kurt Garloff
     
  • The PERF_EVENT_READ implementation made me realize we don't
    actually need the sample_type int the output sample, since
    we already have that in the perf_counter_attr information.

    Therefore, remove the PERF_EVENT_MISC_OVERFLOW bit and the
    event->type overloading, and imply put counter overflow
    samples in a PERF_EVENT_SAMPLE type.

    This also fixes the issue that event->type was only 32-bit
    and sample_type had 64 usable bits.

    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • With the introduction of PERF_EVENT_READ we have the
    possibility to provide accurate counter values for
    individual tasks in a task hierarchy.

    However, due to the lazy context switching used for similar
    counter contexts our current per task counts are way off.

    In order to maintain some of the lazy switch benefits we
    don't disable it out-right, but simply iterate the active
    counters and flip the values between the contexts.

    This only reads the counters but does not need to reprogram
    the full PMU.

    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Provide a read() like event which can be used to log the
    counter value at specific sites such as child->parent
    folding on exit.

    In order to be useful, we log the counter parent ID, not the
    actual counter ID, since userspace can only relate parent
    IDs to perf_counter_attr constructs.

    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Update the mmap control page with the needed information to
    use the userspace RDPMC instruction for self monitoring.

    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Add the needed time scale to the self-profile mmap information.

    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

25 Jun, 2009

6 commits

  • Yanmin noticed that fault_in_user_writeable() requests 4 pages instead
    of one.

    That's the result of blindly trusting Linus' proposal :) I even looked
    up the prototype to verify the correctness: the argument in question
    is confusingly enough named "len" while in reality it means number of
    pages.

    Pointed-out-by: Yanmin Zhang
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • In hunting down the cause for the hwlat_detector ring buffer spew in
    my failed -next builds it became obvious that folks are now treating
    ring_buffer as something that is generic independent of tracing and thus,
    suitable for public driver consumption.

    Given that there are only a few minor areas in ring_buffer that have any
    reliance on CONFIG_TRACING or CONFIG_FUNCTION_TRACER, provide stubs for
    those and make it generally available.

    Signed-off-by: Paul Mundt
    Cc: Jon Masters
    Cc: Steven Rostedt
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul Mundt
     
  • Before:
    # echo 'sys_open:traceon:' > set_ftrace_filter
    # echo 'sys_close:traceoff:5' > set_ftrace_filter
    # cat set_ftrace_filter
    #### all functions enabled ####
    sys_open:traceon:unlimited

    sys_close:traceoff:count=0

    After:
    # cat set_ftrace_filter
    #### all functions enabled ####
    sys_open:traceon:unlimited
    sys_close:traceoff:count=0

    Signed-off-by: Li Zefan
    Cc: Steven Rostedt
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Li Zefan
     
  • …/{vfs-2.6,audit-current}

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
    another race fix in jfs_check_acl()
    Get "no acls for this inode" right, fix shmem breakage
    inline functions left without protection of ifdef (acl)

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
    audit: inode watches depend on CONFIG_AUDIT not CONFIG_AUDIT_SYSCALL

    Linus Torvalds
     
  • Even though one cannot make use of the audit watch code without
    CONFIG_AUDIT_SYSCALL the spaghetti nature of the audit code means that
    the audit rule filtering requires that it at least be compiled.

    Thus build the audit_watch code when we build auditfilter like it was
    before cfcad62c74abfef83762dc05a556d21bdf3980a2

    Clearly this is a point of potential future cleanup..

    Reported-by: Frans Pop
    Signed-off-by: Eric Paris
    Signed-off-by: Al Viro

    Eric Paris
     
  • commit 64d1304a64 (futex: setup writeable mapping for futex ops which
    modify user space data) did address only half of the problem of write
    access faults.

    The patch was made on two wrong assumptions:

    1) access_ok(VERIFY_WRITE,...) would actually check write access.

    On x86 it does _NOT_. It's a pure address range check.

    2) a RW mapped region can not go away under us.

    That's wrong as well. Nobody can prevent another thread to call
    mprotect(PROT_READ) on that region where the futex resides. If that
    call hits between the get_user_pages_fast() verification and the
    actual write access in the atomic region we are toast again.

    The solution is to not rely on access_ok and get_user() for any write
    access related fault on private and shared futexes. Instead we need to
    fault it in with verification of write access.

    There is no generic non destructive write mechanism which would fault
    the user page in trough a #PF, but as we already know that we will
    fault we can as well call get_user_pages() directly and avoid the #PF
    overhead.

    If get_user_pages() returns -EFAULT we know that we can not fix it
    anymore and need to bail out to user space.

    Remove a bunch of confusing comments on this issue as well.

    Signed-off-by: Thomas Gleixner
    Cc: stable@kernel.org

    Thomas Gleixner
     

24 Jun, 2009

2 commits

  • Removes the warnings about Hierarchical RCU being experimental,
    given that it has gone through almost six months of being the
    default RCU in mainline for the x86 with very little trouble.

    This makes hierarchical-RCU bootup look less scary.

    Signed-off-by: Paul E. McKenney
    Cc: akpm@linux-foundation.org
    Cc: niv@us.ibm.com
    Cc: dvhltc@us.ibm.com
    Cc: dipankar@in.ibm.com
    Cc: dhowells@redhat.com
    Cc: lethal@linux-sh.org
    Cc: kernel@wantstofly.org
    Cc: cl@linux-foundation.org
    Cc: schamp@sgi.com
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • We should be able to specify [KMG] when setting trace_buf_size
    boot option, as documented in kernel-parameters.txt

    Signed-off-by: Li Zefan
    Cc: Steven Rostedt
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Li Zefan