02 Jun, 2012

1 commit

  • Pull vfs changes from Al Viro.
    "A lot of misc stuff. The obvious groups:
    * Miklos' atomic_open series; kills the damn abuse of
    ->d_revalidate() by NFS, which was the major stumbling block for
    all work in that area.
    * ripping security_file_mmap() and dealing with deadlocks in the
    area; sanitizing the neighborhood of vm_mmap()/vm_munmap() in
    general.
    * ->encode_fh() switched to saner API; insane fake dentry in
    mm/cleancache.c gone.
    * assorted annotations in fs (endianness, __user)
    * parts of Artem's ->s_dirty work (jff2 and reiserfs parts)
    * ->update_time() work from Josef.
    * other bits and pieces all over the place.

    Normally it would've been in two or three pull requests, but
    signal.git stuff had eaten a lot of time during this cycle ;-/"

    Fix up trivial conflicts in Documentation/filesystems/vfs.txt (the
    'truncate_range' inode method was removed by the VM changes, the VFS
    update adds an 'update_time()' method), and in fs/btrfs/ulist.[ch] (due
    to sparse fix added twice, with other changes nearby).

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (95 commits)
    nfs: don't open in ->d_revalidate
    vfs: retry last component if opening stale dentry
    vfs: nameidata_to_filp(): don't throw away file on error
    vfs: nameidata_to_filp(): inline __dentry_open()
    vfs: do_dentry_open(): don't put filp
    vfs: split __dentry_open()
    vfs: do_last() common post lookup
    vfs: do_last(): add audit_inode before open
    vfs: do_last(): only return EISDIR for O_CREAT
    vfs: do_last(): check LOOKUP_DIRECTORY
    vfs: do_last(): make ENOENT exit RCU safe
    vfs: make follow_link check RCU safe
    vfs: do_last(): use inode variable
    vfs: do_last(): inline walk_component()
    vfs: do_last(): make exit RCU safe
    vfs: split do_lookup()
    Btrfs: move over to use ->update_time
    fs: introduce inode operation ->update_time
    reiserfs: get rid of resierfs_sync_super
    reiserfs: mark the superblock as dirty a bit later
    ...

    Linus Torvalds
     

01 Jun, 2012

2 commits

  • Pull second pile of signal handling patches from Al Viro:
    "This one is just task_work_add() series + remaining prereqs for it.

    There probably will be another pull request from that tree this
    cycle - at least for helpers, to get them out of the way for per-arch
    fixes remaining in the tree."

    Fix trivial conflict in kernel/irq/manage.c: the merge of Andrew's pile
    had brought in commit 97fd75b7b8e0 ("kernel/irq/manage.c: use the
    pr_foo() infrastructure to prefix printks") which changed one of the
    pr_err() calls that this merge moves around.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
    keys: kill task_struct->replacement_session_keyring
    keys: kill the dummy key_replace_session_keyring()
    keys: change keyctl_session_to_parent() to use task_work_add()
    genirq: reimplement exit_irq_thread() hook via task_work_add()
    task_work_add: generic process-context callbacks
    avr32: missed _TIF_NOTIFY_RESUME on one of do_notify_resume callers
    parisc: need to check NOTIFY_RESUME when exiting from syscall
    move key_repace_session_keyring() into tracehook_notify_resume()
    TIF_NOTIFY_RESUME is defined on all targets now

    Linus Torvalds
     
  • While doing the checkpoint-restore in the user space one need to determine
    whether various kernel objects (like mm_struct-s of file_struct-s) are
    shared between tasks and restore this state.

    The 2nd step can be solved by using appropriate CLONE_ flags and the
    unshare syscall, while there's currently no ways for solving the 1st one.

    One of the ways for checking whether two tasks share e.g. mm_struct is to
    provide some mm_struct ID of a task to its proc file, but showing such
    info considered to be not that good for security reasons.

    Thus after some debates we end up in conclusion that using that named
    'comparison' syscall might be the best candidate. So here is it --
    __NR_kcmp.

    It takes up to 5 arguments - the pids of the two tasks (which
    characteristics should be compared), the comparison type and (in case of
    comparison of files) two file descriptors.

    Lookups for pids are done in the caller's PID namespace only.

    At moment only x86 is supported and tested.

    [akpm@linux-foundation.org: fix up selftests, warnings]
    [akpm@linux-foundation.org: include errno.h]
    [akpm@linux-foundation.org: tweak comment text]
    Signed-off-by: Cyrill Gorcunov
    Acked-by: "Eric W. Biederman"
    Cc: Pavel Emelyanov
    Cc: Andrey Vagin
    Cc: KOSAKI Motohiro
    Cc: Ingo Molnar
    Cc: H. Peter Anvin
    Cc: Thomas Gleixner
    Cc: Glauber Costa
    Cc: Andi Kleen
    Cc: Tejun Heo
    Cc: Matt Helsley
    Cc: Pekka Enberg
    Cc: Eric Dumazet
    Cc: Vasiliy Kulikov
    Cc: Alexey Dobriyan
    Cc: Valdis.Kletnieks@vt.edu
    Cc: Michal Marek
    Cc: Frederic Weisbecker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Cyrill Gorcunov
     

30 May, 2012

1 commit

  • lglocks and brlocks are currently generated with some complicated macros
    in lglock.h. But there's no reason to not just use common utility
    functions and put all the data into a common data structure.

    Since there are at least two users it makes sense to share this code in a
    library. This is also easier maintainable than a macro forest.

    This will also make it later possible to dynamically allocate lglocks and
    also use them in modules (this would both still need some additional, but
    now straightforward, code)

    [akpm@linux-foundation.org: checkpatch fixes]
    Signed-off-by: Andi Kleen
    Cc: Al Viro
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Rusty Russell
    Signed-off-by: Al Viro

    Andi Kleen
     

24 May, 2012

1 commit

  • Provide a simple mechanism that allows running code in the (nonatomic)
    context of the arbitrary task.

    The caller does task_work_add(task, task_work) and this task executes
    task_work->func() either from do_notify_resume() or from do_exit(). The
    callback can rely on PF_EXITING to detect the latter case.

    "struct task_work" can be embedded in another struct, still it has "void
    *data" to handle the most common/simple case.

    This allows us to kill the ->replacement_session_keyring hack, and
    potentially this can have more users.

    Performance-wise, this adds 2 "unlikely(!hlist_empty())" checks into
    tracehook_notify_resume() and do_exit(). But at the same time we can
    remove the "replacement_session_keyring != NULL" checks from
    arch/*/signal.c and exit_creds().

    Note: task_work_add/task_work_run abuses ->pi_lock. This is only because
    this lock is already used by lookup_pi_state() to synchronize with
    do_exit() setting PF_EXITING. Fortunately the scope of this lock in
    task_work.c is really tiny, and the code is unlikely anyway.

    Signed-off-by: Oleg Nesterov
    Acked-by: David Howells
    Cc: Thomas Gleixner
    Cc: Richard Kuo
    Cc: Linus Torvalds
    Cc: Alexander Gordeev
    Cc: Chris Zankel
    Cc: David Smith
    Cc: "Frank Ch. Eigler"
    Cc: Geert Uytterhoeven
    Cc: Larry Woodman
    Cc: Peter Zijlstra
    Cc: Tejun Heo
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Oleg Nesterov
     

26 Apr, 2012

1 commit

  • Start a new file, which will hold SMP and CPU hotplug related generic
    infrastructure.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Rusty Russell
    Cc: Paul E. McKenney
    Cc: Srivatsa S. Bhat
    Cc: Matt Turner
    Cc: Russell King
    Cc: Mike Frysinger
    Cc: Jesper Nilsson
    Cc: Richard Kuo
    Cc: Tony Luck
    Cc: Hirokazu Takata
    Cc: Ralf Baechle
    Cc: David Howells
    Cc: James E.J. Bottomley
    Cc: Benjamin Herrenschmidt
    Cc: Martin Schwidefsky
    Cc: Paul Mundt
    Cc: David S. Miller
    Cc: Chris Metcalf
    Cc: Richard Weinberger
    Cc: x86@kernel.org
    Link: http://lkml.kernel.org/r/20120420124557.035417523@linutronix.de

    Thomas Gleixner
     

25 Jan, 2012

1 commit

  • Move the core sysctl code from kernel/sysctl.c and kernel/sysctl_check.c
    into fs/proc/proc_sysctl.c.

    Currently sysctl maintenance is hampered by the sysctl implementation
    being split across 3 files with artificial layering between them.
    Consolidate the entire sysctl implementation into 1 file so that
    it is easier to see what is going on and hopefully allowing for
    simpler maintenance.

    For functions that are now only used in fs/proc/proc_sysctl.c remove
    their declarations from sysctl.h and make them static in fs/proc/proc_sysctl.c

    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     

14 Jan, 2012

1 commit

  • After commit 1eb208aea3179dd2fc0cdeea45ef869d75b4fe70, "PM: Make
    CONFIG_PM depend on (CONFIG_PM_SLEEP || CONFIG_PM_RUNTIME)", the
    files under kernel/power are not built unless CONFIG_PM_SLEEP or
    CONFIG_PM_RUNTIME is set. In particular, this causes
    kernel/power/poweroff.c to be omitted, even though it should be
    compiled, because CONFIG_MAGIC_SYSRQ is set.

    Fix the problem by causing kernel/power/Makefile to be processed
    for CONFIG_PM unset too.

    Reported-and-tested-by: Phil Oester
    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     

17 Nov, 2011

2 commits


29 Oct, 2011

1 commit

  • …git-cur/linux-2.6-arm

    * 'devel-stable' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm: (178 commits)
    ARM: 7139/1: fix compilation with CONFIG_ARM_ATAG_DTB_COMPAT and large TEXT_OFFSET
    ARM: gic, local timers: use the request_percpu_irq() interface
    ARM: gic: consolidate PPI handling
    ARM: switch from NO_MACH_MEMORY_H to NEED_MACH_MEMORY_H
    ARM: mach-s5p64x0: remove mach/memory.h
    ARM: mach-s3c64xx: remove mach/memory.h
    ARM: plat-mxc: remove mach/memory.h
    ARM: mach-prima2: remove mach/memory.h
    ARM: mach-zynq: remove mach/memory.h
    ARM: mach-bcmring: remove mach/memory.h
    ARM: mach-davinci: remove mach/memory.h
    ARM: mach-pxa: remove mach/memory.h
    ARM: mach-ixp4xx: remove mach/memory.h
    ARM: mach-h720x: remove mach/memory.h
    ARM: mach-vt8500: remove mach/memory.h
    ARM: mach-s5pc100: remove mach/memory.h
    ARM: mach-tegra: remove mach/memory.h
    ARM: plat-tcc: remove mach/memory.h
    ARM: mach-mmp: remove mach/memory.h
    ARM: mach-cns3xxx: remove mach/memory.h
    ...

    Fix up mostly pretty trivial conflicts in:
    - arch/arm/Kconfig
    - arch/arm/include/asm/localtimer.h
    - arch/arm/kernel/Makefile
    - arch/arm/mach-shmobile/board-ap4evb.c
    - arch/arm/mach-u300/core.c
    - arch/arm/mm/dma-mapping.c
    - arch/arm/mm/proc-v7.S
    - arch/arm/plat-omap/Kconfig
    largely due to some CONFIG option renaming (ie CONFIG_PM_SLEEP ->
    CONFIG_ARM_CPU_SUSPEND for the arm-specific suspend code etc) and
    addition of NEED_MACH_MEMORY_H next to HAVE_IDE.

    Linus Torvalds
     

23 Sep, 2011

1 commit

  • During some CPU power modes entered during idle, hotplug and
    suspend, peripherals located in the CPU power domain, such as
    the GIC, localtimers, and VFP, may be powered down. Add a
    notifier chain that allows drivers for those peripherals to
    be notified before and after they may be reset.

    Notified drivers can include VFP co-processor, interrupt controller
    and it's PM extensions, local CPU timers context save/restore which
    shouldn't be interrupted. Hence CPU PM event APIs must be called
    with interrupts disabled.

    Signed-off-by: Colin Cross
    Signed-off-by: Santosh Shilimkar
    Reviewed-by: Kevin Hilman
    Tested-and-Acked-by: Shawn Guo
    Tested-by: Kevin Hilman
    Tested-by: Vishwanath BS

    Colin Cross
     

25 Aug, 2011

1 commit


06 Aug, 2011

1 commit

  • In the course of testing jump labels for use with the CFS
    bandwidth controller, Paul Turner, discovered that using jump
    labels reduced the branch count and the instruction count, but
    did not reduce the cycle count or wall time.

    I noticed that having the jump_label.o included in the kernel
    but not used in any way still caused this increase in cycle
    count and wall time. Thus, I moved jump_label.o in the
    kernel/Makefile, thus changing the link order, and presumably
    moving it out of hot icache areas. This brought down the cycle
    count/time as expected.

    In addition to Paul's testing, I've tested the patch using a
    single 'static_branch()' in the getppid() path, and basically
    running tight loops of calls to getppid(). Here are my results
    for the branch disabled case:

    With jump labels turned on (CONFIG_JUMP_LABEL), branch disabled:

    Performance counter stats for 'bash -c /tmp/getppid;true' (50 runs):

    3,969,510,217 instructions # 0.864 IPC ( +-0.000% )
    4,592,334,954 cycles ( +- 0.046% )
    751,634,470 branches ( +- 0.000% )

    1.722635797 seconds time elapsed ( +- 0.046% )

    Jump labels turned off (CONFIG_JUMP_LABEL not set), branch
    disabled:

    Performance counter stats for 'bash -c /tmp/getppid;true' (50 runs):

    4,009,611,846 instructions # 0.867 IPC ( +-0.000% )
    4,622,210,580 cycles ( +- 0.012% )
    771,662,904 branches ( +- 0.000% )

    1.734341454 seconds time elapsed ( +- 0.022% )

    Signed-off-by: Jason Baron
    Cc: rth@redhat.com
    Cc: a.p.zijlstra@chello.nl
    Cc: rostedt@goodmis.org
    Link: http://lkml.kernel.org/r/20110805204040.GG2522@redhat.com
    Signed-off-by: Ingo Molnar
    Tested-by: Paul Turner

    Jason Baron
     

20 Jul, 2011

1 commit

  • When IKCONFIG is built-in make oldconfig will cause the kernel to be
    relinked even if .config didn't change. This happens because of a
    config_data.gz dependency on .config. This patch changes the if_changed
    to a filechk so that config_data.h is only rebuilt when the contents
    have actually changed.

    Signed-off-by: Peter Foley
    Signed-off-by: Michal Marek

    Peter Foley
     

27 May, 2011

1 commit

  • The ns_cgroup is an annoying cgroup at the namespace / cgroup frontier and
    leads to some problems:

    * cgroup creation is out-of-control
    * cgroup name can conflict when pids are looping
    * it is not possible to have a single process handling a lot of
    namespaces without falling in a exponential creation time
    * we may want to create a namespace without creating a cgroup

    The ns_cgroup was replaced by a compatibility flag 'clone_children',
    where a newly created cgroup will copy the parent cgroup values.
    The userspace has to manually create a cgroup and add a task to
    the 'tasks' file.

    This patch removes the ns_cgroup as suggested in the following thread:

    https://lists.linux-foundation.org/pipermail/containers/2009-June/018616.html

    The 'cgroup_clone' function is removed because it is no longer used.

    This is a userspace-visible change. Commit 45531757b45c ("cgroup: notify
    ns_cgroup deprecated") (merged into 2.6.27) caused the kernel to emit a
    printk warning users that the feature is planned for removal. Since that
    time we have heard from XXX users who were affected by this.

    Signed-off-by: Daniel Lezcano
    Signed-off-by: Serge E. Hallyn
    Cc: Eric W. Biederman
    Cc: Jamal Hadi Salim
    Reviewed-by: Li Zefan
    Acked-by: Paul Menage
    Acked-by: Matt Helsley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daniel Lezcano
     

03 May, 2011

2 commits

  • As part of the events sybsystem unification, relocate hw_breakpoint.c
    into its new destination.

    Cc: Frederic Weisbecker
    Signed-off-by: Borislav Petkov

    Borislav Petkov
     
  • mv kernel/perf_event.c -> kernel/events/core.c. From there, all further
    sensible splitting can happen. The idea is that due to perf_event.c
    becoming pretty sizable and with the advent of the marriage with ftrace,
    splitting functionality into its logical parts should help speeding up
    the unification and to manage the complexity of the subsystem.

    Signed-off-by: Borislav Petkov

    Borislav Petkov
     

24 Mar, 2011

1 commit

  • …p_elfcorehdr and saved_max_pfn

    The Xen PV drivers in a crashed HVM guest can not connect to the dom0
    backend drivers because both frontend and backend drivers are still in
    connected state. To run the connection reset function only in case of a
    crashdump, the is_kdump_kernel() function needs to be available for the PV
    driver modules.

    Consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn into
    kernel/crash_dump.c Also export elfcorehdr_addr to make is_kdump_kernel()
    usable for modules.

    Leave 'elfcorehdr' as early_param(). This changes powerpc from __setup()
    to early_param(). It adds an address range check from x86 also on ia64
    and powerpc.

    [akpm@linux-foundation.org: additional #includes]
    [akpm@linux-foundation.org: remove elfcorehdr_addr export]
    [akpm@linux-foundation.org: fix for Tejun's mm/nobootmem.c changes]
    Signed-off-by: Olaf Hering <olaf@aepfle.de>
    Cc: Russell King <rmk@arm.linux.org.uk>
    Cc: "Luck, Tony" <tony.luck@intel.com>
    Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Cc: Paul Mundt <lethal@linux-sh.org>
    Cc: Ingo Molnar <mingo@elte.hu>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

    Olaf Hering
     

14 Jan, 2011

1 commit

  • For arch which needs USE_GENERIC_SMP_HELPERS, it has to select
    USE_GENERIC_SMP_HELPERS, rather than leaving a choice to user, since they
    don't provide their own implementions.

    Also, move on_each_cpu() to kernel/smp.c, it is strange to put it in
    kernel/softirq.c.

    For arch which doesn't use USE_GENERIC_SMP_HELPERS, e.g. blackfin, only
    on_each_cpu() is compiled.

    Signed-off-by: Amerigo Wang
    Cc: David Howells
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Yinghai Lu
    Cc: Peter Zijlstra
    Cc: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Amerigo Wang
     

12 Jan, 2011

1 commit

  • …/git/tip/linux-2.6-tip

    * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (28 commits)
    perf session: Fix infinite loop in __perf_session__process_events
    perf evsel: Support perf_evsel__open(cpus > 1 && threads > 1)
    perf sched: Use PTHREAD_STACK_MIN to avoid pthread_attr_setstacksize() fail
    perf tools: Emit clearer message for sys_perf_event_open ENOENT return
    perf stat: better error message for unsupported events
    perf sched: Fix allocation result check
    perf, x86: P4 PMU - Fix unflagged overflows handling
    dynamic debug: Fix build issue with older gcc
    tracing: Fix TRACE_EVENT power tracepoint creation
    tracing: Fix preempt count leak
    tracepoint: Add __rcu annotation
    tracing: remove duplicate null-pointer check in skb tracepoint
    tracing/trivial: Add missing comma in TRACE_EVENT comment
    tracing: Include module.h in define_trace.h
    x86: Save rbp in pt_regs on irq entry
    x86, dumpstack: Fix unused variable warning
    x86, NMI: Clean-up default_do_nmi()
    x86, NMI: Allow NMI reason io port (0x61) to be processed on any CPU
    x86, NMI: Remove DIE_NMI_IPI
    x86, NMI: Add priorities to handlers
    ...

    Linus Torvalds
     

08 Jan, 2011

1 commit


15 Dec, 2010

1 commit

  • If you try to build a kernel with KCONFIG_CONFIG set (to a value
    not equal to .config) and that config sets CONFIG_IKCONFIG then the
    build will fail with:

    make[1]: *** No rule to make target `.config', needed by \
    `kernel/config_data.gz'. Stop.

    because the kernel/Makefile contains a direct reference to .config.

    This issue has been present since the introduction of KCONFIG_CONFIG
    in 14cdd3c402bf7c66f0bcd76e290f0770a54a4b21.

    Signed-off-by: Ben Gardiner
    CC: Roman Zippel
    CC: Michal Marek
    Reviewed-by: Michal Marek
    Signed-off-by: Michal Marek

    Ben Gardiner
     

22 Oct, 2010

2 commits

  • …nel/git/tip/linux-2.6-tip

    * 'core-memblock-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (74 commits)
    x86-64: Only set max_pfn_mapped to 512 MiB if we enter via head_64.S
    xen: Cope with unmapped pages when initializing kernel pagetable
    memblock, bootmem: Round pfn properly for memory and reserved regions
    memblock: Annotate memblock functions with __init_memblock
    memblock: Allow memblock_init to be called early
    memblock/arm: Fix memblock_region_is_memory() typo
    x86, memblock: Remove __memblock_x86_find_in_range_size()
    memblock: Fix wraparound in find_region()
    x86-32, memblock: Make add_highpages honor early reserved ranges
    x86, memblock: Fix crashkernel allocation
    arm, memblock: Fix the sparsemem build
    memblock: Fix section mismatch warnings
    powerpc, memblock: Fix memblock API change fallout
    memblock, microblaze: Fix memblock API change fallout
    x86: Remove old bootmem code
    x86, memblock: Use memblock_memory_size()/memblock_free_memory_size() to get correct dma_reserve
    x86: Remove not used early_res code
    x86, memblock: Replace e820_/_early string with memblock_
    x86: Use memblock to replace early_res
    x86, memblock: Use memblock_debug to control debug message print out
    ...

    Fix up trivial conflicts in arch/x86/kernel/setup.c and kernel/Makefile

    Linus Torvalds
     
  • …git/tip/linux-2.6-tip

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (163 commits)
    tracing: Fix compile issue for trace_sched_wakeup.c
    [S390] hardirq: remove pointless header file includes
    [IA64] Move local_softirq_pending() definition
    perf, powerpc: Fix power_pmu_event_init to not use event->ctx
    ftrace: Remove recursion between recordmcount and scripts/mod/empty
    jump_label: Add COND_STMT(), reducer wrappery
    perf: Optimize sw events
    perf: Use jump_labels to optimize the scheduler hooks
    jump_label: Add atomic_t interface
    jump_label: Use more consistent naming
    perf, hw_breakpoint: Fix crash in hw_breakpoint creation
    perf: Find task before event alloc
    perf: Fix task refcount bugs
    perf: Fix group moving
    irq_work: Add generic hardirq context callbacks
    perf_events: Fix transaction recovery in group_sched_in()
    perf_events: Fix bogus AMD64 generic TLB events
    perf_events: Fix bogus context time tracking
    tracing: Remove parent recording in latency tracer graph options
    tracing: Use one prologue for the preempt irqs off tracer function tracers
    ...

    Linus Torvalds
     

19 Oct, 2010

1 commit

  • Provide a mechanism that allows running code in IRQ context. It is
    most useful for NMI code that needs to interact with the rest of the
    system -- like wakeup a task to drain buffers.

    Perf currently has such a mechanism, so extract that and provide it as
    a generic feature, independent of perf so that others may also
    benefit.

    The IRQ context callback is generated through self-IPIs where
    possible, or on architectures like powerpc the decrementer (the
    built-in timer facility) is set to generate an interrupt immediately.

    Architectures that don't have anything like this get to do with a
    callback from the timer tick. These architectures can call
    irq_work_run() at the tail of any IRQ handlers that might enqueue such
    work (like the perf IRQ handler) to avoid undue latencies in
    processing the work.

    Signed-off-by: Peter Zijlstra
    Acked-by: Kyle McMartin
    Acked-by: Martin Schwidefsky
    [ various fixes ]
    Signed-off-by: Huang Ying
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

23 Sep, 2010

1 commit

  • base patch to implement 'jump labeling'. Based on a new 'asm goto' inline
    assembly gcc mechanism, we can now branch to labels from an 'asm goto'
    statment. This allows us to create a 'no-op' fastpath, which can subsequently
    be patched with a jump to the slowpath code. This is useful for code which
    might be rarely used, but which we'd like to be able to call, if needed.
    Tracepoints are the current usecase that these are being implemented for.

    Acked-by: David S. Miller
    Signed-off-by: Jason Baron
    LKML-Reference:

    [ cleaned up some formating ]

    Signed-off-by: Steven Rostedt

    Jason Baron
     

31 Aug, 2010

1 commit


28 Aug, 2010

1 commit


20 Aug, 2010

1 commit

  • Implement a small-memory-footprint uniprocessor-only implementation of
    preemptible RCU. This implementation uses but a single blocked-tasks
    list rather than the combinatorial number used per leaf rcu_node by
    TREE_PREEMPT_RCU, which reduces memory consumption and greatly simplifies
    processing. This version also takes advantage of uniprocessor execution
    to accelerate grace periods in the case where there are no readers.

    The general design is otherwise broadly similar to that of TREE_PREEMPT_RCU.

    This implementation is a step towards having RCU implementation driven
    off of the SMP and PREEMPT kernel configuration variables, which can
    happen once this implementation has accumulated sufficient experience.

    Removed ACCESS_ONCE() from __rcu_read_unlock() and added barrier() as
    suggested by Steve Rostedt in order to avoid the compiler-reordering
    issue noted by Mathieu Desnoyers (http://lkml.org/lkml/2010/8/16/183).

    As can be seen below, CONFIG_TINY_PREEMPT_RCU represents almost 5Kbyte
    savings compared to CONFIG_TREE_PREEMPT_RCU. Of course, for non-real-time
    workloads, CONFIG_TINY_RCU is even better.

    CONFIG_TREE_PREEMPT_RCU

    text data bss dec filename
    13 0 0 13 kernel/rcupdate.o
    6170 825 28 7023 kernel/rcutree.o
    ----
    7026 Total

    CONFIG_TINY_PREEMPT_RCU

    text data bss dec filename
    13 0 0 13 kernel/rcupdate.o
    2081 81 8 2170 kernel/rcutiny.o
    ----
    2183 Total

    CONFIG_TINY_RCU (non-preemptible)

    text data bss dec filename
    13 0 0 13 kernel/rcupdate.o
    719 25 0 744 kernel/rcutiny.o
    ---
    757 Total

    Requested-by: Loïc Minier
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

11 Aug, 2010

1 commit

  • * 'for-linus' of git://git.infradead.org/users/eparis/notify: (132 commits)
    fanotify: use both marks when possible
    fsnotify: pass both the vfsmount mark and inode mark
    fsnotify: walk the inode and vfsmount lists simultaneously
    fsnotify: rework ignored mark flushing
    fsnotify: remove global fsnotify groups lists
    fsnotify: remove group->mask
    fsnotify: remove the global masks
    fsnotify: cleanup should_send_event
    fanotify: use the mark in handler functions
    audit: use the mark in handler functions
    dnotify: use the mark in handler functions
    inotify: use the mark in handler functions
    fsnotify: send fsnotify_mark to groups in event handling functions
    fsnotify: Exchange list heads instead of moving elements
    fsnotify: srcu to protect read side of inode and vfsmount locks
    fsnotify: use an explicit flag to indicate fsnotify_destroy_mark has been called
    fsnotify: use _rcu functions for mark list traversal
    fsnotify: place marks on object in order of group memory address
    vfs/fsnotify: fsnotify_close can delay the final work in fput
    fsnotify: store struct file not struct path
    ...

    Fix up trivial delete/modify conflict in fs/notify/inotify/inotify.c.

    Linus Torvalds
     

08 Aug, 2010

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (55 commits)
    workqueue: mark init_workqueues() as early_initcall()
    workqueue: explain for_each_*cwq_cpu() iterators
    fscache: fix build on !CONFIG_SYSCTL
    slow-work: kill it
    gfs2: use workqueue instead of slow-work
    drm: use workqueue instead of slow-work
    cifs: use workqueue instead of slow-work
    fscache: drop references to slow-work
    fscache: convert operation to use workqueue instead of slow-work
    fscache: convert object to use workqueue instead of slow-work
    workqueue: fix how cpu number is stored in work->data
    workqueue: fix mayday_mask handling on UP
    workqueue: fix build problem on !CONFIG_SMP
    workqueue: fix locking in retry path of maybe_create_worker()
    async: use workqueue for worker pool
    workqueue: remove WQ_SINGLE_CPU and use WQ_UNBOUND instead
    workqueue: implement unbound workqueue
    workqueue: prepare for WQ_UNBOUND implementation
    libata: take advantage of cmwq and remove concurrency limitations
    workqueue: fix worker management invocation without pending works
    ...

    Fixed up conflicts in fs/cifs/* as per Tejun. Other trivial conflicts in
    include/linux/workqueue.h, kernel/trace/Kconfig and kernel/workqueue.c

    Linus Torvalds
     

05 Aug, 2010

1 commit


28 Jul, 2010

1 commit

  • Audit watch should depend on CONFIG_AUDIT_SYSCALL and should select
    FSNOTIFY. This splits the spagetti like mixing of audit_watch and
    audit_filter code so they can be configured seperately.

    Signed-off-by: Eric Paris

    Eric Paris
     

23 Jul, 2010

1 commit


21 May, 2010

1 commit


13 May, 2010

2 commits

  • The new nmi_watchdog (which uses the perf event subsystem) is very
    similar in structure to the softlockup detector. Using Ingo's
    suggestion, I combined the two functionalities into one file:
    kernel/watchdog.c.

    Now both the nmi_watchdog (or hardlockup detector) and softlockup
    detector sit on top of the perf event subsystem, which is run every
    60 seconds or so to see if there are any lockups.

    To detect hardlockups, cpus not responding to interrupts, I
    implemented an hrtimer that runs 5 times for every perf event
    overflow event. If that stops counting on a cpu, then the cpu is
    most likely in trouble.

    To detect softlockups, tasks not yielding to the scheduler, I used the
    previous kthread idea that now gets kicked every time the hrtimer fires.
    If the kthread isn't being scheduled neither is anyone else and the
    warning is printed to the console.

    I tested this on x86_64 and both the softlockup and hardlockup paths
    work.

    V2:
    - cleaned up the Kconfig and softlockup combination
    - surrounded hardlockup cases with #ifdef CONFIG_PERF_EVENTS_NMI
    - seperated out the softlockup case from perf event subsystem
    - re-arranged the enabling/disabling nmi watchdog from proc space
    - added cpumasks for hardlockup failure cases
    - removed fallback to soft events if no PMU exists for hard events

    V3:
    - comment cleanups
    - drop support for older softlockup code
    - per_cpu cleanups
    - completely remove software clock base hardlockup detector
    - use per_cpu masking on hard/soft lockup detection
    - #ifdef cleanups
    - rename config option NMI_WATCHDOG to LOCKUP_DETECTOR
    - documentation additions

    V4:
    - documentation fixes
    - convert per_cpu to __get_cpu_var
    - powerpc compile fixes

    V5:
    - split apart warn flags for hard and soft lockups

    TODO:
    - figure out how to make an arch-agnostic clock2cycles call
    (if possible) to feed into perf events as a sample period

    [fweisbec: merged conflict patch]

    Signed-off-by: Don Zickus
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Cyrill Gorcunov
    Cc: Eric Paris
    Cc: Randy Dunlap
    LKML-Reference:
    Signed-off-by: Frederic Weisbecker

    Don Zickus
     
  • Merge reason: catch up with latest softlockup detector changes.

    Frederic Weisbecker
     

08 May, 2010

1 commit

  • When !CONFIG_SMP, cpu_stop functions weren't defined at all which
    could lead to build failures if UP code uses cpu_stop facility. Add
    dummy cpu_stop implementation for UP. The waiting variants execute
    the work function directly with preempt disabled and
    stop_one_cpu_nowait() schedules a workqueue work.

    Makefile and ifdefs around stop_machine implementation are updated to
    accomodate CONFIG_SMP && !CONFIG_STOP_MACHINE case.

    Signed-off-by: Tejun Heo
    Reported-by: Ingo Molnar

    Tejun Heo
     

07 Mar, 2010

1 commit

  • elf_core_dump() and elf_fdpic_core_dump() use #ifdef and the corresponding
    macro for hiding _multiline_ logics in functions. This patch removes
    #ifdef and replaces ELF_CORE_EXTRA_* by corresponding functions. For
    architectures not implemeonting ELF_CORE_EXTRA_*, we use weak functions in
    order to reduce a range of modification.

    This cleanup is for my next patches, but I think this cleanup itself is
    worth doing regardless of my firnal purpose.

    Signed-off-by: Daisuke HATAYAMA
    Cc: "Luck, Tony"
    Cc: Jeff Dike
    Cc: David Howells
    Cc: Greg Ungerer
    Cc: Roland McGrath
    Cc: Oleg Nesterov
    Cc: Ingo Molnar
    Cc: Alexander Viro
    Cc: Andi Kleen
    Cc: Alan Cox
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daisuke HATAYAMA