22 Aug, 2015

1 commit


18 Aug, 2015

1 commit


15 Aug, 2015

2 commits


14 Aug, 2015

1 commit


13 Aug, 2015

1 commit


12 Aug, 2015

2 commits

  • I ran the perf fuzzer, which triggered some WARN()s which are due to
    trying to stop/restart an event on the wrong CPU.

    Use the normal IPI pattern to ensure we run the code on the correct CPU.

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Vince Weaver
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Fixes: bad7192b842c ("perf: Fix PERF_EVENT_IOC_PERIOD to force-reset the period")
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • If rb->aux_refcount is decremented to zero before rb->refcount,
    __rb_free_aux() may be called twice resulting in a double free of
    rb->aux_pages. Fix this by adding a check to __rb_free_aux().

    Signed-off-by: Ben Hutchings
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Alexander Shishkin
    Cc: Arnaldo Carvalho de Melo
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: stable@vger.kernel.org
    Fixes: 57ffc5ca679f ("perf: Fix AUX buffer refcounting")
    Link: http://lkml.kernel.org/r/1437953468.12842.17.camel@decadent.org.uk
    Signed-off-by: Ingo Molnar

    Ben Hutchings
     

10 Aug, 2015

5 commits

  • The comment says it's using trialcs->mems_allowed as a temp variable but
    it didn't match the code. Change the code to match the comment.

    This fixes an issue when writing in cpuset.mems when a sub-directory
    exists: we need to write several times for the information to persist:

    | root@alban:/sys/fs/cgroup/cpuset# mkdir footest9
    | root@alban:/sys/fs/cgroup/cpuset# cd footest9
    | root@alban:/sys/fs/cgroup/cpuset/footest9# mkdir aa
    | root@alban:/sys/fs/cgroup/cpuset/footest9# cat cpuset.mems
    |
    | root@alban:/sys/fs/cgroup/cpuset/footest9# echo 0 > cpuset.mems
    | root@alban:/sys/fs/cgroup/cpuset/footest9# cat cpuset.mems
    |
    | root@alban:/sys/fs/cgroup/cpuset/footest9# echo 0 > cpuset.mems
    | root@alban:/sys/fs/cgroup/cpuset/footest9# cat cpuset.mems
    | 0
    | root@alban:/sys/fs/cgroup/cpuset/footest9# cat aa/cpuset.mems
    |
    | root@alban:/sys/fs/cgroup/cpuset/footest9# echo 0 > aa/cpuset.mems
    | root@alban:/sys/fs/cgroup/cpuset/footest9# cat aa/cpuset.mems
    | 0
    | root@alban:/sys/fs/cgroup/cpuset/footest9#

    This should help to fix the following issue in Docker:
    https://github.com/opencontainers/runc/issues/133
    In some conditions, a Docker container needs to be started twice in
    order to work.

    Signed-off-by: Alban Crequy
    Tested-by: Iago López Galeiras
    Cc: # 3.17+
    Acked-by: Li Zefan
    Signed-off-by: Tejun Heo

    Alban Crequy
     
  • According to the perf_event_map_fd and index, the function
    bpf_perf_event_read() can convert the corresponding map
    value to the pointer to struct perf_event and return the
    Hardware PMU counter value.

    Signed-off-by: Kaixu Xia
    Signed-off-by: David S. Miller

    Kaixu Xia
     
  • Introduce a new bpf map type 'BPF_MAP_TYPE_PERF_EVENT_ARRAY'.
    This map only stores the pointer to struct perf_event. The
    user space event FDs from perf_event_open() syscall are converted
    to the pointer to struct perf_event and stored in map.

    Signed-off-by: Kaixu Xia
    Signed-off-by: David S. Miller

    Kaixu Xia
     
  • All the map backends are of generic nature. In order to avoid
    adding much special code into the eBPF core, rewrite part of
    the bpf_prog_array map code and make it more generic. So the
    new perf_event_array map type can reuse most of code with
    bpf_prog_array map and add fewer lines of special code.

    Signed-off-by: Wang Nan
    Signed-off-by: Kaixu Xia
    Signed-off-by: David S. Miller

    Wang Nan
     
  • This patch add three core perf APIs:
    - perf_event_attrs(): export the struct perf_event_attr from struct
    perf_event;
    - perf_event_get(): get the struct perf_event from the given fd;
    - perf_event_read_local(): read the events counters active on the
    current CPU;
    These APIs are needed when accessing events counters in eBPF programs.

    The API perf_event_read_local() comes from Peter and I add the
    corresponding SOB.

    Signed-off-by: Kaixu Xia
    Signed-off-by: Peter Zijlstra
    Signed-off-by: David S. Miller

    Kaixu Xia
     

07 Aug, 2015

3 commits

  • The s-Par visornic driver, currently in staging, processes a queue being
    serviced by the an s-Par service partition. We can get a message that
    something has happened with the Service Partition, when that happens, we
    must not access the channel until we get a message that the service
    partition is back again.

    The visornic driver has a thread for processing the channel, when we get
    the message, we need to be able to park the thread and then resume it
    when the problem clears.

    We can do this with kthread_park and unpark but they are not exported
    from the kernel, this patch exports the needed functions.

    Signed-off-by: David Kershner
    Acked-by: Ingo Molnar
    Acked-by: Neil Horman
    Acked-by: Thomas Gleixner
    Cc: Richard Weinberger
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Kershner
     
  • This function may copy the si_addr_lsb, si_lower and si_upper fields to
    user mode when they haven't been initialized, which can leak kernel
    stack data to user mode.

    Just checking the value of si_code is insufficient because the same
    si_code value is shared between multiple signals. This is solved by
    checking the value of si_signo in addition to si_code.

    Signed-off-by: Amanieu d'Antras
    Cc: Oleg Nesterov
    Cc: Ingo Molnar
    Cc: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Amanieu d'Antras
     
  • This function can leak kernel stack data when the user siginfo_t has a
    positive si_code value. The top 16 bits of si_code descibe which fields
    in the siginfo_t union are active, but they are treated inconsistently
    between copy_siginfo_from_user32, copy_siginfo_to_user32 and
    copy_siginfo_to_user.

    copy_siginfo_from_user32 is called from rt_sigqueueinfo and
    rt_tgsigqueueinfo in which the user has full control overthe top 16 bits
    of si_code.

    This fixes the following information leaks:
    x86: 8 bytes leaked when sending a signal from a 32-bit process to
    itself. This leak grows to 16 bytes if the process uses x32.
    (si_code = __SI_CHLD)
    x86: 100 bytes leaked when sending a signal from a 32-bit process to
    a 64-bit process. (si_code = -1)
    sparc: 4 bytes leaked when sending a signal from a 32-bit process to a
    64-bit process. (si_code = any)

    parsic and s390 have similar bugs, but they are not vulnerable because
    rt_[tg]sigqueueinfo have checks that prevent sending a positive si_code
    to a different process. These bugs are also fixed for consistency.

    Signed-off-by: Amanieu d'Antras
    Cc: Oleg Nesterov
    Cc: Ingo Molnar
    Cc: Russell King
    Cc: Ralf Baechle
    Cc: Benjamin Herrenschmidt
    Cc: Chris Metcalf
    Cc: Paul Mackerras
    Cc: Michael Ellerman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Amanieu d'Antras
     

04 Aug, 2015

1 commit

  • Vince reported that the fasync signal stuff doesn't work proper for
    inherited events. So fix that.

    Installing fasync allocates memory and sets filp->f_flags |= FASYNC,
    which upon the demise of the file descriptor ensures the allocation is
    freed and state is updated.

    Now for perf, we can have the events stick around for a while after the
    original FD is dead because of references from child events. So we
    cannot copy the fasync pointer around. We can however consistently use
    the parent's fasync, as that will be updated.

    Reported-and-Tested-by: Vince Weaver
    Signed-off-by: Peter Zijlstra (Intel)
    Cc:
    Cc: Arnaldo Carvalho deMelo
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: eranian@google.com
    Link: http://lkml.kernel.org/r/1434011521.1495.71.camel@twins
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

01 Aug, 2015

1 commit


29 Jul, 2015

1 commit

  • We don't actually hold the module_mutex when calling find_module_all
    from module_kallsyms_lookup_name: that's because it's used by the oops
    code and we don't want to deadlock.

    However, access to the list read-only is safe if preempt is disabled,
    so we can weaken the assertion. Keep a strong version for external
    callers though.

    Fixes: 0be964be0d45 ("module: Sanitize RCU usage and locking")
    Reported-by: He Kuang
    Cc: stable@kernel.org
    Acked-by: Peter Zijlstra (Intel)
    Signed-off-by: Rusty Russell

    Rusty Russell
     

27 Jul, 2015

3 commits

  • A recent fix to the shadow timestamp inadvertly broke the running time
    accounting.

    We must not update the running timestamp if we fail to schedule the
    event, the event will not have ran. This can (and did) result in
    negative total runtime because the stopped timestamp was before the
    running timestamp (we 'started' but never stopped the event -- because
    it never really started we didn't have to stop it either).

    Reported-and-Tested-by: Vince Weaver
    Fixes: 72f669c0086f ("perf: Update shadow timestamp before add event")
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: stable@vger.kernel.org # 4.1
    Cc: Shaohua Li
    Signed-off-by: Thomas Gleixner

    Peter Zijlstra
     
  • mov %rsp, %r1 ; r1 = rsp
    add $-8, %r1 ; r1 = rsp - 8
    store_q $123, -8(%rsp) ; *(u64*)r1 = 123
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Alex Gartrell
     
  • Pull x86 fixes from Thomas Gleixner:
    "This update contains:

    - the manual revert of the SYSCALL32 changes which caused a
    regression

    - a fix for the MPX vma handling

    - three fixes for the ioremap 'is ram' checks.

    - PAT warning fixes

    - a trivial fix for the size calculation of TLB tracepoints

    - handle old EFI structures gracefully

    This also contains a PAT fix from Jan plus a revert thereof. Toshi
    explained why the code is correct"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/mm/pat: Revert 'Adjust default caching mode translation tables'
    x86/asm/entry/32: Revert 'Do not use R9 in SYSCALL32' commit
    x86/mm: Fix newly introduced printk format warnings
    mm: Fix bugs in region_is_ram()
    x86/mm: Remove region_is_ram() call from ioremap
    x86/mm: Move warning from __ioremap_check_ram() to the call site
    x86/mm/pat, drivers/media/ivtv: Move the PAT warning and replace WARN() with pr_warn()
    x86/mm/pat, drivers/infiniband/ipath: Replace WARN() with pr_warn()
    x86/mm/pat: Adjust default caching mode translation tables
    x86/fpu: Disable dependent CPU features on "noxsave"
    x86/mpx: Do not set ->vm_ops on MPX VMAs
    x86/mm: Add parenthesis for TLB tracepoint size calculation
    efi: Handle memory error structures produced based on old versions of standard

    Linus Torvalds
     

26 Jul, 2015

1 commit

  • Pull ftrace fix from Steven Rostedt:
    "Back in 3.16 the ftrace code was redesigned and cleaned up to remove
    the double iteration list (one for registered ftrace ops, and one for
    registered "global" ops), to just use one list. That simplified the
    code but also broke the function tracing filtering on pid.

    This updates the code to handle the filtering again with the new
    logic"

    * tag 'trace-v4.2-rc2-fix3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    ftrace: Fix breakage of set_ftrace_pid

    Linus Torvalds
     

25 Jul, 2015

1 commit

  • Commit 4104d326b670 ("ftrace: Remove global function list and call function
    directly") simplified the ftrace code by removing the global_ops list with a
    new design. But this cleanup also broke the filtering of PIDs that are added
    to the set_ftrace_pid file.

    Add back the proper hooks to have pid filtering working once again.

    Cc: stable@vger.kernel.org # 3.16+
    Reported-by: Matt Fleming
    Reported-by: Richard Weinberger
    Tested-by: Matt Fleming
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     

23 Jul, 2015

1 commit


22 Jul, 2015

1 commit

  • region_is_ram() looks up the iomem_resource table to check if
    a target range is in RAM. However, it always returns with -1
    due to invalid range checks. It always breaks the loop at the
    first entry of the table.

    Another issue is that it compares p->flags and flags, but it always
    fails. flags is declared as int, which makes it as a negative value
    with IORESOURCE_BUSY (0x80000000) set while p->flags is unsigned long.

    Fix the range check and flags so that region_is_ram() works as
    advertised.

    Signed-off-by: Toshi Kani
    Reviewed-by: Dan Williams
    Cc: Mike Travis
    Cc: Luis R. Rodriguez
    Cc: Andrew Morton
    Cc: Roland Dreier
    Cc: linux-mm@kvack.org
    Link: http://lkml.kernel.org/r/1437088996-28511-4-git-send-email-toshi.kani@hp.com
    Signed-off-by: Thomas Gleixner

    Toshi Kani
     

21 Jul, 2015

2 commits

  • Enabling locking-selftest in a VM guest may cause the following
    kernel panic:

    kernel BUG at .../kernel/locking/qspinlock_paravirt.h:137!

    This is due to the fact that the pvqspinlock unlock function is
    expecting either a _Q_LOCKED_VAL or _Q_SLOW_VAL in the lock
    byte. This patch prevents that bug report by ignoring it when
    debug_locks_silent is set. Otherwise, a warning will be printed
    if it contains an unexpected value.

    With this patch applied, the kernel locking-selftest completed
    without any noise.

    Tested-by: Masami Hiramatsu
    Signed-off-by: Waiman Long
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1436663959-53092-1-git-send-email-Waiman.Long@hp.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • improve accuracy of timing in test_bpf and add two stress tests:
    - {skb->data[0], get_smp_processor_id} repeated 2k times
    - {skb->data[0], vlan_push} x 68 followed by {skb->data[0], vlan_pop} x 68

    1st test is useful to test performance of JIT implementation of BPF_LD_ABS
    together with BPF_CALL instructions.
    2nd test is stressing skb_vlan_push/pop logic together with skb->data access
    via BPF_LD_ABS insn which checks that re-caching of skb->data is done correctly.

    In order to call bpf_skb_vlan_push() from test_bpf.ko have to add
    three export_symbol_gpl.

    Signed-off-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Alexei Starovoitov
     

19 Jul, 2015

4 commits

  • Pull x86 fixes from Ingo Molnar:
    "Two families of fixes:

    - Fix an FPU context related boot crash on newer x86 hardware with
    larger context sizes than what most people test. To fix this
    without ugly kludges or extensive reverts we had to touch core task
    allocator, to allow x86 to determine the task size dynamically, at
    boot time.

    I've tested it on a number of x86 platforms, and I cross-built it
    to a handful of architectures:

    (warns) (warns)
    testing x86-64: -git: pass ( 0), -tip: pass ( 0)
    testing x86-32: -git: pass ( 0), -tip: pass ( 0)
    testing arm: -git: pass ( 1359), -tip: pass ( 1359)
    testing cris: -git: pass ( 1031), -tip: pass ( 1031)
    testing m32r: -git: pass ( 1135), -tip: pass ( 1135)
    testing m68k: -git: pass ( 1471), -tip: pass ( 1471)
    testing mips: -git: pass ( 1162), -tip: pass ( 1162)
    testing mn10300: -git: pass ( 1058), -tip: pass ( 1058)
    testing parisc: -git: pass ( 1846), -tip: pass ( 1846)
    testing sparc: -git: pass ( 1185), -tip: pass ( 1185)

    ... so I hope the cross-arch impact 'none', as intended.

    (by Dave Hansen)

    - Fix various NMI handling related bugs unearthed by the big asm code
    rewrite and generally make the NMI code more robust and more
    maintainable while at it. These changes are a bit late in the
    cycle, I hope they are still acceptable.

    (by Andy Lutomirski)"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/fpu, sched: Introduce CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and use it on x86
    x86/fpu, sched: Dynamically allocate 'struct fpu'
    x86/entry/64, x86/nmi/64: Add CONFIG_DEBUG_ENTRY NMI testing code
    x86/nmi/64: Make the "NMI executing" variable more consistent
    x86/nmi/64: Minor asm simplification
    x86/nmi/64: Use DF to avoid userspace RSP confusing nested NMI detection
    x86/nmi/64: Reorder nested NMI checks
    x86/nmi/64: Improve nested NMI comments
    x86/nmi/64: Switch stacks on userspace NMI entry
    x86/nmi/64: Remove asm code that saves CR2
    x86/nmi: Enable nested do_nmi() handling for 64-bit kernels

    Linus Torvalds
     
  • Pull timer fix from Ingo Molnar:
    "Fix for a misplaced export that can cause build failures in certain
    (rare) Kconfig situations"

    * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    tick: Move the export of tick_broadcast_oneshot_control to the proper place

    Linus Torvalds
     
  • Pull scheduler fix from Ingo Molnar:
    "A oneliner rq throttling fix"

    * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched/fair: Test list head instead of list entry in throttle_cfs_rq()

    Linus Torvalds
     
  • Pull irq fixes from Ingo Molnar:
    "Misc irq fixes:

    - two driver fixes
    - a Xen regression fix
    - a nested irq thread crash fix"

    * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    irqchip/gicv3-its: Fix mapping of LPIs to collections
    genirq: Prevent resend to interrupts marked IRQ_NESTED_THREAD
    genirq: Revert sparse irq locking around __cpu_up() and move it to x86 for now
    gpio/davinci: Fix race in installing chained irq handler

    Linus Torvalds
     

18 Jul, 2015

2 commits

  • Don't burden architectures without dynamic task_struct sizing
    with the overhead of dynamic sizing.

    Also optimize the x86 code a bit by caching task_struct_size.

    Acked-and-Tested-by: Dave Hansen
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: Denys Vlasenko
    Cc: Linus Torvalds
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1437128892-9831-3-git-send-email-mingo@kernel.org
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • The FPU rewrite removed the dynamic allocations of 'struct fpu'.
    But, this potentially wastes massive amounts of memory (2k per
    task on systems that do not have AVX-512 for instance).

    Instead of having a separate slab, this patch just appends the
    space that we need to the 'task_struct' which we dynamically
    allocate already. This saves from doing an extra slab
    allocation at fork().

    The only real downside here is that we have to stick everything
    and the end of the task_struct. But, I think the
    BUILD_BUG_ON()s I stuck in there should keep that from being too
    fragile.

    Signed-off-by: Dave Hansen
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1437128892-9831-2-git-send-email-mingo@kernel.org
    Signed-off-by: Ingo Molnar

    Dave Hansen
     

17 Jul, 2015

1 commit

  • The resend mechanism happily calls the interrupt handler of interrupts
    which are marked IRQ_NESTED_THREAD from softirq context. This can
    result in crashes because the interrupt handler is not the proper way
    to invoke the device handlers. They must be invoked via
    handle_nested_irq.

    Prevent the resend even if the interrupt has no valid parent irq
    set. Its better to have a lost interrupt than a crashing machine.

    Reported-by: Uwe Kleine-König
    Signed-off-by: Thomas Gleixner
    Cc: stable@vger.kernel.org

    Thomas Gleixner
     

16 Jul, 2015

1 commit

  • Pull tracing fix from Steven Rostedt:
    "Fengguang Wu discovered a crash that happened to be because of the
    branch tracer (traces unlikely and likely branches) when enabled with
    certain debug options.

    What happened was that various debug options like lockdep and
    DEBUG_PREEMPT can cause parts of the branch tracer to recurse outside
    its recursion protection. In fact, part of its recursion protection
    used these features that caused the lockup. This cleans up the code a
    little and makes the recursion protection a bit more robust"

    * tag 'trace-v4.2-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    tracing: Have branch tracer use recursive field of task struct

    Linus Torvalds
     

15 Jul, 2015

1 commit

  • Boris reported that the sparse_irq protection around __cpu_up() in the
    generic code causes a regression on Xen. Xen allocates interrupts and
    some more in the xen_cpu_up() function, so it deadlocks on the
    sparse_irq_lock.

    There is no simple fix for this and we really should have the
    protection for all architectures, but for now the only solution is to
    move it to x86 where actual wreckage due to the lack of protection has
    been observed.

    Reported-and-tested-by: Boris Ostrovsky
    Fixes: a89941816726 'hotplug: Prevent alloc/free of irq descriptors during cpu up/down'
    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: xiao jin
    Cc: Joerg Roedel
    Cc: Borislav Petkov
    Cc: Yanmin Zhang
    Cc: xen-devel

    Thomas Gleixner
     

14 Jul, 2015

3 commits