01 Feb, 2017

1 commit


24 Nov, 2016

1 commit

  • For large values of "mult" and long uptimes, the intermediate
    result of "cycles * mult" can overflow 64 bits. For example,
    the tile platform calls clocksource_cyc2ns with a 1.2 GHz clock;
    we have mult = 853, and after 208.5 days, we overflow 64 bits.

    Since clocksource_cyc2ns() is intended to be used for relative
    cycle counts, not absolute cycle counts, performance is more
    importance than accepting a wider range of cycle values. So,
    just use mult_frac() directly in tile's sched_clock().

    Commit 4cecf6d401a0 ("sched, x86: Avoid unnecessary overflow
    in sched_clock") by Salman Qazi results in essentially the same
    generated code for x86 as this change does for tile. In fact,
    a follow-on change by Salman introduced mult_frac() and switched
    to using it, so the C code was largely identical at that point too.

    Peter Zijlstra then added mul_u64_u32_shr() and switched x86
    to use it. This is, in principle, better; by optimizing the
    64x64->64 multiplies to be 32x32->64 multiplies we can potentially
    save some time. However, the compiler piplines the 64x64->64
    multiplies pretty well, and the conditional branch in the generic
    mul_u64_u32_shr() causes some bubbles in execution, with the
    result that it's pretty much a wash. If tilegx provided its own
    implementation of mul_u64_u32_shr() without the conditional branch,
    we could potentially save 3 cycles, but that seems like small gain
    for a fair amount of additional build scaffolding; no other platform
    currently provides a mul_u64_u32_shr() override, and tile doesn't
    currently have an header to put the override in.

    Additionally, gcc currently has an optimization bug that prevents
    it from recognizing the opportunity to use a 32x32->64 multiply,
    and so the result would be no better than the existing mult_frac()
    until such time as the compiler is fixed.

    For now, just using mult_frac() seems like the right answer.

    Cc: stable@kernel.org [v3.4+]
    Signed-off-by: Chris Metcalf

    Chris Metcalf
     

15 Nov, 2016

1 commit

  • The tile architecture already marks RO_DATA as read-only in
    the kernel, so grouping RO_AFTER_INIT_DATA with RO_DATA, as is
    done by default, means the kernel faults in init when it tries
    to write to RO_AFTER_INIT_DATA. For now, just arrange that
    __ro_after_init is handled like __write_once, i.e. __read_mostly.

    Reviewed-by: Kees Cook
    Signed-off-by: Chris Metcalf

    Chris Metcalf
     

12 Oct, 2016

1 commit

  • Currently, all callers to randomize_range() set the length to 0 and
    calculate end by adding a constant to the start address. We can simplify
    the API to remove a bunch of needless checks and variables.

    Use the new randomize_addr(start, range) call to set the requested
    address.

    Link: http://lkml.kernel.org/r/20160803233913.32511-6-jason@lakedaemon.net
    Signed-off-by: Jason Cooper
    Acked-by: Kees Cook
    Cc: "Theodore Ts'o"
    Cc: Chris Metcalf
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jason Cooper
     

08 Oct, 2016

3 commits

  • When doing an nmi backtrace of many cores, most of which are idle, the
    output is a little overwhelming and very uninformative. Suppress
    messages for cpus that are idling when they are interrupted and just
    emit one line, "NMI backtrace for N skipped: idling at pc 0xNNN".

    We do this by grouping all the cpuidle code together into a new
    .cpuidle.text section, and then checking the address of the interrupted
    PC to see if it lies within that section.

    This commit suitably tags x86 and tile idle routines, and only adds in
    the minimal framework for other architectures.

    Link: http://lkml.kernel.org/r/1472487169-14923-5-git-send-email-cmetcalf@mellanox.com
    Signed-off-by: Chris Metcalf
    Acked-by: Peter Zijlstra (Intel)
    Tested-by: Peter Zijlstra (Intel)
    Tested-by: Daniel Thompson [arm]
    Tested-by: Petr Mladek
    Cc: Aaron Tomlin
    Cc: Peter Zijlstra (Intel)
    Cc: "Rafael J. Wysocki"
    Cc: Russell King
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Metcalf
     
  • Previously tile was rolling its own method of capturing backtrace data
    in the NMI handlers, but it was relying on running printk() from the NMI
    handler, which is not always safe. So adopt the nmi_backtrace model
    (with the new cpumask extension) instead.

    So we can call the nmi_backtrace code directly from the nmi handler,
    move the nmi_enter()/exit() into the top-level tile NMI handler.

    The semantics of the routine change slightly since it is now synchronous
    with the remote cores completing the backtraces. Previously it was
    asynchronous, but with protection to avoid starting a new remote
    backtrace if the old one was still in progress.

    Link: http://lkml.kernel.org/r/1472487169-14923-4-git-send-email-cmetcalf@mellanox.com
    Signed-off-by: Chris Metcalf
    Cc: Daniel Thompson [arm]
    Cc: Petr Mladek
    Cc: Aaron Tomlin
    Cc: Peter Zijlstra (Intel)
    Cc: "Rafael J. Wysocki"
    Cc: Russell King
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Metcalf
     
  • This came to light when implementing native 64-bit atomics for ARCv2.

    The atomic64 self-test code uses CONFIG_ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE
    to check whether atomic64_dec_if_positive() is available. It seems it
    was needed when not every arch defined it. However as of current code
    the Kconfig option seems needless

    - for CONFIG_GENERIC_ATOMIC64 it is auto-enabled in lib/Kconfig and a
    generic definition of API is present lib/atomic64.c
    - arches with native 64-bit atomics select it in arch/*/Kconfig and
    define the API in their headers

    So I see no point in keeping the Kconfig option

    Compile tested for:
    - blackfin (CONFIG_GENERIC_ATOMIC64)
    - x86 (!CONFIG_GENERIC_ATOMIC64)
    - ia64

    Link: http://lkml.kernel.org/r/1473703083-8625-3-git-send-email-vgupta@synopsys.com
    Signed-off-by: Vineet Gupta
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Cc: Russell King
    Cc: Catalin Marinas
    Cc: Will Deacon
    Cc: Ralf Baechle
    Cc: "James E.J. Bottomley"
    Cc: Helge Deller
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Michael Ellerman
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: "David S. Miller"
    Cc: Chris Metcalf
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Vineet Gupta
    Cc: Zhaoxiu Zeng
    Cc: Linus Walleij
    Cc: Alexander Potapenko
    Cc: Andrey Ryabinin
    Cc: Herbert Xu
    Cc: Ming Lin
    Cc: Arnd Bergmann
    Cc: Geert Uytterhoeven
    Cc: Peter Zijlstra
    Cc: Borislav Petkov
    Cc: Andi Kleen
    Cc: Boqun Feng
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vineet Gupta
     

15 Sep, 2016

1 commit


31 Aug, 2016

1 commit

  • There are three usercopy warnings which are currently being silenced for
    gcc 4.6 and newer:

    1) "copy_from_user() buffer size is too small" compile warning/error

    This is a static warning which happens when object size and copy size
    are both const, and copy size > object size. I didn't see any false
    positives for this one. So the function warning attribute seems to
    be working fine here.

    Note this scenario is always a bug and so I think it should be
    changed to *always* be an error, regardless of
    CONFIG_DEBUG_STRICT_USER_COPY_CHECKS.

    2) "copy_from_user() buffer size is not provably correct" compile warning

    This is another static warning which happens when I enable
    __compiletime_object_size() for new compilers (and
    CONFIG_DEBUG_STRICT_USER_COPY_CHECKS). It happens when object size
    is const, but copy size is *not*. In this case there's no way to
    compare the two at build time, so it gives the warning. (Note the
    warning is a byproduct of the fact that gcc has no way of knowing
    whether the overflow function will be called, so the call isn't dead
    code and the warning attribute is activated.)

    So this warning seems to only indicate "this is an unusual pattern,
    maybe you should check it out" rather than "this is a bug".

    I get 102(!) of these warnings with allyesconfig and the
    __compiletime_object_size() gcc check removed. I don't know if there
    are any real bugs hiding in there, but from looking at a small
    sample, I didn't see any. According to Kees, it does sometimes find
    real bugs. But the false positive rate seems high.

    3) "Buffer overflow detected" runtime warning

    This is a runtime warning where object size is const, and copy size >
    object size.

    All three warnings (both static and runtime) were completely disabled
    for gcc 4.6 with the following commit:

    2fb0815c9ee6 ("gcc4: disable __compiletime_object_size for GCC 4.6+")

    That commit mistakenly assumed that the false positives were caused by a
    gcc bug in __compiletime_object_size(). But in fact,
    __compiletime_object_size() seems to be working fine. The false
    positives were instead triggered by #2 above. (Though I don't have an
    explanation for why the warnings supposedly only started showing up in
    gcc 4.6.)

    So remove warning #2 to get rid of all the false positives, and re-enable
    warnings #1 and #3 by reverting the above commit.

    Furthermore, since #1 is a real bug which is detected at compile time,
    upgrade it to always be an error.

    Having done all that, CONFIG_DEBUG_STRICT_USER_COPY_CHECKS is no longer
    needed.

    Signed-off-by: Josh Poimboeuf
    Cc: Kees Cook
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H . Peter Anvin"
    Cc: Andy Lutomirski
    Cc: Steven Rostedt
    Cc: Brian Gerst
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Cc: Byungchul Park
    Cc: Nilay Vaish
    Signed-off-by: Linus Torvalds

    Josh Poimboeuf
     

24 Aug, 2016

1 commit

  • Storing this value will help prevent unwinders from getting out of sync
    with the function graph tracer ret_stack. Now instead of needing a
    stateful iterator, they can compare the return address pointer to find
    the right ret_stack entry.

    Note that an array of 50 ftrace_ret_stack structs is allocated for every
    task. So when an arch implements this, it will add either 200 or 400
    bytes of memory usage per task (depending on whether it's a 32-bit or
    64-bit platform).

    Signed-off-by: Josh Poimboeuf
    Acked-by: Steven Rostedt
    Cc: Andy Lutomirski
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Byungchul Park
    Cc: Denys Vlasenko
    Cc: Frederic Weisbecker
    Cc: H. Peter Anvin
    Cc: Kees Cook
    Cc: Linus Torvalds
    Cc: Nilay Vaish
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/a95cfcc39e8f26b89a430c56926af0bb217bc0a1.1471607358.git.jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar

    Josh Poimboeuf
     

04 Aug, 2016

2 commits

  • Previously, all the __exit sections were just dropped by the link phase.
    However, if there are static_key (jump label) constructs in __exit
    sections that are not modules, the link fails with the message:

    `.exit.text' referenced in section `__jump_table' of xxx.o:
    defined in discarded section `.exit.text' of xxx.o

    Support this usage by keeping the .exit.text sections in the final image
    if JUMP_LABEL is defined, then discarding them once initialization is
    complete.

    Link: http://lkml.kernel.org/r/bfd7c107c610c30e992868ebfe2a5d796a097464.1467837322.git.jbaron@akamai.com
    Signed-off-by: Jason Baron
    Signed-off-by: Chris Metcalf
    Cc: "David S. Miller"
    Cc: Arnd Bergmann
    Cc: Benjamin Herrenschmidt
    Cc: Heiko Carstens
    Cc: Joe Perches
    Cc: Martin Schwidefsky
    Cc: Michael Ellerman
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Metcalf
     
  • The dma-mapping core and the implementations do not change the DMA
    attributes passed by pointer. Thus the pointer can point to const data.
    However the attributes do not have to be a bitfield. Instead unsigned
    long will do fine:

    1. This is just simpler. Both in terms of reading the code and setting
    attributes. Instead of initializing local attributes on the stack
    and passing pointer to it to dma_set_attr(), just set the bits.

    2. It brings safeness and checking for const correctness because the
    attributes are passed by value.

    Semantic patches for this change (at least most of them):

    virtual patch
    virtual context

    @r@
    identifier f, attrs;

    @@
    f(...,
    - struct dma_attrs *attrs
    + unsigned long attrs
    , ...)
    {
    ...
    }

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
    )

    and

    // Options: --all-includes
    virtual patch
    virtual context

    @r@
    identifier f, attrs;
    type t;

    @@
    t f(..., struct dma_attrs *attrs);

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
    )

    Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
    Signed-off-by: Krzysztof Kozlowski
    Acked-by: Vineet Gupta
    Acked-by: Robin Murphy
    Acked-by: Hans-Christian Noren Egtvedt
    Acked-by: Mark Salter [c6x]
    Acked-by: Jesper Nilsson [cris]
    Acked-by: Daniel Vetter [drm]
    Reviewed-by: Bart Van Assche
    Acked-by: Joerg Roedel [iommu]
    Acked-by: Fabien Dessenne [bdisp]
    Reviewed-by: Marek Szyprowski [vb2-core]
    Acked-by: David Vrabel [xen]
    Acked-by: Konrad Rzeszutek Wilk [xen swiotlb]
    Acked-by: Joerg Roedel [iommu]
    Acked-by: Richard Kuo [hexagon]
    Acked-by: Geert Uytterhoeven [m68k]
    Acked-by: Gerald Schaefer [s390]
    Acked-by: Bjorn Andersson
    Acked-by: Hans-Christian Noren Egtvedt [avr32]
    Acked-by: Vineet Gupta [arc]
    Acked-by: Robin Murphy [arm64 and dma-iommu]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Krzysztof Kozlowski
     

03 Aug, 2016

1 commit

  • In general, there's no need for the "restore sigmask" flag to live in
    ti->flags. alpha, ia64, microblaze, powerpc, sh, sparc (64-bit only),
    tile, and x86 use essentially identical alternative implementations,
    placing the flag in ti->status.

    Replace those optimized implementations with an equally good common
    implementation that stores it in a bitfield in struct task_struct and
    drop the custom implementations.

    Additional architectures can opt in by removing their
    TIF_RESTORE_SIGMASK defines.

    Link: http://lkml.kernel.org/r/8a14321d64a28e40adfddc90e18a96c086a6d6f9.1468522723.git.luto@kernel.org
    Signed-off-by: Andy Lutomirski
    Tested-by: Michael Ellerman [powerpc]
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Cc: Tony Luck
    Cc: Fenghua Yu
    Cc: Michal Simek
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Yoshinori Sato
    Cc: Rich Felker
    Cc: "David S. Miller"
    Cc: Chris Metcalf
    Cc: Peter Zijlstra
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dmitry Safonov
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Lutomirski
     

30 Jul, 2016

1 commit

  • Pull security subsystem updates from James Morris:
    "Highlights:

    - TPM core and driver updates/fixes
    - IPv6 security labeling (CALIPSO)
    - Lots of Apparmor fixes
    - Seccomp: remove 2-phase API, close hole where ptrace can change
    syscall #"

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (156 commits)
    apparmor: fix SECURITY_APPARMOR_HASH_DEFAULT parameter handling
    tpm: Add TPM 2.0 support to the Nuvoton i2c driver (NPCT6xx family)
    tpm: Factor out common startup code
    tpm: use devm_add_action_or_reset
    tpm2_i2c_nuvoton: add irq validity check
    tpm: read burstcount from TPM_STS in one 32-bit transaction
    tpm: fix byte-order for the value read by tpm2_get_tpm_pt
    tpm_tis_core: convert max timeouts from msec to jiffies
    apparmor: fix arg_size computation for when setprocattr is null terminated
    apparmor: fix oops, validate buffer size in apparmor_setprocattr()
    apparmor: do not expose kernel stack
    apparmor: fix module parameters can be changed after policy is locked
    apparmor: fix oops in profile_unpack() when policy_db is not present
    apparmor: don't check for vmalloc_addr if kvzalloc() failed
    apparmor: add missing id bounds check on dfa verification
    apparmor: allow SYS_CAP_RESOURCE to be sufficient to prlimit another task
    apparmor: use list_next_entry instead of list_entry_next
    apparmor: fix refcount race when finding a child profile
    apparmor: fix ref count leak when profile sha1 hash is read
    apparmor: check that xindex is in trans_table bounds
    ...

    Linus Torvalds
     

29 Jul, 2016

3 commits

  • There are now a number of accounting oddities such as mapped file pages
    being accounted for on the node while the total number of file pages are
    accounted on the zone. This can be coped with to some extent but it's
    confusing so this patch moves the relevant file-based accounted. Due to
    throttling logic in the page allocator for reliable OOM detection, it is
    still necessary to track dirty and writeback pages on a per-zone basis.

    [mgorman@techsingularity.net: fix NR_ZONE_WRITE_PENDING accounting]
    Link: http://lkml.kernel.org/r/1468404004-5085-5-git-send-email-mgorman@techsingularity.net
    Link: http://lkml.kernel.org/r/1467970510-21195-20-git-send-email-mgorman@techsingularity.net
    Signed-off-by: Mel Gorman
    Acked-by: Vlastimil Babka
    Acked-by: Michal Hocko
    Cc: Hillf Danton
    Acked-by: Johannes Weiner
    Cc: Joonsoo Kim
    Cc: Minchan Kim
    Cc: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • Reclaim makes decisions based on the number of pages that are mapped but
    it's mixing node and zone information. Account NR_FILE_MAPPED and
    NR_ANON_PAGES pages on the node.

    Link: http://lkml.kernel.org/r/1467970510-21195-18-git-send-email-mgorman@techsingularity.net
    Signed-off-by: Mel Gorman
    Acked-by: Vlastimil Babka
    Acked-by: Michal Hocko
    Cc: Hillf Danton
    Acked-by: Johannes Weiner
    Cc: Joonsoo Kim
    Cc: Minchan Kim
    Cc: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • This moves the LRU lists from the zone to the node and related data such
    as counters, tracing, congestion tracking and writeback tracking.

    Unfortunately, due to reclaim and compaction retry logic, it is
    necessary to account for the number of LRU pages on both zone and node
    logic. Most reclaim logic is based on the node counters but the retry
    logic uses the zone counters which do not distinguish inactive and
    active sizes. It would be possible to leave the LRU counters on a
    per-zone basis but it's a heavier calculation across multiple cache
    lines that is much more frequent than the retry checks.

    Other than the LRU counters, this is mostly a mechanical patch but note
    that it introduces a number of anomalies. For example, the scans are
    per-zone but using per-node counters. We also mark a node as congested
    when a zone is congested. This causes weird problems that are fixed
    later but is easier to review.

    In the event that there is excessive overhead on 32-bit systems due to
    the nodes being on LRU then there are two potential solutions

    1. Long-term isolation of highmem pages when reclaim is lowmem

    When pages are skipped, they are immediately added back onto the LRU
    list. If lowmem reclaim persisted for long periods of time, the same
    highmem pages get continually scanned. The idea would be that lowmem
    keeps those pages on a separate list until a reclaim for highmem pages
    arrives that splices the highmem pages back onto the LRU. It potentially
    could be implemented similar to the UNEVICTABLE list.

    That would reduce the skip rate with the potential corner case is that
    highmem pages have to be scanned and reclaimed to free lowmem slab pages.

    2. Linear scan lowmem pages if the initial LRU shrink fails

    This will break LRU ordering but may be preferable and faster during
    memory pressure than skipping LRU pages.

    Link: http://lkml.kernel.org/r/1467970510-21195-4-git-send-email-mgorman@techsingularity.net
    Signed-off-by: Mel Gorman
    Acked-by: Johannes Weiner
    Acked-by: Vlastimil Babka
    Cc: Hillf Danton
    Cc: Joonsoo Kim
    Cc: Michal Hocko
    Cc: Minchan Kim
    Cc: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     

28 Jul, 2016

1 commit


27 Jul, 2016

1 commit


26 Jul, 2016

2 commits

  • Pull locking updates from Ingo Molnar:
    "The locking tree was busier in this cycle than the usual pattern - a
    couple of major projects happened to coincide.

    The main changes are:

    - implement the atomic_fetch_{add,sub,and,or,xor}() API natively
    across all SMP architectures (Peter Zijlstra)

    - add atomic_fetch_{inc/dec}() as well, using the generic primitives
    (Davidlohr Bueso)

    - optimize various aspects of rwsems (Jason Low, Davidlohr Bueso,
    Waiman Long)

    - optimize smp_cond_load_acquire() on arm64 and implement LSE based
    atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
    on arm64 (Will Deacon)

    - introduce smp_acquire__after_ctrl_dep() and fix various barrier
    mis-uses and bugs (Peter Zijlstra)

    - after discovering ancient spin_unlock_wait() barrier bugs in its
    implementation and usage, strengthen its semantics and update/fix
    usage sites (Peter Zijlstra)

    - optimize mutex_trylock() fastpath (Peter Zijlstra)

    - ... misc fixes and cleanups"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (67 commits)
    locking/atomic: Introduce inc/dec variants for the atomic_fetch_$op() API
    locking/barriers, arch/arm64: Implement LDXR+WFE based smp_cond_load_acquire()
    locking/static_keys: Fix non static symbol Sparse warning
    locking/qspinlock: Use __this_cpu_dec() instead of full-blown this_cpu_dec()
    locking/atomic, arch/tile: Fix tilepro build
    locking/atomic, arch/m68k: Remove comment
    locking/atomic, arch/arc: Fix build
    locking/Documentation: Clarify limited control-dependency scope
    locking/atomic, arch/rwsem: Employ atomic_long_fetch_add()
    locking/atomic, arch/qrwlock: Employ atomic_fetch_add_acquire()
    locking/atomic, arch/mips: Convert to _relaxed atomics
    locking/atomic, arch/alpha: Convert to _relaxed atomics
    locking/atomic: Remove the deprecated atomic_{set,clear}_mask() functions
    locking/atomic: Remove linux/atomic.h:atomic_fetch_or()
    locking/atomic: Implement atomic{,64,_long}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
    locking/atomic: Fix atomic64_relaxed() bits
    locking/atomic, arch/xtensa: Implement atomic_fetch_{add,sub,and,or,xor}()
    locking/atomic, arch/x86: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
    locking/atomic, arch/tile: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
    locking/atomic, arch/sparc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
    ...

    Linus Torvalds
     
  • AT_VECTOR_SIZE_ARCH should be defined with the maximum number of
    NEW_AUX_ENT entries that ARCH_DLINFO can contain, but it wasn't defined
    for tile at all even though ARCH_DLINFO will contain one NEW_AUX_ENT for
    the VDSO address.

    This shouldn't be a problem as AT_VECTOR_SIZE_BASE includes space for
    AT_BASE_PLATFORM which tile doesn't use, but lets define it now and add
    the comment above ARCH_DLINFO as found in several other architectures to
    remind future modifiers of ARCH_DLINFO to keep AT_VECTOR_SIZE_ARCH up to
    date.

    Fixes: 4a556f4f56da ("tile: implement gettimeofday() via vDSO")
    Signed-off-by: James Hogan
    Cc: Chris Metcalf
    Signed-off-by: Chris Metcalf

    James Hogan
     

25 Jun, 2016

4 commits

  • Merge misc fixes from Andrew Morton:
    "Two weeks worth of fixes here"

    * emailed patches from Andrew Morton : (41 commits)
    init/main.c: fix initcall_blacklisted on ia64, ppc64 and parisc64
    autofs: don't get stuck in a loop if vfs_write() returns an error
    mm/page_owner: avoid null pointer dereference
    tools/vm/slabinfo: fix spelling mistake: "Ocurrences" -> "Occurrences"
    fs/nilfs2: fix potential underflow in call to crc32_le
    oom, suspend: fix oom_reaper vs. oom_killer_disable race
    ocfs2: disable BUG assertions in reading blocks
    mm, compaction: abort free scanner if split fails
    mm: prevent KASAN false positives in kmemleak
    mm/hugetlb: clear compound_mapcount when freeing gigantic pages
    mm/swap.c: flush lru pvecs on compound page arrival
    memcg: css_alloc should return an ERR_PTR value on error
    memcg: mem_cgroup_migrate() may be called with irq disabled
    hugetlb: fix nr_pmds accounting with shared page tables
    Revert "mm: disable fault around on emulated access bit architecture"
    Revert "mm: make faultaround produce old ptes"
    mailmap: add Boris Brezillon's email
    mailmap: add Antoine Tenart's email
    mm, sl[au]b: add __GFP_ATOMIC to the GFP reclaim mask
    mm: mempool: kasan: don't poot mempool objects in quarantine
    ...

    Linus Torvalds
     
  • __GFP_REPEAT has a rather weak semantic but since it has been introduced
    around 2.6.12 it has been ignored for low order allocations.

    pgtable_alloc_one uses __GFP_REPEAT flag for L2_USER_PGTABLE_ORDER but
    the order is either 0 or 3 if L2_KERNEL_PGTABLE_SHIFT for HPAGE_SHIFT.
    This means that this flag has never been actually useful here because it
    has always been used only for PAGE_ALLOC_COSTLY requests.

    Link: http://lkml.kernel.org/r/1464599699-30131-16-git-send-email-mhocko@kernel.org
    Signed-off-by: Michal Hocko
    Acked-by: Chris Metcalf [for tile]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     
  • We've had the thread info allocated together with the thread stack for
    most architectures for a long time (since the thread_info was split off
    from the task struct), but that is about to change.

    But the patches that move the thread info to be off-stack (and a part of
    the task struct instead) made it clear how confused the allocator and
    freeing functions are.

    Because the common case was that we share an allocation with the thread
    stack and the thread_info, the two pointers were identical. That
    identity then meant that we would have things like

    ti = alloc_thread_info_node(tsk, node);
    ...
    tsk->stack = ti;

    which certainly _worked_ (since stack and thread_info have the same
    value), but is rather confusing: why are we assigning a thread_info to
    the stack? And if we move the thread_info away, the "confusing" code
    just gets to be entirely bogus.

    So remove all this confusion, and make it clear that we are doing the
    stack allocation by renaming and clarifying the function names to be
    about the stack. The fact that the thread_info then shares the
    allocation is an implementation detail, and not really about the
    allocation itself.

    This is a pure renaming and type fix: we pass in the same pointer, it's
    just that we clarify what the pointer means.

    The ia64 code that actually only has one single allocation (for all of
    task_struct, thread_info and kernel thread stack) now looks a bit odd,
    but since "tsk->stack" is actually not even used there, that oddity
    doesn't matter. It would be a separate thing to clean that up, I
    intentionally left the ia64 changes as a pure brute-force renaming and
    type change.

    Acked-by: Andy Lutomirski
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • The tip gcc includes an optimization mode that converts
    64-bit divides into 128-bit multiplies using __multi3.
    Export the symbol so that modules can find it. We just
    export unconditionally without worrying about the gcc
    version, since the symbol has been in libgcc forever and
    the function is less than 300 bytes.

    Signed-off-by: Chris Metcalf

    Chris Metcalf
     

24 Jun, 2016

1 commit

  • The tilepro change wasn't ever compiled it seems (the 0day built bot
    also doesn't have a toolchain for it).

    Make it work.

    The thing that makes the patch bigger than desired is namespace
    collision with the C11 __atomic builtin functions. So rename the
    tilepro functions to __atomic32.

    Reported-by: Sudip Mukherjee
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Chris Metcalf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Stephen Rothwell
    Cc: Thomas Gleixner
    Fixes: 1af5de9af138 ("locking/atomic, arch/tile: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()")
    Link: http://lkml.kernel.org/r/20160622091649.GB30154@twins.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

16 Jun, 2016

3 commits

  • Since all architectures have this implemented now natively, remove this
    dead code.

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-arch@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Implement FETCH-OP atomic primitives, these are very similar to the
    existing OP-RETURN primitives we already have, except they return the
    value of the atomic variable _before_ modification.

    This is especially useful for irreversible operations -- such as
    bitops (because it becomes impossible to reconstruct the state prior
    to modification).

    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Chris Metcalf
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-arch@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • The glibc __LONG_LONG_PAIR passes arguments in a different
    order for BE platforms vs LE platforms. Adjust the expectations
    in the kernel 32-bit syscall routines for BE to match.

    Signed-off-by: Chris Metcalf

    Chris Metcalf
     

15 Jun, 2016

2 commits


14 Jun, 2016

2 commits

  • This patch updates/fixes all spin_unlock_wait() implementations.

    The update is in semantics; where it previously was only a control
    dependency, we now upgrade to a full load-acquire to match the
    store-release from the spin_unlock() we waited on. This ensures that
    when spin_unlock_wait() returns, we're guaranteed to observe the full
    critical section we waited on.

    This fixes a number of spin_unlock_wait() users that (not
    unreasonably) rely on this.

    I also fixed a number of ticket lock versions to only wait on the
    current lock holder, instead of for a full unlock, as this is
    sufficient.

    Furthermore; again for ticket locks; I added an smp_rmb() in between
    the initial ticket load and the spin loop testing the current value
    because I could not convince myself the address dependency is
    sufficient, esp. if the loads are of different sizes.

    I'm more than happy to remove this smp_rmb() again if people are
    certain the address dependency does indeed work as expected.

    Note: PPC32 will be fixed independently

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: chris@zankel.net
    Cc: cmetcalf@mellanox.com
    Cc: davem@davemloft.net
    Cc: dhowells@redhat.com
    Cc: james.hogan@imgtec.com
    Cc: jejb@parisc-linux.org
    Cc: linux@armlinux.org.uk
    Cc: mpe@ellerman.id.au
    Cc: ralf@linux-mips.org
    Cc: realmz6@gmail.com
    Cc: rkuo@codeaurora.org
    Cc: rth@twiddle.net
    Cc: schwidefsky@de.ibm.com
    Cc: tony.luck@intel.com
    Cc: vgupta@synopsys.com
    Cc: ysato@users.sourceforge.jp
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Since TILE doesn't do read speculation, its control dependencies also
    guarantee LOAD->LOAD order and we don't need the additional RMB
    otherwise required to provide ACQUIRE semantics.

    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Chris Metcalf
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

08 Jun, 2016

1 commit


26 May, 2016

1 commit

  • Pull perf updates from Ingo Molnar:
    "Mostly tooling and PMU driver fixes, but also a number of late updates
    such as the reworking of the call-chain size limiting logic to make
    call-graph recording more robust, plus tooling side changes for the
    new 'backwards ring-buffer' extension to the perf ring-buffer"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
    perf record: Read from backward ring buffer
    perf record: Rename variable to make code clear
    perf record: Prevent reading invalid data in record__mmap_read
    perf evlist: Add API to pause/resume
    perf trace: Use the ptr->name beautifier as default for "filename" args
    perf trace: Use the fd->name beautifier as default for "fd" args
    perf report: Add srcline_from/to branch sort keys
    perf evsel: Record fd into perf_mmap
    perf evsel: Add overwrite attribute and check write_backward
    perf tools: Set buildid dir under symfs when --symfs is provided
    perf trace: Only auto set call-graph to "dwarf" when syscalls are being traced
    perf annotate: Sort list of recognised instructions
    perf annotate: Fix identification of ARM blt and bls instructions
    perf tools: Fix usage of max_stack sysctl
    perf callchain: Stop validating callchains by the max_stack sysctl
    perf trace: Fix exit_group() formatting
    perf top: Use machine->kptr_restrict_warned
    perf trace: Warn when trying to resolve kernel addresses with kptr_restrict=1
    perf machine: Do not bail out if not managing to read ref reloc symbol
    perf/x86/intel/p4: Trival indentation fix, remove space
    ...

    Linus Torvalds
     

25 May, 2016

1 commit


24 May, 2016

4 commits

  • Merge yet more updates from Andrew Morton:

    - Oleg's "wait/ptrace: assume __WALL if the child is traced". It's a
    kernel-based workaround for existing userspace issues.

    - A few hotfixes

    - befs cleanups

    - nilfs2 updates

    - sys_wait() changes

    - kexec updates

    - kdump

    - scripts/gdb updates

    - the last of the MM queue

    - a few other misc things

    * emailed patches from Andrew Morton : (84 commits)
    kgdb: depends on VT
    drm/amdgpu: make amdgpu_mn_get wait for mmap_sem killable
    drm/radeon: make radeon_mn_get wait for mmap_sem killable
    drm/i915: make i915_gem_mmap_ioctl wait for mmap_sem killable
    uprobes: wait for mmap_sem for write killable
    prctl: make PR_SET_THP_DISABLE wait for mmap_sem killable
    exec: make exec path waiting for mmap_sem killable
    aio: make aio_setup_ring killable
    coredump: make coredump_wait wait for mmap_sem for write killable
    vdso: make arch_setup_additional_pages wait for mmap_sem for write killable
    ipc, shm: make shmem attach/detach wait for mmap_sem killable
    mm, fork: make dup_mmap wait for mmap_sem for write killable
    mm, proc: make clear_refs killable
    mm: make vm_brk killable
    mm, elf: handle vm_brk error
    mm, aout: handle vm_brk failures
    mm: make vm_munmap killable
    mm: make vm_mmap killable
    mm: make mmap_sem for write waits killable for mm syscalls
    MAINTAINERS: add co-maintainer for scripts/gdb
    ...

    Linus Torvalds
     
  • Pull arch/tile updates from Chris Metcalf:
    "This is an even quieter cycle than usual"

    * git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
    Fix typo
    Fix typo
    Fix typo
    tile: sort the "select" lines in the TILE/TILEGX configs
    tile: clarify barrier semantics of atomic_add_return
    tile/defconfigs: Remove CONFIG_IPV6_PRIVACY

    Linus Torvalds
     
  • This option was replaced by PAGE_COUNTER which is selected by MEMCG.

    Signed-off-by: Konstantin Khlebnikov
    Acked-by: Arnd Bergmann
    Acked-by: Balbir Singh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     
  • Signed-off-by: Andrea Gelmini
    Signed-off-by: Chris Metcalf

    Andrea Gelmini