26 Mar, 2016

1 commit

  • Implement the stack depot and provide CONFIG_STACKDEPOT. Stack depot
    will allow KASAN store allocation/deallocation stack traces for memory
    chunks. The stack traces are stored in a hash table and referenced by
    handles which reside in the kasan_alloc_meta and kasan_free_meta
    structures in the allocated memory chunks.

    IRQ stack traces are cut below the IRQ entry point to avoid unnecessary
    duplication.

    Right now stackdepot support is only enabled in SLAB allocator. Once
    KASAN features in SLAB are on par with those in SLUB we can switch SLUB
    to stackdepot as well, thus removing the dependency on SLUB stack
    bookkeeping, which wastes a lot of memory.

    This patch is based on the "mm: kasan: stack depots" patch originally
    prepared by Dmitry Chernenkov.

    Joonsoo has said that he plans to reuse the stackdepot code for the
    mm/page_owner.c debugging facility.

    [akpm@linux-foundation.org: s/depot_stack_handle/depot_stack_handle_t]
    [aryabinin@virtuozzo.com: comment style fixes]
    Signed-off-by: Alexander Potapenko
    Signed-off-by: Andrey Ryabinin
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Cc: Andrey Konovalov
    Cc: Dmitry Vyukov
    Cc: Steven Rostedt
    Cc: Konstantin Serebryany
    Cc: Dmitry Chernenkov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Potapenko
     

23 Mar, 2016

1 commit

  • kcov provides code coverage collection for coverage-guided fuzzing
    (randomized testing). Coverage-guided fuzzing is a testing technique
    that uses coverage feedback to determine new interesting inputs to a
    system. A notable user-space example is AFL
    (http://lcamtuf.coredump.cx/afl/). However, this technique is not
    widely used for kernel testing due to missing compiler and kernel
    support.

    kcov does not aim to collect as much coverage as possible. It aims to
    collect more or less stable coverage that is function of syscall inputs.
    To achieve this goal it does not collect coverage in soft/hard
    interrupts and instrumentation of some inherently non-deterministic or
    non-interesting parts of kernel is disbled (e.g. scheduler, locking).

    Currently there is a single coverage collection mode (tracing), but the
    API anticipates additional collection modes. Initially I also
    implemented a second mode which exposes coverage in a fixed-size hash
    table of counters (what Quentin used in his original patch). I've
    dropped the second mode for simplicity.

    This patch adds the necessary support on kernel side. The complimentary
    compiler support was added in gcc revision 231296.

    We've used this support to build syzkaller system call fuzzer, which has
    found 90 kernel bugs in just 2 months:

    https://github.com/google/syzkaller/wiki/Found-Bugs

    We've also found 30+ bugs in our internal systems with syzkaller.
    Another (yet unexplored) direction where kcov coverage would greatly
    help is more traditional "blob mutation". For example, mounting a
    random blob as a filesystem, or receiving a random blob over wire.

    Why not gcov. Typical fuzzing loop looks as follows: (1) reset
    coverage, (2) execute a bit of code, (3) collect coverage, repeat. A
    typical coverage can be just a dozen of basic blocks (e.g. an invalid
    input). In such context gcov becomes prohibitively expensive as
    reset/collect coverage steps depend on total number of basic
    blocks/edges in program (in case of kernel it is about 2M). Cost of
    kcov depends only on number of executed basic blocks/edges. On top of
    that, kernel requires per-thread coverage because there are always
    background threads and unrelated processes that also produce coverage.
    With inlined gcov instrumentation per-thread coverage is not possible.

    kcov exposes kernel PCs and control flow to user-space which is
    insecure. But debugfs should not be mapped as user accessible.

    Based on a patch by Quentin Casasnovas.

    [akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode']
    [akpm@linux-foundation.org: unbreak allmodconfig]
    [akpm@linux-foundation.org: follow x86 Makefile layout standards]
    Signed-off-by: Dmitry Vyukov
    Reviewed-by: Kees Cook
    Cc: syzkaller
    Cc: Vegard Nossum
    Cc: Catalin Marinas
    Cc: Tavis Ormandy
    Cc: Will Deacon
    Cc: Quentin Casasnovas
    Cc: Kostya Serebryany
    Cc: Eric Dumazet
    Cc: Alexander Potapenko
    Cc: Kees Cook
    Cc: Bjorn Helgaas
    Cc: Sasha Levin
    Cc: David Drysdale
    Cc: Ard Biesheuvel
    Cc: Andrey Ryabinin
    Cc: Kirill A. Shutemov
    Cc: Jiri Slaby
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitry Vyukov
     

21 Mar, 2016

1 commit

  • Pull virtio/vhost updates from Michael Tsirkin:
    "New features, performance improvements, cleanups:

    - basic polling support for vhost
    - rework virtio to optionally use DMA API, fixing it on Xen
    - balloon stats gained a new entry
    - using the new napi_alloc_skb speeds up virtio net
    - virtio blk stats can now be read while another VCPU is busy
    inflating or deflating the balloon

    plus misc cleanups in various places"

    * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
    virtio_net: replace netdev_alloc_skb_ip_align() with napi_alloc_skb()
    vhost_net: basic polling support
    vhost: introduce vhost_vq_avail_empty()
    vhost: introduce vhost_has_work()
    virtio_balloon: Allow to resize and update the balloon stats in parallel
    virtio_balloon: Use a workqueue instead of "vballoon" kthread
    virtio/s390: size of SET_IND payload
    virtio/s390: use dev_to_virtio
    vhost: rename vhost_init_used()
    vhost: rename cross-endian helpers
    virtio_blk: VIRTIO_BLK_F_WCE->VIRTIO_BLK_F_FLUSH
    vring: Use the DMA API on Xen
    virtio_pci: Use the DMA API if enabled
    virtio_mmio: Use the DMA API if enabled
    virtio: Add improved queue allocation API
    virtio_ring: Support DMA APIs
    vring: Introduce vring_use_dma_api()
    s390/dma: Allow per device dma ops
    alpha/dma: use common noop dma ops
    dma: Provide simple noop dma ops

    Linus Torvalds
     

02 Mar, 2016

1 commit

  • We are going to require dma_ops for several common drivers, even for
    systems that do have an identity mapping. Lets provide some minimal
    no-op dma_ops that can be used for that purpose.

    Signed-off-by: Christian Borntraeger
    Reviewed-by: Joerg Roedel
    Signed-off-by: Andy Lutomirski
    Signed-off-by: Michael S. Tsirkin

    Christian Borntraeger
     

20 Feb, 2016

1 commit


24 Jan, 2016

1 commit

  • Pull rdma updates from Doug Ledford:
    "Initial roundup of 4.5 merge window patches

    - Remove usage of ib_query_device and instead store attributes in
    ib_device struct

    - Move iopoll out of block and into lib, rename to irqpoll, and use
    in several places in the rdma stack as our new completion queue
    polling library mechanism. Update the other block drivers that
    already used iopoll to use the new mechanism too.

    - Replace the per-entry GID table locks with a single GID table lock

    - IPoIB multicast cleanup

    - Cleanups to the IB MR facility

    - Add support for 64bit extended IB counters

    - Fix for netlink oops while parsing RDMA nl messages

    - RoCEv2 support for the core IB code

    - mlx4 RoCEv2 support

    - mlx5 RoCEv2 support

    - Cross Channel support for mlx5

    - Timestamp support for mlx5

    - Atomic support for mlx5

    - Raw QP support for mlx5

    - MAINTAINERS update for mlx4/mlx5

    - Misc ocrdma, qib, nes, usNIC, cxgb3, cxgb4, mlx4, mlx5 updates

    - Add support for remote invalidate to the iSER driver (pushed
    through the RDMA tree due to dependencies, acknowledged by nab)

    - Update to NFSoRDMA (pushed through the RDMA tree due to
    dependencies, acknowledged by Bruce)"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (169 commits)
    IB/mlx5: Unify CQ create flags check
    IB/mlx5: Expose Raw Packet QP to user space consumers
    {IB, net}/mlx5: Move the modify QP operation table to mlx5_ib
    IB/mlx5: Support setting Ethernet priority for Raw Packet QPs
    IB/mlx5: Add Raw Packet QP query functionality
    IB/mlx5: Add create and destroy functionality for Raw Packet QP
    IB/mlx5: Refactor mlx5_ib_qp to accommodate other QP types
    IB/mlx5: Allocate a Transport Domain for each ucontext
    net/mlx5_core: Warn on unsupported events of QP/RQ/SQ
    net/mlx5_core: Add RQ and SQ event handling
    net/mlx5_core: Export transport objects
    IB/mlx5: Expose CQE version to user-space
    IB/mlx5: Add CQE version 1 support to user QPs and SRQs
    IB/mlx5: Fix data validation in mlx5_ib_alloc_ucontext
    IB/sa: Fix netlink local service GFP crash
    IB/srpt: Remove redundant wc array
    IB/qib: Improve ipoib UD performance
    IB/mlx4: Advertise RoCE v2 support
    IB/mlx4: Create and use another QP1 for RoCEv2
    IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers
    ...

    Linus Torvalds
     

21 Jan, 2016

3 commits

  • UBSAN uses compile-time instrumentation to catch undefined behavior
    (UB). Compiler inserts code that perform certain kinds of checks before
    operations that could cause UB. If check fails (i.e. UB detected)
    __ubsan_handle_* function called to print error message.

    So the most of the work is done by compiler. This patch just implements
    ubsan handlers printing errors.

    GCC has this capability since 4.9.x [1] (see -fsanitize=undefined
    option and its suboptions).
    However GCC 5.x has more checkers implemented [2].
    Article [3] has a bit more details about UBSAN in the GCC.

    [1] - https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Debugging-Options.html
    [2] - https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html
    [3] - http://developerblog.redhat.com/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/

    Issues which UBSAN has found thus far are:

    Found bugs:

    * out-of-bounds access - 97840cb67ff5 ("netfilter: nfnetlink: fix
    insufficient validation in nfnetlink_bind")

    undefined shifts:

    * d48458d4a768 ("jbd2: use a better hash function for the revoke
    table")

    * 10632008b9e1 ("clockevents: Prevent shift out of bounds")

    * 'x << -1' shift in ext4 -
    http://lkml.kernel.org/r/

    * undefined rol32(0) -
    http://lkml.kernel.org/r/

    * undefined dirty_ratelimit calculation -
    http://lkml.kernel.org/r/

    * undefined roundown_pow_of_two(0) -
    http://lkml.kernel.org/r/

    * [WONTFIX] undefined shift in __bpf_prog_run -
    http://lkml.kernel.org/r/

    WONTFIX here because it should be fixed in bpf program, not in kernel.

    signed overflows:

    * 32a8df4e0b33f ("sched: Fix odd values in effective_load()
    calculations")

    * mul overflow in ntp -
    http://lkml.kernel.org/r/

    * incorrect conversion into rtc_time in rtc_time64_to_tm() -
    http://lkml.kernel.org/r/

    * unvalidated timespec in io_getevents() -
    http://lkml.kernel.org/r/

    * [NOTABUG] signed overflow in ktime_add_safe() -
    http://lkml.kernel.org/r/

    [akpm@linux-foundation.org: fix unused local warning]
    [akpm@linux-foundation.org: fix __int128 build woes]
    Signed-off-by: Andrey Ryabinin
    Cc: Peter Zijlstra
    Cc: Sasha Levin
    Cc: Randy Dunlap
    Cc: Rasmus Villemoes
    Cc: Jonathan Corbet
    Cc: Michal Marek
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Yury Gribov
    Cc: Dmitry Vyukov
    Cc: Konstantin Khlebnikov
    Cc: Kostya Serebryany
    Cc: Johannes Berg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Ryabinin
     
  • The clz table (__clz_tab) in lib/clz_tab.c is also provided as part of
    libgcc.a, and many architectures link against libgcc. To allow the
    linker to avoid a multiple-definition link failure, clz_tab.o has to be
    in lib/lib.a rather than lib/builtin.o. The specific issue is that
    libgcc.a comes before lib/builtin.o on vmlinux.o's link command line, so
    its _clz.o is pulled to satisfy __clz_tab, and then when the remainder
    of lib/builtin.o is pulled in to satisfy all the other dependencies, the
    __clz_tab symbols conflict. By putting clz_tab.o in lib.a, the linker
    can simply avoid pulling it into vmlinux.o when this situation arises.

    The definitions of __clz_tab are the same in libgcc.a and in the kernel;
    arguably we could also simply rename the kernel version, but it's
    unlikely the libgcc version will ever change to become incompatible, so
    just using it seems reasonably safe.

    Signed-off-by: Chris Metcalf
    Acked-by: David S. Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Metcalf
     
  • The test suite currently doesn't cover many corner cases when
    hex_dump_to_buffer() runs into overflow. Refactor and amend test suite
    to cover most of the cases.

    This patch (of 9):

    Just to follow the scheme that most of the test modules are using.

    There is no fuctional change.

    Signed-off-by: Andy Shevchenko
    Acked-by: Rasmus Villemoes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Shevchenko
     

12 Dec, 2015

1 commit


02 Dec, 2015

1 commit

  • This module allows to insert errors in some of netdevice's notifier
    events. All network drivers use these notifiers to signal various events
    and to check if they are allowed, e.g. PRECHANGEMTU and CHANGEMTU
    afterwards. Until recently I had to run failure tests by injecting
    a custom module, but now this infrastructure makes it trivial to test
    these failure paths. Some of the recent bugs I fixed were found using
    this module.
    Here's an example:
    $ cd /sys/kernel/debug/notifier-error-inject/netdev
    $ echo -22 > actions/NETDEV_CHANGEMTU/error
    $ ip link set eth0 mtu 1024
    RTNETLINK answers: Invalid argument

    CC: Akinobu Mita
    CC: "David S. Miller"
    CC: netdev
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

07 Nov, 2015

1 commit

  • This adds a simple module for testing the kernel's printf facilities.
    Previously, some %p extensions have caused a wrong return value in case
    the entire output didn't fit and/or been unusable in kasprintf(). This
    should help catch such issues. Also, it should help ensure that changes
    to the formatting algorithms don't break anything.

    I'm not sure if we have a struct dentry or struct file lying around at
    boot time or if we can fake one, but most %p extensions should be
    testable, as should the ordinary number and string formatting.

    The nature of vararg functions means we can't use a more conventional
    table-driven approach.

    For now, this is mostly a skeleton; contributions are very
    welcome. Some tests are/will be slightly annoying to write, since the
    expected output depends on stuff like CONFIG_*, sizeof(long), runtime
    values etc.

    Signed-off-by: Rasmus Villemoes
    Reviewed-by: Kees Cook
    Cc: Andy Shevchenko
    Cc: Martin Kletzander
    Cc: Rasmus Villemoes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rasmus Villemoes
     

08 Oct, 2015

1 commit

  • There's no good reason why users outside of networking should not
    be using this facility, f.e. for initializing their seeds.

    Therefore, make it accessible from there as get_random_once().

    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

09 Sep, 2015

1 commit

  • Pull NMI backtrace update from Russell King:
    "These changes convert the x86 NMI handling to be a library
    implementation which other architectures can make use of. Thomas
    Gleixner has reviewed and tested these changes, and wishes me to send
    these rather than taking them through the tip tree.

    The final patch in the set adds an initial implementation using this
    infrastructure to ARM, even though it doesn't send the IPI at "NMI"
    level. Patches are in progress to add the ARM equivalent of NMI, but
    we still need the IRQ-level fallback for systems where the "NMI" isn't
    available due to secure firmware denying access to it"

    * 'nmi' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
    ARM: add basic support for on-demand backtrace of other CPUs
    nmi: x86: convert to generic nmi handler
    nmi: create generic NMI backtrace implementation

    Linus Torvalds
     

04 Sep, 2015

1 commit

  • Pull locking and atomic updates from Ingo Molnar:
    "Main changes in this cycle are:

    - Extend atomic primitives with coherent logic op primitives
    (atomic_{or,and,xor}()) and deprecate the old partial APIs
    (atomic_{set,clear}_mask())

    The old ops were incoherent with incompatible signatures across
    architectures and with incomplete support. Now every architecture
    supports the primitives consistently (by Peter Zijlstra)

    - Generic support for 'relaxed atomics':

    - _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return()
    - atomic_read_acquire()
    - atomic_set_release()

    This came out of porting qwrlock code to arm64 (by Will Deacon)

    - Clean up the fragile static_key APIs that were causing repeat bugs,
    by introducing a new one:

    DEFINE_STATIC_KEY_TRUE(name);
    DEFINE_STATIC_KEY_FALSE(name);

    which define a key of different types with an initial true/false
    value.

    Then allow:

    static_branch_likely()
    static_branch_unlikely()

    to take a key of either type and emit the right instruction for the
    case. To be able to know the 'type' of the static key we encode it
    in the jump entry (by Peter Zijlstra)

    - Static key self-tests (by Jason Baron)

    - qrwlock optimizations (by Waiman Long)

    - small futex enhancements (by Davidlohr Bueso)

    - ... and misc other changes"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
    jump_label/x86: Work around asm build bug on older/backported GCCs
    locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations
    locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h
    locking/qrwlock: Make use of _{acquire|release|relaxed}() atomics
    locking/qrwlock: Implement queue_write_unlock() using smp_store_release()
    locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition
    locking, asm-generic: Add _{relaxed|acquire|release}() variants for 'atomic_long_t'
    locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication
    locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations
    locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic
    locking/static_keys: Make verify_keys() static
    jump label, locking/static_keys: Update docs
    locking/static_keys: Provide a selftest
    jump_label: Provide a self-test
    s390/uaccess, locking/static_keys: employ static_branch_likely()
    x86, tsc, locking/static_keys: Employ static_branch_likely()
    locking/static_keys: Add selftest
    locking/static_keys: Add a new static_key interface
    locking/static_keys: Rework update logic
    locking/static_keys: Add static_key_{en,dis}able() helpers
    ...

    Linus Torvalds
     

03 Sep, 2015

1 commit

  • Pull networking updates from David Miller:
    "Another merge window, another set of networking changes. I've heard
    rumblings that the lightweight tunnels infrastructure has been voted
    networking change of the year. But what do I know?

    1) Add conntrack support to openvswitch, from Joe Stringer.

    2) Initial support for VRF (Virtual Routing and Forwarding), which
    allows the segmentation of routing paths without using multiple
    devices. There are some semantic kinks to work out still, but
    this is a reasonably strong foundation. From David Ahern.

    3) Remove spinlock fro act_bpf fast path, from Alexei Starovoitov.

    4) Ignore route nexthops with a link down state in ipv6, just like
    ipv4. From Andy Gospodarek.

    5) Remove spinlock from fast path of act_gact and act_mirred, from
    Eric Dumazet.

    6) Document the DSA layer, from Florian Fainelli.

    7) Add netconsole support to bcmgenet, systemport, and DSA. Also
    from Florian Fainelli.

    8) Add Mellanox Switch Driver and core infrastructure, from Jiri
    Pirko.

    9) Add support for "light weight tunnels", which allow for
    encapsulation and decapsulation without bearing the overhead of a
    full blown netdevice. From Thomas Graf, Jiri Benc, and a cast of
    others.

    10) Add Identifier Locator Addressing support for ipv6, from Tom
    Herbert.

    11) Support fragmented SKBs in iwlwifi, from Johannes Berg.

    12) Allow perf PMUs to be accessed from eBPF programs, from Kaixu Xia.

    13) Add BQL support to 3c59x driver, from Loganaden Velvindron.

    14) Stop using a zero TX queue length to mean that a device shouldn't
    have a qdisc attached, use an explicit flag instead. From Phil
    Sutter.

    15) Use generic geneve netdevice infrastructure in openvswitch, from
    Pravin B Shelar.

    16) Add infrastructure to avoid re-forwarding a packet in software
    that was already forwarded by a hardware switch. From Scott
    Feldman.

    17) Allow AF_PACKET fanout function to be implemented in a bpf
    program, from Willem de Bruijn"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1458 commits)
    netfilter: nf_conntrack: make nf_ct_zone_dflt built-in
    netfilter: nf_dup{4, 6}: fix build error when nf_conntrack disabled
    net: fec: clear receive interrupts before processing a packet
    ipv6: fix exthdrs offload registration in out_rt path
    xen-netback: add support for multicast control
    bgmac: Update fixed_phy_register()
    sock, diag: fix panic in sock_diag_put_filterinfo
    flow_dissector: Use 'const' where possible.
    flow_dissector: Fix function argument ordering dependency
    ixgbe: Resolve "initialized field overwritten" warnings
    ixgbe: Remove bimodal SR-IOV disabling
    ixgbe: Add support for reporting 2.5G link speed
    ixgbe: fix bounds checking in ixgbe_setup_tc for 82598
    ixgbe: support for ethtool set_rxfh
    ixgbe: Avoid needless PHY access on copper phys
    ixgbe: cleanup to use cached mask value
    ixgbe: Remove second instance of lan_id variable
    ixgbe: use kzalloc for allocating one thing
    flow: Move __get_hash_from_flowi{4,6} into flow_dissector.c
    ixgbe: Remove unused PCI bus types
    ...

    Linus Torvalds
     

27 Aug, 2015

1 commit


25 Aug, 2015

1 commit

  • Sometimes a scatter-gather has to be split into several chunks, or sub
    scatter lists. This happens for example if a scatter list will be
    handled by multiple DMA channels, each one filling a part of it.

    A concrete example comes with the media V4L2 API, where the scatter list
    is allocated from userspace to hold an image, regardless of the
    knowledge of how many DMAs will fill it :
    - in a simple RGB565 case, one DMA will pump data from the camera ISP
    to memory
    - in the trickier YUV422 case, 3 DMAs will pump data from the camera
    ISP pipes, one for pipe Y, one for pipe U and one for pipe V

    For these cases, it is necessary to split the original scatter list into
    multiple scatter lists, which is the purpose of this patch.

    The guarantees that are required for this patch are :
    - the intersection of spans of any couple of resulting scatter lists is
    empty.
    - the union of spans of all resulting scatter lists is a subrange of
    the span of the original scatter list.
    - streaming DMA API operations (mapping, unmapping) should not happen
    both on both the resulting and the original scatter list. It's either
    the first or the later ones.
    - the caller is reponsible to call kfree() on the resulting
    scatterlists.

    Signed-off-by: Robert Jarzmik
    Signed-off-by: Jens Axboe

    Robert Jarzmik
     

03 Aug, 2015

2 commits

  • The 'jump label' self-test is in reality testing static keys - rename things
    accordingly.

    Also prettify the code in various places while at it.

    Acked-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Jason Baron
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Shuah Khan
    Cc: Thomas Gleixner
    Cc: benh@kernel.crashing.org
    Cc: bp@alien8.de
    Cc: davem@davemloft.net
    Cc: ddaney@caviumnetworks.com
    Cc: heiko.carstens@de.ibm.com
    Cc: linux-kernel@vger.kernel.org
    Cc: liuj97@gmail.com
    Cc: luto@amacapital.net
    Cc: michael@ellerman.id.au
    Cc: rabin@rab.in
    Cc: ralf@linux-mips.org
    Cc: rostedt@goodmis.org
    Cc: vbabka@suse.cz
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/0c091ecebd78a879ed8a71835d205a691a75ab4e.1438227999.git.jbaron@akamai.com
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Signed-off-by: Jason Baron
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: benh@kernel.crashing.org
    Cc: bp@alien8.de
    Cc: davem@davemloft.net
    Cc: ddaney@caviumnetworks.com
    Cc: heiko.carstens@de.ibm.com
    Cc: linux-kernel@vger.kernel.org
    Cc: liuj97@gmail.com
    Cc: luto@amacapital.net
    Cc: michael@ellerman.id.au
    Cc: rabin@rab.in
    Cc: ralf@linux-mips.org
    Cc: rostedt@goodmis.org
    Cc: shuahkh@osg.samsung.com
    Cc: vbabka@suse.cz
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/0c091ecebd78a879ed8a71835d205a691a75ab4e.1438227999.git.jbaron@akamai.com
    Signed-off-by: Ingo Molnar

    Jason Baron
     

17 Jul, 2015

1 commit

  • x86s NMI backtrace implementation (for arch_trigger_all_cpu_backtrace())
    is fairly generic in nature - the only architecture specific bits are
    the act of raising the NMI to other CPUs, and reporting the status of
    the NMI handler.

    These are fairly simple to factor out, and produce a generic
    implementation which can be shared between ARM and x86.

    Reviewed-by: Thomas Gleixner
    Signed-off-by: Russell King

    Russell King
     

03 Jul, 2015

1 commit

  • Pull kbuild updates from Michal Marek:
    "Just a few kbuild core commits this time:

    - kallsyms fix for CONFIG_XIP_KERNEL

    - bashisms in scripts/link-vmlinux.sh fixed

    - workaround to make DEBUG_INFO_REDUCED more useful yet still space
    efficient

    - clang is not wrongly detected when cross-compiling"

    * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
    kbuild: include core debug info when DEBUG_INFO_REDUCED
    scripts: link-vmlinux: Don't pass page offset to kallsyms if XIP Kernel
    scripts: fix link-vmlinux.sh bash-ism
    Makefile: Fix detection of clang when cross-compiling

    Linus Torvalds
     

19 Jun, 2015

1 commit


11 Jun, 2015

1 commit

  • With CONFIG_DEBUG_INFO_REDUCED, we do get quite a lot of debug info
    (around 22.7 MB for a defconfig+DEBUG_INFO_REDUCED). However, the
    "basenames must match" rule used by -femit-struct-debug-baseonly
    option means that we miss some core data structures, such as struct
    {device, file, inode, mm_struct, page} etc.

    We can easily get these included as well, while still getting the
    benefits of CONFIG_DEBUG_INFO_REDUCED (faster build times and smaller
    individual object files): All it takes is a dummy translation unit
    including a few strategic headers and compiled with a flag overriding
    -femit-struct-debug-baseonly.

    This increases the size of .debug_info by ~0.3%, but these 90 KB
    contain some rather useful info.

    Signed-off-by: Rasmus Villemoes
    Signed-off-by: Michal Marek

    Rasmus Villemoes
     

11 May, 2015

1 commit

  • Add 842-format software compression and decompression functions.
    Update the MAINTAINERS 842 section to include the new files.

    The 842 compression function can compress any input data into the 842
    compression format. The 842 decompression function can decompress any
    standard-format 842 compressed data - specifically, either a compressed
    data buffer created by the 842 software compression function, or a
    compressed data buffer created by the 842 hardware compressor (located
    in PowerPC coprocessors).

    The 842 compressed data format is explained in the header comments.

    This is used in a later patch to provide a full software 842 compression
    and decompression crypto interface.

    Signed-off-by: Dan Streetman
    Signed-off-by: Herbert Xu

    Dan Streetman
     

18 Apr, 2015

1 commit

  • Pull sparc updates from David Miller:
    "The PowerPC folks have a really nice scalable IOMMU pool allocator
    that we wanted to make use of for sparc. So here we have a series
    that abstracts out their code into a common layer that anyone can make
    use of.

    Sparc is converted, and the PowerPC folks have reviewed and ACK'd this
    series and plan to convert PowerPC over as well"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
    iommu-common: Fix PARISC compile-time warnings
    sparc: Make LDC use common iommu poll management functions
    sparc: Make sparc64 use scalable lib/iommu-common.c functions
    sparc: Break up monolithic iommu table/lock into finer graularity pools and lock

    Linus Torvalds
     

17 Apr, 2015

3 commits

  • This file contains implementation for all find_*_bit{,_le}
    So giving it more generic name looks reasonable.

    Signed-off-by: Yury Norov
    Reviewed-by: Rasmus Villemoes
    Reviewed-by: George Spelvin
    Cc: Alexey Klimov
    Cc: David S. Miller
    Cc: Daniel Borkmann
    Cc: Hannes Frederic Sowa
    Cc: Lai Jiangshan
    Cc: Mark Salter
    Cc: AKASHI Takahiro
    Cc: Thomas Graf
    Cc: Valentin Rothberg
    Cc: Chris Wilson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yury Norov
     
  • Currently all 'find_*_bit' family is located in lib/find_next_bit.c,
    except 'find_last_bit', which is in lib/find_last_bit.c. It seems,
    there's no major benefit to have it separated.

    Signed-off-by: Yury Norov
    Reviewed-by: Rasmus Villemoes
    Reviewed-by: George Spelvin
    Cc: Alexey Klimov
    Cc: David S. Miller
    Cc: Daniel Borkmann
    Cc: Hannes Frederic Sowa
    Cc: Lai Jiangshan
    Cc: Mark Salter
    Cc: AKASHI Takahiro
    Cc: Thomas Graf
    Cc: Valentin Rothberg
    Cc: Chris Wilson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yury Norov
     
  • Investigation of multithreaded iperf experiments on an ethernet
    interface show the iommu->lock as the hottest lock identified by
    lockstat, with something of the order of 21M contentions out of
    27M acquisitions, and an average wait time of 26 us for the lock.
    This is not efficient. A more scalable design is to follow the ppc
    model, where the iommu_table has multiple pools, each stretching
    over a segment of the map, and with a separate lock for each pool.
    This model allows for better parallelization of the iommu map search.

    This patch adds the iommu range alloc/free function infrastructure.

    Signed-off-by: Sowmini Varadhan
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

18 Feb, 2015

1 commit


14 Feb, 2015

1 commit

  • This is a test module doing various nasty things like out of bounds
    accesses, use after free. It is useful for testing kernel debugging
    features like kernel address sanitizer.

    It mostly concentrates on testing of slab allocator, but we might want to
    add more different stuff here in future (like stack/global variables out
    of bounds accesses and so on).

    Signed-off-by: Andrey Ryabinin
    Cc: Dmitry Vyukov
    Cc: Konstantin Serebryany
    Cc: Dmitry Chernenkov
    Signed-off-by: Andrey Konovalov
    Cc: Yuri Gribov
    Cc: Konstantin Khlebnikov
    Cc: Sasha Levin
    Cc: Christoph Lameter
    Cc: Joonsoo Kim
    Cc: Dave Hansen
    Cc: Andi Kleen
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Ryabinin
     

13 Feb, 2015

1 commit

  • Test different scenarios of function calls located in lib/hexdump.c.

    Currently hex_dump_to_buffer() is only tested and test data is provided
    for little endian CPUs.

    Signed-off-by: Andy Shevchenko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Shevchenko
     

12 Feb, 2015

1 commit

  • Pull s390 updates from Martin Schwidefsky:

    - The remaining patches for the z13 machine support: kernel build
    option for z13, the cache synonym avoidance, SMT support,
    compare-and-delay for spinloops and the CES5S crypto adapater.

    - The ftrace support for function tracing with the gcc hotpatch option.
    This touches common code Makefiles, Steven is ok with the changes.

    - The hypfs file system gets an extension to access diagnose 0x0c data
    in user space for performance analysis for Linux running under z/VM.

    - The iucv hvc console gets wildcard spport for the user id filtering.

    - The cacheinfo code is converted to use the generic infrastructure.

    - Cleanup and bug fixes.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (42 commits)
    s390/process: free vx save area when releasing tasks
    s390/hypfs: Eliminate hypfs interval
    s390/hypfs: Add diagnose 0c support
    s390/cacheinfo: don't use smp_processor_id() in preemptible context
    s390/zcrypt: fixed domain scanning problem (again)
    s390/smp: increase maximum value of NR_CPUS to 512
    s390/jump label: use different nop instruction
    s390/jump label: add sanity checks
    s390/mm: correct missing space when reporting user process faults
    s390/dasd: cleanup profiling
    s390/dasd: add locking for global_profile access
    s390/ftrace: hotpatch support for function tracing
    ftrace: let notrace function attribute disable hotpatching if necessary
    ftrace: allow architectures to specify ftrace compile options
    s390: reintroduce diag 44 calls for cpu_relax()
    s390/zcrypt: Add support for new crypto express (CEX5S) adapter.
    s390/zcrypt: Number of supported ap domains is not retrievable.
    s390/spinlock: add compare-and-delay to lock wait loops
    s390/tape: remove redundant if statement
    s390/hvc_iucv: add simple wildcard matches to the iucv allow filter
    ...

    Linus Torvalds
     

05 Feb, 2015

1 commit


04 Feb, 2015

1 commit


31 Jan, 2015

1 commit


29 Jan, 2015

1 commit

  • If the kernel is compiled with function tracer support the -pg compile option
    is passed to gcc to generate extra code into the prologue of each function.

    This patch replaces the "open-coded" -pg compile flag with a CC_FLAGS_FTRACE
    makefile variable which architectures can override if a different option
    should be used for code generation.

    Acked-by: Steven Rostedt
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

12 Dec, 2014

1 commit

  • Pull networking updates from David Miller:

    1) New offloading infrastructure and example 'rocker' driver for
    offloading of switching and routing to hardware.

    This work was done by a large group of dedicated individuals, not
    limited to: Scott Feldman, Jiri Pirko, Thomas Graf, John Fastabend,
    Jamal Hadi Salim, Andy Gospodarek, Florian Fainelli, Roopa Prabhu

    2) Start making the networking operate on IOV iterators instead of
    modifying iov objects in-situ during transfers. Thanks to Al Viro
    and Herbert Xu.

    3) A set of new netlink interfaces for the TIPC stack, from Richard
    Alpe.

    4) Remove unnecessary looping during ipv6 routing lookups, from Martin
    KaFai Lau.

    5) Add PAUSE frame generation support to gianfar driver, from Matei
    Pavaluca.

    6) Allow for larger reordering levels in TCP, which are easily
    achievable in the real world right now, from Eric Dumazet.

    7) Add a variable of napi_schedule that doesn't need to disable cpu
    interrupts, from Eric Dumazet.

    8) Use a doubly linked list to optimize neigh_parms_release(), from
    Nicolas Dichtel.

    9) Various enhancements to the kernel BPF verifier, and allow eBPF
    programs to actually be attached to sockets. From Alexei
    Starovoitov.

    10) Support TSO/LSO in sunvnet driver, from David L Stevens.

    11) Allow controlling ECN usage via routing metrics, from Florian
    Westphal.

    12) Remote checksum offload, from Tom Herbert.

    13) Add split-header receive, BQL, and xmit_more support to amd-xgbe
    driver, from Thomas Lendacky.

    14) Add MPLS support to openvswitch, from Simon Horman.

    15) Support wildcard tunnel endpoints in ipv6 tunnels, from Steffen
    Klassert.

    16) Do gro flushes on a per-device basis using a timer, from Eric
    Dumazet. This tries to resolve the conflicting goals between the
    desired handling of bulk vs. RPC-like traffic.

    17) Allow userspace to ask for the CPU upon what a packet was
    received/steered, via SO_INCOMING_CPU. From Eric Dumazet.

    18) Limit GSO packets to half the current congestion window, from Eric
    Dumazet.

    19) Add a generic helper so that all drivers set their RSS keys in a
    consistent way, from Eric Dumazet.

    20) Add xmit_more support to enic driver, from Govindarajulu
    Varadarajan.

    21) Add VLAN packet scheduler action, from Jiri Pirko.

    22) Support configurable RSS hash functions via ethtool, from Eyal
    Perry.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1820 commits)
    Fix race condition between vxlan_sock_add and vxlan_sock_release
    net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header
    net/mlx4: Add support for A0 steering
    net/mlx4: Refactor QUERY_PORT
    net/mlx4_core: Add explicit error message when rule doesn't meet configuration
    net/mlx4: Add A0 hybrid steering
    net/mlx4: Add mlx4_bitmap zone allocator
    net/mlx4: Add a check if there are too many reserved QPs
    net/mlx4: Change QP allocation scheme
    net/mlx4_core: Use tasklet for user-space CQ completion events
    net/mlx4_core: Mask out host side virtualization features for guests
    net/mlx4_en: Set csum level for encapsulated packets
    be2net: Export tunnel offloads only when a VxLAN tunnel is created
    gianfar: Fix dma check map error when DMA_API_DEBUG is enabled
    cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call
    net: fec: only enable mdio interrupt before phy device link up
    net: fec: clear all interrupt events to support i.MX6SX
    net: fec: reset fep link status in suspend function
    net: sock: fix access via invalid file descriptor
    net: introduce helper macro for_each_cmsghdr
    ...

    Linus Torvalds
     

11 Dec, 2014

2 commits

  • Pull nmi-safe seq_buf printk update from Steven Rostedt:
    "This code is a fork from the trace-3.19 pull as it needed the
    trace_seq clean ups from that branch.

    This code solves the issue of performing stack dumps from NMI context.
    The issue is that printk() is not safe from NMI context as if the NMI
    were to trigger when a printk() was being performed, the NMI could
    deadlock from the printk() internal locks. This has been seen in
    practice.

    With lots of review from Petr Mladek, this code went through several
    iterations, and we feel that it is now at a point of quality to be
    accepted into mainline.

    Here's what is contained in this patch set:

    - Creates a "seq_buf" generic buffer utility that allows a descriptor
    to be passed around where functions can write their own "printk()"
    formatted strings into it. The generic version was pulled out of
    the trace_seq() code that was made specifically for tracing.

    - The seq_buf code was change to model the seq_file code. I have a
    patch (not included for 3.19) that converts the seq_file.c code
    over to use seq_buf.c like the trace_seq.c code does. This was
    done to make sure that seq_buf.c is compatible with seq_file.c. I
    may try to get that patch in for 3.20.

    - The seq_buf.c file was moved to lib/ to remove it from being
    dependent on CONFIG_TRACING.

    - The printk() was updated to allow for a per_cpu "override" of the
    internal calls. That is, instead of writing to the console, a call
    to printk() may do something else. This made it easier to allow
    the NMI to change what printk() does in order to call dump_stack()
    without needing to update that code as well.

    - Finally, the dump_stack from all CPUs via NMI code was converted to
    use the seq_buf code. The caller to trigger the NMI code would
    wait till all the NMIs finished, and then it would print the
    seq_buf data to the console safely from a non NMI context

    One added bonus is that this code also makes the NMI dump stack work
    on PREEMPT_RT kernels. As printk() includes sleeping locks on
    PREEMPT_RT, printk() only writes to console if the console does not
    use any rt_mutex converted spin locks. Which a lot do"

    * tag 'trace-seq-buf-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    x86/nmi: Fix use of unallocated cpumask_var_t
    printk/percpu: Define printk_func when printk is not defined
    x86/nmi: Perform a safe NMI stack trace on all CPUs
    printk: Add per_cpu printk func to allow printk to be diverted
    seq_buf: Move the seq_buf code to lib/
    seq-buf: Make seq_buf_bprintf() conditional on CONFIG_BINARY_PRINTF
    tracing: Add seq_buf_get_buf() and seq_buf_commit() helper functions
    tracing: Have seq_buf use full buffer
    seq_buf: Add seq_buf_can_fit() helper function
    tracing: Add paranoid size check in trace_printk_seq()
    tracing: Use trace_seq_used() and seq_buf_used() instead of len
    tracing: Clean up tracing_fill_pipe_page()
    seq_buf: Create seq_buf_used() to find out how much was written
    tracing: Add a seq_buf_clear() helper and clear len and readpos in init
    tracing: Convert seq_buf fields to be like seq_file fields
    tracing: Convert seq_buf_path() to be like seq_path()
    tracing: Create seq_buf layer in trace_seq

    Linus Torvalds
     
  • As there are now no remaining users of arch_fast_hash(), lets kill
    it entirely.

    This basically reverts commit 71ae8aac3e19 ("lib: introduce arch
    optimized hash library") and follow-up work, that is f.e., commit
    237217546d44 ("lib: hash: follow-up fixups for arch hash"),
    commit e3fec2f74f7f ("lib: Add missing arch generic-y entries for
    asm-generic/hash.h") and last but not least commit 6a02652df511
    ("perf tools: Fix include for non x86 architectures").

    Cc: Francesco Fusco
    Cc: Thomas Graf
    Cc: Arnaldo Carvalho de Melo
    Signed-off-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Daniel Borkmann