19 Dec, 2016

1 commit

  • Pull libnvdimm updates from Dan Williams:
    "The libnvdimm pull request is relatively small this time around due to
    some development topics being deferred to 4.11.

    As for this pull request the bulk of it has been in -next for several
    releases leading to one late fix being added (commit 868f036fee4b
    ("libnvdimm: fix mishandled nvdimm_clear_poison() return value")). It
    has received a build success notification from the 0day-kbuild robot
    and passes the latest libnvdimm unit tests.

    Summary:

    - Dynamic label support: To date namespace label support has been
    limited to disambiguating cases where PMEM (direct load/store) and
    BLK (mmio aperture) accessed-capacity alias on the same DIMM. Since
    4.9 added support for multiple namespaces per PMEM-region there is
    value to support namespace labels even in the non-aliasing case.
    The presence of a valid namespace index block force-enables label
    support when the kernel would otherwise rely on region boundaries,
    and permits the region to be sub-divided.

    - Handle media errors in namespace metadata: Complement the error
    handling for media errors in namespace data areas with support for
    clearing errors on writes, and downgrading potential machine-check
    exceptions to simple i/o errors on read.

    - Device-DAX region attributes: Add 'align', 'id', and 'size' as
    attributes for device-dax regions. In particular this enables
    userspace tooling to generically size memory mapping and i/o
    operations. Prevent userspace from growing assumptions /
    dependencies about the parent device topology for a dax region. A
    libnvdimm namespace may not always be the parent device of a dax
    region.

    - Various cleanups and small fixes"

    * tag 'libnvdimm-for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
    dax: add region 'id', 'size', and 'align' attributes
    libnvdimm: fix mishandled nvdimm_clear_poison() return value
    libnvdimm: replace mutex_is_locked() warnings with lockdep_assert_held
    libnvdimm, pfn: fix align attribute
    libnvdimm, e820: use module_platform_driver
    libnvdimm, namespace: use octal for permissions
    libnvdimm, namespace: avoid multiple sector calculations
    libnvdimm: remove else after return in nsio_rw_bytes()
    libnvdimm, namespace: fix the type of name variable
    libnvdimm: use consistent naming for request_mem_region()
    nvdimm: use the right length of "pmem"
    libnvdimm: check and clear poison before writing to pmem
    tools/testing/nvdimm: dynamic label support
    libnvdimm: allow a platform to force enable label support
    libnvdimm: use generic iostat interfaces

    Linus Torvalds
     

18 Dec, 2016

2 commits

  • Pull networking fixes and cleanups from David Miller:

    1) Revert bogus nla_ok() change, from Alexey Dobriyan.

    2) Various bpf validator fixes from Daniel Borkmann.

    3) Add some necessary SET_NETDEV_DEV() calls to hsis_femac and hip04
    drivers, from Dongpo Li.

    4) Several ethtool ksettings conversions from Philippe Reynes.

    5) Fix bugs in inet port management wrt. soreuseport, from Tom Herbert.

    6) XDP support for virtio_net, from John Fastabend.

    7) Fix NAT handling within a vrf, from David Ahern.

    8) Endianness fixes in dpaa_eth driver, from Claudiu Manoil

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (63 commits)
    net: mv643xx_eth: fix build failure
    isdn: Constify some function parameters
    mlxsw: spectrum: Mark split ports as such
    cgroup: Fix CGROUP_BPF config
    qed: fix old-style function definition
    net: ipv6: check route protocol when deleting routes
    r6040: move spinlock in r6040_close as SOFTIRQ-unsafe lock order detected
    irda: w83977af_ir: cleanup an indent issue
    net: sfc: use new api ethtool_{get|set}_link_ksettings
    net: davicom: dm9000: use new api ethtool_{get|set}_link_ksettings
    net: cirrus: ep93xx: use new api ethtool_{get|set}_link_ksettings
    net: chelsio: cxgb3: use new api ethtool_{get|set}_link_ksettings
    net: chelsio: cxgb2: use new api ethtool_{get|set}_link_ksettings
    bpf: fix mark_reg_unknown_value for spilled regs on map value marking
    bpf: fix overflow in prog accounting
    bpf: dynamically allocate digest scratch buffer
    gtp: Fix initialization of Flags octet in GTPv1 header
    gtp: gtp_check_src_ms_ipv4() always return success
    net/x25: use designated initializers
    isdn: use designated initializers
    ...

    Linus Torvalds
     
  • Dan Williams
     

17 Dec, 2016

3 commits

  • Running ./test_verifier as unprivileged lets 1 out of 98 tests fail:

    [...]
    #71 unpriv: check that printk is disallowed FAIL
    Unexpected error message!
    0: (7a) *(u64 *)(r10 -8) = 0
    1: (bf) r1 = r10
    2: (07) r1 += -8
    3: (b7) r2 = 8
    4: (bf) r3 = r1
    5: (85) call bpf_trace_printk#6
    unknown func bpf_trace_printk#6
    [...]

    The test case is correct, just that the error outcome changed with
    ebb676daa1a3 ("bpf: Print function name in addition to function id").
    Same as with e00c7b216f34 ("bpf: fix multiple issues in selftest suite
    and samples") issue 2), so just fix up the function name.

    Fixes: ebb676daa1a3 ("bpf: Print function name in addition to function id")
    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • Commit 57a09bf0a416 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL
    registers") introduced a regression where existing programs stopped
    loading due to reaching the verifier's maximum complexity limit,
    whereas prior to this commit they were loading just fine; the affected
    program has roughly 2k instructions.

    What was found is that state pruning couldn't be performed effectively
    anymore due to mismatches of the verifier's register state, in particular
    in the id tracking. It doesn't mean that 57a09bf0a416 is incorrect per
    se, but rather that verifier needs to perform a lot more work for the
    same program with regards to involved map lookups.

    Since commit 57a09bf0a416 is only about tracking registers with type
    PTR_TO_MAP_VALUE_OR_NULL, the id is only needed to follow registers
    until they are promoted through pattern matching with a NULL check to
    either PTR_TO_MAP_VALUE or UNKNOWN_VALUE type. After that point, the
    id becomes irrelevant for the transitioned types.

    For UNKNOWN_VALUE, id is already reset to 0 via mark_reg_unknown_value(),
    but not so for PTR_TO_MAP_VALUE where id is becoming stale. It's even
    transferred further into other types that don't make use of it. Among
    others, one example is where UNKNOWN_VALUE is set on function call
    return with RET_INTEGER return type.

    states_equal() will then fall through the memcmp() on register state;
    note that the second memcmp() uses offsetofend(), so the id is part of
    that since d2a4dd37f6b4 ("bpf: fix state equivalence"). But the bisect
    pointed already to 57a09bf0a416, where we really reach beyond complexity
    limit. What I found was that states_equal() often failed in this
    case due to id mismatches in spilled regs with registers in type
    PTR_TO_MAP_VALUE. Unlike non-spilled regs, spilled regs just perform
    a memcmp() on their reg state and don't have any other optimizations
    in place, therefore also id was relevant in this case for making a
    pruning decision.

    We can safely reset id to 0 as well when converting to PTR_TO_MAP_VALUE.
    For the affected program, it resulted in a ~17 fold reduction of
    complexity and let the program load fine again. Selftest suite also
    runs fine. The only other place where env->id_gen is used currently is
    through direct packet access, but for these cases id is long living, thus
    a different scenario.

    Also, the current logic in mark_map_regs() is not fully correct when
    marking NULL branch with UNKNOWN_VALUE. We need to cache the destination
    reg's id in any case. Otherwise, once we marked that reg as UNKNOWN_VALUE,
    it's id is reset and any subsequent registers that hold the original id
    and are of type PTR_TO_MAP_VALUE_OR_NULL won't be marked UNKNOWN_VALUE
    anymore, since mark_map_reg() reuses the uncached regs[regno].id that
    was just overridden. Note, we don't need to cache it outside of
    mark_map_regs(), since it's called once on this_branch and the other
    time on other_branch, which are both two independent verifier states.
    A test case for this is added here, too.

    Fixes: 57a09bf0a416 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
    Signed-off-by: Daniel Borkmann
    Acked-by: Thomas Graf
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • Pull powerpc updates from Michael Ellerman:
    "Highlights include:

    - Support for the kexec_file_load() syscall, which is a prereq for
    secure and trusted boot.

    - Prevent kernel execution of userspace on P9 Radix (similar to
    SMEP/PXN).

    - Sort the exception tables at build time, to save time at boot, and
    store them as relative offsets to save space in the kernel image &
    memory.

    - Allow building the kernel with thin archives, which should allow us
    to build an allyesconfig once some other fixes land.

    - Build fixes to allow us to correctly rebuild when changing the
    kernel endian from big to little or vice versa.

    - Plumbing so that we can avoid doing a full mm TLB flush on P9
    Radix.

    - Initial stack protector support (-fstack-protector).

    - Support for dumping the radix (aka. Linux) and hash page tables via
    debugfs.

    - Fix an oops in cxl coredump generation when cxl_get_fd() is used.

    - Freescale updates from Scott: "Highlights include 8xx hugepage
    support, qbman fixes/cleanup, device tree updates, and some misc
    cleanup."

    - Many and varied fixes and minor enhancements as always.

    Thanks to:
    Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Anshuman
    Khandual, Anton Blanchard, Balbir Singh, Bartlomiej Zolnierkiewicz,
    Christophe Jaillet, Christophe Leroy, Denis Kirjanov, Elimar
    Riesebieter, Frederic Barrat, Gautham R. Shenoy, Geliang Tang, Geoff
    Levand, Jack Miller, Johan Hovold, Lars-Peter Clausen, Libin,
    Madhavan Srinivasan, Michael Neuling, Nathan Fontenot, Naveen N.
    Rao, Nicholas Piggin, Pan Xinhui, Peter Senna Tschudin, Rashmica
    Gupta, Rui Teng, Russell Currey, Scott Wood, Simon Guo, Suraj
    Jitindar Singh, Thiago Jung Bauermann, Tobias Klauser, Vaibhav Jain"

    [ And thanks to Michael, who took time off from a new baby to get this
    pull request done. - Linus ]

    * tag 'powerpc-4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (174 commits)
    powerpc/fsl/dts: add FMan node for t1042d4rdb
    powerpc/fsl/dts: add sg_2500_aqr105_phy4 alias on t1024rdb
    powerpc/fsl/dts: add QMan and BMan nodes on t1024
    powerpc/fsl/dts: add QMan and BMan nodes on t1023
    soc/fsl/qman: test: use DEFINE_SPINLOCK()
    powerpc/fsl-lbc: use DEFINE_SPINLOCK()
    powerpc/8xx: Implement support of hugepages
    powerpc: get hugetlbpage handling more generic
    powerpc: port 64 bits pgtable_cache to 32 bits
    powerpc/boot: Request no dynamic linker for boot wrapper
    soc/fsl/bman: Use resource_size instead of computation
    soc/fsl/qe: use builtin_platform_driver
    powerpc/fsl_pmc: use builtin_platform_driver
    powerpc/83xx/suspend: use builtin_platform_driver
    powerpc/ftrace: Fix the comments for ftrace_modify_code
    powerpc/perf: macros for power9 format encoding
    powerpc/perf: power9 raw event format encoding
    powerpc/perf: update attribute_group data structure
    powerpc/perf: factor out the event format field
    powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown
    ...

    Linus Torvalds
     

16 Dec, 2016

3 commits

  • …x/kernel/git/shuah/linux-kselftest

    Pull kselftest updates from Shuah Khan:
    "This update consists of:

    - new tests to exercise the Sync Kernel Infrastructure. These tests
    are part of a battery of Android libsync tests and are re-written
    to test the new sync user-space interfaces from Emilio López, and
    Gustavo Padovan.

    - test to run hw-independent mock tests for i915.ko from Chris Wilson

    - a new gpio test case from Bamvor Jian Zhang

    - missing gitignore additions"

    * tag 'linux-kselftest-4.10-rc1-update' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
    selftest/gpio: add gpio test case
    selftest: sync: improve assert() failure message
    kselftests: Exercise hw-independent mock tests for i915.ko
    selftests: add missing gitignore files/dirs
    selftests: add missing set-tz to timers .gitignore
    selftest: sync: stress test for merges
    selftest: sync: stress consumer/producer test
    selftest: sync: stress test for parallelism
    selftest: sync: wait tests for sw_sync framework
    selftest: sync: merge tests for sw_sync framework
    selftest: sync: fence tests for sw_sync framework
    selftest: sync: basic tests for sw_sync framework

    Linus Torvalds
     
  • Pull tracing updates from Steven Rostedt:
    "This release has a few updates:

    - STM can hook into the function tracer
    - Function filtering now supports more advance glob matching
    - Ftrace selftests updates and added tests
    - Softirq tag in traces now show only softirqs
    - ARM nop added to non traced locations at compile time
    - New trace_marker_raw file that allows for binary input
    - Optimizations to the ring buffer
    - Removal of kmap in trace_marker
    - Wakeup and irqsoff tracers now adhere to the set_graph_notrace file
    - Other various fixes and clean ups"

    * tag 'trace-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (42 commits)
    selftests: ftrace: Shift down default message verbosity
    kprobes/trace: Fix kprobe selftest for newer gcc
    tracing/kprobes: Add a helper method to return number of probe hits
    tracing/rb: Init the CPU mask on allocation
    tracing: Use SOFTIRQ_OFFSET for softirq dectection for more accurate results
    tracing/fgraph: Have wakeup and irqsoff tracers ignore graph functions too
    fgraph: Handle a case where a tracer ignores set_graph_notrace
    tracing: Replace kmap with copy_from_user() in trace_marker writing
    ftrace/x86_32: Set ftrace_stub to weak to prevent gcc from using short jumps to it
    tracing: Allow benchmark to be enabled at early_initcall()
    tracing: Have system enable return error if one of the events fail
    tracing: Do not start benchmark on boot up
    tracing: Have the reg function allow to fail
    ring-buffer: Force rb_end_commit() and rb_set_commit_to_write() inline
    ring-buffer: Froce rb_update_write_stamp() to be inlined
    ring-buffer: Force inline of hotpath helper functions
    tracing: Make __buffer_unlock_commit() always_inline
    tracing: Make tracepoint_printk a static_key
    ring-buffer: Always inline rb_event_data()
    ring-buffer: Make rb_reserve_next_event() always inlined
    ...

    Linus Torvalds
     
  • [ This resurrects commit 53855d10f456, which was reverted in
    2b41226b39b6. It depended on commit d544abd5ff7d ("lib/radix-tree:
    Convert to hotplug state machine") so now it is correct to apply ]

    Patch "lib/radix-tree: Convert to hotplug state machine" breaks the test
    suite as it adds a call to cpuhp_setup_state_nocalls() which is not
    currently emulated in the test suite. Add it, and delete the emulation
    of the old CPU hotplug mechanism.

    Link: http://lkml.kernel.org/r/1480369871-5271-36-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     

15 Dec, 2016

25 commits

  • This file was used to implement call_rcu() before liburcu implemented
    that function. It hasn't even been compiled since before the test suite
    was added to the kernel. Remove it to reduce confusion.

    Link: http://lkml.kernel.org/r/1481667692-14500-5-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Cc: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • We have a check that setting a tag on a single entry at root succeeds,
    but we were missing a check that clearing a tag on that same entry also
    succeeds.

    Link: http://lkml.kernel.org/r/1481667692-14500-4-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Cc: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • radix_tree_join() was freeing nodes with a non-zero ->exceptional count,
    and radix_tree_split() wasn't zeroing ->exceptional when it allocated
    the new node. Fix this by making all callers of radix_tree_node_alloc()
    pass in the new counts (and some other always-initialised fields), which
    will prevent the problem recurring if in future we decide to do
    something similar.

    Link: http://lkml.kernel.org/r/1481667692-14500-3-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Cc: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • The kmem_cache_alloc implementation simply allocates new memory from
    malloc() and calls the ctor, which zeroes out the entire object. This
    means it cannot spot bugs where the object isn't properly reinitialised
    before being freed.

    Add a small (11 objects) cache before freeing objects back to malloc.
    This is enough to let us write a test to catch it, although the memory
    allocator is now aware of the structure of the radix tree node, since it
    chains free objects through ->private_data (like the percpu cache does).

    Link: http://lkml.kernel.org/r/1481667692-14500-2-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Cc: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • IDR needs more functionality from the kernel: kmalloc()/kfree(), and
    xchg().

    Link: http://lkml.kernel.org/r/1480369871-5271-67-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • The random iteration test only inserts order-0 entries currently.
    Update it to insert entries of order between 7 and 0. Also make the
    maximum index configurable, make some variables static, make the test
    duration variable, remove some useless spinning, and add a fifth thread
    which calls tag_tagged_items().

    Link: http://lkml.kernel.org/r/1480369871-5271-62-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • When replacing an entry with NULL, we need to delete any sibling
    entries. Also account deleting exceptional entries properly. Also fix
    a bug with radix_tree_iter_replace() where we would fail to remove
    entirely freed nodes. Also fix accounting bug when switching between
    normal and exceptional entries with replace_slot. Also add testcases
    for all these bugs.

    Link: http://lkml.kernel.org/r/1480369871-5271-61-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • Calculate how many nodes we need to allocate to split an old_order entry
    into multiple entries, each of size new_order. The test suite checks
    that we allocated exactly the right number of nodes; neither too many
    (checked by rtp->nr == 0), nor too few (checked by comparing
    nr_allocated before and after the call to radix_tree_split()).

    Link: http://lkml.kernel.org/r/1480369871-5271-60-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • This new function splits a larger multiorder entry into smaller entries
    (potentially multi-order entries). These entries are initialised to
    RADIX_TREE_RETRY to ensure that RCU walkers who see this state aren't
    confused. The caller should then call radix_tree_for_each_slot() and
    radix_tree_replace_slot() in order to turn these retry entries into the
    intended new entries. Tags are replicated from the original multiorder
    entry into each new entry.

    Link: http://lkml.kernel.org/r/1480369871-5271-59-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • This new function allows for the replacement of many smaller entries in
    the radix tree with one larger multiorder entry. From the point of view
    of an RCU walker, they may see a mixture of the smaller entries and the
    large entry during the same walk, but they will never see NULL for an
    index which was populated before the join.

    Link: http://lkml.kernel.org/r/1480369871-5271-58-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • This is an exceptionally complicated function with just one caller
    (tag_pages_for_writeback). We devote a large portion of the runtime of
    the test suite to testing this one function which has one caller. By
    introducing the new function radix_tree_iter_tag_set(), we can eliminate
    all of the complexity while keeping the performance. The caller can now
    use a fairly standard radix_tree_for_each() loop, and it doesn't need to
    worry about tricksy things like 'start' wrapping.

    The test suite continues to spend a large amount of time investigating
    this function, but now it's testing the underlying primitives such as
    radix_tree_iter_resume() and the radix_tree_for_each_tagged() iterator
    which are also used by other parts of the kernel.

    Link: http://lkml.kernel.org/r/1480369871-5271-57-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • This rather complicated function can be better implemented as an
    iterator. It has only one caller, so move the functionality to the only
    place that needs it. Update the test suite to follow the same pattern.

    Link: http://lkml.kernel.org/r/1480369871-5271-56-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Acked-by: Konstantin Khlebnikov
    Tested-by: Kirill A. Shutemov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • This fixes several interlinked problems with the iterators in the
    presence of multiorder entries.

    1. radix_tree_iter_next() would only advance by one slot, which would
    result in the iterators returning the same entry more than once if
    there were sibling entries.

    2. radix_tree_next_slot() could return an internal pointer instead of
    a user pointer if a tagged multiorder entry was immediately followed by
    an entry of lower order.

    3. radix_tree_next_slot() expanded to a lot more code than it used to
    when multiorder support was compiled in. And I wasn't comfortable with
    entry_to_node() being in a header file.

    Fixing radix_tree_iter_next() for the presence of sibling entries
    necessarily involves examining the contents of the radix tree, so we now
    need to pass 'slot' to radix_tree_iter_next(), and we need to change the
    calling convention so it is called *before* dropping the lock which
    protects the tree. Also rename it to radix_tree_iter_resume(), as some
    people thought it was necessary to call radix_tree_iter_next() each time
    around the loop.

    radix_tree_next_slot() becomes closer to how it looked before multiorder
    support was introduced. It only checks to see if the next entry in the
    chunk is a sibling entry or a pointer to a node; this should be rare
    enough that handling this case out of line is not a performance impact
    (and such impact is amortised by the fact that the entry we just
    processed was a multiorder entry). Also, radix_tree_next_slot() used to
    force a new chunk lookup for untagged entries, which is more expensive
    than the out of line sibling entry skipping.

    Link: http://lkml.kernel.org/r/1480369871-5271-55-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • Remove the old find_next_bit code in favour of linking in the find_bit
    code from tools/lib.

    Link: http://lkml.kernel.org/r/1480369871-5271-48-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • This probably doubles the size of each item allocated by the test suite
    but it lets us check a few more things, and may be needed for upcoming
    API changes that require the caller pass in the order of the entry.

    Link: http://lkml.kernel.org/r/1480369871-5271-46-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • item_kill_tree() assumes that everything in the tree is a pointer to a
    struct item, which is annoying when testing the behaviour of exceptional
    entries. Fix it to delete exceptional entries on the assumption they
    don't need to be freed.

    Link: http://lkml.kernel.org/r/1480369871-5271-45-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • Calling rcu_barrier() allows all of the rcu-freed memory to be actually
    returned to the pool, and allows nr_allocated to return to 0. As well
    as allowing diffs between runs to be more useful, it also lets us
    pinpoint leaks more effectively.

    Link: http://lkml.kernel.org/r/1480369871-5271-44-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • This adds simple benchmark for iterator similar to one I've used for
    commit 78c1d78488a3 ("radix-tree: introduce bit-optimized iterator")

    Building with make BENCHMARK=1 set radix tree order to 6, this allows to
    get performance comparable to in kernel performance.

    Link: http://lkml.kernel.org/r/1480369871-5271-43-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Konstantin Khlebnikov
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     
  • Each thread needs to register itself with RCU, otherwise the reading
    thread's read lock has no effect and the freeing thread will free the
    memory in the tree without waiting for the read lock to be dropped.

    Link: http://lkml.kernel.org/r/1480369871-5271-42-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • Instead of reseeding the random number generator every time around the
    loop in big_gang_check(), seed it at the beginning of execution. Use
    rand_r() and an independent base seed for each thread in
    iteration_test() so they don't stomp all over each others state. Since
    this particular test depends on the kernel scheduler, the iteration test
    can't be reproduced based purely on the random seed, but at least it
    won't pollute the other tests.

    Print the seed, and allow the seed to be specified so that a run which
    hits a problem can be reproduced.

    Link: http://lkml.kernel.org/r/1480369871-5271-41-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • It can be a source of mild concern when the test suite shows that we're
    leaking nodes. While poring over the source code looking for leaks can
    lead to some fascinating bugs being discovered, sometimes the leak is
    simply that these nodes were preallocated and are sitting on the per-CPU
    list. Free them by calling the CPU dead callback.

    Link: http://lkml.kernel.org/r/1480369871-5271-40-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • Rather than simply NOP out preempt_enable() and preempt_disable(), keep
    track of preempt_count and display it regularly in case either the test
    suite or the code under test is forgetting to balance the enables &
    disables. Only found a test-case that was forgetting to re-enable
    preemption, but it's a possibility worth checking.

    Link: http://lkml.kernel.org/r/1480369871-5271-39-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • In order to test the preload code, it is necessary to fail GFP_ATOMIC
    allocations, which requires defining GFP_KERNEL and GFP_ATOMIC properly.
    Remove the obsolete __GFP_WAIT and copy the definitions of the __GFP
    flags which are used from the kernel include files. We also need the
    real definition of gfpflags_allow_blocking() to persuade the radix tree
    to actually use its preallocated nodes.

    Link: http://lkml.kernel.org/r/1480369871-5271-38-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • Patch series "Radix tree patches for 4.10", v3.

    Mostly these are improvements; the only bug fixes in here relate to
    multiorder entries (which are unused in the 4.9 tree).

    This patch (of 32):

    The radix tree uses its own buggy WARN_ON_ONCE. Replace it with the
    definition from asm-generic/bug.h

    Link: http://lkml.kernel.org/r/1480369871-5271-37-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • Ajdust spelling to more common "mandatory". Variant "mandidory" is
    certainly wrong.

    Link: http://lkml.kernel.org/r/20161011073003.GA19476@amd
    Signed-off-by: Pavel Machek
    Acked-by: Steven Rostedt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Machek
     

14 Dec, 2016

2 commits

  • Pull arm64 updates from Catalin Marinas:

    - struct thread_info moved off-stack (also touching
    include/linux/thread_info.h and include/linux/restart_block.h)

    - cpus_have_cap() reworked to avoid __builtin_constant_p() for static
    key use (also touching drivers/irqchip/irq-gic-v3.c)

    - uprobes support (currently only for native 64-bit tasks)

    - Emulation of kernel Privileged Access Never (PAN) using TTBR0_EL1
    switching to a reserved page table

    - CPU capacity information passing via DT or sysfs (used by the
    scheduler)

    - support for systems without FP/SIMD (IOW, kernel avoids touching
    these registers; there is no soft-float ABI, nor kernel emulation for
    AArch64 FP/SIMD)

    - handling of hardware watchpoint with unaligned addresses, varied
    lengths and offsets from base

    - use of the page table contiguous hint for kernel mappings

    - hugetlb fixes for sizes involving the contiguous hint

    - remove unnecessary I-cache invalidation in flush_cache_range()

    - CNTHCTL_EL2 access fix for CPUs with VHE support (ARMv8.1)

    - boot-time checks for writable+executable kernel mappings

    - simplify asm/opcodes.h and avoid including the 32-bit ARM counterpart
    and make the arm64 kernel headers self-consistent (Xen headers patch
    merged separately)

    - Workaround for broken .inst support in certain binutils versions

    * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (60 commits)
    arm64: Disable PAN on uaccess_enable()
    arm64: Work around broken .inst when defective gas is detected
    arm64: Add detection code for broken .inst support in binutils
    arm64: Remove reference to asm/opcodes.h
    arm64: Get rid of asm/opcodes.h
    arm64: smp: Prevent raw_smp_processor_id() recursion
    arm64: head.S: Fix CNTHCTL_EL2 access on VHE system
    arm64: Remove I-cache invalidation from flush_cache_range()
    arm64: Enable HIBERNATION in defconfig
    arm64: Enable CONFIG_ARM64_SW_TTBR0_PAN
    arm64: xen: Enable user access before a privcmd hvc call
    arm64: Handle faults caused by inadvertent user access with PAN enabled
    arm64: Disable TTBR0_EL1 during normal kernel execution
    arm64: Introduce uaccess_{disable,enable} functionality based on TTBR0_EL1
    arm64: Factor out TTBR0_EL1 post-update workaround into a specific asm macro
    arm64: Factor out PAN enabling/disabling into separate uaccess_* macros
    arm64: Update the synchronous external abort fault description
    selftests: arm64: add test for unaligned/inexact watchpoint handling
    arm64: Allow hw watchpoint of length 3,5,6 and 7
    arm64: hw_breakpoint: Handle inexact watchpoint addresses
    ...

    Linus Torvalds
     
  • Shift down default message verbosity, where it does not show
    error results in stdout by default. Since that behavior
    is the same as giving the --quiet option, this patch removes
    --quiet and makes --verbose increasing verbosity.

    In other words, this changes verbosity options as below.
    ftracetest -q -> ftracetest
    ftracetest -> ftracetest -v
    ftracetest -v -> ftracetest -v -v (or -vv)

    Link: http://lkml.kernel.org/r/148007872763.5917.15256235993753860592.stgit@devbox

    Acked-by: Shuah Khan
    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt

    Masami Hiramatsu
     

13 Dec, 2016

4 commits

  • This test script try to do whitebox testing for gpio subsystem(based on
    gpiolib). It manipulate gpio device through chardev or sysfs and check
    the result from debugfs. This script test gpio-mockup through chardev by
    default. User could test other gpio chip by passing the module name.
    Some of the testcases are turned off by default to avoid the conflicting
    with gpiochip in system.

    In details, it test the following things:
    1. Test direction and output value for valid pin.
    2. Test dynamic allocation of gpio base.
    3. Add single, multi gpiochip to do overlap check.

    Run "tools/testing/selftests/gpio/gpio-mockup.sh -h" for usage.

    Acked-by: Shuah Khan
    Signed-off-by: Bamvor Jian Zhang
    Signed-off-by: Shuah Khan

    Bamvor Jian Zhang
     
  • Print "ERROR" on all messages instead of using the not well defined terms
    like "BAD".

    Signed-off-by: Gustavo Padovan
    Signed-off-by: Shuah Khan

    Gustavo Padovan
     
  • Pull documentation update from Jonathan Corbet:
    "These are the documentation changes for 4.10.

    It's another busy cycle for the docs tree, as the sphinx conversion
    continues. Highlights include:

    - Further work on PDF output, which remains a bit of a pain but
    should be more solid now.

    - Five more DocBook template files converted to Sphinx. Only 27 to
    go... Lots of plain-text files have also been converted and
    integrated.

    - Images in binary formats have been replaced with more
    source-friendly versions.

    - Various bits of organizational work, including the renaming of
    various files discussed at the kernel summit.

    - New documentation for the device_link mechanism.

    ... and, of course, lots of typo fixes and small updates"

    * tag 'docs-4.10' of git://git.lwn.net/linux: (193 commits)
    dma-buf: Extract dma-buf.rst
    Update Documentation/00-INDEX
    docs: 00-INDEX: document directories/files with no docs
    docs: 00-INDEX: remove non-existing entries
    docs: 00-INDEX: add missing entries for documentation files/dirs
    docs: 00-INDEX: consolidate process/ and admin-guide/ description
    scripts: add a script to check if Documentation/00-INDEX is sane
    Docs: change sh -> awk in REPORTING-BUGS
    Documentation/core-api/device_link: Add initial documentation
    core-api: remove an unexpected unident
    ppc/idle: Add documentation for powersave=off
    Doc: Correct typo, "Introdution" => "Introduction"
    Documentation/atomic_ops.txt: convert to ReST markup
    Documentation/local_ops.txt: convert to ReST markup
    Documentation/assoc_array.txt: convert to ReST markup
    docs-rst: parse-headers.pl: cleanup the documentation
    docs-rst: fix media cleandocs target
    docs-rst: media/Makefile: reorganize the rules
    docs-rst: media: build SVG from graphviz files
    docs-rst: replace bayer.png by a SVG image
    ...

    Linus Torvalds
     
  • Merge updates from Andrew Morton:

    - various misc bits

    - most of MM (quite a lot of MM material is awaiting the merge of
    linux-next dependencies)

    - kasan

    - printk updates

    - procfs updates

    - MAINTAINERS

    - /lib updates

    - checkpatch updates

    * emailed patches from Andrew Morton : (123 commits)
    init: reduce rootwait polling interval time to 5ms
    binfmt_elf: use vmalloc() for allocation of vma_filesz
    checkpatch: don't emit unified-diff error for rename-only patches
    checkpatch: don't check c99 types like uint8_t under tools
    checkpatch: avoid multiple line dereferences
    checkpatch: don't check .pl files, improve absolute path commit log test
    scripts/checkpatch.pl: fix spelling
    checkpatch: don't try to get maintained status when --no-tree is given
    lib/ida: document locking requirements a bit better
    lib/rbtree.c: fix typo in comment of ____rb_erase_color
    lib/Kconfig.debug: make CONFIG_STRICT_DEVMEM depend on CONFIG_DEVMEM
    MAINTAINERS: add drm and drm/i915 irc channels
    MAINTAINERS: add "C:" for URI for chat where developers hang out
    MAINTAINERS: add drm and drm/i915 bug filing info
    MAINTAINERS: add "B:" for URI where to file bugs
    get_maintainer: look for arbitrary letter prefixes in sections
    printk: add Kconfig option to set default console loglevel
    printk/sound: handle more message headers
    printk/btrfs: handle more message headers
    printk/kdb: handle more message headers
    ...

    Linus Torvalds