01 Mar, 2017

1 commit

  • Pull IDR rewrite from Matthew Wilcox:
    "The most significant part of the following is the patch to rewrite the
    IDR & IDA to be clients of the radix tree. But there's much more,
    including an enhancement of the IDA to be significantly more space
    efficient, an IDR & IDA test suite, some improvements to the IDR API
    (and driver changes to take advantage of those improvements), several
    improvements to the radix tree test suite and RCU annotations.

    The IDR & IDA rewrite had a good spin in linux-next and Andrew's tree
    for most of the last cycle. Coupled with the IDR test suite, I feel
    pretty confident that any remaining bugs are quite hard to hit. 0-day
    did a great job of watching my git tree and pointing out problems; as
    it hit them, I added new test-cases to be sure not to be caught the
    same way twice"

    Willy goes on to expand a bit on the IDR rewrite rationale:
    "The radix tree and the IDR use very similar data structures.

    Merging the two codebases lets us share the memory allocation pools,
    and results in a net deletion of 500 lines of code. It also opens up
    the possibility of exposing more of the features of the radix tree to
    users of the IDR (and I have some interesting patches along those
    lines waiting for 4.12)

    It also shrinks the size of the 'struct idr' from 40 bytes to 24 which
    will shrink a fair few data structures that embed an IDR"

    * 'idr-4.11' of git://git.infradead.org/users/willy/linux-dax: (32 commits)
    radix tree test suite: Add config option for map shift
    idr: Add missing __rcu annotations
    radix-tree: Fix __rcu annotations
    radix-tree: Add rcu_dereference and rcu_assign_pointer calls
    radix tree test suite: Run iteration tests for longer
    radix tree test suite: Fix split/join memory leaks
    radix tree test suite: Fix leaks in regression2.c
    radix tree test suite: Fix leaky tests
    radix tree test suite: Enable address sanitizer
    radix_tree_iter_resume: Fix out of bounds error
    radix-tree: Store a pointer to the root in each node
    radix-tree: Chain preallocated nodes through ->parent
    radix tree test suite: Dial down verbosity with -v
    radix tree test suite: Introduce kmalloc_verbose
    idr: Return the deleted entry from idr_remove
    radix tree test suite: Build separate binaries for some tests
    ida: Use exceptional entries for small IDAs
    ida: Move ida_bitmap to a percpu variable
    Reimplement IDR and IDA using the radix tree
    radix-tree: Add radix_tree_iter_delete
    ...

    Linus Torvalds
     

28 Feb, 2017

4 commits

  • Merge yet more updates from Andrew Morton:

    - a few MM remainders

    - misc things

    - autofs updates

    - signals

    - affs updates

    - ipc

    - nilfs2

    - spelling.txt updates

    * emailed patches from Andrew Morton : (78 commits)
    mm, x86: fix HIGHMEM64 && PARAVIRT build config for native_pud_clear()
    mm: add arch-independent testcases for RODATA
    hfs: atomically read inode size
    mm: clarify mm_struct.mm_{users,count} documentation
    mm: use mmget_not_zero() helper
    mm: add new mmget() helper
    mm: add new mmgrab() helper
    checkpatch: warn when formats use %Z and suggest %z
    lib/vsprintf.c: remove %Z support
    scripts/spelling.txt: add some typo-words
    scripts/spelling.txt: add "followings" pattern and fix typo instances
    scripts/spelling.txt: add "therfore" pattern and fix typo instances
    scripts/spelling.txt: add "overwriten" pattern and fix typo instances
    scripts/spelling.txt: add "overwritting" pattern and fix typo instances
    scripts/spelling.txt: add "deintialize(d)" pattern and fix typo instances
    scripts/spelling.txt: add "disassocation" pattern and fix typo instances
    scripts/spelling.txt: add "omited" pattern and fix typo instances
    scripts/spelling.txt: add "explictely" pattern and fix typo instances
    scripts/spelling.txt: add "applys" pattern and fix typo instances
    scripts/spelling.txt: add "configuartion" pattern and fix typo instances
    ...

    Linus Torvalds
     
  • Pull cgroup updates from Tejun Heo:
    "Several noteworthy changes.

    - Parav's rdma controller is finally merged. It is very straight
    forward and can limit the abosolute numbers of common rdma
    constructs used by different cgroups.

    - kernel/cgroup.c got too chubby and disorganized. Created
    kernel/cgroup/ subdirectory and moved all cgroup related files
    under kernel/ there and reorganized the core code. This hurts for
    backporting patches but was long overdue.

    - cgroup v2 process listing reimplemented so that it no longer
    depends on allocating a buffer large enough to cache the entire
    result to sort and uniq the output. v2 has always mangled the sort
    order to ensure that users don't depend on the sorted output, so
    this shouldn't surprise anybody. This makes the pid listing
    functions use the same iterators that are used internally, which
    have to have the same iterating capabilities anyway.

    - perf cgroup filtering now works automatically on cgroup v2. This
    patch was posted a long time ago but somehow fell through the
    cracks.

    - misc fixes asnd documentation updates"

    * 'for-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (27 commits)
    kernfs: fix locking around kernfs_ops->release() callback
    cgroup: drop the matching uid requirement on migration for cgroup v2
    cgroup, perf_event: make perf_event controller work on cgroup2 hierarchy
    cgroup: misc cleanups
    cgroup: call subsys->*attach() only for subsystems which are actually affected by migration
    cgroup: track migration context in cgroup_mgctx
    cgroup: cosmetic update to cgroup_taskset_add()
    rdmacg: Fixed uninitialized current resource usage
    cgroup: Add missing cgroup-v2 PID controller documentation.
    rdmacg: Added documentation for rdmacg
    IB/core: added support to use rdma cgroup controller
    rdmacg: Added rdma cgroup controller
    cgroup: fix a comment typo
    cgroup: fix RCU related sparse warnings
    cgroup: move namespace code to kernel/cgroup/namespace.c
    cgroup: rename functions for consistency
    cgroup: move v1 mount functions to kernel/cgroup/cgroup-v1.c
    cgroup: separate out cgroup1_kf_syscall_ops
    cgroup: refactor mount path and clearly distinguish v1 and v2 paths
    cgroup: move cgroup v1 specific code to kernel/cgroup/cgroup-v1.c
    ...

    Linus Torvalds
     
  • This patch makes arch-independent testcases for RODATA. Both x86 and
    x86_64 already have testcases for RODATA, But they are arch-specific
    because using inline assembly directly.

    And cacheflush.h is not a suitable location for rodata-test related
    things. Since they were in cacheflush.h, If someone change the state of
    CONFIG_DEBUG_RODATA_TEST, It cause overhead of kernel build.

    To solve the above issues, write arch-independent testcases and move it
    to shared location.

    [jinb.park7@gmail.com: fix config dependency]
    Link: http://lkml.kernel.org/r/20170209131625.GA16954@pjb1027-Latitude-E5410
    Link: http://lkml.kernel.org/r/20170129105436.GA9303@pjb1027-Latitude-E5410
    Signed-off-by: Jinbum Park
    Acked-by: Kees Cook
    Cc: Ingo Molnar
    Cc: H. Peter Anvin
    Cc: Arjan van de Ven
    Cc: Laura Abbott
    Cc: Russell King
    Cc: Valentin Rothberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jinbum Park
     
  • Commit 4a9d4b024a31 ("switch fput to task_work_add") implements a
    schedule_work() for completing fput(), but did not guarantee calling
    __fput() after unpacking initramfs. Because of this, there is a
    possibility that during boot a driver can see ETXTBSY when it tries to
    load a binary from initramfs as fput() is still pending on that binary.

    This patch makes sure that fput() is completed after unpacking initramfs
    and removes the call to flush_delayed_fput() in kernel_init() which
    happens very late after unpacking initramfs.

    Link: http://lkml.kernel.org/r/20170201140540.22051-1-lokeshvutla@ti.com
    Signed-off-by: Lokesh Vutla
    Reported-by: Murali Karicheri
    Cc: Al Viro
    Cc: Tero Kristo
    Cc: Sekhar Nori
    Cc: Nishanth Menon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lokesh Vutla
     

23 Feb, 2017

4 commits

  • Merge updates from Andrew Morton:
    "142 patches:

    - DAX updates

    - various misc bits

    - OCFS2 updates

    - most of MM"

    * emailed patches from Andrew Morton : (142 commits)
    mm/z3fold.c: limit first_num to the actual range of possible buddy indexes
    mm: fix stray kernel-doc notation
    zram: remove obsolete sysfs attrs
    mm/memblock.c: remove unnecessary log and clean up
    oom-reaper: use madvise_dontneed() logic to decide if unmap the VMA
    mm: drop unused argument of zap_page_range()
    mm: drop zap_details::check_swap_entries
    mm: drop zap_details::ignore_dirty
    mm, page_alloc: warn_alloc nodemask is NULL when cpusets are disabled
    mm: help __GFP_NOFAIL allocations which do not trigger OOM killer
    mm, oom: do not enforce OOM killer for __GFP_NOFAIL automatically
    mm: consolidate GFP_NOFAIL checks in the allocator slowpath
    lib/show_mem.c: teach show_mem to work with the given nodemask
    arch, mm: remove arch specific show_mem
    mm, page_alloc: warn_alloc print nodemask
    mm, page_alloc: do not report all nodes in show_mem
    Revert "mm: bail out in shrink_inactive_list()"
    mm, vmscan: consider eligible zones in get_scan_count
    mm, vmscan: cleanup lru size claculations
    mm, vmscan: do not count freed pages as PGDEACTIVATE
    ...

    Linus Torvalds
     
  • Pull printk updates from Petr Mladek:

    - Add Petr Mladek, Sergey Senozhatsky as printk maintainers, and Steven
    Rostedt as the printk reviewer. This idea came up after the
    discussion about printk issues at Kernel Summit. It was formulated
    and discussed at lkml[1].

    - Extend a lock-less NMI per-cpu buffers idea to handle recursive
    printk() calls by Sergey Senozhatsky[2]. It is the first step in
    sanitizing printk as discussed at Kernel Summit.

    The change allows to see messages that would normally get ignored or
    would cause a deadlock.

    Also it allows to enable lockdep in printk(). This already paid off.
    The testing in linux-next helped to discover two old problems that
    were hidden before[3][4].

    - Remove unused parameter by Sergey Senozhatsky. Clean up after a past
    change.

    [1] http://lkml.kernel.org/r/1481798878-31898-1-git-send-email-pmladek@suse.com
    [2] http://lkml.kernel.org/r/20161227141611.940-1-sergey.senozhatsky@gmail.com
    [3] http://lkml.kernel.org/r/20170215044332.30449-1-sergey.senozhatsky@gmail.com
    [4] http://lkml.kernel.org/r/20170217015932.11898-1-sergey.senozhatsky@gmail.com

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
    printk: drop call_console_drivers() unused param
    printk: convert the rest to printk-safe
    printk: remove zap_locks() function
    printk: use printk_safe buffers in printk
    printk: report lost messages in printk safe/nmi contexts
    printk: always use deferred printk when flush printk_safe lines
    printk: introduce per-cpu safe_print seq buffer
    printk: rename nmi.c and exported api
    printk: use vprintk_func in vprintk()
    MAINTAINERS: Add printk maintainers

    Linus Torvalds
     
  • SLUB creates a per-cache directory under /sys/kernel/slab which hosts a
    bunch of debug files. Usually, there aren't that many caches on a
    system and this doesn't really matter; however, if memcg is in use, each
    cache can have per-cgroup sub-caches. SLUB creates the same directories
    for these sub-caches under /sys/kernel/slab/$CACHE/cgroup.

    Unfortunately, because there can be a lot of cgroups, active or
    draining, the product of the numbers of caches, cgroups and files in
    each directory can reach a very high number - hundreds of thousands is
    commonplace. Millions and beyond aren't difficult to reach either.

    What's under /sys/kernel/slab is primarily for debugging and the
    information and control on the a root cache already cover its
    sub-caches. While having a separate directory for each sub-cache can be
    helpful for development, it doesn't make much sense to pay this amount
    of overhead by default.

    This patch introduces a boot parameter slub_memcg_sysfs which determines
    whether to create sysfs directories for per-memcg sub-caches. It also
    adds CONFIG_SLUB_MEMCG_SYSFS_ON which determines the boot parameter's
    default value and defaults to 0.

    [akpm@linux-foundation.org: kset_unregister(NULL) is legal]
    Link: http://lkml.kernel.org/r/20170204145203.GB26958@mtj.duckdns.org
    Signed-off-by: Tejun Heo
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Cc: Vladimir Davydov
    Cc: Michal Hocko
    Cc: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • Pull char/misc driver updates from Greg KH:
    "Here is the big char/misc driver patchset for 4.11-rc1.

    Lots of different driver subsystems updated here: rework for the
    hyperv subsystem to handle new platforms better, mei and w1 and extcon
    driver updates, as well as a number of other "minor" driver updates.

    All of these have been in linux-next for a while with no reported
    issues"

    * tag 'char-misc-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (169 commits)
    goldfish: Sanitize the broken interrupt handler
    x86/platform/goldfish: Prevent unconditional loading
    vmbus: replace modulus operation with subtraction
    vmbus: constify parameters where possible
    vmbus: expose hv_begin/end_read
    vmbus: remove conditional locking of vmbus_write
    vmbus: add direct isr callback mode
    vmbus: change to per channel tasklet
    vmbus: put related per-cpu variable together
    vmbus: callback is in softirq not workqueue
    binder: Add support for file-descriptor arrays
    binder: Add support for scatter-gather
    binder: Add extra size to allocator
    binder: Refactor binder_transact()
    binder: Support multiple /dev instances
    binder: Deal with contexts in debugfs
    binder: Support multiple context managers
    binder: Split flat_binder_object
    auxdisplay: ht16k33: remove private workqueue
    auxdisplay: ht16k33: rework input device initialization
    ...

    Linus Torvalds
     

22 Feb, 2017

2 commits

  • Pull rodata updates from Kees Cook:
    "This renames the (now inaccurate) DEBUG_RODATA and related
    SET_MODULE_RONX configs to the more sensible STRICT_KERNEL_RWX and
    STRICT_MODULE_RWX"

    * tag 'rodata-v4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
    arch: Rename CONFIG_DEBUG_RODATA and CONFIG_DEBUG_MODULE_RONX
    arch: Move CONFIG_DEBUG_RODATA and CONFIG_SET_MODULE_RONX to be common

    Linus Torvalds
     
  • Pull exception table module split from Paul Gortmaker:
    "Final extable.h related changes.

    This completes the separation of exception table content from the
    module.h header file. This is achieved with the final commit that
    removes the one line back compatible change that sourced extable.h
    into the module.h file.

    The commits are unchanged since January, with the exception of a
    couple Acks that came in for the last two commits a bit later. The
    changes have been in linux-next for quite some time[1] and have got
    widespread arch coverage via toolchains I have and also from
    additional ones the kbuild bot has.

    Maintaners of the various arch were Cc'd during the postings to
    lkml[2] and informed that the intention was to take the remaining arch
    specific changes and lump them together with the final two non-arch
    specific changes and submit for this merge window.

    The ia64 diffstat stands out and probably warrants a mention. In an
    earlier review, Al Viro made a valid comment that the original header
    separation of content left something to be desired, and that it get
    fixed as a part of this change, hence the larger diffstat"

    * tag 'extable-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (21 commits)
    module.h: remove extable.h include now users have migrated
    core: migrate exception table users off module.h and onto extable.h
    cris: migrate exception table users off module.h and onto extable.h
    hexagon: migrate exception table users off module.h and onto extable.h
    microblaze: migrate exception table users off module.h and onto extable.h
    unicore32: migrate exception table users off module.h and onto extable.h
    score: migrate exception table users off module.h and onto extable.h
    metag: migrate exception table users off module.h and onto extable.h
    arc: migrate exception table users off module.h and onto extable.h
    nios2: migrate exception table users off module.h and onto extable.h
    sparc: migrate exception table users onto extable.h
    openrisc: migrate exception table users off module.h and onto extable.h
    frv: migrate exception table users off module.h and onto extable.h
    sh: migrate exception table users off module.h and onto extable.h
    xtensa: migrate exception table users off module.h and onto extable.h
    mn10300: migrate exception table users off module.h and onto extable.h
    alpha: migrate exception table users off module.h and onto extable.h
    arm: migrate exception table users off module.h and onto extable.h
    m32r: migrate exception table users off module.h and onto extable.h
    ia64: ensure exception table search users include extable.h
    ...

    Linus Torvalds
     

21 Feb, 2017

4 commits

  • Pull locking updates from Ingo Molnar:
    "The main changes in this cycle were:

    - Implement wraparound-safe refcount_t and kref_t types based on
    generic atomic primitives (Peter Zijlstra)

    - Improve and fix the ww_mutex code (Nicolai Hähnle)

    - Add self-tests to the ww_mutex code (Chris Wilson)

    - Optimize percpu-rwsems with the 'rcuwait' mechanism (Davidlohr
    Bueso)

    - Micro-optimize the current-task logic all around the core kernel
    (Davidlohr Bueso)

    - Tidy up after recent optimizations: remove stale code and APIs,
    clean up the code (Waiman Long)

    - ... plus misc fixes, updates and cleanups"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
    fork: Fix task_struct alignment
    locking/spinlock/debug: Remove spinlock lockup detection code
    lockdep: Fix incorrect condition to print bug msgs for MAX_LOCKDEP_CHAIN_HLOCKS
    lkdtm: Convert to refcount_t testing
    kref: Implement 'struct kref' using refcount_t
    refcount_t: Introduce a special purpose refcount type
    sched/wake_q: Clarify queue reinit comment
    sched/wait, rcuwait: Fix typo in comment
    locking/mutex: Fix lockdep_assert_held() fail
    locking/rtmutex: Flip unlikely() branch to likely() in __rt_mutex_slowlock()
    locking/rwsem: Reinit wake_q after use
    locking/rwsem: Remove unnecessary atomic_long_t casts
    jump_labels: Move header guard #endif down where it belongs
    locking/atomic, kref: Implement kref_put_lock()
    locking/ww_mutex: Turn off __must_check for now
    locking/atomic, kref: Avoid more abuse
    locking/atomic, kref: Use kref_get_unless_zero() more
    locking/atomic, kref: Kill kref_sub()
    locking/atomic, kref: Add kref_read()
    locking/atomic, kref: Add KREF_INIT()
    ...

    Linus Torvalds
     
  • Pull scheduler updates from Ingo Molnar:
    "The main changes in this (fairly busy) cycle were:

    - There was a class of scheduler bugs related to forgetting to update
    the rq-clock timestamp which can cause weird and hard to debug
    problems, so there's a new debug facility for this: which uncovered
    a whole lot of bugs which convinced us that we want to keep the
    debug facility.

    (Peter Zijlstra, Matt Fleming)

    - Various cputime related updates: eliminate cputime and use u64
    nanoseconds directly, simplify and improve the arch interfaces,
    implement delayed accounting more widely, etc. - (Frederic
    Weisbecker)

    - Move code around for better structure plus cleanups (Ingo Molnar)

    - Move IO schedule accounting deeper into the scheduler plus related
    changes to improve the situation (Tejun Heo)

    - ... plus a round of sched/rt and sched/deadline fixes, plus other
    fixes, updats and cleanups"

    * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (85 commits)
    sched/core: Remove unlikely() annotation from sched_move_task()
    sched/autogroup: Rename auto_group.[ch] to autogroup.[ch]
    sched/topology: Split out scheduler topology code from core.c into topology.c
    sched/core: Remove unnecessary #include headers
    sched/rq_clock: Consolidate the ordering of the rq_clock methods
    delayacct: Include
    sched/core: Clean up comments
    sched/rt: Show the 'sched_rr_timeslice' SCHED_RR timeslice tuning knob in milliseconds
    sched/clock: Add dummy clear_sched_clock_stable() stub function
    sched/cputime: Remove generic asm headers
    sched/cputime: Remove unused nsec_to_cputime()
    s390, sched/cputime: Remove unused cputime definitions
    powerpc, sched/cputime: Remove unused cputime definitions
    s390, sched/cputime: Make arch_cpu_idle_time() to return nsecs
    ia64, sched/cputime: Remove unused cputime definitions
    ia64: Convert vtime to use nsec units directly
    ia64, sched/cputime: Move the nsecs based cputime headers to the last arch using it
    sched/cputime: Remove jiffies based cputime
    sched/cputime, vtime: Return nsecs instead of cputime_t to account
    sched/cputime: Complete nsec conversion of tick based accounting
    ...

    Linus Torvalds
     
  • Pull EFI updates from Ingo Molnar:
    "The main changes in this cycle were:

    - Changes to the EFI init code to establish whether secure boot
    authentication was performed at boot time. (Josh Boyer, David
    Howells)

    - Wire up the UEFI memory attributes table for x86. This eliminates
    any runtime memory regions that are both writable and executable,
    on recent firmware versions. (Sai Praneeth)

    - Move the BGRT init code to an earlier stage so that we can still
    use efi_mem_reserve(). (Dave Young)

    - Preserve debug symbols in the ARM/arm64 UEFI stub (Ard Biesheuvel)

    - Code deduplication work and various other cleanups (Lukas Wunner)

    - ... plus various other fixes and cleanups"

    * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    efi/libstub: Make file I/O chunking x86-specific
    efi: Print the secure boot status in x86 setup_arch()
    efi: Disable secure boot if shim is in insecure mode
    efi: Get and store the secure boot status
    efi: Add SHIM and image security database GUID definitions
    arm/efi: Allow invocation of arbitrary runtime services
    x86/efi: Allow invocation of arbitrary runtime services
    efi/libstub: Preserve .debug sections after absolute relocation check
    efi/x86: Add debug code to print cooked memmap
    efi/x86: Move the EFI BGRT init code to early init code
    efi: Use typed function pointers for the runtime services table
    efi/esrt: Fix typo in pr_err() message
    x86/efi: Add support for EFI_MEMORY_ATTRIBUTES_TABLE
    efi: Introduce the EFI_MEM_ATTR bit and set it from the memory attributes table
    efi: Make EFI_MEMORY_ATTRIBUTES_TABLE initialization common across all architectures
    x86/efi: Deduplicate efi_char16_printk()
    efi: Deduplicate efi_file_size() / _read() / _close()

    Linus Torvalds
     
  • Pull RCU updates from Ingo Molnar:
    "The RCU changes in this cycle are:

    - Dynticks updates, consolidating open-coded counter accesses into a
    well-defined API

    - SRCU updates: Simplify algorithm, add formal verification

    - Documentation updates

    - Miscellaneous fixes

    - Torture-test updates

    Most of the diffstat comes from the relatively large documentation
    update"

    * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
    srcu: Reduce probability of SRCU ->unlock_count[] counter overflow
    rcutorture: Add CBMC-based formal verification for SRCU
    srcu: Force full grace-period ordering
    srcu: Implement more-efficient reader counts
    rcu: Adjust FQS offline checks for exact online-CPU detection
    rcu: Check cond_resched_rcu_qs() state less often to reduce GP overhead
    rcu: Abstract extended quiescent state determination
    rcu: Abstract dynticks extended quiescent state enter/exit operations
    rcu: Add lockdep checks to synchronous expedited primitives
    rcu: Eliminate unused expedited_normal counter
    llist: Clarify comments about when locking is needed
    rcu: Fix comment in rcu_organize_nocb_kthreads()
    rcu: Enable RCU tracepoints by default to aid in debugging
    rcu: Make rcu_cpu_starting() use its "cpu" argument
    rcu: Add comment headers to expedited-grace-period counter functions
    rcu: Don't wake rcuc/X kthreads on NOCB CPUs
    rcu: Re-enable TASKS_RCU for User Mode Linux
    rcu: Once again use NMI-based stack traces in stall warnings
    rcu: Remove short-term CPU kicking
    rcu: Add long-term CPU kicking
    ...

    Linus Torvalds
     

14 Feb, 2017

1 commit

  • The IDR is very similar to the radix tree. It has some functionality that
    the radix tree did not have (alloc next free, cyclic allocation, a
    callback-based for_each, destroy tree), which is readily implementable on
    top of the radix tree. A few small changes were needed in order to use a
    tag to represent nodes with free space below them. More extensive
    changes were needed to support storing NULL as a valid entry in an IDR.
    Plain radix trees still interpret NULL as a not-present entry.

    The IDA is reimplemented as a client of the newly enhanced radix tree. As
    in the current implementation, it uses a bitmap at the last level of the
    tree.

    Signed-off-by: Matthew Wilcox
    Signed-off-by: Matthew Wilcox
    Tested-by: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Ross Zwisler
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton

    Matthew Wilcox
     

10 Feb, 2017

1 commit

  • These files were including module.h for exception table related
    functions. We've now separated that content out into its own file
    "extable.h" so now move over to that and where possible, avoid all
    the extra header content in module.h that we don't really need to
    compile these non-modular files.

    Note:
    init/main.c still needs module.h for __init_or_module
    kernel/extable.c still needs module.h for is_module_text_address

    ...and so we don't get the benefit of removing module.h from the cpp
    feed for these two files, unlike the almost universal 1:1 exchange
    of module.h for extable.h we were able to do in the arch dirs.

    Cc: Rusty Russell
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Acked-by: Jessica Yu
    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     

08 Feb, 2017

2 commits

  • A preparation patch for printk_safe work. No functional change.
    - rename nmi.c to print_safe.c
    - add `printk_safe' prefix to some (which used both by printk-safe
    and printk-nmi) of the exported functions.

    Link: http://lkml.kernel.org/r/20161227141611.940-3-sergey.senozhatsky@gmail.com
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Jan Kara
    Cc: Tejun Heo
    Cc: Calvin Owens
    Cc: Steven Rostedt
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Andy Lutomirski
    Cc: Peter Hurley
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Sergey Senozhatsky
    Signed-off-by: Petr Mladek

    Sergey Senozhatsky
     
  • Both of these options are poorly named. The features they provide are
    necessary for system security and should not be considered debug only.
    Change the names to CONFIG_STRICT_KERNEL_RWX and
    CONFIG_STRICT_MODULE_RWX to better describe what these options do.

    Signed-off-by: Laura Abbott
    Acked-by: Jessica Yu
    Signed-off-by: Kees Cook

    Laura Abbott
     

07 Feb, 2017

1 commit


06 Feb, 2017

1 commit


04 Feb, 2017

1 commit

  • This add the kbuild infrastructure that will allow architectures to emit
    vmlinux symbol CRCs as 32-bit offsets to another location in the kernel
    where the actual value is stored. This works around problems with CRCs
    being mistaken for relocatable symbols on kernels that self relocate at
    runtime (i.e., powerpc with CONFIG_RELOCATABLE=y)

    For the kbuild side of things, this comes down to the following:

    - introducing a Kconfig symbol MODULE_REL_CRCS

    - adding a -R switch to genksyms to instruct it to emit the CRC symbols
    as references into the .rodata section

    - making modpost distinguish such references from absolute CRC symbols
    by the section index (SHN_ABS)

    - making kallsyms disregard non-absolute symbols with a __crc_ prefix

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Linus Torvalds

    Ard Biesheuvel
     

01 Feb, 2017

1 commit

  • Before invoking the arch specific handler, efi_mem_reserve() reserves
    the given memory region through memblock.

    efi_bgrt_init() will call efi_mem_reserve() after mm_init(), at which
    time memblock is dead and should not be used anymore.

    The EFI BGRT code depends on ACPI initialization to get the BGRT ACPI
    table, so move parsing of the BGRT table to ACPI early boot code to
    ensure that efi_mem_reserve() in EFI BGRT code still use memblock safely.

    Tested-by: Bhupesh Sharma
    Signed-off-by: Dave Young
    Signed-off-by: Ard Biesheuvel
    Cc: Len Brown
    Cc: Linus Torvalds
    Cc: Matt Fleming
    Cc: Peter Zijlstra
    Cc: Rafael J. Wysocki
    Cc: Thomas Gleixner
    Cc: linux-acpi@vger.kernel.org
    Cc: linux-efi@vger.kernel.org
    Link: http://lkml.kernel.org/r/1485868902-20401-9-git-send-email-ard.biesheuvel@linaro.org
    Signed-off-by: Ingo Molnar

    Dave Young
     

31 Jan, 2017

1 commit


24 Jan, 2017

1 commit


19 Jan, 2017

1 commit

  • PC/104 form factor devices serve a specific niche of embedded system
    users; most Linux users will not have PC/104 form factor devices. This
    patch introduces the PC104 Kconfig option, which should be used to
    filter PC/104 specific device drivers and options, so that only those
    users interested in PC/104 related options are exposed to them.

    Signed-off-by: William Breathitt Gray
    Signed-off-by: Greg Kroah-Hartman

    William Breathitt Gray
     

17 Jan, 2017

1 commit

  • RCU_EXPEDITE_BOOT should speed up the boot process by enforcing
    synchronize_rcu_expedited() instead of synchronize_rcu() during the boot
    process. There should be no reason why one does not want this and there
    is no need worry about real time latency at this point.
    Therefore make it default.

    Note that users wishing to avoid expediting entirely, for example when
    bringing up new hardware possibly having flaky IPIs, can use the
    rcu_normal boot parameter to override boot-time expediting.

    Signed-off-by: Sebastian Andrzej Siewior
    [ paulmck: Reworded commit log. ]
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Sebastian Andrzej Siewior
     

14 Jan, 2017

2 commits

  • Since we need to change the implementation, stop exposing internals.

    Provide KREF_INIT() to allow static initialization of struct kref.

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Greg Kroah-Hartman
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Currently we switch to the stable sched_clock if we guess the TSC is
    usable, and then switch back to the unstable path if it turns out TSC
    isn't stable during SMP bringup after all.

    Delay switching to the stable path until after SMP bringup is
    complete. This way we'll avoid switching during the time we detect the
    worst of the TSC offences.

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

11 Jan, 2017

2 commits

  • We now 'select SOCK_CGROUP_DATA' but Kconfig complains that this is
    not right when CONFIG_NET is disabled and there is no socket interface:

    warning: (CGROUP_BPF) selects SOCK_CGROUP_DATA which has unmet direct dependencies (NET)

    I don't know what the correct solution for this is, but simply removing
    the dependency on NET from SOCK_CGROUP_DATA by moving it out of the
    'if NET' section avoids the warning and does not produce other build
    errors.

    Fixes: 483c4933ea09 ("cgroup: Fix CGROUP_BPF config")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: David S. Miller

    Arnd Bergmann
     
  • Added rdma cgroup controller that does accounting, limit enforcement
    on rdma/IB resources.

    Added rdma cgroup header file which defines its APIs to perform
    charging/uncharging functionality. It also defined APIs for RDMA/IB
    stack for device registration. Devices which are registered will
    participate in controller functions of accounting and limit
    enforcements. It define rdmacg_device structure to bind IB stack
    and RDMA cgroup controller.

    RDMA resources are tracked using resource pool. Resource pool is per
    device, per cgroup entity which allows setting up accounting limits
    on per device basis.

    Currently resources are defined by the RDMA cgroup.

    Resource pool is created/destroyed dynamically whenever
    charging/uncharging occurs respectively and whenever user
    configuration is done. Its a tradeoff of memory vs little more code
    space that creates resource pool object whenever necessary, instead of
    creating them during cgroup creation and device registration time.

    Signed-off-by: Parav Pandit
    Signed-off-by: Tejun Heo

    Parav Pandit
     

26 Dec, 2016

1 commit

  • Add a new page flag, PageWaiters, to indicate the page waitqueue has
    tasks waiting. This can be tested rather than testing waitqueue_active
    which requires another cacheline load.

    This bit is always set when the page has tasks on page_waitqueue(page),
    and is set and cleared under the waitqueue lock. It may be set when
    there are no tasks on the waitqueue, which will cause a harmless extra
    wakeup check that will clears the bit.

    The generic bit-waitqueue infrastructure is no longer used for pages.
    Instead, waitqueues are used directly with a custom key type. The
    generic code was not flexible enough to have PageWaiters manipulation
    under the waitqueue lock (which simplifies concurrency).

    This improves the performance of page lock intensive microbenchmarks by
    2-3%.

    Putting two bits in the same word opens the opportunity to remove the
    memory barrier between clearing the lock bit and testing the waiters
    bit, after some work on the arch primitives (e.g., ensuring memory
    operand widths match and cover both bits).

    Signed-off-by: Nicholas Piggin
    Cc: Dave Hansen
    Cc: Bob Peterson
    Cc: Steven Whitehouse
    Cc: Andrew Lutomirski
    Cc: Andreas Gruenbacher
    Cc: Peter Zijlstra
    Cc: Mel Gorman
    Signed-off-by: Linus Torvalds

    Nicholas Piggin
     

25 Dec, 2016

1 commit


18 Dec, 2016

2 commits

  • Pull networking fixes and cleanups from David Miller:

    1) Revert bogus nla_ok() change, from Alexey Dobriyan.

    2) Various bpf validator fixes from Daniel Borkmann.

    3) Add some necessary SET_NETDEV_DEV() calls to hsis_femac and hip04
    drivers, from Dongpo Li.

    4) Several ethtool ksettings conversions from Philippe Reynes.

    5) Fix bugs in inet port management wrt. soreuseport, from Tom Herbert.

    6) XDP support for virtio_net, from John Fastabend.

    7) Fix NAT handling within a vrf, from David Ahern.

    8) Endianness fixes in dpaa_eth driver, from Claudiu Manoil

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (63 commits)
    net: mv643xx_eth: fix build failure
    isdn: Constify some function parameters
    mlxsw: spectrum: Mark split ports as such
    cgroup: Fix CGROUP_BPF config
    qed: fix old-style function definition
    net: ipv6: check route protocol when deleting routes
    r6040: move spinlock in r6040_close as SOFTIRQ-unsafe lock order detected
    irda: w83977af_ir: cleanup an indent issue
    net: sfc: use new api ethtool_{get|set}_link_ksettings
    net: davicom: dm9000: use new api ethtool_{get|set}_link_ksettings
    net: cirrus: ep93xx: use new api ethtool_{get|set}_link_ksettings
    net: chelsio: cxgb3: use new api ethtool_{get|set}_link_ksettings
    net: chelsio: cxgb2: use new api ethtool_{get|set}_link_ksettings
    bpf: fix mark_reg_unknown_value for spilled regs on map value marking
    bpf: fix overflow in prog accounting
    bpf: dynamically allocate digest scratch buffer
    gtp: Fix initialization of Flags octet in GTPv1 header
    gtp: gtp_check_src_ms_ipv4() always return success
    net/x25: use designated initializers
    isdn: use designated initializers
    ...

    Linus Torvalds
     
  • CGROUP_BPF depended on SOCK_CGROUP_DATA which can't be manually
    enabled, making it rather challenging to turn CGROUP_BPF on.

    Signed-off-by: Andy Lutomirski
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Andy Lutomirski
     

15 Dec, 2016

1 commit

  • Pull modules updates from Jessica Yu:
    "Summary of modules changes for the 4.10 merge window:

    - The rodata= cmdline parameter has been extended to additionally
    apply to module mappings

    - Fix a hard to hit race between module loader error/clean up
    handling and ftrace registration

    - Some code cleanups, notably panic.c and modules code use a unified
    taint_flags table now. This is much cleaner than duplicating the
    taint flag code in modules.c"

    * tag 'modules-for-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
    module: fix DEBUG_SET_MODULE_RONX typo
    module: extend 'rodata=off' boot cmdline parameter to module mappings
    module: Fix a comment above strong_try_module_get()
    module: When modifying a module's text ignore modules which are going away too
    module: Ensure a module's state is set accordingly during module coming cleanup code
    module: remove trailing whitespace
    taint/module: Clean up global and module taint flags handling
    modpost: free allocated memory

    Linus Torvalds
     

14 Dec, 2016

1 commit

  • Pull workqueue updates from Tejun Heo:
    "Mostly patches to initialize workqueue subsystem earlier and get rid
    of keventd_up().

    The patches were headed for the last merge cycle but got delayed due
    to a bug found late minute, which is fixed now.

    Also, to help debugging, destroy_workqueue() is more chatty now on a
    sanity check failure."

    * 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
    workqueue: move wq_numa_init() to workqueue_init()
    workqueue: remove keventd_up()
    debugobj, workqueue: remove keventd_up() usage
    slab, workqueue: remove keventd_up() usage
    power, workqueue: remove keventd_up() usage
    tty, workqueue: remove keventd_up() usage
    mce, workqueue: remove keventd_up() usage
    workqueue: make workqueue available early during boot
    workqueue: dump workqueue state on sanity check failures in destroy_workqueue()

    Linus Torvalds
     

13 Dec, 2016

3 commits

  • Pull documentation update from Jonathan Corbet:
    "These are the documentation changes for 4.10.

    It's another busy cycle for the docs tree, as the sphinx conversion
    continues. Highlights include:

    - Further work on PDF output, which remains a bit of a pain but
    should be more solid now.

    - Five more DocBook template files converted to Sphinx. Only 27 to
    go... Lots of plain-text files have also been converted and
    integrated.

    - Images in binary formats have been replaced with more
    source-friendly versions.

    - Various bits of organizational work, including the renaming of
    various files discussed at the kernel summit.

    - New documentation for the device_link mechanism.

    ... and, of course, lots of typo fixes and small updates"

    * tag 'docs-4.10' of git://git.lwn.net/linux: (193 commits)
    dma-buf: Extract dma-buf.rst
    Update Documentation/00-INDEX
    docs: 00-INDEX: document directories/files with no docs
    docs: 00-INDEX: remove non-existing entries
    docs: 00-INDEX: add missing entries for documentation files/dirs
    docs: 00-INDEX: consolidate process/ and admin-guide/ description
    scripts: add a script to check if Documentation/00-INDEX is sane
    Docs: change sh -> awk in REPORTING-BUGS
    Documentation/core-api/device_link: Add initial documentation
    core-api: remove an unexpected unident
    ppc/idle: Add documentation for powersave=off
    Doc: Correct typo, "Introdution" => "Introduction"
    Documentation/atomic_ops.txt: convert to ReST markup
    Documentation/local_ops.txt: convert to ReST markup
    Documentation/assoc_array.txt: convert to ReST markup
    docs-rst: parse-headers.pl: cleanup the documentation
    docs-rst: fix media cleandocs target
    docs-rst: media/Makefile: reorganize the rules
    docs-rst: media: build SVG from graphviz files
    docs-rst: replace bayer.png by a SVG image
    ...

    Linus Torvalds
     
  • Merge updates from Andrew Morton:

    - various misc bits

    - most of MM (quite a lot of MM material is awaiting the merge of
    linux-next dependencies)

    - kasan

    - printk updates

    - procfs updates

    - MAINTAINERS

    - /lib updates

    - checkpatch updates

    * emailed patches from Andrew Morton : (123 commits)
    init: reduce rootwait polling interval time to 5ms
    binfmt_elf: use vmalloc() for allocation of vma_filesz
    checkpatch: don't emit unified-diff error for rename-only patches
    checkpatch: don't check c99 types like uint8_t under tools
    checkpatch: avoid multiple line dereferences
    checkpatch: don't check .pl files, improve absolute path commit log test
    scripts/checkpatch.pl: fix spelling
    checkpatch: don't try to get maintained status when --no-tree is given
    lib/ida: document locking requirements a bit better
    lib/rbtree.c: fix typo in comment of ____rb_erase_color
    lib/Kconfig.debug: make CONFIG_STRICT_DEVMEM depend on CONFIG_DEVMEM
    MAINTAINERS: add drm and drm/i915 irc channels
    MAINTAINERS: add "C:" for URI for chat where developers hang out
    MAINTAINERS: add drm and drm/i915 bug filing info
    MAINTAINERS: add "B:" for URI where to file bugs
    get_maintainer: look for arbitrary letter prefixes in sections
    printk: add Kconfig option to set default console loglevel
    printk/sound: handle more message headers
    printk/btrfs: handle more message headers
    printk/kdb: handle more message headers
    ...

    Linus Torvalds
     
  • Pull timer updates from Thomas Gleixner:
    "The time/timekeeping/timer folks deliver with this update:

    - Fix a reintroduced signed/unsigned issue and cleanup the whole
    signed/unsigned mess in the timekeeping core so this wont happen
    accidentaly again.

    - Add a new trace clock based on boot time

    - Prevent injection of random sleep times when PM tracing abuses the
    RTC for storage

    - Make posix timers configurable for real tiny systems

    - Add tracepoints for the alarm timer subsystem so timer based
    suspend wakeups can be instrumented

    - The usual pile of fixes and updates to core and drivers"

    * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
    timekeeping: Use mul_u64_u32_shr() instead of open coding it
    timekeeping: Get rid of pointless typecasts
    timekeeping: Make the conversion call chain consistently unsigned
    timekeeping_Force_unsigned_clocksource_to_nanoseconds_conversion
    alarmtimer: Add tracepoints for alarm timers
    trace: Update documentation for mono, mono_raw and boot clock
    trace: Add an option for boot clock as trace clock
    timekeeping: Add a fast and NMI safe boot clock
    timekeeping/clocksource_cyc2ns: Document intended range limitation
    timekeeping: Ignore the bogus sleep time if pm_trace is enabled
    selftests/timers: Fix spelling mistake "Asyncrhonous" -> "Asynchronous"
    clocksource/drivers/bcm2835_timer: Unmap region obtained by of_iomap
    clocksource/drivers/arm_arch_timer: Map frame with of_io_request_and_map()
    arm64: dts: rockchip: Arch counter doesn't tick in system suspend
    clocksource/drivers/arm_arch_timer: Don't assume clock runs in suspend
    posix-timers: Make them configurable
    posix_cpu_timers: Move the add_device_randomness() call to a proper place
    timer: Move sys_alarm from timer.c to itimer.c
    ptp_clock: Allow for it to be optional
    Kconfig: Regenerate *.c_shipped files after previous changes
    ...

    Linus Torvalds