07 Feb, 2018

40 commits

  • test_sort.c performs array-based and linked list sort test. Code allows
    to compile either as a loadable modules or builtin into the kernel.

    Current code is not allow to unload the test_sort.ko module after
    successful completion.

    This patch adds support to unload the "test_sort.ko" module by adding
    module_exit support.

    Previous patch was implemented auto unload support by returning -EAGAIN
    from module_init() function on successful case, but this approach is not
    ideal.

    The auto-unload might seem like a nice optimization, but it encourages
    inconsistent behaviour. And behaviour that is different from all other
    normal modules.

    Link: http://lkml.kernel.org/r/1513967133-6843-1-git-send-email-pravin.shedge4linux@gmail.com
    Signed-off-by: Pravin Shedge
    Cc: Kostenzer Felix
    Cc: Andy Shevchenko
    Cc: Geert Uytterhoeven
    Cc: Paul Gortmaker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pravin Shedge
     
  • No need to get into the submenu to disable all related config entries.

    This makes it easier to disable all RUNTIME_TESTS config options without
    entering the submenu. It will also enable one to see that en/dis-abled
    state from the outside menu.

    This is only intended to change menuconfig UI, not change the config
    dependencies.

    Link: http://lkml.kernel.org/r/20171209162742.7363-1-vincent.legoll@gmail.com
    Signed-off-by: Vincent Legoll
    Cc: Ingo Molnar
    Cc: Byungchul Park
    Cc: Peter Zijlstra
    Cc: "Paul E. McKenney"
    Cc: Josh Poimboeuf
    Cc: Geert Uytterhoeven
    Cc: Randy Dunlap
    Cc: "Luis R. Rodriguez"
    Cc: Nicholas Piggin
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vincent Legoll
     
  • We've measured that we spend ~0.6% of sys cpu time in cpumask_next_and().
    It's essentially a joined iteration in search for a non-zero bit, which is
    currently implemented as a lookup join (find a nonzero bit on the lhs,
    lookup the rhs to see if it's set there).

    Implement a direct join (find a nonzero bit on the incrementally built
    join). Also add generic bitmap benchmarks in the new `test_find_bit`
    module for new function (see `find_next_and_bit` in [2] and [3] below).

    For cpumask_next_and, direct benchmarking shows that it's 1.17x to 14x
    faster with a geometric mean of 2.1 on 32 CPUs [1]. No impact on memory
    usage. Note that on Arm, the new pure-C implementation still outperforms
    the old one that uses a mix of C and asm (`find_next_bit`) [3].

    [1] Approximate benchmark code:

    ```
    unsigned long src1p[nr_cpumask_longs] = {pattern1};
    unsigned long src2p[nr_cpumask_longs] = {pattern2};
    for (/*a bunch of repetitions*/) {
    for (int n = -1; n ]
    Link: http://lkml.kernel.org/r/1512556816-28627-1-git-send-email-geert@linux-m68k.org
    Link: http://lkml.kernel.org/r/20171128131334.23491-1-courbet@google.com
    Signed-off-by: Clement Courbet
    Signed-off-by: Geert Uytterhoeven
    Cc: Yury Norov
    Cc: Geert Uytterhoeven
    Cc: Alexey Dobriyan
    Cc: Rasmus Villemoes
    Signed-off-by: Andrew Morton

    Signed-off-by: Linus Torvalds

    Clement Courbet
     
  • As suggested in review comments:
    * printk: align numbers using whitespaces instead of tabs;
    * return error value from init() to avoid calling rmmod if testing again;
    * use ktime_get instead of get_cycles as some arches don't support it;

    The output in dmesg (on QEMU arm64):
    [ 38.823430] Start testing find_bit() with random-filled bitmap
    [ 38.845358] find_next_bit: 20138448 ns, 163968 iterations
    [ 38.856217] find_next_zero_bit: 10615328 ns, 163713 iterations
    [ 38.863564] find_last_bit: 7111888 ns, 163967 iterations
    [ 40.944796] find_first_bit: 2081007216 ns, 163968 iterations
    [ 40.944975]
    [ 40.944975] Start testing find_bit() with sparse bitmap
    [ 40.945268] find_next_bit: 73216 ns, 656 iterations
    [ 40.967858] find_next_zero_bit: 22461008 ns, 327025 iterations
    [ 40.968047] find_last_bit: 62320 ns, 656 iterations
    [ 40.978060] find_first_bit: 9889360 ns, 656 iterations

    Link: http://lkml.kernel.org/r/20171124143040.a44jvhmnaiyedg2i@yury-thinkpad
    Signed-off-by: Yury Norov
    Tested-by: Geert Uytterhoeven
    Cc: Alexey Dobriyan
    Cc: Clement Courbet
    Cc: Matthew Wilcox
    Cc: Rasmus Villemoes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yury Norov
     
  • As suggested in review comments, rename test_find_bit.c to
    find_bit_benchmark.c.

    Link: http://lkml.kernel.org/r/20171124143040.a44jvhmnaiyedg2i@yury-thinkpad
    Signed-off-by: Yury Norov
    Tested-by: Geert Uytterhoeven
    Cc: Alexey Dobriyan
    Cc: Clement Courbet
    Cc: Matthew Wilcox
    Cc: Rasmus Villemoes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yury Norov
     
  • stackdepot used to call memcmp(), which compiler tools normally
    instrument, therefore every lookup used to unnecessarily call instrumented
    code. This is somewhat ok in the case of KASAN, but under KMSAN a lot of
    time was spent in the instrumentation.

    Link: http://lkml.kernel.org/r/20171117172149.69562-1-glider@google.com
    Signed-off-by: Alexander Potapenko
    Cc: Andrey Ryabinin
    Cc: Dmitry Vyukov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Potapenko
     
  • Behaviour of bitmap_fill() differs from bitmap_zero() in a way how bits
    behind bitmap are handed. bitmap_zero() clears entire bitmap by unsigned
    long boundary, while bitmap_fill() mimics bitmap_set().

    Here we change bitmap_fill() behaviour to be consistent with bitmap_zero()
    and add a note to documentation.

    The change might reveal some bugs in the code where unused bits are
    handled differently and in such cases bitmap_set() has to be used.

    Link: http://lkml.kernel.org/r/20180109172430.87452-4-andriy.shevchenko@linux.intel.com
    Signed-off-by: Andy Shevchenko
    Suggested-by: Rasmus Villemoes
    Cc: Randy Dunlap
    Cc: Yury Norov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Shevchenko
     
  • Since we have separate explicit test cases for bitmap_zero() /
    bitmap_clear() and bitmap_fill() / bitmap_set(), clean up
    test_zero_fill_copy() to only test bitmap_copy() functionality and thus
    rename a function to reflect the changes.

    While here, replace bitmap_fill() by bitmap_set() with proper values.

    Link: http://lkml.kernel.org/r/20180109172430.87452-3-andriy.shevchenko@linux.intel.com
    Signed-off-by: Andy Shevchenko
    Reviewed-by: Yury Norov
    Cc: Randy Dunlap
    Cc: Rasmus Villemoes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Shevchenko
     
  • Explicitly test bitmap_fill() and bitmap_set() functions.

    For bitmap_fill() we expect a consistent behaviour as in bitmap_zero(),
    i.e. the trailing bits will be set up to unsigned long boundary.

    Link: http://lkml.kernel.org/r/20180109172430.87452-2-andriy.shevchenko@linux.intel.com
    Signed-off-by: Andy Shevchenko
    Reviewed-by: Yury Norov
    Cc: Randy Dunlap
    Cc: Rasmus Villemoes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Shevchenko
     
  • Explicitly test bitmap_zero() and bitmap_clear() functions.

    Link: http://lkml.kernel.org/r/20180109172430.87452-1-andriy.shevchenko@linux.intel.com
    Signed-off-by: Andy Shevchenko
    Reviewed-by: Yury Norov
    Cc: Rasmus Villemoes
    Cc: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Shevchenko
     
  • with bitmap_{from,to}_arr32 over the kernel. Additionally to it:
    * __check_eq_bitmap() now takes single nbits argument.
    * __check_eq_u32_array is not used in new test but may be used in
    future. So I don't remove it here, but annotate as __used.

    Tested on arm64 and 32-bit BE mips.

    [arnd@arndb.de: perf: arm_dsu_pmu: convert to bitmap_from_arr32]
    Link: http://lkml.kernel.org/r/20180201172508.5739-2-ynorov@caviumnetworks.com
    [ynorov@caviumnetworks.com: fix net/core/ethtool.c]
    Link: http://lkml.kernel.org/r/20180205071747.4ekxtsbgxkj5b2fz@yury-thinkpad
    Link: http://lkml.kernel.org/r/20171228150019.27953-2-ynorov@caviumnetworks.com
    Signed-off-by: Yury Norov
    Signed-off-by: Arnd Bergmann
    Cc: Ben Hutchings
    Cc: David Decotigny ,
    Cc: David S. Miller ,
    Cc: Geert Uytterhoeven
    Cc: Matthew Wilcox
    Cc: Rasmus Villemoes
    Cc: Heiner Kallweit
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yury Norov
     
  • This patchset replaces bitmap_{to,from}_u32array with more simple and
    standard looking copy-like functions.

    bitmap_from_u32array() takes 4 arguments (bitmap_to_u32array is similar):
    - unsigned long *bitmap, which is destination;
    - unsigned int nbits, the length of destination bitmap, in bits;
    - const u32 *buf, the source; and
    - unsigned int nwords, the length of source buffer in ints.

    In description to the function it is detailed like:
    * copy min(nbits, 32*nwords) bits from @buf to @bitmap, remaining
    * bits between nword and nbits in @bitmap (if any) are cleared.

    Having two size arguments looks unneeded and potentially dangerous.

    It is unneeded because normally user of copy-like function should take
    care of the size of destination and make it big enough to fit source
    data.

    And it is dangerous because function may hide possible error if user
    doesn't provide big enough bitmap, and data becomes silently dropped.

    That's why all copy-like functions have 1 argument for size of copying
    data, and I don't see any reason to make bitmap_from_u32array()
    different.

    One exception that comes in mind is strncpy() which also provides size
    of destination in arguments, but it's strongly argued by the possibility
    of taking broken strings in source. This is not the case of
    bitmap_{from,to}_u32array().

    There is no many real users of bitmap_{from,to}_u32array(), and they all
    very clearly provide size of destination matched with the size of
    source, so additional functionality is not used in fact. Like this:
    bitmap_from_u32array(to->link_modes.supported,
    __ETHTOOL_LINK_MODE_MASK_NBITS,
    link_usettings.link_modes.supported,
    __ETHTOOL_LINK_MODE_MASK_NU32);
    Where:
    #define __ETHTOOL_LINK_MODE_MASK_NU32 \
    DIV_ROUND_UP(__ETHTOOL_LINK_MODE_MASK_NBITS, 32)

    In this patch, bitmap_copy_safe and bitmap_{from,to}_arr32 are introduced.

    'Safe' in bitmap_copy_safe() stands for clearing unused bits in bitmap
    beyond last bit till the end of last word. It is useful for hardening
    API when bitmap is assumed to be exposed to userspace.

    bitmap_{from,to}_arr32 functions are replacements for
    bitmap_{from,to}_u32array. They don't take unneeded nwords argument, and
    so simpler in implementation and understanding.

    This patch suggests optimization for 32-bit systems - aliasing
    bitmap_{from,to}_arr32 to bitmap_copy_safe.

    Other possible optimization is aliasing 64-bit LE bitmap_{from,to}_arr32 to
    more generic function(s). But I didn't end up with the function that would
    be helpful by itself, and can be used to alias 64-bit LE
    bitmap_{from,to}_arr32, like bitmap_copy_safe() does. So I preferred to
    leave things as is.

    The following patch switches kernel to new API and introduces test for it.

    Discussion is here: https://lkml.org/lkml/2017/11/15/592

    [ynorov@caviumnetworks.com: rename bitmap_copy_safe to bitmap_copy_clear_tail]
    Link: http://lkml.kernel.org/r/20180201172508.5739-3-ynorov@caviumnetworks.com
    Link: http://lkml.kernel.org/r/20171228150019.27953-1-ynorov@caviumnetworks.com
    Signed-off-by: Yury Norov
    Cc: Ben Hutchings
    Cc: David Decotigny ,
    Cc: David S. Miller ,
    Cc: Geert Uytterhoeven
    Cc: Matthew Wilcox
    Cc: Rasmus Villemoes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yury Norov
     
  • Replace my codeaurora.org address with my kernel.org address so that
    emails don't bounce.

    Link: http://lkml.kernel.org/r/20180129173258.10643-1-sboyd@codeaurora.org
    Signed-off-by: Stephen Boyd
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Boyd
     
  • This reverts commit 92266d6ef60c ("async: simplify lowest_in_progress()")
    which was simply wrong: In the case where domain is NULL, we now use the
    wrong offsetof() in the list_first_entry macro, so we don't actually
    fetch the ->cookie value, but rather the eight bytes located
    sizeof(struct list_head) further into the struct async_entry.

    On 64 bit, that's the data member, while on 32 bit, that's a u64 built
    from func and data in some order.

    I think the bug happens to be harmless in practice: It obviously only
    affects callers which pass a NULL domain, and AFAICT the only such
    caller is

    async_synchronize_full() ->
    async_synchronize_full_domain(NULL) ->
    async_synchronize_cookie_domain(ASYNC_COOKIE_MAX, NULL)

    and the ASYNC_COOKIE_MAX means that in practice we end up waiting for
    the async_global_pending list to be empty - but it would break if
    somebody happened to pass (void*)-1 as the data element to
    async_schedule, and of course also if somebody ever does a
    async_synchronize_cookie_domain(, NULL) with a "finite" cookie value.

    Maybe the "harmless in practice" means this isn't -stable material. But
    I'm not completely confident my quick git grep'ing is enough, and there
    might be affected code in one of the earlier kernels that has since been
    removed, so I'll leave the decision to the stable guys.

    Link: http://lkml.kernel.org/r/20171128104938.3921-1-linux@rasmusvillemoes.dk
    Fixes: 92266d6ef60c "async: simplify lowest_in_progress()"
    Signed-off-by: Rasmus Villemoes
    Acked-by: Tejun Heo
    Cc: Arjan van de Ven
    Cc: Adam Wallis
    Cc: Lai Jiangshan
    Cc: [3.10+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rasmus Villemoes
     
  • Use a separate fd set for select()-s exception fds param to fix the
    following gcc warning:

    pager.c:36:12: error: passing argument 2 to restrict-qualified parameter aliases with argument 4 [-Werror=restrict]
    select(1, &in, NULL, &in, NULL);
    ^~~ ~~~

    Link: http://lkml.kernel.org/r/20180101105626.7168-1-sergey.senozhatsky@gmail.com
    Signed-off-by: Sergey Senozhatsky
    Cc: Arnaldo Carvalho de Melo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sergey Senozhatsky
     
  • Exported header doesn't use anything from ,
    it is which uses memcmp().

    Link: http://lkml.kernel.org/r/20171225171121.GA22754@avx2
    Signed-off-by: Alexey Dobriyan
    Reviewed-by: Andy Shevchenko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Nearly all modern compilers support a stack-protector option, and nearly
    all modern distributions enable the kernel stack-protector, so enabling
    this by default in kernel builds would make sense. However, Kconfig does
    not have knowledge of available compiler features, so it isn't safe to
    force on, as this would unconditionally break builds for the compilers or
    architectures that don't have support. Instead, this introduces a new
    option, CONFIG_CC_STACKPROTECTOR_AUTO, which attempts to discover the best
    possible stack-protector available, and will allow builds to proceed even
    if the compiler doesn't support any stack-protector.

    This option is made the default so that kernels built with modern
    compilers will be protected-by-default against stack buffer overflows,
    avoiding things like the recent BlueBorne attack. Selection of a specific
    stack-protector option remains available, including disabling it.

    Additionally, tiny.config is adjusted to use CC_STACKPROTECTOR_NONE, since
    that's the option with the least code size (and it used to be the default,
    so we have to explicitly choose it there now).

    Link: http://lkml.kernel.org/r/1510076320-69931-4-git-send-email-keescook@chromium.org
    Signed-off-by: Kees Cook
    Tested-by: Laura Abbott
    Cc: Masahiro Yamada
    Cc: Arnd Bergmann
    Cc: Josh Triplett
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • Various portions of the kernel, especially per-architecture pieces,
    need to know if the compiler is building with the stack protector.
    This was done in the arch/Kconfig with 'select', but this doesn't
    allow a way to do auto-detected compiler support. In preparation for
    creating an on-if-available default, move the logic for the definition of
    CONFIG_CC_STACKPROTECTOR into the Makefile.

    Link: http://lkml.kernel.org/r/1510076320-69931-3-git-send-email-keescook@chromium.org
    Signed-off-by: Kees Cook
    Tested-by: Laura Abbott
    Cc: Masahiro Yamada
    Cc: Arnd Bergmann
    Cc: Josh Triplett
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • In order to make stack-protector failures warn instead of unconditionally
    breaking the build, this moves the compiler output sanity-check earlier,
    and sets a flag for later testing. Future patches can choose to warn or
    fail, depending on the flag value.

    Link: http://lkml.kernel.org/r/1510076320-69931-2-git-send-email-keescook@chromium.org
    Signed-off-by: Kees Cook
    Tested-by: Laura Abbott
    Cc: Masahiro Yamada
    Cc: Arnd Bergmann
    Cc: Josh Triplett
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • A single character (line break) should be put into a sequence. Thus use
    the corresponding function "seq_putc".

    This issue was detected by using the Coccinelle software.

    Link: http://lkml.kernel.org/r/04fb69fe-d820-9141-820f-07e9a48f4635@users.sourceforge.net
    Signed-off-by: Markus Elfring
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Markus Elfring
     
  • Rearrange args for smaller code.

    lookup revolves around memcmp() which gets len 3rd arg, so propagate
    length as 3rd arg.

    readdir and lookup add additional arg to VFS ->readdir and ->lookup, so
    better add it to the end.

    Space savings on x86_64:

    add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-18 (-18)
    Function old new delta
    proc_readdir 22 13 -9
    proc_lookup 18 9 -9

    proc_match() is smaller if not inlined, I promise!

    Link: http://lkml.kernel.org/r/20180104175958.GB5204@avx2
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • use_pde() is used at every open/read/write/... of every random /proc
    file. Negative refcount happens only if PDE is being deleted by module
    (read: never). So it gets "likely".

    unuse_pde() gets "unlikely" for the same reason.

    close_pdeo() gets unlikely as the completion is filled only if there is a
    race between PDE removal and close() (read: never ever).

    It even saves code on x86_64 defconfig:

    add/remove: 0/0 grow/shrink: 1/2 up/down: 2/-20 (-18)
    Function old new delta
    close_pdeo 183 185 +2
    proc_reg_get_unmapped_area 119 111 -8
    proc_reg_poll 85 73 -12

    Link: http://lkml.kernel.org/r/20180104175657.GA5204@avx2
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • /proc/self inode numbers, value of proc_inode_cache and st_nlink of
    /proc/$TGID are fixed constants.

    Link: http://lkml.kernel.org/r/20180103184707.GA31849@avx2
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Document what ->pde_unload_lock actually does.

    Link: http://lkml.kernel.org/r/20180103185120.GB31849@avx2
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • struct proc_dir_entry became bit messy over years:

    * move 16-bit ->mode_t before namelen to get rid of padding
    * make ->in_use first field: it seems to be most used resulting in
    smaller code on x86_64 (defconfig):

    add/remove: 0/0 grow/shrink: 7/13 up/down: 24/-67 (-43)
    Function old new delta
    proc_readdir_de 451 455 +4
    proc_get_inode 282 286 +4
    pde_put 65 69 +4
    remove_proc_subtree 294 297 +3
    remove_proc_entry 297 300 +3
    proc_register 295 298 +3
    proc_notify_change 94 97 +3
    unuse_pde 27 26 -1
    proc_reg_write 89 85 -4
    proc_reg_unlocked_ioctl 85 81 -4
    proc_reg_read 89 85 -4
    proc_reg_llseek 87 83 -4
    proc_reg_get_unmapped_area 123 119 -4
    proc_entry_rundown 139 135 -4
    proc_reg_poll 91 85 -6
    proc_reg_mmap 79 73 -6
    proc_get_link 55 49 -6
    proc_reg_release 108 101 -7
    proc_reg_open 298 291 -7
    close_pdeo 228 218 -10

    * move writeable fields together to a first cacheline (on x86_64),
    those include
    * ->in_use: reference count, taken every open/read/write/close etc
    * ->count: reference count, taken at readdir on every entry
    * ->pde_openers: tracks (nearly) every open, dirtied
    * ->pde_unload_lock: spinlock protecting ->pde_openers
    * ->proc_iops, ->proc_fops, ->data: writeonce fields,
    used right together with previous group.

    * other rarely written fields go into 1st/2nd and 2nd/3rd cacheline on
    32-bit and 64-bit respectively.

    Additionally on 32-bit, ->subdir, ->subdir_node, ->namelen, ->name go
    fully into 2nd cacheline, separated from writeable fields. They are all
    used during lookup.

    Link: http://lkml.kernel.org/r/20171220215914.GA7877@avx2
    Signed-off-by: Alexey Dobriyan
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Commit df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext
    data") added a bounce buffer to avoid hardened usercopy checks. Copying
    to the bounce buffer was implemented with a simple memcpy() assuming
    that it is always valid to read from kernel memory iff the
    kern_addr_valid() check passed.

    A simple, but pointless, test case like "dd if=/proc/kcore of=/dev/null"
    now can easily crash the kernel, since the former execption handling on
    invalid kernel addresses now doesn't work anymore.

    Also adding a kern_addr_valid() implementation wouldn't help here. Most
    architectures simply return 1 here, while a couple implemented a page
    table walk to figure out if something is mapped at the address in
    question.

    With DEBUG_PAGEALLOC active mappings are established and removed all the
    time, so that relying on the result of kern_addr_valid() before
    executing the memcpy() also doesn't work.

    Therefore simply use probe_kernel_read() to copy to the bounce buffer.
    This also allows to simplify read_kcore().

    At least on s390 this fixes the observed crashes and doesn't introduce
    warnings that were removed with df04abfd181a ("fs/proc/kcore.c: Add
    bounce buffer for ktext data"), even though the generic
    probe_kernel_read() implementation uses uaccess functions.

    While looking into this I'm also wondering if kern_addr_valid() could be
    completely removed...(?)

    Link: http://lkml.kernel.org/r/20171202132739.99971-1-heiko.carstens@de.ibm.com
    Fixes: df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data")
    Fixes: f5509cc18daa ("mm: Hardened usercopy")
    Signed-off-by: Heiko Carstens
    Acked-by: Kees Cook
    Cc: Jiri Olsa
    Cc: Al Viro
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Heiko Carstens
     
  • It is 1:1 wrapper around seq_release().

    Link: http://lkml.kernel.org/r/20171122171510.GA12161@avx2
    Signed-off-by: Alexey Dobriyan
    Acked-by: Cyrill Gorcunov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • dentry name can be evaluated later, right before calling into VFS.

    Also, spend less time under ->mmap_sem.

    Link: http://lkml.kernel.org/r/20171110163034.GA2534@avx2
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Iterators aren't necessary as you can just grab the first entry and delete
    it until no entries left.

    Link: http://lkml.kernel.org/r/20171121191121.GA20757@avx2
    Signed-off-by: Alexey Dobriyan
    Cc: Mahesh Salgaonkar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Current code does:

    if (sscanf(dentry->d_name.name, "%lx-%lx", start, end) != 2)

    However sscanf() is broken garbage.

    It silently accepts whitespace between format specifiers
    (did you know that?).

    It silently accepts valid strings which result in integer overflow.

    Do not use sscanf() for any even remotely reliable parsing code.

    OK
    # readlink '/proc/1/map_files/55a23af39000-55a23b05b000'
    /lib/systemd/systemd

    broken
    # readlink '/proc/1/map_files/ 55a23af39000-55a23b05b000'
    /lib/systemd/systemd

    broken
    # readlink '/proc/1/map_files/55a23af39000-55a23b05b000 '
    /lib/systemd/systemd

    very broken
    # readlink '/proc/1/map_files/1000000000000000055a23af39000-55a23b05b000'
    /lib/systemd/systemd

    Andrei said:

    : This patch breaks criu. It was a bug in criu. And this bug is on a minor
    : path, which works when memfd_create() isn't available. It is a reason why
    : I ask to not backport this patch to stable kernels.
    :
    : In CRIU this bug can be triggered, only if this patch will be backported
    : to a kernel which version is lower than v3.16.

    Link: http://lkml.kernel.org/r/20171120212706.GA14325@avx2
    Signed-off-by: Alexey Dobriyan
    Cc: Pavel Emelyanov
    Cc: Andrei Vagin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • READ_ONCE and WRITE_ONCE are useless when there is only one read/write
    is being made.

    Link: http://lkml.kernel.org/r/20171120204033.GA9446@avx2
    Signed-off-by: Alexey Dobriyan
    Cc: Akinobu Mita
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • PROC_NUMBUF is 13 which is enough for "negative int + \n + \0".

    However PIDs and TGIDs are never negative and newline is not a concern,
    so use just 10 per integer.

    Link: http://lkml.kernel.org/r/20171120203005.GA27743@avx2
    Signed-off-by: Alexey Dobriyan
    Cc: Alexander Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Variable real_size is initialized with a value that is never read, it is
    re-assigned a new value later on, hence the initialization is redundant
    and can be removed.

    Cleans up clang warning:

    lib/test_kasan.c:422:21: warning: Value stored to 'real_size' during its initialization is never read

    Link: http://lkml.kernel.org/r/20180206144950.32457-1-colin.king@canonical.com
    Signed-off-by: Colin Ian King
    Acked-by: Andrey Ryabinin
    Reviewed-by: Andrew Morton
    Cc: Alexander Potapenko
    Cc: Dmitry Vyukov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Colin Ian King
     
  • Right now the fact that KASAN uses a single shadow byte for 8 bytes of
    memory is scattered all over the code.

    This change defines KASAN_SHADOW_SCALE_SHIFT early in asm include files
    and makes use of this constant where necessary.

    [akpm@linux-foundation.org: coding-style fixes]
    Link: http://lkml.kernel.org/r/34937ca3b90736eaad91b568edf5684091f662e3.1515775666.git.andreyknvl@google.com
    Signed-off-by: Andrey Konovalov
    Acked-by: Andrey Ryabinin
    Cc: Dmitry Vyukov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Konovalov
     
  • Use the new one.

    Link: http://lkml.kernel.org/r/de3b7ffc30a55178913a7d3865216aa7accf6c40.1515775666.git.andreyknvl@google.com
    Signed-off-by: Andrey Konovalov
    Cc: Andrey Ryabinin
    Cc: Dmitry Vyukov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Konovalov
     
  • Detect frees of pointers into middle of heap objects.

    Link: http://lkml.kernel.org/r/cb569193190356beb018a03bb8d6fbae67e7adbc.1514378558.git.dvyukov@google.com
    Signed-off-by: Dmitry Vyukov
    Cc: Andrey Ryabinin a
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitry Vyukov
     
  • Both of these functions deal with freeing of slab objects.
    However, kasan_poison_kfree() mishandles SLAB_TYPESAFE_BY_RCU
    (must also not poison such objects) and does not detect double-frees.

    Unify code between these functions.

    This solves both of the problems and allows to add more common code
    (e.g. detection of invalid frees).

    Link: http://lkml.kernel.org/r/385493d863acf60408be219a021c3c8e27daa96f.1514378558.git.dvyukov@google.com
    Signed-off-by: Dmitry Vyukov
    Cc: Andrey Ryabinin a
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitry Vyukov
     
  • Detect frees of pointers into middle of mempool objects.

    I did a one-off test, but it turned out to be very tricky, so I reverted
    it. First, mempool does not call kasan_poison_kfree() unless allocation
    function fails. I stubbed an allocation function to fail on second and
    subsequent allocations. But then mempool stopped to call
    kasan_poison_kfree() at all, because it does it only when allocation
    function is mempool_kmalloc(). We could support this special failing
    test allocation function in mempool, but it also can't live with kasan
    tests, because these are in a module.

    Link: http://lkml.kernel.org/r/bf7a7d035d7a5ed62d2dd0e3d2e8a4fcdf456aa7.1514378558.git.dvyukov@google.com
    Signed-off-by: Dmitry Vyukov
    Cc: Andrey Ryabinin a
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitry Vyukov
     
  • __builtin_return_address(1) is unreliable without frame pointers.
    With defconfig on kmalloc_pagealloc_invalid_free test I am getting:

    BUG: KASAN: double-free or invalid-free in (null)

    Pass caller PC from callers explicitly.

    Link: http://lkml.kernel.org/r/9b01bc2d237a4df74ff8472a3bf6b7635908de01.1514378558.git.dvyukov@google.com
    Signed-off-by: Dmitry Vyukov
    Cc: Andrey Ryabinin a
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitry Vyukov
     
  • Patch series "kasan: detect invalid frees".

    KASAN detects double-frees, but does not detect invalid-frees (when a
    pointer into a middle of heap object is passed to free). We recently had
    a very unpleasant case in crypto code which freed an inner object inside
    of a heap allocation. This left unnoticed during free, but totally
    corrupted heap and later lead to a bunch of random crashes all over kernel
    code.

    Detect invalid frees.

    This patch (of 5):

    Detect frees of pointers into middle of large heap objects.

    I dropped const from kasan_kfree_large() because it starts propagating
    through a bunch of functions in kasan_report.c, slab/slub nearest_obj(),
    all of their local variables, fixup_red_left(), etc.

    Link: http://lkml.kernel.org/r/1b45b4fe1d20fc0de1329aab674c1dd973fee723.1514378558.git.dvyukov@google.com
    Signed-off-by: Dmitry Vyukov
    Cc: Andrey Ryabinin a
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitry Vyukov