25 Feb, 2017

40 commits

  • pr_cont(...) and printk(KERN_CONT ...) uses should be discouraged
    as their output can be interleaved by multiple logging processes.

    Link: http://lkml.kernel.org/r/7100ba00098694ec81471a299583ed068975fd05.1483465888.git.joe@perches.com
    Signed-off-by: Joe Perches
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Embedded function names are less appropriate to use when refactoring can
    cause function renaming. Prefer the use of "%s", __func__ to embedded
    function names.

    Link: http://lkml.kernel.org/r/ac9631fdbac5af3507c5bfe88ad9064f0ed764ec.1483510416.git.joe@perches.com
    Signed-off-by: Joe Perches
    Acked-by: Andy Whitcroft
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Remove the functions introduced as wrappers for providing backwards
    compatibility to the prior LZ4 version. They're not needed anymore
    since there's no callers left.

    Link: http://lkml.kernel.org/r/1486321748-19085-6-git-send-email-4sschmid@informatik.uni-hamburg.de
    Signed-off-by: Sven Schmidt
    Cc: Bongkyu Kim
    Cc: Rui Salvaterra
    Cc: Sergey Senozhatsky
    Cc: Greg Kroah-Hartman
    Cc: Herbert Xu
    Cc: David S. Miller
    Cc: Anton Vorontsov
    Cc: Colin Cross
    Cc: Kees Cook
    Cc: Tony Luck
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sven Schmidt
     
  • Update fs/pstore and fs/squashfs to use the updated functions from the
    new LZ4 module.

    Link: http://lkml.kernel.org/r/1486321748-19085-5-git-send-email-4sschmid@informatik.uni-hamburg.de
    Signed-off-by: Sven Schmidt
    Cc: Bongkyu Kim
    Cc: Rui Salvaterra
    Cc: Sergey Senozhatsky
    Cc: Greg Kroah-Hartman
    Cc: Herbert Xu
    Cc: David S. Miller
    Cc: Anton Vorontsov
    Cc: Colin Cross
    Cc: Kees Cook
    Cc: Tony Luck
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sven Schmidt
     
  • Update the crypto modules using LZ4 compression as well as the test
    cases in testmgr.h to work with the new LZ4 module version.

    Link: http://lkml.kernel.org/r/1486321748-19085-4-git-send-email-4sschmid@informatik.uni-hamburg.de
    Signed-off-by: Sven Schmidt
    Cc: Bongkyu Kim
    Cc: Rui Salvaterra
    Cc: Sergey Senozhatsky
    Cc: Greg Kroah-Hartman
    Cc: Herbert Xu
    Cc: David S. Miller
    Cc: Anton Vorontsov
    Cc: Colin Cross
    Cc: Kees Cook
    Cc: Tony Luck
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sven Schmidt
     
  • Update the unlz4 wrapper to work with the updated LZ4 kernel module
    version.

    Link: http://lkml.kernel.org/r/1486321748-19085-3-git-send-email-4sschmid@informatik.uni-hamburg.de
    Signed-off-by: Sven Schmidt
    Cc: Bongkyu Kim
    Cc: Rui Salvaterra
    Cc: Sergey Senozhatsky
    Cc: Greg Kroah-Hartman
    Cc: Herbert Xu
    Cc: David S. Miller
    Cc: Anton Vorontsov
    Cc: Colin Cross
    Cc: Kees Cook
    Cc: Tony Luck
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sven Schmidt
     
  • Patch series "Update LZ4 compressor module", v7.

    This patchset updates the LZ4 compression module to a version based on
    LZ4 v1.7.3 allowing to use the fast compression algorithm aka LZ4 fast
    which provides an "acceleration" parameter as a tradeoff between high
    compression ratio and high compression speed.

    We want to use LZ4 fast in order to support compression in lustre and
    (mostly, based on that) investigate data reduction techniques in behalf
    of storage systems.

    Also, it will be useful for other users of LZ4 compression, as with LZ4
    fast it is possible to enable applications to use fast and/or high
    compression depending on the usecase. For instance, ZRAM is offering a
    LZ4 backend and could benefit from an updated LZ4 in the kernel.

    LZ4 homepage: http://www.lz4.org/
    LZ4 source repository: https://github.com/lz4/lz4 Source version: 1.7.3

    Benchmark (taken from [1], Core i5-4300U @1.9GHz):
    ----------------|--------------|----------------|----------
    Compressor | Compression | Decompression | Ratio
    ----------------|--------------|----------------|----------
    memcpy | 4200 MB/s | 4200 MB/s | 1.000
    LZ4 fast 50 | 1080 MB/s | 2650 MB/s | 1.375
    LZ4 fast 17 | 680 MB/s | 2220 MB/s | 1.607
    LZ4 fast 5 | 475 MB/s | 1920 MB/s | 1.886
    LZ4 default | 385 MB/s | 1850 MB/s | 2.101

    [1] http://fastcompression.blogspot.de/2015/04/sampling-or-faster-lz4.html

    [PATCH 1/5] lib: Update LZ4 compressor module
    [PATCH 2/5] lib/decompress_unlz4: Change module to work with new LZ4 module version
    [PATCH 3/5] crypto: Change LZ4 modules to work with new LZ4 module version
    [PATCH 4/5] fs/pstore: fs/squashfs: Change usage of LZ4 to work with new LZ4 version
    [PATCH 5/5] lib/lz4: Remove back-compat wrappers

    This patch (of 5):

    Update the LZ4 kernel module to LZ4 v1.7.3 by Yann Collet. The kernel
    module is inspired by the previous work by Chanho Min. The updated LZ4
    module will not break existing code since the patchset contains
    appropriate changes.

    API changes:

    New method LZ4_compress_fast which differs from the variant available in
    kernel by the new acceleration parameter, allowing to trade compression
    ratio for more compression speed and vice versa.

    LZ4_decompress_fast is the respective decompression method, featuring a
    very fast decoder (multiple GB/s per core), able to reach RAM speed in
    multi-core systems. The decompressor allows to decompress data
    compressed with LZ4 fast as well as the LZ4 HC (high compression)
    algorithm.

    Also the useful functions LZ4_decompress_safe_partial and
    LZ4_compress_destsize were added. The latter reverses the logic by
    trying to compress as much data as possible from source to dest while
    the former aims to decompress partial blocks of data.

    A bunch of streaming functions were also added which allow
    compressig/decompressing data in multiple steps (so called "streaming
    mode").

    The methods lz4_compress and lz4_decompress_unknownoutputsize are now
    known as LZ4_compress_default respectivley LZ4_decompress_safe. The old
    methods will be removed since there's no callers left in the code.

    [arnd@arndb.de: fix KERNEL_LZ4 support]
    Link: http://lkml.kernel.org/r/20170208211946.2839649-1-arnd@arndb.de
    [akpm@linux-foundation.org: simplify]
    [akpm@linux-foundation.org: fix the simplification]
    [4sschmid@informatik.uni-hamburg.de: fix performance regressions]
    Link: http://lkml.kernel.org/r/1486898178-17125-2-git-send-email-4sschmid@informatik.uni-hamburg.de
    [4sschmid@informatik.uni-hamburg.de: v8]
    Link: http://lkml.kernel.org/r/1487182598-15351-2-git-send-email-4sschmid@informatik.uni-hamburg.de
    Link: http://lkml.kernel.org/r/1486321748-19085-2-git-send-email-4sschmid@informatik.uni-hamburg.de
    Signed-off-by: Sven Schmidt
    Signed-off-by: Arnd Bergmann
    Cc: Bongkyu Kim
    Cc: Rui Salvaterra
    Cc: Sergey Senozhatsky
    Cc: Greg Kroah-Hartman
    Cc: Herbert Xu
    Cc: David S. Miller
    Cc: Anton Vorontsov
    Cc: Colin Cross
    Cc: Kees Cook
    Cc: Tony Luck
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sven Schmidt
     
  • The Kconfig currently controlling compilation of this code is:

    lib/Kconfig.debug:config TEST_SORT
    lib/Kconfig.debug: bool "Array-based sort test"

    ...meaning that it currently is not being built as a module by anyone.

    Lets remove the couple traces of modular infrastructure use, so that
    when reading the code there is no doubt it is builtin-only.

    Since module_init translates to device_initcall in the non-modular case,
    the init ordering becomes slightly earlier when we change it to use
    subsys_initcall as done here. However, since it is a self contained
    test, this shouldn't be an issue and subsys_initcall seems like a better
    fit for this particular case.

    We also delete the MODULE_LICENSE tag since that information is now
    contained at the top of the file in the comments.

    Link: http://lkml.kernel.org/r/20170124225608.7319-1-paul.gortmaker@windriver.com
    Signed-off-by: Paul Gortmaker
    Cc: Kostenzer Felix
    Cc: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Gortmaker
     
  • Along with the addition made to Kconfig.debug, the prior existing but
    permanently disabled test function has been slightly refactored.

    Patch has been tested using QEMU 2.1.2 with a .config obtained through
    'make defconfig' (x86_64) and manually enabling the option.

    [arnd@arndb.de: move sort self-test into a separate file]
    Link: http://lkml.kernel.org/r/20170112110657.3123790-1-arnd@arndb.de
    Link: http://lkml.kernel.org/r/HE1PR09MB0394B0418D504DCD27167D4FD49B0@HE1PR09MB0394.eurprd09.prod.outlook.com
    Signed-off-by: Kostenzer Felix
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kostenzer Felix
     
  • Prepare to mark sensitive kernel structures for randomization by making
    sure they're using designated initializers. These were identified
    during allyesconfig builds of x86, arm, and arm64, with most initializer
    fixes extracted from grsecurity.

    Link: http://lkml.kernel.org/r/20161217010253.GA140470@beast
    Signed-off-by: Kees Cook
    Acked-by: Peter Zijlstra (Intel)
    Cc: David Howells
    Cc: Jie Chen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • While working on a thermal driver I encounter a scenario where the
    divisor could be negative, instead of adding local code to handle this I
    though I first try to add support for this in DIV_ROUND_CLOSEST.

    Add support to DIV_ROUND_CLOSEST for negative divisors if both dividend
    and divisor variable types are signed. This should not alter current
    behavior for users of the macro as previously negative divisors where
    not supported.

    Before:

    DIV_ROUND_CLOSEST( 59, 4) = 15
    DIV_ROUND_CLOSEST( 59, -4) = -14
    DIV_ROUND_CLOSEST( -59, 4) = -15
    DIV_ROUND_CLOSEST( -59, -4) = 14

    After:

    DIV_ROUND_CLOSEST( 59, 4) = 15
    DIV_ROUND_CLOSEST( 59, -4) = -15
    DIV_ROUND_CLOSEST( -59, 4) = -15
    DIV_ROUND_CLOSEST( -59, -4) = 15

    [akpm@linux-foundation.org: fix comment, per Guenter]
    Link: http://lkml.kernel.org/r/20161222102217.29011-1-niklas.soderlund+renesas@ragnatech.se
    Signed-off-by: Niklas Söderlund
    Reviewed-by: Guenter Roeck
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Niklas Söderlund
     
  • This saves 32 bytes on my x86-64 build, mostly due to alignment
    considerations and sharing more code between find_next_bit and
    find_next_zero_bit, but it does save a couple of instructions.

    There's really two parts to this commit:
    - First, the first half of the test: (!nbits || start >= nbits) is
    trivially a subset of the second half, since nbits and start are both
    unsigned
    - Second, while looking at the disassembly, I noticed that GCC was
    predicting the branch taken. Since this is a failure case, it's
    clearly the less likely of the two branches, so add an unlikely() to
    override GCC's heuristics.

    [mawilcox@microsoft.com: v2]
    Link: http://lkml.kernel.org/r/1483709016-1834-1-git-send-email-mawilcox@linuxonhyperv.com
    Link: http://lkml.kernel.org/r/1483709016-1834-1-git-send-email-mawilcox@linuxonhyperv.com
    Signed-off-by: Matthew Wilcox
    Acked-by: Yury Norov
    Acked-by: Rasmus Villemoes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • Allow to compile the atomic64 test code either to a loadable module, or
    builtin into the kernel.

    Link: http://lkml.kernel.org/r/1483470276-10517-3-git-send-email-geert@linux-m68k.org
    Signed-off-by: Geert Uytterhoeven
    Reviewed-by: Andy Shevchenko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Geert Uytterhoeven
     
  • Extract the glob test code into its own source file, to allow to compile
    it either to a loadable module, or builtin into the kernel.

    Link: http://lkml.kernel.org/r/1483470276-10517-2-git-send-email-geert@linux-m68k.org
    Signed-off-by: Geert Uytterhoeven
    Reviewed-by: Andy Shevchenko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Geert Uytterhoeven
     
  • Extract the crc32 test code into its own source file, to allow to
    compile it either to a loadable module, or builtin into the kernel.

    Link: http://lkml.kernel.org/r/1483470276-10517-1-git-send-email-geert@linux-m68k.org
    Signed-off-by: Geert Uytterhoeven
    Reviewed-by: Andy Shevchenko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Geert Uytterhoeven
     
  • The object notes_attr of type bin_attribute is not modified after
    getting initailized by ksysfs_init. Apart from initialization in
    ksysfs_init it is also passed as an argument to the function
    sysfs_create_bin_file but this argument is of type const. Therefore,
    add __ro_after_init to its declaration.

    Link: http://lkml.kernel.org/r/1486839969-16891-1-git-send-email-bhumirks@gmail.com
    Signed-off-by: Bhumika Goyal
    Acked-by: Kees Cook
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bhumika Goyal
     
  • NOTIFY_STOP_MASK (0x8000) has only one bit set and there is no need to
    compare output of "ret & NOTIFY_STOP_MASK" to NOTIFY_STOP_MASK. We just
    need to make sure the output is non-zero, that's it.

    Link: http://lkml.kernel.org/r/88ee58264a2bfab1c97ffc8ac753e25f55f57c10.1483593065.git.viresh.kumar@linaro.org
    Signed-off-by: Viresh Kumar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Viresh Kumar
     
  • With CONFIG_BALLOON_COMPACTION=y the kernel will mount balloon_mnt for
    balloon page migration when we probe a virtio_balloon device. However
    we do not unmount it when removing the device. Fix this.

    Fixes: b1123ea6d3b3 ("mm: balloon: use general non-lru movable page feature")
    Link: http://lkml.kernel.org/r/1486531318-35189-1-git-send-email-xieyisheng1@huawei.com
    Signed-off-by: Yisheng Xie
    Acked-by: Minchan Kim
    Cc: Rafael Aquini
    Cc: Konstantin Khlebnikov
    Cc: Gioh Kim
    Cc: Vlastimil Babka
    Cc: Michal Hocko
    Cc: Michael S. Tsirkin
    Cc: Jason Wang
    Cc: Hanjun Guo
    Cc: Xishi Qiu
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yisheng Xie
     
  • The CHECK_DATA_CORRUPTION() macro was designed to have callers do
    something meaningful/protective on failure. However, using "return
    false" in the macro too strictly limits the design patterns of callers.
    Instead, let callers handle the logic test directly, but make sure that
    the result IS checked by forcing __must_check (which appears to not be
    able to be used directly on macro expressions).

    Link: http://lkml.kernel.org/r/20170206204547.GA125312@beast
    Signed-off-by: Kees Cook
    Suggested-by: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • There is which provides macros for various gcc
    specific constructs. Eg: __weak for __attribute__((weak)). I've
    cleaned all instances of gcc specific attributes with the right macros
    for all files under /arch/m68k

    Link: http://lkml.kernel.org/r/1485540901-1988-3-git-send-email-gidisrael@gmail.com
    Signed-off-by: Gideon Israel Dsouza
    Cc: Greg Ungerer
    Cc: Geert Uytterhoeven
    Cc: Paul Gortmaker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gideon Israel Dsouza
     
  • Add __mode(x) into compiler-gcc.h as part of a cleanup task I've taken
    up, to replace gcc specific attributes with macros.

    The next patch is a cleanup of the m68k subsystem and it requires a new
    macro to wrap __attribute__ ((mode (...)))

    Link: http://lkml.kernel.org/r/1485540901-1988-2-git-send-email-gidisrael@gmail.com
    Signed-off-by: Gideon Israel Dsouza
    Cc: Greg Ungerer
    Cc: Geert Uytterhoeven
    Cc: Paul Gortmaker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gideon Israel Dsouza
     
  • The timer APIs this header needs are ktime_get(), ktime_add_us(), and
    ktime_compare(). So, including seems enough. This
    commit will cut unnecessary header file parsing.

    Link: http://lkml.kernel.org/r/1481679225-10885-1-git-send-email-yamada.masahiro@socionext.com
    Signed-off-by: Masahiro Yamada
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Masahiro Yamada
     
  • Commit 63159f5dcccb ("uapi: Use __kernel_long_t in struct mq_attr")
    changed the types from long to __kernel_long_t, but didn't add a
    linux/types.h include. Code that tries to include this header directly
    breaks:

    /usr/include/linux/mqueue.h:26:2: error: unknown type name '__kernel_long_t'
    __kernel_long_t mq_flags; /* message queue flags */

    This also upsets configure tests for this header:

    checking linux/mqueue.h usability... no
    checking linux/mqueue.h presence... yes
    configure: WARNING: linux/mqueue.h: present but cannot be compiled
    configure: WARNING: linux/mqueue.h: check for missing prerequisite headers?
    configure: WARNING: linux/mqueue.h: see the Autoconf documentation
    configure: WARNING: linux/mqueue.h: section "Present But Cannot Be Compiled"
    configure: WARNING: linux/mqueue.h: proceeding with the compiler's result
    checking for linux/mqueue.h... no

    Link: http://lkml.kernel.org/r/20170119194644.4403-1-vapier@gentoo.org
    Signed-off-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Frysinger
     
  • Previously, the hidepid parameter was checked by comparing literal
    integers 0, 1, 2. Let's add a proper enum for this, to make the
    checking more expressive:

    0 → HIDEPID_OFF
    1 → HIDEPID_NO_ACCESS
    2 → HIDEPID_INVISIBLE

    This changes the internal labelling only, the userspace-facing interface
    remains unmodified, and still works with literal integers 0, 1, 2.

    No functional changes.

    Link: http://lkml.kernel.org/r/1484572984-13388-2-git-send-email-djalal@gmail.com
    Signed-off-by: Lafcadio Wluiki
    Signed-off-by: Djalal Harouni
    Acked-by: Kees Cook
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lafcadio Wluiki
     
  • After staring at this code for a while I've figured using small 2-entry
    array describing ARGV and ENVP is the way to address code duplication
    critique.

    Link: http://lkml.kernel.org/r/20170105185724.GA12027@avx2
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • To make the code clearer, use rb_entry() instead of container_of() to
    deal with rbtree.

    Link: http://lkml.kernel.org/r/4fd1f82818665705ce75c5156a060ae7caa8e0a9.1482160150.git.geliangtang@gmail.com
    Signed-off-by: Geliang Tang
    Cc: Jan Kara
    Cc: Al Viro
    Cc: "David S. Miller"
    Cc: Juergen Gross
    Cc: Dmitry Torokhov
    Cc: Seth Forshee
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Geliang Tang
     
  • Given that the arch does not add its own implementations, simply use the
    asm-generic/current.h (generic-y) header instead of duplicating code.

    Link: http://lkml.kernel.org/r/1485992878-4780-2-git-send-email-dave@stgolabs.net
    Signed-off-by: Davidlohr Bueso
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davidlohr Bueso
     
  • The build of frv defconfig gives warning:

    arch/frv/mb93090-mb00/pci-frv.c:176:5: warning: ignoring return value of 'pci_assign_resource', declared with attribute warn_unused_result

    Just print an error message to silence the warning. We can not do much
    here on error.

    Link: http://lkml.kernel.org/r/1484256471-5379-1-git-send-email-sudipm.mukherjee@gmail.com
    Signed-off-by: Sudip Mukherjee
    Cc: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sudip Mukherjee
     
  • Make a kasan test which uses a SLAB_ACCOUNT slab cache. If the test is
    run within a non default memcg, then it uncovers the bug fixed by
    "kasan: drain quarantine of memcg slab objects"[1].

    If run without fix [1] it shows "Slab cache still has objects", and the
    kmem_cache structure is leaked.
    Here's an unpatched kernel test:

    $ dmesg -c > /dev/null
    $ mkdir /sys/fs/cgroup/memory/test
    $ echo $$ > /sys/fs/cgroup/memory/test/tasks
    $ modprobe test_kasan 2> /dev/null
    $ dmesg | grep -B1 still
    [ 123.456789] kasan test: memcg_accounted_kmem_cache allocate memcg accounted object
    [ 124.456789] kmem_cache_destroy test_cache: Slab cache still has objects

    Kernels with fix [1] don't have the "Slab cache still has objects"
    warning or the underlying leak.

    The new test runs and passes in the default (root) memcg, though in the
    root memcg it won't uncover the problem fixed by [1].

    Link: http://lkml.kernel.org/r/1482257462-36948-2-git-send-email-gthelen@google.com
    Signed-off-by: Greg Thelen
    Reviewed-by: Vladimir Davydov
    Cc: Andrey Ryabinin
    Cc: Alexander Potapenko
    Cc: Dmitry Vyukov
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Greg Thelen
     
  • Per memcg slab accounting and kasan have a problem with kmem_cache
    destruction.
    - kmem_cache_create() allocates a kmem_cache, which is used for
    allocations from processes running in root (top) memcg.
    - Processes running in non root memcg and allocating with either
    __GFP_ACCOUNT or from a SLAB_ACCOUNT cache use a per memcg
    kmem_cache.
    - Kasan catches use-after-free by having kfree() and kmem_cache_free()
    defer freeing of objects. Objects are placed in a quarantine.
    - kmem_cache_destroy() destroys root and non root kmem_caches. It takes
    care to drain the quarantine of objects from the root memcg's
    kmem_cache, but ignores objects associated with non root memcg. This
    causes leaks because quarantined per memcg objects refer to per memcg
    kmem cache being destroyed.

    To see the problem:

    1) create a slab cache with kmem_cache_create(,,,SLAB_ACCOUNT,)
    2) from non root memcg, allocate and free a few objects from cache
    3) dispose of the cache with kmem_cache_destroy() kmem_cache_destroy()
    will trigger a "Slab cache still has objects" warning indicating
    that the per memcg kmem_cache structure was leaked.

    Fix the leak by draining kasan quarantined objects allocated from non
    root memcg.

    Racing memcg deletion is tricky, but handled. kmem_cache_destroy() =>
    shutdown_memcg_caches() => __shutdown_memcg_cache() => shutdown_cache()
    flushes per memcg quarantined objects, even if that memcg has been
    rmdir'd and gone through memcg_deactivate_kmem_caches().

    This leak only affects destroyed SLAB_ACCOUNT kmem caches when kasan is
    enabled. So I don't think it's worth patching stable kernels.

    Link: http://lkml.kernel.org/r/1482257462-36948-1-git-send-email-gthelen@google.com
    Signed-off-by: Greg Thelen
    Reviewed-by: Vladimir Davydov
    Acked-by: Andrey Ryabinin
    Cc: Alexander Potapenko
    Cc: Dmitry Vyukov
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Greg Thelen
     
  • Commit 31bc3858ea3e ("add automatic onlining policy for the newly added
    memory") provides the capability to have added memory automatically
    onlined during add, but this appears to be slightly broken.

    The current implementation uses walk_memory_range() to call
    online_memory_block, which uses memory_block_change_state() to online
    the memory. Instead, we should be calling device_online() for the
    memory block in online_memory_block(). This would online the memory
    (the memory bus online routine memory_subsys_online() called from
    device_online calls memory_block_change_state()) and properly update the
    device struct offline flag.

    As a result of the current implementation, attempting to remove a memory
    block after adding it using auto online fails. This is because doing a
    remove, for instance

    echo offline > /sys/devices/system/memory/memoryXXX/state

    uses device_offline() which checks the dev->offline flag.

    Link: http://lkml.kernel.org/r/20170222220744.8119.19687.stgit@ltcalpine2-lp14.aus.stglabs.ibm.com
    Signed-off-by: Nathan Fontenot
    Cc: Michael Ellerman
    Cc: Michael Roth
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nathan Fontenot
     
  • With rw_page, page_endio is used for completing IO on a page and it
    propagates write error to the address space if the IO fails. The
    problem is it accesses page->mapping directly which might be okay for
    file-backed pages but it shouldn't for anonymous page. Otherwise, it
    can corrupt one of field from anon_vma under us and system goes panic
    randomly.

    swap_writepage
    bdev_writepage
    ops->rw_page

    I encountered the BUG during developing new zram feature and it was
    really hard to figure it out because it made random crash, somtime
    mmap_sem lockdep, sometime other places where places never related to
    zram/zsmalloc, and not reproducible with some configuration.

    When I consider how that bug is subtle and people do fast-swap test with
    brd, it's worth to add stable mark, I think.

    Fixes: dd6bd0d9c7db ("swap: use bdev_read_page() / bdev_write_page()")
    Signed-off-by: Minchan Kim
    Acked-by: Michal Hocko
    Cc: Matthew Wilcox
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Minchan Kim
     
  • We are using the wrong flag value in task_numa_falt function. This can
    result in us doing wrong numa fault statistics update, because we update
    num_pages_migrate and numa_fault_locality etc based on the flag argument
    passed.

    Fixes: bae473a423 ("mm: introduce fault_env")
    Link: http://lkml.kernel.org/r/1487498395-9544-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com
    Signed-off-by: Aneesh Kumar K.V
    Acked-by: Hillf Danton
    Acked-by: Kirill A. Shutemov
    Cc: Rik van Riel
    Cc: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Aneesh Kumar K.V
     
  • Do the prot_none/FOLL_NUMA check after we are sure this is a THP pte.
    Archs can implement prot_none such that it can return true for regular
    pmd entries.

    Link: http://lkml.kernel.org/r/1487498326-8734-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com
    Signed-off-by: Aneesh Kumar K.V
    Cc: Rik van Riel
    Cc: Mel Gorman
    Cc: Hillf Danton
    Cc: Kirill A. Shutemov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Aneesh Kumar K.V
     
  • cleanup rest of dma_addr_t and phys_addr_t type casting in mm
    use %pad for dma_addr_t
    use %pa for phys_addr_t

    Link: http://lkml.kernel.org/r/1486618489-13912-1-git-send-email-miles.chen@mediatek.com
    Signed-off-by: Miles Chen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miles Chen
     
  • The class index and fullness group are not encoded in
    (first)page->mapping any more, after commit 3783689a1aa8 ("zsmalloc:
    introduce zspage structure"). Instead, they are store in struct zspage.

    Just delete this unneeded comment.

    Link: http://lkml.kernel.org/r/1486620822-36826-1-git-send-email-xieyisheng1@huawei.com
    Signed-off-by: Yisheng Xie
    Suggested-by: Sergey Senozhatsky
    Reviewed-by: Sergey Senozhatsky
    Acked-by: Minchan Kim
    Cc: Nitin Gupta
    Cc: Hanjun Guo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yisheng Xie
     
  • arch_zone_lowest/highest_possible_pfn[] is set to 0 and [ZONE_MOVABLE]
    is skipped in the loop. No need to reset them to 0 again.

    This patch just removes the redundant code.

    Link: http://lkml.kernel.org/r/20170209141731.60208-1-richard.weiyang@gmail.com
    Signed-off-by: Wei Yang
    Cc: Anshuman Khandual
    Cc: Mel Gorman
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wei Yang
     
  • We had used page->lru to link the component pages (except the first
    page) of a zspage, and used INIT_LIST_HEAD(&page->lru) to init it.
    Therefore, to get the last page's next page, which is NULL, we had to
    use page flag PG_Private_2 to identify it.

    But now, we use page->freelist to link all of the pages in zspage and
    init the page->freelist as NULL for last page, so no need to use
    PG_Private_2 anymore.

    This remove redundant SetPagePrivate2 in create_page_chain and
    ClearPagePrivate2 in reset_page(). Save a few cycles for migration of
    zsmalloc page :)

    Link: http://lkml.kernel.org/r/1487076509-49270-1-git-send-email-xieyisheng1@huawei.com
    Signed-off-by: Yisheng Xie
    Reviewed-by: Sergey Senozhatsky
    Acked-by: Minchan Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yisheng Xie
     
  • At the end of a window period, if the reclaimed pages is greater than
    scanned, an unsigned underflow can result in a huge pressure value and
    thus a critical event. Reclaimed pages is found to go higher than
    scanned because of the addition of reclaimed slab pages to reclaimed in
    shrink_node without a corresponding increment to scanned pages.

    Minchan Kim mentioned that this can also happen in the case of a THP
    page where the scanned is 1 and reclaimed could be 512.

    Link: http://lkml.kernel.org/r/1486641577-11685-1-git-send-email-vinmenon@codeaurora.org
    Signed-off-by: Vinayak Menon
    Acked-by: Minchan Kim
    Acked-by: Michal Hocko
    Cc: Johannes Weiner
    Cc: Mel Gorman
    Cc: Vlastimil Babka
    Cc: Rik van Riel
    Cc: Vladimir Davydov
    Cc: Anton Vorontsov
    Cc: Shiraz Hashim
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vinayak Menon
     
  • Remove the prototypes for shmem_mapping() and shmem_zero_setup() from
    linux/mm.h, since they are already provided in linux/shmem_fs.h. But
    shmem_fs.h must then provide the inline stub for shmem_mapping() when
    CONFIG_SHMEM is not set, and a few more cfiles now need to #include it.

    Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1702081658250.1549@eggly.anvils
    Signed-off-by: Hugh Dickins
    Cc: Johannes Weiner
    Cc: Michal Simek
    Cc: Michael Ellerman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins