08 Apr, 2020

13 commits

  • It's clearer to just put this inline.

    Signed-off-by: Matthew Wilcox (Oracle)
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Link: http://lkml.kernel.org/r/20200317193201.9924-5-adobriyan@gmail.com
    Signed-off-by: Linus Torvalds

    Matthew Wilcox (Oracle)
     
  • The process maps file was the only user of version (introduced back in
    2005). Now that it uses ppos instead, we can remove it.

    Signed-off-by: Matthew Wilcox (Oracle)
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Link: http://lkml.kernel.org/r/20200317193201.9924-4-adobriyan@gmail.com
    Signed-off-by: Linus Torvalds

    Matthew Wilcox (Oracle)
     
  • The ppos is a private cursor, just like m->version. Use the canonical
    cursor, not a special one.

    Signed-off-by: Matthew Wilcox (Oracle)
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Link: http://lkml.kernel.org/r/20200317193201.9924-3-adobriyan@gmail.com
    Signed-off-by: Linus Torvalds

    Matthew Wilcox (Oracle)
     
  • Instead of setting m->version in the show method, set it in m_next(),
    where it should be. Also remove the fallback code for failing to find a
    vma, or version being zero.

    Signed-off-by: Matthew Wilcox (Oracle)
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Link: http://lkml.kernel.org/r/20200317193201.9924-2-adobriyan@gmail.com
    Signed-off-by: Linus Torvalds

    Matthew Wilcox (Oracle)
     
  • Instead of calling vma_stop() from m_start() and m_next(), do its work
    in m_stop().

    Signed-off-by: Matthew Wilcox (Oracle)
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Link: http://lkml.kernel.org/r/20200317193201.9924-1-adobriyan@gmail.com
    Signed-off-by: Linus Torvalds

    Matthew Wilcox (Oracle)
     
  • top(1) reads all /proc/*/statm files but kernel threads will always have
    zeros. Print those zeroes directly without going through
    seq_put_decimal_ull().

    Speed up reading /proc/2/statm (which is kthreadd) is like 3%.

    My system has more kernel threads than normal processes after booting KDE.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Link: http://lkml.kernel.org/r/20200307154435.GA2788@avx2
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Now that "struct proc_ops" exist we can start putting there stuff which
    could not fly with VFS "struct file_operations"...

    Most of fs/proc/inode.c file is dedicated to make open/read/.../close
    reliable in the event of disappearing /proc entries which usually happens
    if module is getting removed. Files like /proc/cpuinfo which never
    disappear simply do not need such protection.

    Save 2 atomic ops, 1 allocation, 1 free per open/read/close sequence for such
    "permanent" files.

    Enable "permanent" flag for

    /proc/cpuinfo
    /proc/kmsg
    /proc/modules
    /proc/slabinfo
    /proc/stat
    /proc/sysvipc/*
    /proc/swaps

    More will come once I figure out foolproof way to prevent out module
    authors from marking their stuff "permanent" for performance reasons
    when it is not.

    This should help with scalability: benchmark is "read /proc/cpuinfo R times
    by N threads scattered over the system".

    N R t, s (before) t, s (after)
    -----------------------------------------------------
    64 4096 1.582458 1.530502 -3.2%
    256 4096 6.371926 6.125168 -3.9%
    1024 4096 25.64888 24.47528 -4.6%

    Benchmark source:

    #include
    #include
    #include
    #include

    #include
    #include
    #include
    #include

    const int NR_CPUS = sysconf(_SC_NPROCESSORS_ONLN);
    int N;
    const char *filename;
    int R;

    int xxx = 0;

    int glue(int n)
    {
    cpu_set_t m;
    CPU_ZERO(&m);
    CPU_SET(n, &m);
    return sched_setaffinity(0, sizeof(cpu_set_t), &m);
    }

    void f(int n)
    {
    glue(n % NR_CPUS);

    while (*(volatile int *)&xxx == 0) {
    }

    for (int i = 0; i < R; i++) {
    int fd = open(filename, O_RDONLY);
    char buf[4096];
    ssize_t rv = read(fd, buf, sizeof(buf));
    asm volatile ("" :: "g" (rv));
    close(fd);
    }
    }

    int main(int argc, char *argv[])
    {
    if (argc < 4) {
    std::cerr << "usage: " << argv[0] << ' ' << "N /proc/filename R
    ";
    return 1;
    }

    N = atoi(argv[1]);
    filename = argv[2];
    R = atoi(argv[3]);

    for (int i = 0; i < NR_CPUS; i++) {
    if (glue(i) == 0)
    break;
    }

    std::vector T;
    T.reserve(N);
    for (int i = 0; i < N; i++) {
    T.emplace_back(f, i);
    }

    auto t0 = std::chrono::system_clock::now();
    {
    *(volatile int *)&xxx = 1;
    for (auto& t: T) {
    t.join();
    }
    }
    auto t1 = std::chrono::system_clock::now();
    std::chrono::duration dt = t1 - t0;
    std::cout << dt.count() << '
    ';

    return 0;
    }

    P.S.:
    Explicit randomization marker is added because adding non-function pointer
    will silently disable structure layout randomization.

    [akpm@linux-foundation.org: coding style fixes]
    Reported-by: kbuild test robot
    Reported-by: Dan Carpenter
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Cc: Al Viro
    Cc: Joe Perches
    Link: http://lkml.kernel.org/r/20200222201539.GA22576@avx2
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Fix sparse locking imbalance warning:

    warning: context imbalance in close_pdeo() - unexpected unlock

    Signed-off-by: Jules Irenge
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Link: http://lkml.kernel.org/r/20200227201538.GA30462@avx2
    Signed-off-by: Linus Torvalds

    Jules Irenge
     
  • Only declare _UFFDIO_WRITEPROTECT if the user specified
    UFFDIO_REGISTER_MODE_WP and if all the checks passed. Then when the user
    registers regions with shmem/hugetlbfs we won't expose the new ioctl to
    them. Even with complete anonymous memory range, we'll only expose the
    new WP ioctl bit if the register mode has MODE_WP.

    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Reviewed-by: Mike Rapoport
    Cc: Andrea Arcangeli
    Cc: Bobby Powers
    Cc: Brian Geffon
    Cc: David Hildenbrand
    Cc: Denis Plotnikov
    Cc: "Dr . David Alan Gilbert"
    Cc: Hugh Dickins
    Cc: Jerome Glisse
    Cc: Johannes Weiner
    Cc: "Kirill A . Shutemov"
    Cc: Martin Cracauer
    Cc: Marty McFadden
    Cc: Maya Gokhale
    Cc: Mel Gorman
    Cc: Mike Kravetz
    Cc: Pavel Emelyanov
    Cc: Rik van Riel
    Cc: Shaohua Li
    Link: http://lkml.kernel.org/r/20200220163112.11409-18-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     
  • It does not make sense to try to wake up any waiting thread when we're
    write-protecting a memory region. Only wake up when resolving a write
    protected page fault.

    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Reviewed-by: Mike Rapoport
    Cc: Andrea Arcangeli
    Cc: Bobby Powers
    Cc: Brian Geffon
    Cc: David Hildenbrand
    Cc: Denis Plotnikov
    Cc: "Dr . David Alan Gilbert"
    Cc: Hugh Dickins
    Cc: Jerome Glisse
    Cc: Johannes Weiner
    Cc: "Kirill A . Shutemov"
    Cc: Martin Cracauer
    Cc: Marty McFadden
    Cc: Maya Gokhale
    Cc: Mel Gorman
    Cc: Mike Kravetz
    Cc: Pavel Emelyanov
    Cc: Rik van Riel
    Cc: Shaohua Li
    Link: http://lkml.kernel.org/r/20200220163112.11409-16-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     
  • Introduce the new uffd-wp APIs for userspace.

    Firstly, we'll allow to do UFFDIO_REGISTER with write protection tracking
    using the new UFFDIO_REGISTER_MODE_WP flag. Note that this flag can
    co-exist with the existing UFFDIO_REGISTER_MODE_MISSING, in which case the
    userspace program can not only resolve missing page faults, and at the
    same time tracking page data changes along the way.

    Secondly, we introduced the new UFFDIO_WRITEPROTECT API to do page level
    write protection tracking. Note that we will need to register the memory
    region with UFFDIO_REGISTER_MODE_WP before that.

    [peterx@redhat.com: write up the commit message]
    [peterx@redhat.com: remove useless block, write commit message, check against
    VM_MAYWRITE rather than VM_WRITE when register]
    Signed-off-by: Andrea Arcangeli
    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Reviewed-by: Jerome Glisse
    Cc: Bobby Powers
    Cc: Brian Geffon
    Cc: David Hildenbrand
    Cc: Denis Plotnikov
    Cc: "Dr . David Alan Gilbert"
    Cc: Hugh Dickins
    Cc: Johannes Weiner
    Cc: "Kirill A . Shutemov"
    Cc: Martin Cracauer
    Cc: Marty McFadden
    Cc: Maya Gokhale
    Cc: Mel Gorman
    Cc: Mike Kravetz
    Cc: Mike Rapoport
    Cc: Pavel Emelyanov
    Cc: Rik van Riel
    Cc: Shaohua Li
    Link: http://lkml.kernel.org/r/20200220163112.11409-14-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     
  • This allows UFFDIO_COPY to map pages write-protected.

    [peterx@redhat.com: switch to VM_WARN_ON_ONCE in mfill_atomic_pte; add brackets
    around "dst_vma->vm_flags & VM_WRITE"; fix wordings in comments and
    commit messages]
    Signed-off-by: Andrea Arcangeli
    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Reviewed-by: Jerome Glisse
    Reviewed-by: Mike Rapoport
    Cc: Bobby Powers
    Cc: Brian Geffon
    Cc: David Hildenbrand
    Cc: Denis Plotnikov
    Cc: "Dr . David Alan Gilbert"
    Cc: Hugh Dickins
    Cc: Johannes Weiner
    Cc: "Kirill A . Shutemov"
    Cc: Martin Cracauer
    Cc: Marty McFadden
    Cc: Maya Gokhale
    Cc: Mel Gorman
    Cc: Mike Kravetz
    Cc: Pavel Emelyanov
    Cc: Rik van Riel
    Cc: Shaohua Li
    Link: http://lkml.kernel.org/r/20200220163112.11409-6-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     
  • This replaces all remaining open encodings with is_vm_hugetlb_page().

    Signed-off-by: Anshuman Khandual
    Signed-off-by: Andrew Morton
    Acked-by: Vlastimil Babka
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: Michael Ellerman
    Cc: Alexander Viro
    Cc: Will Deacon
    Cc: "Aneesh Kumar K.V"
    Cc: Nick Piggin
    Cc: Peter Zijlstra
    Cc: Arnd Bergmann
    Cc: Ingo Molnar
    Cc: Arnaldo Carvalho de Melo
    Cc: Andy Lutomirski
    Cc: Dave Hansen
    Cc: Geert Uytterhoeven
    Cc: Guo Ren
    Cc: Mel Gorman
    Cc: Paul Burton
    Cc: Paul Mackerras
    Cc: Ralf Baechle
    Cc: Rich Felker
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    Cc: Yoshinori Sato
    Link: http://lkml.kernel.org/r/1582520593-30704-4-git-send-email-anshuman.khandual@arm.com
    Signed-off-by: Linus Torvalds

    Anshuman Khandual
     

06 Apr, 2020

8 commits

  • Pull fsnotify updates from Jan Kara:
    "This implements the fanotify FAN_DIR_MODIFY event.

    This event reports the name in a directory under which a change
    happened and together with the directory filehandle and fstatat()
    allows reliable and efficient implementation of directory
    synchronization"

    * tag 'fsnotify_for_v5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    fanotify: Fix the checks in fanotify_fsid_equal
    fanotify: report name info for FAN_DIR_MODIFY event
    fanotify: record name info for FAN_DIR_MODIFY event
    fanotify: Drop fanotify_event_has_fid()
    fanotify: prepare to report both parent and child fid's
    fanotify: send FAN_DIR_MODIFY event flavor with dir inode and name
    fanotify: divorce fanotify_path_event and fanotify_fid_event
    fanotify: Store fanotify handles differently
    fanotify: Simplify create_fd()
    fanotify: fix merging marks masks with FAN_ONDIR
    fanotify: merge duplicate events on parent and child
    fsnotify: replace inode pointer with an object id
    fsnotify: simplify arguments passing to fsnotify_parent()
    fsnotify: use helpers to access data by data_type
    fsnotify: funnel all dirent events through fsnotify_name()
    fsnotify: factor helpers fsnotify_dentry() and fsnotify_file()
    fsnotify: tidy up FS_ and FAN_ constants

    Linus Torvalds
     
  • Pull ext2/udf updates from Jan Kara:
    "Cleanups and fixes for ext2 and one cleanup for udf"

    * tag 'for_v5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    ext2: fix empty body warnings when -Wextra is used
    ext2: fix debug reference to ext2_xattr_cache
    udf: udf_sb.h: Replace zero-length array with flexible-array member
    ext2: xattr.h: Replace zero-length array with flexible-array member
    ext2: Silence lockdep warning about reclaim under xattr_sem

    Linus Torvalds
     
  • Pull 9p updates from Dominique Martinet:
    "Not much new, but a few patches for this cycle:

    - Fix read with O_NONBLOCK to allow incomplete read and return
    immediately

    - Rest is just cleanup (indent, unused field in struct, extra
    semicolon)"

    * tag '9p-for-5.7' of git://github.com/martinetd/linux:
    net/9p: remove unused p9_req_t aux field
    9p: read only once on O_NONBLOCK
    9pnet: allow making incomplete read requests
    9p: Remove unneeded semicolon
    9p: Fix Kconfig indentation

    Linus Torvalds
     
  • Pull vfs pathwalk fix from Al Viro:
    "Dumb braino in legitimize_path()..."

    * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    fix a braino in legitimize_path()

    Linus Torvalds
     
  • brown paperbag time... wrong order of arguments ended up confusing
    the values to check dentry and mount_lock seqcounts against.

    Reported-by: kernel test robot
    Fixes: 2aa38470853a ("non-RCU analogue of the previous commit")
    Tested-by: kernel test robot
    Signed-off-by: Al Viro

    Al Viro
     
  • Commit 9255782f7061 ("sysfs: Wrap __compat_only_sysfs_link_entry_to_kobj
    function to change the symlink name") made this function a wrapper
    around a new non-underscored function, which is a bit odd. The normal
    naming convention is the other way around: the underscored function is
    the wrappee, and the non-underscored function is the wrapper.

    There's only one single user (well, two call-sites in that user) of the
    more limited double underscore version of this function, so just remove
    the oddly named wrapper entirely and just add the extra NULL argument to
    the user.

    I considered just doing that in the merge, but that tends to make
    history really hard to read.

    Link: https://lore.kernel.org/lkml/CAHk-=wgkkmNV5tMzQDmPAQuNJBuMcry--Jb+h8H1o4RA3kF7QQ@mail.gmail.com/
    Cc: Sourabh Jain
    Cc: Michael Ellerman
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Pull powerpc updates from Michael Ellerman:
    "Slightly late as I had to rebase mid-week to insert a bug fix:

    - A large series from Nick for 64-bit to further rework our exception
    vectors, and rewrite portions of the syscall entry/exit and
    interrupt return in C. The result is much easier to follow code
    that is also faster in general.

    - Cleanup of our ptrace code to split various parts out that had
    become badly intertwined with #ifdefs over the years.

    - Changes to our NUMA setup under the PowerVM hypervisor which should
    hopefully avoid non-sensical topologies which can lead to warnings
    from the workqueue code and other problems.

    - MAINTAINERS updates to remove some of our old orphan entries and
    update the status of others.

    - Quite a few other small changes and fixes all over the map.

    Thanks to: Abdul Haleem, afzal mohammed, Alexey Kardashevskiy, Andrew
    Donnellan, Aneesh Kumar K.V, Balamuruhan S, Cédric Le Goater, Chen
    Zhou, Christophe JAILLET, Christophe Leroy, Christoph Hellwig, Clement
    Courbet, Daniel Axtens, David Gibson, Douglas Miller, Fabiano Rosas,
    Fangrui Song, Ganesh Goudar, Gautham R. Shenoy, Greg Kroah-Hartman,
    Greg Kurz, Gustavo Luiz Duarte, Hari Bathini, Ilie Halip, Jan Kara,
    Joe Lawrence, Joe Perches, Kajol Jain, Larry Finger, Laurentiu Tudor,
    Leonardo Bras, Libor Pechacek, Madhavan Srinivasan, Mahesh Salgaonkar,
    Masahiro Yamada, Masami Hiramatsu, Mauricio Faria de Oliveira, Michael
    Neuling, Michal Suchanek, Mike Rapoport, Nageswara R Sastry, Nathan
    Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Nick
    Desaulniers, Oliver O'Halloran, Po-Hsu Lin, Pratik Rajesh Sampat,
    Rasmus Villemoes, Ravi Bangoria, Roman Bolshakov, Sam Bobroff,
    Sandipan Das, Santosh S, Sedat Dilek, Segher Boessenkool, Shilpasri G
    Bhat, Sourabh Jain, Srikar Dronamraju, Stephen Rothwell, Tyrel
    Datwyler, Vaibhav Jain, YueHaibing"

    * tag 'powerpc-5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (158 commits)
    powerpc: Make setjmp/longjmp signature standard
    powerpc/cputable: Remove unnecessary copy of cpu_spec->oprofile_type
    powerpc: Suppress .eh_frame generation
    powerpc: Drop -fno-dwarf2-cfi-asm
    powerpc/32: drop unused ISA_DMA_THRESHOLD
    powerpc/powernv: Add documentation for the opal sensor_groups sysfs interfaces
    selftests/powerpc: Fix try-run when source tree is not writable
    powerpc/vmlinux.lds: Explicitly retain .gnu.hash
    powerpc/ptrace: move ptrace_triggered() into hw_breakpoint.c
    powerpc/ptrace: create ppc_gethwdinfo()
    powerpc/ptrace: create ptrace_get_debugreg()
    powerpc/ptrace: split out ADV_DEBUG_REGS related functions.
    powerpc/ptrace: move register viewing functions out of ptrace.c
    powerpc/ptrace: split out TRANSACTIONAL_MEM related functions.
    powerpc/ptrace: split out SPE related functions.
    powerpc/ptrace: split out ALTIVEC related functions.
    powerpc/ptrace: split out VSX related functions.
    powerpc/ptrace: drop PARAMETER_SAVE_AREA_OFFSET
    powerpc/ptrace: drop unnecessary #ifdefs CONFIG_PPC64
    powerpc/ptrace: remove unused header includes
    ...

    Linus Torvalds
     
  • Pull ext4 updates from Ted Ts'o:

    - Replace ext4's bmap and iopoll implementations to use iomap.

    - Clean up extent tree handling.

    - Other cleanups and miscellaneous bug fixes

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (31 commits)
    ext4: save all error info in save_error_info() and drop ext4_set_errno()
    ext4: fix incorrect group count in ext4_fill_super error message
    ext4: fix incorrect inodes per group in error message
    ext4: don't set dioread_nolock by default for blocksize < pagesize
    ext4: disable dioread_nolock whenever delayed allocation is disabled
    ext4: do not commit super on read-only bdev
    ext4: avoid ENOSPC when avoiding to reuse recently deleted inodes
    ext4: unregister sysfs path before destroying jbd2 journal
    ext4: check for non-zero journal inum in ext4_calculate_overhead
    ext4: remove map_from_cluster from ext4_ext_map_blocks
    ext4: clean up ext4_ext_insert_extent() call in ext4_ext_map_blocks()
    ext4: mark block bitmap corrupted when found instead of BUGON
    ext4: use flexible-array member for xattr structs
    ext4: use flexible-array member in struct fname
    Documentation: correct the description of FIEMAP_EXTENT_LAST
    ext4: move ext4_fiemap to use iomap framework
    ext4: make ext4_ind_map_blocks work with fiemap
    ext4: move ext4 bmap to use iomap infrastructure
    ext4: optimize ext4_ext_precache for 0 depth
    ext4: add IOMAP_F_MERGED for non-extent based mapping
    ...

    Linus Torvalds
     

05 Apr, 2020

2 commits

  • Pull exfat filesystem from Al Viro:
    "Shiny new fs/exfat replacement for drivers/staging/exfat"

    * 'work.exfat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    exfat: update file system parameter handling
    staging: exfat: make staging/exfat and fs/exfat mutually exclusive
    MAINTAINERS: add exfat filesystem
    exfat: add Kconfig and Makefile
    exfat: add nls operations
    exfat: add misc operations
    exfat: add exfat cache
    exfat: add bitmap operations
    exfat: add fat entry operations
    exfat: add file operations
    exfat: add directory operations
    exfat: add inode operations
    exfat: add super block operations
    exfat: add in-memory and on-disk structures and headers

    Linus Torvalds
     
  • Pull nfsd updates from Chuck Lever:

    - Fix EXCHANGE_ID response when NFSD runs in a container

    - A battery of new static trace points

    - Socket transports now use bio_vec to send Replies

    - NFS/RDMA now supports filesystems with no .splice_read method

    - Favor memcpy() over DMA mapping for small RPC/RDMA Replies

    - Add pre-requisites for supporting multiple Write chunks

    - Numerous minor fixes and clean-ups

    [ Chuck is filling in for Bruce this time while he and his family settle
    into a new house ]

    * tag 'nfsd-5.7' of git://git.linux-nfs.org/projects/cel/cel-2.6: (39 commits)
    svcrdma: Fix leak of transport addresses
    SUNRPC: Fix a potential buffer overflow in 'svc_print_xprts()'
    SUNRPC/cache: don't allow invalid entries to be flushed
    nfsd: fsnotify on rmdir under nfsd/clients/
    nfsd4: kill warnings on testing stateids with mismatched clientids
    nfsd: remove read permission bit for ctl sysctl
    NFSD: Fix NFS server build errors
    sunrpc: Add tracing for cache events
    SUNRPC/cache: Allow garbage collection of invalid cache entries
    nfsd: export upcalls must not return ESTALE when mountd is down
    nfsd: Add tracepoints for update of the expkey and export cache entries
    nfsd: Add tracepoints for exp_find_key() and exp_get_by_name()
    nfsd: Add tracing to nfsd_set_fh_dentry()
    nfsd: Don't add locks to closed or closing open stateids
    SUNRPC: Teach server to use xprt_sock_sendmsg for socket sends
    SUNRPC: Refactor xs_sendpages()
    svcrdma: Avoid DMA mapping small RPC Replies
    svcrdma: Fix double sync of transport header buffer
    svcrdma: Refactor chunk list encoders
    SUNRPC: Add encoders for list item discriminators
    ...

    Linus Torvalds
     

04 Apr, 2020

2 commits

  • Pull SPDX updates from Greg KH:
    "Here are three SPDX patches for 5.7-rc1.

    One fixes up the SPDX tag for a single driver, while the other two go
    through the tree and add SPDX tags for all of the .gitignore files as
    needed.

    Nothing too complex, but you will get a merge conflict with your
    current tree, that should be trivial to handle (one file modified by
    two things, one file deleted.)

    All three of these have been in linux-next for a while, with no
    reported issues other than the merge conflict"

    * tag 'spdx-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx:
    ASoC: MT6660: make spdxcheck.py happy
    .gitignore: add SPDX License Identifier
    .gitignore: remove too obvious comments

    Linus Torvalds
     
  • Pull cgroup updates from Tejun Heo:

    - Christian extended clone3 so that processes can be spawned into
    cgroups directly.

    This is not only neat in terms of semantics but also avoids grabbing
    the global cgroup_threadgroup_rwsem for migration.

    - Daniel added !root xattr support to cgroupfs.

    Userland already uses xattrs on cgroupfs for bookkeeping. This will
    allow delegated cgroups to support such usages.

    - Prateek tried to make cpuset hotplug handling synchronous but that
    led to possible deadlock scenarios. Reverted.

    - Other minor changes including release_agent_path handling cleanup.

    * 'for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
    docs: cgroup-v1: Document the cpuset_v2_mode mount option
    Revert "cpuset: Make cpuset hotplug synchronous"
    cgroupfs: Support user xattrs
    kernfs: Add option to enable user xattrs
    kernfs: Add removed_size out param for simple_xattr_set
    kernfs: kvmalloc xattr value instead of kmalloc
    cgroup: Restructure release_agent_path handling
    selftests/cgroup: add tests for cloning into cgroups
    clone3: allow spawning processes into cgroups
    cgroup: add cgroup_may_write() helper
    cgroup: refactor fork helpers
    cgroup: add cgroup_get_from_file() helper
    cgroup: unify attach permission checking
    cpuset: Make cpuset hotplug synchronous
    cgroup.c: Use built-in RCU list checking
    kselftest/cgroup: add cgroup destruction test
    cgroup: Clean up css_set task traversal

    Linus Torvalds
     

03 Apr, 2020

15 commits

  • Merge updates from Andrew Morton:
    "A large amount of MM, plenty more to come.

    Subsystems affected by this patch series:
    - tools
    - kthread
    - kbuild
    - scripts
    - ocfs2
    - vfs
    - mm: slub, kmemleak, pagecache, gup, swap, memcg, pagemap, mremap,
    sparsemem, kasan, pagealloc, vmscan, compaction, mempolicy,
    hugetlbfs, hugetlb"

    * emailed patches from Andrew Morton : (155 commits)
    include/linux/huge_mm.h: check PageTail in hpage_nr_pages even when !THP
    mm/hugetlb: fix build failure with HUGETLB_PAGE but not HUGEBTLBFS
    selftests/vm: fix map_hugetlb length used for testing read and write
    mm/hugetlb: remove unnecessary memory fetch in PageHeadHuge()
    mm/hugetlb.c: clean code by removing unnecessary initialization
    hugetlb_cgroup: add hugetlb_cgroup reservation docs
    hugetlb_cgroup: add hugetlb_cgroup reservation tests
    hugetlb: support file_region coalescing again
    hugetlb_cgroup: support noreserve mappings
    hugetlb_cgroup: add accounting for shared mappings
    hugetlb: disable region_add file_region coalescing
    hugetlb_cgroup: add reservation accounting for private mappings
    mm/hugetlb_cgroup: fix hugetlb_cgroup migration
    hugetlb_cgroup: add interface for charge/uncharge hugetlb reservations
    hugetlb_cgroup: add hugetlb_cgroup reservation counter
    hugetlbfs: Use i_mmap_rwsem to address page fault/truncate race
    hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization
    mm/memblock.c: remove redundant assignment to variable max_addr
    mm: mempolicy: require at least one nodeid for MPOL_PREFERRED
    mm: mempolicy: use VM_BUG_ON_VMA in queue_pages_test_walk()
    ...

    Linus Torvalds
     
  • Pull xfs updates from Darrick Wong:
    "There's a lot going on this cycle with cleanups in the log code, the
    btree code, and the xattr code.

    We're tightening of metadata validation and online fsck checking, and
    introducing a common btree rebuilding library so that we can refactor
    xfs_repair and introduce online repair in a future cycle.

    We also fixed a few visible bugs -- most notably there's one in
    getdents that we introduced in 5.6; and a fix for hangs when disabling
    quotas.

    This series has been running fstests & other QA in the background for
    over a week and looks good so far.

    I anticipate sending a second pull request next week. That batch will
    change how xfs interacts with memory reclaim; how the log batches and
    throttles log items; how hard writes near ENOSPC will try to squeeze
    more space out of the filesystem; and hopefully fix the last of the
    umount hangs after a catastrophic failure. That should ease a lot of
    problems when running at the limits, but for now I'm leaving that in
    for-next for another week to make sure we got all the subtleties
    right.

    Summary:

    - Fix a hard to trigger race between iclog error checking and log
    shutdown.

    - Strengthen the AGF verifier.

    - Ratelimit some of the more spammy error messages.

    - Remove the icdinode uid/gid members and just use the ones in the
    vfs inode.

    - Hold ILOCK across insert/collapse range.

    - Clean up the extended attribute interfaces.

    - Clean up the attr flags mess.

    - Restore PF_MEMALLOC after exiting xfsaild thread to avoid
    triggering warnings in the process accounting code.

    - Remove the flexibly-sized array from struct xfs_agfl to eliminate
    compiler warnings about unaligned pointers and packed structures.

    - Various macro and typedef removals.

    - Stale metadata buffers if we decide they're corrupt outside of a
    verifier.

    - Check directory data/block/free block owners.

    - Fix a UAF when aborting inactivation of a corrupt xattr fork.

    - Teach online scrub to report failed directory and attr name lookups
    as a metadata corruption instead of a runtime error.

    - Avoid potential buffer overflows in sysfs files by using scnprintf.

    - Fix a regression in getdents lookups due to a mistake in pointer
    arithmetic.

    - Refactor btree cursor private data structures to use anonymous
    unions.

    - Cleanups in the log unmounting code.

    - Fix a potential mishandling of ENOMEM errors on multi-block
    directory buffer lookups.

    - Fix an incorrect test in the block allocation code.

    - Cleanups and name prefix shortening in the scrub code.

    - Introduce btree bulk loading code for online repair and scrub.

    - Fix a quotaoff log item leak (and hang) when the fs goes down
    midway through a quotaoff operation.

    - Remove di_version from the incore inode.

    - Refactor some of the log shutdown checking code.

    - Record the forcing of the log unmount records in the log force
    counters.

    - Fix a longstanding bug where quotacheck would purge the
    administrator's default quota grace interval and warning limits.

    - Reduce memory usage when scrubbing directory and xattr trees.

    - Don't let fsfreeze race with GETFSMAP or online scrub.

    - Handle bio_add_page failures more gracefully in xlog_write_iclog"

    * tag 'xfs-5.7-merge-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (108 commits)
    xfs: prohibit fs freezing when using empty transactions
    xfs: shutdown on failure to add page to log bio
    xfs: directory bestfree check should release buffers
    xfs: drop all altpath buffers at the end of the sibling check
    xfs: preserve default grace interval during quotacheck
    xfs: remove xlog_state_want_sync
    xfs: move the ioerror check out of xlog_state_clean_iclog
    xfs: refactor xlog_state_clean_iclog
    xfs: remove the aborted parameter to xlog_state_done_syncing
    xfs: simplify log shutdown checking in xfs_log_release_iclog
    xfs: simplify the xfs_log_release_iclog calling convention
    xfs: factor out a xlog_wait_on_iclog helper
    xfs: merge xlog_cil_push into xlog_cil_push_work
    xfs: remove the di_version field from struct icdinode
    xfs: simplify a check in xfs_ioctl_setattr_check_cowextsize
    xfs: simplify di_flags2 inheritance in xfs_ialloc
    xfs: only check the superblock version for dinode size calculation
    xfs: add a new xfs_sb_version_has_v3inode helper
    xfs: fix unmount hang and memory leak on shutdown during quotaoff
    xfs: factor out quotaoff intent AIL removal and memory free
    ...

    Linus Torvalds
     
  • Pull hibernation fix from Darrick Wong:
    "Fix a regression where we broke the userspace hibernation driver by
    disallowing writes to the swap device"

    * tag 'vfs-5.7-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
    hibernate: Allow uswsusp to write to swap

    Linus Torvalds
     
  • Pull iomap updates from Darrick Wong:
    "We're fixing tracepoints and comments in this cycle, so there
    shouldn't be any surprises here.

    I anticipate sending a second pull request next week with a single bug
    fix for readahead, but it's still undergoing QA.

    Summary:

    - Fix a broken tracepoint

    - Fix a broken comment"

    * tag 'iomap-5.7-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
    iomap: fix comments in iomap_dio_rw
    iomap: Remove pgoff from tracepoints

    Linus Torvalds
     
  • Pull vfs pathwalk sanitizing from Al Viro:
    "Massive pathwalk rewrite and cleanups.

    Several iterations have been posted; hopefully this thing is getting
    readable and understandable now. Pretty much all parts of pathname
    resolutions are affected...

    The branch is identical to what has sat in -next, except for commit
    message in "lift all calls of step_into() out of follow_dotdot/
    follow_dotdot_rcu", crediting Qian Cai for reporting the bug; only
    commit message changed there."

    * 'work.dotdot1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (69 commits)
    lookup_open(): don't bother with fallbacks to lookup+create
    atomic_open(): no need to pass struct open_flags anymore
    open_last_lookups(): move complete_walk() into do_open()
    open_last_lookups(): lift O_EXCL|O_CREAT handling into do_open()
    open_last_lookups(): don't abuse complete_walk() when all we want is unlazy
    open_last_lookups(): consolidate fsnotify_create() calls
    take post-lookup part of do_last() out of loop
    link_path_walk(): sample parent's i_uid and i_mode for the last component
    __nd_alloc_stack(): make it return bool
    reserve_stack(): switch to __nd_alloc_stack()
    pick_link(): take reserving space on stack into a new helper
    pick_link(): more straightforward handling of allocation failures
    fold path_to_nameidata() into its only remaining caller
    pick_link(): pass it struct path already with normal refcounting rules
    fs/namei.c: kill follow_mount()
    non-RCU analogue of the previous commit
    helper for mount rootwards traversal
    follow_dotdot(): be lazy about changing nd->path
    follow_dotdot_rcu(): be lazy about changing nd->path
    follow_dotdot{,_rcu}(): massage loops
    ...

    Linus Torvalds
     
  • Pull exec/proc updates from Eric Biederman:
    "This contains two significant pieces of work: the work to sort out
    proc_flush_task, and the work to solve a deadlock between strace and
    exec.

    Fixing proc_flush_task so that it no longer requires a persistent
    mount makes improvements to proc possible. The removal of the
    persistent mount solves an old regression that that caused the hidepid
    mount option to only work on remount not on mount. The regression was
    found and reported by the Android folks. This further allows Alexey
    Gladkov's work making proc mount options specific to an individual
    mount of proc to move forward.

    The work on exec starts solving a long standing issue with exec that
    it takes mutexes of blocking userspace applications, which makes exec
    extremely deadlock prone. For the moment this adds a second mutex with
    a narrower scope that handles all of the easy cases. Which makes the
    tricky cases easy to spot. With a little luck the code to solve those
    deadlocks will be ready by next merge window"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (25 commits)
    signal: Extend exec_id to 64bits
    pidfd: Use new infrastructure to fix deadlocks in execve
    perf: Use new infrastructure to fix deadlocks in execve
    proc: io_accounting: Use new infrastructure to fix deadlocks in execve
    proc: Use new infrastructure to fix deadlocks in execve
    kernel/kcmp.c: Use new infrastructure to fix deadlocks in execve
    kernel: doc: remove outdated comment cred.c
    mm: docs: Fix a comment in process_vm_rw_core
    selftests/ptrace: add test cases for dead-locks
    exec: Fix a deadlock in strace
    exec: Add exec_update_mutex to replace cred_guard_mutex
    exec: Move exec_mmap right after de_thread in flush_old_exec
    exec: Move cleanup of posix timers on exec out of de_thread
    exec: Factor unshare_sighand out of de_thread and call it separately
    exec: Only compute current once in flush_old_exec
    pid: Improve the comment about waiting in zap_pid_ns_processes
    proc: Remove the now unnecessary internal mount of proc
    uml: Create a private mount of proc for mconsole
    uml: Don't consult current to find the proc_mnt in mconsole_proc
    proc: Use a list of inodes to flush from proc
    ...

    Linus Torvalds
     
  • hugetlbfs page faults can race with truncate and hole punch operations.
    Current code in the page fault path attempts to handle this by 'backing
    out' operations if we encounter the race. One obvious omission in the
    current code is removing a page newly added to the page cache. This is
    pretty straight forward to address, but there is a more subtle and
    difficult issue of backing out hugetlb reservations. To handle this
    correctly, the 'reservation state' before page allocation needs to be
    noted so that it can be properly backed out. There are four distinct
    possibilities for reservation state: shared/reserved, shared/no-resv,
    private/reserved and private/no-resv. Backing out a reservation may
    require memory allocation which could fail so that needs to be taken
    into account as well.

    Instead of writing the required complicated code for this rare
    occurrence, just eliminate the race. i_mmap_rwsem is now held in read
    mode for the duration of page fault processing. Hold i_mmap_rwsem in
    write mode when modifying i_size. In this way, truncation can not
    proceed when page faults are being processed. In addition, i_size
    will not change during fault processing so a single check can be made
    to ensure faults are not beyond (proposed) end of file. Faults can
    still race with hole punch, but that race is handled by existing code
    and the use of hugetlb_fault_mutex.

    With this modification, checks for races with truncation in the page
    fault path can be simplified and removed. remove_inode_hugepages no
    longer needs to take hugetlb_fault_mutex in the case of truncation.
    Comments are expanded to explain reasoning behind locking.

    Signed-off-by: Mike Kravetz
    Signed-off-by: Andrew Morton
    Cc: Andrea Arcangeli
    Cc: "Aneesh Kumar K . V"
    Cc: Davidlohr Bueso
    Cc: Hugh Dickins
    Cc: "Kirill A . Shutemov"
    Cc: Michal Hocko
    Cc: Naoya Horiguchi
    Cc: Prakash Sangappa
    Link: http://lkml.kernel.org/r/20200316205756.146666-3-mike.kravetz@oracle.com
    Signed-off-by: Linus Torvalds

    Mike Kravetz
     
  • Patch series "hugetlbfs: use i_mmap_rwsem for more synchronization", v2.

    While discussing the issue with huge_pte_offset [1], I remembered that
    there were more outstanding hugetlb races. These issues are:

    1) For shared pmds, huge PTE pointers returned by huge_pte_alloc can become
    invalid via a call to huge_pmd_unshare by another thread.
    2) hugetlbfs page faults can race with truncation causing invalid global
    reserve counts and state.

    A previous attempt was made to use i_mmap_rwsem in this manner as
    described at [2]. However, those patches were reverted starting with [3]
    due to locking issues.

    To effectively use i_mmap_rwsem to address the above issues it needs to be
    held (in read mode) during page fault processing. However, during fault
    processing we need to lock the page we will be adding. Lock ordering
    requires we take page lock before i_mmap_rwsem. Waiting until after
    taking the page lock is too late in the fault process for the
    synchronization we want to do.

    To address this lock ordering issue, the following patches change the lock
    ordering for hugetlb pages. This is not too invasive as hugetlbfs
    processing is done separate from core mm in many places. However, I don't
    really like this idea. Much ugliness is contained in the new routine
    hugetlb_page_mapping_lock_write() of patch 1.

    The only other way I can think of to address these issues is by catching
    all the races. After catching a race, cleanup, backout, retry ... etc,
    as needed. This can get really ugly, especially for huge page
    reservations. At one time, I started writing some of the reservation
    backout code for page faults and it got so ugly and complicated I went
    down the path of adding synchronization to avoid the races. Any other
    suggestions would be welcome.

    [1] https://lore.kernel.org/linux-mm/1582342427-230392-1-git-send-email-longpeng2@huawei.com/
    [2] https://lore.kernel.org/linux-mm/20181222223013.22193-1-mike.kravetz@oracle.com/
    [3] https://lore.kernel.org/linux-mm/20190103235452.29335-1-mike.kravetz@oracle.com
    [4] https://lore.kernel.org/linux-mm/1584028670.7365.182.camel@lca.pw/
    [5] https://lore.kernel.org/lkml/20200312183142.108df9ac@canb.auug.org.au/

    This patch (of 2):

    While looking at BUGs associated with invalid huge page map counts, it was
    discovered and observed that a huge pte pointer could become 'invalid' and
    point to another task's page table. Consider the following:

    A task takes a page fault on a shared hugetlbfs file and calls
    huge_pte_alloc to get a ptep. Suppose the returned ptep points to a
    shared pmd.

    Now, another task truncates the hugetlbfs file. As part of truncation, it
    unmaps everyone who has the file mapped. If the range being truncated is
    covered by a shared pmd, huge_pmd_unshare will be called. For all but the
    last user of the shared pmd, huge_pmd_unshare will clear the pud pointing
    to the pmd. If the task in the middle of the page fault is not the last
    user, the ptep returned by huge_pte_alloc now points to another task's
    page table or worse. This leads to bad things such as incorrect page
    map/reference counts or invalid memory references.

    To fix, expand the use of i_mmap_rwsem as follows:
    - i_mmap_rwsem is held in read mode whenever huge_pmd_share is called.
    huge_pmd_share is only called via huge_pte_alloc, so callers of
    huge_pte_alloc take i_mmap_rwsem before calling. In addition, callers
    of huge_pte_alloc continue to hold the semaphore until finished with
    the ptep.
    - i_mmap_rwsem is held in write mode whenever huge_pmd_unshare is called.

    One problem with this scheme is that it requires taking i_mmap_rwsem
    before taking the page lock during page faults. This is not the order
    specified in the rest of mm code. Handling of hugetlbfs pages is mostly
    isolated today. Therefore, we use this alternative locking order for
    PageHuge() pages.

    mapping->i_mmap_rwsem
    hugetlb_fault_mutex (hugetlbfs specific page fault mutex)
    page->flags PG_locked (lock_page)

    To help with lock ordering issues, hugetlb_page_mapping_lock_write() is
    introduced to write lock the i_mmap_rwsem associated with a page.

    In most cases it is easy to get address_space via vma->vm_file->f_mapping.
    However, in the case of migration or memory errors for anon pages we do
    not have an associated vma. A new routine _get_hugetlb_page_mapping()
    will use anon_vma to get address_space in these cases.

    Signed-off-by: Mike Kravetz
    Signed-off-by: Andrew Morton
    Cc: Michal Hocko
    Cc: Hugh Dickins
    Cc: Naoya Horiguchi
    Cc: "Aneesh Kumar K . V"
    Cc: Andrea Arcangeli
    Cc: "Kirill A . Shutemov"
    Cc: Davidlohr Bueso
    Cc: Prakash Sangappa
    Link: http://lkml.kernel.org/r/20200316205756.146666-2-mike.kravetz@oracle.com
    Signed-off-by: Linus Torvalds

    Mike Kravetz
     
  • Userfaultfd fault path was by default killable even if the caller does not
    have FAULT_FLAG_KILLABLE. That makes sense before in that when with gup
    we don't have FAULT_FLAG_KILLABLE properly set before. Now after previous
    patch we've got FAULT_FLAG_KILLABLE applied even for gup code so it should
    also make sense to let userfaultfd to honor the FAULT_FLAG_KILLABLE.

    Because we're unconditionally setting FAULT_FLAG_KILLABLE in gup code
    right now, this patch should have no functional change. It also cleaned
    the code a little bit by introducing some helpers.

    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Tested-by: Brian Geffon
    Cc: Andrea Arcangeli
    Cc: Bobby Powers
    Cc: David Hildenbrand
    Cc: Denis Plotnikov
    Cc: "Dr . David Alan Gilbert"
    Cc: Hugh Dickins
    Cc: Jerome Glisse
    Cc: Johannes Weiner
    Cc: "Kirill A . Shutemov"
    Cc: Martin Cracauer
    Cc: Marty McFadden
    Cc: Matthew Wilcox
    Cc: Maya Gokhale
    Cc: Mel Gorman
    Cc: Mike Kravetz
    Cc: Mike Rapoport
    Cc: Pavel Emelyanov
    Link: http://lkml.kernel.org/r/20200220160300.9941-1-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     
  • handle_userfaultfd() is currently the only one place in the kernel page
    fault procedures that can respond to non-fatal userspace signals. It was
    trying to detect such an allowance by checking against USER & KILLABLE
    flags, which was "un-official".

    In this patch, we introduced a new flag (FAULT_FLAG_INTERRUPTIBLE) to show
    that the fault handler allows the fault procedure to respond even to
    non-fatal signals. Meanwhile, add this new flag to the default fault
    flags so that all the page fault handlers can benefit from the new flag.
    With that, replacing the userfault check to this one.

    Since the line is getting even longer, clean up the fault flags a bit too
    to ease TTY users.

    Although we've got a new flag and applied it, we shouldn't have any
    functional change with this patch so far.

    Suggested-by: Linus Torvalds
    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Tested-by: Brian Geffon
    Reviewed-by: David Hildenbrand
    Cc: Andrea Arcangeli
    Cc: Bobby Powers
    Cc: Denis Plotnikov
    Cc: "Dr . David Alan Gilbert"
    Cc: Hugh Dickins
    Cc: Jerome Glisse
    Cc: Johannes Weiner
    Cc: "Kirill A . Shutemov"
    Cc: Martin Cracauer
    Cc: Marty McFadden
    Cc: Matthew Wilcox
    Cc: Maya Gokhale
    Cc: Mel Gorman
    Cc: Mike Kravetz
    Cc: Mike Rapoport
    Cc: Pavel Emelyanov
    Link: http://lkml.kernel.org/r/20200220195348.16302-1-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     
  • This patch removes the risk path in handle_userfault() then we will be
    sure that the callers of handle_mm_fault() will know that the VMAs might
    have changed. Meanwhile with previous patch we don't lose responsiveness
    as well since the core mm code now can handle the nonfatal userspace
    signals even if we return VM_FAULT_RETRY.

    Suggested-by: Andrea Arcangeli
    Suggested-by: Linus Torvalds
    Signed-off-by: Peter Xu
    Signed-off-by: Andrew Morton
    Tested-by: Brian Geffon
    Reviewed-by: Jerome Glisse
    Cc: Bobby Powers
    Cc: David Hildenbrand
    Cc: Denis Plotnikov
    Cc: "Dr . David Alan Gilbert"
    Cc: Hugh Dickins
    Cc: Johannes Weiner
    Cc: "Kirill A . Shutemov"
    Cc: Martin Cracauer
    Cc: Marty McFadden
    Cc: Matthew Wilcox
    Cc: Maya Gokhale
    Cc: Mel Gorman
    Cc: Mike Kravetz
    Cc: Mike Rapoport
    Cc: Pavel Emelyanov
    Link: http://lkml.kernel.org/r/20200220160234.9646-1-peterx@redhat.com
    Signed-off-by: Linus Torvalds

    Peter Xu
     
  • Rename (__)memcg_kmem_(un)charge() into (__)memcg_kmem_(un)charge_page()
    to better reflect what they are actually doing:

    1) call __memcg_kmem_(un)charge_memcg() to actually charge or uncharge
    the current memcg

    2) set or clear the PageKmemcg flag

    Signed-off-by: Roman Gushchin
    Signed-off-by: Andrew Morton
    Reviewed-by: Shakeel Butt
    Acked-by: Johannes Weiner
    Cc: Michal Hocko
    Cc: Vladimir Davydov
    Link: http://lkml.kernel.org/r/20200109202659.752357-4-guro@fb.com
    Signed-off-by: Linus Torvalds

    Roman Gushchin
     
  • This notice fills my boot logs with scary-looking asterisks but doesn't
    really tell me anything. Let's just remove it; validation errors are
    already reported separately, so this is just a redundant list of
    filesystems.

    $ dmesg | grep VALIDATE
    [ 0.306256] *** VALIDATE tmpfs ***
    [ 0.307422] *** VALIDATE proc ***
    [ 0.308355] *** VALIDATE cgroup ***
    [ 0.308741] *** VALIDATE cgroup2 ***
    [ 0.813256] *** VALIDATE bpf ***
    [ 0.815272] *** VALIDATE ramfs ***
    [ 0.815665] *** VALIDATE hugetlbfs ***
    [ 0.876970] *** VALIDATE nfs ***
    [ 0.877383] *** VALIDATE nfs4 ***

    Signed-off-by: Kees Cook
    Signed-off-by: Andrew Morton
    Reviewed-by: Seth Arnold
    Cc: Alexander Viro
    Link: http://lkml.kernel.org/r/202003061617.A8835CAAF@keescook
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • OCFS2 doesn't mind if memory reclaim makes I/Os happen; it just cares that
    it won't be reentered, so it can use memalloc_nofs_save() instead of
    memalloc_noio_save().

    Signed-off-by: Matthew Wilcox (Oracle)
    Signed-off-by: Andrew Morton
    Reviewed-by: Joseph Qi
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Junxiao Bi
    Cc: Changwei Ge
    Cc: Gang He
    Cc: Jun Piao
    Link: http://lkml.kernel.org/r/20200326200214.1102-1-willy@infradead.org
    Signed-off-by: Linus Torvalds

    Matthew Wilcox (Oracle)
     
  • Since snprintf() returns the would-be-output size instead of the actual
    output size, the succeeding calls may go beyond the given buffer limit.
    Fix it by replacing with scnprintf().

    Signed-off-by: Takashi Iwai
    Signed-off-by: Andrew Morton
    Acked-by: Joseph Qi
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Joseph Qi
    Cc: Changwei Ge
    Cc: Gang He
    Cc: Jun Piao
    Link: http://lkml.kernel.org/r/20200311093516.25300-1-tiwai@suse.de
    Signed-off-by: Linus Torvalds

    Takashi Iwai