08 Nov, 2018

1 commit

  • We use IOCB_HIPRI to poll for IO in the caller instead of scheduling.
    This information is not available for (or after) IO submission. The
    driver may make different queue choices based on the type of IO, so
    make the fact that we will poll for this IO known to the lower layers
    as well.

    Reviewed-by: Hannes Reinecke
    Reviewed-by: Keith Busch
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Jens Axboe

    Jens Axboe
     

05 Nov, 2018

2 commits

  • Pull UBIFS updates from Richard Weinberger:

    - Full filesystem authentication feature, UBIFS is now able to have the
    whole filesystem structure authenticated plus user data encrypted and
    authenticated.

    - Minor cleanups

    * tag 'tags/upstream-4.20-rc1' of git://git.infradead.org/linux-ubifs: (26 commits)
    ubifs: Remove unneeded semicolon
    Documentation: ubifs: Add authentication whitepaper
    ubifs: Enable authentication support
    ubifs: Do not update inode size in-place in authenticated mode
    ubifs: Add hashes and HMACs to default filesystem
    ubifs: authentication: Authenticate super block node
    ubifs: Create hash for default LPT
    ubfis: authentication: Authenticate master node
    ubifs: authentication: Authenticate LPT
    ubifs: Authenticate replayed journal
    ubifs: Add auth nodes to garbage collector journal head
    ubifs: Add authentication nodes to journal
    ubifs: authentication: Add hashes to index nodes
    ubifs: Add hashes to the tree node cache
    ubifs: Create functions to embed a HMAC in a node
    ubifs: Add helper functions for authentication support
    ubifs: Add separate functions to init/crc a node
    ubifs: Format changes for authentication support
    ubifs: Store read superblock node
    ubifs: Drop write_node
    ...

    Linus Torvalds
     
  • Pull NFS client bugfixes from Trond Myklebust:
    "Highlights include:

    Bugfix:
    - Fix build issues on architectures that don't provide 64-bit cmpxchg

    Cleanups:
    - Fix a spelling mistake"

    * tag 'nfs-for-4.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
    NFS: fix spelling mistake, EACCESS -> EACCES
    SUNRPC: Use atomic(64)_t for seq_send(64)

    Linus Torvalds
     

04 Nov, 2018

9 commits

  • Pull cifs fixes and updates from Steve French:
    "Three small fixes (one Kerberos related, one for stable, and another
    fixes an oops in xfstest 377), two helpful debugging improvements,
    three patches for cifs directio and some minor cleanup"

    * tag '4.20-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
    cifs: fix signed/unsigned mismatch on aio_read patch
    cifs: don't dereference smb_file_target before null check
    CIFS: Add direct I/O functions to file_operations
    CIFS: Add support for direct I/O write
    CIFS: Add support for direct I/O read
    smb3: missing defines and structs for reparse point handling
    smb3: allow more detailed protocol info on open files for debugging
    smb3: on kerberos mount if server doesn't specify auth type use krb5
    smb3: add trace point for tree connection
    cifs: fix spelling mistake, EACCESS -> EACCES
    cifs: fix return value for cifs_listxattr

    Linus Torvalds
     
  • syzbot is reporting too large memory allocation at bfs_fill_super() [1].
    Since file system image is corrupted such that bfs_sb->s_start == 0,
    bfs_fill_super() is trying to allocate 8MB of continuous memory. Fix
    this by adding a sanity check on bfs_sb->s_start, __GFP_NOWARN and
    printf().

    [1] https://syzkaller.appspot.com/bug?id=16a87c236b951351374a84c8a32f40edbc034e96

    Link: http://lkml.kernel.org/r/1525862104-3407-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
    Signed-off-by: Tetsuo Handa
    Reported-by: syzbot
    Reviewed-by: Andrew Morton
    Cc: Tigran Aivazian
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tetsuo Handa
     
  • ocfs2_defrag_extent() might leak allocated clusters. When the file
    system has insufficient space, the number of claimed clusters might be
    less than the caller wants. If that happens, the original code might
    directly commit the transaction without returning clusters.

    This patch is based on code in ocfs2_add_clusters_in_btree().

    [akpm@linux-foundation.org: include localalloc.h, reduce scope of data_ac]
    Link: http://lkml.kernel.org/r/20180904041621.16874-3-lchen@suse.com
    Signed-off-by: Larry Chen
    Reviewed-by: Andrew Morton
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Junxiao Bi
    Cc: Joseph Qi
    Cc: Changwei Ge
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Larry Chen
     
  • The handling of timestamps outside of the 1970..2038 range in the dlm
    glue is rather inconsistent: on 32-bit architectures, this has always
    wrapped around to negative timestamps in the 1902..1969 range, while on
    64-bit kernels all timestamps are interpreted as positive 34 bit numbers
    in the 1970..2514 year range.

    Now that the VFS code handles 64-bit timestamps on all architectures, we
    can make the behavior more consistent here, and return the same result
    that we had on 64-bit already, making the file system y2038 safe in the
    process. Outside of dlmglue, it already uses 64-bit on-disk timestamps
    anway, so that part is fine.

    For consistency, I'm changing ocfs2_pack_timespec() to clamp anything
    outside of the supported range to the minimum and maximum values. This
    avoids a possible ambiguity of values before 1970 in particular, which
    used to be interpreted as times at the end of the 2514 range previously.

    Link: http://lkml.kernel.org/r/20180619155826.4106487-1-arnd@arndb.de
    Signed-off-by: Arnd Bergmann
    Reviewed-by: Andrew Morton
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Junxiao Bi
    Cc: Joseph Qi
    Cc: Changwei Ge
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arnd Bergmann
     
  • ocfs2_read_blocks() and ocfs2_read_blocks_sync() are both used to read
    several blocks from disk. Currently, the input argument *bhs* can be
    NULL or NOT. It depends on the caller's behavior. If the function
    fails in reading blocks from disk, the corresponding bh will be assigned
    to NULL and put.

    Obviously, above process for non-NULL input bh is not appropriate.
    Because the caller doesn't even know its bhs are put and re-assigned.

    If buffer head is managed by caller, ocfs2_read_blocks and
    ocfs2_read_blocks_sync() should not evaluate it to NULL. It will cause
    caller accessing illegal memory, thus crash.

    Link: http://lkml.kernel.org/r/HK2PR06MB045285E0F4FBB561F9F2F9B3D5680@HK2PR06MB0452.apcprd06.prod.outlook.com
    Signed-off-by: Changwei Ge
    Reviewed-by: Guozhonghua
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Junxiao Bi
    Cc: Joseph Qi
    Cc: Changwei Ge
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Changwei Ge
     
  • Somehow, file system metadata was corrupted, which causes
    ocfs2_check_dir_entry() to fail in function ocfs2_dir_foreach_blk_el().

    According to the original design intention, if above happens we should
    skip the problematic block and continue to retrieve dir entry. But
    there is obviouse misuse of brelse around related code.

    After failure of ocfs2_check_dir_entry(), current code just moves to
    next position and uses the problematic buffer head again and again
    during which the problematic buffer head is released for multiple times.
    I suppose, this a serious issue which is long-lived in ocfs2. This may
    cause other file systems which is also used in a the same host insane.

    So we should also consider about bakcporting this patch into linux
    -stable.

    Link: http://lkml.kernel.org/r/HK2PR06MB045211675B43EED794E597B6D56E0@HK2PR06MB0452.apcprd06.prod.outlook.com
    Signed-off-by: Changwei Ge
    Suggested-by: Changkuo Shi
    Reviewed-by: Andrew Morton
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Junxiao Bi
    Cc: Joseph Qi
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Changwei Ge
     
  • When -EIOCBQUEUED returns, it means that aio_complete() will be called
    from dio_complete(), which is an asynchronous progress against
    write_iter. Generally, IO is a very slow progress than executing
    instruction, but we still can't take the risk to access a freed iocb.

    And we do face a BUG crash issue. Using the crash tool, iocb is
    obviously freed already.

    crash> struct -x kiocb ffff881a350f5900
    struct kiocb {
    ki_filp = 0xffff881a350f5a80,
    ki_pos = 0x0,
    ki_complete = 0x0,
    private = 0x0,
    ki_flags = 0x0
    }

    And the backtrace shows:
    ocfs2_file_write_iter+0xcaa/0xd00 [ocfs2]
    aio_run_iocb+0x229/0x2f0
    do_io_submit+0x291/0x540
    SyS_io_submit+0x10/0x20
    system_call_fastpath+0x16/0x75

    Link: http://lkml.kernel.org/r/1523361653-14439-1-git-send-email-ge.changwei@h3c.com
    Signed-off-by: Changwei Ge
    Reviewed-by: Andrew Morton
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Junxiao Bi
    Cc: Joseph Qi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Changwei Ge
     
  • During one dead node's recovery by other node, quota recovery work will
    be queued. We should avoid calling quota when it is not supported, so
    check the quota flags.

    Link: http://lkml.kernel.org/r/71604351584F6A4EBAE558C676F37CA401071AC9FB@H3CMLB12-EX.srv.huawei-3com.com
    Signed-off-by: guozhonghua
    Reviewed-by: Jan Kara
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Junxiao Bi
    Cc: Joseph Qi
    Cc: Changwei Ge
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Guozhonghua
     
  • Remove ocfs2_is_o2cb_active(). We have similar functions to identify
    which cluster stack is being used via osb->osb_cluster_stack.

    Secondly, the current implementation of ocfs2_is_o2cb_active() is not
    totally safe. Based on the design of stackglue, we need to get
    ocfs2_stack_lock before using ocfs2_stack related data structures, and
    that active_stack pointer can be NULL in the case of mount failure.

    Link: http://lkml.kernel.org/r/1495441079-11708-1-git-send-email-ghe@suse.com
    Signed-off-by: Gang He
    Reviewed-by: Joseph Qi
    Reviewed-by: Eric Ren
    Acked-by: Changwei Ge
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Junxiao Bi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gang He
     

03 Nov, 2018

13 commits

  • The patch "CIFS: Add support for direct I/O read" had
    a signed/unsigned mismatch (ssize_t vs. size_t) in the
    return from one function. Similar trivial change
    in aio_write

    Signed-off-by: Long Li
    Signed-off-by: Steve French
    Reported-by: Julia Lawall

    Steve French
     
  • There is a null check on dst_file->private data which suggests
    it can be potentially null. However, before this check, pointer
    smb_file_target is derived from dst_file->private and dereferenced
    in the call to tlink_tcon, hence there is a potential null pointer
    deference.

    Fix this by assigning smb_file_target and target_tcon after the
    null pointer sanity checks.

    Detected by CoverityScan, CID#1475302 ("Dereference before null check")

    Fixes: 04b38d601239 ("vfs: pull btrfs clone API to vfs layer")
    Signed-off-by: Colin Ian King
    Signed-off-by: Steve French

    Colin Ian King
     
  • With direct read/write functions implemented, add them to file_operations.

    Dircet I/O is used under two conditions:
    1. When mounting with "cache=none", CIFS uses direct I/O for all user file
    data transfer.
    2. When opening a file with O_DIRECT, CIFS uses direct I/O for all data
    transfer on this file.

    Signed-off-by: Long Li
    Signed-off-by: Steve French
    Reviewed-by: Ronnie Sahlberg

    Long Li
     
  • With direct I/O write, user supplied buffers are pinned to the memory and data
    are transferred directly from user buffers to the transport layer.

    Change in v3: add support for kernel AIO

    Change in v4:
    Refactor common write code to __cifs_writev for direct and non-direct I/O.
    Retry on direct I/O failure.

    Signed-off-by: Long Li
    Signed-off-by: Steve French

    Long Li
     
  • With direct I/O read, we transfer the data directly from transport layer to
    the user data buffer.

    Change in v3: add support for kernel AIO

    Change in v4:
    Refactor common read code to __cifs_readv for direct and non-direct I/O.
    Retry on direct I/O failure.

    Signed-off-by: Long Li
    Signed-off-by: Steve French

    Long Li
     
  • We were missing some structs from MS-FSCC relating to
    reparse point handling. Add them to protocol defines
    in smb2pdu.h

    Signed-off-by: Steve French
    Reviewed-by: Aurelien Aptel

    Steve French
     
  • In order to debug complex problems it is often helpful to
    have detailed information on the client and server view
    of the open file information. Add the ability for root to
    view the list of smb3 open files and dump the persistent
    handle and other info so that it can be more easily
    correlated with server logs.

    Sample output from "cat /proc/fs/cifs/open_files"

    # Version:1
    # Format:
    #
    0x5 0x800000378 0x8000 1 7704 0 some-file 0x14
    0xcb903c0c 0x84412e67 0x8000 1 7754 1001 rofile 0x1a6d
    0xcb903c0c 0x9526b767 0x8000 1 7720 1000 file 0x1a5b
    0xcb903c0c 0x9ce41a21 0x8000 1 7715 0 smallfile 0xd67

    Signed-off-by: Steve French
    Reviewed-by: Ronnie Sahlberg

    Steve French
     
  • Some servers (e.g. Azure) do not include a spnego blob in the SMB3
    negotiate protocol response, so on kerberos mounts ("sec=krb5")
    we can fail, as we expected the server to list its supported
    auth types (OIDs in the spnego blob in the negprot response).
    Change this so that on krb5 mounts we default to trying krb5 if the
    server doesn't list its supported protocol mechanisms.

    Signed-off-by: Steve French
    Reviewed-by: Ronnie Sahlberg
    CC: Stable

    Steve French
     
  • In debugging certain scenarios, especially reconnect cases,
    it can be helpful to have a dynamic trace point for the
    result of tree connect. See sample output below
    from a reconnect event. The new event is 'smb3_tcon'

    TASK-PID CPU# |||| TIMESTAMP FUNCTION
    | | | |||| | |
    cifsd-6071 [001] .... 2659.897923: smb3_reconnect: server=localhost current_mid=0xa
    kworker/1:1-71 [001] .... 2666.026342: smb3_cmd_done: sid=0x0 tid=0x0 cmd=0 mid=0
    kworker/1:1-71 [001] .... 2666.026576: smb3_cmd_err: sid=0xc49e1787 tid=0x0 cmd=1 mid=1 status=0xc0000016 rc=-5
    kworker/1:1-71 [001] .... 2666.031677: smb3_cmd_done: sid=0xc49e1787 tid=0x0 cmd=1 mid=2
    kworker/1:1-71 [001] .... 2666.031921: smb3_cmd_done: sid=0xc49e1787 tid=0x6e78f05f cmd=3 mid=3
    kworker/1:1-71 [001] .... 2666.031923: smb3_tcon: xid=0 sid=0xc49e1787 tid=0x0 unc_name=\\localhost\test rc=0
    kworker/1:1-71 [001] .... 2666.032097: smb3_cmd_done: sid=0xc49e1787 tid=0x6e78f05f cmd=11 mid=4
    kworker/1:1-71 [001] .... 2666.032265: smb3_cmd_done: sid=0xc49e1787 tid=0x7912332f cmd=3 mid=5
    kworker/1:1-71 [001] .... 2666.032266: smb3_tcon: xid=0 sid=0xc49e1787 tid=0x0 unc_name=\\localhost\IPC$ rc=0
    kworker/1:1-71 [001] .... 2666.032386: smb3_cmd_done: sid=0xc49e1787 tid=0x7912332f cmd=11 mid=6

    Signed-off-by: Steve French
    Reviewed-by: Ronnie Sahlberg

    Steve French
     
  • Trivial fix to a spelling mistake of the error access name EACCESS,
    rename to EACCES

    Signed-off-by: Colin Ian King
    Signed-off-by: Steve French

    Colin Ian King
     
  • If the application buffer was too small to fit all the names
    we would still count the number of bytes and return this for
    listxattr. This would then trigger a BUG in usercopy.c

    Fix the computation of the size so that we return -ERANGE
    correctly when the buffer is too small.

    This fixes the kernel BUG for xfstest generic/377

    Signed-off-by: Ronnie Sahlberg
    Signed-off-by: Steve French
    Reviewed-by: Aurelien Aptel

    Ronnie Sahlberg
     
  • Pull block layer fixes from Jens Axboe:
    "The biggest part of this pull request is the revert of the blkcg
    cleanup series. It had one fix earlier for a stacked device issue, but
    another one was reported. Rather than play whack-a-mole with this,
    revert the entire series and try again for the next kernel release.

    Apart from that, only small fixes/changes.

    Summary:

    - Indentation fixup for mtip32xx (Colin Ian King)

    - The blkcg cleanup series revert (Dennis Zhou)

    - Two NVMe fixes. One fixing a regression in the nvme request
    initialization in this merge window, causing nvme-fc to not work.
    The other is a suspend/resume p2p resource issue (James, Keith)

    - Fix sg discard merge, allowing us to merge in cases where we didn't
    before (Jianchao Wang)

    - Call rq_qos_exit() after the queue is frozen, preventing a hang
    (Ming)

    - Fix brd queue setup, fixing an oops if we fail setting up all
    devices (Ming)"

    * tag 'for-linus-20181102' of git://git.kernel.dk/linux-block:
    nvme-pci: fix conflicting p2p resource adds
    nvme-fc: fix request private initialization
    blkcg: revert blkcg cleanups series
    block: brd: associate with queue until adding disk
    block: call rq_qos_exit() after queue is frozen
    mtip32xx: clean an indentation issue, remove extraneous tabs
    block: fix the DISCARD request merge

    Linus Torvalds
     
  • Pull vfs dedup fixes from Dave Chinner:
    "This reworks the vfs data cloning infrastructure.

    We discovered many issues with these interfaces late in the 4.19 cycle
    - the worst of them (data corruption, setuid stripping) were fixed for
    XFS in 4.19-rc8, but a larger rework of the infrastructure fixing all
    the problems was needed. That rework is the contents of this pull
    request.

    Rework the vfs_clone_file_range and vfs_dedupe_file_range
    infrastructure to use a common .remap_file_range method and supply
    generic bounds and sanity checking functions that are shared with the
    data write path. The current VFS infrastructure has problems with
    rlimit, LFS file sizes, file time stamps, maximum filesystem file
    sizes, stripping setuid bits, etc and so they are addressed in these
    commits.

    We also introduce the ability for the ->remap_file_range methods to
    return short clones so that clones for vfs_copy_file_range() don't get
    rejected if the entire range can't be cloned. It also allows
    filesystems to sliently skip deduplication of partial EOF blocks if
    they are not capable of doing so without requiring errors to be thrown
    to userspace.

    Existing filesystems are converted to user the new remap_file_range
    method, and both XFS and ocfs2 are modified to make use of the new
    generic checking infrastructure"

    * tag 'xfs-4.20-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (28 commits)
    xfs: remove [cm]time update from reflink calls
    xfs: remove xfs_reflink_remap_range
    xfs: remove redundant remap partial EOF block checks
    xfs: support returning partial reflink results
    xfs: clean up xfs_reflink_remap_blocks call site
    xfs: fix pagecache truncation prior to reflink
    ocfs2: remove ocfs2_reflink_remap_range
    ocfs2: support partial clone range and dedupe range
    ocfs2: fix pagecache truncation prior to reflink
    ocfs2: truncate page cache for clone destination file before remapping
    vfs: clean up generic_remap_file_range_prep return value
    vfs: hide file range comparison function
    vfs: enable remap callers that can handle short operations
    vfs: plumb remap flags through the vfs dedupe functions
    vfs: plumb remap flags through the vfs clone functions
    vfs: make remap_file_range functions take and return bytes completed
    vfs: remap helper should update destination inode metadata
    vfs: pass remap flags to generic_remap_checks
    vfs: pass remap flags to generic_remap_file_range_prep
    vfs: combine the clone and dedupe into a single remap_file_range
    ...

    Linus Torvalds
     

02 Nov, 2018

9 commits

  • Pull misc vfs updates from Al Viro:
    "No common topic, really - a handful of assorted stuff; the least
    trivial bits are Mark's dedupe patches"

    * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    fs/exofs: only use true/false for asignment of bool type variable
    fs/exofs: fix potential memory leak in mount option parsing
    Delete invalid assignment statements in do_sendfile
    iomap: remove duplicated include from iomap.c
    vfs: dedupe should return EPERM if permission is not granted
    vfs: allow dedupe of user owned read-only files
    ntfs: don't open-code ERR_CAST
    ext4: don't open-code ERR_CAST

    Linus Torvalds
     
  • Pull AFS updates from Al Viro:
    "AFS series, with some iov_iter bits included"

    * 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits)
    missing bits of "iov_iter: Separate type from direction and use accessor functions"
    afs: Probe multiple fileservers simultaneously
    afs: Fix callback handling
    afs: Eliminate the address pointer from the address list cursor
    afs: Allow dumping of server cursor on operation failure
    afs: Implement YFS support in the fs client
    afs: Expand data structure fields to support YFS
    afs: Get the target vnode in afs_rmdir() and get a callback on it
    afs: Calc callback expiry in op reply delivery
    afs: Fix FS.FetchStatus delivery from updating wrong vnode
    afs: Implement the YFS cache manager service
    afs: Remove callback details from afs_callback_break struct
    afs: Commit the status on a new file/dir/symlink
    afs: Increase to 64-bit volume ID and 96-bit vnode ID for YFS
    afs: Don't invoke the server to read data beyond EOF
    afs: Add a couple of tracepoints to log I/O errors
    afs: Handle EIO from delivery function
    afs: Fix TTL on VL server and address lists
    afs: Implement VL server rotation
    afs: Improve FS server rotation error handling
    ...

    Linus Torvalds
     
  • This reverts a series committed earlier due to null pointer exception
    bug report in [1]. It seems there are edge case interactions that I did
    not consider and will need some time to understand what causes the
    adverse interactions.

    The original series can be found in [2] with a follow up series in [3].

    [1] https://www.spinics.net/lists/cgroups/msg20719.html
    [2] https://lore.kernel.org/lkml/20180911184137.35897-1-dennisszhou@gmail.com/
    [3] https://lore.kernel.org/lkml/20181020185612.51587-1-dennis@kernel.org/

    This reverts the following commits:
    d459d853c2ed, b2c3fa546705, 101246ec02b5, b3b9f24f5fcc, e2b0989954ae,
    f0fcb3ec89f3, c839e7a03f92, bdc2491708c4, 74b7c02a9bc1, 5bf9a1f3b4ef,
    a7b39b4e961c, 07b05bcc3213, 49f4c2dc2b50, 27e6fa996c53

    Signed-off-by: Dennis Zhou
    Signed-off-by: Jens Axboe

    Dennis Zhou
     
  • Pull compiler attribute updates from Miguel Ojeda:
    "This is an effort to disentangle the include/linux/compiler*.h headers
    and bring them up to date.

    The main idea behind the series is to use feature checking macros
    (i.e. __has_attribute) instead of compiler version checks (e.g.
    GCC_VERSION), which are compiler-agnostic (so they can be shared,
    reducing the size of compiler-specific headers) and version-agnostic.

    Other related improvements have been performed in the headers as well,
    which on top of the use of __has_attribute it has amounted to a
    significant simplification of these headers (e.g. GCC_VERSION is now
    only guarding a few non-attribute macros).

    This series should also help the efforts to support compiling the
    kernel with clang and icc. A fair amount of documentation and comments
    have also been added, clarified or removed; and the headers are now
    more readable, which should help kernel developers in general.

    The series was triggered due to the move to gcc >= 4.6. In turn, this
    series has also triggered Sparse to gain the ability to recognize
    __has_attribute on its own.

    Finally, the __nonstring variable attribute series has been also
    applied on top; plus two related patches from Nick Desaulniers for
    unreachable() that came a bit afterwards"

    * tag 'compiler-attributes-for-linus-4.20-rc1' of https://github.com/ojeda/linux:
    compiler-gcc: remove comment about gcc 4.5 from unreachable()
    compiler.h: update definition of unreachable()
    Compiler Attributes: ext4: remove local __nonstring definition
    Compiler Attributes: auxdisplay: panel: use __nonstring
    Compiler Attributes: enable -Wstringop-truncation on W=1 (gcc >= 8)
    Compiler Attributes: add support for __nonstring (gcc >= 8)
    Compiler Attributes: add MAINTAINERS entry
    Compiler Attributes: add Doc/process/programming-language.rst
    Compiler Attributes: remove uses of __attribute__ from compiler.h
    Compiler Attributes: KENTRY used twice the "used" attribute
    Compiler Attributes: use feature checks instead of version checks
    Compiler Attributes: add missing SPDX ID in compiler_types.h
    Compiler Attributes: remove unneeded sparse (__CHECKER__) tests
    Compiler Attributes: homogenize __must_be_array
    Compiler Attributes: remove unneeded tests
    Compiler Attributes: always use the extra-underscores syntax
    Compiler Attributes: remove unused attributes

    Linus Torvalds
     
  • backmerge to do fixup of iov_iter_kvec() conflict

    Al Viro
     
  • Pull overlayfs updates from Miklos Szeredi:
    "A mix of fixes and cleanups"

    * tag 'ovl-update-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
    ovl: automatically enable redirect_dir on metacopy=on
    ovl: check whiteout in ovl_create_over_whiteout()
    ovl: using posix_acl_xattr_size() to get size instead of posix_acl_to_xattr()
    ovl: abstract ovl_inode lock with a helper
    ovl: remove the 'locked' argument of ovl_nlink_{start,end}
    ovl: relax requirement for non null uuid of lower fs
    ovl: fold copy-up helpers into callers
    ovl: untangle copy up call chain
    ovl: relax permission checking on underlying layers
    ovl: fix recursive oi->lock in ovl_link()
    vfs: fix FIGETBSZ ioctl on an overlayfs file
    ovl: clean up error handling in ovl_get_tmpfile()
    ovl: fix error handling in ovl_verify_set_fh()

    Linus Torvalds
     
  • Current behavior is to automatically disable metacopy if redirect_dir is
    not enabled and proceed with the mount.

    If "metacopy=on" mount option was given, then this behavior can confuse the
    user: no mount failure, yet metacopy is disabled.

    This patch makes metacopy=on imply redirect_dir=on.

    The converse is also true: turning off full redirect with redirect_dir=
    {off|follow|nofollow} will disable metacopy.

    If both metacopy=on and redirect_dir={off|follow|nofollow} is specified,
    then mount will fail, since there's no way to correctly resolve the
    conflict.

    Reported-by: Daniel Walsh
    Fixes: d5791044d2e5 ("ovl: Provide a mount option metacopy=on/off...")
    Cc: # v4.19
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • Pull stackleak gcc plugin from Kees Cook:
    "Please pull this new GCC plugin, stackleak, for v4.20-rc1. This plugin
    was ported from grsecurity by Alexander Popov. It provides efficient
    stack content poisoning at syscall exit. This creates a defense
    against at least two classes of flaws:

    - Uninitialized stack usage. (We continue to work on improving the
    compiler to do this in other ways: e.g. unconditional zero init was
    proposed to GCC and Clang, and more plugin work has started too).

    - Stack content exposure. By greatly reducing the lifetime of valid
    stack contents, exposures via either direct read bugs or unknown
    cache side-channels become much more difficult to exploit. This
    complements the existing buddy and heap poisoning options, but
    provides the coverage for stacks.

    The x86 hooks are included in this series (which have been reviewed by
    Ingo, Dave Hansen, and Thomas Gleixner). The arm64 hooks have already
    been merged through the arm64 tree (written by Laura Abbott and
    reviewed by Mark Rutland and Will Deacon).

    With VLAs having been removed this release, there is no need for
    alloca() protection, so it has been removed from the plugin"

    * tag 'stackleak-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
    arm64: Drop unneeded stackleak_check_alloca()
    stackleak: Allow runtime disabling of kernel stack erasing
    doc: self-protection: Add information about STACKLEAK feature
    fs/proc: Show STACKLEAK metrics in the /proc file system
    lkdtm: Add a test for STACKLEAK
    gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
    x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls

    Linus Torvalds
     
  • Trivial fix to a spelling mistake of the error access name EACCESS,
    rename to EACCES

    Signed-off-by: Colin Ian King
    Signed-off-by: Trond Myklebust

    Colin Ian King
     

01 Nov, 2018

3 commits

  • Pull fuse updates from Miklos Szeredi:
    "As well as the usual bug fixes, this adds the following new features:

    - cached readdir and readlink

    - max I/O size increased from 128k to 1M

    - improved performance and scalability of request queues

    - copy_file_range support

    The only non-fuse bits are trivial cleanups of macros in
    "

    * tag 'fuse-update-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (31 commits)
    fuse: enable caching of symlinks
    fuse: only invalidate atime in direct read
    fuse: don't need GETATTR after every READ
    fuse: allow fine grained attr cache invaldation
    bitops: protect variables in bit_clear_unless() macro
    bitops: protect variables in set_mask_bits() macro
    fuse: realloc page array
    fuse: add max_pages to init_out
    fuse: allocate page array more efficiently
    fuse: reduce size of struct fuse_inode
    fuse: use iversion for readdir cache verification
    fuse: use mtime for readdir cache verification
    fuse: add readdir cache version
    fuse: allow using readdir cache
    fuse: allow caching readdir
    fuse: extract fuse_emit() helper
    fuse: add FOPEN_CACHE_DIR
    fuse: split out readdir.c
    fuse: Use hash table to link processing request
    fuse: kill req->intr_unique
    ...

    Linus Torvalds
     
  • Pull ceph updates from Ilya Dryomov:
    "The highlights are:

    - a series that fixes some old memory allocation issues in libceph
    (myself). We no longer allocate memory in places where allocation
    failures cannot be handled and BUG when the allocation fails.

    - support for copy_file_range() syscall (Luis Henriques). If size and
    alignment conditions are met, it leverages RADOS copy-from
    operation. Otherwise, a local copy is performed.

    - a patch that reduces memory requirement of ceph_sync_read() from
    the size of the entire read to the size of one object (Zheng Yan).

    - fallocate() syscall is now restricted to FALLOC_FL_PUNCH_HOLE (Luis
    Henriques)"

    * tag 'ceph-for-4.20-rc1' of git://github.com/ceph/ceph-client: (25 commits)
    ceph: new mount option to disable usage of copy-from op
    ceph: support copy_file_range file operation
    libceph: support the RADOS copy-from operation
    ceph: add non-blocking parameter to ceph_try_get_caps()
    libceph: check reply num_data_items in setup_request_data()
    libceph: preallocate message data items
    libceph, rbd, ceph: move ceph_osdc_alloc_messages() calls
    libceph: introduce alloc_watch_request()
    libceph: assign cookies in linger_submit()
    libceph: enable fallback to ceph_msg_new() in ceph_msgpool_get()
    ceph: num_ops is off by one in ceph_aio_retry_work()
    libceph: no need to call osd_req_opcode_valid() in osd_req_encode_op()
    ceph: set timeout conditionally in __cap_delay_requeue
    libceph: don't consume a ref on pagelist in ceph_msg_data_add_pagelist()
    libceph: introduce ceph_pagelist_alloc()
    libceph: osd_req_op_cls_init() doesn't need to take opcode
    libceph: bump CEPH_MSG_MAX_DATA_LEN
    ceph: only allow punch hole mode in fallocate
    ceph: refactor ceph_sync_read()
    ceph: check if LOOKUPNAME request was aborted when filling trace
    ...

    Linus Torvalds
     
  • Merge more updates from Andrew Morton:

    - the rest of MM

    - lib/bitmap updates

    - hfs updates

    - fatfs updates

    - various other misc things

    * emailed patches from Andrew Morton : (94 commits)
    mm/gup.c: fix __get_user_pages_fast() comment
    mm: Fix warning in insert_pfn()
    memory-hotplug.rst: add some details about locking internals
    powerpc/powernv: hold device_hotplug_lock when calling memtrace_offline_pages()
    powerpc/powernv: hold device_hotplug_lock when calling device_online()
    mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
    mm/memory_hotplug: make add_memory() take the device_hotplug_lock
    mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
    mm/memblock.c: warn if zero alignment was requested
    memblock: stop using implicit alignment to SMP_CACHE_BYTES
    docs/boot-time-mm: remove bootmem documentation
    mm: remove include/linux/bootmem.h
    memblock: replace BOOTMEM_ALLOC_* with MEMBLOCK variants
    mm: remove nobootmem
    memblock: rename __free_pages_bootmem to memblock_free_pages
    memblock: rename free_all_bootmem to memblock_free_all
    memblock: replace free_bootmem_late with memblock_free_late
    memblock: replace free_bootmem{_node} with memblock_free
    mm: nobootmem: remove bootmem allocation APIs
    memblock: replace alloc_bootmem with memblock_alloc
    ...

    Linus Torvalds
     

31 Oct, 2018

3 commits

  • Move remaining definitions and declarations from include/linux/bootmem.h
    into include/linux/memblock.h and remove the redundant header.

    The includes were replaced with the semantic patch below and then
    semi-automated removal of duplicated '#include

    @@
    @@
    - #include
    + #include

    [sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
    Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
    [sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
    Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
    [sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
    Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
    Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Signed-off-by: Stephen Rothwell
    Acked-by: Michal Hocko
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Jonas Bonn
    Cc: Jonathan Corbet
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Martin Schwidefsky
    Cc: Matt Turner
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Palmer Dabbelt
    Cc: Paul Burton
    Cc: Richard Kuo
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Serge Semin
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • All architecures use memblock for early memory management. There is no need
    for the CONFIG_HAVE_MEMBLOCK configuration option.

    [rppt@linux.vnet.ibm.com: of/fdt: fixup #ifdefs]
    Link: http://lkml.kernel.org/r/20180919103457.GA20545@rapoport-lnx
    [rppt@linux.vnet.ibm.com: csky: fixups after bootmem removal]
    Link: http://lkml.kernel.org/r/20180926112744.GC4628@rapoport-lnx
    [rppt@linux.vnet.ibm.com: remove stale #else and the code it protects]
    Link: http://lkml.kernel.org/r/1538067825-24835-1-git-send-email-rppt@linux.vnet.ibm.com
    Link: http://lkml.kernel.org/r/1536927045-23536-4-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Acked-by: Michal Hocko
    Tested-by: Jonathan Cameron
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Jonas Bonn
    Cc: Jonathan Corbet
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Martin Schwidefsky
    Cc: Matt Turner
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Palmer Dabbelt
    Cc: Paul Burton
    Cc: Richard Kuo
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Serge Semin
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • setattr_copy can't truncate timestamps correctly for
    msdos/vfat, so truncate and copy them ourselves.

    Link: http://lkml.kernel.org/r/a2b4701b1125573fafaeaae6802050ca86d6f8cc.1538363961.git.sorenson@redhat.com
    Signed-off-by: Frank Sorenson
    Acked-by: OGAWA Hirofumi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Frank Sorenson