13 May, 2017

1 commit


11 May, 2017

2 commits

  • Pull NFS client updates from Trond Myklebust:
    "Highlights include:

    Stable bugfixes:
    - Fix use after free in write error path
    - Use GFP_NOIO for two allocations in writeback
    - Fix a hang in OPEN related to server reboot
    - Check the result of nfs4_pnfs_ds_connect
    - Fix an rcu lock leak

    Features:
    - Removal of the unmaintained and unused OSD pNFS layout
    - Cleanup and removal of lots of unnecessary dprintk()s
    - Cleanup and removal of some memory failure paths now that GFP_NOFS
    is guaranteed to never fail.
    - Remove the v3-only data server limitation on pNFS/flexfiles

    Bugfixes:
    - RPC/RDMA connection handling bugfixes
    - Copy offload: fixes to ensure the copied data is COMMITed to disk.
    - Readdir: switch back to using the ->iterate VFS interface
    - File locking fixes from Ben Coddington
    - Various use-after-free and deadlock issues in pNFS
    - Write path bugfixes"

    * tag 'nfs-for-4.12-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (89 commits)
    pNFS/flexfiles: Always attempt to call layoutstats when flexfiles is enabled
    NFSv4.1: Work around a Linux server bug...
    NFS append COMMIT after synchronous COPY
    NFSv4: Fix exclusive create attributes encoding
    NFSv4: Fix an rcu lock leak
    nfs: use kmap/kunmap directly
    NFS: always treat the invocation of nfs_getattr as cache hit when noac is on
    Fix nfs_client refcounting if kmalloc fails in nfs4_proc_exchange_id and nfs4_proc_async_renew
    NFSv4.1: RECLAIM_COMPLETE must handle NFS4ERR_CONN_NOT_BOUND_TO_SESSION
    pNFS: Fix NULL dereference in pnfs_generic_alloc_ds_commits
    pNFS: Fix a typo in pnfs_generic_alloc_ds_commits
    pNFS: Fix a deadlock when coalescing writes and returning the layout
    pNFS: Don't clear the layout return info if there are segments to return
    pNFS: Ensure we commit the layout if it has been invalidated
    pNFS: Don't send COMMITs to the DSes if the server invalidated our layout
    pNFS/flexfiles: Fix up the ff_layout_write_pagelist failure path
    pNFS: Ensure we check layout validity before marking it for return
    NFS4.1 handle interrupted slot reuse from ERR_DELAY
    NFSv4: check return value of xdr_inline_decode
    nfs/filelayout: fix NULL pointer dereference in fl_pnfs_update_layout()
    ...

    Linus Torvalds
     
  • Pull overlayfs update from Miklos Szeredi:
    "The biggest part of this is making st_dev/st_ino on the overlay behave
    like a normal filesystem (i.e. st_ino doesn't change on copy up,
    st_dev is the same for all files and directories). Currently this only
    works if all layers are on the same filesystem, but future work will
    move the general case towards more sane behavior.

    There are also miscellaneous fixes, including fixes to handling
    append-only files. There's a small change in the VFS, but that only
    has an effect on overlayfs, since otherwise file->f_path.dentry->inode
    and file_inode(file) are always the same"

    * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
    ovl: update documentation w.r.t. constant inode numbers
    ovl: persistent inode numbers for upper hardlinks
    ovl: merge getattr for dir and nondir
    ovl: constant st_ino/st_dev across copy up
    ovl: persistent inode number for directories
    ovl: set the ORIGIN type flag
    ovl: lookup non-dir copy-up-origin by file handle
    ovl: use an auxiliary var for overlay root entry
    ovl: store file handle of lower inode on copy up
    ovl: check if all layers are on the same fs
    ovl: do not set overlay.opaque on non-dir create
    ovl: check IS_APPEND() on real upper inode
    vfs: ftruncate check IS_APPEND() on real upper inode
    ovl: Use designated initializers
    ovl: lockdep annotate of nested stacked overlayfs inode lock

    Linus Torvalds
     

09 May, 2017

2 commits

  • Pull PCI updates from Bjorn Helgaas:

    - add framework for supporting PCIe devices in Endpoint mode (Kishon
    Vijay Abraham I)

    - use non-postable PCI config space mappings when possible (Lorenzo
    Pieralisi)

    - clean up and unify mmap of PCI BARs (David Woodhouse)

    - export and unify Function Level Reset support (Christoph Hellwig)

    - avoid FLR for Intel 82579 NICs (Sasha Neftin)

    - add pci_request_irq() and pci_free_irq() helpers (Christoph Hellwig)

    - short-circuit config access failures for disconnected devices (Keith
    Busch)

    - remove D3 sleep delay when possible (Adrian Hunter)

    - freeze PME scan before suspending devices (Lukas Wunner)

    - stop disabling MSI/MSI-X in pci_device_shutdown() (Prarit Bhargava)

    - disable boot interrupt quirk for ASUS M2N-LR (Stefan Assmann)

    - add arch-specific alignment control to improve device passthrough by
    avoiding multiple BARs in a page (Yongji Xie)

    - add sysfs sriov_drivers_autoprobe to control VF driver binding
    (Bodong Wang)

    - allow slots below PCI-to-PCIe "reverse bridges" (Bjorn Helgaas)

    - fix crashes when unbinding host controllers that don't support
    removal (Brian Norris)

    - add driver for MicroSemi Switchtec management interface (Logan
    Gunthorpe)

    - add driver for Faraday Technology FTPCI100 host bridge (Linus
    Walleij)

    - add i.MX7D support (Andrey Smirnov)

    - use generic MSI support for Aardvark (Thomas Petazzoni)

    - make Rockchip driver modular (Brian Norris)

    - advertise 128-byte Read Completion Boundary support for Rockchip
    (Shawn Lin)

    - advertise PCI_EXP_LNKSTA_SLC for Rockchip root port (Shawn Lin)

    - convert atomic_t to refcount_t in HV driver (Elena Reshetova)

    - add CPU IRQ affinity in HV driver (K. Y. Srinivasan)

    - fix PCI bus removal in HV driver (Long Li)

    - add support for ThunderX2 DMA alias topology (Jayachandran C)

    - add ThunderX pass2.x 2nd node MCFG quirk (Tomasz Nowicki)

    - add ITE 8893 bridge DMA alias quirk (Jarod Wilson)

    - restrict Cavium ACS quirk only to CN81xx/CN83xx/CN88xx devices
    (Manish Jaggi)

    * tag 'pci-v4.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (146 commits)
    PCI: Don't allow unbinding host controllers that aren't prepared
    ARM: DRA7: clockdomain: Change the CLKTRCTRL of CM_PCIE_CLKSTCTRL to SW_WKUP
    MAINTAINERS: Add PCI Endpoint maintainer
    Documentation: PCI: Add userguide for PCI endpoint test function
    tools: PCI: Add sample test script to invoke pcitest
    tools: PCI: Add a userspace tool to test PCI endpoint
    Documentation: misc-devices: Add Documentation for pci-endpoint-test driver
    misc: Add host side PCI driver for PCI test function device
    PCI: Add device IDs for DRA74x and DRA72x
    dt-bindings: PCI: dra7xx: Add DT bindings to enable unaligned access
    PCI: dwc: dra7xx: Workaround for errata id i870
    dt-bindings: PCI: dra7xx: Add DT bindings for PCI dra7xx EP mode
    PCI: dwc: dra7xx: Add EP mode support
    PCI: dwc: dra7xx: Facilitate wrapper and MSI interrupts to be enabled independently
    dt-bindings: PCI: Add DT bindings for PCI designware EP mode
    PCI: dwc: designware: Add EP mode support
    Documentation: PCI: Add binding documentation for pci-test endpoint function
    ixgbe: Use pcie_flr() instead of duplicating it
    IB/hfi1: Use pcie_flr() instead of duplicating it
    PCI: imx6: Fix spelling mistake: "contol" -> "control"
    ...

    Linus Torvalds
     
  • Commit afddba49d18f ("fs: introduce write_begin, write_end, and
    perform_write aops") introduced AOP_FLAG_UNINTERRUPTIBLE flag which was
    checked in pagecache_write_begin(), but that check was removed by
    4e02ed4b4a2f ("fs: remove prepare_write/commit_write").

    Between these two commits, commit d9414774dc0c ("cifs: Convert cifs to
    new aops.") added a check in cifs_write_begin(), but that check was soon
    removed by commit a98ee8c1c707 ("[CIFS] fix regression in
    cifs_write_begin/cifs_write_end").

    Therefore, AOP_FLAG_UNINTERRUPTIBLE flag is checked nowhere. Let's
    remove this flag. This patch has no functionality changes.

    Link: http://lkml.kernel.org/r/1489294781-53494-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
    Signed-off-by: Tetsuo Handa
    Reviewed-by: Jeff Layton
    Reviewed-by: Christoph Hellwig
    Cc: Nick Piggin
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tetsuo Handa
     

05 May, 2017

1 commit


04 May, 2017

1 commit

  • Show MADV_FREE pages info of each vma in smaps. The interface is for
    diganose or monitoring purpose, userspace could use it to understand
    what happens in the application. Since userspace could dirty MADV_FREE
    pages without notice from kernel, this interface is the only place we
    can get accurate accounting info about MADV_FREE pages.

    [mhocko@kernel.org: update Documentation/filesystems/proc.txt]
    Link: http://lkml.kernel.org/r/89efde633559de1ec07444f2ef0f4963a97a2ce8.1487965799.git.shli@fb.com
    Signed-off-by: Shaohua Li
    Acked-by: Johannes Weiner
    Acked-by: Minchan Kim
    Acked-by: Michal Hocko
    Acked-by: Hillf Danton
    Cc: Hugh Dickins
    Cc: Rik van Riel
    Cc: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Shaohua Li
     

03 May, 2017

2 commits

  • Pull livepatch updates from Jiri Kosina:

    - a per-task consistency model is being added for architectures that
    support reliable stack dumping (extending this, currently rather
    trivial set, is currently in the works).

    This extends the nature of the types of patches that can be applied
    by live patching infrastructure. The code stems from the design
    proposal made [1] back in November 2014. It's a hybrid of SUSE's
    kGraft and RH's kpatch, combining advantages of both: it uses
    kGraft's per-task consistency and syscall barrier switching combined
    with kpatch's stack trace switching. There are also a number of
    fallback options which make it quite flexible.

    Most of the heavy lifting done by Josh Poimboeuf with help from
    Miroslav Benes and Petr Mladek

    [1] https://lkml.kernel.org/r/20141107140458.GA21774@suse.cz

    - module load time patch optimization from Zhou Chengming

    - a few assorted small fixes

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
    livepatch: add missing printk newlines
    livepatch: Cancel transition a safe way for immediate patches
    livepatch: Reduce the time of finding module symbols
    livepatch: make klp_mutex proper part of API
    livepatch: allow removal of a disabled patch
    livepatch: add /proc//patch_state
    livepatch: change to a per-task consistency model
    livepatch: store function sizes
    livepatch: use kstrtobool() in enabled_store()
    livepatch: move patching functions into patch.c
    livepatch: remove unnecessary object loaded check
    livepatch: separate enabled and patched states
    livepatch/s390: add TIF_PATCH_PENDING thread flag
    livepatch/s390: reorganize TIF thread flag bits
    livepatch/powerpc: add TIF_PATCH_PENDING thread flag
    livepatch/x86: add TIF_PATCH_PENDING thread flag
    livepatch: create temporary klp_update_patch_state() stub
    x86/entry: define _TIF_ALLWORK_MASK flags explicitly
    stacktrace/x86: add function for detecting reliable stack traces

    Linus Torvalds
     
  • Pull documentation update from Jonathan Corbet:
    "A reasonably busy cycle for documentation this time around. There is a
    new guide for user-space API documents, rather sparsely populated at
    the moment, but it's a start. Markus improved the infrastructure for
    converting diagrams. Mauro has converted much of the USB documentation
    over to RST. Plus the usual set of fixes, improvements, and tweaks.

    There's a bit more than the usual amount of reaching out of
    Documentation/ to fix comments elsewhere in the tree; I have acks for
    those where I could get them"

    * tag 'docs-4.12' of git://git.lwn.net/linux: (74 commits)
    docs: Fix a couple typos
    docs: Fix a spelling error in vfio-mediated-device.txt
    docs: Fix a spelling error in ioctl-number.txt
    MAINTAINERS: update file entry for HSI subsystem
    Documentation: allow installing man pages to a user defined directory
    Doc/PM: Sync with intel_powerclamp code behavior
    zr364xx.rst: usb/devices is now at /sys/kernel/debug/
    usb.rst: move documentation from proc_usb_info.txt to USB ReST book
    convert philips.txt to ReST and add to media docs
    docs-rst: usb: update old usbfs-related documentation
    arm: Documentation: update a path name
    docs: process/4.Coding.rst: Fix a couple of document refs
    docs-rst: fix usb cross-references
    usb: gadget.h: be consistent at kernel doc macros
    usb: composite.h: fix two warnings when building docs
    usb: get rid of some ReST doc build errors
    usb.rst: get rid of some Sphinx errors
    usb/URB.txt: convert to ReST and update it
    usb/persist.txt: convert to ReST and add to driver-api book
    usb/hotplug.txt: convert to ReST and add to driver-api book
    ...

    Linus Torvalds
     

21 Apr, 2017

1 commit


20 Apr, 2017

1 commit

  • Starting to leave behind the legacy of the pci_mmap_page_range() interface
    which takes "user-visible" BAR addresses. This takes just the resource and
    offset.

    For now, both APIs coexist and depending on the platform, one is
    implemented as a wrapper around the other.

    Signed-off-by: David Woodhouse
    Signed-off-by: Bjorn Helgaas

    David Woodhouse
     

19 Apr, 2017

2 commits

  • This is relatively esoteric, and knowing that we don't have it makes life
    easier in some cases rather than just an eventual -EINVAL from
    pci_mmap_page_range().

    Signed-off-by: David Woodhouse
    Signed-off-by: Bjorn Helgaas

    David Woodhouse
     
  • Most of the almost-identical versions of pci_mmap_page_range() silently
    ignore the 'write_combine' argument and give uncached mappings.

    Yet we allow the PCIIOC_WRITE_COMBINE ioctl in /proc/bus/pci, expose the
    'resourceX_wc' file in sysfs, and allow an attempted mapping to apparently
    succeed.

    To fix this, introduce a macro arch_can_pci_mmap_wc() which indicates
    whether the platform can do a write-combining mapping. On x86 this ends up
    being pat_enabled(), while the few other platforms that support it can just
    set it to a literal '1'.

    Signed-off-by: David Woodhouse
    Signed-off-by: Bjorn Helgaas

    David Woodhouse
     

03 Apr, 2017

1 commit


30 Mar, 2017

1 commit

  • As ftp.kernel.org is closed [0], this commit fixes dead URLs in
    documents to use www.kernel.org instead.

    [0] https://www.kernel.org/shutting-down-ftp-services.html

    Signed-off-by: SeongJae Park
    Acked-by: Theodore Ts'o
    Acked-by: David S. Miller
    Reviewed-by: Mauro Carvalho Chehab
    Signed-off-by: Jonathan Corbet

    SeongJae Park
     

08 Mar, 2017

1 commit


03 Mar, 2017

1 commit

  • Add a system call to make extended file information available, including
    file creation and some attribute flags where available through the
    underlying filesystem.

    The getattr inode operation is altered to take two additional arguments: a
    u32 request_mask and an unsigned int flags that indicate the
    synchronisation mode. This change is propagated to the vfs_getattr*()
    function.

    Functions like vfs_stat() are now inline wrappers around new functions
    vfs_statx() and vfs_statx_fd() to reduce stack usage.

    ========
    OVERVIEW
    ========

    The idea was initially proposed as a set of xattrs that could be retrieved
    with getxattr(), but the general preference proved to be for a new syscall
    with an extended stat structure.

    A number of requests were gathered for features to be included. The
    following have been included:

    (1) Make the fields a consistent size on all arches and make them large.

    (2) Spare space, request flags and information flags are provided for
    future expansion.

    (3) Better support for the y2038 problem [Arnd Bergmann] (tv_sec is an
    __s64).

    (4) Creation time: The SMB protocol carries the creation time, which could
    be exported by Samba, which will in turn help CIFS make use of
    FS-Cache as that can be used for coherency data (stx_btime).

    This is also specified in NFSv4 as a recommended attribute and could
    be exported by NFSD [Steve French].

    (5) Lightweight stat: Ask for just those details of interest, and allow a
    netfs (such as NFS) to approximate anything not of interest, possibly
    without going to the server [Trond Myklebust, Ulrich Drepper, Andreas
    Dilger] (AT_STATX_DONT_SYNC).

    (6) Heavyweight stat: Force a netfs to go to the server, even if it thinks
    its cached attributes are up to date [Trond Myklebust]
    (AT_STATX_FORCE_SYNC).

    And the following have been left out for future extension:

    (7) Data version number: Could be used by userspace NFS servers [Aneesh
    Kumar].

    Can also be used to modify fill_post_wcc() in NFSD which retrieves
    i_version directly, but has just called vfs_getattr(). It could get
    it from the kstat struct if it used vfs_xgetattr() instead.

    (There's disagreement on the exact semantics of a single field, since
    not all filesystems do this the same way).

    (8) BSD stat compatibility: Including more fields from the BSD stat such
    as creation time (st_btime) and inode generation number (st_gen)
    [Jeremy Allison, Bernd Schubert].

    (9) Inode generation number: Useful for FUSE and userspace NFS servers
    [Bernd Schubert].

    (This was asked for but later deemed unnecessary with the
    open-by-handle capability available and caused disagreement as to
    whether it's a security hole or not).

    (10) Extra coherency data may be useful in making backups [Andreas Dilger].

    (No particular data were offered, but things like last backup
    timestamp, the data version number and the DOS archive bit would come
    into this category).

    (11) Allow the filesystem to indicate what it can/cannot provide: A
    filesystem can now say it doesn't support a standard stat feature if
    that isn't available, so if, for instance, inode numbers or UIDs don't
    exist or are fabricated locally...

    (This requires a separate system call - I have an fsinfo() call idea
    for this).

    (12) Store a 16-byte volume ID in the superblock that can be returned in
    struct xstat [Steve French].

    (Deferred to fsinfo).

    (13) Include granularity fields in the time data to indicate the
    granularity of each of the times (NFSv4 time_delta) [Steve French].

    (Deferred to fsinfo).

    (14) FS_IOC_GETFLAGS value. These could be translated to BSD's st_flags.
    Note that the Linux IOC flags are a mess and filesystems such as Ext4
    define flags that aren't in linux/fs.h, so translation in the kernel
    may be a necessity (or, possibly, we provide the filesystem type too).

    (Some attributes are made available in stx_attributes, but the general
    feeling was that the IOC flags were to ext[234]-specific and shouldn't
    be exposed through statx this way).

    (15) Mask of features available on file (eg: ACLs, seclabel) [Brad Boyer,
    Michael Kerrisk].

    (Deferred, probably to fsinfo. Finding out if there's an ACL or
    seclabal might require extra filesystem operations).

    (16) Femtosecond-resolution timestamps [Dave Chinner].

    (A __reserved field has been left in the statx_timestamp struct for
    this - if there proves to be a need).

    (17) A set multiple attributes syscall to go with this.

    ===============
    NEW SYSTEM CALL
    ===============

    The new system call is:

    int ret = statx(int dfd,
    const char *filename,
    unsigned int flags,
    unsigned int mask,
    struct statx *buffer);

    The dfd, filename and flags parameters indicate the file to query, in a
    similar way to fstatat(). There is no equivalent of lstat() as that can be
    emulated with statx() by passing AT_SYMLINK_NOFOLLOW in flags. There is
    also no equivalent of fstat() as that can be emulated by passing a NULL
    filename to statx() with the fd of interest in dfd.

    Whether or not statx() synchronises the attributes with the backing store
    can be controlled by OR'ing a value into the flags argument (this typically
    only affects network filesystems):

    (1) AT_STATX_SYNC_AS_STAT tells statx() to behave as stat() does in this
    respect.

    (2) AT_STATX_FORCE_SYNC will require a network filesystem to synchronise
    its attributes with the server - which might require data writeback to
    occur to get the timestamps correct.

    (3) AT_STATX_DONT_SYNC will suppress synchronisation with the server in a
    network filesystem. The resulting values should be considered
    approximate.

    mask is a bitmask indicating the fields in struct statx that are of
    interest to the caller. The user should set this to STATX_BASIC_STATS to
    get the basic set returned by stat(). It should be noted that asking for
    more information may entail extra I/O operations.

    buffer points to the destination for the data. This must be 256 bytes in
    size.

    ======================
    MAIN ATTRIBUTES RECORD
    ======================

    The following structures are defined in which to return the main attribute
    set:

    struct statx_timestamp {
    __s64 tv_sec;
    __s32 tv_nsec;
    __s32 __reserved;
    };

    struct statx {
    __u32 stx_mask;
    __u32 stx_blksize;
    __u64 stx_attributes;
    __u32 stx_nlink;
    __u32 stx_uid;
    __u32 stx_gid;
    __u16 stx_mode;
    __u16 __spare0[1];
    __u64 stx_ino;
    __u64 stx_size;
    __u64 stx_blocks;
    __u64 __spare1[1];
    struct statx_timestamp stx_atime;
    struct statx_timestamp stx_btime;
    struct statx_timestamp stx_ctime;
    struct statx_timestamp stx_mtime;
    __u32 stx_rdev_major;
    __u32 stx_rdev_minor;
    __u32 stx_dev_major;
    __u32 stx_dev_minor;
    __u64 __spare2[14];
    };

    The defined bits in request_mask and stx_mask are:

    STATX_TYPE Want/got stx_mode & S_IFMT
    STATX_MODE Want/got stx_mode & ~S_IFMT
    STATX_NLINK Want/got stx_nlink
    STATX_UID Want/got stx_uid
    STATX_GID Want/got stx_gid
    STATX_ATIME Want/got stx_atime{,_ns}
    STATX_MTIME Want/got stx_mtime{,_ns}
    STATX_CTIME Want/got stx_ctime{,_ns}
    STATX_INO Want/got stx_ino
    STATX_SIZE Want/got stx_size
    STATX_BLOCKS Want/got stx_blocks
    STATX_BASIC_STATS [The stuff in the normal stat struct]
    STATX_BTIME Want/got stx_btime{,_ns}
    STATX_ALL [All currently available stuff]

    stx_btime is the file creation time, stx_mask is a bitmask indicating the
    data provided and __spares*[] are where as-yet undefined fields can be
    placed.

    Time fields are structures with separate seconds and nanoseconds fields
    plus a reserved field in case we want to add even finer resolution. Note
    that times will be negative if before 1970; in such a case, the nanosecond
    fields will also be negative if not zero.

    The bits defined in the stx_attributes field convey information about a
    file, how it is accessed, where it is and what it does. The following
    attributes map to FS_*_FL flags and are the same numerical value:

    STATX_ATTR_COMPRESSED File is compressed by the fs
    STATX_ATTR_IMMUTABLE File is marked immutable
    STATX_ATTR_APPEND File is append-only
    STATX_ATTR_NODUMP File is not to be dumped
    STATX_ATTR_ENCRYPTED File requires key to decrypt in fs

    Within the kernel, the supported flags are listed by:

    KSTAT_ATTR_FS_IOC_FLAGS

    [Are any other IOC flags of sufficient general interest to be exposed
    through this interface?]

    New flags include:

    STATX_ATTR_AUTOMOUNT Object is an automount trigger

    These are for the use of GUI tools that might want to mark files specially,
    depending on what they are.

    Fields in struct statx come in a number of classes:

    (0) stx_dev_*, stx_blksize.

    These are local system information and are always available.

    (1) stx_mode, stx_nlinks, stx_uid, stx_gid, stx_[amc]time, stx_ino,
    stx_size, stx_blocks.

    These will be returned whether the caller asks for them or not. The
    corresponding bits in stx_mask will be set to indicate whether they
    actually have valid values.

    If the caller didn't ask for them, then they may be approximated. For
    example, NFS won't waste any time updating them from the server,
    unless as a byproduct of updating something requested.

    If the values don't actually exist for the underlying object (such as
    UID or GID on a DOS file), then the bit won't be set in the stx_mask,
    even if the caller asked for the value. In such a case, the returned
    value will be a fabrication.

    Note that there are instances where the type might not be valid, for
    instance Windows reparse points.

    (2) stx_rdev_*.

    This will be set only if stx_mode indicates we're looking at a
    blockdev or a chardev, otherwise will be 0.

    (3) stx_btime.

    Similar to (1), except this will be set to 0 if it doesn't exist.

    =======
    TESTING
    =======

    The following test program can be used to test the statx system call:

    samples/statx/test-statx.c

    Just compile and run, passing it paths to the files you want to examine.
    The file is built automatically if CONFIG_SAMPLES is enabled.

    Here's some example output. Firstly, an NFS directory that crosses to
    another FSID. Note that the AUTOMOUNT attribute is set because transiting
    this directory will cause d_automount to be invoked by the VFS.

    [root@andromeda ~]# /tmp/test-statx -A /warthog/data
    statx(/warthog/data) = 0
    results=7ff
    Size: 4096 Blocks: 8 IO Block: 1048576 directory
    Device: 00:26 Inode: 1703937 Links: 125
    Access: (3777/drwxrwxrwx) Uid: 0 Gid: 4041
    Access: 2016-11-24 09:02:12.219699527+0000
    Modify: 2016-11-17 10:44:36.225653653+0000
    Change: 2016-11-17 10:44:36.225653653+0000
    Attributes: 0000000000001000 (-------- -------- -------- -------- -------- -------- ---m---- --------)

    Secondly, the result of automounting on that directory.

    [root@andromeda ~]# /tmp/test-statx /warthog/data
    statx(/warthog/data) = 0
    results=7ff
    Size: 4096 Blocks: 8 IO Block: 1048576 directory
    Device: 00:27 Inode: 2 Links: 125
    Access: (3777/drwxrwxrwx) Uid: 0 Gid: 4041
    Access: 2016-11-24 09:02:12.219699527+0000
    Modify: 2016-11-17 10:44:36.225653653+0000
    Change: 2016-11-17 10:44:36.225653653+0000

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     

02 Mar, 2017

1 commit

  • Pull f2fs updates from Jaegeuk Kim:
    "This round introduces several interesting features such as on-disk NAT
    bitmaps, IO alignment, and a discard thread. And it includes a couple
    of major bug fixes as below.

    Enhancements:

    - introduce on-disk bitmaps to avoid scanning NAT blocks when getting
    free nids

    - support IO alignment to prepare open-channel SSD integration in
    future

    - introduce a discard thread to avoid long latency during checkpoint
    and fstrim

    - use SSR for warm node and enable inline_xattr by default

    - introduce in-memory bitmaps to check FS consistency for debugging

    - improve write_begin by avoiding needless read IO

    Bug fixes:

    - fix broken zone_reset behavior for SMR drive

    - fix wrong victim selection policy during GC

    - fix missing behavior when preparing discard commands

    - fix bugs in atomic write support and fiemap

    - workaround to handle multiple f2fs_add_link calls having same name

    ... and it includes a bunch of clean-up patches as well"

    * tag 'for-f2fs-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (97 commits)
    f2fs: avoid to flush nat journal entries
    f2fs: avoid to issue redundant discard commands
    f2fs: fix a plint compile warning
    f2fs: add f2fs_drop_inode tracepoint
    f2fs: Fix zoned block device support
    f2fs: remove redundant set_page_dirty()
    f2fs: fix to enlarge size of write_io_dummy mempool
    f2fs: fix memory leak of write_io_dummy mempool during umount
    f2fs: fix to update F2FS_{CP_}WB_DATA count correctly
    f2fs: use MAX_FREE_NIDS for the free nids target
    f2fs: introduce free nid bitmap
    f2fs: new helper cur_cp_crc() getting crc in f2fs_checkpoint
    f2fs: update the comment of default nr_pages to skipping
    f2fs: drop the duplicate pval in f2fs_getxattr
    f2fs: Don't update the xattr data that same as the exist
    f2fs: kill __is_extent_same
    f2fs: avoid bggc->fggc when enough free segments are avaliable after cp
    f2fs: select target segment with closer temperature in SSR mode
    f2fs: show simple call stack in fault injection message
    f2fs: no need lock_op in f2fs_write_inline_data
    ...

    Linus Torvalds
     

01 Mar, 2017

1 commit

  • Pull ceph updates from Ilya Dryomov:
    "This time around we have:

    - support for rbd data-pool feature, which enables rbd images on
    erasure-coded pools (myself). CEPH_PG_MAX_SIZE has been bumped to
    allow erasure-coded profiles with k+m up to 32.

    - a patch for ceph_d_revalidate() performance regression introduced
    in 4.9, along with some cleanups in the area (Jeff Layton)

    - a set of fixes for unsafe ->d_parent accesses in CephFS (Jeff
    Layton)

    - buffered reads are now processed in rsize windows instead of rasize
    windows (Andreas Gerstmayr). The new default for rsize mount option
    is 64M.

    - ack vs commit distinction is gone, greatly simplifying ->fsync()
    and MOSDOpReply handling code (myself)

    ... also a few filesystem bug fixes from Zheng, a CRUSH sync up (CRUSH
    computations are still serialized though) and several minor fixes and
    cleanups all over"

    * tag 'ceph-for-4.11-rc1' of git://github.com/ceph/ceph-client: (52 commits)
    libceph, rbd, ceph: WRITE | ONDISK -> WRITE
    libceph: get rid of ack vs commit
    ceph: remove special ack vs commit behavior
    ceph: tidy some white space in get_nonsnap_parent()
    crush: fix dprintk compilation
    crush: do is_out test only if we do not collide
    ceph: remove req from unsafe list when unregistering it
    rbd: constify device_type structure
    rbd: kill obj_request->object_name and rbd_segment_name_cache
    rbd: store and use obj_request->object_no
    rbd: RBD_V{1,2}_DATA_FORMAT macros
    rbd: factor out __rbd_osd_req_create()
    rbd: set offset and length outside of rbd_obj_request_create()
    rbd: support for data-pool feature
    rbd: introduce rbd_init_layout()
    rbd: use rbd_obj_bytes() more
    rbd: remove now unused rbd_obj_request_wait() and helpers
    rbd: switch rbd_obj_method_sync() to ceph_osdc_call()
    libceph: pass reply buffer length through ceph_osdc_call()
    rbd: do away with obj_request in rbd_obj_read_sync()
    ...

    Linus Torvalds
     

28 Feb, 2017

4 commits

  • Fix typos and add the following to the scripts/spelling.txt:

    an user||a user
    an userspace||a userspace

    I also added "userspace" to the list since it is a common word in Linux.
    I found some instances for "an userfaultfd", but I did not add it to the
    list. I felt it is endless to find words that start with "user" such as
    "userland" etc., so must draw a line somewhere.

    Link: http://lkml.kernel.org/r/1481573103-11329-4-git-send-email-yamada.masahiro@socionext.com
    Signed-off-by: Masahiro Yamada
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Masahiro Yamada
     
  • This is the same as bf72eda5 except that it's a different file. Sync
    documentation with changes made by 730c9eec in 2009.

    Link: http://lkml.kernel.org/r/148577165630.9801.6081791213151121657.stgit@pluto.themaw.net
    Signed-off-by: Tomohiro Kusumi
    Signed-off-by: Ian Kent
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tomohiro Kusumi
     
  • This is the same as d8732841 except that it's a different file. A
    caller has no devid input, and devid is obtained via superblock.

    Link: http://lkml.kernel.org/r/148577165119.9801.16967562019122274820.stgit@pluto.themaw.net
    Signed-off-by: Tomohiro Kusumi
    Signed-off-by: Ian Kent
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tomohiro Kusumi
     
  • Link: http://lkml.kernel.org/r/148577164606.9801.12571810310561599401.stgit@pluto.themaw.net
    Signed-off-by: Tomohiro Kusumi
    Signed-off-by: Ian Kent
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tomohiro Kusumi
     

24 Feb, 2017

2 commits


23 Feb, 2017

1 commit


20 Feb, 2017

1 commit

  • This patch sets the io_pages bdi hint based on the rsize mount option.
    Without this patch large buffered reads (request size > max readahead)
    are processed sequentially in chunks of the readahead size (i.e. read
    requests are sent out up to the readahead size, then the
    do_generic_file_read() function waits until the first page is received).

    With this patch read requests are sent out at once up to the size
    specified in the rsize mount option (default: 64 MB).

    Signed-off-by: Andreas Gerstmayr
    Acked-by: Jeff Layton
    Signed-off-by: Yan, Zheng

    Andreas Gerstmayr
     

18 Feb, 2017

1 commit

  • Change module filename from af-rxrpc.ko to rxrpc.ko so as to be consistent
    with the other protocol drivers.

    Also adjust the documentation to reflect this.

    Further, there is no longer a standalone rxkad module, as it has been
    merged into the rxrpc core, so get rid of references to that.

    Reported-by: Marc Dionne
    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    David Howells
     

29 Jan, 2017

1 commit


25 Jan, 2017

1 commit

  • Commit bc3e53f682d9 ("mm: distinguish between mlocked and pinned pages")
    added VmPin in /proc//status. Report that in
    Documentation/filesystems/proc.txt

    Also move Umask after Name to keep correct order.

    Link: http://lkml.kernel.org/r/20170114201219.30387-1-fabf@skynet.be
    Signed-off-by: Fabian Frederick
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     

18 Dec, 2016

2 commits

  • …/linux/kernel/git/mszeredi/vfs

    Pull partial readlink cleanups from Miklos Szeredi.

    This is the uncontroversial part of the readlink cleanup patch-set that
    simplifies the default readlink handling.

    Miklos and Al are still discussing the rest of the series.

    * git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
    vfs: make generic_readlink() static
    vfs: remove ".readlink = generic_readlink" assignments
    vfs: default to generic_readlink()
    vfs: replace calling i_op->readlink with vfs_readlink()
    proc/self: use generic_readlink
    ecryptfs: use vfs_get_link()
    bad_inode: add missing i_op initializers

    Linus Torvalds
     
  • Pull more vfs updates from Al Viro:
    "In this pile:

    - autofs-namespace series
    - dedupe stuff
    - more struct path constification"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (40 commits)
    ocfs2: implement the VFS clone_range, copy_range, and dedupe_range features
    ocfs2: charge quota for reflinked blocks
    ocfs2: fix bad pointer cast
    ocfs2: always unlock when completing dio writes
    ocfs2: don't eat io errors during _dio_end_io_write
    ocfs2: budget for extent tree splits when adding refcount flag
    ocfs2: prohibit refcounted swapfiles
    ocfs2: add newlines to some error messages
    ocfs2: convert inode refcount test to a helper
    simple_write_end(): don't zero in short copy into uptodate
    exofs: don't mess with simple_write_{begin,end}
    9p: saner ->write_end() on failing copy into non-uptodate page
    fix gfs2_stuffed_write_end() on short copies
    fix ceph_write_end()
    nfs_write_end(): fix handling of short copies
    vfs: refactor clone/dedupe_file_range common functions
    fs: try to clone files first in vfs_copy_file_range
    vfs: misc struct path constification
    namespace.c: constify struct path passed to a bunch of primitives
    quota: constify struct path in quota_on
    ...

    Linus Torvalds
     

17 Dec, 2016

2 commits

  • Pull overlayfs updates from Miklos Szeredi:
    "This update contains:

    - try to clone on copy-up

    - allow renaming a directory

    - split source into managable chunks

    - misc cleanups and fixes

    It does not contain the read-only fd data inconsistency fix, which Al
    didn't like. I'll leave that to the next year..."

    * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (36 commits)
    ovl: fix reStructuredText syntax errors in documentation
    ovl: fix return value of ovl_fill_super
    ovl: clean up kstat usage
    ovl: fold ovl_copy_up_truncate() into ovl_copy_up()
    ovl: create directories inside merged parent opaque
    ovl: opaque cleanup
    ovl: show redirect_dir mount option
    ovl: allow setting max size of redirect
    ovl: allow redirect_dir to default to "on"
    ovl: check for emptiness of redirect dir
    ovl: redirect on rename-dir
    ovl: lookup redirects
    ovl: consolidate lookup for underlying layers
    ovl: fix nested overlayfs mount
    ovl: check namelen
    ovl: split super.c
    ovl: use d_is_dir()
    ovl: simplify lookup
    ovl: check lower existence of rename target
    ovl: rename: simplify handling of lower/merged directory
    ...

    Linus Torvalds
     
  • Pull vfs updates from Al Viro:

    - more ->d_init() stuff (work.dcache)

    - pathname resolution cleanups (work.namei)

    - a few missing iov_iter primitives - copy_from_iter_full() and
    friends. Either copy the full requested amount, advance the iterator
    and return true, or fail, return false and do _not_ advance the
    iterator. Quite a few open-coded callers converted (and became more
    readable and harder to fuck up that way) (work.iov_iter)

    - several assorted patches, the big one being logfs removal

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    logfs: remove from tree
    vfs: fix put_compat_statfs64() does not handle errors
    namei: fold should_follow_link() with the step into not-followed link
    namei: pass both WALK_GET and WALK_MORE to should_follow_link()
    namei: invert WALK_PUT logics
    namei: shift interpretation of LOOKUP_FOLLOW inside should_follow_link()
    namei: saner calling conventions for mountpoint_last()
    namei.c: get rid of user_path_parent()
    switch getfrag callbacks to ..._full() primitives
    make skb_add_data,{_nocache}() and skb_copy_to_page_nocache() advance only on success
    [iov_iter] new primitives - copy_from_iter_full() and friends
    don't open-code file_inode()
    ceph: switch to use of ->d_init()
    ceph: unify dentry_operations instances
    lustre: switch to use of ->d_init()

    Linus Torvalds
     

16 Dec, 2016

4 commits

  • - Fix broken long line block quote
    - Fix missing newline before bullets list
    - Use correct numbered list syntax

    Signed-off-by: Amir Goldstein
    Signed-off-by: Miklos Szeredi

    Amir Goldstein
     
  • Current code returns EXDEV when a directory would need to be copied up to
    move. We could copy up the directory tree in this case, but there's
    another, simpler solution: point to old lower directory from moved upper
    directory.

    This is achieved with a "trusted.overlay.redirect" xattr storing the path
    relative to the root of the overlay. After such attribute has been set,
    the directory can be moved without further actions required.

    This is a backward incompatible feature, old kernels won't be able to
    correctly mount an overlay containing redirected directories.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • The quirk for file locks and leases no longer applies.

    Add missing info about renaming directory residing on lower layer.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • Pull PCI updates from Bjorn Helgaas:
    "PCI changes:

    - add support for PCI on ARM64 boxes with ACPI. We already had this
    for theoretical spec-compliant hardware; now we're adding quirks
    for the actual hardware (Cavium, HiSilicon, Qualcomm, X-Gene)

    - add runtime PM support for hotplug ports

    - enable runtime suspend for Intel UHCI that uses platform-specific
    wakeup signaling

    - add yet another host bridge registration interface. We hope this is
    extensible enough to subsume the others

    - expose device revision in sysfs for DRM

    - to avoid device conflicts, make sure any VF BAR updates are done
    before enabling the VF

    - avoid unnecessary link retrains for ASPM

    - allow INTx masking on Mellanox devices that support it

    - allow access to non-standard VPD for Chelsio devices

    - update Broadcom iProc support for PAXB v2, PAXC v2, inbound DMA,
    etc

    - update Rockchip support for max-link-speed

    - add NVIDIA Tegra210 support

    - add Layerscape LS1046a support

    - update R-Car compatibility strings

    - add Qualcomm MSM8996 support

    - remove some uninformative bootup messages"

    * tag 'pci-v4.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (115 commits)
    PCI: Enable access to non-standard VPD for Chelsio devices (cxgb3)
    PCI: Expand "VPD access disabled" quirk message
    PCI: pciehp: Remove loading message
    PCI: hotplug: Remove hotplug core message
    PCI: Remove service driver load/unload messages
    PCI/AER: Log AER IRQ when claiming Root Port
    PCI/AER: Log errors with PCI device, not PCIe service device
    PCI/AER: Remove unused version macros
    PCI/PME: Log PME IRQ when claiming Root Port
    PCI/PME: Drop unused support for PMEs from Root Complex Event Collectors
    PCI: Move config space size macros to pci_regs.h
    x86/platform/intel-mid: Constify mid_pci_platform_pm
    PCI/ASPM: Don't retrain link if ASPM not possible
    PCI: iproc: Skip check for legacy IRQ on PAXC buses
    PCI: pciehp: Leave power indicator on when enabling already-enabled slot
    PCI: pciehp: Prioritize data-link event over presence detect
    PCI: rcar: Add gen3 fallback compatibility string for pcie-rcar
    PCI: rcar: Use gen2 fallback compatibility last
    PCI: rcar-gen2: Use gen2 fallback compatibility last
    PCI: rockchip: Move the deassert of pm/aclk/pclk after phy_init()
    ..

    Linus Torvalds
     

15 Dec, 2016

2 commits

  • Pull xfs updates from Dave Chinner:
    "There is quite a varied bunch of stuff in this update, and some of it
    you will have already merged through the ext4 tree which imported the
    dax-4.10-iomap-pmd topic branch from the XFS tree.

    There is also a new direct IO implementation that uses the iomap
    infrastructure. It's much simpler, faster, and has lower IO latency
    than the existing direct IO infrastructure.

    Summary:
    - DAX PMD faults via iomap infrastructure
    - Direct-io support in iomap infrastructure
    - removal of now-redundant XFS inode iolock, replaced with VFS
    i_rwsem
    - synchronisation with fixes and changes in userspace libxfs code
    - extent tree lookup helpers
    - lots of little corruption detection improvements to verifiers
    - optimised CRC calculations
    - faster buffer cache lookups
    - deprecation of barrier/nobarrier mount options - we always use
    REQ_FUA/REQ_FLUSH where appropriate for data integrity now
    - cleanups to speculative preallocation
    - miscellaneous minor bug fixes and cleanups"

    * tag 'xfs-for-linus-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (63 commits)
    xfs: nuke unused tracepoint definitions
    xfs: use GPF_NOFS when allocating btree cursors
    xfs: use xfs_vn_setattr_size to check on new size
    xfs: deprecate barrier/nobarrier mount option
    xfs: Always flush caches when integrity is required
    xfs: ignore leaf attr ichdr.count in verifier during log replay
    xfs: use rhashtable to track buffer cache
    xfs: optimise CRC updates
    xfs: make xfs btree stats less huge
    xfs: don't cap maximum dedupe request length
    xfs: don't allow di_size with high bit set
    xfs: error out if trying to add attrs and anextents > 0
    xfs: don't crash if reading a directory results in an unexpected hole
    xfs: complain if we don't get nextents bmap records
    xfs: check for bogus values in btree block headers
    xfs: forbid AG btrees with level == 0
    xfs: several xattr functions can be void
    xfs: handle cow fork in xfs_bmap_trace_exlist
    xfs: pass state not whichfork to trace_xfs_extlist
    xfs: Move AGI buffer type setting to xfs_read_agi
    ...

    Linus Torvalds
     
  • Logfs was introduced to the kernel in 2009, and hasn't seen any non
    drive-by changes since 2012, while having lots of unsolved issues
    including the complete lack of error handling, with more and more
    issues popping up without any fixes.

    The logfs.org domain has been bouncing from a mail, and the maintainer
    on the non-logfs.org domain hasn't repsonded to past queries either.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig