12 Oct, 2016

19 commits

  • Using const is generally a good idea.

    Julia Lawall has created a list of always const and almost always const
    structs in the kernel sources.

    Link: https://lkml.org/lkml/2016/8/28/95

    Add the most frequently used (> 50 cases) that are almost always or
    always const.

    Link: http://lkml.kernel.org/r/1e16020f8027654db0095bbfbcc11da51025365c.1472664220.git.joe@perches.com
    Signed-off-by: Joe Perches
    Acked-by: Kees Cook
    Cc: Julia Lawall
    Cc: Andy Whitcroft
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Make it easier to add new structs that should be const.

    Link: http://lkml.kernel.org/r/e5a8da43e7c11525bafbda1ca69a8323614dd942.1472664220.git.joe@perches.com
    Signed-off-by: Joe Perches
    Cc: Julia Lawall
    Cc: Kees Cook
    Cc: Andy Whitcroft
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • < sigh > Comment these tests out.

    These are just too enticing to people that don't verify that
    both source and dest addresses really must be __aligned(2).

    It helps make Dan Carpenter happy too.

    Link: http://lkml.kernel.org/r/dc32ec66d24647f4cdf824c8dfbbc59aa7ce7b7d.1472665676.git.joe@perches.com
    Signed-off-by: Joe Perches
    Cc: Dan Carpenter
    Cc: Greg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Warn when block comments are not aligned on the *

    /*
    * block comment, no warning
    */

    /*
    * block comment, emit warning
    */

    Link: http://lkml.kernel.org/r/edb57bd330adfe024b95ec2a807d4aa7f0c8b112.1472261299.git.joe@perches.com
    Signed-off-by: Joe Perches
    Reported-by: Sudip Mukherjee
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • S_ uses should be avoided where octal is more intelligible.

    Linus didst say:

    : It's *much* easier to parse and understand the octal numbers, while the
    : symbolic macro names are just random line noise and hard as hell to
    : understand. You really have to think about it.
    :
    : So we should rather go the other way: convert existing bad symbolic
    : permission bit macro use to just use the octal numbers.
    :
    : The symbolic names are good for the *other* bits (ie sticky bit, and the
    : inode mode _type_ numbers etc), but for the permission bits, the symbolic
    : names are just insane crap. Nobody sane should ever use them. Not in the
    : kernel, not in user space.
    (http://lkml.kernel.org/r/CA+55aFw5v23T-zvDZp-MmD_EYxF8WbafwwB59934FV7g21uMGQ@mail.gmail.com)

    Link: http://lkml.kernel.org/r/7232ef011d05a92f4caa86a5e9830d87966a2eaf.1470180926.git.joe@perches.com
    Signed-off-by: Joe Perches
    Cc: Linus Torvalds
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Use get_maintainer to check the status of individual files. If
    "obsolete", suggest leaving the files alone.

    Link: http://lkml.kernel.org/r/7ceaa510dc9d2df05ec4b456baed7bb1415550b3.1471889575.git.joe@perches.com
    Signed-off-by: Joe Perches
    Cc: SF Markus Elfring
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Today there are platforms with many CPUs (up to 4K). Trying to boot only
    part of the CPUs may result in too long string.

    For example lets take NPS platform that is part of arch/arc. This
    platform have SMP system with 256 cores each with 16 HW threads (SMT
    machine) where HW thread appears as CPU to the kernel. In this example
    there is total of 4K CPUs. When one tries to boot only part of the HW
    threads from each core the string representing the map may be long... For
    example if for sake of performance we decided to boot only first half of
    HW threads of each core the map will look like:
    0-7,16-23,32-39,...,4080-4087

    This patch introduce new syntax to accommodate with such use case. I
    added an optional postfix to a range of CPUs which will choose according
    to given modulo the desired range of reminders i.e.:

    :sed_size/group_size

    For example, above map can be described in new syntax like this:
    0-4095:8/16

    Note that this patch is backward compatible with current syntax.

    [akpm@linux-foundation.org: rework documentation]
    Link: http://lkml.kernel.org/r/1473579629-4283-1-git-send-email-noamca@mellanox.com
    Signed-off-by: Noam Camus
    Cc: David Decotigny
    Cc: Ben Hutchings
    Cc: David S. Miller
    Cc: Pan Xinhui
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Noam Camus
     
  • Set "overflow" bit upon encountering it instead of postponing to the end
    of the conversion. Somehow gcc unwedges itself and generates better code:

    $ ./scripts/bloat-o-meter ../vmlinux-000 ../obj/vmlinux
    _parse_integer 177 139 -38

    Inspired by patch from Zhaoxiu Zeng.

    Link: http://lkml.kernel.org/r/20160826221920.GA1909@p183.telecom.by
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Make isdigit into a simple range checking inline function:

    return '0'
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • The strncpy_from_user() accessor is effectively a copy_from_user()
    specialised to copy strings, terminating early at a NUL byte if possible.
    In other respects it is identical, and can be used to copy an arbitrarily
    large buffer from userspace into the kernel. Conceptually, it exposes a
    similar attack surface.

    As with copy_from_user(), we check the destination range when the kernel
    is built with KASAN, but unlike copy_from_user() we do not check the
    destination buffer when using HARDENED_USERCOPY. As strncpy_from_user()
    calls get_user() in a loop, we must call check_object_size() explicitly.

    This patch adds this instrumentation to strncpy_from_user(), per the same
    rationale as with the regular copy_from_user(). In the absence of
    hardened usercopy this will have no impact as the instrumentation expands
    to an empty static inline function.

    Link: http://lkml.kernel.org/r/1472221903-31181-1-git-send-email-mark.rutland@arm.com
    Signed-off-by: Mark Rutland
    Cc: Kees Cook
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mark Rutland
     
  • The pthread_mutex_t in regression1.c wasn't being initialized properly.

    Link: http://lkml.kernel.org/r/20160815194237.25967-4-ross.zwisler@linux.intel.com
    Signed-off-by: Ross Zwisler
    Cc: Konstantin Khlebnikov
    Cc: Andrey Ryabinin
    Cc: Dmitry Vyukov
    Cc: Shuah Khan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ross Zwisler
     
  • There are four cases I can see where we could end up with a NULL 'slot' in
    radix_tree_next_slot(). This unit test exercises all four of them, making
    sure that if in the future we have an unsafe path through
    radix_tree_next_slot(), we'll catch it.

    Here are details on the four cases:

    1) radix_tree_iter_retry() via a non-tagged iteration like
    radix_tree_for_each_slot(). In this case we currently aren't seeing a bug
    because radix_tree_iter_retry() sets

    iter->next_index = iter->index;

    which means that in in the else case in radix_tree_next_slot(), 'count' is
    zero, so we skip over the while() loop and effectively just return NULL
    without ever dereferencing 'slot'.

    2) radix_tree_iter_retry() via tagged iteration like
    radix_tree_for_each_tagged(). This case was giving us NULL pointer
    dereferences in testing, and was fixed with this commit:

    commit 3cb9185c6730 ("radix-tree: fix radix_tree_iter_retry() for tagged
    iterators.")

    This fix doesn't explicitly check for 'slot' being NULL, though, it works
    around the NULL pointer dereference by instead zeroing iter->tags in
    radix_tree_iter_retry(), which makes us bail out of the if() case in
    radix_tree_next_slot() before we dereference 'slot'.

    3) radix_tree_iter_next() via via a non-tagged iteration like
    radix_tree_for_each_slot(). This currently happens in shmem_tag_pins()
    and shmem_partial_swap_usage().

    As with non-tagged iteration, 'count' in the else case of
    radix_tree_next_slot() is zero, so we skip over the while() loop and
    effectively just return NULL without ever dereferencing 'slot'.

    4) radix_tree_iter_next() via tagged iteration like
    radix_tree_for_each_tagged(). This happens in shmem_wait_for_pins().

    radix_tree_iter_next() zeros out iter->tags, so we end up exiting
    radix_tree_next_slot() here:

    if (flags & RADIX_TREE_ITER_TAGGED) {
    void *canon = slot;

    iter->tags >>= 1;
    if (unlikely(!iter->tags))
    return NULL;

    Link: http://lkml.kernel.org/r/20160815194237.25967-3-ross.zwisler@linux.intel.com
    Signed-off-by: Ross Zwisler
    Cc: Konstantin Khlebnikov
    Cc: Andrey Ryabinin
    Cc: Dmitry Vyukov
    Cc: Shuah Khan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ross Zwisler
     
  • There are four cases I can see where we could end up with a NULL 'slot' in
    radix_tree_next_slot(). Yet radix_tree_next_slot() never actually checks
    whether 'slot' is NULL. It just happens that for the cases where 'slot'
    is NULL, some other combination of factors prevents us from dereferencing
    it.

    It would be very easy for someone to unwittingly change one of these
    factors without realizing that we are implicitly depending on it to save
    us from a NULL pointer dereference.

    Add a comment documenting the things that allow 'slot' to be safely passed
    as NULL to radix_tree_next_slot().

    Here are details on the four cases:

    1) radix_tree_iter_retry() via a non-tagged iteration like
    radix_tree_for_each_slot(). In this case we currently aren't seeing a bug
    because radix_tree_iter_retry() sets

    iter->next_index = iter->index;

    which means that in in the else case in radix_tree_next_slot(), 'count' is
    zero, so we skip over the while() loop and effectively just return NULL
    without ever dereferencing 'slot'.

    2) radix_tree_iter_retry() via tagged iteration like
    radix_tree_for_each_tagged(). This case was giving us NULL pointer
    dereferences in testing, and was fixed with this commit:

    commit 3cb9185c6730 ("radix-tree: fix radix_tree_iter_retry() for tagged
    iterators.")

    This fix doesn't explicitly check for 'slot' being NULL, though, it works
    around the NULL pointer dereference by instead zeroing iter->tags in
    radix_tree_iter_retry(), which makes us bail out of the if() case in
    radix_tree_next_slot() before we dereference 'slot'.

    3) radix_tree_iter_next() via via a non-tagged iteration like
    radix_tree_for_each_slot(). This currently happens in shmem_tag_pins()
    and shmem_partial_swap_usage().

    As with non-tagged iteration, 'count' in the else case of
    radix_tree_next_slot() is zero, so we skip over the while() loop and
    effectively just return NULL without ever dereferencing 'slot'.

    4) radix_tree_iter_next() via tagged iteration like
    radix_tree_for_each_tagged(). This happens in shmem_wait_for_pins().

    radix_tree_iter_next() zeros out iter->tags, so we end up exiting
    radix_tree_next_slot() here:

    if (flags & RADIX_TREE_ITER_TAGGED) {
    void *canon = slot;

    iter->tags >>= 1;
    if (unlikely(!iter->tags))
    return NULL;

    Link: http://lkml.kernel.org/r/20160815194237.25967-2-ross.zwisler@linux.intel.com
    Signed-off-by: Ross Zwisler
    Cc: Konstantin Khlebnikov
    Cc: Andrey Ryabinin
    Cc: Dmitry Vyukov
    Cc: Shuah Khan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ross Zwisler
     
  • The select(2) syscall performs a kmalloc(size, GFP_KERNEL) where size grows
    with the number of fds passed. We had a customer report page allocation
    failures of order-4 for this allocation. This is a costly order, so it might
    easily fail, as the VM expects such allocation to have a lower-order fallback.

    Such trivial fallback is vmalloc(), as the memory doesn't have to be physically
    contiguous and the allocation is temporary for the duration of the syscall
    only. There were some concerns, whether this would have negative impact on the
    system by exposing vmalloc() to userspace. Although an excessive use of vmalloc
    can cause some system wide performance issues - TLB flushes etc. - a large
    order allocation is not for free either and an excessive reclaim/compaction can
    have a similar effect. Also note that the size is effectively limited by
    RLIMIT_NOFILE which defaults to 1024 on the systems I checked. That means the
    bitmaps will fit well within single page and thus the vmalloc() fallback could
    be only excercised for processes where root allows a higher limit.

    Note that the poll(2) syscall seems to use a linked list of order-0 pages, so
    it doesn't need this kind of fallback.

    [eric.dumazet@gmail.com: fix failure path logic]
    [akpm@linux-foundation.org: use proper type for size]
    Link: http://lkml.kernel.org/r/20160927084536.5923-1-vbabka@suse.cz
    Signed-off-by: Vlastimil Babka
    Acked-by: Michal Hocko
    Cc: Alexander Viro
    Cc: Eric Dumazet
    Cc: David Laight
    Cc: Hillf Danton
    Cc: Nicholas Piggin
    Cc: Jason Baron
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vlastimil Babka
     
  • After much discussion, it seems that the fallocate feature flag
    FALLOC_FL_ZERO_RANGE maps nicely to SCSI WRITE SAME; and the feature
    FALLOC_FL_PUNCH_HOLE maps nicely to the devices that have been whitelisted
    for zeroing SCSI UNMAP. Punch still requires that FALLOC_FL_KEEP_SIZE is
    set. A length that goes past the end of the device will be clamped to the
    device size if KEEP_SIZE is set; or will return -EINVAL if not. Both
    start and length must be aligned to the device's logical block size.

    Since the semantics of fallocate are fairly well established already, wire
    up the two pieces. The other fallocate variants (collapse range, insert
    range, and allocate blocks) are not supported.

    Link: http://lkml.kernel.org/r/147518379992.22791.8849838163218235007.stgit@birch.djwong.org
    Signed-off-by: Darrick J. Wong
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Bart Van Assche
    Cc: Theodore Ts'o
    Cc: Martin K. Petersen
    Cc: Mike Snitzer # tweaked header
    Cc: Brian Foster
    Cc: Christoph Hellwig
    Cc: Hannes Reinecke
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Darrick J. Wong
     
  • Make sure that the offset and length arguments that we're using to
    construct WRITE SAME and DISCARD requests are actually aligned to the
    logical block size. Failure to do this causes other errors in other parts
    of the block layer or the SCSI layer because disks don't support partial
    logical block writes.

    Link: http://lkml.kernel.org/r/147518379026.22791.4437508871355153928.stgit@birch.djwong.org
    Signed-off-by: Darrick J. Wong
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Bart Van Assche
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Hannes Reinecke
    Cc: Theodore Ts'o
    Cc: Mike Snitzer # tweaked header
    Cc: Brian Foster
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Darrick J. Wong
     
  • Patch series "fallocate for block devices", v11.

    This is a patchset to fix page cache coherency with BLKZEROOUT and
    implement fallocate for block devices.

    The first patch is a fix to the existing BLKZEROOUT ioctl to invalidate
    the page cache if the zeroing command to the underlying device succeeds.
    Without this patch we still have the pagecache coherence bug that's been
    in the kernel forever.

    The second patch changes the internal block device functions to reject
    attempts to discard or zeroout that are not aligned to the logical block
    size. Previously, we only checked that the start/len parameters were
    512-byte aligned, which caused kernel BUG_ONs for unaligned IOs to 4k-LBA
    devices.

    The third patch creates an fallocate handler for block devices, wires up
    the FALLOC_FL_PUNCH_HOLE flag to zeroing-discard, and connects
    FALLOC_FL_ZERO_RANGE to write-same so that we can have a consistent
    fallocate interface between files and block devices. It also allows the
    combination of PUNCH_HOLE and NO_HIDE_STALE to invoke non-zeroing discard.

    Test cases for the new block device fallocate are now in xfstests as
    generic/349-351.

    This patch (of 3):

    Invalidate the page cache (as a regular O_DIRECT write would do) to avoid
    returning stale cache contents at a later time.

    Link: http://lkml.kernel.org/r/147518378313.22791.16649519283678515021.stgit@birch.djwong.org
    Signed-off-by: Darrick J. Wong
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Bart Van Assche
    Reviewed-by: Hannes Reinecke
    Cc: Theodore Ts'o
    Cc: Mike Snitzer
    Cc: Brian Foster
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Darrick J. Wong
     
  • In the dlm_migrate_request_handler(), when `ret' is -EEXIST, the mle
    should be freed, otherwise the memory will be leaked.

    Link: http://lkml.kernel.org/r/71604351584F6A4EBAE558C676F37CA4A3D3522A@H3CMLB12-EX.srv.huawei-3com.com
    Signed-off-by: Guozhonghua
    Reviewed-by: Mark Fasheh
    Cc: Eric Ren
    Cc: Joel Becker
    Cc: Junxiao Bi
    Cc: Joseph Qi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Guozhonghua
     
  • it actually worked only when requested area ended on the page boundary...

    Reported-by: Marco Grassi
    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Al Viro
     

11 Oct, 2016

21 commits

  • Pull networking fixes from David Miller:

    1) Netfilter list handling fix, from Linus.

    2) RXRPC/AFS bug fixes from David Howells (oops on call to serviceless
    endpoints, build warnings, missing notifications, etc.) From David
    Howells.

    3) Kernel log message missing newlines, from Colin Ian King.

    4) Don't enter direct reclaim in netlink dumps, the idea is to use a
    high order allocation first and fallback quickly to a 0-order
    allocation if such a high-order one cannot be done cheaply and
    without reclaim. From Eric Dumazet.

    5) Fix firmware download errors in btusb bluetooth driver, from Ethan
    Hsieh.

    6) Missing Kconfig deps for QCOM_EMAC, from Geert Uytterhoeven.

    7) Fix MDIO_XGENE dup Kconfig entry. From Laura Abbott.

    8) Constrain ipv6 rtr_solicits sysctl values properly, from Maciej
    Żenczykowski.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits)
    netfilter: Fix slab corruption.
    be2net: Enable VF link state setting for BE3
    be2net: Fix TX stats for TSO packets
    be2net: Update Copyright string in be_hw.h
    be2net: NCSI FW section should be properly updated with ethtool for BE3
    be2net: Provide an alternate way to read pf_num for BEx chips
    wan/fsl_ucc_hdlc: Fix size used in dma_free_coherent()
    net: macb: NULL out phydev after removing mdio bus
    xen-netback: make sure that hashes are not send to unaware frontends
    Fixing a bug in team driver due to incorrect 'unsigned int' to 'int' conversion
    MAINTAINERS: add myself as a maintainer of xen-netback
    ipv6 addrconf: disallow rtr_solicits < -1
    Bluetooth: btusb: Fix atheros firmware download error
    drivers: net: phy: Correct duplicate MDIO_XGENE entry
    ethernet: qualcomm: QCOM_EMAC should depend on HAS_DMA and HAS_IOMEM
    net: ethernet: mediatek: remove hwlro property in the device tree
    net: ethernet: mediatek: get hw lro capability by the chip id instead of by the dtsi
    net: ethernet: mediatek: get the chip id by ETHDMASYS registers
    net: bgmac: Fix errant feature flag check
    netlink: do not enter direct reclaim from netlink_dump()
    ...

    Linus Torvalds
     
  • Use the correct pattern for singly linked list insertion and
    deletion. We can also calculate the list head outside of the
    mutex.

    Fixes: e3b37f11e6e4 ("netfilter: replace list_head with single linked list")
    Signed-off-by: Linus Torvalds
    Reviewed-by: Aaron Conole
    Signed-off-by: David S. Miller

    net/netfilter/core.c | 108 ++++++++++++++++-----------------------------------
    1 file changed, 33 insertions(+), 75 deletions(-)

    Linus Torvalds
     
  • Pull more vfs updates from Al Viro:
    ">rename2() work from Miklos + current_time() from Deepa"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    fs: Replace current_fs_time() with current_time()
    fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
    fs: Replace CURRENT_TIME with current_time() for inode timestamps
    fs: proc: Delete inode time initializations in proc_alloc_inode()
    vfs: Add current_time() api
    vfs: add note about i_op->rename changes to porting
    fs: rename "rename2" i_op to "rename"
    vfs: remove unused i_op->rename
    fs: make remaining filesystems use .rename2
    libfs: support RENAME_NOREPLACE in simple_rename()
    fs: support RENAME_NOREPLACE for local filesystems
    ncpfs: fix unused variable warning

    Linus Torvalds
     
  • Al Viro
     
  • Pull MTD updates from Brian Norris:
    "I've not been very active this cycle, so these are mostly from Boris,
    for the NAND flash subsystem.

    NAND:

    - Add the infrastructure to automate NAND timings configuration

    - Provide a generic DT property to maximize ECC strength

    - Some refactoring in the core bad block table handling, to help with
    improving some of the logic in error cases.

    - Minor cleanups and fixes

    MTD:

    - Add APIs for handling page pairing; this is necessary for reliably
    supporting MLC and TLC NAND flash, where paired-page disturbance
    affects reliability. Upper layers (e.g., UBI) should make use of
    these in the near future"

    * tag 'for-linus-20161008' of git://git.infradead.org/linux-mtd: (35 commits)
    mtd: nand: fix trivial spelling error
    mtdpart: Propagate _get/put_device()
    mtd: nand: Provide nand_cleanup() function to free NAND related resources
    mtd: Kill the OF_MTD Kconfig option
    mtd: nand: mxc: Test CONFIG_OF instead of CONFIG_OF_MTD
    mtd: nand: Fix nand_command_lp() for 8bits opcodes
    mtd: nand: sunxi: Support ECC maximization
    mtd: nand: Support maximizing ECC when using software BCH
    mtd: nand: Add an option to maximize the ECC strength
    mtd: nand: mxc: Add timing setup for v2 controllers
    mtd: nand: mxc: implement onfi get/set features
    mtd: nand: sunxi: switch from manual to automated timing config
    mtd: nand: automate NAND timings selection
    mtd: nand: Expose data interface for ONFI mode 0
    mtd: nand: Add function to convert ONFI mode to data_interface
    mtd: nand: convert ONFI mode into data interface
    mtd: nand: Introduce nand_data_interface
    mtd: nand: Create a NAND reset function
    mtd: nand: remove unnecessary 'extern' from function declarations
    MAINTAINERS: Add maintainer entry for Ingenic JZ4780 NAND driver
    ...

    Linus Torvalds
     
  • Pull vfs xattr updates from Al Viro:
    "xattr stuff from Andreas

    This completes the switch to xattr_handler ->get()/->set() from
    ->getxattr/->setxattr/->removexattr"

    * 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    vfs: Remove {get,set,remove}xattr inode operations
    xattr: Stop calling {get,set,remove}xattr inode operations
    vfs: Check for the IOP_XATTR flag in listxattr
    xattr: Add __vfs_{get,set,remove}xattr helpers
    libfs: Use IOP_XATTR flag for empty directory handling
    vfs: Use IOP_XATTR flag for bad-inode handling
    vfs: Add IOP_XATTR inode operations flag
    vfs: Move xattr_resolve_name to the front of fs/xattr.c
    ecryptfs: Switch to generic xattr handlers
    sockfs: Get rid of getxattr iop
    sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names
    kernfs: Switch to generic xattr handlers
    hfs: Switch to generic xattr handlers
    jffs2: Remove jffs2_{get,set,remove}xattr macros
    xattr: Remove unnecessary NULL attribute name check

    Linus Torvalds
     
  • Pull crypto updates from Herbert Xu:
    "Here is the crypto update for 4.9:

    API:
    - The crypto engine code now supports hashes.

    Algorithms:
    - Allow keys >= 2048 bits in FIPS mode for RSA.

    Drivers:
    - Memory overwrite fix for vmx ghash.
    - Add support for building ARM sha1-neon in Thumb2 mode.
    - Reenable ARM ghash-ce code by adding import/export.
    - Reenable img-hash by adding import/export.
    - Add support for multiple cores in omap-aes.
    - Add little-endian support for sha1-powerpc.
    - Add Cavium HWRNG driver for ThunderX SoC"

    * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (137 commits)
    crypto: caam - treat SGT address pointer as u64
    crypto: ccp - Make syslog errors human-readable
    crypto: ccp - clean up data structure
    crypto: vmx - Ensure ghash-generic is enabled
    crypto: testmgr - add guard to dst buffer for ahash_export
    crypto: caam - Unmap region obtained by of_iomap
    crypto: sha1-powerpc - little-endian support
    crypto: gcm - Fix IV buffer size in crypto_gcm_setkey
    crypto: vmx - Fix memory corruption caused by p8_ghash
    crypto: ghash-generic - move common definitions to a new header file
    crypto: caam - fix sg dump
    hwrng: omap - Only fail if pm_runtime_get_sync returns < 0
    crypto: omap-sham - shrink the internal buffer size
    crypto: omap-sham - add support for export/import
    crypto: omap-sham - convert driver logic to use sgs for data xmit
    crypto: omap-sham - change the DMA threshold value to a define
    crypto: omap-sham - add support functions for sg based data handling
    crypto: omap-sham - rename sgl to sgl_tmp for deprecation
    crypto: omap-sham - align algorithms on word offset
    crypto: omap-sham - add context export/import stubs
    ...

    Linus Torvalds
     
  • Pull dlm fix from David Teigland:
    "This includes a bug fix for a bad memory access during workqueue
    cleanup, which can happen while shutting down the dlm networking
    layer"

    * tag 'dlm-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
    dlm: free workqueues after the connections

    Linus Torvalds
     
  • Pull Ceph updates from Ilya Dryomov:
    "The big ticket item here is support for rbd exclusive-lock feature,
    with maintenance operations offloaded to userspace (Douglas Fuller,
    Mike Christie and myself). Another block device bullet is a series
    fixing up layering error paths (myself).

    On the filesystem side, we've got patches that improve our handling of
    buffered vs dio write races (Neil Brown) and a few assorted fixes from
    Zheng. Also included a couple of random cleanups and a minor CRUSH
    update"

    * tag 'ceph-for-4.9-rc1' of git://github.com/ceph/ceph-client: (39 commits)
    crush: remove redundant local variable
    crush: don't normalize input of crush_ln iteratively
    libceph: ceph_build_auth() doesn't need ceph_auth_build_hello()
    libceph: use CEPH_AUTH_UNKNOWN in ceph_auth_build_hello()
    ceph: fix description for rsize and rasize mount options
    rbd: use kmalloc_array() in rbd_header_from_disk()
    ceph: use list_move instead of list_del/list_add
    ceph: handle CEPH_SESSION_REJECT message
    ceph: avoid accessing / when mounting a subpath
    ceph: fix mandatory flock check
    ceph: remove warning when ceph_releasepage() is called on dirty page
    ceph: ignore error from invalidate_inode_pages2_range() in direct write
    ceph: fix error handling of start_read()
    rbd: add rbd_obj_request_error() helper
    rbd: img_data requests don't own their page array
    rbd: don't call rbd_osd_req_format_read() for !img_data requests
    rbd: rework rbd_img_obj_exists_submit() error paths
    rbd: don't crash or leak on errors in rbd_img_obj_parent_read_full_callback()
    rbd: move bumping img_request refcount into rbd_obj_request_submit()
    rbd: mark the original request as done if stat request fails
    ...

    Linus Torvalds
     
  • Pull splice fixups from Al Viro:
    "A couple of fixups for interaction of pipe-backed iov_iter with
    O_DIRECT reads + constification of a couple of primitives in uio.h
    missed by previous rounds.

    Kudos to davej - his fuzzing has caught those bugs"

    * 'work.splice_read' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    [btrfs] fix check_direct_IO() for non-iovec iterators
    constify iov_iter_count() and iter_is_iovec()
    fix ITER_PIPE interaction with direct_IO

    Linus Torvalds
     
  • Pull misc vfs updates from Al Viro:
    "Assorted misc bits and pieces.

    There are several single-topic branches left after this (rename2
    series from Miklos, current_time series from Deepa Dinamani, xattr
    series from Andreas, uaccess stuff from from me) and I'd prefer to
    send those separately"

    * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (39 commits)
    proc: switch auxv to use of __mem_open()
    hpfs: support FIEMAP
    cifs: get rid of unused arguments of CIFSSMBWrite()
    posix_acl: uapi header split
    posix_acl: xattr representation cleanups
    fs/aio.c: eliminate redundant loads in put_aio_ring_file
    fs/internal.h: add const to ns_dentry_operations declaration
    compat: remove compat_printk()
    fs/buffer.c: make __getblk_slow() static
    proc: unsigned file descriptors
    fs/file: more unsigned file descriptors
    fs: compat: remove redundant check of nr_segs
    cachefiles: Fix attempt to read i_blocks after deleting file [ver #2]
    cifs: don't use memcpy() to copy struct iov_iter
    get rid of separate multipage fault-in primitives
    fs: Avoid premature clearing of capabilities
    fs: Give dentry to inode_change_ok() instead of inode
    fuse: Propagate dentry down to inode_change_ok()
    ceph: Propagate dentry down to inode_change_ok()
    xfs: Propagate dentry down to inode_change_ok()
    ...

    Linus Torvalds
     
  • Pull ARM pcmcia updates from Russell King:
    "These updates lay the foundations for more generic soc_common PCMCIA
    support, which will result in several of the board specific drivers
    being elimated.

    As the dependencies for this are complex, the preliminary work is
    being submitted now, with the remainder scheduled for the next merge
    window"

    * 'pcmcia' of git://git.armlinux.org.uk/~rmk/linux-arm:
    pcmcia: soc_common: add driver-data pointer
    pcmcia: soc_common: add support for voltage sense GPIOs
    pcmcia: soc_common: constify pcmcia_low_level ops pointer
    pcmcia: soc_common: switch to a per-socket cpufreq notifier
    pcmcia: soc_common: add support for Vcc and Vpp regulators
    pcmcia: soc_common: add CF socket state helper
    pcmcia: soc_common: restore previous socket state on error
    pcmcia: soc_common: add support for reset and bus enable GPIOs
    pcmcia: soc_common: request legacy detect GPIO with active low
    pcmcia: soc_common: ignore invalid interrupts
    pcmcia: soc_common: switch to using gpio_descs
    pcmcia: soc_common: use devm_gpio_request_one()

    Linus Torvalds
     
  • Pull nios2 update from Ley Foon Tan:
    "Use of_property_read_bool() instead of open-coding it"

    * tag 'nios2-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
    nios2: use of_property_read_bool

    Linus Torvalds
     
  • Pull CRIS updates from Jesper Nilsson.

    * tag 'cris-for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jesper/cris:
    cris: return of class_create should be considered
    CRIS: defconfig: remove MTDRAM_ABS_POS
    CRIS v32: remove some double unlocks
    Fix typos
    cris: migrate exception table users off module.h and onto extable.h
    cris: v10: axisflashmap: remove unused ifdefs
    cris: use generic io.h
    cris: fix Kconfig mismatch when building with CONFIG_PCI
    cris: cardbus: fix header include path
    cris: add dev88_defconfig
    cris: irq: stop loop from accessing array out of bounds
    cris: fasttimer: fix mixed declarations and code compile warning
    cris: intmem: fix pointer comparison compile warning
    cris: intmem: fix device_initcall compile warning

    Linus Torvalds
     
  • Pull protection keys syscall interface from Thomas Gleixner:
    "This is the final step of Protection Keys support which adds the
    syscalls so user space can actually allocate keys and protect memory
    areas with them. Details and usage examples can be found in the
    documentation.

    The mm side of this has been acked by Mel"

    * 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/pkeys: Update documentation
    x86/mm/pkeys: Do not skip PKRU register if debug registers are not used
    x86/pkeys: Fix pkeys build breakage for some non-x86 arches
    x86/pkeys: Add self-tests
    x86/pkeys: Allow configuration of init_pkru
    x86/pkeys: Default to a restrictive init PKRU
    pkeys: Add details of system call use to Documentation/
    generic syscalls: Wire up memory protection keys syscalls
    x86: Wire up protection keys system calls
    x86/pkeys: Allocation/free syscalls
    x86/pkeys: Make mprotect_key() mask off additional vm_flags
    mm: Implement new pkey_mprotect() system call
    x86/pkeys: Add fault handling for PF_PK page fault bit

    Linus Torvalds
     
  • Pull x86 updates from Thomas Gleixner:
    "A pile of regression fixes and updates:

    - address the fallout of the patches which made the cpuid - nodeid
    relation permanent: Handling of invalid APIC ids and preventing
    pointless warning messages.

    - force eager FPU when protection keys are enabled. Protection keys
    are not generating FPU exceptions so they cannot work with the lazy
    FPU mechanism.

    - prevent force migration of interrupts which are not part of the CPU
    vector domain.

    - handle the fact that APIC ids are not updated in the ACPI/MADT
    tables on physical CPU hotplug

    - remove bash-isms from syscall table generator script

    - use the hypervisor supplied APIC frequency when running on VMware"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/pkeys: Make protection keys an "eager" feature
    x86/apic: Prevent pointless warning messages
    x86/acpi: Prevent LAPIC id 0xff from being accounted
    arch/x86: Handle non enumerated CPU after physical hotplug
    x86/unwind: Fix oprofile module link error
    x86/vmware: Skip lapic calibration on VMware
    x86/syscalls: Remove bash-isms in syscall table generator
    x86/irq: Prevent force migration of irqs which are not in the vector domain

    Linus Torvalds
     
  • looking for duplicate ->iov_base makes sense only for
    iovec-backed iterators; for kvec-backed ones it's pointless,
    for bvec-backed ones it's pointless and broken on 32bit (we
    walk through an array of struct bio_vec accessing them as if
    they were struct iovec; works by accident on 64bit, but on
    32bit it'll blow up) and for pipe-backed ones it's pointless
    and ends up oopsing.

    Signed-off-by: Al Viro

    Al Viro
     
  • Signed-off-by: Al Viro

    Al Viro
     
  • by making sure we call iov_iter_advance() on original
    iov_iter even if direct_IO (done on its copy) has returned 0.
    It's a no-op for old iov_iter flavours and does the right thing
    (== truncation of the stuff we'd allocated, but not filled) in
    ITER_PIPE case. Failures (e.g. -EIO) get caught and dealt with
    by cleanup in generic_file_read_iter().

    Signed-off-by: Al Viro

    Al Viro
     
  • Pull perf tooling updates from Thomas Gleixner:

    - handle uretprobe placement proper on little endian PPC64

    - fix buffer handling in libtraceevent

    - add a missing pointer derefence in perf probe

    - fix the build of host tools in cross builds

    - fix Intel PT timestamp handling

    - synchronize memcpy, cpufeatures and bpf headers with the kernel headers

    - support for vendor supplied JSON files describing PMU events

    - a new set of tool tips

    - initial work for clang/llvm support

    - address some style issues found by cppcheck

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits)
    tools build: Add feature detection for g++
    tools build: Support compiling C++ source file
    perf top/report: Add tips about a list option
    perf report/top: Add a tip about system-wide collection from all CPUs
    perf report/top: Add a tip about source line numbers with overhead
    tools: Synchronize tools/include/uapi/linux/bpf.h
    tools: Synchronize tools/arch/x86/include/asm/cpufeatures.h
    perf bench mem: Sync memcpy assembly sources with the kernel
    perf jevents: Fix Intel JSON fixed counter conversions
    tools lib traceevent: Fix kbuffer_read_at_offset()
    perf intel-pt: Fix MTC timestamp calculation for large MTC periods
    perf intel-pt: Fix estimated timestamps for cycle-accurate mode
    perf uretprobe ppc64le: Fix probe location
    perf pmu-events: Add Skylake frontend MSR support
    perf pmu-events: Fix fixed counters on Intel
    perf tools: Make alias matching case-insensitive
    perf tools: Allow period= in perf stat CPU event descriptions.
    perf tools: Add README for info on parsing JSON/map files
    perf list jevents: Add support for event list topics
    perf list: Support long jevents descriptions
    ...

    Linus Torvalds
     
  • Pull scheduler fix from Thomas Gleixner:
    "A revert of a commit which pointelessly widened a preempt disabled
    section which in turn caused might_sleep() to trigger.

    The patch intended to prevent usage of smp_processor_id() in
    preemptible context, but the usage in that case is fine because the
    thread is pinned on a single cpu and therefore cannot be migrated off"

    * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    Revert "sched/core: Do not use smp_processor_id() with preempt enabled in smpboot_thread_fn()"

    Linus Torvalds