15 Jul, 2017

1 commit

  • The value written to fail-nth file is parsed as 0-based. Parsing as
    one-based is more natural to understand and it enables to cancel the
    previous setup by simply writing '0'.

    This change also converts task->fail_nth from signed to unsigned int.

    Link: http://lkml.kernel.org/r/1491490561-10485-3-git-send-email-akinobu.mita@gmail.com
    Signed-off-by: Akinobu Mita
    Cc: Dmitry Vyukov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     

14 Jul, 2017

28 commits

  • Pull NFS client updates from Anna Schumaker:
    "Stable bugfixes:
    - Fix -EACCESS on commit to DS handling
    - Fix initialization of nfs_page_array->npages
    - Only invalidate dentries that are actually invalid

    Features:
    - Enable NFSoRDMA transparent state migration
    - Add support for lookup-by-filehandle
    - Add support for nfs re-exporting

    Other bugfixes and cleanups:
    - Christoph cleaned up the way we declare NFS operations
    - Clean up various internal structures
    - Various cleanups to commits
    - Various improvements to error handling
    - Set the dt_type of . and .. entries in NFS v4
    - Make slot allocation more reliable
    - Fix fscache stat printing
    - Fix uninitialized variable warnings
    - Fix potential list overrun in nfs_atomic_open()
    - Fix a race in NFSoRDMA RPC reply handler
    - Fix return size for nfs42_proc_copy()
    - Fix against MAC forgery timing attacks"

    * tag 'nfs-for-4.13-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (68 commits)
    NFS: Don't run wake_up_bit() when nobody is waiting...
    nfs: add export operations
    nfs4: add NFSv4 LOOKUPP handlers
    nfs: add a nfs_ilookup helper
    nfs: replace d_add with d_splice_alias in atomic_open
    sunrpc: use constant time memory comparison for mac
    NFSv4.2 fix size storage for nfs42_proc_copy
    xprtrdma: Fix documenting comments in frwr_ops.c
    xprtrdma: Replace PAGE_MASK with offset_in_page()
    xprtrdma: FMR does not need list_del_init()
    xprtrdma: Demote "connect" log messages
    NFSv4.1: Use seqid returned by EXCHANGE_ID after state migration
    NFSv4.1: Handle EXCHGID4_FLAG_CONFIRMED_R during NFSv4.1 migration
    xprtrdma: Don't defer MR recovery if ro_map fails
    xprtrdma: Fix FRWR invalidation error recovery
    xprtrdma: Fix client lock-up after application signal fires
    xprtrdma: Rename rpcrdma_req::rl_free
    xprtrdma: Pass only the list of registered MRs to ro_unmap_sync
    xprtrdma: Pre-mark remotely invalidated MRs
    xprtrdma: On invalidation failure, remove MWs from rl_registered
    ...

    Linus Torvalds
     
  • Pull SCSI target updates from Nicholas Bellinger:
    "It's been usually busy for summer, with most of the efforts centered
    around TCMU developments and various target-core + fabric driver bug
    fixing activities. Not particularly large in terms of LoC, but lots of
    smaller patches from many different folks.

    The highlights include:

    - ibmvscsis logical partition manager support (Michael Cyr + Bryant
    Ly)

    - Convert target/iblock WRITE_SAME to blkdev_issue_zeroout (hch +
    nab)

    - Add support for TMR percpu LUN reference counting (nab)

    - Fix a potential deadlock between EXTENDED_COPY and iscsi shutdown
    (Bart)

    - Fix COMPARE_AND_WRITE caw_sem leak during se_cmd quiesce (Jiang Yi)

    - Fix TMCU module removal (Xiubo Li)

    - Fix iser-target OOPs during login failure (Andrea Righi + Sagi)

    - Breakup target-core free_device backend driver callback (mnc)

    - Perform TCMU add/delete/reconfig synchronously (mnc)

    - Fix TCMU multiple UIO open/close sequences (mnc)

    - Fix TCMU CHECK_CONDITION sense handling (mnc)

    - Fix target-core SAM_STAT_BUSY + TASK_SET_FULL handling (mnc + nab)

    - Introduce TYPE_ZBC support in PSCSI (Damien Le Moal)

    - Fix possible TCMU memory leak + OOPs when recalculating cmd base
    size (Xiubo Li + Bryant Ly + Damien Le Moal + mnc)

    - Add login_keys_workaround attribute for non RFC initiators (Robert
    LeBlanc + Arun Easi + nab)"

    * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (68 commits)
    iscsi-target: Add login_keys_workaround attribute for non RFC initiators
    Revert "qla2xxx: Fix incorrect tcm_qla2xxx_free_cmd use during TMR ABORT"
    tcmu: clean up the code and with one small fix
    tcmu: Fix possbile memory leak / OOPs when recalculating cmd base size
    target: export lio pgr/alua support as device attr
    target: Fix return sense reason in target_scsi3_emulate_pr_out
    target: Fix cmd size for PR-OUT in passthrough_parse_cdb
    tcmu: Fix dev_config_store
    target: pscsi: Introduce TYPE_ZBC support
    target: Use macro for WRITE_VERIFY_32 operation codes
    target: fix SAM_STAT_BUSY/TASK_SET_FULL handling
    target: remove transport_complete
    pscsi: finish cmd processing from pscsi_req_done
    tcmu: fix sense handling during completion
    target: add helper to copy sense to se_cmd buffer
    target: do not require a transport_complete for SCF_TRANSPORT_TASK_SENSE
    target: make device_mutex and device_list static
    tcmu: Fix flushing cmd entry dcache page
    tcmu: fix multiple uio open/close sequences
    tcmu: drop configured check in destroy
    ...

    Linus Torvalds
     
  • "perf lock" shows fairly heavy contention for the bit waitqueue locks
    when doing an I/O heavy workload.
    Use a bit to tell whether or not there has been contention for a lock
    so that we can optimise away the bit waitqueue options in those cases.

    Signed-off-by: Trond Myklebust
    Signed-off-by: Anna Schumaker

    Trond Myklebust
     
  • Pull nfsd updates from Bruce Fields:
    "Chuck's RDMA update overhauls the "call receive" side of the
    RPC-over-RDMA transport to use the new rdma_rw API.

    Christoph cleaned the way nfs operations are declared, removing a
    bunch of function-pointer casts and declaring the operation vectors as
    const.

    Christoph's changes touch both client and server, and both client and
    server pulls this time around should be based on the same commits from
    Christoph"

    * tag 'nfsd-4.13' of git://linux-nfs.org/~bfields/linux: (53 commits)
    svcrdma: fix an incorrect check on -E2BIG and -EINVAL
    nfsd4: factor ctime into change attribute
    svcrdma: Remove svc_rdma_chunk_ctxt::cc_dir field
    svcrdma: use offset_in_page() macro
    svcrdma: Clean up after converting svc_rdma_recvfrom to rdma_rw API
    svcrdma: Clean-up svc_rdma_unmap_dma
    svcrdma: Remove frmr cache
    svcrdma: Remove unused Read completion handlers
    svcrdma: Properly compute .len and .buflen for received RPC Calls
    svcrdma: Use generic RDMA R/W API in RPC Call path
    svcrdma: Add recvfrom helpers to svc_rdma_rw.c
    sunrpc: Allocate up to RPCSVC_MAXPAGES per svc_rqst
    svcrdma: Don't account for Receive queue "starvation"
    svcrdma: Improve Reply chunk sanity checking
    svcrdma: Improve Write chunk sanity checking
    svcrdma: Improve Read chunk sanity checking
    svcrdma: Remove svc_rdma_marshal.c
    svcrdma: Avoid Send Queue overflow
    svcrdma: Squelch disconnection messages
    sunrpc: Disable splice for krb5i
    ...

    Linus Torvalds
     
  • This will be needed in order to implement the get_parent export op
    for nfsd.

    Signed-off-by: Jeff Layton
    Signed-off-by: Anna Schumaker

    Jeff Layton
     
  • This helper will allow to find an existing NFS inode by the file handle
    and fattr.

    Signed-off-by: Peng Tao
    [hch: split from a larger patch]
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Anna Schumaker

    Peng Tao
     
  • Transparent State Migration copies a client's lease state from the
    server where a filesystem used to reside to the server where it now
    resides. When an NFSv4.1 client first contacts that destination
    server, it uses EXCHANGE_ID to detect trunking relationships.

    The lease that was copied there is returned to that client, but the
    destination server sets EXCHGID4_FLAG_CONFIRMED_R when replying to
    the client. This is because the lease was confirmed on the source
    server (before it was copied).

    Normally, when CONFIRMED_R is set, a client purges the lease and
    creates a new one. However, that throws away the entire benefit of
    Transparent State Migration.

    Therefore, the client must not purge that lease when it is possible
    that Transparent State Migration has occurred.

    Reported-by: Xuan Qi
    Signed-off-by: Chuck Lever
    Tested-by: Xuan Qi
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • If the page cache is being flushed, then we want to ensure that we
    do start a commit once the pages are done being flushed.
    If we just wait until all I/O is done to that file, we can end up
    livelocking until the balance_dirty_pages() mechanism puts its
    foot down and forces I/O to stop.
    So instead we do more or less the same thing that O_DIRECT does,
    and set up a counter to tell us when the flush is done,

    Signed-off-by: Trond Myklebust
    Signed-off-by: Anna Schumaker

    Trond Myklebust
     
  • Remove the 'layout_private' fields that were only used by the pNFS OSD
    layout driver.

    Signed-off-by: Trond Myklebust
    Signed-off-by: Anna Schumaker

    Trond Myklebust
     
  • An interrupted rename will leave the old dentry behind if the rename
    succeeds. Fix this by forcing a lookup the next time through
    ->d_revalidate.

    A previous attempt at solving this problem took the approach to complete
    the work of the rename asynchronously, however that approach was wrong
    since it would allow the d_move() to occur after the directory's i_mutex
    had been dropped by the original process.

    Signed-off-by: Benjamin Coddington
    Reviewed-by: Jeff Layton
    Signed-off-by: Anna Schumaker

    Benjamin Coddington
     
  • NFS uses some int, and unsigned int :1, and bool as flags in structs and
    args. Assert the preference for uniformly replacing these with the bool
    type.

    Signed-off-by: Benjamin Coddington
    Signed-off-by: Anna Schumaker

    Benjamin Coddington
     
  • Signed-off-by: Christoph Hellwig
    Acked-by: Trond Myklebust

    Christoph Hellwig
     
  • struct svc_procinfo contains function pointers, and marking it as
    constant avoids it being able to be used as an attach vector for
    code injections.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     
  • pc_count is the only writeable memeber of struct svc_procinfo, which is
    a good candidate to be const-ified as it contains function pointers.

    This patch moves it into out out struct svc_procinfo, and into a
    separate writable array that is pointed to by struct svc_version.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     
  • Remove the now unused typedef.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     
  • Drop the resp argument as it can trivially be derived from the rqstp
    argument. With that all functions now have the same prototype, and we
    can remove the unsafe casting to kxdrproc_t.

    Signed-off-by: Christoph Hellwig
    Acked-by: Trond Myklebust

    Christoph Hellwig
     
  • Drop the argp argument as it can trivially be derived from the rqstp
    argument. With that all functions now have the same prototype, and we
    can remove the unsafe casting to kxdrproc_t.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     
  • Drop the p and resp arguments as they are always NULL or can trivially
    be derived from the rqstp argument. With that all functions now have the
    same prototype, and we can remove the unsafe casting to kxdrproc_t.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     
  • Drop the argp and resp arguments as they can trivially be derived from
    the rqstp argument. With that all functions now have the same prototype,
    and we can remove the unsafe casting to svc_procfunc as well as the
    svc_procfunc typedef itself.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     
  • struct rpc_procinfo contains function pointers, and marking it as
    constant avoids it being able to be used as an attach vector for
    code injections.

    Signed-off-by: Christoph Hellwig
    Acked-by: Trond Myklebust

    Christoph Hellwig
     
  • p_count is the only writeable memeber of struct rpc_procinfo, which is
    a good candidate to be const-ified as it contains function pointers.

    This patch moves it into out out struct rpc_procinfo, and into a
    separate writable array that is pointed to by struct rpc_version and
    indexed by p_statidx.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     
  • Pass struct rpc_request as the first argument instead of an untyped blob.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Jeff Layton
    Acked-by: Trond Myklebust

    Christoph Hellwig
     
  • Pass struct rpc_request as the first argument instead of an untyped blob,
    and mark the data object as const.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Jeff Layton

    Christoph Hellwig
     
  • Merge yet more updates from Andrew Morton:

    - various misc things

    - kexec updates

    - sysctl core updates

    - scripts/gdb udpates

    - checkpoint-restart updates

    - ipc updates

    - kernel/watchdog updates

    - Kees's "rough equivalent to the glibc _FORTIFY_SOURCE=1 feature"

    - "stackprotector: ascii armor the stack canary"

    - more MM bits

    - checkpatch updates

    * emailed patches from Andrew Morton : (96 commits)
    writeback: rework wb_[dec|inc]_stat family of functions
    ARM: samsung: usb-ohci: move inline before return type
    video: fbdev: omap: move inline before return type
    video: fbdev: intelfb: move inline before return type
    USB: serial: safe_serial: move __inline__ before return type
    drivers: tty: serial: move inline before return type
    drivers: s390: move static and inline before return type
    x86/efi: move asmlinkage before return type
    sh: move inline before return type
    MIPS: SMP: move asmlinkage before return type
    m68k: coldfire: move inline before return type
    ia64: sn: pci: move inline before type
    ia64: move inline before return type
    FRV: tlbflush: move asmlinkage before return type
    CRIS: gpio: move inline before return type
    ARM: HP Jornada 7XX: move inline before return type
    ARM: KVM: move asmlinkage before type
    checkpatch: improve the STORAGE_CLASS test
    mm, migration: do not trigger OOM killer when migrating memory
    drm/i915: use __GFP_RETRY_MAYFAIL
    ...

    Linus Torvalds
     
  • Pull VFIO updates from Alex Williamson:

    - Include Intel XXV710 in INTx workaround (Alex Williamson)

    - Make use of ERR_CAST() for error return (Dan Carpenter)

    - Fix vfio_group release deadlock from iommu notifier (Alex Williamson)

    - Unset KVM-VFIO attributes only on group match (Alex Williamson)

    - Fix release path group/file matching with KVM-VFIO (Alex Williamson)

    - Remove unnecessary lock uses triggering lockdep splat (Alex Williamson)

    * tag 'vfio-v4.13-rc1' of git://github.com/awilliam/linux-vfio:
    vfio: Remove unnecessary uses of vfio_container.group_lock
    vfio: New external user group/file match
    kvm-vfio: Decouple only when we match a group
    vfio: Fix group release deadlock
    vfio: Use ERR_CAST() instead of open coding it
    vfio/pci: Add Intel XXV710 to hidden INTx devices

    Linus Torvalds
     
  • Pull RTC updates from Alexandre Belloni:
    "Here is the pull-request for the RTC subsystem for 4.13.

    Subsystem:

    - expose non volatile RAM using nvmem instead of open coding in many
    drivers. Unfortunately, this option has to be enabled by default to
    not break existing users.

    - rtctest can now test for cutoff dates, showing when an RTC will
    start failing to properly save time and date.

    - new RTC registration functions to remove race conditions in drivers

    Newly supported RTCs:

    - Broadcom STB wake-timer

    - Epson RX8130CE

    - Maxim IC DS1308

    - STMicroelectronics STM32H7

    Drivers:

    - ds1307: use regmap, use nvmem, more cleanups

    - ds3232: temperature reading support

    - gemini: renamed to ftrtc010

    - m41t80: use CCF to expose the clock

    - rv8803: use nvmem

    - s3c: many cleanups

    - st-lpc: fix y2106 bug"

    * tag 'rtc-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (51 commits)
    rtc: Remove wrong deprecation comment
    nvmem: include linux/err.h from header
    rtc: st-lpc: make it robust against y2038/2106 bug
    rtc: rtctest: add check for problematic dates
    tools: timer: add rtctest_setdate
    rtc: ds1307: remove ds1307_remove
    rtc: ds1307: use generic nvmem
    rtc: ds1307: switch to rtc_register_device
    rtc: rv8803: remove rv8803_remove
    rtc: rv8803: use generic nvmem support
    rtc: rv8803: switch to rtc_register_device
    rtc: add generic nvmem support
    rtc: at91rm9200: remove race condition
    rtc: introduce new registration method
    rtc: class separate id allocation from registration
    rtc: class separate device allocation from registration
    rtc: stm32: add STM32H7 RTC support
    dt-bindings: rtc: stm32: add support for STM32H7
    rtc: ds1307: add ds1308 variant
    rtc: ds3232: add temperature support
    ...

    Linus Torvalds
     
  • Pull MTD updates from Brian Norris:
    "General updates:
    - Cleanups and additional flash support for "dataflash" driver
    - new driver for mchp23k256 SPI SRAM device
    - improve handling of MTDs without eraseblocks (i.e., MTD_NO_ERASE)
    - refactor and improve "sub-partition" handling with TRX partition
    parser; partitions can now be created as sub-partitions of another
    partition

    SPINOR updates, from Cyrille Pitchen and Marek Vasut:
    - introduce support to the SPI 1-2-2 and 1-4-4 protocols.
    - introduce support to the Double Data Rate (DDR) mode.
    - introduce support to the Octo SPI protocols.
    - add support to new memory parts for Spansion, Macronix and Winbond.
    - add fixes for the Aspeed, STM32 and Cadence QSPI controler drivers.
    - clean up the st_spi_fsm driver.

    NAND updates, from Boris Brezillon:
    - addition of on-die ECC support to Micron driver
    - addition of helpers to help drivers choose most appropriate ECC
    settings
    - deletion of dead-code (cached programming and ->errstat() hook)
    - make sure drivers that do not support the SET/GET FEATURES command
    return ENOTSUPP use a dummy ->set/get_features implementation
    returning -ENOTSUPP (required for Micron on-die ECC)
    - change the semantic of ecc->write_page() for drivers setting the
    NAND_ECC_CUSTOM_PAGE_ACCESS flag
    - support exiting 'GET STATUS' command in default ->cmdfunc()
    implementations
    - change the prototype of ->setup_data_interface()

    A bunch of driver related changes:
    - various cleanup, fixes and improvements of the MTK driver
    - OMAP DT bindings fixes
    - support for ->setup_data_interface() in the fsmc driver
    - support for imx7 in the gpmi driver
    - finalization of the denali driver rework (thanks to Masahiro for
    the work he's done on this driver)
    - fix "bitflips in erased pages" handling in the ifc driver
    - addition of PM ops and dynamic timing configuration to the atmel
    driver"

    * tag 'for-linus-20170713' of git://git.infradead.org/linux-mtd: (118 commits)
    Documentation: ABI: mtd: describe "offset" more precisely
    mtd: Fix check in mtd_unpoint()
    mtd: nand: mtk: release lock on error path
    mtd: st_spi_fsm: remove SPINOR_OP_RDSR2 and use SPINOR_OP_RDCR instead
    mtd: spi-nor: cqspi: remove duplicate const
    mtd: spi-nor: Add support for Spansion S25FL064L
    mtd: spi-nor: Add support for mx66u51235f
    mtd: nand: mtk: add ->setup_data_interface() hook
    mtd: nand: mtk: remove unneeded mtk_ecc_hw_init from mtk_ecc_resume
    mtd: nand: mtk: remove unneeded mtk_nfc_hw_init from mtk_nfc_resume
    mtd: nand: mtk: disable ecc irq when writing page with hwecc
    mtd: nand: mtk: fix incorrect register setting order about ecc irq
    mtd: partitions: fixup some allocate_partition() whitespace
    mtd: parsers: trx: fix pr_err format for printing offset
    MAINTAINERS: Update SPI NOR subsystem git repositories
    mtd: extract TRX parser out of bcm47xxpart into a separated module
    mtd: partitions: add support for partition parsers
    mtd: partitions: add support for subpartitions
    mtd: partitions: rename "master" to the "parent" where appropriate
    mtd: partitions: remove sysfs files when deleting all master's partitions
    ...

    Linus Torvalds
     
  • Pull more drm updates from Dave Airlie:
    "i915, amd and some core fixes + mediatek color support.

    Some fixes tree came in since the main pull request for rc1, primarily
    i915 and drm-misc and one amd fix. The drm core vblank regression fix
    is probably the most important thing.

    I've also added the mediatek feature pull, it wasn't that big and
    didn't look like it would have any impact outside of mediatek, in fact
    it looks to just be a single feature, and some cleanups"

    * tag 'drm-fixes-for-v4.13-rc1' of git://people.freedesktop.org/~airlied/linux: (31 commits)
    drm/i915: Make DP-MST connector info work
    drm/i915/gvt: Use fence error from GVT request for workload status
    drm/i915/gvt: remove scheduler_mutex in per-engine workload_thread
    drm/i915/gvt: Revert "drm/i915/gvt: Fix possible recursive locking issue"
    drm/i915/gvt: Audit the command buffer address
    drm/i915/gvt: Fix a memory leak in intel_gvt_init_gtt()
    drm/rockchip: fix NULL check on devm_kzalloc() return value
    drm/i915/fbdev: Check for existence of ifbdev->vma before operations
    drm/radeon: Fix eDP for single-display iMac10,1 (v2)
    drm/i915: Hold RPM wakelock while initializing OA buffer
    drm/i915/cnl: Fix the CURSOR_COEFF_MASK used in DDI Vswing Programming
    drm/i915/cfl: Fix Workarounds.
    drm/i915: Avoid undefined behaviour of "u32 >> 32"
    drm/i915: reintroduce VLV/CHV PFI programming power domain workaround
    drm/i915: Fix an error checking test
    drm/i915: Disable MSI for all pre-gen5
    drm/atomic: Add missing drm_atomic_state_clear to atomic_remove_fb
    drm: vblank: Fix vblank timestamp update
    drm/i915/gvt: Make function dpy_reg_mmio_readx safe
    drm/mediatek: separate color module to fixup error memory reallocation
    ...

    Linus Torvalds
     

13 Jul, 2017

11 commits

  • Pull sysctl fix from Eric Biederman:
    "A rather embarassing and hard to hit bug was merged into 4.11-rc1.

    Andrei Vagin tracked this bug now and after some staring at the code
    I came up with a fix"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    proc: Fix proc_sys_prune_dcache to hold a sb reference

    Linus Torvalds
     
  • Pull networking fixes from David Miller:

    1) Fix 64-bit division in mlx5 IPSEC offload support, from Ilan Tayari
    and Arnd Bergmann.

    2) Fix race in statistics gathering in bnxt_en driver, from Michael
    Chan.

    3) Can't use a mutex in RCU reader protected section on tap driver, from
    Cong WANG.

    4) Fix mdb leak in bridging code, from Eduardo Valentin.

    5) Fix free of wrong pointer variable in nfp driver, from Dan Carpenter.

    6) Buffer overflow in brcmfmac driver, from Arend van SPriel.

    7) ioremap_nocache() return value needs to be checked in smsc911x
    driver, from Alexey Khoroshilov.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (34 commits)
    net: stmmac: revert "support future possible different internal phy mode"
    sfc: don't read beyond unicast address list
    datagram: fix kernel-doc comments
    socket: add documentation for missing elements
    smsc911x: Add check for ioremap_nocache() return code
    brcmfmac: fix possible buffer overflow in brcmf_cfg80211_mgmt_tx()
    net: hns: Bugfix for Tx timeout handling in hns driver
    net: ipmr: ipmr_get_table() returns NULL
    nfp: freeing the wrong variable
    mlxsw: spectrum_switchdev: Check status of memory allocation
    mlxsw: spectrum_switchdev: Remove unused variable
    mlxsw: spectrum_router: Fix use-after-free in route replace
    mlxsw: spectrum_router: Add missing rollback
    samples/bpf: fix a build issue
    bridge: mdb: fix leak on complete_info ptr on fail path
    tap: convert a mutex to a spinlock
    cxgb4: fix BUG() on interrupt deallocating path of ULD
    qed: Fix printk option passed when printing ipv6 addresses
    net: Fix minor code bug in timestamping.txt
    net: stmmac: Make 'alloc_dma_[rt]x_desc_resources()' look even closer
    ...

    Linus Torvalds
     
  • …drm-misc into drm-next

    Core Changes:
    - Fix empty timestamps on hw without vlbank counter (Laurent)
    - Clear atomic state before retrying ww/mutex acquisition in remove_fb (Maarten)

    Driver Changes:
    - rockchip: Fix incorrect NULL pointer check after allocation (Gustavo)

    Cc: Gustavo A. R. Silva <garsilva@embeddedor.com>
    Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
    Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>

    * tag 'drm-misc-next-fixes-2017-07-10' of git://anongit.freedesktop.org/git/drm-misc:
    drm/rockchip: fix NULL check on devm_kzalloc() return value
    drm/atomic: Add missing drm_atomic_state_clear to atomic_remove_fb
    drm: vblank: Fix vblank timestamp update
    DRM: Fix an incorrectly formatted table
    bridge: Fix panel-bridge error return on !panel.
    drm/rockchip: gem: add the lacks lock and trivial changes

    Dave Airlie
     
  • Currently the writeback statistics code uses a percpu counters to hold
    various statistics. Furthermore we have 2 families of functions - those
    which disable local irq and those which doesn't and whose names begin
    with double underscore. However, they both end up calling
    __add_wb_stats which in turn calls percpu_counter_add_batch which is
    already irq-safe.

    Exploiting this fact allows to eliminated the __wb_* functions since
    they don't add any further protection than we already have.
    Furthermore, refactor the wb_* function to call __add_wb_stat directly
    without the irq-disabling dance. This will likely result in better
    runtime of code which deals with modifying the stat counters.

    While at it also document why percpu_counter_add_batch is in fact
    preempt and irq-safe since at least 3 people got confused.

    Link: http://lkml.kernel.org/r/1498029937-27293-1-git-send-email-nborisov@suse.com
    Signed-off-by: Nikolay Borisov
    Acked-by: Tejun Heo
    Reviewed-by: Jan Kara
    Cc: Josef Bacik
    Cc: Mel Gorman
    Cc: Jeff Layton
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nikolay Borisov
     
  • Make the code like the rest of the kernel.

    Link: http://lkml.kernel.org/r/667a515b8d0f10f2465d519f8595edd91552fc5e.1499284835.git.joe@perches.com
    Signed-off-by: Joe Perches
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Page migration (for memory hotplug, soft_offline_page or mbind) needs to
    allocate a new memory. This can trigger an oom killer if the target
    memory is depleated. Although quite unlikely, still possible,
    especially for the memory hotplug (offlining of memoery).

    Up to now we didn't really have reasonable means to back off.
    __GFP_NORETRY can fail just too easily and __GFP_THISNODE sticks to a
    single node and that is not suitable for all callers.

    But now that we have __GFP_RETRY_MAYFAIL we should use it. It is
    preferable to fail the migration than disrupt the system by killing some
    processes.

    Link: http://lkml.kernel.org/r/20170623085345.11304-7-mhocko@kernel.org
    Signed-off-by: Michal Hocko
    Acked-by: Vlastimil Babka
    Cc: Alex Belits
    Cc: Chris Wilson
    Cc: Christoph Hellwig
    Cc: Darrick J. Wong
    Cc: David Daney
    Cc: Johannes Weiner
    Cc: Mel Gorman
    Cc: NeilBrown
    Cc: Ralf Baechle
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     
  • __GFP_REPEAT was designed to allow retry-but-eventually-fail semantic to
    the page allocator. This has been true but only for allocations
    requests larger than PAGE_ALLOC_COSTLY_ORDER. It has been always
    ignored for smaller sizes. This is a bit unfortunate because there is
    no way to express the same semantic for those requests and they are
    considered too important to fail so they might end up looping in the
    page allocator for ever, similarly to GFP_NOFAIL requests.

    Now that the whole tree has been cleaned up and accidental or misled
    usage of __GFP_REPEAT flag has been removed for !costly requests we can
    give the original flag a better name and more importantly a more useful
    semantic. Let's rename it to __GFP_RETRY_MAYFAIL which tells the user
    that the allocator would try really hard but there is no promise of a
    success. This will work independent of the order and overrides the
    default allocator behavior. Page allocator users have several levels of
    guarantee vs. cost options (take GFP_KERNEL as an example)

    - GFP_KERNEL & ~__GFP_RECLAIM - optimistic allocation without _any_
    attempt to free memory at all. The most light weight mode which even
    doesn't kick the background reclaim. Should be used carefully because
    it might deplete the memory and the next user might hit the more
    aggressive reclaim

    - GFP_KERNEL & ~__GFP_DIRECT_RECLAIM (or GFP_NOWAIT)- optimistic
    allocation without any attempt to free memory from the current
    context but can wake kswapd to reclaim memory if the zone is below
    the low watermark. Can be used from either atomic contexts or when
    the request is a performance optimization and there is another
    fallback for a slow path.

    - (GFP_KERNEL|__GFP_HIGH) & ~__GFP_DIRECT_RECLAIM (aka GFP_ATOMIC) -
    non sleeping allocation with an expensive fallback so it can access
    some portion of memory reserves. Usually used from interrupt/bh
    context with an expensive slow path fallback.

    - GFP_KERNEL - both background and direct reclaim are allowed and the
    _default_ page allocator behavior is used. That means that !costly
    allocation requests are basically nofail but there is no guarantee of
    that behavior so failures have to be checked properly by callers
    (e.g. OOM killer victim is allowed to fail currently).

    - GFP_KERNEL | __GFP_NORETRY - overrides the default allocator behavior
    and all allocation requests fail early rather than cause disruptive
    reclaim (one round of reclaim in this implementation). The OOM killer
    is not invoked.

    - GFP_KERNEL | __GFP_RETRY_MAYFAIL - overrides the default allocator
    behavior and all allocation requests try really hard. The request
    will fail if the reclaim cannot make any progress. The OOM killer
    won't be triggered.

    - GFP_KERNEL | __GFP_NOFAIL - overrides the default allocator behavior
    and all allocation requests will loop endlessly until they succeed.
    This might be really dangerous especially for larger orders.

    Existing users of __GFP_REPEAT are changed to __GFP_RETRY_MAYFAIL
    because they already had their semantic. No new users are added.
    __alloc_pages_slowpath is changed to bail out for __GFP_RETRY_MAYFAIL if
    there is no progress and we have already passed the OOM point.

    This means that all the reclaim opportunities have been exhausted except
    the most disruptive one (the OOM killer) and a user defined fallback
    behavior is more sensible than keep retrying in the page allocator.

    [akpm@linux-foundation.org: fix arch/sparc/kernel/mdesc.c]
    [mhocko@suse.com: semantic fix]
    Link: http://lkml.kernel.org/r/20170626123847.GM11534@dhcp22.suse.cz
    [mhocko@kernel.org: address other thing spotted by Vlastimil]
    Link: http://lkml.kernel.org/r/20170626124233.GN11534@dhcp22.suse.cz
    Link: http://lkml.kernel.org/r/20170623085345.11304-3-mhocko@kernel.org
    Signed-off-by: Michal Hocko
    Acked-by: Vlastimil Babka
    Cc: Alex Belits
    Cc: Chris Wilson
    Cc: Christoph Hellwig
    Cc: Darrick J. Wong
    Cc: David Daney
    Cc: Johannes Weiner
    Cc: Mel Gorman
    Cc: NeilBrown
    Cc: Ralf Baechle
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     
  • Patch series "stackprotector: ascii armor the stack canary", v2.

    Zero out the first byte of the stack canary value on 64 bit systems, in
    order to mitigate unterminated C string overflows.

    The null byte both prevents C string functions from reading the canary,
    and from writing it if the canary value were guessed or obtained through
    some other means.

    Reducing the entropy by 8 bits is acceptable on 64-bit systems, which
    will still have 56 bits of entropy left, but not on 32 bit systems, so
    the "ascii armor" canary is only implemented on 64-bit systems.

    Inspired by the "ascii armor" code in execshield and Daniel Micay's
    linux-hardened tree.

    Also see https://github.com/thestinger/linux-hardened/

    This patch (of 5):

    Introduce get_random_canary(), which provides a random unsigned long
    canary value with the first byte zeroed out on 64 bit architectures, in
    order to mitigate non-terminated C string overflows.

    The null byte both prevents C string functions from reading the canary,
    and from writing it if the canary value were guessed or obtained through
    some other means.

    Reducing the entropy by 8 bits is acceptable on 64-bit systems, which
    will still have 56 bits of entropy left, but not on 32 bit systems, so
    the "ascii armor" canary is only implemented on 64-bit systems.

    Inspired by the "ascii armor" code in the old execshield patches, and
    Daniel Micay's linux-hardened tree.

    Link: http://lkml.kernel.org/r/20170524155751.424-2-riel@redhat.com
    Signed-off-by: Rik van Riel
    Acked-by: Kees Cook
    Cc: Daniel Micay
    Cc: "Theodore Ts'o"
    Cc: H. Peter Anvin
    Cc: Andy Lutomirski
    Cc: Ingo Molnar
    Cc: Catalin Marinas
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rik van Riel
     
  • This adds support for compiling with a rough equivalent to the glibc
    _FORTIFY_SOURCE=1 feature, providing compile-time and runtime buffer
    overflow checks for string.h functions when the compiler determines the
    size of the source or destination buffer at compile-time. Unlike glibc,
    it covers buffer reads in addition to writes.

    GNU C __builtin_*_chk intrinsics are avoided because they would force a
    much more complex implementation. They aren't designed to detect read
    overflows and offer no real benefit when using an implementation based
    on inline checks. Inline checks don't add up to much code size and
    allow full use of the regular string intrinsics while avoiding the need
    for a bunch of _chk functions and per-arch assembly to avoid wrapper
    overhead.

    This detects various overflows at compile-time in various drivers and
    some non-x86 core kernel code. There will likely be issues caught in
    regular use at runtime too.

    Future improvements left out of initial implementation for simplicity,
    as it's all quite optional and can be done incrementally:

    * Some of the fortified string functions (strncpy, strcat), don't yet
    place a limit on reads from the source based on __builtin_object_size of
    the source buffer.

    * Extending coverage to more string functions like strlcat.

    * It should be possible to optionally use __builtin_object_size(x, 1) for
    some functions (C strings) to detect intra-object overflows (like
    glibc's _FORTIFY_SOURCE=2), but for now this takes the conservative
    approach to avoid likely compatibility issues.

    * The compile-time checks should be made available via a separate config
    option which can be enabled by default (or always enabled) once enough
    time has passed to get the issues it catches fixed.

    Kees said:
    "This is great to have. While it was out-of-tree code, it would have
    blocked at least CVE-2016-3858 from being exploitable (improper size
    argument to strlcpy()). I've sent a number of fixes for
    out-of-bounds-reads that this detected upstream already"

    [arnd@arndb.de: x86: fix fortified memcpy]
    Link: http://lkml.kernel.org/r/20170627150047.660360-1-arnd@arndb.de
    [keescook@chromium.org: avoid panic() in favor of BUG()]
    Link: http://lkml.kernel.org/r/20170626235122.GA25261@beast
    [keescook@chromium.org: move from -mm, add ARCH_HAS_FORTIFY_SOURCE, tweak Kconfig help]
    Link: http://lkml.kernel.org/r/20170526095404.20439-1-danielmicay@gmail.com
    Link: http://lkml.kernel.org/r/1497903987-21002-8-git-send-email-keescook@chromium.org
    Signed-off-by: Daniel Micay
    Signed-off-by: Kees Cook
    Signed-off-by: Arnd Bergmann
    Acked-by: Kees Cook
    Cc: Mark Rutland
    Cc: Daniel Axtens
    Cc: Rasmus Villemoes
    Cc: Andy Shevchenko
    Cc: Chris Metcalf
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daniel Micay
     
  • Split SOFTLOCKUP_DETECTOR from LOCKUP_DETECTOR, and split
    HARDLOCKUP_DETECTOR_PERF from HARDLOCKUP_DETECTOR.

    LOCKUP_DETECTOR implies the general boot, sysctl, and programming
    interfaces for the lockup detectors.

    An architecture that wants to use a hard lockup detector must define
    HAVE_HARDLOCKUP_DETECTOR_PERF or HAVE_HARDLOCKUP_DETECTOR_ARCH.

    Alternatively an arch can define HAVE_NMI_WATCHDOG, which provides the
    minimum arch_touch_nmi_watchdog, and it otherwise does its own thing and
    does not implement the LOCKUP_DETECTOR interfaces.

    sparc is unusual in that it has started to implement some of the
    interfaces, but not fully yet. It should probably be converted to a full
    HAVE_HARDLOCKUP_DETECTOR_ARCH.

    [npiggin@gmail.com: fix]
    Link: http://lkml.kernel.org/r/20170617223522.66c0ad88@roar.ozlabs.ibm.com
    Link: http://lkml.kernel.org/r/20170616065715.18390-4-npiggin@gmail.com
    Signed-off-by: Nicholas Piggin
    Reviewed-by: Don Zickus
    Reviewed-by: Babu Moger
    Tested-by: Babu Moger [sparc]
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Michael Ellerman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nicholas Piggin
     
  • For architectures that define HAVE_NMI_WATCHDOG, instead of having them
    provide the complete touch_nmi_watchdog() function, just have them
    provide arch_touch_nmi_watchdog().

    This gives the generic code more flexibility in implementing this
    function, and arch implementations don't miss out on touching the
    softlockup watchdog or other generic details.

    Link: http://lkml.kernel.org/r/20170616065715.18390-3-npiggin@gmail.com
    Signed-off-by: Nicholas Piggin
    Reviewed-by: Don Zickus
    Reviewed-by: Babu Moger
    Tested-by: Babu Moger [sparc]
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Michael Ellerman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nicholas Piggin