19 Jun, 2017

1 commit

  • Stack guard page is a useful feature to reduce a risk of stack smashing
    into a different mapping. We have been using a single page gap which
    is sufficient to prevent having stack adjacent to a different mapping.
    But this seems to be insufficient in the light of the stack usage in
    userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
    used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
    which is 256kB or stack strings with MAX_ARG_STRLEN.

    This will become especially dangerous for suid binaries and the default
    no limit for the stack size limit because those applications can be
    tricked to consume a large portion of the stack and a single glibc call
    could jump over the guard page. These attacks are not theoretical,
    unfortunatelly.

    Make those attacks less probable by increasing the stack guard gap
    to 1MB (on systems with 4k pages; but make it depend on the page size
    because systems with larger base pages might cap stack allocations in
    the PAGE_SIZE units) which should cover larger alloca() and VLA stack
    allocations. It is obviously not a full fix because the problem is
    somehow inherent, but it should reduce attack space a lot.

    One could argue that the gap size should be configurable from userspace,
    but that can be done later when somebody finds that the new 1MB is wrong
    for some special case applications. For now, add a kernel command line
    option (stack_guard_gap) to specify the stack gap size (in page units).

    Implementation wise, first delete all the old code for stack guard page:
    because although we could get away with accounting one extra page in a
    stack vma, accounting a larger gap can break userspace - case in point,
    a program run with "ulimit -S -v 20000" failed when the 1MB gap was
    counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
    and strict non-overcommit mode.

    Instead of keeping gap inside the stack vma, maintain the stack guard
    gap as a gap between vmas: using vm_start_gap() in place of vm_start
    (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
    places which need to respect the gap - mainly arch_get_unmapped_area(),
    and and the vma tree's subtree_gap support for that.

    Original-patch-by: Oleg Nesterov
    Original-patch-by: Michal Hocko
    Signed-off-by: Hugh Dickins
    Acked-by: Michal Hocko
    Tested-by: Helge Deller # parisc
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

13 Jun, 2017

1 commit

  • Pull Xtensa fixes from Max Filippov:

    - don't use linux IRQ #0 in legacy irq domains: fixes timer interrupt
    assignment when it's hardware IRQ # is 0 and the kernel is built w/o
    device tree support

    - reduce reservation size for double exception vector literals from 48
    to 20 bytes: fixes build on cores with small user exception vector

    - cleanups: use kmalloc_array instead of kmalloc in simdisk_init and
    seq_puts instead of seq_printf in c_show.

    * tag 'xtensa-20170612' of git://github.com/jcmvbkbc/linux-xtensa:
    xtensa: don't use linux IRQ #0
    xtensa: reduce double exception literal reservation
    xtensa: ISS: Use kmalloc_array() in simdisk_init()
    xtensa: Use seq_puts() in c_show()

    Linus Torvalds
     

06 Jun, 2017

2 commits

  • Linux IRQ #0 is reserved for error reporting and may not be used.
    Increase NR_IRQS for one additional slot and increase
    irq_domain_add_legacy parameter first_irq value to 1, so that linux
    IRQ #0 is not associated with hardware IRQ #0 in legacy IRQ domains.
    Introduce macro XTENSA_PIC_LINUX_IRQ for static translation of xtensa
    PIC hardware IRQ # to linux IRQ #. Use this macro in XTFPGA platform
    data definitions.

    This fixes inability to use hardware IRQ #0 in configurations that don't
    use device tree and allows for non-identity mapping between linux IRQ #
    and hardware IRQ #.

    Cc: stable@vger.kernel.org
    Signed-off-by: Max Filippov

    Max Filippov
     
  • Double exception vector only needs 20 bytes of space for 5 literals, not
    48. Reduce the reservation for double exception vector literals
    accordingly. This fixes build for configurations with small user
    exception vector size.

    Signed-off-by: Max Filippov

    Max Filippov
     

10 May, 2017

1 commit

  • Regularly, when a new header is created in include/uapi/, the developer
    forgets to add it in the corresponding Kbuild file. This error is usually
    detected after the release is out.

    In fact, all headers under uapi directories should be exported, thus it's
    useless to have an exhaustive list.

    After this patch, the following files, which were not exported, are now
    exported (with make headers_install_all):
    asm-arc/kvm_para.h
    asm-arc/ucontext.h
    asm-blackfin/shmparam.h
    asm-blackfin/ucontext.h
    asm-c6x/shmparam.h
    asm-c6x/ucontext.h
    asm-cris/kvm_para.h
    asm-h8300/shmparam.h
    asm-h8300/ucontext.h
    asm-hexagon/shmparam.h
    asm-m32r/kvm_para.h
    asm-m68k/kvm_para.h
    asm-m68k/shmparam.h
    asm-metag/kvm_para.h
    asm-metag/shmparam.h
    asm-metag/ucontext.h
    asm-mips/hwcap.h
    asm-mips/reg.h
    asm-mips/ucontext.h
    asm-nios2/kvm_para.h
    asm-nios2/ucontext.h
    asm-openrisc/shmparam.h
    asm-parisc/kvm_para.h
    asm-powerpc/perf_regs.h
    asm-sh/kvm_para.h
    asm-sh/ucontext.h
    asm-tile/shmparam.h
    asm-unicore32/shmparam.h
    asm-unicore32/ucontext.h
    asm-x86/hwcap2.h
    asm-xtensa/kvm_para.h
    drm/armada_drm.h
    drm/etnaviv_drm.h
    drm/vgem_drm.h
    linux/aspeed-lpc-ctrl.h
    linux/auto_dev-ioctl.h
    linux/bcache.h
    linux/btrfs_tree.h
    linux/can/vxcan.h
    linux/cifs/cifs_mount.h
    linux/coresight-stm.h
    linux/cryptouser.h
    linux/fsmap.h
    linux/genwqe/genwqe_card.h
    linux/hash_info.h
    linux/kcm.h
    linux/kcov.h
    linux/kfd_ioctl.h
    linux/lightnvm.h
    linux/module.h
    linux/nbd-netlink.h
    linux/nilfs2_api.h
    linux/nilfs2_ondisk.h
    linux/nsfs.h
    linux/pr.h
    linux/qrtr.h
    linux/rpmsg.h
    linux/sched/types.h
    linux/sed-opal.h
    linux/smc.h
    linux/smc_diag.h
    linux/stm.h
    linux/switchtec_ioctl.h
    linux/vfio_ccw.h
    linux/wil6210_uapi.h
    rdma/bnxt_re-abi.h

    Note that I have removed from this list the files which are generated in every
    exported directories (like .install or .install.cmd).

    Thanks to Julien Floret for the tip to get all
    subdirs with a pure makefile command.

    For the record, note that exported files for asm directories are a mix of
    files listed by:
    - include/uapi/asm-generic/Kbuild.asm;
    - arch//include/uapi/asm/Kbuild;
    - arch//include/asm/Kbuild.

    Signed-off-by: Nicolas Dichtel
    Acked-by: Daniel Vetter
    Acked-by: Russell King
    Acked-by: Mark Salter
    Acked-by: Michael Ellerman (powerpc)
    Signed-off-by: Masahiro Yamada

    Nicolas Dichtel
     

09 May, 2017

4 commits

  • Pull PCI updates from Bjorn Helgaas:

    - add framework for supporting PCIe devices in Endpoint mode (Kishon
    Vijay Abraham I)

    - use non-postable PCI config space mappings when possible (Lorenzo
    Pieralisi)

    - clean up and unify mmap of PCI BARs (David Woodhouse)

    - export and unify Function Level Reset support (Christoph Hellwig)

    - avoid FLR for Intel 82579 NICs (Sasha Neftin)

    - add pci_request_irq() and pci_free_irq() helpers (Christoph Hellwig)

    - short-circuit config access failures for disconnected devices (Keith
    Busch)

    - remove D3 sleep delay when possible (Adrian Hunter)

    - freeze PME scan before suspending devices (Lukas Wunner)

    - stop disabling MSI/MSI-X in pci_device_shutdown() (Prarit Bhargava)

    - disable boot interrupt quirk for ASUS M2N-LR (Stefan Assmann)

    - add arch-specific alignment control to improve device passthrough by
    avoiding multiple BARs in a page (Yongji Xie)

    - add sysfs sriov_drivers_autoprobe to control VF driver binding
    (Bodong Wang)

    - allow slots below PCI-to-PCIe "reverse bridges" (Bjorn Helgaas)

    - fix crashes when unbinding host controllers that don't support
    removal (Brian Norris)

    - add driver for MicroSemi Switchtec management interface (Logan
    Gunthorpe)

    - add driver for Faraday Technology FTPCI100 host bridge (Linus
    Walleij)

    - add i.MX7D support (Andrey Smirnov)

    - use generic MSI support for Aardvark (Thomas Petazzoni)

    - make Rockchip driver modular (Brian Norris)

    - advertise 128-byte Read Completion Boundary support for Rockchip
    (Shawn Lin)

    - advertise PCI_EXP_LNKSTA_SLC for Rockchip root port (Shawn Lin)

    - convert atomic_t to refcount_t in HV driver (Elena Reshetova)

    - add CPU IRQ affinity in HV driver (K. Y. Srinivasan)

    - fix PCI bus removal in HV driver (Long Li)

    - add support for ThunderX2 DMA alias topology (Jayachandran C)

    - add ThunderX pass2.x 2nd node MCFG quirk (Tomasz Nowicki)

    - add ITE 8893 bridge DMA alias quirk (Jarod Wilson)

    - restrict Cavium ACS quirk only to CN81xx/CN83xx/CN88xx devices
    (Manish Jaggi)

    * tag 'pci-v4.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (146 commits)
    PCI: Don't allow unbinding host controllers that aren't prepared
    ARM: DRA7: clockdomain: Change the CLKTRCTRL of CM_PCIE_CLKSTCTRL to SW_WKUP
    MAINTAINERS: Add PCI Endpoint maintainer
    Documentation: PCI: Add userguide for PCI endpoint test function
    tools: PCI: Add sample test script to invoke pcitest
    tools: PCI: Add a userspace tool to test PCI endpoint
    Documentation: misc-devices: Add Documentation for pci-endpoint-test driver
    misc: Add host side PCI driver for PCI test function device
    PCI: Add device IDs for DRA74x and DRA72x
    dt-bindings: PCI: dra7xx: Add DT bindings to enable unaligned access
    PCI: dwc: dra7xx: Workaround for errata id i870
    dt-bindings: PCI: dra7xx: Add DT bindings for PCI dra7xx EP mode
    PCI: dwc: dra7xx: Add EP mode support
    PCI: dwc: dra7xx: Facilitate wrapper and MSI interrupts to be enabled independently
    dt-bindings: PCI: Add DT bindings for PCI designware EP mode
    PCI: dwc: designware: Add EP mode support
    Documentation: PCI: Add binding documentation for pci-test endpoint function
    ixgbe: Use pcie_flr() instead of duplicating it
    IB/hfi1: Use pcie_flr() instead of duplicating it
    PCI: imx6: Fix spelling mistake: "contol" -> "control"
    ...

    Linus Torvalds
     
  • * A multiplication for the size determination of a memory allocation
    indicated that an array data structure should be processed.
    Thus use the corresponding function "kmalloc_array".

    This issue was detected by using the Coccinelle software.

    * Replace the specification of a data type by a pointer dereference
    to make the corresponding size determination a bit safer according to
    the Linux coding style convention.

    Signed-off-by: Markus Elfring
    Signed-off-by: Max Filippov

    Markus Elfring
     
  • A string which did not contain a data format specification should be put
    into a sequence. Thus use the corresponding function "seq_puts".

    This issue was detected by using the Coccinelle software.

    Signed-off-by: Markus Elfring
    Signed-off-by: Max Filippov

    Markus Elfring
     
  • Pull Xtensa updates from Max Filippov:

    - clearly mark references to spilled register locations with SPILL_SLOT
    macros

    - clean up xtensa ptrace: use generic tracehooks, move internal kernel
    definitions from uapi/asm to asm, make locally-used functions static,
    fix code style and alignment

    - use command line parameters passed to ISS as kernel command line.

    * tag 'xtensa-20170507' of git://github.com/jcmvbkbc/linux-xtensa:
    xtensa: clean up access to spilled registers locations
    xtensa: use generic tracehooks
    xtensa: move internal ptrace definitions from uapi/asm to asm
    xtensa: clean up xtensa/kernel/ptrace.c
    xtensa: drop unused fast_io_protect function
    xtensa: use ITLB_HIT_BIT instead of hardcoded number
    xtensa: ISS: update kernel command line in platform_setup
    xtensa: ISS: add argc/argv simcall definitions
    xtensa: ISS: cleanup setup.c

    Linus Torvalds
     

08 May, 2017

1 commit


03 May, 2017

1 commit

  • Pull networking updates from David Millar:
    "Here are some highlights from the 2065 networking commits that
    happened this development cycle:

    1) XDP support for IXGBE (John Fastabend) and thunderx (Sunil Kowuri)

    2) Add a generic XDP driver, so that anyone can test XDP even if they
    lack a networking device whose driver has explicit XDP support
    (me).

    3) Sparc64 now has an eBPF JIT too (me)

    4) Add a BPF program testing framework via BPF_PROG_TEST_RUN (Alexei
    Starovoitov)

    5) Make netfitler network namespace teardown less expensive (Florian
    Westphal)

    6) Add symmetric hashing support to nft_hash (Laura Garcia Liebana)

    7) Implement NAPI and GRO in netvsc driver (Stephen Hemminger)

    8) Support TC flower offload statistics in mlxsw (Arkadi Sharshevsky)

    9) Multiqueue support in stmmac driver (Joao Pinto)

    10) Remove TCP timewait recycling, it never really could possibly work
    well in the real world and timestamp randomization really zaps any
    hint of usability this feature had (Soheil Hassas Yeganeh)

    11) Support level3 vs level4 ECMP route hashing in ipv4 (Nikolay
    Aleksandrov)

    12) Add socket busy poll support to epoll (Sridhar Samudrala)

    13) Netlink extended ACK support (Johannes Berg, Pablo Neira Ayuso,
    and several others)

    14) IPSEC hw offload infrastructure (Steffen Klassert)"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2065 commits)
    tipc: refactor function tipc_sk_recv_stream()
    tipc: refactor function tipc_sk_recvmsg()
    net: thunderx: Optimize page recycling for XDP
    net: thunderx: Support for XDP header adjustment
    net: thunderx: Add support for XDP_TX
    net: thunderx: Add support for XDP_DROP
    net: thunderx: Add basic XDP support
    net: thunderx: Cleanup receive buffer allocation
    net: thunderx: Optimize CQE_TX handling
    net: thunderx: Optimize RBDR descriptor handling
    net: thunderx: Support for page recycling
    ipx: call ipxitf_put() in ioctl error path
    net: sched: add helpers to handle extended actions
    qed*: Fix issues in the ptp filter config implementation.
    qede: Fix concurrency issue in PTP Tx path processing.
    stmmac: Add support for SIMATIC IOT2000 platform
    net: hns: fix ethtool_get_strings overflow in hns driver
    tcp: fix wraparound issue in tcp_lp
    bpf, arm64: fix jit branch offset related to ldimm64
    bpf, arm64: implement jiting of BPF_XADD
    ...

    Linus Torvalds
     

02 May, 2017

2 commits

  • Pull uaccess unification updates from Al Viro:
    "This is the uaccess unification pile. It's _not_ the end of uaccess
    work, but the next batch of that will go into the next cycle. This one
    mostly takes copy_from_user() and friends out of arch/* and gets the
    zero-padding behaviour in sync for all architectures.

    Dealing with the nocache/writethrough mess is for the next cycle;
    fortunately, that's x86-only. Same for cleanups in iov_iter.c (I am
    sold on access_ok() in there, BTW; just not in this pile), same for
    reducing __copy_... callsites, strn*... stuff, etc. - there will be a
    pile about as large as this one in the next merge window.

    This one sat in -next for weeks. -3KLoC"

    * 'work.uaccess' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (96 commits)
    HAVE_ARCH_HARDENED_USERCOPY is unconditional now
    CONFIG_ARCH_HAS_RAW_COPY_USER is unconditional now
    m32r: switch to RAW_COPY_USER
    hexagon: switch to RAW_COPY_USER
    microblaze: switch to RAW_COPY_USER
    get rid of padding, switch to RAW_COPY_USER
    ia64: get rid of copy_in_user()
    ia64: sanitize __access_ok()
    ia64: get rid of 'segment' argument of __do_{get,put}_user()
    ia64: get rid of 'segment' argument of __{get,put}_user_check()
    ia64: add extable.h
    powerpc: get rid of zeroing, switch to RAW_COPY_USER
    esas2r: don't open-code memdup_user()
    alpha: fix stack smashing in old_adjtimex(2)
    don't open-code kernel_setsockopt()
    mips: switch to RAW_COPY_USER
    mips: get rid of tail-zeroing in primitives
    mips: make copy_from_user() zero tail explicitly
    mips: clean and reorder the forest of macros...
    mips: consolidate __invoke_... wrappers
    ...

    Linus Torvalds
     
  • Define macros SPILL_SLOT* that return a reference to the stack location
    of the spill slot for specific register and use them instead of opencoded
    address calculations.

    Signed-off-by: Max Filippov

    Max Filippov
     

01 May, 2017

4 commits


29 Apr, 2017

1 commit


27 Apr, 2017

2 commits


20 Apr, 2017

1 commit


19 Apr, 2017

3 commits


08 Apr, 2017

1 commit

  • Introduce a new getsockopt operation to retrieve the socket cookie
    for a specific socket based on the socket fd. It returns a unique
    non-decreasing cookie for each socket.
    Tested: https://android-review.googlesource.com/#/c/358163/

    Acked-by: Willem de Bruijn
    Signed-off-by: Chenbo Feng
    Signed-off-by: David S. Miller

    Chenbo Feng
     

06 Apr, 2017

1 commit


05 Apr, 2017

1 commit


02 Apr, 2017

1 commit


01 Apr, 2017

2 commits


31 Mar, 2017

1 commit

  • When __pa is applied to virtual address in uncached KSEG region the
    result is incorrect. Fix it by checking if the original address is in
    the uncached KSEG and adjusting the result. It looks better than masking
    off bits because pfn_valid would correctly work with new __pa results
    and it may be made working in noMMU case, once we get definition for
    uncached memory view.

    This is required for the dma_common_mmap and DMA debug code to work
    correctly: they both indirectly use __pa with coherent DMA addresses.
    In case of DMA debug the visible effect is false reports that an address
    mapped for DMA is accessed by CPU.

    Cc: stable@vger.kernel.org
    Tested-by: Boris Brezillon
    Reviewed-by: Boris Brezillon
    Signed-off-by: Max Filippov

    Max Filippov
     

29 Mar, 2017

3 commits


25 Mar, 2017

1 commit

  • This socket option returns the NAPI ID associated with the queue on which
    the last frame is received. This information can be used by the apps to
    split the incoming flows among the threads based on the Rx queue on which
    they are received.

    If the NAPI ID actually represents a sender_cpu then the value is ignored
    and 0 is returned.

    Signed-off-by: Sridhar Samudrala
    Signed-off-by: Alexander Duyck
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Sridhar Samudrala
     

23 Mar, 2017

1 commit

  • Allows reading of SK_MEMINFO_VARS via socket option. This way an
    application can get all meminfo related information in single socket
    option call instead of multiple calls.

    Adds helper function, sk_get_meminfo(), and uses that for both
    getsockopt and sock_diag_put_meminfo().

    Suggested by Eric Dumazet.

    Signed-off-by: Josh Hunt
    Reviewed-by: Jason Baron
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Josh Hunt
     

14 Mar, 2017

3 commits


10 Mar, 2017

1 commit

  • If an architecture uses 4level-fixup.h we don't need to do anything as
    it includes 5level-fixup.h.

    If an architecture uses pgtable-nop*d.h, define __ARCH_USE_5LEVEL_HACK
    before inclusion of the header. It makes asm-generic code to use
    5level-fixup.h.

    If an architecture has 4-level paging or folds levels on its own,
    include 5level-fixup.h directly.

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov