21 Mar, 2016

1 commit

  • Pull x86 protection key support from Ingo Molnar:
    "This tree adds support for a new memory protection hardware feature
    that is available in upcoming Intel CPUs: 'protection keys' (pkeys).

    There's a background article at LWN.net:

    https://lwn.net/Articles/643797/

    The gist is that protection keys allow the encoding of
    user-controllable permission masks in the pte. So instead of having a
    fixed protection mask in the pte (which needs a system call to change
    and works on a per page basis), the user can map a (handful of)
    protection mask variants and can change the masks runtime relatively
    cheaply, without having to change every single page in the affected
    virtual memory range.

    This allows the dynamic switching of the protection bits of large
    amounts of virtual memory, via user-space instructions. It also
    allows more precise control of MMU permission bits: for example the
    executable bit is separate from the read bit (see more about that
    below).

    This tree adds the MM infrastructure and low level x86 glue needed for
    that, plus it adds a high level API to make use of protection keys -
    if a user-space application calls:

    mmap(..., PROT_EXEC);

    or

    mprotect(ptr, sz, PROT_EXEC);

    (note PROT_EXEC-only, without PROT_READ/WRITE), the kernel will notice
    this special case, and will set a special protection key on this
    memory range. It also sets the appropriate bits in the Protection
    Keys User Rights (PKRU) register so that the memory becomes unreadable
    and unwritable.

    So using protection keys the kernel is able to implement 'true'
    PROT_EXEC on x86 CPUs: without protection keys PROT_EXEC implies
    PROT_READ as well. Unreadable executable mappings have security
    advantages: they cannot be read via information leaks to figure out
    ASLR details, nor can they be scanned for ROP gadgets - and they
    cannot be used by exploits for data purposes either.

    We know about no user-space code that relies on pure PROT_EXEC
    mappings today, but binary loaders could start making use of this new
    feature to map binaries and libraries in a more secure fashion.

    There is other pending pkeys work that offers more high level system
    call APIs to manage protection keys - but those are not part of this
    pull request.

    Right now there's a Kconfig that controls this feature
    (CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) that is default enabled
    (like most x86 CPU feature enablement code that has no runtime
    overhead), but it's not user-configurable at the moment. If there's
    any serious problem with this then we can make it configurable and/or
    flip the default"

    * 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
    x86/mm/pkeys: Fix mismerge of protection keys CPUID bits
    mm/pkeys: Fix siginfo ABI breakage caused by new u64 field
    x86/mm/pkeys: Fix access_error() denial of writes to write-only VMA
    mm/core, x86/mm/pkeys: Add execute-only protection keys support
    x86/mm/pkeys: Create an x86 arch_calc_vm_prot_bits() for VMA flags
    x86/mm/pkeys: Allow kernel to modify user pkey rights register
    x86/fpu: Allow setting of XSAVE state
    x86/mm: Factor out LDT init from context init
    mm/core, x86/mm/pkeys: Add arch_validate_pkey()
    mm/core, arch, powerpc: Pass a protection key in to calc_vm_flag_bits()
    x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU
    x86/mm/pkeys: Add Kconfig prompt to existing config option
    x86/mm/pkeys: Dump pkey from VMA in /proc/pid/smaps
    x86/mm/pkeys: Dump PKRU with other kernel registers
    mm/core, x86/mm/pkeys: Differentiate instruction fetches
    x86/mm/pkeys: Optimize fault handling in access_error()
    mm/core: Do not enforce PKEY permissions on remote mm access
    um, pkeys: Add UML arch_*_access_permitted() methods
    mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keys
    x86/mm/gup: Simplify get_user_pages() PTE bit handling
    ...

    Linus Torvalds
     

20 Mar, 2016

1 commit

  • Pull powerpc updates from Michael Ellerman:
    "This was delayed a day or two by some build-breakage on old toolchains
    which we've now fixed.

    There's two PCI commits both acked by Bjorn.

    There's one commit to mm/hugepage.c which is (co)authored by Kirill.

    Highlights:
    - Restructure Linux PTE on Book3S/64 to Radix format from Paul
    Mackerras
    - Book3s 64 MMU cleanup in preparation for Radix MMU from Aneesh
    Kumar K.V
    - Add POWER9 cputable entry from Michael Neuling
    - FPU/Altivec/VSX save/restore optimisations from Cyril Bur
    - Add support for new ftrace ABI on ppc64le from Torsten Duwe

    Various cleanups & minor fixes from:
    - Adam Buchbinder, Andrew Donnellan, Balbir Singh, Christophe Leroy,
    Cyril Bur, Luis Henriques, Madhavan Srinivasan, Pan Xinhui, Russell
    Currey, Sukadev Bhattiprolu, Suraj Jitindar Singh.

    General:
    - atomics: Allow architectures to define their own __atomic_op_*
    helpers from Boqun Feng
    - Implement atomic{, 64}_*_return_* variants and acquire/release/
    relaxed variants for (cmp)xchg from Boqun Feng
    - Add powernv_defconfig from Jeremy Kerr
    - Fix BUG_ON() reporting in real mode from Balbir Singh
    - Add xmon command to dump OPAL msglog from Andrew Donnellan
    - Add xmon command to dump process/task similar to ps(1) from Douglas
    Miller
    - Clean up memory hotplug failure paths from David Gibson

    pci/eeh:
    - Redesign SR-IOV on PowerNV to give absolute isolation between VFs
    from Wei Yang.
    - EEH Support for SRIOV VFs from Wei Yang and Gavin Shan.
    - PCI/IOV: Rename and export virtfn_{add, remove} from Wei Yang
    - PCI: Add pcibios_bus_add_device() weak function from Wei Yang
    - MAINTAINERS: Update EEH details and maintainership from Russell
    Currey

    cxl:
    - Support added to the CXL driver for running on both bare-metal and
    hypervisor systems, from Christophe Lombard and Frederic Barrat.
    - Ignore probes for virtual afu pci devices from Vaibhav Jain

    perf:
    - Export Power8 generic and cache events to sysfs from Sukadev
    Bhattiprolu
    - hv-24x7: Fix usage with chip events, display change in counter
    values, display domain indices in sysfs, eliminate domain suffix in
    event names, from Sukadev Bhattiprolu

    Freescale:
    - Updates from Scott: "Highlights include 8xx optimizations, 32-bit
    checksum optimizations, 86xx consolidation, e5500/e6500 cpu
    hotplug, more fman and other dt bits, and minor fixes/cleanup"

    * tag 'powerpc-4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (179 commits)
    powerpc: Fix unrecoverable SLB miss during restore_math()
    powerpc/8xx: Fix do_mtspr_cpu6() build on older compilers
    powerpc/rcpm: Fix build break when SMP=n
    powerpc/book3e-64: Use hardcoded mttmr opcode
    powerpc/fsl/dts: Add "jedec,spi-nor" flash compatible
    powerpc/T104xRDB: add tdm riser card node to device tree
    powerpc32: PAGE_EXEC required for inittext
    powerpc/mpc85xx: Add pcsphy nodes to FManV3 device tree
    powerpc/mpc85xx: Add MDIO bus muxing support to the board device tree(s)
    powerpc/86xx: Introduce and use common dtsi
    powerpc/86xx: Update device tree
    powerpc/86xx: Move dts files to fsl directory
    powerpc/86xx: Switch to kconfig fragments approach
    powerpc/86xx: Update defconfigs
    powerpc/86xx: Consolidate common platform code
    powerpc32: Remove one insn in mulhdu
    powerpc32: small optimisation in flush_icache_range()
    powerpc: Simplify test in __dma_sync()
    powerpc32: move xxxxx_dcache_range() functions inline
    powerpc32: Remove clear_pages() and define clear_page() inline
    ...

    Linus Torvalds
     

19 Mar, 2016

1 commit

  • Pull cgroup updates from Tejun Heo:
    "cgroup changes for v4.6-rc1. No userland visible behavior changes in
    this pull request. I'll send out a separate pull request for the
    addition of cgroup namespace support.

    - The biggest change is the revamping of cgroup core task migration
    and controller handling logic. There are quite a few places where
    controllers and tasks are manipulated. Previously, many of those
    places implemented custom operations for each specific use case
    assuming specific starting conditions. While this worked, it makes
    the code fragile and difficult to follow.

    The bulk of this pull request restructures these operations so that
    most related operations are performed through common helpers which
    implement recursive (subtrees are always processed consistently)
    and idempotent (they make cgroup hierarchy converge to the target
    state rather than performing operations assuming specific starting
    conditions). This makes the code a lot easier to understand,
    verify and extend.

    - Implicit controller support is added. This is primarily for using
    perf_event on the v2 hierarchy so that perf can match cgroup v2
    path without requiring the user to do anything special. The kernel
    portion of perf_event changes is acked but userland changes are
    still pending review.

    - cgroup_no_v1= boot parameter added to ease testing cgroup v2 in
    certain environments.

    - There is a regression introduced during v4.4 devel cycle where
    attempts to migrate zombie tasks can mess up internal object
    management. This was fixed earlier this week and included in this
    pull request w/ stable cc'd.

    - Misc non-critical fixes and improvements"

    * 'for-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (44 commits)
    cgroup: avoid false positive gcc-6 warning
    cgroup: ignore css_sets associated with dead cgroups during migration
    Documentation: cgroup v2: Trivial heading correction.
    cgroup: implement cgroup_subsys->implicit_on_dfl
    cgroup: use css_set->mg_dst_cgrp for the migration target cgroup
    cgroup: make cgroup[_taskset]_migrate() take cgroup_root instead of cgroup
    cgroup: move migration destination verification out of cgroup_migrate_prepare_dst()
    cgroup: fix incorrect destination cgroup in cgroup_update_dfl_csses()
    cgroup: Trivial correction to reflect controller.
    cgroup: remove stale item in cgroup-v1 document INDEX file.
    cgroup: update css iteration in cgroup_update_dfl_csses()
    cgroup: allocate 2x cgrp_cset_links when setting up a new root
    cgroup: make cgroup_calc_subtree_ss_mask() take @this_ss_mask
    cgroup: reimplement rebind_subsystems() using cgroup_apply_control() and friends
    cgroup: use cgroup_apply_enable_control() in cgroup creation path
    cgroup: combine cgroup_mutex locking and offline css draining
    cgroup: factor out cgroup_{apply|finalize}_control() from cgroup_subtree_control_write()
    cgroup: introduce cgroup_{save|propagate|restore}_control()
    cgroup: make cgroup_drain_offline() and cgroup_apply_control_{disable|enable}() recursive
    cgroup: factor out cgroup_apply_control_enable() from cgroup_subtree_control_write()
    ...

    Linus Torvalds
     

18 Mar, 2016

2 commits

  • Pull USB updates from Greg KH:
    "Here is the big USB patchset for 4.6-rc1.

    The normal mess is here, gadget and xhci fixes and updates, and lots
    of other driver updates and cleanups as well. Full details are in the
    shortlog.

    All have been in linux-next for a while with no reported issues"

    * tag 'usb-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (266 commits)
    USB: core: let USB device know device node
    usb: devio: Add ioctl to disallow detaching kernel USB drivers.
    usb: gadget: f_acm: Fix configfs attr name
    usb: udc: lpc32xx: remove USB PLL and USB OTG clock management
    usb: udc: lpc32xx: remove direct access to clock controller registers
    usb: udc: lpc32xx: switch to clock prepare/unprepare model
    usb: renesas_usbhs: gadget: fix giveback status code in usbhsg_pipe_disable()
    usb: gadget: renesas_usb3: Use ARCH_RENESAS
    usb: dwc2: Fix issues in dwc2_complete_non_isoc_xfer_ddma()
    usb: dwc2: Add support for Lantiq ARX and XRX SoCs
    usb: phy: generic: Handle late registration of gadget
    usb: gadget: bdc_udc: fix race condition in bdc_udc_exit()
    usb: musb: core: added missing const qualifier to musb_hdrc_platform_data::config
    usb: dwc2: Move host-specific core functions into hcd.c
    usb: dwc2: Move register save and restore functions
    usb: dwc2: Use kmem_cache_free()
    usb: dwc2: host: If using uframe scheduler, end splits better
    usb: dwc2: host: Totally redo the microframe scheduler
    usb: dwc2: host: Properly set even/odd frame
    usb: dwc2: host: Add dwc2_hcd_get_future_frame_number() call
    ...

    Linus Torvalds
     
  • Pull tty/serial updates from Greg KH:
    "Here's the big tty/serial driver pull request for 4.6-rc1.

    Lots of changes in here, Peter has been on a tear again, with lots of
    refactoring and bugs fixes, many thanks to the great work he has been
    doing. Lots of driver updates and fixes as well, full details in the
    shortlog.

    All have been in linux-next for a while with no reported issues"

    * tag 'tty-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (220 commits)
    serial: 8250: describe CONFIG_SERIAL_8250_RSA
    serial: samsung: optimize UART rx fifo access routine
    serial: pl011: add mark/space parity support
    serial: sa1100: make sa1100_register_uart_fns a function
    tty: serial: 8250: add MOXA Smartio MUE boards support
    serial: 8250: convert drivers to use up_to_u8250p()
    serial: 8250/mediatek: fix building with SERIAL_8250=m
    serial: 8250/ingenic: fix building with SERIAL_8250=m
    serial: 8250/uniphier: fix modular build
    Revert "drivers/tty/serial: make 8250/8250_ingenic.c explicitly non-modular"
    Revert "drivers/tty/serial: make 8250/8250_mtk.c explicitly non-modular"
    serial: mvebu-uart: initial support for Armada-3700 serial port
    serial: mctrl_gpio: Add missing module license
    serial: ifx6x60: avoid uninitialized variable use
    tty/serial: at91: fix bad offset for UART timeout register
    tty/serial: at91: restore dynamic driver binding
    serial: 8250: Add hardware dependency to RT288X option
    TTY, devpts: document pty count limiting
    tty: goldfish: support platform_device with id -1
    drivers: tty: goldfish: Add device tree bindings
    ...

    Linus Torvalds
     

17 Mar, 2016

2 commits

  • Pull power management and ACPI updates from Rafael Wysocki:
    "This time the majority of changes go into cpufreq and they are
    significant.

    First off, the way CPU frequency updates are triggered is different
    now. Instead of having to set up and manage a deferrable timer for
    each CPU in the system to evaluate and possibly change its frequency
    periodically, cpufreq governors set up callbacks to be invoked by the
    scheduler on a regular basis (basically on utilization updates). The
    "old" governors, "ondemand" and "conservative", still do all of their
    work in process context (although that is triggered by the scheduler
    now), but intel_pstate does it all in the callback invoked by the
    scheduler with no need for any additional asynchronous processing.

    Of course, this eliminates the overhead related to the management of
    all those timers, but also it allows the cpufreq governor code to be
    simplified quite a bit. On top of that, the common code and data
    structures used by the "ondemand" and "conservative" governors are
    cleaned up and made more straightforward and some long-standing and
    quite annoying problems are addressed. In particular, the handling of
    governor sysfs attributes is modified and the related locking becomes
    more fine grained which allows some concurrency problems to be avoided
    (particularly deadlocks with the core cpufreq code).

    In principle, the new mechanism for triggering frequency updates
    allows utilization information to be passed from the scheduler to
    cpufreq. Although the current code doesn't make use of it, in the
    works is a new cpufreq governor that will make decisions based on the
    scheduler's utilization data. That should allow the scheduler and
    cpufreq to work more closely together in the long run.

    In addition to the core and governor changes, cpufreq drivers are
    updated too. Fixes and optimizations go into intel_pstate, the
    cpufreq-dt driver is updated on top of some modification in the
    Operating Performance Points (OPP) framework and there are fixes and
    other updates in the powernv cpufreq driver.

    Apart from the cpufreq updates there is some new ACPICA material,
    including a fix for a problem introduced by previous ACPICA updates,
    and some less significant changes in the ACPI code, like CPPC code
    optimizations, ACPI processor driver cleanups and support for loading
    ACPI tables from initrd.

    Also updated are the generic power domains framework, the Intel RAPL
    power capping driver and the turbostat utility and we have a bunch of
    traditional assorted fixes and cleanups.

    Specifics:

    - Redesign of cpufreq governors and the intel_pstate driver to make
    them use callbacks invoked by the scheduler to trigger CPU
    frequency evaluation instead of using per-CPU deferrable timers for
    that purpose (Rafael Wysocki).

    - Reorganization and cleanup of cpufreq governor code to make it more
    straightforward and fix some concurrency problems in it (Rafael
    Wysocki, Viresh Kumar).

    - Cleanup and improvements of locking in the cpufreq core (Viresh
    Kumar).

    - Assorted cleanups in the cpufreq core (Rafael Wysocki, Viresh
    Kumar, Eric Biggers).

    - intel_pstate driver updates including fixes, optimizations and a
    modification to make it enable enable hardware-coordinated P-state
    selection (HWP) by default if supported by the processor (Philippe
    Longepe, Srinivas Pandruvada, Rafael Wysocki, Viresh Kumar, Felipe
    Franciosi).

    - Operating Performance Points (OPP) framework updates to improve its
    handling of voltage regulators and device clocks and updates of the
    cpufreq-dt driver on top of that (Viresh Kumar, Jon Hunter).

    - Updates of the powernv cpufreq driver to fix initialization and
    cleanup problems in it and correct its worker thread handling with
    respect to CPU offline, new powernv_throttle tracepoint (Shilpasri
    Bhat).

    - ACPI cpufreq driver optimization and cleanup (Rafael Wysocki).

    - ACPICA updates including one fix for a regression introduced by
    previos changes in the ACPICA code (Bob Moore, Lv Zheng, David Box,
    Colin Ian King).

    - Support for installing ACPI tables from initrd (Lv Zheng).

    - Optimizations of the ACPI CPPC code (Prashanth Prakash, Ashwin
    Chaugule).

    - Support for _HID(ACPI0010) devices (ACPI processor containers) and
    ACPI processor driver cleanups (Sudeep Holla).

    - Support for ACPI-based enumeration of the AMBA bus (Graeme Gregory,
    Aleksey Makarov).

    - Modification of the ACPI PCI IRQ management code to make it treat
    255 in the Interrupt Line register as "not connected" on x86 (as
    per the specification) and avoid attempts to use that value as a
    valid interrupt vector (Chen Fan).

    - ACPI APEI fixes related to resource leaks (Josh Hunt).

    - Removal of modularity from a few ACPI drivers (BGRT, GHES,
    intel_pmic_crc) that cannot be built as modules in practice (Paul
    Gortmaker).

    - PNP framework update to make it treat ACPI_RESOURCE_TYPE_SERIAL_BUS
    as a valid resource type (Harb Abdulhamid).

    - New device ID (future AMD I2C controller) in the ACPI driver for
    AMD SoCs (APD) and in the designware I2C driver (Xiangliang Yu).

    - Assorted ACPI cleanups (Colin Ian King, Kaiyen Chang, Oleg Drokin).

    - cpuidle menu governor optimization to avoid a square root
    computation in it (Rasmus Villemoes).

    - Fix for potential use-after-free in the generic device properties
    framework (Heikki Krogerus).

    - Updates of the generic power domains (genpd) framework including
    support for multiple power states of a domain, fixes and debugfs
    output improvements (Axel Haslam, Jon Hunter, Laurent Pinchart,
    Geert Uytterhoeven).

    - Intel RAPL power capping driver updates to reduce IPI overhead in
    it (Jacob Pan).

    - System suspend/hibernation code cleanups (Eric Biggers, Saurabh
    Sengar).

    - Year 2038 fix for the process freezer (Abhilash Jindal).

    - turbostat utility updates including new features (decoding of more
    registers and CPUID fields, sub-second intervals support, GFX MHz
    and RC6 printout, --out command line option), fixes (syscall jitter
    detection and workaround, reductioin of the number of syscalls
    made, fixes related to Xeon x200 processors, compiler warning
    fixes) and cleanups (Len Brown, Hubert Chrzaniuk, Chen Yu)"

    * tag 'pm+acpi-4.6-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (182 commits)
    tools/power turbostat: bugfix: TDP MSRs print bits fixing
    tools/power turbostat: correct output for MSR_NHM_SNB_PKG_CST_CFG_CTL dump
    tools/power turbostat: call __cpuid() instead of __get_cpuid()
    tools/power turbostat: indicate SMX and SGX support
    tools/power turbostat: detect and work around syscall jitter
    tools/power turbostat: show GFX%rc6
    tools/power turbostat: show GFXMHz
    tools/power turbostat: show IRQs per CPU
    tools/power turbostat: make fewer systems calls
    tools/power turbostat: fix compiler warnings
    tools/power turbostat: add --out option for saving output in a file
    tools/power turbostat: re-name "%Busy" field to "Busy%"
    tools/power turbostat: Intel Xeon x200: fix turbo-ratio decoding
    tools/power turbostat: Intel Xeon x200: fix erroneous bclk value
    tools/power turbostat: allow sub-sec intervals
    ACPI / APEI: ERST: Fixed leaked resources in erst_init
    ACPI / APEI: Fix leaked resources
    intel_pstate: Do not skip samples partially
    intel_pstate: Remove freq calculation from intel_pstate_calc_busy()
    intel_pstate: Move intel_pstate_calc_busy() into get_target_pstate_use_performance()
    ...

    Linus Torvalds
     
  • Merge first patch-bomb from Andrew Morton:

    - some misc things

    - ofs2 updates

    - about half of MM

    - checkpatch updates

    - autofs4 update

    * emailed patches from Andrew Morton : (120 commits)
    autofs4: fix string.h include in auto_dev-ioctl.h
    autofs4: use pr_xxx() macros directly for logging
    autofs4: change log print macros to not insert newline
    autofs4: make autofs log prints consistent
    autofs4: fix some white space errors
    autofs4: fix invalid ioctl return in autofs4_root_ioctl_unlocked()
    autofs4: fix coding style line length in autofs4_wait()
    autofs4: fix coding style problem in autofs4_get_set_timeout()
    autofs4: coding style fixes
    autofs: show pipe inode in mount options
    kallsyms: add support for relative offsets in kallsyms address table
    kallsyms: don't overload absolute symbol type for percpu symbols
    x86: kallsyms: disable absolute percpu symbols on !SMP
    checkpatch: fix another left brace warning
    checkpatch: improve UNSPECIFIED_INT test for bare signed/unsigned uses
    checkpatch: warn on bare unsigned or signed declarations without int
    checkpatch: exclude asm volatile from complex macro check
    mm: memcontrol: drop unnecessary lru locking from mem_cgroup_migrate()
    mm: migrate: consolidate mem_cgroup_migrate() calls
    mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous
    ...

    Linus Torvalds
     

16 Mar, 2016

5 commits

  • Page poisoning is currently set up as a feature if architectures don't
    have architecture debug page_alloc to allow unmapping of pages. It has
    uses apart from that though. Clearing of the pages on free provides an
    increase in security as it helps to limit the risk of information leaks.
    Allow page poisoning to be enabled as a separate option independent of
    kernel_map pages since the two features do separate work. Because of
    how hiberanation is implemented, the checks on alloc cannot occur if
    hibernation is enabled. The runtime alloc checks can also be enabled
    with an option when !HIBERNATION.

    Credit to Grsecurity/PaX team for inspiring this work

    Signed-off-by: Laura Abbott
    Cc: Rafael J. Wysocki
    Cc: "Kirill A. Shutemov"
    Cc: Vlastimil Babka
    Cc: Michal Hocko
    Cc: Kees Cook
    Cc: Mathias Krause
    Cc: Dave Hansen
    Cc: Jianyu Zhan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Laura Abbott
     
  • This patch extends existing "kernelcore" option and introduces
    kernelcore=mirror option. By specifying "mirror" instead of specifying
    the amount of memory, non-mirrored (non-reliable) region will be
    arranged into ZONE_MOVABLE.

    [akpm@linux-foundation.org: fix build with CONFIG_HAVE_MEMBLOCK_NODE_MAP=n]
    Signed-off-by: Taku Izumi
    Tested-by: Sudeep Holla
    Cc: Tony Luck
    Cc: Xishi Qiu
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Dave Hansen
    Cc: Matt Fleming
    Cc: Arnd Bergmann
    Cc: Steve Capper
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Taku Izumi
     
  • Pull irq updates from Thomas Gleixner:
    "The 4.6 pile of irq updates contains:

    - Support for IPI irqdomains to support proper integration of IPIs to
    and from coprocessors. The first user of this new facility is
    MIPS. The relevant MIPS patches come with the core to avoid merge
    ordering issues and have been acked by Ralf.

    - A new command line option to set the default interrupt affinity
    mask at boot time.

    - Support for some more new ARM and MIPS interrupt controllers:
    tango, alpine-msix and bcm6345-l1

    - Two small cleanups for x86/apic which we merged into irq/core to
    avoid yet another branch in x86 with two tiny commits.

    - The usual set of updates, cleanups in drivers/irqchip. Mostly in
    the area of ARM-GIC, arada-37-xp and atmel chips. Nothing
    outstanding here"

    * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits)
    irqchip/irq-alpine-msi: Release the correct domain on error
    irqchip/mxs: Fix error check of of_io_request_and_map()
    irqchip/sunxi-nmi: Fix error check of of_io_request_and_map()
    genirq: Export IRQ functions for module use
    irqchip/gic/realview: Support more RealView DCC variants
    Documentation/bindings: Document the Alpine MSIX driver
    irqchip: Add the Alpine MSIX interrupt controller
    irqchip/gic-v3: Always return IRQ_SET_MASK_OK_DONE in gic_set_affinity
    irqchip/gic-v3-its: Mark its_init() and its children as __init
    irqchip/gic-v3: Remove gic_root_node variable from the ITS code
    irqchip/gic-v3: ACPI: Add redistributor support via GICC structures
    irqchip/gic-v3: Add ACPI support for GICv3/4 initialization
    irqchip/gic-v3: Refactor gic_of_init() for GICv3 driver
    x86/apic: Deinline _flat_send_IPI_mask, save ~150 bytes
    x86/apic: Deinline __default_send_IPI_*, save ~200 bytes
    dt-bindings: interrupt-controller: Add SoC-specific compatible string to Marvell ODMI
    irqchip/mips-gic: Add new DT property to reserve IPIs
    MIPS: Delete smp-gic.c
    MIPS: Make smp CMP, CPS and MT use the new generic IPI functions
    MIPS: Add generic SMP IPI support
    ...

    Linus Torvalds
     
  • Pull x86 mm updates from Ingo Molnar:
    "The main changes in this cycle were:

    - Enable full ASLR randomization for 32-bit programs (Hector
    Marco-Gisbert)

    - Add initial minimal INVPCI support, to flush global mappings (Andy
    Lutomirski)

    - Add KASAN enhancements (Andrey Ryabinin)

    - Fix mmiotrace for huge pages (Karol Herbst)

    - ... misc cleanups and small enhancements"

    * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/mm/32: Enable full randomization on i386 and X86_32
    x86/mm/kmmio: Fix mmiotrace for hugepages
    x86/mm: Avoid premature success when changing page attributes
    x86/mm/ptdump: Remove paravirt_enabled()
    x86/mm: Fix INVPCID asm constraint
    x86/dmi: Switch dmi_remap() from ioremap() [uncached] to ioremap_cache()
    x86/mm: If INVPCID is available, use it to flush global mappings
    x86/mm: Add a 'noinvpcid' boot option to turn off INVPCID
    x86/mm: Add INVPCID helpers
    x86/kasan: Write protect kasan zero shadow
    x86/kasan: Clear kasan_zero_page after TLB flush
    x86/mm/numa: Check for failures in numa_clear_kernel_node_hotplug()
    x86/mm/numa: Clean up numa_clear_kernel_node_hotplug()
    x86/mm: Make kmap_prot into a #define
    x86/mm/32: Set NX in __supported_pte_mask before enabling paging
    x86/mm: Streamline and restore probe_memory_block_size()

    Linus Torvalds
     
  • Pull x86 asm updates from Ingo Molnar:
    "This is another big update. Main changes are:

    - lots of x86 system call (and other traps/exceptions) entry code
    enhancements. In particular the complex parts of the 64-bit entry
    code have been migrated to C code as well, and a number of dusty
    corners have been refreshed. (Andy Lutomirski)

    - vDSO special mapping robustification and general cleanups (Andy
    Lutomirski)

    - cpufeature refactoring, cleanups and speedups (Borislav Petkov)

    - lots of other changes ..."

    * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits)
    x86/cpufeature: Enable new AVX-512 features
    x86/entry/traps: Show unhandled signal for i386 in do_trap()
    x86/entry: Call enter_from_user_mode() with IRQs off
    x86/entry/32: Change INT80 to be an interrupt gate
    x86/entry: Improve system call entry comments
    x86/entry: Remove TIF_SINGLESTEP entry work
    x86/entry/32: Add and check a stack canary for the SYSENTER stack
    x86/entry/32: Simplify and fix up the SYSENTER stack #DB/NMI fixup
    x86/entry: Only allocate space for tss_struct::SYSENTER_stack if needed
    x86/entry: Vastly simplify SYSENTER TF (single-step) handling
    x86/entry/traps: Clear DR6 early in do_debug() and improve the comment
    x86/entry/traps: Clear TIF_BLOCKSTEP on all debug exceptions
    x86/entry/32: Restore FLAGS on SYSEXIT
    x86/entry/32: Filter NT and speed up AC filtering in SYSENTER
    x86/entry/compat: In SYSENTER, sink AC clearing below the existing FLAGS test
    selftests/x86: In syscall_nt, test NT|TF as well
    x86/asm-offsets: Remove PARAVIRT_enabled
    x86/entry/32: Introduce and use X86_BUG_ESPFIX instead of paravirt_enabled
    uprobes: __create_xol_area() must nullify xol_mapping.fault
    x86/cpufeature: Create a new synthetic cpu capability for machine check recovery
    ...

    Linus Torvalds
     

15 Mar, 2016

1 commit

  • Pull scheduler updates from Ingo Molnar:
    "The main changes in this cycle are:

    - Make schedstats a runtime tunable (disabled by default) and
    optimize it via static keys.

    As most distributions enable CONFIG_SCHEDSTATS=y due to its
    instrumentation value, this is a nice performance enhancement.
    (Mel Gorman)

    - Implement 'simple waitqueues' (swait): these are just pure
    waitqueues without any of the more complex features of full-blown
    waitqueues (callbacks, wake flags, wake keys, etc.). Simple
    waitqueues have less memory overhead and are faster.

    Use simple waitqueues in the RCU code (in 4 different places) and
    for handling KVM vCPU wakeups.

    (Peter Zijlstra, Daniel Wagner, Thomas Gleixner, Paul Gortmaker,
    Marcelo Tosatti)

    - sched/numa enhancements (Rik van Riel)

    - NOHZ performance enhancements (Rik van Riel)

    - Various sched/deadline enhancements (Steven Rostedt)

    - Various fixes (Peter Zijlstra)

    - ... and a number of other fixes, cleanups and smaller enhancements"

    * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits)
    sched/cputime: Fix steal_account_process_tick() to always return jiffies
    sched/deadline: Remove dl_new from struct sched_dl_entity
    Revert "kbuild: Add option to turn incompatible pointer check into error"
    sched/deadline: Remove superfluous call to switched_to_dl()
    sched/debug: Fix preempt_disable_ip recording for preempt_disable()
    sched, time: Switch VIRT_CPU_ACCOUNTING_GEN to jiffy granularity
    time, acct: Drop irq save & restore from __acct_update_integrals()
    acct, time: Change indentation in __acct_update_integrals()
    sched, time: Remove non-power-of-two divides from __acct_update_integrals()
    sched/rt: Kick RT bandwidth timer immediately on start up
    sched/debug: Add deadline scheduler bandwidth ratio to /proc/sched_debug
    sched/debug: Move sched_domain_sysctl to debug.c
    sched/debug: Move the /sys/kernel/debug/sched_features file setup into debug.c
    sched/rt: Fix PI handling vs. sched_setscheduler()
    sched/core: Remove duplicated sched_group_set_shares() prototype
    sched/fair: Consolidate nohz CPU load update code
    sched/fair: Avoid using decay_load_missed() with a negative value
    sched/deadline: Always calculate end of period on sched_yield()
    sched/cgroup: Fix cgroup entity load tracking tear-down
    rcu: Use simple wait queues where possible in rcutree
    ...

    Linus Torvalds
     

12 Mar, 2016

1 commit


10 Mar, 2016

1 commit

  • Some HP laptops seem to have invalid 64 bit FADT X_PM* addresses
    which are causing various boot issues. In these cases, it would
    be useful to force ACPI to use the valid legacy 32 bit equivalent
    PM addresses. Add a acpi_force_32bit_fadt_addr to set the ACPICA
    acpi_gbl_use32_bit_fadt_addresses to TRUE to force this override.

    Link: https://bugs.launchpad.net/bugs/1529381
    Signed-off-by: Colin Ian King
    Signed-off-by: Rafael J. Wysocki

    Colin Ian King
     

08 Mar, 2016

2 commits

  • Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Armada-3700's uart is a simple serial port, which doesn't
    support. Configuring the modem control lines. The uart port has a 32
    bytes Tx FIFO and a 64 bytes Rx FIFO

    The uart driver implements the uart core operations. It also support the
    system (early) console based on Armada-3700's serial port.

    Known Issue:

    The uart driver currently doesn't support clock programming, which means
    the baud-rate stays with the default value configured by the bootloader
    at boot time

    [gregory.clement@free-electrons.com: Rewrite many part which are too long
    to enumerate]

    Signed-off-by: Wilson Ding
    Signed-off-by: Nadav Haklai
    Signed-off-by: Gregory CLEMENT
    Acked-by: Rob Herring
    Signed-off-by: Greg Kroah-Hartman

    Wilson Ding
     

06 Mar, 2016

1 commit


01 Mar, 2016

1 commit

  • Most newer Rockchip SoCs provide the possibility to use a usb-phy
    as passthrough for the debug uart (uart2), making it possible to
    for example get console output without needing to open the device.

    This patch adds an early_initcall to enable this functionality
    conditionally via the commandline and also disables the corresponding
    usb controller in the devicetree.

    Currently only data for the rk3288 is provided, but at least the
    rk3188 and arm64 rk3368 also provide this functionality and will be
    enabled later.

    On a spliced usb cable the signals are tx on white wire(D+) and
    rx on green wire(D-).

    The one caveat is that currently the reconfiguration of the phy
    happens as early_initcall, as the code depends on the unflattened
    devicetree being available. Everything is fine if only a regular
    console is active as the console-replay will happen after the
    reconfiguation. But with earlycon active output up to smp-init
    currently will get lost.

    The phy is an optional property for the connected dwc2 controller,
    so we still provide the phy device but fail all phy-ops with -EBUSY
    to make sure the dwc2 does not try to transmit anything on the
    repurposed phy.

    Signed-off-by: Heiko Stuebner
    Signed-off-by: Kishon Vijay Abraham I

    Heiko Stuebner
     

29 Feb, 2016

1 commit


22 Feb, 2016

1 commit

  • It may be useful to debug writes to the readonly sections of memory,
    so provide a cmdline "rodata=off" to allow for this. This can be
    expanded in the future to support "log" and "write" modes, but that
    will need to be architecture-specific.

    This also makes KDB software breakpoints more usable, as read-only
    mappings can now be disabled on any kernel.

    Suggested-by: H. Peter Anvin
    Signed-off-by: Kees Cook
    Cc: Andy Lutomirski
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: David Brown
    Cc: Denys Vlasenko
    Cc: Emese Revfy
    Cc: Linus Torvalds
    Cc: Mathias Krause
    Cc: Michael Ellerman
    Cc: PaX Team
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: kernel-hardening@lists.openwall.com
    Cc: linux-arch
    Link: http://lkml.kernel.org/r/1455748879-21872-3-git-send-email-keescook@chromium.org
    Signed-off-by: Ingo Molnar

    Kees Cook
     

19 Feb, 2016

1 commit

  • This sets the bit in 'cr4' to actually enable the protection
    keys feature. We also include a boot-time disable for the
    feature "nopku".

    Seting X86_CR4_PKE will cause the X86_FEATURE_OSPKE cpuid
    bit to appear set. At this point in boot, identify_cpu()
    has already run the actual CPUID instructions and populated
    the "cpu features" structures. We need to go back and
    re-run identify_cpu() to make sure it gets updated values.

    We *could* simply re-populate the 11th word of the cpuid
    data, but this is probably quick enough.

    Also note that with the cpu_has() check and X86_FEATURE_PKU
    present in disabled-features.h, we do not need an #ifdef
    for setup_pku().

    Signed-off-by: Dave Hansen
    Reviewed-by: Thomas Gleixner
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: linux-mm@kvack.org
    Link: http://lkml.kernel.org/r/20160212210229.6708027C@viggo.jf.intel.com
    [ Small readability edits. ]
    Signed-off-by: Ingo Molnar

    Dave Hansen
     

18 Feb, 2016

1 commit


17 Feb, 2016

1 commit


16 Feb, 2016

1 commit


11 Feb, 2016

1 commit

  • Pull workqueue fixes from Tejun Heo:
    "Workqueue fixes for v4.5-rc3.

    - Remove a spurious triggering of flush dependency warning.

    - Officially break local execution guarantee of unbound work items
    and add a debug feature to flush out usages which depend on it.

    - Work around CPU -> NODE mapping becoming invalid on CPU offline.

    The branch is young but pushing out early as stable kernels are being
    affected"

    * 'for-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
    workqueue: handle NUMA_NO_NODE for unbound pool_workqueue lookup
    workqueue: implement "workqueue.debug_force_rr_cpu" debug feature
    workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs
    Revert "workqueue: make sure delayed work run in local cpu"
    workqueue: skip flush dependency checks for legacy workqueues

    Linus Torvalds
     

10 Feb, 2016

1 commit

  • Workqueue used to guarantee local execution for work items queued
    without explicit target CPU. The guarantee is gone now which can
    break some usages in subtle ways. To flush out those cases, this
    patch implements a debug feature which forces round-robin CPU
    selection for all such work items.

    The debug feature defaults to off and can be enabled with a kernel
    parameter. The default can be flipped with a debug config option.

    If you hit this commit during bisection, please refer to 041bd12e272c
    ("Revert "workqueue: make sure delayed work run in local cpu"") for
    more information and ping me.

    Signed-off-by: Tejun Heo

    Tejun Heo
     

09 Feb, 2016

2 commits

  • This adds a chicken bit to turn off INVPCID in case something goes
    wrong. It's an early_param() because we do TLB flushes before we
    parse __setup() parameters.

    Signed-off-by: Andy Lutomirski
    Reviewed-by: Borislav Petkov
    Cc: Andrew Morton
    Cc: Andrey Ryabinin
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Luis R. Rodriguez
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Toshi Kani
    Cc: linux-mm@kvack.org
    Link: http://lkml.kernel.org/r/f586317ed1bc2b87aee652267e515b90051af385.1454096309.git.luto@kernel.org
    Signed-off-by: Ingo Molnar

    Andy Lutomirski
     
  • schedstats is very useful during debugging and performance tuning but it
    incurs overhead to calculate the stats. As such, even though it can be
    disabled at build time, it is often enabled as the information is useful.

    This patch adds a kernel command-line and sysctl tunable to enable or
    disable schedstats on demand (when it's built in). It is disabled
    by default as someone who knows they need it can also learn to enable
    it when necessary.

    The benefits are dependent on how scheduler-intensive the workload is.
    If it is then the patch reduces the number of cycles spent calculating
    the stats with a small benefit from reducing the cache footprint of the
    scheduler.

    These measurements were taken from a 48-core 2-socket
    machine with Xeon(R) E5-2670 v3 cpus although they were also tested on a
    single socket machine 8-core machine with Intel i7-3770 processors.

    netperf-tcp
    4.5.0-rc1 4.5.0-rc1
    vanilla nostats-v3r1
    Hmean 64 560.45 ( 0.00%) 575.98 ( 2.77%)
    Hmean 128 766.66 ( 0.00%) 795.79 ( 3.80%)
    Hmean 256 950.51 ( 0.00%) 981.50 ( 3.26%)
    Hmean 1024 1433.25 ( 0.00%) 1466.51 ( 2.32%)
    Hmean 2048 2810.54 ( 0.00%) 2879.75 ( 2.46%)
    Hmean 3312 4618.18 ( 0.00%) 4682.09 ( 1.38%)
    Hmean 4096 5306.42 ( 0.00%) 5346.39 ( 0.75%)
    Hmean 8192 10581.44 ( 0.00%) 10698.15 ( 1.10%)
    Hmean 16384 18857.70 ( 0.00%) 18937.61 ( 0.42%)

    Small gains here, UDP_STREAM showed nothing intresting and neither did
    the TCP_RR tests. The gains on the 8-core machine were very similar.

    tbench4
    4.5.0-rc1 4.5.0-rc1
    vanilla nostats-v3r1
    Hmean mb/sec-1 500.85 ( 0.00%) 522.43 ( 4.31%)
    Hmean mb/sec-2 984.66 ( 0.00%) 1018.19 ( 3.41%)
    Hmean mb/sec-4 1827.91 ( 0.00%) 1847.78 ( 1.09%)
    Hmean mb/sec-8 3561.36 ( 0.00%) 3611.28 ( 1.40%)
    Hmean mb/sec-16 5824.52 ( 0.00%) 5929.03 ( 1.79%)
    Hmean mb/sec-32 10943.10 ( 0.00%) 10802.83 ( -1.28%)
    Hmean mb/sec-64 15950.81 ( 0.00%) 16211.31 ( 1.63%)
    Hmean mb/sec-128 15302.17 ( 0.00%) 15445.11 ( 0.93%)
    Hmean mb/sec-256 14866.18 ( 0.00%) 15088.73 ( 1.50%)
    Hmean mb/sec-512 15223.31 ( 0.00%) 15373.69 ( 0.99%)
    Hmean mb/sec-1024 14574.25 ( 0.00%) 14598.02 ( 0.16%)
    Hmean mb/sec-2048 13569.02 ( 0.00%) 13733.86 ( 1.21%)
    Hmean mb/sec-3072 12865.98 ( 0.00%) 13209.23 ( 2.67%)

    Small gains of 2-4% at low thread counts and otherwise flat. The
    gains on the 8-core machine were slightly different

    tbench4 on 8-core i7-3770 single socket machine
    Hmean mb/sec-1 442.59 ( 0.00%) 448.73 ( 1.39%)
    Hmean mb/sec-2 796.68 ( 0.00%) 794.39 ( -0.29%)
    Hmean mb/sec-4 1322.52 ( 0.00%) 1343.66 ( 1.60%)
    Hmean mb/sec-8 2611.65 ( 0.00%) 2694.86 ( 3.19%)
    Hmean mb/sec-16 2537.07 ( 0.00%) 2609.34 ( 2.85%)
    Hmean mb/sec-32 2506.02 ( 0.00%) 2578.18 ( 2.88%)
    Hmean mb/sec-64 2511.06 ( 0.00%) 2569.16 ( 2.31%)
    Hmean mb/sec-128 2313.38 ( 0.00%) 2395.50 ( 3.55%)
    Hmean mb/sec-256 2110.04 ( 0.00%) 2177.45 ( 3.19%)
    Hmean mb/sec-512 2072.51 ( 0.00%) 2053.97 ( -0.89%)

    In constract, this shows a relatively steady 2-3% gain at higher thread
    counts. Due to the nature of the patch and the type of workload, it's
    not a surprise that the result will depend on the CPU used.

    hackbench-pipes
    4.5.0-rc1 4.5.0-rc1
    vanilla nostats-v3r1
    Amean 1 0.0637 ( 0.00%) 0.0660 ( -3.59%)
    Amean 4 0.1229 ( 0.00%) 0.1181 ( 3.84%)
    Amean 7 0.1921 ( 0.00%) 0.1911 ( 0.52%)
    Amean 12 0.3117 ( 0.00%) 0.2923 ( 6.23%)
    Amean 21 0.4050 ( 0.00%) 0.3899 ( 3.74%)
    Amean 30 0.4586 ( 0.00%) 0.4433 ( 3.33%)
    Amean 48 0.5910 ( 0.00%) 0.5694 ( 3.65%)
    Amean 79 0.8663 ( 0.00%) 0.8626 ( 0.43%)
    Amean 110 1.1543 ( 0.00%) 1.1517 ( 0.22%)
    Amean 141 1.4457 ( 0.00%) 1.4290 ( 1.16%)
    Amean 172 1.7090 ( 0.00%) 1.6924 ( 0.97%)
    Amean 192 1.9126 ( 0.00%) 1.9089 ( 0.19%)

    Some small gains and losses and while the variance data is not included,
    it's close to the noise. The UMA machine did not show anything particularly
    different

    pipetest
    4.5.0-rc1 4.5.0-rc1
    vanilla nostats-v2r2
    Min Time 4.13 ( 0.00%) 3.99 ( 3.39%)
    1st-qrtle Time 4.38 ( 0.00%) 4.27 ( 2.51%)
    2nd-qrtle Time 4.46 ( 0.00%) 4.39 ( 1.57%)
    3rd-qrtle Time 4.56 ( 0.00%) 4.51 ( 1.10%)
    Max-90% Time 4.67 ( 0.00%) 4.60 ( 1.50%)
    Max-93% Time 4.71 ( 0.00%) 4.65 ( 1.27%)
    Max-95% Time 4.74 ( 0.00%) 4.71 ( 0.63%)
    Max-99% Time 4.88 ( 0.00%) 4.79 ( 1.84%)
    Max Time 4.93 ( 0.00%) 4.83 ( 2.03%)
    Mean Time 4.48 ( 0.00%) 4.39 ( 1.91%)
    Best99%Mean Time 4.47 ( 0.00%) 4.39 ( 1.91%)
    Best95%Mean Time 4.46 ( 0.00%) 4.38 ( 1.93%)
    Best90%Mean Time 4.45 ( 0.00%) 4.36 ( 1.98%)
    Best50%Mean Time 4.36 ( 0.00%) 4.25 ( 2.49%)
    Best10%Mean Time 4.23 ( 0.00%) 4.10 ( 3.13%)
    Best5%Mean Time 4.19 ( 0.00%) 4.06 ( 3.20%)
    Best1%Mean Time 4.13 ( 0.00%) 4.00 ( 3.39%)

    Small improvement and similar gains were seen on the UMA machine.

    The gain is small but it stands to reason that doing less work in the
    scheduler is a good thing. The downside is that the lack of schedstats and
    tracepoints may be surprising to experts doing performance analysis until
    they find the existence of the schedstats= parameter or schedstats sysctl.
    It will be automatically activated for latencytop and sleep profiling to
    alleviate the problem. For tracepoints, there is a simple warning as it's
    not safe to activate schedstats in the context when it's known the tracepoint
    may be wanted but is unavailable.

    Signed-off-by: Mel Gorman
    Reviewed-by: Matt Fleming
    Reviewed-by: Srikar Dronamraju
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1454663316-22048-1-git-send-email-mgorman@techsingularity.net
    Signed-off-by: Ingo Molnar

    Mel Gorman
     

08 Feb, 2016

1 commit

  • If we isolate CPUs, then we don't want random device interrupts on them. Even
    w/o the user space irq balancer enabled we can end up with irqs on non boot
    cpus and chasing newly requested interrupts is a tedious task.

    Allow to restrict the default irq affinity mask.

    Signed-off-by: Thomas Gleixner
    Cc: Rik van Riel
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Cc: Chris Metcalf
    Cc: Christoph Lameter
    Cc: Sebastian Siewior
    Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1602031948190.25254@nanos
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

04 Feb, 2016

1 commit

  • This patch provides a way of working around a slight regression
    introduced by commit 84638335900f ("mm: rework virtual memory
    accounting").

    Before that commit RLIMIT_DATA have control only over size of the brk
    region. But that change have caused problems with all existing versions
    of valgrind, because it set RLIMIT_DATA to zero.

    This patch fixes rlimit check (limit actually in bytes, not pages) and
    by default turns it into warning which prints at first VmData misuse:

    "mmap: top (795): VmData 516096 exceed data ulimit 512000. Will be forbidden soon."

    Behavior is controlled by boot param ignore_rlimit_data=y/n and by sysfs
    /sys/module/kernel/parameters/ignore_rlimit_data. For now it set to "y".

    [akpm@linux-foundation.org: tweak kernel-parameters.txt text[
    Signed-off-by: Konstantin Khlebnikov
    Link: http://lkml.kernel.org/r/20151228211015.GL2194@uranus
    Reported-by: Christian Borntraeger
    Cc: Cyrill Gorcunov
    Cc: Linus Torvalds
    Cc: Vegard Nossum
    Cc: Peter Zijlstra
    Cc: Vladimir Davydov
    Cc: Andy Lutomirski
    Cc: Quentin Casasnovas
    Cc: Kees Cook
    Cc: Willy Tarreau
    Cc: Pavel Emelyanov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     

30 Jan, 2016

1 commit

  • Move them to a separate header and have the following
    dependency:

    x86/cpufeatures.h
    Signed-off-by: Borislav Petkov
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1453842730-28463-5-git-send-email-bp@alien8.de
    Signed-off-by: Ingo Molnar

    Borislav Petkov
     

25 Jan, 2016

1 commit

  • Pull MIPS updates from Ralf Baechle:
    "This is the main pull request for MIPS for 4.5 plus some 4.4 fixes.

    The executive summary:

    - ATH79 platform improvments, use DT bindings for the ATH79 USB PHY.
    - Avoid useless rebuilds for zboot.
    - jz4780: Add NEMC, BCH and NAND device tree nodes
    - Initial support for the MicroChip's DT platform. As all the device
    drivers are missing this is still of limited use.
    - Some Loongson3 cleanups.
    - The unavoidable whitespace polishing.
    - Reduce clock skew when synchronizing the CPU cycle counters on CPU
    startup.
    - Add MIPS R6 fixes.
    - Lots of cleanups across arch/mips as fallout from KVM.
    - Lots of minor fixes and changes for IEEE 754-2008 support to the
    FPU emulator / fp-assist software.
    - Minor Ralink, BCM47xx and bcm963xx platform support improvments.
    - Support SMP on BCM63168"

    * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (84 commits)
    MIPS: zboot: Add support for serial debug using the PROM
    MIPS: zboot: Avoid useless rebuilds
    MIPS: BMIPS: Enable ARCH_WANT_OPTIONAL_GPIOLIB
    MIPS: bcm63xx: nvram: Remove unused bcm63xx_nvram_get_psi_size() function
    MIPS: bcm963xx: Update bcm_tag field image_sequence
    MIPS: bcm963xx: Move extended flash address to bcm_tag header file
    MIPS: bcm963xx: Move Broadcom BCM963xx image tag data structure
    MIPS: bcm63xx: nvram: Use nvram structure definition from header file
    MIPS: bcm963xx: Add Broadcom BCM963xx board nvram data structure
    MAINTAINERS: Add KVM for MIPS entry
    MIPS: KVM: Add missing newline to kvm_err()
    MIPS: Move KVM specific opcodes into asm/inst.h
    MIPS: KVM: Use cacheops.h definitions
    MIPS: Break down cacheops.h definitions
    MIPS: Use EXCCODE_ constants with set_except_vector()
    MIPS: Update trap codes
    MIPS: Move Cause.ExcCode trap codes to mipsregs.h
    MIPS: KVM: Make kvm_mips_{init,exit}() static
    MIPS: KVM: Refactor added offsetof()s
    MIPS: KVM: Convert EXPORT_SYMBOL to _GPL
    ...

    Linus Torvalds
     

21 Jan, 2016

1 commit

  • Kmem accounting might incur overhead that some users can't put up with.
    Besides, the implementation is still considered unstable. So let's
    provide a way to disable it for those users who aren't happy with it.

    To disable kmem accounting for cgroup2, pass cgroup.memory=nokmem at
    boot time.

    Signed-off-by: Vladimir Davydov
    Acked-by: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     

20 Jan, 2016

1 commit

  • Add an `ieee754=' kernel parameter to control IEEE Std 754 conformance
    mode.

    Use separate flags copied from the respective CPU feature flags, and
    adjusted according to the conformance mode selected, to make binaries
    requesting individual NaN encoding modes accepted or rejected as needed.
    Update the initial setting for FCSR and, in the full FPU emulation mode,
    its read-only mask accordingly. Accept the mode selection requested for
    legacy processors as well.

    As with the EF_MIPS_NAN2008 ELF file header flag adjust both ABS2008 and
    NAN2008 bits at the same time, to match the choice made for hardware
    currently implemented.

    Signed-off-by: Maciej W. Rozycki
    Cc: Andrew Morton
    Cc: Matthew Fortune
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Patchwork: https://patchwork.linux-mips.org/patch/11481/
    Signed-off-by: Ralf Baechle

    Maciej W. Rozycki
     

18 Jan, 2016

1 commit

  • Pull documentation updates from Jon Corbet:
    "A relatively boring cycle in the docs tree. There's a few kernel-doc
    fixes and various document tweaks.

    One patch reaches out of the documentation subtree to fix a comment in
    init/do_mounts_rd.c. There didn't seem to be anybody more appropriate
    to take that one, so I accepted it"

    * tag 'docs-4.5' of git://git.lwn.net/linux: (29 commits)
    thermal: add description for integral_cutoff unit
    Documentation: update libhugetlbfs site url
    Documentation: Explain pci=conf1,conf2 more verbosely
    DMA-API: fix confusing sentence in Documentation/DMA-API.txt
    Documentation: translations: update linux cross reference link
    Documentation: fix typo in CodingStyle
    init, Documentation: Remove ramdisk_blocksize mentions
    Documentation-getdelays: Apply a recommendation from "checkpatch.pl" in main()
    Documentation: HOWTO: update versions from 3.x to 4.x
    Documentation: remove outdated references from translations
    Doc: treewide: Fix grammar "a" to "an"
    Documentation: cpu-hotplug: Fix sysfs mount instructions
    can-doc: Add hint about getting timestamps
    Fix CFQ I/O scheduler parameter name in documentation
    Documentation: arm: remove dead links from Marvell Berlin docs
    Documentation: HOWTO: update code cross reference link
    Doc: Docbook/iio: Fix typo in iio.tmpl
    DocBook: make index.html generation less verbose by default
    DocBook: Cleanup: remove an unused $(call) line
    DocBook: Add a help message for DOCBOOKS env var
    ...

    Linus Torvalds
     

16 Jan, 2016

1 commit

  • Pull powerpc updates from Michael Ellerman:
    "Core:
    - Ground work for the new Power9 MMU from Aneesh Kumar K.V
    - Optimise FP/VMX/VSX context switching from Anton Blanchard

    Misc:
    - Various cleanups from Krzysztof Kozlowski, John Ogness, Rashmica
    Gupta, Russell Currey, Gavin Shan, Daniel Axtens, Michael Neuling,
    Andrew Donnellan
    - Allow wrapper to work on non-english system from Laurent Vivier
    - Add rN aliases to the pt_regs_offset table from Rashmica Gupta
    - Fix module autoload for rackmeter & axonram drivers from Luis de
    Bethencourt
    - Include KVM guest test in all interrupt vectors from Paul Mackerras
    - Fix DSCR inheritance over fork() from Anton Blanchard
    - Make value-returning atomics & {cmp}xchg* & their atomic_ versions
    fully ordered from Boqun Feng
    - Print MSR TM bits in oops messages from Michael Neuling
    - Add TM signal return & invalid stack selftests from Michael Neuling
    - Limit EPOW reset event warnings from Vipin K Parashar
    - Remove the Cell QPACE code from Rashmica Gupta
    - Append linux_banner to exception information in xmon from Rashmica
    Gupta
    - Add selftest to check if VSRs are corrupted from Rashmica Gupta
    - Remove broken GregorianDay() from Daniel Axtens
    - Import Anton's context_switch2 benchmark into selftests from
    Michael Ellerman
    - Add selftest script to test HMI functionality from Daniel Axtens
    - Remove obsolete OPAL v2 support from Stewart Smith
    - Make enter_rtas() private from Michael Ellerman
    - PPR exception cleanups from Michael Ellerman
    - Add page soft dirty tracking from Laurent Dufour
    - Add support for Nvlink NPUs from Alistair Popple
    - Add support for kexec on 476fpe from Alistair Popple
    - Enable kernel CPU dlpar from sysfs from Nathan Fontenot
    - Copy only required pieces of the mm_context_t to the paca from
    Michael Neuling
    - Add a kmsg_dumper that flushes OPAL console output on panic from
    Russell Currey
    - Implement save_stack_trace_regs() to enable kprobe stack tracing
    from Steven Rostedt
    - Add HWCAP bits for Power9 from Michael Ellerman
    - Fix _PAGE_PTE breaking swapoff from Aneesh Kumar K.V
    - Fix _PAGE_SWP_SOFT_DIRTY breaking swapoff from Hugh Dickins
    - scripts/recordmcount.pl: support data in text section on powerpc
    from Ulrich Weigand
    - Handle R_PPC64_ENTRY relocations in modules from Ulrich Weigand

    cxl:
    - cxl: Fix possible idr warning when contexts are released from
    Vaibhav Jain
    - cxl: use correct operator when writing pcie config space values
    from Andrew Donnellan
    - cxl: Fix DSI misses when the context owning task exits from Vaibhav
    Jain
    - cxl: fix build for GCC 4.6.x from Brian Norris
    - cxl: use -Werror only with CONFIG_PPC_WERROR from Brian Norris
    - cxl: Enable PCI device ID for future IBM CXL adapter from Uma
    Krishnan

    Freescale:
    - Freescale updates from Scott: Highlights include moving QE code out
    of arch/powerpc (to be shared with arm), device tree updates, and
    minor fixes"

    * tag 'powerpc-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (149 commits)
    powerpc/module: Handle R_PPC64_ENTRY relocations
    scripts/recordmcount.pl: support data in text section on powerpc
    powerpc/powernv: Fix OPAL_CONSOLE_FLUSH prototype and usages
    powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff
    powerpc/mm: Fix _PAGE_PTE breaking swapoff
    cxl: Enable PCI device ID for future IBM CXL adapter
    cxl: use -Werror only with CONFIG_PPC_WERROR
    cxl: fix build for GCC 4.6.x
    powerpc: Add HWCAP bits for Power9
    powerpc/powernv: Reserve PE#0 on NPU
    powerpc/powernv: Change NPU PE# assignment
    powerpc/powernv: Fix update of NVLink DMA mask
    powerpc/powernv: Remove misleading comment in pci.c
    powerpc: Implement save_stack_trace_regs() to enable kprobe stack tracing
    powerpc: Fix build break due to paca mm_context_t changes
    cxl: Fix DSI misses when the context owning task exits
    MAINTAINERS: Update Scott Wood's e-mail address
    powerpc/powernv: Fix minor off-by-one error in opal_mce_check_early_recovery()
    powerpc: Fix style of self-test config prompts
    powerpc/powernv: Only delay opal_rtc_read() retry when necessary
    ...

    Linus Torvalds
     

15 Jan, 2016

2 commits

  • Socket memory can be a significant share of overall memory consumed by
    common workloads. In order to provide reasonable resource isolation in
    the unified hierarchy, this type of memory needs to be included in the
    tracking/accounting of a cgroup under active memory resource control.

    Overhead is only incurred when a non-root control group is created AND
    the memory controller is instructed to track and account the memory
    footprint of that group. cgroup.memory=nosocket can be specified on the
    boot commandline to override any runtime configuration and forcibly
    exclude socket memory from active memory resource control.

    Signed-off-by: Johannes Weiner
    Acked-by: David S. Miller
    Reviewed-by: Vladimir Davydov
    Acked-by: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • People complained that setting the PCI config space access mechanism
    through "pci=conf1" or "pci=conf2" on the command line is not really
    documented. Yeah, can you blame them? Look at what we have now.

    So try to improve the situation a bit by explaining what those "conf1"
    and "conf2" things actually mean.

    See http://wiki.osdev.org/PCI for more info.

    Suggested-by: Eric Morton
    Signed-off-by: Borislav Petkov
    [jc: Added the above URL to the document too]
    Signed-off-by: Jonathan Corbet

    Borislav Petkov
     

14 Jan, 2016

1 commit

  • Pull tty/serial updates from Greg KH:
    "Here is the big serial/tty driver update for 4.5-rc1.

    Lots of driver updates and some tty core changes. All of these have
    been in linux-next and the details are in the shortlog"

    * tag 'tty-4.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (127 commits)
    drivers/tty/serial: delete unused MODULE_DEVICE_TABLE from atmel_serial.c
    serial: sh-sci: Remove cpufreq notifier to fix crash/deadlock
    serial: 8250: of: Fix the driver and actually compile the 8250_of
    tty: amba-pl011: use iotype instead of access_32b to track 32-bit I/O
    tty: amba-pl011: fix earlycon register offsets
    serial: sh-sci: Drop the sci_fck clock fallback
    sh: sh7734: Correct SCIF type for BRG
    sh: Remove sci_ick clock alias
    sh: Rename sci_ick and sci_fck clock to fck
    serial: sh-sci: Add support for optional BRG on (H)SCIF
    serial: sh-sci: Add support for optional external (H)SCK input
    serial: sh-sci: Prepare for multiple sampling clock sources
    serial: sh-sci: Correct SCIF type on R-Car for BRG
    serial: sh-sci: Correct SCIF type on RZ/A1H
    serial: sh-sci: Replace struct sci_port_info by type/regtype encoding
    serial: sh-sci: Add BRG register definitions
    serial: sh-sci: Take into account sampling rate for max baud rate
    serial: sh-sci: Merge sci_scbrr_calc() and sci_baud_calc_hscif()
    serial: sh-sci: Avoid calculating the receive margin for HSCIF
    serial: sh-sci: Improve bit rate error calculation for HSCIF
    ...

    Linus Torvalds