16 Apr, 2015

1 commit

  • Pull networking updates from David Miller:

    1) Add BQL support to via-rhine, from Tino Reichardt.

    2) Integrate SWITCHDEV layer support into the DSA layer, so DSA drivers
    can support hw switch offloading. From Floria Fainelli.

    3) Allow 'ip address' commands to initiate multicast group join/leave,
    from Madhu Challa.

    4) Many ipv4 FIB lookup optimizations from Alexander Duyck.

    5) Support EBPF in cls_bpf classifier and act_bpf action, from Daniel
    Borkmann.

    6) Remove the ugly compat support in ARP for ugly layers like ax25,
    rose, etc. And use this to clean up the neigh layer, then use it to
    implement MPLS support. All from Eric Biederman.

    7) Support L3 forwarding offloading in switches, from Scott Feldman.

    8) Collapse the LOCAL and MAIN ipv4 FIB tables when possible, to speed
    up route lookups even further. From Alexander Duyck.

    9) Many improvements and bug fixes to the rhashtable implementation,
    from Herbert Xu and Thomas Graf. In particular, in the case where
    an rhashtable user bulk adds a large number of items into an empty
    table, we expand the table much more sanely.

    10) Don't make the tcp_metrics hash table per-namespace, from Eric
    Biederman.

    11) Extend EBPF to access SKB fields, from Alexei Starovoitov.

    12) Split out new connection request sockets so that they can be
    established in the main hash table. Much less false sharing since
    hash lookups go direct to the request sockets instead of having to
    go first to the listener then to the request socks hashed
    underneath. From Eric Dumazet.

    13) Add async I/O support for crytpo AF_ALG sockets, from Tadeusz Struk.

    14) Support stable privacy address generation for RFC7217 in IPV6. From
    Hannes Frederic Sowa.

    15) Hash network namespace into IP frag IDs, also from Hannes Frederic
    Sowa.

    16) Convert PTP get/set methods to use 64-bit time, from Richard
    Cochran.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1816 commits)
    fm10k: Bump driver version to 0.15.2
    fm10k: corrected VF multicast update
    fm10k: mbx_update_max_size does not drop all oversized messages
    fm10k: reset head instead of calling update_max_size
    fm10k: renamed mbx_tx_dropped to mbx_tx_oversized
    fm10k: update xcast mode before synchronizing multicast addresses
    fm10k: start service timer on probe
    fm10k: fix function header comment
    fm10k: comment next_vf_mbx flow
    fm10k: don't handle mailbox events in iov_event path and always process mailbox
    fm10k: use separate workqueue for fm10k driver
    fm10k: Set PF queues to unlimited bandwidth during virtualization
    fm10k: expose tx_timeout_count as an ethtool stat
    fm10k: only increment tx_timeout_count in Tx hang path
    fm10k: remove extraneous "Reset interface" message
    fm10k: separate PF only stats so that VF does not display them
    fm10k: use hw->mac.max_queues for stats
    fm10k: only show actual queues, not the maximum in hardware
    fm10k: allow creation of VLAN on default vid
    fm10k: fix unused warnings
    ...

    Linus Torvalds
     

15 Apr, 2015

39 commits

  • Pull ARM updates from Russell King:
    "Included in this update are both some long term fixes and some new
    features.

    Fixes:

    - An integer overflow in the calculation of ELF_ET_DYN_BASE.

    - Avoiding OOMs for high-order IOMMU allocations

    - SMP requires the data cache to be enabled for synchronisation
    primitives to work, so prevent the CPU_DCACHE_DISABLE option being
    visible on SMP builds.

    - A bug going back 10+ years in the noMMU ARM94* CPU support code,
    where it corrupts registers. Found by folk getting Linux running
    on their cameras.

    - Versatile Express needs an errata workaround enabled for CPU
    hot-unplug to work.

    Features:

    - Clean up module linker by handling out of range relocations
    separately from relocation cases we don't handle.

    - Fix a long term bug in the pci_mmap_page_range() code, which we
    hope won't impact userspace (we hope there's no users of the
    existing broken interface.)

    - Don't map DMA coherent allocations when we don't have a MMU.

    - Drop experimental status for SMP_ON_UP.

    - Warn when DT doesn't specify ePAPR mandatory cache properties.

    - Add documentation concerning how we find the start of physical
    memory for AUTO_ZRELADDR kernels, detailing why we have chosen the
    mask and the implications of changing it.

    - Updates from Ard Biesheuvel to address some issues with large
    kernels (such as allyesconfig) failing to link.

    - Allow hibernation to work on modern (ARMv7) CPUs - this appears to
    have never worked in the past on these CPUs.

    - Enable IRQ_SHOW_LEVEL, which changes the /proc/interrupts output
    format (hopefully without userspace breaking... let's hope that if
    it causes someone a problem, they tell us.)

    - Fix tegra-ahb DT offsets.

    - Rework ARM errata 643719 code (and ARMv7 flush_cache_louis()/
    flush_dcache_all()) code to be more efficient, and enable this
    errata workaround by default for ARMv7+SMP CPUs. This complements
    the Versatile Express fix above.

    - Rework ARMv7 context code for errata 430973, so that only Cortex A8
    CPUs are impacted by the branch target buffer flush when this
    errata is enabled. Also update the help text to indicate that all
    r1p* A8 CPUs are impacted.

    - Switch ARM to the generic show_mem() implementation, it conveys all
    the information which we were already reporting.

    - Prevent slow timer sources being used for udelay() - timers running
    at less than 1MHz are not useful for this, and can cause udelay()
    to return immediately, without any wait. Using such a slow timer
    is silly.

    - VDSO support for 32-bit ARM, mainly for gettimeofday() using the
    ARM architected timer.

    - Perf support for Scorpion performance monitoring units"

    vdso semantic conflict fixed up as per linux-next.

    * 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (52 commits)
    ARM: update errata 430973 documentation to cover Cortex A8 r1p*
    ARM: ensure delay timer has sufficient accuracy for delays
    ARM: switch to use the generic show_mem() implementation
    ARM: proc-v7: avoid errata 430973 workaround for non-Cortex A8 CPUs
    ARM: enable ARM errata 643719 workaround by default
    ARM: cache-v7: optimise test for Cortex A9 r0pX devices
    ARM: cache-v7: optimise branches in v7_flush_cache_louis
    ARM: cache-v7: consolidate initialisation of cache level index
    ARM: cache-v7: shift CLIDR to extract appropriate field before masking
    ARM: cache-v7: use movw/movt instructions
    ARM: allow 16-bit instructions in ALT_UP()
    ARM: proc-arm94*.S: fix setup function
    ARM: vexpress: fix CPU hotplug with CT9x4 tile.
    ARM: 8276/1: Make CPU_DCACHE_DISABLE depend on !SMP
    ARM: 8335/1: Documentation: DT bindings: Tegra AHB: document the legacy base address
    ARM: 8334/1: amba: tegra-ahb: detect and correct bogus base address
    ARM: 8333/1: amba: tegra-ahb: fix register offsets in the macros
    ARM: 8339/1: Enable CONFIG_GENERIC_IRQ_SHOW_LEVEL
    ARM: 8338/1: kexec: Relax SMP validation to improve DT compatibility
    ARM: 8337/1: mm: Do not invoke OOM for higher order IOMMU DMA allocations
    ...

    Linus Torvalds
     
  • Pull s390 updates from Martin Schwidefsky:
    "The major change in this merge is the removal of the support for
    31-bit kernels. Naturally 31-bit user space will continue to work via
    the compat layer.

    And then some cleanup, some improvements and bug fixes"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (23 commits)
    s390/smp: wait until secondaries are active & online
    s390/hibernate: fix save and restore of kernel text section
    s390/cacheinfo: add missing facility check
    s390/syscalls: simplify syscall_get_arch()
    s390/irq: enforce correct irqclass_sub_desc array size
    s390: remove "64" suffix from mem64.S and swsusp_asm64.S
    s390/ipl: cleanup macro usage
    s390/ipl: cleanup shutdown_action attributes
    s390/ipl: cleanup bin attr usage
    s390/uprobes: fix address space annotation
    s390: add missing arch_release_task_struct() declaration
    s390: make couple of functions and variables static
    s390/maccess: improve s390_kernel_write()
    s390/maccess: remove potentially broken probe_kernel_write()
    s390/watchdog: support for KVM hypervisors and delete pr_info messages
    s390/watchdog: enable KEEPALIVE for /dev/watchdog
    s390/dasd: remove setting of scheduler from driver
    s390/traps: panic() instead of die() on translation exception
    s390: remove test_facility(2) (== z/Architecture mode active) checks
    s390/cmpxchg: simplify cmpxchg_double
    ...

    Linus Torvalds
     
  • Pull power management and ACPI updates from Rafael Wysocki:
    "These are mostly fixes and cleanups all over, although there are a few
    items that sort of fall into the new feature category.

    First off, we have new callbacks for PM domains that should help us to
    handle some issues related to device initialization in a better way.

    There also is some consolidation in the unified device properties API
    area allowing us to use that inferface for accessing data coming from
    platform initialization code in addition to firmware-provided data.

    We have some new device/CPU IDs in a few drivers, support for new
    chips and a new cpufreq driver too.

    Specifics:

    - Generic PM domains support update including new PM domain callbacks
    to handle device initialization better (Russell King, Rafael J
    Wysocki, Kevin Hilman)

    - Unified device properties API update including a new mechanism for
    accessing data provided by platform initialization code (Rafael J
    Wysocki, Adrian Hunter)

    - ARM cpuidle update including ARM32/ARM64 handling consolidation
    (Daniel Lezcano)

    - intel_idle update including support for the Silvermont Core in the
    Baytrail SOC and for the Airmont Core in the Cherrytrail and
    Braswell SOCs (Len Brown, Mathias Krause)

    - New cpufreq driver for Hisilicon ACPU (Leo Yan)

    - intel_pstate update including support for the Knights Landing chip
    (Dasaratharaman Chandramouli, Kristen Carlson Accardi)

    - QorIQ cpufreq driver update (Tang Yuantian, Arnd Bergmann)

    - powernv cpufreq driver update (Shilpasri G Bhat)

    - devfreq update including Tegra support changes (Tomeu Vizoso,
    MyungJoo Ham, Chanwoo Choi)

    - powercap RAPL (Running-Average Power Limit) driver update including
    support for Intel Broadwell server chips (Jacob Pan, Mathias Krause)

    - ACPI device enumeration update related to the handling of the
    special PRP0001 device ID allowing DT-style 'compatible' property
    to be used for ACPI device identification (Rafael J Wysocki)

    - ACPI EC driver update including limited _DEP support (Lan Tianyu,
    Lv Zheng)

    - ACPI backlight driver update including a new mechanism to allow
    native backlight handling to be forced on non-Windows 8 systems and
    a new quirk for Lenovo Ideapad Z570 (Aaron Lu, Hans de Goede)

    - New Windows Vista compatibility quirk for Sony VGN-SR19XN (Chen Yu)

    - Assorted ACPI fixes and cleanups (Aaron Lu, Martin Kepplinger,
    Masanari Iida, Mika Westerberg, Nan Li, Rafael J Wysocki)

    - Fixes related to suspend-to-idle for the iTCO watchdog driver and
    the ACPI core system suspend/resume code (Rafael J Wysocki, Chen Yu)

    - PM tracing support for the suspend phase of system suspend/resume
    transitions (Zhonghui Fu)

    - Configurable delay for the system suspend/resume testing facility
    (Brian Norris)

    - PNP subsystem cleanups (Peter Huewe, Rafael J Wysocki)"

    * tag 'pm+acpi-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (74 commits)
    ACPI / scan: Fix NULL pointer dereference in acpi_companion_match()
    ACPI / scan: Rework modalias creation when "compatible" is present
    intel_idle: mark cpu id array as __initconst
    powercap / RAPL: mark rapl_ids array as __initconst
    powercap / RAPL: add ID for Broadwell server
    intel_pstate: Knights Landing support
    intel_pstate: remove MSR test
    cpufreq: fix qoriq uniprocessor build
    ACPI / scan: Take the PRP0001 position in the list of IDs into account
    ACPI / scan: Simplify acpi_match_device()
    ACPI / scan: Generalize of_compatible matching
    device property: Introduce firmware node type for platform data
    device property: Make it possible to use secondary firmware nodes
    PM / watchdog: iTCO: stop watchdog during system suspend
    cpufreq: hisilicon: add acpu driver
    ACPI / EC: Call acpi_walk_dep_device_list() after installing EC opregion handler
    cpufreq: powernv: Report cpu frequency throttling
    intel_idle: Add support for the Airmont Core in the Cherrytrail and Braswell SOCs
    intel_idle: Update support for Silvermont Core in Baytrail SOC
    PM / devfreq: tegra: Register governor on module init
    ...

    Linus Torvalds
     
  • Jeff Kirsher says:

    ====================
    Intel Wired LAN Driver Updates 2015-04-14

    This series contains updates to fm10k only.

    Fixed transmit statistics which was actually using values from the
    receive ring, instead of the transmit ring. Fixed up spelling mistakes
    in code comments and resolved unused argument warnings. Added support
    for netconsole. Fixed up statistic reporting so that we are only
    reporting from actual queues as well as display PF only stats for
    just the PF and not the VF. Also fixed an issue that when returning
    virtualization queues from the VF back to the PF, we were retaining
    the VF rate limiter.

    Fixed up the driver to use a separate workqueue, which helps reduce
    and stabilize latency between scheduling the work in our interrupt and
    actually performing the work.

    Fixed a bug where the VF tried to set a multicast address before
    requesting the required xcast mode.

    Fix VF multicast update since VFs were being improperly added to the
    switch's mutlicast group. The error stems from the fact that incorrect
    arguments were passed to the update_mc_addr().

    Thanks to Alex Duyck for the extensive review.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Pull input subsystem updates from Dmitry Torokhov:
    "You will get the following new drivers:

    - Qualcomm PM8941 power key drver
    - ChipOne icn8318 touchscreen controller driver
    - Broadcom iProc touchscreen and keypad drivers
    - Semtech SX8654 I2C touchscreen controller driver

    ALPS driver now supports newer SS4 devices; Elantech got a fix that
    should make it work on some ASUS laptops; and a slew of other
    enhancements and random fixes"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (51 commits)
    Input: alps - non interleaved V2 dualpoint has separate stick button bits
    Input: alps - fix touchpad buttons getting stuck when used with trackpoint
    Input: atkbd - document "no new force-release quirks" policy
    Input: ALPS - make alps_get_pkt_id_ss4_v2() and others static
    Input: ALPS - V7 devices can report 5-finger taps
    Input: ALPS - add support for SS4 touchpad devices
    Input: ALPS - refactor alps_set_abs_params_mt()
    Input: elantech - fix absolute mode setting on some ASUS laptops
    Input: atmel_mxt_ts - split out touchpad initialisation logic
    Input: atmel_mxt_ts - implement support for T100 touch object
    Input: cros_ec_keyb - fix clearing keyboard state on wakeup
    Input: gscps2 - drop pci_ids dependency
    Input: synaptics - allocate 3 slots to keep stability in image sensors
    Input: Revert "Revert "synaptics - use dmax in input_mt_assign_slots""
    Input: MT - make slot assignment work for overcovered solutions
    mfd: tc3589x: enforce device-tree only mode
    Input: tc3589x - localize platform data
    Input: tsc2007 - Convert msecs to jiffies only once
    Input: edt-ft5x06 - remove EV_SYN event report
    Input: edt-ft5x06 - allow to setting the maximum axes value through the DT
    ...

    Linus Torvalds
     
  • Pull i2c updates from Wolfram Sang:
    "Most notable:

    - introducing the i2c_quirk infrastructure. Now, flaws of I2C
    controllers can be described and the core will check if the flaws
    collide with the messages to be sent

    - wait_for_completion return type cleanup series

    - new drivers for Digicolor, Netlogic XLP, Ingenic JZ4780

    - updates to the I2C slave framework which include API changes. Its
    only user was updated, too. Documentation was finally added

    - changed dynamic bus numbering for the DT case. This could change
    bus numbers for users. However, it fixes a collision where dynamic
    and static busses request the same id.

    - driver bugfixes, cleanups"

    * 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (52 commits)
    i2c: xlp9xx: Driver for Netlogic XLP9XX/5XX I2C controller
    of: Add vendor prefix 'netlogic'
    i2c: davinci: use ICPFUNC to toggle I2C as gpio for bus recovery
    i2c: davinci: use bus recovery infrastructure
    i2c: change input parameter to i2c_adapter for prepare/unprepare_recovery
    i2c: i2c-mux-gpio: remove error messages for probe deferrals
    i2c: jz4780: Add i2c bus controller driver for Ingenic JZ4780
    i2c: dln2: set the device tree node of the adapter
    i2c: davinci: fixup wait_for_completion_timeout handling
    i2c: mpc: Fix ISR return value
    i2c: slave-eeprom: add more info when to increase the pointer
    i2c: slave: add documentation for i2c-slave-eeprom
    Documentation: i2c: describe the new slave mode
    i2c: slave: rework the slave API
    i2c: add support for the Digicolor I2C controller
    i2c: busses with dynamic ids should start after fixed ids for DT
    of: base: add function to get highest id of an alias stem
    i2c: designware: Suppress error message if platform_get_irq() < 0
    i2c: mpc: assign the correct prescaler from SVR
    i2c: img-scb: fixup of wait_for_completion_timeout return handling
    ...

    Linus Torvalds
     
  • Pull VFIO updates from Alex Williamson:

    - VFIO platform bus driver support (Baptiste Reynal, Antonios Motakis,
    testing and review by Eric Auger)

    - Split VFIO irqfd support to separate module (Alex Williamson)

    - vfio-pci VGA arbiter client (Alex Williamson)

    - New vfio-pci.ids= module option (Alex Williamson)

    - vfio-pci D3 power state support for idle devices (Alex Williamson)

    * tag 'vfio-v4.1-rc1' of git://github.com/awilliam/linux-vfio: (30 commits)
    vfio-pci: Fix use after free
    vfio-pci: Move idle devices to D3hot power state
    vfio-pci: Remove warning if try-reset fails
    vfio-pci: Allow PCI IDs to be specified as module options
    vfio-pci: Add VGA arbiter client
    vfio-pci: Add module option to disable VGA region access
    vgaarb: Stub vga_set_legacy_decoding()
    vfio: Split virqfd into a separate module for vfio bus drivers
    vfio: virqfd_lock can be static
    vfio: put off the allocation of "minor" in vfio_create_group
    vfio/platform: implement IRQ masking/unmasking via an eventfd
    vfio: initialize the virqfd workqueue in VFIO generic code
    vfio: move eventfd support code for VFIO_PCI to a separate file
    vfio: pass an opaque pointer on virqfd initialization
    vfio: add local lock for virqfd instead of depending on VFIO PCI
    vfio: virqfd: rename vfio_pci_virqfd_init and vfio_pci_virqfd_exit
    vfio: add a vfio_ prefix to virqfd_enable and virqfd_disable and export
    vfio/platform: support for level sensitive interrupts
    vfio/platform: trigger an interrupt via eventfd
    vfio/platform: initial interrupts support code
    ...

    Linus Torvalds
     
  • Pull pincontrol updates from Linus Walleij:
    "This is the bulk of pin control changes for the v4.1 development
    cycle. Nothing really exciting this time: we basically added a few
    new drivers and subdrivers and stabilized them in linux-next. Some
    cleanups too. With sunrisepoint Intel has a real fine fully featured
    pin control driver for contemporary hardware, and the AMD driver is
    also for large deployments. Most of the others are ARM devices.

    New drivers:
    - Intel Sunrisepoint
    - AMD KERNCZ GPIO
    - Broadcom Cygnus IOMUX

    New subdrivers:
    - Marvell MVEBU Armada 39x SoCs
    - Samsung Exynos 5433
    - nVidia Tegra 210
    - Mediatek MT8135
    - Mediatek MT8173
    - AMLogic Meson8b
    - Qualcomm PM8916

    On top of this cleanups and development history for the above drivers
    as issues were fixed after merging"

    * tag 'pinctrl-v4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (71 commits)
    pinctrl: sirf: move sgpio lock into state container
    pinctrl: Add support for PM8916 GPIO's and MPP's
    pinctrl: bcm2835: Fix support for threaded level triggered IRQs
    sh-pfc: r8a7790: add EtherAVB pin groups
    pinctrl: Document "function" + "pins" pinmux binding
    pinctrl: intel: Add Intel Sunrisepoint pin controller and GPIO support
    pinctrl: fsl: imx: Check for 0 config register
    pinctrl: Add support for Meson8b
    documentation: Extend pinctrl docs for Meson8b
    pinctrl: Cleanup Meson8 driver
    Fix inconsistent spinlock of AMD GPIO driver which can be recognized by static analysis tool smatch. Declare constant Variables with Sparse's suggestion.
    pinctrl: at91: convert __raw to endian agnostic IO
    pinctrl: constify of_device_id array
    pinctrl: pinconf-generic: add dt node names to error messages
    pinctrl: pinconf-generic: scan also referenced phandle node
    pinctrl: mvebu: add suspend/resume support to Armada XP pinctrl driver
    pinctrl: st: Display pin's function when printing pinctrl debug information
    pinctrl: st: Show correct pin direction also in GPIO mode
    pinctrl: st: Supply a GPIO get_direction() call-back
    pinctrl: st: Move st_get_pio_control() further up the source file
    ...

    Linus Torvalds
     
  • Pull backlight updates from Lee Jones:
    "Changes to existing drivers:

    - Use of_get_child_by_name() instead of refcount; 88pm860x_bl

    - Terminate array with NULL element; da9052_bl"

    * tag 'backlight-for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
    backlight: da9052_bl: Terminate da9052_wled_ids array with empty element
    backlight: 88pm860x_bl: Use of_get_child_by_name() instead of refcount hack

    Linus Torvalds
     
  • Pull MFD updates from Lee Jones:
    "Changes to existing drivers:

    - Rename child driver [axp288_battery => axp288_fuel_gauge]; axp20x
    - Rename child driver [max77693-flash => max77693-led]; max77693
    - Error handling fixes; intel_soc_pmic
    - GPIO tweaking; intel_soc_pmic
    - Remove non-DT code; vexpress-sysreg, tc3589x
    - Remove unused/legacy code; ti_am335x_tscadc, rts5249, rtsx_gops, rtsx_pcr,
    rtc-s5m, sec-core, max77693, menelaus,
    wm5102-tables
    - Trivial fixups; rtsx_pci, da9150-core, sec-core, max7769, max77693,
    mc13xxx-core, dln2, hi6421-pmic-core, rk808, twl4030-power,
    lpc_ich, menelaus, twl6040
    - Update register/address values; rts5227, rts5249
    - DT and/or binding document fixups; arizona, da9150, mt6397, axp20x,
    qcom-rpm, qcom-spmi-pmic
    - Couple of trivial core Kconfig fixups
    - Remove use of seq_printf return value; ab8500-debugfs
    - Remove __exit markups; menelaus, tps65010
    - Fix platform-device name collisions; mfd-core

    New drivers/supported devices:

    - Add support for wm8280/wm8281 into arizona
    - Add support for COMe-cBL6 into kempld-core
    - Add support for rts524a and rts525a into rts5249
    - Add support for ipq8064 into qcom_rpm
    - Add support for extcon into axp20x
    - New MediaTek MT6397 PMIC driver
    - New Maxim MAX77843 PMIC dirver
    - New Intel Quark X1000 I2C-GPIO driver
    - New Skyworks SKY81452 driver"

    * tag 'mfd-for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (76 commits)
    mfd: sec: Fix RTC alarm interrupt number on S2MPS11
    mfd: wm5102: Remove registers for output 3R from readable list
    mfd: tps65010: Remove incorrect __exit markups
    mfd: devicetree: bindings: Add Qualcomm RPM regulator subnodes
    mfd: axp20x: Add support for extcon cell
    mfd: lpc_ich: Sort IDs
    mfd: twl6040: Remove wrong and unneeded "platform:twl6040" modalias
    mfd: qcom-spmi-pmic: Add specific compatible strings for Qualcomm's SPMI PMIC's
    mfd: axp20x: Fix duplicate const for model names
    mfd: menelaus: Use macro for magic number
    mfd: menelaus: Drop support for SW controller VCORE
    mfd: menelaus: Delete omap_has_menelaus
    mfd: arizona: Correct type of gpio_defaults
    mfd: lpc_ich: Sort IDs
    mfd: Fix a typo in Kconfig
    mfd: qcom_rpm: Add support for IPQ8064
    mfd: devicetree: qcom_rpm: Document IPQ8064 resources
    mfd: core: Fix platform-device name collisions
    mfd: intel_quark_i2c_gpio: Don't crash if !DMI
    dt-bindings: Add vendor-prefix for X-Powers
    ...

    Linus Torvalds
     
  • Merge first patchbomb from Andrew Morton:

    - arch/sh updates

    - ocfs2 updates

    - kernel/watchdog feature

    - about half of mm/

    * emailed patches from Andrew Morton : (122 commits)
    Documentation: update arch list in the 'memtest' entry
    Kconfig: memtest: update number of test patterns up to 17
    arm: add support for memtest
    arm64: add support for memtest
    memtest: use phys_addr_t for physical addresses
    mm: move memtest under mm
    mm, hugetlb: abort __get_user_pages if current has been oom killed
    mm, mempool: do not allow atomic resizing
    memcg: print cgroup information when system panics due to panic_on_oom
    mm: numa: remove migrate_ratelimited
    mm: fold arch_randomize_brk into ARCH_HAS_ELF_RANDOMIZE
    mm: split ET_DYN ASLR from mmap ASLR
    s390: redefine randomize_et_dyn for ELF_ET_DYN_BASE
    mm: expose arch_mmap_rnd when available
    s390: standardize mmap_rnd() usage
    powerpc: standardize mmap_rnd() usage
    mips: extract logic for mmap_rnd()
    arm64: standardize mmap_rnd() usage
    x86: standardize mmap_rnd() usage
    arm: factor out mmap ASLR into mmap_rnd
    ...

    Linus Torvalds
     
  • Since arm64/arm support memtest command line option update the "memtest"
    entry.

    Signed-off-by: Vladimir Murzin
    Cc: "H. Peter Anvin"
    Cc: Catalin Marinas
    Cc: Ingo Molnar
    Cc: Mark Rutland
    Cc: Russell King
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Murzin
     
  • Additional test patterns for memtest were introduced since commit
    63823126c221 ("x86: memtest: add additional (regular) test patterns"),
    but looks like Kconfig was not updated that time.

    Update Kconfig entry with the actual number of maximum test patterns.

    Signed-off-by: Vladimir Murzin
    Cc: "H. Peter Anvin"
    Cc: Catalin Marinas
    Cc: Ingo Molnar
    Cc: Mark Rutland
    Cc: Russell King
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Murzin
     
  • Add support for memtest command line option.

    Signed-off-by: Vladimir Murzin
    Acked-by: Will Deacon
    Cc: "H. Peter Anvin"
    Cc: Catalin Marinas
    Cc: Ingo Molnar
    Cc: Mark Rutland
    Cc: Russell King
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Murzin
     
  • Add support for memtest command line option.

    Signed-off-by: Vladimir Murzin
    Acked-by: Will Deacon
    Tested-by: Mark Rutland
    Cc: Catalin Marinas
    Cc: Russell King
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Murzin
     
  • Since memtest might be used by other architectures pass input parameters
    as phys_addr_t instead of long to prevent overflow.

    Signed-off-by: Vladimir Murzin
    Acked-by: Will Deacon
    Tested-by: Mark Rutland
    Cc: "H. Peter Anvin"
    Cc: Catalin Marinas
    Cc: Ingo Molnar
    Cc: Russell King
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Murzin
     
  • Memtest is a simple feature which fills the memory with a given set of
    patterns and validates memory contents, if bad memory regions is detected
    it reserves them via memblock API. Since memblock API is widely used by
    other architectures this feature can be enabled outside of x86 world.

    This patch set promotes memtest to live under generic mm umbrella and
    enables memtest feature for arm/arm64.

    It was reported that this patch set was useful for tracking down an issue
    with some errant DMA on an arm64 platform.

    This patch (of 6):

    There is nothing platform dependent in the core memtest code, so other
    platforms might benefit from this feature too.

    [linux@roeck-us.net: MEMTEST depends on MEMBLOCK]
    Signed-off-by: Vladimir Murzin
    Acked-by: Will Deacon
    Tested-by: Mark Rutland
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Catalin Marinas
    Cc: Russell King
    Cc: Paul Bolle
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Murzin
     
  • If __get_user_pages() is faulting a significant number of hugetlb pages,
    usually as the result of mmap(MAP_LOCKED), it can potentially allocate a
    very large amount of memory.

    If the process has been oom killed, this will cause a lot of memory to
    potentially deplete memory reserves.

    In the same way that commit 4779280d1ea4 ("mm: make get_user_pages()
    interruptible") aborted for pending SIGKILLs when faulting non-hugetlb
    memory, based on the premise of commit 462e00cc7151 ("oom: stop
    allocating user memory if TIF_MEMDIE is set"), hugetlb page faults now
    terminate when the process has been oom killed.

    Signed-off-by: David Rientjes
    Acked-by: Rik van Riel
    Acked-by: Greg Thelen
    Cc: Naoya Horiguchi
    Acked-by: Davidlohr Bueso
    Acked-by: "Kirill A. Shutemov"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Allocating a large number of elements in atomic context could quickly
    deplete memory reserves, so just disallow atomic resizing entirely.

    Nothing currently uses mempool_resize() with anything other than
    GFP_KERNEL, so convert existing callers to drop the gfp_mask.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: David Rientjes
    Acked-by: Steffen Maier [zfcp]
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Steve French
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • If kernel panics due to oom, caused by a cgroup reaching its limit, when
    'compulsory panic_on_oom' is enabled, then we will only see that the OOM
    happened because of "compulsory panic_on_oom is enabled" but this doesn't
    tell the difference between mempolicy and memcg. And dumping system wide
    information is plain wrong and more confusing. This patch provides the
    information of the cgroup whose limit triggerred panic

    Signed-off-by: Balasubramani Vivekanandan
    Acked-by: Michal Hocko
    Cc: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Balasubramani Vivekanandan
     
  • This code is dead since commit 9e645ab6d089 ("sched/numa: Continue PTE
    scanning even if migrate rate limited") so remove it.

    Signed-off-by: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • The arch_randomize_brk() function is used on several architectures,
    even those that don't support ET_DYN ASLR. To avoid bulky extern/#define
    tricks, consolidate the support under CONFIG_ARCH_HAS_ELF_RANDOMIZE for
    the architectures that support it, while still handling CONFIG_COMPAT_BRK.

    Signed-off-by: Kees Cook
    Cc: Hector Marco-Gisbert
    Cc: Russell King
    Reviewed-by: Ingo Molnar
    Cc: Catalin Marinas
    Cc: Will Deacon
    Cc: Ralf Baechle
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Michael Ellerman
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Alexander Viro
    Cc: Oleg Nesterov
    Cc: Andy Lutomirski
    Cc: "David A. Long"
    Cc: Andrey Ryabinin
    Cc: Arun Chandran
    Cc: Yann Droneaud
    Cc: Min-Hua Chen
    Cc: Paul Burton
    Cc: Alex Smith
    Cc: Markos Chandras
    Cc: Vineeth Vijayan
    Cc: Jeff Bailey
    Cc: Michael Holzheu
    Cc: Ben Hutchings
    Cc: Behan Webster
    Cc: Ismael Ripoll
    Cc: Jan-Simon Mller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • This fixes the "offset2lib" weakness in ASLR for arm, arm64, mips,
    powerpc, and x86. The problem is that if there is a leak of ASLR from
    the executable (ET_DYN), it means a leak of shared library offset as
    well (mmap), and vice versa. Further details and a PoC of this attack
    is available here:

    http://cybersecurity.upv.es/attacks/offset2lib/offset2lib.html

    With this patch, a PIE linked executable (ET_DYN) has its own ASLR
    region:

    $ ./show_mmaps_pie
    54859ccd6000-54859ccd7000 r-xp ... /tmp/show_mmaps_pie
    54859ced6000-54859ced7000 r--p ... /tmp/show_mmaps_pie
    54859ced7000-54859ced8000 rw-p ... /tmp/show_mmaps_pie
    7f75be764000-7f75be91f000 r-xp ... /lib/x86_64-linux-gnu/libc.so.6
    7f75be91f000-7f75beb1f000 ---p ... /lib/x86_64-linux-gnu/libc.so.6
    7f75beb1f000-7f75beb23000 r--p ... /lib/x86_64-linux-gnu/libc.so.6
    7f75beb23000-7f75beb25000 rw-p ... /lib/x86_64-linux-gnu/libc.so.6
    7f75beb25000-7f75beb2a000 rw-p ...
    7f75beb2a000-7f75beb4d000 r-xp ... /lib64/ld-linux-x86-64.so.2
    7f75bed45000-7f75bed46000 rw-p ...
    7f75bed46000-7f75bed47000 r-xp ...
    7f75bed47000-7f75bed4c000 rw-p ...
    7f75bed4c000-7f75bed4d000 r--p ... /lib64/ld-linux-x86-64.so.2
    7f75bed4d000-7f75bed4e000 rw-p ... /lib64/ld-linux-x86-64.so.2
    7f75bed4e000-7f75bed4f000 rw-p ...
    7fffb3741000-7fffb3762000 rw-p ... [stack]
    7fffb377b000-7fffb377d000 r--p ... [vvar]
    7fffb377d000-7fffb377f000 r-xp ... [vdso]

    The change is to add a call the newly created arch_mmap_rnd() into the
    ELF loader for handling ET_DYN ASLR in a separate region from mmap ASLR,
    as was already done on s390. Removes CONFIG_BINFMT_ELF_RANDOMIZE_PIE,
    which is no longer needed.

    Signed-off-by: Kees Cook
    Reported-by: Hector Marco-Gisbert
    Cc: Russell King
    Reviewed-by: Ingo Molnar
    Cc: Catalin Marinas
    Cc: Will Deacon
    Cc: Ralf Baechle
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Michael Ellerman
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Alexander Viro
    Cc: Oleg Nesterov
    Cc: Andy Lutomirski
    Cc: "David A. Long"
    Cc: Andrey Ryabinin
    Cc: Arun Chandran
    Cc: Yann Droneaud
    Cc: Min-Hua Chen
    Cc: Paul Burton
    Cc: Alex Smith
    Cc: Markos Chandras
    Cc: Vineeth Vijayan
    Cc: Jeff Bailey
    Cc: Michael Holzheu
    Cc: Ben Hutchings
    Cc: Behan Webster
    Cc: Ismael Ripoll
    Cc: Jan-Simon Mller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • In preparation for moving ET_DYN randomization into the ELF loader (which
    requires a static ELF_ET_DYN_BASE), this redefines s390's existing ET_DYN
    randomization in a call to arch_mmap_rnd(). This refactoring results in
    the same ET_DYN randomization on s390.

    Signed-off-by: Kees Cook
    Acked-by: Martin Schwidefsky
    Cc: Heiko Carstens
    Reviewed-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • When an architecture fully supports randomizing the ELF load location,
    a per-arch mmap_rnd() function is used to find a randomized mmap base.
    In preparation for randomizing the location of ET_DYN binaries
    separately from mmap, this renames and exports these functions as
    arch_mmap_rnd(). Additionally introduces CONFIG_ARCH_HAS_ELF_RANDOMIZE
    for describing this feature on architectures that support it
    (which is a superset of ARCH_BINFMT_ELF_RANDOMIZE_PIE, since s390
    already supports a separated ET_DYN ASLR from mmap ASLR without the
    ARCH_BINFMT_ELF_RANDOMIZE_PIE logic).

    Signed-off-by: Kees Cook
    Cc: Hector Marco-Gisbert
    Cc: Russell King
    Reviewed-by: Ingo Molnar
    Cc: Catalin Marinas
    Cc: Will Deacon
    Cc: Ralf Baechle
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Michael Ellerman
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Alexander Viro
    Cc: Oleg Nesterov
    Cc: Andy Lutomirski
    Cc: "David A. Long"
    Cc: Andrey Ryabinin
    Cc: Arun Chandran
    Cc: Yann Droneaud
    Cc: Min-Hua Chen
    Cc: Paul Burton
    Cc: Alex Smith
    Cc: Markos Chandras
    Cc: Vineeth Vijayan
    Cc: Jeff Bailey
    Cc: Michael Holzheu
    Cc: Ben Hutchings
    Cc: Behan Webster
    Cc: Ismael Ripoll
    Cc: Jan-Simon Mller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • In preparation for splitting out ET_DYN ASLR, this refactors the use of
    mmap_rnd() to be used similarly to arm and x86, and extracts the
    checking of PF_RANDOMIZE.

    Signed-off-by: Kees Cook
    Acked-by: Martin Schwidefsky
    Cc: Heiko Carstens
    Reviewed-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • In preparation for splitting out ET_DYN ASLR, this refactors the use of
    mmap_rnd() to be used similarly to arm and x86.

    (Can mmap ASLR be safely enabled in the legacy mmap case here? Other
    archs use "mm->mmap_base = TASK_UNMAPPED_BASE + random_factor".)

    Signed-off-by: Kees Cook
    Reviewed-by: Ingo Molnar
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Michael Ellerman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • In preparation for splitting out ET_DYN ASLR, extract the mmap ASLR
    selection into a separate function.

    Signed-off-by: Kees Cook
    Reviewed-by: Ingo Molnar
    Cc: Ralf Baechle
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • In preparation for splitting out ET_DYN ASLR, this refactors the use of
    mmap_rnd() to be used similarly to arm and x86. This additionally
    enables mmap ASLR on legacy mmap layouts, which appeared to be missing
    on arm64, and was already supported on arm. Additionally removes a
    copy/pasted declaration of an unused function.

    Signed-off-by: Kees Cook
    Cc: Russell King
    Cc: Catalin Marinas
    Reviewed-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • In preparation for splitting out ET_DYN ASLR, this refactors the use of
    mmap_rnd() to be used similarly to arm, and extracts the checking of
    PF_RANDOMIZE.

    Signed-off-by: Kees Cook
    Reviewed-by: Ingo Molnar
    Cc: Oleg Nesterov
    Cc: Andy Lutomirski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • To address the "offset2lib" ASLR weakness[1], this separates ET_DYN ASLR
    from mmap ASLR, as already done on s390. The architectures that are
    already randomizing mmap (arm, arm64, mips, powerpc, s390, and x86), have
    their various forms of arch_mmap_rnd() made available via the new
    CONFIG_ARCH_HAS_ELF_RANDOMIZE. For these architectures,
    arch_randomize_brk() is collapsed as well.

    This is an alternative to the solutions in:
    https://lkml.org/lkml/2015/2/23/442

    I've been able to test x86 and arm, and the buildbot (so far) seems happy
    with building the rest.

    [1] http://cybersecurity.upv.es/attacks/offset2lib/offset2lib.html

    This patch (of 10):

    In preparation for splitting out ET_DYN ASLR, this moves the ASLR
    calculations for mmap on ARM into a separate routine, similar to x86.
    This also removes the redundant check of personality (PF_RANDOMIZE is
    already set before calling arch_pick_mmap_layout).

    Signed-off-by: Kees Cook
    Cc: Hector Marco-Gisbert
    Cc: Russell King
    Reviewed-by: Ingo Molnar
    Cc: Catalin Marinas
    Cc: Will Deacon
    Cc: Ralf Baechle
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Michael Ellerman
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Alexander Viro
    Cc: Oleg Nesterov
    Cc: Andy Lutomirski
    Cc: "David A. Long"
    Cc: Andrey Ryabinin
    Cc: Arun Chandran
    Cc: Yann Droneaud
    Cc: Min-Hua Chen
    Cc: Paul Burton
    Cc: Alex Smith
    Cc: Markos Chandras
    Cc: Vineeth Vijayan
    Cc: Jeff Bailey
    Cc: Michael Holzheu
    Cc: Ben Hutchings
    Cc: Behan Webster
    Cc: Ismael Ripoll
    Cc: Jan-Simon Mller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • With CONFIG_ARCH_BINFMT_ELF_RANDOMIZE_PIE enabled, and a normal top-down
    address allocation strategy, load_elf_binary() will attempt to map a PIE
    binary into an address range immediately below mm->mmap_base.

    Unfortunately, load_elf_ binary() does not take account of the need to
    allocate sufficient space for the entire binary which means that, while
    the first PT_LOAD segment is mapped below mm->mmap_base, the subsequent
    PT_LOAD segment(s) end up being mapped above mm->mmap_base into the are
    that is supposed to be the "gap" between the stack and the binary.

    Since the size of the "gap" on x86_64 is only guaranteed to be 128MB this
    means that binaries with large data segments > 128MB can end up mapping
    part of their data segment over their stack resulting in corruption of the
    stack (and the data segment once the binary starts to run).

    Any PIE binary with a data segment > 128MB is vulnerable to this although
    address randomization means that the actual gap between the stack and the
    end of the binary is normally greater than 128MB. The larger the data
    segment of the binary the higher the probability of failure.

    Fix this by calculating the total size of the binary in the same way as
    load_elf_interp().

    Signed-off-by: Michael Davidson
    Cc: Alexander Viro
    Cc: Jiri Kosina
    Cc: Kees Cook
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael Davidson
     
  • When !MMU, it will report warning. The related warning with allmodconfig
    under c6x:

    CC mm/memcontrol.o
    mm/memcontrol.c:2802:12: warning: 'mem_cgroup_move_account' defined but not used [-Wunused-function]
    static int mem_cgroup_move_account(struct page *page,
    ^

    Signed-off-by: Chen Gang
    Acked-by: Michal Hocko
    Acked-by: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chen Gang
     
  • Implement huge KVA mapping interfaces on x86.

    On x86, MTRRs can override PAT memory types with a 4KB granularity. When
    using a huge page, MTRRs can override the memory type of the huge page,
    which may lead a performance penalty. The processor can also behave in an
    undefined manner if a huge page is mapped to a memory range that MTRRs
    have mapped with multiple different memory types. Therefore, the mapping
    code falls back to use a smaller page size toward 4KB when a mapping range
    is covered by non-WB type of MTRRs. The WB type of MTRRs has no affect on
    the PAT memory types.

    pud_set_huge() and pmd_set_huge() call mtrr_type_lookup() to see if a
    given range is covered by MTRRs. MTRR_TYPE_WRBACK indicates that the
    range is either covered by WB or not covered and the MTRR default value is
    set to WB. 0xFF indicates that MTRRs are disabled.

    HAVE_ARCH_HUGE_VMAP is selected when X86_64 or X86_32 with X86_PAE is set.
    X86_32 without X86_PAE is not supported since such config can unlikey be
    benefited from this feature, and there was an issue found in testing.

    [fengguang.wu@intel.com: ioremap_pud_capable can be static]
    Signed-off-by: Toshi Kani
    Cc: "H. Peter Anvin"
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Arnd Bergmann
    Cc: Dave Hansen
    Cc: Robert Elliott
    Signed-off-by: Fengguang Wu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Toshi Kani
     
  • Implement huge I/O mapping capability interfaces for ioremap() on x86.

    IOREMAP_MAX_ORDER is defined to PUD_SHIFT on x86/64 and PMD_SHIFT on
    x86/32, which overrides the default value defined in .

    Signed-off-by: Toshi Kani
    Cc: "H. Peter Anvin"
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Arnd Bergmann
    Cc: Dave Hansen
    Cc: Robert Elliott
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Toshi Kani
     
  • Change vunmap_pmd_range() and vunmap_pud_range() to tear down huge KVA
    mappings when they are set. pud_clear_huge() and pmd_clear_huge() return
    zero when no-operation is performed, i.e. huge page mapping was not used.

    These changes are only enabled when CONFIG_HAVE_ARCH_HUGE_VMAP is defined
    on the architecture.

    [akpm@linux-foundation.org: use consistent code layout]
    Signed-off-by: Toshi Kani
    Cc: "H. Peter Anvin"
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Arnd Bergmann
    Cc: Dave Hansen
    Cc: Robert Elliott
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Toshi Kani
     
  • ioremap_pud_range() and ioremap_pmd_range() are changed to create huge I/O
    mappings when their capability is enabled, and a request meets required
    conditions -- both virtual & physical addresses are aligned by their huge
    page size, and a requested range fufills their huge page size. When
    pud_set_huge() or pmd_set_huge() returns zero, i.e. no-operation is
    performed, the code simply falls back to the next level.

    The changes are only enabled when CONFIG_HAVE_ARCH_HUGE_VMAP is defined on
    the architecture.

    Signed-off-by: Toshi Kani
    Cc: "H. Peter Anvin"
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Arnd Bergmann
    Cc: Dave Hansen
    Cc: Robert Elliott
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Toshi Kani
     
  • Add ioremap_pud_enabled() and ioremap_pmd_enabled(), which return 1 when
    I/O mappings with pud/pmd are enabled on the kernel.

    ioremap_huge_init() calls arch_ioremap_pud_supported() and
    arch_ioremap_pmd_supported() to initialize the capabilities at boot-time.

    A new kernel option "nohugeiomap" is also added, so that user can disable
    the huge I/O map capabilities when necessary.

    Signed-off-by: Toshi Kani
    Cc: "H. Peter Anvin"
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Arnd Bergmann
    Cc: Dave Hansen
    Cc: Robert Elliott
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Toshi Kani
     
  • ioremap() and its related interfaces are used to create I/O mappings to
    memory-mapped I/O devices. The mapping sizes of the traditional I/O
    devices are relatively small. Non-volatile memory (NVM), however, has
    many GB and is going to have TB soon. It is not very efficient to create
    large I/O mappings with 4KB.

    This patchset extends the ioremap() interfaces to transparently create I/O
    mappings with huge pages whenever possible. ioremap() continues to use
    4KB mappings when a huge page does not fit into a requested range. There
    is no change necessary to the drivers using ioremap(). A requested
    physical address must be aligned by a huge page size (1GB or 2MB on x86)
    for using huge page mapping, though. The kernel huge I/O mapping will
    improve performance of NVM and other devices with large memory, and reduce
    the time to create their mappings as well.

    On x86, MTRRs can override PAT memory types with a 4KB granularity. When
    using a huge page, MTRRs can override the memory type of the huge page,
    which may lead a performance penalty. The processor can also behave in an
    undefined manner if a huge page is mapped to a memory range that MTRRs
    have mapped with multiple different memory types. Therefore, the mapping
    code falls back to use a smaller page size toward 4KB when a mapping range
    is covered by non-WB type of MTRRs. The WB type of MTRRs has no affect on
    the PAT memory types.

    The patchset introduces HAVE_ARCH_HUGE_VMAP, which indicates that the arch
    supports huge KVA mappings for ioremap(). User may specify a new kernel
    option "nohugeiomap" to disable the huge I/O mapping capability of
    ioremap() when necessary.

    Patch 1-4 change common files to support huge I/O mappings. There is no
    change in the functinalities unless HAVE_ARCH_HUGE_VMAP is defined on the
    architecture of the system.

    Patch 5-6 implement the HAVE_ARCH_HUGE_VMAP funcs on x86, and set
    HAVE_ARCH_HUGE_VMAP on x86.

    This patch (of 6):

    __get_vm_area_node() takes unsigned long size, which is a 64-bit value on
    a 64-bit kernel. However, fls(size) simply ignores the upper 32-bit.
    Change to use fls_long() to handle the size properly.

    Signed-off-by: Toshi Kani
    Cc: "H. Peter Anvin"
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Arnd Bergmann
    Cc: Dave Hansen
    Cc: Robert Elliott
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Toshi Kani