16 Feb, 2015

1 commit


11 Feb, 2015

39 commits

  • Pull live patching infrastructure from Jiri Kosina:
    "Let me provide a bit of history first, before describing what is in
    this pile.

    Originally, there was kSplice as a standalone project that implemented
    stop_machine()-based patching for the linux kernel. This project got
    later acquired, and the current owner is providing live patching as a
    proprietary service, without any intentions to have their
    implementation merged.

    Then, due to rising user/customer demand, both Red Hat and SUSE
    started working on their own implementation (not knowing about each
    other), and announced first versions roughly at the same time [1] [2].

    The principle difference between the two solutions is how they are
    making sure that the patching is performed in a consistent way when it
    comes to different execution threads with respect to the semantic
    nature of the change that is being introduced.

    In a nutshell, kPatch is issuing stop_machine(), then looking at
    stacks of all existing processess, and if it decides that the system
    is in a state that can be patched safely, it proceeds insterting code
    redirection machinery to the patched functions.

    On the other hand, kGraft provides a per-thread consistency during one
    single pass of a process through the kernel and performs a lazy
    contignuous migration of threads from "unpatched" universe to the
    "patched" one at safe checkpoints.

    If interested in a more detailed discussion about the consistency
    models and its possible combinations, please see the thread that
    evolved around [3].

    It pretty quickly became obvious to the interested parties that it's
    absolutely impractical in this case to have several isolated solutions
    for one task to co-exist in the kernel. During a dedicated Live
    Kernel Patching track at LPC in Dusseldorf, all the interested parties
    sat together and came up with a joint aproach that would work for both
    distro vendors. Steven Rostedt took notes [4] from this meeting.

    And the foundation for that aproach is what's present in this pull
    request.

    It provides a basic infrastructure for function "live patching" (i.e.
    code redirection), including API for kernel modules containing the
    actual patches, and API/ABI for userspace to be able to operate on the
    patches (look up what patches are applied, enable/disable them, etc).

    It's relatively simple and minimalistic, as it's making use of
    existing kernel infrastructure (namely ftrace) as much as possible.
    It's also self-contained, in a sense that it doesn't hook itself in
    any other kernel subsystem (it doesn't even touch any other code).
    It's now implemented for x86 only as a reference architecture, but
    support for powerpc, s390 and arm is already in the works (adding
    arch-specific support basically boils down to teaching ftrace about
    regs-saving).

    Once this common infrastructure gets merged, both Red Hat and SUSE
    have agreed to immediately start porting their current solutions on
    top of this, abandoning their out-of-tree code. The plan basically is
    that each patch will be marked by flag(s) that would indicate which
    consistency model it is willing to use (again, the details have been
    sketched out already in the thread at [3]).

    Before this happens, the current codebase can be used to patch a large
    group of secruity/stability problems the patches for which are not too
    complex (in a sense that they don't introduce non-trivial change of
    function's return value semantics, they don't change layout of data
    structures, etc) -- this corresponds to LEAVE_FUNCTION &&
    SWITCH_FUNCTION semantics described at [3].

    This tree has been in linux-next since December.

    [1] https://lkml.org/lkml/2014/4/30/477
    [2] https://lkml.org/lkml/2014/7/14/857
    [3] https://lkml.org/lkml/2014/11/7/354
    [4] http://linuxplumbersconf.org/2014/wp-content/uploads/2014/10/LPC2014_LivePatching.txt

    [ The core code is introduced by the three commits authored by Seth
    Jennings, which got a lot of changes incorporated during numerous
    respins and reviews of the initial implementation. All the followup
    commits have materialized only after public tree has been created,
    so they were not folded into initial three commits so that the
    public tree doesn't get rebased ]"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
    livepatch: add missing newline to error message
    livepatch: rename config to CONFIG_LIVEPATCH
    livepatch: fix uninitialized return value
    livepatch: support for repatching a function
    livepatch: enforce patch stacking semantics
    livepatch: change ARCH_HAVE_LIVE_PATCHING to HAVE_LIVE_PATCHING
    livepatch: fix deferred module patching order
    livepatch: handle ancient compilers with more grace
    livepatch: kconfig: use bool instead of boolean
    livepatch: samples: fix usage example comments
    livepatch: MAINTAINERS: add git tree location
    livepatch: use FTRACE_OPS_FL_IPMODIFY
    livepatch: move x86 specific ftrace handler code to arch/x86
    livepatch: samples: add sample live patching module
    livepatch: kernel: add support for live patching
    livepatch: kernel: add TAINT_LIVEPATCH

    Linus Torvalds
     
  • Pull HID updates from Jiri Kosina:
    "Updates for HID code

    - improveements of Logitech HID++ procotol implementation, from
    Benjamin Tissoires

    - support for composite RMI devices, from Andrew Duggan

    - new driver for BETOP controller, from Huang Bo

    - fixup for conflicting mapping in HID core between PC-101/103/104
    and PC-102/105 keyboards from David Herrmann

    - new hardware support and fixes in Wacom driver, from Ping Cheng

    - assorted small fixes and device ID additions all over the place"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (33 commits)
    HID: wacom: add support for Cintiq 27QHD and 27QHD touch
    HID: wacom: consolidate input capability settings for pen and touch
    HID: wacom: make sure touch arbitration is applied consistently
    HID: pidff: Fix initialisation forMicrosoft Sidewinder FF Pro 2
    HID: hyperv: match wait_for_completion_timeout return type
    HID: wacom: Report ABS_MISC event for Cintiq Companion Hybrid
    HID: Use Kbuild idiom in Makefiles
    HID: do not bind to Microchip Pick16F1454
    HID: hid-lg4ff: use DEVICE_ATTR_RW macro
    HID: hid-lg4ff: fix sysfs attribute permission
    HID: wacom: peport In Range event according to the spec
    HID: wacom: process invalid Cintiq and Intuos data in wacom_intuos_inout()
    HID: rmi: Add support for the touchpad in the Razer Blade 14 laptop
    HID: rmi: Support touchpads with external buttons
    HID: rmi: Use hid_report_len to compute the size of reports
    HID: logitech-hidpp: store the name of the device in struct hidpp
    HID: microsoft: add support for Japanese Surface Type Cover 3
    HID: fixup the conflicting keyboard mappings quirk
    HID: apple: fix battery support for the 2009 ANSI wireless keyboard
    HID: fix Kconfig text
    ...

    Linus Torvalds
     
  • Commit 84683a7e081f ("sata_dwc_460ex: enable COMPILE_TEST for the
    driver") enabled this driver for non-ppc460-ex platforms, but it was
    then disabled for ARM and ARM64 by commit 2de5a9c004e9 ("sata_dwc_460ex:
    disable compilation on ARM and ARM64") because it's too noisy and
    broken.

    This disabled is entirely, because it's too noisy on x86-64 too, and
    there's no point in disabling architectures one by one. At a minimum,
    the code isn't 64-bit clean, and even on 32-bit it is questionable
    whether it makes sense.

    Cc: Andy Shevchenko
    Cc: Tejun Heo
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Merge misc updates from Andrew Morton:
    "Bite-sized chunks this time, to avoid the MTA ratelimiting woes.

    - fs/notify updates

    - ocfs2

    - some of MM"

    That laconic "some MM" is mainly the removal of remap_file_pages(),
    which is a big simplification of the VM, and which gets rid of a *lot*
    of random cruft and special cases because we no longer support the
    non-linear mappings that it used.

    From a user interface perspective, nothing has changed, because the
    remap_file_pages() syscall still exists, it's just done by emulating the
    old behavior by creating a lot of individual small mappings instead of
    one non-linear one.

    The emulation is slower than the old "native" non-linear mappings, but
    nobody really uses or cares about remap_file_pages(), and simplifying
    the VM is a big advantage.

    * emailed patches from Andrew Morton : (78 commits)
    memcg: zap memcg_slab_caches and memcg_slab_mutex
    memcg: zap memcg_name argument of memcg_create_kmem_cache
    memcg: zap __memcg_{charge,uncharge}_slab
    mm/page_alloc.c: place zone_id check before VM_BUG_ON_PAGE check
    mm: hugetlb: fix type of hugetlb_treat_as_movable variable
    mm, hugetlb: remove unnecessary lower bound on sysctl handlers"?
    mm: memory: merge shared-writable dirtying branches in do_wp_page()
    mm: memory: remove ->vm_file check on shared writable vmas
    xtensa: drop _PAGE_FILE and pte_file()-related helpers
    x86: drop _PAGE_FILE and pte_file()-related helpers
    unicore32: drop pte_file()-related helpers
    um: drop _PAGE_FILE and pte_file()-related helpers
    tile: drop pte_file()-related helpers
    sparc: drop pte_file()-related helpers
    sh: drop _PAGE_FILE and pte_file()-related helpers
    score: drop _PAGE_FILE and pte_file()-related helpers
    s390: drop pte_file()-related helpers
    parisc: drop _PAGE_FILE and pte_file()-related helpers
    openrisc: drop _PAGE_FILE and pte_file()-related helpers
    nios2: drop _PAGE_FILE and pte_file()-related helpers
    ...

    Linus Torvalds
     
  • Pull gfs2 updates from Steven Whitehouse:
    "This time we have mostly clean ups. There is a bug fix for a NULL
    dereference relating to ACLs, and another which improves (but does not
    fix entirely) an allocation fall-back code path. The other three
    patches are small clean ups"

    * tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw:
    GFS2: Fix crash during ACL deletion in acl max entry check in gfs2_set_acl()
    GFS2: use __vmalloc GFP_NOFS for fs-related allocations.
    GFS2: Eliminate a nonsense goto
    GFS2: fix sprintf format specifier
    GFS2: Eliminate __gfs2_glock_remove_from_lru

    Linus Torvalds
     
  • Pull xfs update from Dave Chinner:
    "This update contains:

    - RENAME_EXCHANGE support

    - Rework of the superblock logging infrastructure

    - Rework of the XFS_IOCTL_SETXATTR implementation
    * enables use inside user namespaces
    * fixes inconsistencies setting extent size hints

    - fixes for missing buffer type annotations used in log recovery

    - more consolidation of libxfs headers

    - preparation patches for block based PNFS support

    - miscellaneous bug fixes and cleanups"

    * tag 'xfs-for-linus-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (37 commits)
    xfs: only trace buffer items if they exist
    xfs: report proper f_files in statfs if we overshoot imaxpct
    xfs: fix panic_mask documentation
    xfs: xfs_ioctl_setattr_check_projid can be static
    xfs: growfs should use synchronous transactions
    xfs: fix behaviour of XFS_IOC_FSSETXATTR on directories
    xfs: factor projid hint checking out of xfs_ioctl_setattr
    xfs: factor extsize hint checking out of xfs_ioctl_setattr
    xfs: XFS_IOCTL_SETXATTR can run in user namespaces
    xfs: kill xfs_ioctl_setattr behaviour mask
    xfs: disaggregate xfs_ioctl_setattr
    xfs: factor out xfs_ioctl_setattr transaciton preamble
    xfs: separate xflags from xfs_ioctl_setattr
    xfs: FSX_NONBLOCK is not used
    xfs: don't allocate an ioend for direct I/O completions
    xfs: change kmem_free to use generic kvfree()
    xfs: factor out a xfs_update_prealloc_flags() helper
    xfs: remove incorrect error negation in attr_multi ioctl
    xfs: set superblock buffer type correctly
    xfs: set buf types when converting extent formats
    ...

    Linus Torvalds
     
  • Pull quota interface unification and misc cleanups from Jan Kara:
    "The first part of the series unifying XFS and VFS quota interfaces.

    This part unifies turning quotas on and off so quota-tools and
    xfs_quota can be used to manage any filesystem. This is useful so
    that userspace doesn't have to distinguish which filesystem it is
    working with. As a result we can then easily reuse tests for project
    quotas in XFS for ext4.

    This also contains minor cleanups and fixes for udf, isofs, and ext3"

    * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (23 commits)
    udf: remove bool assignment to 0/1
    udf: use bool for done
    quota: Store maximum space limit in bytes
    quota: Remove quota_on_meta callback
    ocfs2: Use generic helpers for quotaon and quotaoff
    ext4: Use generic helpers for quotaon and quotaoff
    quota: Add ->quota_{enable,disable} callbacks for VFS quotas
    quota: Wire up ->quota_{enable,disable} callbacks into Q_QUOTA{ON,OFF}
    quota: Split ->set_xstate callback into two
    xfs: Remove some pointless quota checks
    xfs: Remove some useless flags tests
    xfs: Remove useless test
    quota: Verify flags passed to Q_SETINFO
    quota: Cleanup flags definitions
    ocfs2: Move OLQF_CLEAN flag out of generic quota flags
    quota: Don't store flags for v2 quota format
    jbd: drop jbd_ENOSYS debug
    udf: destroy sbi mutex in put_super
    udf: Check length of extended attributes and allocation descriptors
    udf: Remove repeated loads blocksize
    ...

    Linus Torvalds
     
  • Pull file locking related changes #1 from Jeff Layton:
    "This patchset contains a fairly major overhaul of how file locks are
    tracked within the inode. Rather than a single list, we now create a
    per-inode "lock context" that contains individual lists for the file
    locks, and a new dedicated spinlock for them.

    There are changes in other trees that are based on top of this set so
    it may be easiest to pull this in early"

    * tag 'locks-v3.20-1' of git://git.samba.org/jlayton/linux:
    locks: update comments that refer to inode->i_flock
    locks: consolidate NULL i_flctx checks in locks_remove_file
    locks: keep a count of locks on the flctx lists
    locks: clean up the lm_change prototype
    locks: add a dedicated spinlock to protect i_flctx lists
    locks: remove i_flock field from struct inode
    locks: convert lease handling to file_lock_context
    locks: convert posix locks to file_lock_context
    locks: move flock locks to file_lock_context
    ceph: move spinlocking into ceph_encode_locks_to_buffer and ceph_count_locks
    locks: add a new struct file_locking_context pointer to struct inode
    locks: have locks_release_file use flock_lock_file to release generic flock locks
    locks: add new struct list_head to struct file_lock

    Linus Torvalds
     
  • Pull ACPI and power management updates from Rafael Wysocki:
    "We have a few new features this time, including a new SFI-based
    cpufreq driver, a new devfreq driver for Tegra Activity Monitor, a new
    devfreq class for providing its governors with raw utilization data
    and a new ACPI driver for AMD SoCs.

    Still, the majority of changes here are reworks of existing code to
    make it more straightforward or to prepare it for implementing new
    features on top of it. The primary example is the rework of ACPI
    resources handling from Jiang Liu, Thomas Gleixner and Lv Zheng with
    support for IOAPIC hotplug implemented on top of it, but there is
    quite a number of changes of this kind in the cpufreq core, ACPICA,
    ACPI EC driver, ACPI processor driver and the generic power domains
    core code too.

    The most active developer is Viresh Kumar with his cpufreq changes.

    Specifics:

    - Rework of the core ACPI resources parsing code to fix issues in it
    and make using resource offsets more convenient and consolidation
    of some resource-handing code in a couple of places that have grown
    analagous data structures and code to cover the the same gap in the
    core (Jiang Liu, Thomas Gleixner, Lv Zheng).

    - ACPI-based IOAPIC hotplug support on top of the resources handling
    rework (Jiang Liu, Yinghai Lu).

    - ACPICA update to upstream release 20150204 including an interrupt
    handling rework that allows drivers to install raw handlers for
    ACPI GPEs which then become entirely responsible for the given GPE
    and the ACPICA core code won't touch it (Lv Zheng, David E Box,
    Octavian Purdila).

    - ACPI EC driver rework to fix several concurrency issues and other
    problems related to events handling on top of the ACPICA's new
    support for raw GPE handlers (Lv Zheng).

    - New ACPI driver for AMD SoCs analogous to the LPSS (Low-Power
    Subsystem) driver for Intel chips (Ken Xue).

    - Two minor fixes of the ACPI LPSS driver (Heikki Krogerus, Jarkko
    Nikula).

    - Two new blacklist entries for machines (Samsung 730U3E/740U3E and
    510R) where the native backlight interface doesn't work correctly
    while the ACPI one does (Hans de Goede).

    - Rework of the ACPI processor driver's handling of idle states to
    make the code more straightforward and less bloated overall (Rafael
    J Wysocki).

    - Assorted minor fixes related to ACPI and SFI (Andreas Ruprecht,
    Andy Shevchenko, Hanjun Guo, Jan Beulich, Rafael J Wysocki, Yaowei
    Bai).

    - PCI core power management modification to avoid resuming (some)
    runtime-suspended devices during system suspend if they are in the
    right states already (Rafael J Wysocki).

    - New SFI-based cpufreq driver for Intel platforms using SFI
    (Srinidhi Kasagar).

    - cpufreq core fixes, cleanups and simplifications (Viresh Kumar,
    Doug Anderson, Wolfram Sang).

    - SkyLake CPU support and other updates for the intel_pstate driver
    (Kristen Carlson Accardi, Srinivas Pandruvada).

    - cpufreq-dt driver cleanup (Markus Elfring).

    - Init fix for the ARM big.LITTLE cpuidle driver (Sudeep Holla).

    - Generic power domains core code fixes and cleanups (Ulf Hansson).

    - Operating Performance Points (OPP) core code cleanups and kernel
    documentation update (Nishanth Menon).

    - New dabugfs interface to make the list of PM QoS constraints
    available to user space (Nishanth Menon).

    - New devfreq driver for Tegra Activity Monitor (Tomeu Vizoso).

    - New devfreq class (devfreq_event) to provide raw utilization data
    to devfreq governors (Chanwoo Choi).

    - Assorted minor fixes and cleanups related to power management
    (Andreas Ruprecht, Krzysztof Kozlowski, Rickard Strandqvist, Pavel
    Machek, Todd E Brandt, Wonhong Kwon).

    - turbostat updates (Len Brown) and cpupower Makefile improvement
    (Sriram Raghunathan)"

    * tag 'pm+acpi-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (151 commits)
    tools/power turbostat: relax dependency on APERF_MSR
    tools/power turbostat: relax dependency on invariant TSC
    Merge branch 'pci/host-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci into acpi-resources
    tools/power turbostat: decode MSR_*_PERF_LIMIT_REASONS
    tools/power turbostat: relax dependency on root permission
    ACPI / video: Add disable_native_backlight quirk for Samsung 510R
    ACPI / PM: Remove unneeded nested #ifdef
    USB / PM: Remove unneeded #ifdef and associated dead code
    intel_pstate: provide option to only use intel_pstate with HWP
    ACPI / EC: Add GPE reference counting debugging messages
    ACPI / EC: Add query flushing support
    ACPI / EC: Refine command storm prevention support
    ACPI / EC: Add command flushing support.
    ACPI / EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag
    ACPI: add AMD ACPI2Platform device support for x86 system
    ACPI / table: remove duplicate NULL check for the handler of acpi_table_parse()
    ACPI / EC: Update revision due to raw handler mode.
    ACPI / EC: Reduce ec_poll() by referencing the last register access timestamp.
    ACPI / EC: Fix several GPE handling issues by deploying ACPI_GPE_DISPATCH_RAW_HANDLER mode.
    ACPICA: Events: Enable APIs to allow interrupt/polling adaptive request based GPE handling model
    ...

    Linus Torvalds
     
  • Pull PCI changes from Bjorn Helgaas:
    "Enumeration
    - Move domain assignment from arm64 to generic code (Lorenzo Pieralisi)
    - ARM: Remove artificial dependency on pci_sys_data domain (Lorenzo Pieralisi)
    - ARM: Move to generic PCI domains (Lorenzo Pieralisi)
    - Generate uppercase hex for modalias var in uevent (Ricardo Ribalda Delgado)
    - Add and use generic config accessors on ARM, PowerPC (Rob Herring)

    Resource management
    - Free resources on failure in of_pci_get_host_bridge_resources() (Lorenzo Pieralisi)
    - Fix infinite loop with ROM image of size 0 (Michel Dänzer)

    PCI device hotplug
    - Handle surprise add even if surprise removal isn't supported (Bjorn Helgaas)

    Virtualization
    - Mark AMD/ATI VGA devices that don't reset on D3hot->D0 transition (Alex Williamson)
    - Add DMA alias quirk for Adaptec 3405 (Alex Williamson)
    - Add Wellsburg (X99) to Intel PCH root port ACS quirk (Alex Williamson)
    - Add ACS quirk for Emulex NICs (Vasundhara Volam)

    MSI
    - Fail MSI-X mappings if there's no space assigned to MSI-X BAR (Yijing Wang)

    Freescale Layerscape host bridge driver
    - Fix platform_no_drv_owner.cocci warnings (Julia Lawall)

    NVIDIA Tegra host bridge driver
    - Remove unnecessary tegra_pcie_fixup_bridge() (Lucas Stach)

    Renesas R-Car host bridge driver
    - Fix error handling of irq_of_parse_and_map() (Dmitry Torokhov)

    TI Keystone host bridge driver
    - Fix error handling of irq_of_parse_and_map() (Dmitry Torokhov)
    - Fix misspelling of current function in debug output (Julia Lawall)

    Xilinx AXI host bridge driver
    - Fix harmless format string warning (Arnd Bergmann)

    Miscellaneous
    - Use standard parsing functions for ASPM sysfs setters (Chris J Arges)
    - Add pci_device_to_OF_node() stub for !CONFIG_OF (Kevin Hao)
    - Delete unnecessary NULL pointer checks (Markus Elfring)
    - Add and use defines for PCIe Max_Read_Request_Size (Rafał Miłecki)
    - Include clk.h instead of clk-private.h (Stephen Boyd)"

    * tag 'pci-v3.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (48 commits)
    PCI: Add pci_device_to_OF_node() stub for !CONFIG_OF
    PCI: xilinx: Convert to use generic config accessors
    PCI: xgene: Convert to use generic config accessors
    PCI: tegra: Convert to use generic config accessors
    PCI: rcar: Convert to use generic config accessors
    PCI: generic: Convert to use generic config accessors
    powerpc/powermac: Convert PCI to use generic config accessors
    powerpc/fsl_pci: Convert PCI to use generic config accessors
    ARM: ks8695: Convert PCI to use generic config accessors
    ARM: sa1100: Convert PCI to use generic config accessors
    ARM: integrator: Convert PCI to use generic config accessors
    PCI: versatile: Add DT-based ARM Versatile PB PCIe host driver
    ARM: dts: versatile: add PCI controller binding
    of/pci: Free resources on failure in of_pci_get_host_bridge_resources()
    PCI: versatile: Add DT docs for ARM Versatile PB PCIe driver
    PCI: Fail MSI-X mappings if there's no space assigned to MSI-X BAR
    r8169: use PCI define for Max_Read_Request_Size
    [SCSI] esas2r: use PCI define for Max_Read_Request_Size
    tile: use PCI define for Max_Read_Request_Size
    rapidio/tsi721: use PCI define for Max_Read_Request_Size
    ...

    Linus Torvalds
     
  • mem_cgroup->memcg_slab_caches is a list of kmem caches corresponding to
    the given cgroup. Currently, it is only used on css free in order to
    destroy all caches corresponding to the memory cgroup being freed. The
    list is protected by memcg_slab_mutex. The mutex is also used to protect
    kmem_cache->memcg_params->memcg_caches arrays and synchronizes
    kmem_cache_destroy vs memcg_unregister_all_caches.

    However, we can perfectly get on without these two. To destroy all caches
    corresponding to a memory cgroup, we can walk over the global list of kmem
    caches, slab_caches, and we can do all the synchronization stuff using the
    slab_mutex instead of the memcg_slab_mutex. This patch therefore gets rid
    of the memcg_slab_caches and memcg_slab_mutex.

    Apart from this nice cleanup, it also:

    - assures that rcu_barrier() is called once at max when a root cache is
    destroyed or a memory cgroup is freed, no matter how many caches have
    SLAB_DESTROY_BY_RCU flag set;

    - fixes the race between kmem_cache_destroy and kmem_cache_create that
    exists, because memcg_cleanup_cache_params, which is called from
    kmem_cache_destroy after checking that kmem_cache->refcount=0,
    releases the slab_mutex, which gives kmem_cache_create a chance to
    make an alias to a cache doomed to be destroyed.

    Signed-off-by: Vladimir Davydov
    Cc: Johannes Weiner
    Cc: Michal Hocko
    Acked-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     
  • Instead of passing the name of the memory cgroup which the cache is
    created for in the memcg_name_argument, let's obtain it immediately in
    memcg_create_kmem_cache.

    Signed-off-by: Vladimir Davydov
    Cc: Johannes Weiner
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     
  • They are simple wrappers around memcg_{charge,uncharge}_kmem, so let's
    zap them and call these functions directly.

    Signed-off-by: Vladimir Davydov
    Cc: Johannes Weiner
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     
  • If the freeing page and its buddy page are not at the same zone, the
    current holding zone->lock for the freeing page cann't prevent buddy page
    getting allocated, this could trigger VM_BUG_ON_PAGE in page_is_buddy() at
    a very tiny chance, such as:

    cpu 0: cpu 1:
    hold zone_1 lock
    check page and it buddy
    PageBuddy(buddy) is true hold zone_2 lock
    page_order(buddy) == order is true alloc buddy
    trigger VM_BUG_ON_PAGE(page_count(buddy) != 0)

    zone_1->lock prevents the freeing page getting allocated
    zone_2->lock prevents the buddy page getting allocated
    they are not the same zone->lock.

    If we can't remove the zone_id check statement, it's better handle this
    rare race. This patch fixes this by placing the zone_id check before the
    VM_BUG_ON_PAGE check.

    Signed-off-by: Weijie Yang
    Acked-by: Mel Gorman
    Cc: Johannes Weiner
    Cc: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Weijie Yang
     
  • hugetlb_treat_as_movable declared as unsigned long, but
    proc_dointvec() used for parsing it:

    static struct ctl_table vm_table[] = {
    ...
    {
    .procname = "hugepages_treat_as_movable",
    .data = &hugepages_treat_as_movable,
    .maxlen = sizeof(int),
    .mode = 0644,
    .proc_handler = proc_dointvec,
    },

    This seems harmless, but it's better to use int type here.

    Signed-off-by: Andrey Ryabinin
    Cc: Dmitry Vyukov
    Cc: Manfred Spraul
    Acked-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Ryabinin
     
  • Commit ed4d4902ebdd ("mm, hugetlb: remove hugetlb_zero and
    hugetlb_infinity") replaced 'unsigned long hugetlb_zero' with 'int zero'
    leading to out-of-bounds access in proc_doulongvec_minmax(). Use
    '.extra1 = NULL' instead of '.extra1 = &zero'. Passing NULL is
    equivalent to passing minimal value, which is 0 for unsigned types.

    Fixes: ed4d4902ebdd ("mm, hugetlb: remove hugetlb_zero and hugetlb_infinity")
    Signed-off-by: Andrey Ryabinin
    Reported-by: Dmitry Vyukov
    Suggested-by: Manfred Spraul
    Acked-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Ryabinin
     
  • Whether there is a vm_ops->page_mkwrite or not, the page dirtying is
    pretty much the same. Make sure the page references are the same in both
    cases, then merge the two branches.

    It's tempting to go even further and page-lock the !page_mkwrite case, to
    get it in line with everybody else setting the page table and thus further
    simplify the model. But that's not quite compelling enough to justify
    dropping the pte lock, then relocking and verifying the entry for
    filesystems without ->page_mkwrite, which notably includes tmpfs. Leave
    it for now and lock the page late in the !page_mkwrite case.

    Signed-off-by: Johannes Weiner
    Acked-by: Kirill A. Shutemov
    Reviewed-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • Shared anonymous mmaps are implemented with shmem files, so all VMAs with
    shared writable semantics also have an underlying backing file.

    Signed-off-by: Johannes Weiner
    Reviewed-by: Jan Kara
    Acked-by: Kirill A. Shutemov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • We've replaced remap_file_pages(2) implementation with emulation. Nobody
    creates non-linear mapping anymore.

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Max Filippov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We've replaced remap_file_pages(2) implementation with emulation. Nobody
    creates non-linear mapping anymore.

    Signed-off-by: Kirill A. Shutemov
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We've replaced remap_file_pages(2) implementation with emulation. Nobody
    creates non-linear mapping anymore.

    Signed-off-by: Kirill A. Shutemov
    Cc: Guan Xuetao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We've replaced remap_file_pages(2) implementation with emulation. Nobody
    creates non-linear mapping anymore.

    Signed-off-by: Kirill A. Shutemov
    Cc: Jeff Dike
    Cc: Richard Weinberger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We've replaced remap_file_pages(2) implementation with emulation. Nobody
    creates non-linear mapping anymore.

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Chris Metcalf
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We've replaced remap_file_pages(2) implementation with emulation. Nobody
    creates non-linear mapping anymore.

    This patch also increase number of bits availble for swap offset.

    Signed-off-by: Kirill A. Shutemov
    Acked-by: David S. Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We've replaced remap_file_pages(2) implementation with emulation. Nobody
    creates non-linear mapping anymore.

    Signed-off-by: Kirill A. Shutemov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We've replaced remap_file_pages(2) implementation with emulation.
    Nobody creates non-linear mapping anymore.

    This patch also increase number of bits availble for swap offset.

    Signed-off-by: Kirill A. Shutemov
    Cc: Chen Liqin
    Cc: Lennox Wu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We've replaced remap_file_pages(2) implementation with emulation. Nobody
    creates non-linear mapping anymore.

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Martin Schwidefsky
    Cc: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We've replaced remap_file_pages(2) implementation with emulation. Nobody
    creates non-linear mapping anymore.

    Signed-off-by: Kirill A. Shutemov
    Cc: "James E.J. Bottomley"
    Cc: Helge Deller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We've replaced remap_file_pages(2) implementation with emulation. Nobody
    creates non-linear mapping anymore.

    Signed-off-by: Kirill A. Shutemov
    Cc: Jonas Bonn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We've replaced remap_file_pages(2) implementation with emulation. Nobody
    creates non-linear mapping anymore.

    Signed-off-by: Kirill A. Shutemov
    Cc: Ley Foon Tan
    Reviewed-by: Tobias Klauser
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We've replaced remap_file_pages(2) implementation with emulation. Nobody
    creates non-linear mapping anymore.

    This patch also increases the number of bits availble for swap offset.

    Signed-off-by: Kirill A. Shutemov
    Cc: David Howells
    Cc: Koichi Yasutake
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We've replaced remap_file_pages(2) implementation with emulation. Nobody
    creates non-linear mapping anymore.

    Signed-off-by: Kirill A. Shutemov
    Cc: Ralf Baechle
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We've replaced remap_file_pages(2) implementation with emulation. Nobody
    creates non-linear mapping anymore.

    Signed-off-by: Kirill A. Shutemov
    Cc: Michal Simek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We've replaced remap_file_pages(2) implementation with emulation. Nobody
    creates non-linear mapping anymore.

    Signed-off-by: Kirill A. Shutemov
    Cc: James Hogan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We've replaced remap_file_pages(2) implementation with emulation. Nobody
    creates non-linear mapping anymore.

    Signed-off-by: Kirill A. Shutemov
    Cc: Geert Uytterhoeven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We've replaced remap_file_pages(2) implementation with emulation. Nobody
    creates non-linear mapping anymore.

    Signed-off-by: Kirill A. Shutemov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We've replaced remap_file_pages(2) implementation with emulation. Nobody
    creates non-linear mapping anymore.

    This patch also increase number of bits availble for swap offset.

    Signed-off-by: Kirill A. Shutemov
    Cc: Tony Luck
    Cc: Fenghua Yu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We've replaced remap_file_pages(2) implementation with emulation. Nobody
    creates non-linear mapping anymore.

    This patch also increase number of bits availble for swap offset.

    Signed-off-by: Kirill A. Shutemov
    Cc: Richard Kuo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We've replaced remap_file_pages(2) implementation with emulation. Nobody
    creates non-linear mapping anymore.

    This patch also increase number of bits availble for swap offset.

    Signed-off-by: Kirill A. Shutemov
    Cc: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov