24 Oct, 2020

1 commit

  • Pull documentation fixes from Jonathan Corbet:
    "A handful of late-arriving documentation fixes"

    * tag 'docs-5.10-2' of git://git.lwn.net/linux:
    docs: Add two missing entries in vm sysctl index
    docs/vm: trivial fixes to several spelling mistakes
    docs: submitting-patches: describe preserving review/test tags
    Documentation: Chinese translation of Documentation/arm64/hugetlbpage.rst
    Documentation: x86: fix a missing word in x86_64/mm.rst.
    docs: driver-api: remove a duplicated index entry
    docs: lkdtm: Modernize and improve details
    docs: deprecated.rst: Expand str*cpy() replacement notes
    docs/cpu-load: format the example code.

    Linus Torvalds
     

22 Oct, 2020

1 commit


14 Oct, 2020

1 commit

  • Disable parsing of the HMAT for debug, to workaround broken platform
    instances, or cases where it is otherwise not wanted.

    [rdunlap@infradead.org: fix build when CONFIG_ACPI is not set]
    Link: https://lkml.kernel.org/r/70e5ee34-9809-a997-7b49-499e4be61307@infradead.org

    Signed-off-by: Dan Williams
    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Cc: Dave Hansen
    Cc: Andy Lutomirski
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Borislav Petkov
    Cc: "H. Peter Anvin"
    Cc: Ard Biesheuvel
    Cc: Ard Biesheuvel
    Cc: Benjamin Herrenschmidt
    Cc: Ben Skeggs
    Cc: Brice Goglin
    Cc: Catalin Marinas
    Cc: Daniel Vetter
    Cc: Dave Jiang
    Cc: David Airlie
    Cc: David Hildenbrand
    Cc: Greg Kroah-Hartman
    Cc: Ira Weiny
    Cc: Jason Gunthorpe
    Cc: Jeff Moyer
    Cc: Jia He
    Cc: Joao Martins
    Cc: Jonathan Cameron
    Cc: Michael Ellerman
    Cc: Mike Rapoport
    Cc: Paul Mackerras
    Cc: Pavel Tatashin
    Cc: "Rafael J. Wysocki"
    Cc: Tom Lendacky
    Cc: Vishal Verma
    Cc: Wei Yang
    Cc: Will Deacon
    Cc: Bjorn Helgaas
    Cc: Boris Ostrovsky
    Cc: Hulk Robot
    Cc: Jason Yan
    Cc: "Jérôme Glisse"
    Cc: Juergen Gross
    Cc: kernel test robot
    Cc: Randy Dunlap
    Cc: Stefano Stabellini
    Cc: Vivek Goyal
    Link: https://lkml.kernel.org/r/159643095540.4062302.732962081968036212.stgit@dwillia2-desk3.amr.corp.intel.com
    Signed-off-by: Linus Torvalds

    Dan Williams
     

05 Aug, 2020

1 commit

  • Pull documentation updates from Jonathan Corbet:
    "It's been a busy cycle for documentation - hopefully the busiest for a
    while to come. Changes include:

    - Some new Chinese translations

    - Progress on the battle against double words words and non-HTTPS
    URLs

    - Some block-mq documentation

    - More RST conversions from Mauro. At this point, that task is
    essentially complete, so we shouldn't see this kind of churn again
    for a while. Unless we decide to switch to asciidoc or
    something...:)

    - Lots of typo fixes, warning fixes, and more"

    * tag 'docs-5.9' of git://git.lwn.net/linux: (195 commits)
    scripts/kernel-doc: optionally treat warnings as errors
    docs: ia64: correct typo
    mailmap: add entry for
    doc/zh_CN: add cpu-load Chinese version
    Documentation/admin-guide: tainted-kernels: fix spelling mistake
    MAINTAINERS: adjust kprobes.rst entry to new location
    devices.txt: document rfkill allocation
    PCI: correct flag name
    docs: filesystems: vfs: correct flag name
    docs: filesystems: vfs: correct sync_mode flag names
    docs: path-lookup: markup fixes for emphasis
    docs: path-lookup: more markup fixes
    docs: path-lookup: fix HTML entity mojibake
    CREDITS: Replace HTTP links with HTTPS ones
    docs: process: Add an example for creating a fixes tag
    doc/zh_CN: add Chinese translation prefer section
    doc/zh_CN: add clearing-warn-once Chinese version
    doc/zh_CN: add admin-guide index
    doc:it_IT: process: coding-style.rst: Correct __maybe_unused compiler label
    futex: MAINTAINERS: Re-add selftests directory
    ...

    Linus Torvalds
     

13 Jul, 2020

1 commit

  • Drop the doubled word "see".

    Signed-off-by: Randy Dunlap
    Cc: Jonathan Corbet
    Cc: linux-doc@vger.kernel.org
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Borislav Petkov
    Cc: x86@kernel.org
    Cc: "H. Peter Anvin"
    Link: https://lore.kernel.org/r/20200703213107.30758-3-rdunlap@infradead.org
    Signed-off-by: Jonathan Corbet

    Randy Dunlap
     

18 Jun, 2020

1 commit

  • Explain how the GS/FS based addressing can be utilized in user space
    applications along with the differences between the generic prctl() based
    GS/FS base control and the FSGSBASE version available on newer CPUs.

    Originally-by: Andi Kleen
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Chang S. Bae
    Signed-off-by: Sasha Levin
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Tony Luck
    Link: https://lkml.kernel.org/r/20200528201402.1708239-15-sashal@kernel.org

    Thomas Gleixner
     

29 Apr, 2020

1 commit


31 Dec, 2019

1 commit


03 Sep, 2019

1 commit

  • After commit cf65a0f6f6ff ("dma-mapping: move all DMA mapping code to
    kernel/dma") some of the files are referring to outdated information,
    i.e. old file names of DMA mapping sources. Fix it here.

    Note, the lines with "Glue code for..." have been removed completely.

    Signed-off-by: Andy Shevchenko
    Signed-off-by: Christoph Hellwig

    Andy Shevchenko
     

15 Jul, 2019

1 commit


10 Jul, 2019

1 commit

  • Pull Documentation updates from Jonathan Corbet:
    "It's been a relatively busy cycle for docs:

    - A fair pile of RST conversions, many from Mauro. These create more
    than the usual number of simple but annoying merge conflicts with
    other trees, unfortunately. He has a lot more of these waiting on
    the wings that, I think, will go to you directly later on.

    - A new document on how to use merges and rebases in kernel repos,
    and one on Spectre vulnerabilities.

    - Various improvements to the build system, including automatic
    markup of function() references because some people, for reasons I
    will never understand, were of the opinion that
    :c:func:``function()`` is unattractive and not fun to type.

    - We now recommend using sphinx 1.7, but still support back to 1.4.

    - Lots of smaller improvements, warning fixes, typo fixes, etc"

    * tag 'docs-5.3' of git://git.lwn.net/linux: (129 commits)
    docs: automarkup.py: ignore exceptions when seeking for xrefs
    docs: Move binderfs to admin-guide
    Disable Sphinx SmartyPants in HTML output
    doc: RCU callback locks need only _bh, not necessarily _irq
    docs: format kernel-parameters -- as code
    Doc : doc-guide : Fix a typo
    platform: x86: get rid of a non-existent document
    Add the RCU docs to the core-api manual
    Documentation: RCU: Add TOC tree hooks
    Documentation: RCU: Rename txt files to rst
    Documentation: RCU: Convert RCU UP systems to reST
    Documentation: RCU: Convert RCU linked list to reST
    Documentation: RCU: Convert RCU basic concepts to reST
    docs: filesystems: Remove uneeded .rst extension on toctables
    scripts/sphinx-pre-install: fix out-of-tree build
    docs: zh_CN: submitting-drivers.rst: Remove a duplicated Documentation/
    Documentation: PGP: update for newer HW devices
    Documentation: Add section about CPU vulnerabilities for Spectre
    Documentation: platform: Delete x86-laptop-drivers.txt
    docs: Note that :c:func: should no longer be used
    ...

    Linus Torvalds
     

15 Jun, 2019

1 commit

  • Convert the cgroup-v1 files to ReST format, in order to
    allow a later addition to the admin-guide.

    The conversion is actually:
    - add blank lines and identation in order to identify paragraphs;
    - fix tables markups;
    - add some lists markups;
    - mark literal blocks;
    - adjust title markups.

    At its new index.rst, let's add a :orphan: while this is not linked to
    the main index.rst file, in order to avoid build warnings.

    Signed-off-by: Mauro Carvalho Chehab
    Acked-by: Tejun Heo
    Signed-off-by: Tejun Heo

    Mauro Carvalho Chehab
     

09 Jun, 2019

1 commit

  • Mostly due to x86 and acpi conversion, several documentation
    links are still pointing to the old file. Fix them.

    Signed-off-by: Mauro Carvalho Chehab
    Reviewed-by: Wolfram Sang
    Reviewed-by: Sven Van Asbroeck
    Reviewed-by: Bhupesh Sharma
    Acked-by: Mark Brown
    Signed-off-by: Jonathan Corbet

    Mauro Carvalho Chehab
     

11 May, 2019

1 commit

  • Pull more documentation updates from Jonathan Corbet:
    "Some late arriving documentation changes. In particular, this contains
    the conversion of the x86 docs to RST, which has been in the works for
    some time but needed a couple of final tweaks"

    * tag 'docs-5.2a' of git://git.lwn.net/linux: (29 commits)
    Documentation: x86: convert x86_64/machinecheck to reST
    Documentation: x86: convert x86_64/cpu-hotplug-spec to reST
    Documentation: x86: convert x86_64/fake-numa-for-cpusets to reST
    Documentation: x86: convert x86_64/5level-paging.txt to reST
    Documentation: x86: convert x86_64/mm.txt to reST
    Documentation: x86: convert x86_64/uefi.txt to reST
    Documentation: x86: convert x86_64/boot-options.txt to reST
    Documentation: x86: convert i386/IO-APIC.txt to reST
    Documentation: x86: convert usb-legacy-support.txt to reST
    Documentation: x86: convert orc-unwinder.txt to reST
    Documentation: x86: convert resctrl_ui.txt to reST
    Documentation: x86: convert microcode.txt to reST
    Documentation: x86: convert pti.txt to reST
    Documentation: x86: convert amd-memory-encryption.txt to reST
    Documentation: x86: convert intel_mpx.txt to reST
    Documentation: x86: convert protection-keys.txt to reST
    Documentation: x86: convert pat.txt to reST
    Documentation: x86: convert mtrr.txt to reST
    Documentation: x86: convert tlb.txt to reST
    Documentation: x86: convert zero-page.txt to reST
    ...

    Linus Torvalds
     

09 May, 2019

7 commits


16 Apr, 2019

1 commit

  • This fixes a PT typo, and the following 56-bit address-space
    addresses:

    * the hole extends from 0100000000000000 to feffffffffffffff
    * the KASAN shadow memory area stops at fffffbffffffffff (see kasan.h)

    Signed-off-by: Stephen Kitt
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Cc: alex.popov@linux.com
    Cc: bhe@redhat.com
    Cc: corbet@lwn.net
    Cc: kirill.shutemov@linux.intel.com
    Cc: linux-doc@vger.kernel.org
    Link: http://lkml.kernel.org/r/20190415150853.10354-1-steve@sk2.org
    Signed-off-by: Ingo Molnar

    Stephen Kitt
     

11 Dec, 2018

1 commit

  • dma-debug is now capable of adding new entries to its pool on-demand if
    the initial preallocation was insufficient, so the IOMMU_LEAK logic no
    longer needs to explicitly change the pool size. This does lose it the
    ability to save a couple of megabytes of RAM by reducing the pool size
    below its default, but it seems unlikely that that is a realistic
    concern these days (or indeed that anyone is actively debugging AGP
    drivers' DMA usage any more). Getting rid of dma_debug_resize_entries()
    will make room for further streamlining in the dma-debug code itself.

    Removing the call reveals quite a lot of cruft which has been useless
    for nearly a decade since commit 19c1a6f5764d ("x86 gart: reimplement
    IOMMU_LEAK feature by using DMA_API_DEBUG"), including the entire
    'iommu=leak' parameter, which controlled nothing except whether
    dma_debug_resize_entries() was called or not.

    Signed-off-by: Robin Murphy
    Acked-by: Thomas Gleixner
    Tested-by: Qian Cai
    Signed-off-by: Christoph Hellwig

    Robin Murphy
     

07 Nov, 2018

1 commit

  • On 5-level paging the LDT remap area is placed in the middle of the KASLR
    randomization region and it can overlap with the direct mapping, the
    vmalloc or the vmap area.

    The LDT mapping is per mm, so it cannot be moved into the P4D page table
    next to the CPU_ENTRY_AREA without complicating PGD table allocation for
    5-level paging.

    The 4 PGD slot gap just before the direct mapping is reserved for
    hypervisors, so it cannot be used.

    Move the direct mapping one slot deeper and use the resulting gap for the
    LDT remap area. The resulting layout is the same for 4 and 5 level paging.

    [ tglx: Massaged changelog ]

    Fixes: f55f0501cbf6 ("x86/pti: Put the LDT in its own PGD if PTI is on")
    Signed-off-by: Kirill A. Shutemov
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Andy Lutomirski
    Cc: bp@alien8.de
    Cc: hpa@zytor.com
    Cc: dave.hansen@linux.intel.com
    Cc: peterz@infradead.org
    Cc: boris.ostrovsky@oracle.com
    Cc: jgross@suse.com
    Cc: bhe@redhat.com
    Cc: willy@infradead.org
    Cc: linux-mm@kvack.org
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/20181026122856.66224-2-kirill.shutemov@linux.intel.com

    Kirill A. Shutemov
     

02 Nov, 2018

1 commit

  • Pull stackleak gcc plugin from Kees Cook:
    "Please pull this new GCC plugin, stackleak, for v4.20-rc1. This plugin
    was ported from grsecurity by Alexander Popov. It provides efficient
    stack content poisoning at syscall exit. This creates a defense
    against at least two classes of flaws:

    - Uninitialized stack usage. (We continue to work on improving the
    compiler to do this in other ways: e.g. unconditional zero init was
    proposed to GCC and Clang, and more plugin work has started too).

    - Stack content exposure. By greatly reducing the lifetime of valid
    stack contents, exposures via either direct read bugs or unknown
    cache side-channels become much more difficult to exploit. This
    complements the existing buddy and heap poisoning options, but
    provides the coverage for stacks.

    The x86 hooks are included in this series (which have been reviewed by
    Ingo, Dave Hansen, and Thomas Gleixner). The arm64 hooks have already
    been merged through the arm64 tree (written by Laura Abbott and
    reviewed by Mark Rutland and Will Deacon).

    With VLAs having been removed this release, there is no need for
    alloca() protection, so it has been removed from the plugin"

    * tag 'stackleak-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
    arm64: Drop unneeded stackleak_check_alloca()
    stackleak: Allow runtime disabling of kernel stack erasing
    doc: self-protection: Add information about STACKLEAK feature
    fs/proc: Show STACKLEAK metrics in the /proc file system
    lkdtm: Add a test for STACKLEAK
    gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
    x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls

    Linus Torvalds
     

25 Oct, 2018

1 commit

  • Pull documentation updates from Jonathan Corbet:
    "This is a fairly typical cycle for documentation. There's some welcome
    readability improvements for the formatted output, some LICENSES
    updates including the addition of the ISC license, the removal of the
    unloved and unmaintained 00-INDEX files, the deprecated APIs document
    from Kees, more MM docs from Mike Rapoport, and the usual pile of typo
    fixes and corrections"

    * tag 'docs-4.20' of git://git.lwn.net/linux: (41 commits)
    docs: Fix typos in histogram.rst
    docs: Introduce deprecated APIs list
    kernel-doc: fix declaration type determination
    doc: fix a typo in adding-syscalls.rst
    docs/admin-guide: memory-hotplug: remove table of contents
    doc: printk-formats: Remove bogus kobject references for device nodes
    Documentation: preempt-locking: Use better example
    dm flakey: Document "error_writes" feature
    docs/completion.txt: Fix a couple of punctuation nits
    LICENSES: Add ISC license text
    LICENSES: Add note to CDDL-1.0 license that it should not be used
    docs/core-api: memory-hotplug: add some details about locking internals
    docs/core-api: rename memory-hotplug-notifier to memory-hotplug
    docs: improve readability for people with poorer eyesight
    yama: clarify ptrace_scope=2 in Yama documentation
    docs/vm: split memory hotplug notifier description to Documentation/core-api
    docs: move memory hotplug description into admin-guide/mm
    doc: Fix acronym "FEKEK" in ecryptfs
    docs: fix some broken documentation references
    iommu: Fix passthrough option documentation
    ...

    Linus Torvalds
     

06 Oct, 2018

2 commits

  • After the cleanups from Baoquan He, make it even more readable:

    - Remove the 'bits' area size column: it's pretty pointless and was even
    wrong for some of the entries. Given that MB, GB, TB, PT are 10, 20,
    30 and 40 bits, a "8 TB" size description makes it obvious that it's
    43 bits.

    - Introduce an "offset" column:

    --------------------------------------------------------------------------------
    start addr | offset | end addr | size | VM area description
    -----------------|------------|------------------|---------|--------------------
    ...
    ffff880000000000 | -120 TB | ffffc7ffffffffff | 64 TB | direct mapping of all physical memory (page_offset_base),
    this is what limits max physical memory supported.

    The -120 TB notation makes it obvious where this particular virtual memory
    region starts: 120 TB down from the top of the 64-bit virtual memory space.
    Especially the layout of the kernel mappings is a *lot* more obvious when
    written this way, plus it's much easier to compare it with the size column
    and understand/check/validate and modify the kernel's layout in the future.

    - Mark the part from where the 47-bit and 56-bit kernel layouts are 100% identical,
    this starts at the -512 GB offset and the EFI region.

    - Re-shuffle the size desciptions to be continous blocks of sizes, instead of the
    often mixed size. I.e. write "0.5 TB" instead of "512 GB" if we are still in
    the TB-granular region of the map.

    - Make the 47-bit and 56-bit descriptions use the *exact* same layout and wording,
    and only differ where there's a material difference. This makes it easy to compare
    the two tables side by side by switching between two terminal tabs.

    - Plus enhance a lot of other stylistic/typographical details: make the tables
    explicitly tabular, add headers, enhance certain entries, etc. etc.

    Note that there are some apparent errors in the tables as well, but I'll fix
    them in a separate patch to make it easier to review/validate.

    Cc: Andy Lutomirski
    Cc: Baoquan He
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Cc: corbet@lwn.net
    Cc: linux-doc@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Cc: thgarnie@google.com
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • In Documentation/x86/x86_64/mm.txt, the description of the x86-64 virtual
    memory layout has become a confusing hodgepodge of inconsistencies:

    - there's a hard to read mixture of 'TB' and 'bits' notation
    - the entries sometimes mention a size in the description and sometimes not
    - sometimes they list holes by address, sometimes only as an 'unused hole' line

    So make it all a coherent, readable, well organized description.

    Signed-off-by: Baoquan He
    Cc: Andy Lutomirski
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Cc: corbet@lwn.net
    Cc: linux-doc@vger.kernel.org
    Cc: thgarnie@google.com
    Link: http://lkml.kernel.org/r/20181006084327.27467-3-bhe@redhat.com
    Signed-off-by: Ingo Molnar

    Baoquan He
     

10 Sep, 2018

1 commit

  • This is a respin with a wider audience (all that get_maintainer returned)
    and I know this spams a *lot* of people. Not sure what would be the correct
    way, so my apologies for ruining your inbox.

    The 00-INDEX files are supposed to give a summary of all files present
    in a directory, but these files are horribly out of date and their
    usefulness is brought into question. Often a simple "ls" would reveal
    the same information as the filenames are generally quite descriptive as
    a short introduction to what the file covers (it should not surprise
    anyone what Documentation/sched/sched-design-CFS.txt covers)

    A few years back it was mentioned that these files were no longer really
    needed, and they have since then grown further out of date, so perhaps
    it is time to just throw them out.

    A short status yields the following _outdated_ 00-INDEX files, first
    counter is files listed in 00-INDEX but missing in the directory, last
    is files present but not listed in 00-INDEX.

    List of outdated 00-INDEX:
    Documentation: (4/10)
    Documentation/sysctl: (0/1)
    Documentation/timers: (1/0)
    Documentation/blockdev: (3/1)
    Documentation/w1/slaves: (0/1)
    Documentation/locking: (0/1)
    Documentation/devicetree: (0/5)
    Documentation/power: (1/1)
    Documentation/powerpc: (0/5)
    Documentation/arm: (1/0)
    Documentation/x86: (0/9)
    Documentation/x86/x86_64: (1/1)
    Documentation/scsi: (4/4)
    Documentation/filesystems: (2/9)
    Documentation/filesystems/nfs: (0/2)
    Documentation/cgroup-v1: (0/2)
    Documentation/kbuild: (0/4)
    Documentation/spi: (1/0)
    Documentation/virtual/kvm: (1/0)
    Documentation/scheduler: (0/2)
    Documentation/fb: (0/1)
    Documentation/block: (0/1)
    Documentation/networking: (6/37)
    Documentation/vm: (1/3)

    Then there are 364 subdirectories in Documentation/ with several files that
    are missing 00-INDEX alltogether (and another 120 with a single file and no
    00-INDEX).

    I don't really have an opinion to whether or not we /should/ have 00-INDEX,
    but the above 00-INDEX should either be removed or be kept up to date. If
    we should keep the files, I can try to keep them updated, but I rather not
    if we just want to delete them anyway.

    As a starting point, remove all index-files and references to 00-INDEX and
    see where the discussion is going.

    Signed-off-by: Henrik Austad
    Acked-by: "Paul E. McKenney"
    Just-do-it-by: Steven Rostedt
    Reviewed-by: Jens Axboe
    Acked-by: Paul Moore
    Acked-by: Greg Kroah-Hartman
    Acked-by: Mark Brown
    Acked-by: Mike Rapoport
    Cc: [Almost everybody else]
    Signed-off-by: Jonathan Corbet

    Henrik Austad
     

05 Sep, 2018

1 commit

  • The STACKLEAK feature (initially developed by PaX Team) has the following
    benefits:

    1. Reduces the information that can be revealed through kernel stack leak
    bugs. The idea of erasing the thread stack at the end of syscalls is
    similar to CONFIG_PAGE_POISONING and memzero_explicit() in kernel
    crypto, which all comply with FDP_RIP.2 (Full Residual Information
    Protection) of the Common Criteria standard.

    2. Blocks some uninitialized stack variable attacks (e.g. CVE-2017-17712,
    CVE-2010-2963). That kind of bugs should be killed by improving C
    compilers in future, which might take a long time.

    This commit introduces the code filling the used part of the kernel
    stack with a poison value before returning to userspace. Full
    STACKLEAK feature also contains the gcc plugin which comes in a
    separate commit.

    The STACKLEAK feature is ported from grsecurity/PaX. More information at:
    https://grsecurity.net/
    https://pax.grsecurity.net/

    This code is modified from Brad Spengler/PaX Team's code in the last
    public patch of grsecurity/PaX based on our understanding of the code.
    Changes or omissions from the original code are ours and don't reflect
    the original grsecurity/PaX code.

    Performance impact:

    Hardware: Intel Core i7-4770, 16 GB RAM

    Test #1: building the Linux kernel on a single core
    0.91% slowdown

    Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P
    4.2% slowdown

    So the STACKLEAK description in Kconfig includes: "The tradeoff is the
    performance impact: on a single CPU system kernel compilation sees a 1%
    slowdown, other systems and workloads may vary and you are advised to
    test this feature on your expected workload before deploying it".

    Signed-off-by: Alexander Popov
    Acked-by: Thomas Gleixner
    Reviewed-by: Dave Hansen
    Acked-by: Ingo Molnar
    Signed-off-by: Kees Cook

    Alexander Popov
     

14 Aug, 2018

1 commit

  • Pull x86 timer updates from Thomas Gleixner:
    "Early TSC based time stamping to allow better boot time analysis.

    This comes with a general cleanup of the TSC calibration code which
    grew warts and duct taping over the years and removes 250 lines of
    code. Initiated and mostly implemented by Pavel with help from various
    folks"

    * 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
    x86/kvmclock: Mark kvm_get_preset_lpj() as __init
    x86/tsc: Consolidate init code
    sched/clock: Disable interrupts when calling generic_sched_clock_init()
    timekeeping: Prevent false warning when persistent clock is not available
    sched/clock: Close a hole in sched_clock_init()
    x86/tsc: Make use of tsc_calibrate_cpu_early()
    x86/tsc: Split native_calibrate_cpu() into early and late parts
    sched/clock: Use static key for sched_clock_running
    sched/clock: Enable sched clock early
    sched/clock: Move sched clock initialization and merge with generic clock
    x86/tsc: Use TSC as sched clock early
    x86/tsc: Initialize cyc2ns when tsc frequency is determined
    x86/tsc: Calibrate tsc only once
    ARM/time: Remove read_boot_clock64()
    s390/time: Remove read_boot_clock64()
    timekeeping: Default boot time offset to local_clock()
    timekeeping: Replace read_boot_clock64() with read_persistent_wall_and_boot_offset()
    s390/time: Add read_persistent_wall_and_boot_offset()
    x86/xen/time: Output xen sched_clock time from 0
    x86/xen/time: Initialize pv xen time in init_hypervisor_platform()
    ...

    Linus Torvalds
     

20 Jul, 2018

1 commit

  • Currently, the notsc kernel parameter disables the use of the TSC by
    sched_clock(). However, this parameter does not prevent the kernel from
    accessing tsc in other places.

    The only rationale to boot with notsc is to avoid timing discrepancies on
    multi-socket systems where TSC are not properly synchronized, and thus
    exclude TSC from being used for time keeping. But that prevents using TSC
    as sched_clock() as well, which is not necessary as the core sched_clock()
    implementation can handle non synchronized TSC based sched clocks just
    fine.

    However, there is another method to solve the above problem: booting with
    tsc=unstable parameter. This parameter allows sched_clock() to use TSC and
    just excludes it from timekeeping.

    So there is no real reason to keep notsc, but for compatibility reasons the
    parameter has to stay. Make it behave like 'tsc=unstable' instead.

    [ tglx: Massaged changelog ]

    Signed-off-by: Pavel Tatashin
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Dou Liyang
    Reviewed-by: Thomas Gleixner
    Cc: steven.sistare@oracle.com
    Cc: daniel.m.jordan@oracle.com
    Cc: linux@armlinux.org.uk
    Cc: schwidefsky@de.ibm.com
    Cc: heiko.carstens@de.ibm.com
    Cc: john.stultz@linaro.org
    Cc: sboyd@codeaurora.org
    Cc: hpa@zytor.com
    Cc: peterz@infradead.org
    Cc: prarit@redhat.com
    Cc: feng.tang@intel.com
    Cc: pmladek@suse.com
    Cc: gnomes@lxorguk.ukuu.org.uk
    Cc: linux-s390@vger.kernel.org
    Cc: boris.ostrovsky@oracle.com
    Cc: jgross@suse.com
    Cc: pbonzini@redhat.com
    Link: https://lkml.kernel.org/r/20180719205545.16512-12-pasha.tatashin@oracle.com

    Pavel Tatashin
     

07 Jul, 2018

1 commit

  • The current NUMA emulation capabilities for splitting System RAM by a
    fixed size or by a set number of nodes may result in some nodes being
    larger than others. The implementation prioritizes establishing a
    minimum usable memory size over satisfying the requested number of NUMA
    nodes.

    Introduce a uniform split capability that evenly partitions each
    physical NUMA node into N emulated nodes. For example numa=fake=3U
    creates 6 emulated nodes total on a system that has 2 physical nodes.

    This capability is useful for debugging and evaluating platform
    memory-side-cache capabilities as described by the ACPI HMAT (see
    5.2.27.5 Memory Side Cache Information Structure in ACPI 6.2a)

    Compare numa=fake=6 that results in only 5 nodes being created against
    numa=fake=3U which takes the 2 physical nodes and evenly divides them.

    numa=fake=6
    available: 5 nodes (0-4)
    node 0 cpus: 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38
    node 0 size: 2648 MB
    node 0 free: 2443 MB
    node 1 cpus: 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
    node 1 size: 2672 MB
    node 1 free: 2442 MB
    node 2 cpus: 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38
    node 2 size: 5291 MB
    node 2 free: 5278 MB
    node 3 cpus: 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
    node 3 size: 2677 MB
    node 3 free: 2665 MB
    node 4 cpus: 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
    node 4 size: 2676 MB
    node 4 free: 2663 MB
    node distances:
    node 0 1 2 3 4
    0: 10 20 10 20 20
    1: 20 10 20 10 10
    2: 10 20 10 20 20
    3: 20 10 20 10 10
    4: 20 10 20 10 10

    numa=fake=3U
    available: 6 nodes (0-5)
    node 0 cpus: 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38
    node 0 size: 2900 MB
    node 0 free: 2637 MB
    node 1 cpus: 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38
    node 1 size: 3023 MB
    node 1 free: 3012 MB
    node 2 cpus: 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38
    node 2 size: 2015 MB
    node 2 free: 2004 MB
    node 3 cpus: 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
    node 3 size: 2704 MB
    node 3 free: 2522 MB
    node 4 cpus: 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
    node 4 size: 2709 MB
    node 4 free: 2698 MB
    node 5 cpus: 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
    node 5 size: 2612 MB
    node 5 free: 2601 MB
    node distances:
    node 0 1 2 3 4 5
    0: 10 10 10 20 20 20
    1: 10 10 10 20 20 20
    2: 10 10 10 20 20 20
    3: 20 20 20 10 10 10
    4: 20 20 20 10 10 10
    5: 20 20 20 10 10 10

    Signed-off-by: Dan Williams
    Cc: David Rientjes
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Wei Yang
    Cc: linux-mm@kvack.org
    Link: http://lkml.kernel.org/r/153089328617.27680.14930758266174305832.stgit@dwillia2-desk3.amr.corp.intel.com
    Signed-off-by: Ingo Molnar

    Dan Williams
     

28 May, 2018

3 commits


12 Apr, 2018

1 commit


03 Apr, 2018

1 commit

  • Commit:

    f5a40711fa58 ("x86/mm: Set MODULES_END to 0xffffffffff000000")

    changed MODULES_END back to a fixed value, but didn't update the documentation
    of memory layout for 4-level paging.

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Andrey Ryabinin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Fixes: f5a40711fa58 ("x86/mm: Set MODULES_END to 0xffffffffff000000")
    Link: http://lkml.kernel.org/r/20180402121025.10244-1-kirill.shutemov@linux.intel.com
    Signed-off-by: Ingo Molnar

    Kirill A. Shutemov
     

16 Feb, 2018

1 commit

  • All pieces of the puzzle are in place and we can now allow to boot with
    CONFIG_X86_5LEVEL=y on a machine without LA57 support.

    Kernel will detect that LA57 is missing and fold p4d at runtime.

    Update the documentation and the Kconfig option description to reflect the
    change.

    Signed-off-by: Kirill A. Shutemov
    Cc: Andy Lutomirski
    Cc: Arjan van de Ven
    Cc: Borislav Petkov
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: David Woodhouse
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-mm@kvack.org
    Link: http://lkml.kernel.org/r/20180214182542.69302-10-kirill.shutemov@linux.intel.com
    Signed-off-by: Ingo Molnar

    Kirill A. Shutemov
     

05 Jan, 2018

1 commit

  • vaddr_end for KASLR is only documented in the KASLR code itself and is
    adjusted depending on config options. So it's not surprising that a change
    of the memory layout causes KASLR to have the wrong vaddr_end. This can map
    arbitrary stuff into other areas causing hard to understand problems.

    Remove the whole ifdef magic and define the start of the cpu_entry_area to
    be the end of the KASLR vaddr range.

    Add documentation to that effect.

    Fixes: 92a0f81d8957 ("x86/cpu_entry_area: Move it out of the fixmap")
    Reported-by: Benjamin Gilbert
    Signed-off-by: Thomas Gleixner
    Tested-by: Benjamin Gilbert
    Cc: Andy Lutomirski
    Cc: Greg Kroah-Hartman
    Cc: stable
    Cc: Dave Hansen
    Cc: Peter Zijlstra
    Cc: Thomas Garnier ,
    Cc: Alexander Kuleshov
    Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801041320360.1771@nanos

    Thomas Gleixner