13 Aug, 2020

1 commit

  • Make kernel GNU build-id available in VMCOREINFO. Having build-id in
    VMCOREINFO facilitates presenting appropriate kernel namelist image with
    debug information file to kernel crash dump analysis tools. Currently
    VMCOREINFO lacks uniquely identifiable key for crash analysis automation.

    Regarding if this patch is necessary or matching of linux_banner and
    OSRELEASE in VMCOREINFO employed by crash(8) meets the need -- IMO,
    build-id approach more foolproof, in most instances it is a cryptographic
    hash generated using internal code/ELF bits unlike kernel version string
    upon which linux_banner is based that is external to the code. I feel
    each is intended for a different purpose. Also OSRELEASE is not suitable
    when two different kernel builds from same version with different features
    enabled.

    Currently for most linux (and non-linux) systems build-id can be extracted
    using standard methods for file types such as user mode crash dumps,
    shared libraries, loadable kernel modules etc., This is an exception for
    linux kernel dump. Having build-id in VMCOREINFO brings some uniformity
    for automation tools.

    Tyler said:

    : I think this is a nice improvement over today's linux_banner approach for
    : correlating vmlinux to a kernel dump.
    :
    : The elf notes parsing in this patch lines up with what is described in in
    : the "Notes (Nhdr)" section of the elf(5) man page.
    :
    : BUILD_ID_MAX is sufficient to hold a sha1 build-id, which is the default
    : build-id type today in GNU ld(2). It is also sufficient to hold the
    : "fast" build-id, which is the default build-id type today in LLVM lld(2).

    Signed-off-by: Vijay Balakrishna
    Signed-off-by: Andrew Morton
    Reviewed-by: Tyler Hicks
    Acked-by: Baoquan He
    Cc: Dave Young
    Cc: Vivek Goyal
    Link: http://lkml.kernel.org/r/1591849672-34104-1-git-send-email-vijayb@linux.microsoft.com
    Signed-off-by: Linus Torvalds

    Vijay Balakrishna
     

03 Jul, 2020

1 commit

  • Right now user-space tools like 'makedumpfile' and 'crash' need to rely
    on a best-guess method of determining value of 'MAX_PHYSMEM_BITS'
    supported by underlying kernel.

    This value is used in user-space code to calculate the bit-space
    required to store a section for SPARESMEM (similar to the existing
    calculation method used in the kernel implementation):

    #define SECTIONS_SHIFT (MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)

    Now, regressions have been reported in user-space utilities
    like 'makedumpfile' and 'crash' on arm64, with the recently added
    kernel support for 52-bit physical address space, as there is
    no clear method of determining this value in user-space
    (other than reading kernel CONFIG flags).

    As per suggestion from makedumpfile maintainer (Kazu), it makes more
    sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself
    rather than in arch-specific code, so that the user-space code for other
    archs can also benefit from this addition to the vmcoreinfo and use it
    as a standard way of determining 'SECTIONS_SHIFT' value in user-land.

    A reference 'makedumpfile' implementation which reads the
    'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion
    is available here:

    While at it also update vmcoreinfo documentation for 'MAX_PHYSMEM_BITS'
    variable being added to vmcoreinfo.

    'MAX_PHYSMEM_BITS' defines the maximum supported physical address
    space memory.

    Signed-off-by: Bhupesh Sharma
    Tested-by: John Donnelly
    Acked-by: Dave Young
    Cc: Boris Petkov
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: James Morse
    Cc: Mark Rutland
    Cc: Will Deacon
    Cc: Michael Ellerman
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: Dave Anderson
    Cc: Kazuhito Hagio
    Cc: x86@kernel.org
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-kernel@vger.kernel.org
    Cc: kexec@lists.infradead.org
    Link: https://lore.kernel.org/r/1589395957-24628-2-git-send-email-bhsharma@redhat.com
    Signed-off-by: Catalin Marinas

    Bhupesh Sharma
     

19 Jun, 2019

1 commit

  • Based on 2 normalized pattern(s):

    this source code is licensed under the gnu general public license
    version 2 see the file copying for more details

    this source code is licensed under general public license version 2
    see

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 52 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Enrico Weigelt
    Reviewed-by: Allison Randal
    Reviewed-by: Alexios Zavras
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190602204653.449021192@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

06 Mar, 2019

1 commit

  • Right now, pages inflated as part of a balloon driver will be dumped by
    dump tools like makedumpfile. While XEN is able to check in the crash
    kernel whether a certain pfn is actuall backed by memory in the
    hypervisor (see xen_oldmem_pfn_is_ram) and optimize this case, dumps of
    other balloon inflated memory will essentially result in zero pages
    getting allocated by the hypervisor and the dump getting filled with
    this data.

    The allocation and reading of zero pages can directly be avoided if a
    dumping tool could know which pages only contain stale information not
    to be dumped.

    We now have PG_offline which can be (and already is by virtio-balloon)
    used for marking pages as logically offline. Follow up patches will
    make use of this flag also in other balloon implementations.

    Let's export PG_offline via PAGE_OFFLINE_MAPCOUNT_VALUE, so makedumpfile
    can directly skip pages that are logically offline and the content
    therefore stale.

    Please note that this is also helpful for a problem we were seeing under
    Hyper-V: Dumping logically offline memory (pages kept fake offline while
    onlining a section via online_page_callback) would under some condicions
    result in a kernel panic when dumping them.

    Link: http://lkml.kernel.org/r/20181119101616.8901-4-david@redhat.com
    Signed-off-by: David Hildenbrand
    Acked-by: Michael S. Tsirkin
    Acked-by: Dave Young
    Cc: "Kirill A. Shutemov"
    Cc: Baoquan He
    Cc: Omar Sandoval
    Cc: Arnd Bergmann
    Cc: Matthew Wilcox
    Cc: Michal Hocko
    Cc: Lianbo Jiang
    Cc: Borislav Petkov
    Cc: Kazuhito Hagio
    Cc: Alexander Duyck
    Cc: Alexey Dobriyan
    Cc: Boris Ostrovsky
    Cc: Christian Hansen
    Cc: David Rientjes
    Cc: Greg Kroah-Hartman
    Cc: Haiyang Zhang
    Cc: Jonathan Corbet
    Cc: Juergen Gross
    Cc: Julien Freche
    Cc: Kairui Song
    Cc: Konstantin Khlebnikov
    Cc: "K. Y. Srinivasan"
    Cc: Len Brown
    Cc: Michal Hocko
    Cc: Mike Rapoport
    Cc: Miles Chen
    Cc: Nadav Amit
    Cc: Naoya Horiguchi
    Cc: Pankaj gupta
    Cc: Pavel Machek
    Cc: Pavel Tatashin
    Cc: Rafael J. Wysocki
    Cc: "Rafael J. Wysocki"
    Cc: Stefano Stabellini
    Cc: Stephen Hemminger
    Cc: Stephen Rothwell
    Cc: Vitaly Kuznetsov
    Cc: Vlastimil Babka
    Cc: Xavier Deguillard
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     

23 Aug, 2018

3 commits

  • The get_seconds() call returns a 32-bit timestamp on some architectures,
    and will overflow in the future. The newer ktime_get_real_seconds()
    always returns a 64-bit timestamp that does not suffer from this problem.

    Link: http://lkml.kernel.org/r/20180618150329.941903-1-arnd@arndb.de
    Signed-off-by: Arnd Bergmann
    Reviewed-by: Andrew Morton
    Cc: Dave Young
    Cc: Baoquan He
    Cc: "Kirill A. Shutemov"
    Cc: Petr Tesarik
    Cc: Marc-Andr Lureau
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arnd Bergmann
     
  • The vmcoreinfo information is useful for runtime debugging tools, not just
    for crash dumps. A lot of this information can be determined by other
    means, but this is much more convenient, and it only adds a page at most
    to the file.

    Link: http://lkml.kernel.org/r/fddbcd08eed76344863303878b12de1c1e2a04b6.1531953780.git.osandov@fb.com
    Signed-off-by: Omar Sandoval
    Cc: Alexey Dobriyan
    Cc: Bhupesh Sharma
    Cc: Eric Biederman
    Cc: James Morse
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Omar Sandoval
     
  • This is preparation for allowing CRASH_CORE to be enabled for any
    architecture.

    swapper_pg_dir is always either an array or a macro expanding to NULL.
    In the latter case, VMCOREINFO_SYMBOL() won't work, as it tries to take
    the address of the given symbol:

    #define VMCOREINFO_SYMBOL(name) \
    vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)

    Instead, use VMCOREINFO_SYMBOL_ARRAY(), which uses the value:

    #define VMCOREINFO_SYMBOL_ARRAY(name) \
    vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)

    This is the same thing for the array case but isn't an error for the macro
    case.

    Link: http://lkml.kernel.org/r/c05f9781ec204f40fc96f95086e7b6de6a3eb2c3.1532563124.git.osandov@fb.com
    Signed-off-by: Omar Sandoval
    Cc: Alexey Dobriyan
    Cc: Bhupesh Sharma
    Cc: Eric Biederman
    Cc: James Morse
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Omar Sandoval
     

08 Jun, 2018

1 commit

  • We're already using a union of many fields here, so stop abusing the
    _mapcount and make page_type its own field. That implies renaming some of
    the machinery that creates PageBuddy, PageBalloon and PageKmemcg; bring
    back the PG_buddy, PG_balloon and PG_kmemcg names.

    As suggested by Kirill, make page_type a bitmask. Because it starts out
    life as -1 (thanks to sharing the storage with _mapcount), setting a page
    flag means clearing the appropriate bit. This gives us space for probably
    twenty or so extra bits (depending how paranoid we want to be about
    _mapcount underflow).

    Link: http://lkml.kernel.org/r/20180518194519.3820-3-willy@infradead.org
    Signed-off-by: Matthew Wilcox
    Acked-by: Kirill A. Shutemov
    Acked-by: Vlastimil Babka
    Cc: Christoph Lameter
    Cc: Dave Hansen
    Cc: Jérôme Glisse
    Cc: Lai Jiangshan
    Cc: Martin Schwidefsky
    Cc: Pekka Enberg
    Cc: Randy Dunlap
    Cc: Andrey Ryabinin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     

14 Apr, 2018

1 commit

  • Since commit 6326fec1122c ("mm: Use owner_priv bit for PageSwapCache,
    valid when PageSwapBacked"), PG_swapcache is an alias for
    PG_owner_priv_1, which may be also used for other purposes.

    To know whether the bit indeed has the PG_swapcache meaning, it is
    necessary to check PG_swapbacked, hence this bit must be exported.

    Link: http://lkml.kernel.org/r/20180410161345.142e142d@ezekiel.suse.cz
    Signed-off-by: Petr Tesarik
    Reviewed-by: Andrew Morton
    Cc: Dave Young
    Cc: Xunlei Pang
    Cc: Baoquan He
    Cc: Hari Bathini
    Cc: "Kirill A. Shutemov"
    Cc: "Marc-Andr Lureau"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Petr Tesarik
     

20 Mar, 2018

1 commit

  • The following patch is going to use the symbol from the fw_cfg module,
    to call the function and write the note location details in the
    vmcoreinfo entry, so qemu can produce dumps with the vmcoreinfo note.

    CC: Andrew Morton
    CC: Hari Bathini
    CC: Tony Luck
    CC: Vivek Goyal
    Acked-by: Baoquan He
    Acked-by: Dave Young
    Signed-off-by: Marc-André Lureau
    Acked-by: Gabriel Somlo
    Signed-off-by: Michael S. Tsirkin

    Marc-André Lureau
     

14 Jan, 2018

1 commit

  • Depending on configuration mem_section can now be an array or a pointer
    to an array allocated dynamically. In most cases, we can continue to
    refer to it as 'mem_section' regardless of what it is.

    But there's one exception: '&mem_section' means "address of the array"
    if mem_section is an array, but if mem_section is a pointer, it would
    mean "address of the pointer".

    We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
    writes down address of pointer into vmcoreinfo, not array as we wanted.

    Let's introduce VMCOREINFO_SYMBOL_ARRAY() that would handle the
    situation correctly for both cases.

    Link: http://lkml.kernel.org/r/20180112162532.35896-1-kirill.shutemov@linux.intel.com
    Signed-off-by: Kirill A. Shutemov
    Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
    Acked-by: Baoquan He
    Acked-by: Dave Young
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Greg Kroah-Hartman
    Cc: Dave Young
    Cc: Baoquan He
    Cc: Vivek Goyal
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

18 Nov, 2017

1 commit

  • parse_crashkernel_mem() silently returns if we get zero bytes in the
    parsing function. It is useful for debugging to add a message,
    especially if the kernel cannot boot correctly.

    Add a pr_info instead of pr_warn because it is expected behavior for
    size = 0, eg. crashkernel=2G-4G:128M, size will be 0 in case system
    memory is less than 2G.

    Link: http://lkml.kernel.org/r/20171114080129.GA6115@dhcp-128-65.nay.redhat.com
    Signed-off-by: Dave Young
    Cc: Baoquan He
    Cc: Vivek Goyal
    Cc: Bhupesh Sharma
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Young
     

13 Jul, 2017

3 commits

  • Currently vmcoreinfo data is updated at boot time subsys_initcall(), it
    has the risk of being modified by some wrong code during system is
    running.

    As a result, vmcore dumped may contain the wrong vmcoreinfo. Later on,
    when using "crash", "makedumpfile", etc utility to parse this vmcore, we
    probably will get "Segmentation fault" or other unexpected errors.

    E.g. 1) wrong code overwrites vmcoreinfo_data; 2) further crashes the
    system; 3) trigger kdump, then we obviously will fail to recognize the
    crash context correctly due to the corrupted vmcoreinfo.

    Now except for vmcoreinfo, all the crash data is well
    protected(including the cpu note which is fully updated in the crash
    path, thus its correctness is guaranteed). Given that vmcoreinfo data
    is a large chunk prepared for kdump, we better protect it as well.

    To solve this, we relocate and copy vmcoreinfo_data to the crash memory
    when kdump is loading via kexec syscalls. Because the whole crash
    memory will be protected by existing arch_kexec_protect_crashkres()
    mechanism, we naturally protect vmcoreinfo_data from write(even read)
    access under kernel direct mapping after kdump is loaded.

    Since kdump is usually loaded at the very early stage after boot, we can
    trust the correctness of the vmcoreinfo data copied.

    On the other hand, we still need to operate the vmcoreinfo safe copy
    when crash happens to generate vmcoreinfo_note again, we rely on vmap()
    to map out a new kernel virtual address and update to use this new one
    instead in the following crash_save_vmcoreinfo().

    BTW, we do not touch vmcoreinfo_note, because it will be fully updated
    using the protected vmcoreinfo_data after crash which is surely correct
    just like the cpu crash note.

    Link: http://lkml.kernel.org/r/1493281021-20737-3-git-send-email-xlpang@redhat.com
    Signed-off-by: Xunlei Pang
    Tested-by: Michael Holzheu
    Cc: Benjamin Herrenschmidt
    Cc: Dave Young
    Cc: Eric Biederman
    Cc: Hari Bathini
    Cc: Juergen Gross
    Cc: Mahesh Salgaonkar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xunlei Pang
     
  • vmcoreinfo_max_size stands for the vmcoreinfo_data, the correct one we
    should use is vmcoreinfo_note whose total size is VMCOREINFO_NOTE_SIZE.

    Like explained in commit 77019967f06b ("kdump: fix exported size of
    vmcoreinfo note"), it should not affect the actual function, but we
    better fix it, also this change should be safe and backward compatible.

    After this, we can get rid of variable vmcoreinfo_max_size, let's use
    the corresponding macros directly, fewer variables means more safety for
    vmcoreinfo operation.

    [xlpang@redhat.com: fix build warning]
    Link: http://lkml.kernel.org/r/1494830606-27736-1-git-send-email-xlpang@redhat.com
    Link: http://lkml.kernel.org/r/1493281021-20737-2-git-send-email-xlpang@redhat.com
    Signed-off-by: Xunlei Pang
    Reviewed-by: Mahesh Salgaonkar
    Reviewed-by: Dave Young
    Cc: Hari Bathini
    Cc: Benjamin Herrenschmidt
    Cc: Eric Biederman
    Cc: Juergen Gross
    Cc: Michael Holzheu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xunlei Pang
     
  • As Eric said,
    "what we need to do is move the variable vmcoreinfo_note out of the
    kernel's .bss section. And modify the code to regenerate and keep this
    information in something like the control page.

    Definitely something like this needs a page all to itself, and ideally
    far away from any other kernel data structures. I clearly was not
    watching closely the data someone decided to keep this silly thing in
    the kernel's .bss section."

    This patch allocates extra pages for these vmcoreinfo_XXX variables, one
    advantage is that it enhances some safety of vmcoreinfo, because
    vmcoreinfo now is kept far away from other kernel data structures.

    Link: http://lkml.kernel.org/r/1493281021-20737-1-git-send-email-xlpang@redhat.com
    Signed-off-by: Xunlei Pang
    Tested-by: Michael Holzheu
    Reviewed-by: Juergen Gross
    Suggested-by: Eric Biederman
    Cc: Benjamin Herrenschmidt
    Cc: Dave Young
    Cc: Hari Bathini
    Cc: Mahesh Salgaonkar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xunlei Pang
     

09 May, 2017

2 commits

  • Get rid of multiple definitions of append_elf_note() & final_note()
    functions. Reuse these functions compiled under CONFIG_CRASH_CORE Also,
    define Elf_Word and use it instead of generic u32 or the more specific
    Elf64_Word.

    Link: http://lkml.kernel.org/r/149035342324.6881.11667840929850361402.stgit@hbathini.in.ibm.com
    Signed-off-by: Hari Bathini
    Acked-by: Dave Young
    Acked-by: Tony Luck
    Cc: Fenghua Yu
    Cc: Eric Biederman
    Cc: Mahesh Salgaonkar
    Cc: Vivek Goyal
    Cc: Michael Ellerman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hari Bathini
     
  • Patch series "kexec/fadump: remove dependency with CONFIG_KEXEC and
    reuse crashkernel parameter for fadump", v4.

    Traditionally, kdump is used to save vmcore in case of a crash. Some
    architectures like powerpc can save vmcore using architecture specific
    support instead of kexec/kdump mechanism. Such architecture specific
    support also needs to reserve memory, to be used by dump capture kernel.
    crashkernel parameter can be a reused, for memory reservation, by such
    architecture specific infrastructure.

    This patchset removes dependency with CONFIG_KEXEC for crashkernel
    parameter and vmcoreinfo related code as it can be reused without kexec
    support. Also, crashkernel parameter is reused instead of
    fadump_reserve_mem to reserve memory for fadump.

    The first patch moves crashkernel parameter parsing and vmcoreinfo
    related code under CONFIG_CRASH_CORE instead of CONFIG_KEXEC_CORE. The
    second patch reuses the definitions of append_elf_note() & final_note()
    functions under CONFIG_CRASH_CORE in IA64 arch code. The third patch
    removes dependency on CONFIG_KEXEC for firmware-assisted dump (fadump)
    in powerpc. The next patch reuses crashkernel parameter for reserving
    memory for fadump, instead of the fadump_reserve_mem parameter. This
    has the advantage of using all syntaxes crashkernel parameter supports,
    for fadump as well. The last patch updates fadump kernel documentation
    about use of crashkernel parameter.

    This patch (of 5):

    Traditionally, kdump is used to save vmcore in case of a crash. Some
    architectures like powerpc can save vmcore using architecture specific
    support instead of kexec/kdump mechanism. Such architecture specific
    support also needs to reserve memory, to be used by dump capture kernel.
    crashkernel parameter can be a reused, for memory reservation, by such
    architecture specific infrastructure.

    But currently, code related to vmcoreinfo and parsing of crashkernel
    parameter is built under CONFIG_KEXEC_CORE. This patch introduces
    CONFIG_CRASH_CORE and moves the above mentioned code under this config,
    allowing code reuse without dependency on CONFIG_KEXEC. There is no
    functional change with this patch.

    Link: http://lkml.kernel.org/r/149035338104.6881.4550894432615189948.stgit@hbathini.in.ibm.com
    Signed-off-by: Hari Bathini
    Acked-by: Dave Young
    Cc: Fenghua Yu
    Cc: Tony Luck
    Cc: Eric Biederman
    Cc: Mahesh Salgaonkar
    Cc: Vivek Goyal
    Cc: Michael Ellerman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hari Bathini