13 Feb, 2019

1 commit

  • [ Upstream commit df44b479654f62b478c18ee4d8bc4e9f897a9844 ]

    Propagate error code back to userspace if writing the /sys/.../uevent
    file fails. Before, the write operation always returned with success,
    even if we failed to recognize the input string or if we failed to
    generate the uevent itself.

    With the error codes properly propagated back to userspace, we are
    able to react in userspace accordingly by not assuming and awaiting
    a uevent that is not delivered.

    Signed-off-by: Peter Rajnoha
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: Sasha Levin

    Peter Rajnoha
     

23 Aug, 2018

1 commit

  • An ordinary arm64 defconfig build has ~64 KB worth of __ksymtab entries,
    each consisting of two 64-bit fields containing absolute references, to
    the symbol itself and to a char array containing its name, respectively.

    When we build the same configuration with KASLR enabled, we end up with an
    additional ~192 KB of relocations in the .init section, i.e., one 24 byte
    entry for each absolute reference, which all need to be processed at boot
    time.

    Given how the struct kernel_symbol that describes each entry is completely
    local to module.c (except for the references emitted by EXPORT_SYMBOL()
    itself), we can easily modify it to contain two 32-bit relative references
    instead. This reduces the size of the __ksymtab section by 50% for all
    64-bit architectures, and gets rid of the runtime relocations entirely for
    architectures implementing KASLR, either via standard PIE linking (arm64)
    or using custom host tools (x86).

    Note that the binary search involving __ksymtab contents relies on each
    section being sorted by symbol name. This is implemented based on the
    input section names, not the names in the ksymtab entries, so this patch
    does not interfere with that.

    Given that the use of place-relative relocations requires support both in
    the toolchain and in the module loader, we cannot enable this feature for
    all architectures. So make it dependent on whether
    CONFIG_HAVE_ARCH_PREL32_RELOCATIONS is defined.

    Link: http://lkml.kernel.org/r/20180704083651.24360-4-ard.biesheuvel@linaro.org
    Signed-off-by: Ard Biesheuvel
    Acked-by: Jessica Yu
    Acked-by: Michael Ellerman
    Reviewed-by: Will Deacon
    Acked-by: Ingo Molnar
    Cc: Arnd Bergmann
    Cc: Benjamin Herrenschmidt
    Cc: Bjorn Helgaas
    Cc: Catalin Marinas
    Cc: James Morris
    Cc: James Morris
    Cc: Josh Poimboeuf
    Cc: Kees Cook
    Cc: Nicolas Pitre
    Cc: Paul Mackerras
    Cc: Petr Mladek
    Cc: Russell King
    Cc: "Serge E. Hallyn"
    Cc: Sergey Senozhatsky
    Cc: Steven Rostedt
    Cc: Thomas Garnier
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ard Biesheuvel
     

18 Aug, 2018

1 commit

  • Pull modules updates from Jessica Yu:
    "Summary of modules changes for the 4.19 merge window:

    - Fix modules kallsyms for livepatch. Livepatch modules can have
    SHN_UNDEF symbols in their module symbol tables for later symbol
    resolution, but kallsyms shouldn't be returning these symbols

    - Some code cleanups and minor reshuffling in load_module() were done
    to log the module name when module signature verification fails"

    * tag 'modules-for-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
    kernel/module: Use kmemdup to replace kmalloc+memcpy
    ARM: module: fix modsign build error
    modsign: log module name in the event of an error
    module: replace VMLINUX_SYMBOL_STR() with __stringify() or string literal
    module: print sensible error code
    module: setup load info before module_sig_check()
    module: make it clear when we're handling the module copy in info->hdr
    module: exclude SHN_UNDEF symbols from kallsyms api

    Linus Torvalds
     

03 Aug, 2018

1 commit


17 Jul, 2018

1 commit

  • Both the init_module and finit_module syscalls call either directly
    or indirectly the security_kernel_read_file LSM hook. This patch
    replaces the direct call in init_module with a call to the new
    security_kernel_load_data hook and makes the corresponding changes
    in SELinux, LoadPin, and IMA.

    Signed-off-by: Mimi Zohar
    Cc: Jeff Vander Stoep
    Cc: Casey Schaufler
    Cc: Kees Cook
    Acked-by: Jessica Yu
    Acked-by: Paul Moore
    Acked-by: Kees Cook
    Signed-off-by: James Morris

    Mimi Zohar
     

02 Jul, 2018

1 commit

  • Now that we have the load_info struct all initialized (including
    info->name, which contains the name of the module) before
    module_sig_check(), make the load_info struct and hence module name
    available to mod_verify_sig() so that we can log the module name in the
    event of an error.

    Signed-off-by: Jessica Yu

    Jessica Yu
     

25 Jun, 2018

2 commits


22 Jun, 2018

2 commits

  • We want to be able to log the module name in early error messages, such as
    when module signature verification fails. Previously, the module name is
    set in layout_and_allocate(), meaning that any error messages that happen
    before (such as those in module_sig_check()) won't be logged with a module
    name, which isn't terribly helpful.

    In order to do this, reshuffle the order in load_module() and set up
    load info earlier so that we can log the module name along with these
    error messages. This requires splitting rewrite_section_headers() out of
    setup_load_info().

    While we're at it, clean up and split up the operations done in
    layout_and_allocate(), setup_load_info(), and rewrite_section_headers()
    more cleanly so these functions only perform what their names suggest.

    Signed-off-by: Jessica Yu

    Jessica Yu
     
  • In load_module(), it's not always clear whether we're handling the
    temporary module copy in info->hdr (which is freed at the end of
    load_module()) or if we're handling the module already allocated and
    copied to it's final place. Adding an info->mod field and using it
    whenever we're handling the temporary copy makes that explicitly clear.

    Signed-off-by: Jessica Yu

    Jessica Yu
     

18 Jun, 2018

1 commit

  • Livepatch modules are special in that we preserve their entire symbol
    tables in order to be able to apply relocations after module load. The
    unwanted side effect of this is that undefined (SHN_UNDEF) symbols of
    livepatch modules are accessible via the kallsyms api and this can
    confuse symbol resolution in livepatch (klp_find_object_symbol()) and
    cause subtle bugs in livepatch.

    Have the module kallsyms api skip over SHN_UNDEF symbols. These symbols
    are usually not available for normal modules anyway as we cut down their
    symbol tables to just the core (non-undefined) symbols, so this should
    really just affect livepatch modules. Note that this patch doesn't
    affect the display of undefined symbols in /proc/kallsyms.

    Reported-by: Josh Poimboeuf
    Tested-by: Josh Poimboeuf
    Reviewed-by: Josh Poimboeuf
    Signed-off-by: Jessica Yu

    Jessica Yu
     

16 Jun, 2018

1 commit


07 Jun, 2018

2 commits

  • Pull overflow updates from Kees Cook:
    "This adds the new overflow checking helpers and adds them to the
    2-factor argument allocators. And this adds the saturating size
    helpers and does a treewide replacement for the struct_size() usage.
    Additionally this adds the overflow testing modules to make sure
    everything works.

    I'm still working on the treewide replacements for allocators with
    "simple" multiplied arguments:

    *alloc(a * b, ...) -> *alloc_array(a, b, ...)

    and

    *zalloc(a * b, ...) -> *calloc(a, b, ...)

    as well as the more complex cases, but that's separable from this
    portion of the series. I expect to have the rest sent before -rc1
    closes; there are a lot of messy cases to clean up.

    Summary:

    - Introduce arithmetic overflow test helper functions (Rasmus)

    - Use overflow helpers in 2-factor allocators (Kees, Rasmus)

    - Introduce overflow test module (Rasmus, Kees)

    - Introduce saturating size helper functions (Matthew, Kees)

    - Treewide use of struct_size() for allocators (Kees)"

    * tag 'overflow-v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
    treewide: Use struct_size() for devm_kmalloc() and friends
    treewide: Use struct_size() for vmalloc()-family
    treewide: Use struct_size() for kmalloc()-family
    device: Use overflow helpers for devm_kmalloc()
    mm: Use overflow helpers in kvmalloc()
    mm: Use overflow helpers in kmalloc_array*()
    test_overflow: Add memory allocation overflow tests
    overflow.h: Add allocation size calculation helpers
    test_overflow: Report test failures
    test_overflow: macrofy some more, do more tests for free
    lib: add runtime test of check_*_overflow functions
    compiler.h: enable builtin overflow checkers and add fallback code

    Linus Torvalds
     
  • One of the more common cases of allocation size calculations is finding
    the size of a structure that has a zero-sized array at the end, along
    with memory for some number of elements for that array. For example:

    struct foo {
    int stuff;
    void *entry[];
    };

    instance = kmalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);

    Instead of leaving these open-coded and prone to type mistakes, we can
    now use the new struct_size() helper:

    instance = kmalloc(struct_size(instance, entry, count), GFP_KERNEL);

    This patch makes the changes for kmalloc()-family (and kvmalloc()-family)
    uses. It was done via automatic conversion with manual review for the
    "CHECKME" non-standard cases noted below, using the following Coccinelle
    script:

    // pkey_cache = kmalloc(sizeof *pkey_cache + tprops->pkey_tbl_len *
    // sizeof *pkey_cache->table, GFP_KERNEL);
    @@
    identifier alloc =~ "kmalloc|kzalloc|kvmalloc|kvzalloc";
    expression GFP;
    identifier VAR, ELEMENT;
    expression COUNT;
    @@

    - alloc(sizeof(*VAR) + COUNT * sizeof(*VAR->ELEMENT), GFP)
    + alloc(struct_size(VAR, ELEMENT, COUNT), GFP)

    // mr = kzalloc(sizeof(*mr) + m * sizeof(mr->map[0]), GFP_KERNEL);
    @@
    identifier alloc =~ "kmalloc|kzalloc|kvmalloc|kvzalloc";
    expression GFP;
    identifier VAR, ELEMENT;
    expression COUNT;
    @@

    - alloc(sizeof(*VAR) + COUNT * sizeof(VAR->ELEMENT[0]), GFP)
    + alloc(struct_size(VAR, ELEMENT, COUNT), GFP)

    // Same pattern, but can't trivially locate the trailing element name,
    // or variable name.
    @@
    identifier alloc =~ "kmalloc|kzalloc|kvmalloc|kvzalloc";
    expression GFP;
    expression SOMETHING, COUNT, ELEMENT;
    @@

    - alloc(sizeof(SOMETHING) + COUNT * sizeof(ELEMENT), GFP)
    + alloc(CHECKME_struct_size(&SOMETHING, ELEMENT, COUNT), GFP)

    Signed-off-by: Kees Cook

    Kees Cook
     

12 May, 2018

1 commit

  • load_module() creates W+X mappings via __vmalloc_node_range() (from
    layout_and_allocate()->move_module()->module_alloc()) by using
    PAGE_KERNEL_EXEC. These mappings are later cleaned up via
    "call_rcu_sched(&freeinit->rcu, do_free_init)" from do_init_module().

    This is a problem because call_rcu_sched() queues work, which can be run
    after debug_checkwx() is run, resulting in a race condition. If hit,
    the race results in a nasty splat about insecure W+X mappings, which
    results in a poor user experience as these are not the mappings that
    debug_checkwx() is intended to catch.

    This issue is observed on multiple arm64 platforms, and has been
    artificially triggered on an x86 platform.

    Address the race by flushing the queued work before running the
    arch-defined mark_rodata_ro() which then calls debug_checkwx().

    Link: http://lkml.kernel.org/r/1525103946-29526-1-git-send-email-jhugo@codeaurora.org
    Fixes: e1a58320a38d ("x86/mm: Warn on W^X mappings")
    Signed-off-by: Jeffrey Hugo
    Reported-by: Timur Tabi
    Reported-by: Jan Glauber
    Acked-by: Kees Cook
    Acked-by: Ingo Molnar
    Acked-by: Will Deacon
    Acked-by: Laura Abbott
    Cc: Mark Rutland
    Cc: Ard Biesheuvel
    Cc: Catalin Marinas
    Cc: Stephen Smalley
    Cc: Thomas Gleixner
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeffrey Hugo
     

19 Apr, 2018

1 commit

  • Reading file /proc/modules shows the correct address:
    [root@s35lp76 ~]# cat /proc/modules | egrep '^qeth_l2'
    qeth_l2 94208 1 - Live 0x000003ff80401000

    and reading file /sys/module/qeth_l2/sections/.text
    [root@s35lp76 ~]# cat /sys/module/qeth_l2/sections/.text
    0x0000000018ea8363
    displays a random address.

    This breaks the perf tool which uses this address on s390
    to calculate start of .text section in memory.

    Fix this by printing the correct (unhashed) address.

    Thanks to Jessica Yu for helping on this.

    Fixes: ef0010a30935 ("vsprintf: don't use 'restricted_pointer()' when not restricting")
    Cc: # v4.15+
    Suggested-by: Linus Torvalds
    Signed-off-by: Thomas Richter
    Cc: Jessica Yu
    Signed-off-by: Jessica Yu

    Thomas Richter
     

17 Apr, 2018

2 commits


03 Apr, 2018

1 commit

  • Pul removal of obsolete architecture ports from Arnd Bergmann:
    "This removes the entire architecture code for blackfin, cris, frv,
    m32r, metag, mn10300, score, and tile, including the associated device
    drivers.

    I have been working with the (former) maintainers for each one to
    ensure that my interpretation was right and the code is definitely
    unused in mainline kernels. Many had fond memories of working on the
    respective ports to start with and getting them included in upstream,
    but also saw no point in keeping the port alive without any users.

    In the end, it seems that while the eight architectures are extremely
    different, they all suffered the same fate: There was one company in
    charge of an SoC line, a CPU microarchitecture and a software
    ecosystem, which was more costly than licensing newer off-the-shelf
    CPU cores from a third party (typically ARM, MIPS, or RISC-V). It
    seems that all the SoC product lines are still around, but have not
    used the custom CPU architectures for several years at this point. In
    contrast, CPU instruction sets that remain popular and have actively
    maintained kernel ports tend to all be used across multiple licensees.

    [ See the new nds32 port merged in the previous commit for the next
    generation of "one company in charge of an SoC line, a CPU
    microarchitecture and a software ecosystem" - Linus ]

    The removal came out of a discussion that is now documented at
    https://lwn.net/Articles/748074/. Unlike the original plans, I'm not
    marking any ports as deprecated but remove them all at once after I
    made sure that they are all unused. Some architectures (notably tile,
    mn10300, and blackfin) are still being shipped in products with old
    kernels, but those products will never be updated to newer kernel
    releases.

    After this series, we still have a few architectures without mainline
    gcc support:

    - unicore32 and hexagon both have very outdated gcc releases, but the
    maintainers promised to work on providing something newer. At least
    in case of hexagon, this will only be llvm, not gcc.

    - openrisc, risc-v and nds32 are still in the process of finishing
    their support or getting it added to mainline gcc in the first
    place. They all have patched gcc-7.3 ports that work to some
    degree, but complete upstream support won't happen before gcc-8.1.
    Csky posted their first kernel patch set last week, their situation
    will be similar

    [ Palmer Dabbelt points out that RISC-V support is in mainline gcc
    since gcc-7, although gcc-7.3.0 is the recommended minimum - Linus ]"

    This really says it all:

    2498 files changed, 95 insertions(+), 467668 deletions(-)

    * tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (74 commits)
    MAINTAINERS: UNICORE32: Change email account
    staging: iio: remove iio-trig-bfin-timer driver
    tty: hvc: remove tile driver
    tty: remove bfin_jtag_comm and hvc_bfin_jtag drivers
    serial: remove tile uart driver
    serial: remove m32r_sio driver
    serial: remove blackfin drivers
    serial: remove cris/etrax uart drivers
    usb: Remove Blackfin references in USB support
    usb: isp1362: remove blackfin arch glue
    usb: musb: remove blackfin port
    usb: host: remove tilegx platform glue
    pwm: remove pwm-bfin driver
    i2c: remove bfin-twi driver
    spi: remove blackfin related host drivers
    watchdog: remove bfin_wdt driver
    can: remove bfin_can driver
    mmc: remove bfin_sdh driver
    input: misc: remove blackfin rotary driver
    input: keyboard: remove bf54x driver
    ...

    Linus Torvalds
     

16 Mar, 2018

1 commit

  • The CONFIG_MPU option was only defined on blackfin, and that architecture
    is now being removed, so the respective code can be simplified.

    A lot of other microcontrollers have an MPU, but I suspect that if we
    want to bring that support back, we'd do it differently anyway.

    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

09 Mar, 2018

1 commit

  • otherwise kernel can oops later in seq_release() due to dereferencing null
    file->private_data which is only set if seq_open() succeeds.

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
    IP: seq_release+0xc/0x30
    Call Trace:
    close_pdeo+0x37/0xd0
    proc_reg_release+0x5d/0x60
    __fput+0x9d/0x1d0
    ____fput+0x9/0x10
    task_work_run+0x75/0x90
    do_exit+0x252/0xa00
    do_group_exit+0x36/0xb0
    SyS_exit_group+0xf/0x10

    Fixes: 516fb7f2e73d ("/proc/module: use the same logic as /proc/kallsyms for address exposure")
    Cc: Jessica Yu
    Cc: Linus Torvalds
    Cc: stable@vger.kernel.org # 4.15+
    Signed-off-by: Leon Yu
    Signed-off-by: Jessica Yu

    Leon Yu
     

08 Feb, 2018

1 commit


02 Feb, 2018

1 commit

  • Pull printk updates from Petr Mladek:

    - Add a console_msg_format command line option:

    The value "default" keeps the old "[time stamp] text\n" format. The
    value "syslog" allows to see the syslog-like "[timestamp] text" format.

    This feature was requested by people doing regression tests, for
    example, 0day robot. They want to have both filtered and full logs
    at hands.

    - Reduce the risk of softlockup:

    Pass the console owner in a busy loop.

    This is a new approach to the old problem. It was first proposed by
    Steven Rostedt on Kernel Summit 2017. It marks a context in which
    the console_lock owner calls console drivers and could not sleep.
    On the other side, printk() callers could detect this state and use
    a busy wait instead of a simple console_trylock(). Finally, the
    console_lock owner checks if there is a busy waiter at the end of
    the special context and eventually passes the console_lock to the
    waiter.

    The hand-off works surprisingly well and helps in many situations.
    Well, there is still a possibility of the softlockup, for example,
    when the flood of messages stops and the last owner still has too
    much to flush.

    There is increasing number of people having problems with
    printk-related softlockups. We might eventually need to get better
    solution. Anyway, this looks like a good start and promising
    direction.

    - Do not allow to schedule in console_unlock() called from printk():

    This reverts an older controversial commit. The reschedule helped
    to avoid softlockups. But it also slowed down the console output.
    This patch is obsoleted by the new console waiter logic described
    above. In fact, the reschedule made the hand-off less effective.

    - Deprecate "%pf" and "%pF" format specifier:

    It was needed on ia64, ppc64 and parisc64 to dereference function
    descriptors and show the real function address. It is done
    transparently by "%ps" and "pS" format specifier now.

    Sergey Senozhatsky found that all the function descriptors were in
    a special elf section and could be easily detected.

    - Remove printk_symbol() API:

    It has been obsoleted by "%pS" format specifier, and this change
    helped to remove few continuous lines and a less intuitive old API.

    - Remove redundant memsets:

    Sergey removed unnecessary memset when processing printk.devkmsg
    command line option.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: (27 commits)
    printk: drop redundant devkmsg_log_str memsets
    printk: Never set console_may_schedule in console_trylock()
    printk: Hide console waiter logic into helpers
    printk: Add console owner and waiter logic to load balance console writes
    kallsyms: remove print_symbol() function
    checkpatch: add pF/pf deprecation warning
    symbol lookup: introduce dereference_symbol_descriptor()
    parisc64: Add .opd based function descriptor dereference
    powerpc64: Add .opd based function descriptor dereference
    ia64: Add .opd based function descriptor dereference
    sections: split dereference_function_descriptor()
    openrisc: Fix conflicting types for _exext and _stext
    lib: do not use print_symbol()
    irq debug: do not use print_symbol()
    sysfs: do not use print_symbol()
    drivers: do not use print_symbol()
    x86: do not use print_symbol()
    unicore32: do not use print_symbol()
    sh: do not use print_symbol()
    mn10300: do not use print_symbol()
    ...

    Linus Torvalds
     

01 Feb, 2018

1 commit

  • Pull networking updates from David Miller:

    1) Significantly shrink the core networking routing structures. Result
    of http://vger.kernel.org/~davem/seoul2017_netdev_keynote.pdf

    2) Add netdevsim driver for testing various offloads, from Jakub
    Kicinski.

    3) Support cross-chip FDB operations in DSA, from Vivien Didelot.

    4) Add a 2nd listener hash table for TCP, similar to what was done for
    UDP. From Martin KaFai Lau.

    5) Add eBPF based queue selection to tun, from Jason Wang.

    6) Lockless qdisc support, from John Fastabend.

    7) SCTP stream interleave support, from Xin Long.

    8) Smoother TCP receive autotuning, from Eric Dumazet.

    9) Lots of erspan tunneling enhancements, from William Tu.

    10) Add true function call support to BPF, from Alexei Starovoitov.

    11) Add explicit support for GRO HW offloading, from Michael Chan.

    12) Support extack generation in more netlink subsystems. From Alexander
    Aring, Quentin Monnet, and Jakub Kicinski.

    13) Add 1000BaseX, flow control, and EEE support to mvneta driver. From
    Russell King.

    14) Add flow table abstraction to netfilter, from Pablo Neira Ayuso.

    15) Many improvements and simplifications to the NFP driver bpf JIT,
    from Jakub Kicinski.

    16) Support for ipv6 non-equal cost multipath routing, from Ido
    Schimmel.

    17) Add resource abstration to devlink, from Arkadi Sharshevsky.

    18) Packet scheduler classifier shared filter block support, from Jiri
    Pirko.

    19) Avoid locking in act_csum, from Davide Caratti.

    20) devinet_ioctl() simplifications from Al viro.

    21) More TCP bpf improvements from Lawrence Brakmo.

    22) Add support for onlink ipv6 route flag, similar to ipv4, from David
    Ahern.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1925 commits)
    tls: Add support for encryption using async offload accelerator
    ip6mr: fix stale iterator
    net/sched: kconfig: Remove blank help texts
    openvswitch: meter: Use 64-bit arithmetic instead of 32-bit
    tcp_nv: fix potential integer overflow in tcpnv_acked
    r8169: fix RTL8168EP take too long to complete driver initialization.
    qmi_wwan: Add support for Quectel EP06
    rtnetlink: enable IFLA_IF_NETNSID for RTM_NEWLINK
    ipmr: Fix ptrdiff_t print formatting
    ibmvnic: Wait for device response when changing MAC
    qlcnic: fix deadlock bug
    tcp: release sk_frag.page in tcp_disconnect
    ipv4: Get the address of interface correctly.
    net_sched: gen_estimator: fix lockdep splat
    net: macb: Handle HRESP error
    net/mlx5e: IPoIB, Fix copy-paste bug in flow steering refactoring
    ipv6: addrconf: break critical section in addrconf_verify_rtnl()
    ipv6: change route cache aging logic
    i40e/i40evf: Update DESC_NEEDED value to reflect larger value
    bnxt_en: cleanup DIM work on device shutdown
    ...

    Linus Torvalds
     

30 Jan, 2018

1 commit

  • Pull x86/pti updates from Thomas Gleixner:
    "Another set of melted spectrum related changes:

    - Code simplifications and cleanups for RSB and retpolines.

    - Make the indirect calls in KVM speculation safe.

    - Whitelist CPUs which are known not to speculate from Meltdown and
    prepare for the new CPUID flag which tells the kernel that a CPU is
    not affected.

    - A less rigorous variant of the module retpoline check which merily
    warns when a non-retpoline protected module is loaded and reflects
    that fact in the sysfs file.

    - Prepare for Indirect Branch Prediction Barrier support.

    - Prepare for exposure of the Speculation Control MSRs to guests, so
    guest OSes which depend on those "features" can use them. Includes
    a blacklist of the broken microcodes. The actual exposure of the
    MSRs through KVM is still being worked on"

    * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/speculation: Simplify indirect_branch_prediction_barrier()
    x86/retpoline: Simplify vmexit_fill_RSB()
    x86/cpufeatures: Clean up Spectre v2 related CPUID flags
    x86/cpu/bugs: Make retpoline module warning conditional
    x86/bugs: Drop one "mitigation" from dmesg
    x86/nospec: Fix header guards names
    x86/alternative: Print unadorned pointers
    x86/speculation: Add basic IBPB (Indirect Branch Prediction Barrier) support
    x86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodes
    x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown
    x86/msr: Add definitions for new speculation control MSRs
    x86/cpufeatures: Add AMD feature bits for Speculation Control
    x86/cpufeatures: Add Intel feature bits for Speculation Control
    x86/cpufeatures: Add CPUID_7_EDX CPUID leaf
    module/retpoline: Warn about missing retpoline in module
    KVM: VMX: Make indirect call speculation safe
    KVM: x86: Make indirect calls in emulator speculation safe

    Linus Torvalds
     

26 Jan, 2018

1 commit

  • There's a risk that a kernel which has full retpoline mitigations becomes
    vulnerable when a module gets loaded that hasn't been compiled with the
    right compiler or the right option.

    To enable detection of that mismatch at module load time, add a module info
    string "retpoline" at build time when the module was compiled with
    retpoline support. This only covers compiled C source, but assembler source
    or prebuilt object files are not checked.

    If a retpoline enabled kernel detects a non retpoline protected module at
    load time, print a warning and report it in the sysfs vulnerability file.

    [ tglx: Massaged changelog ]

    Signed-off-by: Andi Kleen
    Signed-off-by: Thomas Gleixner
    Cc: David Woodhouse
    Cc: gregkh@linuxfoundation.org
    Cc: torvalds@linux-foundation.org
    Cc: jeyu@kernel.org
    Cc: arjan@linux.intel.com
    Link: https://lkml.kernel.org/r/20180125235028.31211-1-andi@firstfloor.org

    Andi Kleen
     

16 Jan, 2018

1 commit

  • ftrace_module_init happen after dynamic_debug_setup, it is desired that
    cleanup should be called after this label however in current implementation
    it is called in free module label,ie:even though ftrace in not initialized,
    from so many fail case ftrace_release_mod() will be called and unnecessary
    traverse the whole list.
    In below patch we moved ftrace_release_mod() from free_module label to
    ddebug_cleanup label. that is the best possible location, other solution
    is to make new label to ftrace_release_mod() but since ftrace_module_init()
    is not return with minimum changes it should be in ddebug_cleanup label.

    Signed-off-by: Namit Gupta
    Reviewed-by: Steven Rostedt (VMware)
    Signed-off-by: Jessica Yu

    Namit Gupta
     

13 Jan, 2018

1 commit

  • Since error-injection framework is not limited to be used
    by kprobes, nor bpf. Other kernel subsystems can use it
    freely for checking safeness of error-injection, e.g.
    livepatch, ftrace etc.
    So this separate error-injection framework from kprobes.

    Some differences has been made:

    - "kprobe" word is removed from any APIs/structures.
    - BPF_ALLOW_ERROR_INJECTION() is renamed to
    ALLOW_ERROR_INJECTION() since it is not limited for BPF too.
    - CONFIG_FUNCTION_ERROR_INJECTION is the config item of this
    feature. It is automatically enabled if the arch supports
    error injection feature for kprobe or ftrace etc.

    Signed-off-by: Masami Hiramatsu
    Reviewed-by: Josef Bacik
    Signed-off-by: Alexei Starovoitov

    Masami Hiramatsu
     

09 Jan, 2018

1 commit

  • There are two format specifiers to print out a pointer in symbolic
    format: '%pS/%ps' and '%pF/%pf'. On most architectures, the two
    mean exactly the same thing, but some architectures (ia64, ppc64,
    parisc64) use an indirect pointer for C function pointers, where
    the function pointer points to a function descriptor (which in
    turn contains the actual pointer to the code). The '%pF/%pf, when
    used appropriately, automatically does the appropriate function
    descriptor dereference on such architectures.

    The "when used appropriately" part is tricky. Basically this is
    a subtle ABI detail, specific to some platforms, that made it to
    the API level and people can be unaware of it and miss the whole
    "we need to dereference the function" business out. [1] proves
    that point (note that it fixes only '%pF' and '%pS', there might
    be '%pf' and '%ps' cases as well).

    It appears that we can handle everything within the affected
    arches and make '%pS/%ps' smart enough to retire '%pF/%pf'.
    Function descriptors live in .opd elf section and all affected
    arches (ia64, ppc64, parisc64) handle it properly for kernel
    and modules. So we, technically, can decide if the dereference
    is needed by simply looking at the pointer: if it belongs to
    .opd section then we need to dereference it.

    The kernel and modules have their own .opd sections, obviously,
    that's why we need to split dereference_function_descriptor()
    and use separate kernel and module dereference arch callbacks.

    This patch does the first step, it
    a) adds dereference_kernel_function_descriptor() function.
    b) adds a weak alias to dereference_module_function_descriptor()
    function.

    So, for the time being, we will have:
    1) dereference_function_descriptor()
    A generic function, that simply dereferences the pointer. There is
    bunch of places that call it: kgdbts, init/main.c, extable, etc.

    2) dereference_kernel_function_descriptor()
    A function to call on kernel symbols that does kernel .opd section
    address range test.

    3) dereference_module_function_descriptor()
    A function to call on modules' symbols that does modules' .opd
    section address range test.

    [1] https://marc.info/?l=linux-kernel&m=150472969730573

    Link: http://lkml.kernel.org/r/20171109234830.5067-2-sergey.senozhatsky@gmail.com
    To: Fenghua Yu
    To: Benjamin Herrenschmidt
    To: Paul Mackerras
    To: Michael Ellerman
    To: James Bottomley
    Cc: Andrew Morton
    Cc: Jessica Yu
    Cc: Steven Rostedt
    Cc: linux-ia64@vger.kernel.org
    Cc: linux-parisc@vger.kernel.org
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Sergey Senozhatsky
    Tested-by: Tony Luck #ia64
    Tested-by: Santosh Sivaraj #powerpc
    Tested-by: Helge Deller #parisc64
    Signed-off-by: Petr Mladek

    Sergey Senozhatsky
     

13 Dec, 2017

1 commit

  • Using BPF we can override kprob'ed functions and return arbitrary
    values. Obviously this can be a bit unsafe, so make this feature opt-in
    for functions. Simply tag a function with KPROBE_ERROR_INJECT_SYMBOL in
    order to give BPF access to that function for error injection purposes.

    Signed-off-by: Josef Bacik
    Acked-by: Ingo Molnar
    Signed-off-by: Alexei Starovoitov

    Josef Bacik
     

30 Nov, 2017

1 commit

  • The conditional kallsym hex printing used a special fixed-width '%lx'
    output (KALLSYM_FMT) in preparation for the hashing of %p, but that
    series ended up adding a %px specifier to help with the conversions.

    Use it, and avoid the "print pointer as an unsigned long" code.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

18 Nov, 2017

1 commit

  • Pull tracing updates from

    - allow module init functions to be traced

    - clean up some unused or not used by config events (saves space)

    - clean up of trace histogram code

    - add support for preempt and interrupt enabled/disable events

    - other various clean ups

    * tag 'trace-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (30 commits)
    tracing, thermal: Hide cpu cooling trace events when not in use
    tracing, thermal: Hide devfreq trace events when not in use
    ftrace: Kill FTRACE_OPS_FL_PER_CPU
    perf/ftrace: Small cleanup
    perf/ftrace: Fix function trace events
    perf/ftrace: Revert ("perf/ftrace: Fix double traces of perf on ftrace:function")
    tracing, dma-buf: Remove unused trace event dma_fence_annotate_wait_on
    tracing, memcg, vmscan: Hide trace events when not in use
    tracing/xen: Hide events that are not used when X86_PAE is not defined
    tracing: mark trace_test_buffer as __maybe_unused
    printk: Remove superfluous memory barriers from printk_safe
    ftrace: Clear hashes of stale ips of init memory
    tracing: Add support for preempt and irq enable/disable events
    tracing: Prepare to add preempt and irq trace events
    ftrace/kallsyms: Have /proc/kallsyms show saved mod init functions
    ftrace: Add freeing algorithm to free ftrace_mod_maps
    ftrace: Save module init functions kallsyms symbols for tracing
    ftrace: Allow module init functions to be traced
    ftrace: Add a ftrace_free_mem() function for modules to use
    tracing: Reimplement log2
    ...

    Linus Torvalds
     

16 Nov, 2017

1 commit

  • Pull module updates from Jessica Yu:
    "Summary of modules changes for the 4.15 merge window:

    - treewide module_param_call() cleanup, fix up set/get function
    prototype mismatches, from Kees Cook

    - minor code cleanups"

    * tag 'modules-for-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
    module: Do not paper over type mismatches in module_param_call()
    treewide: Fix function prototypes for module_param_call()
    module: Prepare to convert all module_param_call() prototypes
    kernel/module: Delete an error message for a failed memory allocation in add_module_usage()

    Linus Torvalds
     

14 Nov, 2017

1 commit

  • …morris/linux-security

    Pull security subsystem integrity updates from James Morris:
    "There is a mixture of bug fixes, code cleanup, preparatory code for
    new functionality and new functionality.

    Commit 26ddabfe96bb ("evm: enable EVM when X509 certificate is
    loaded") enabled EVM without loading a symmetric key, but was limited
    to defining the x509 certificate pathname at build. Included in this
    set of patches is the ability of enabling EVM, without loading the EVM
    symmetric key, from userspace. New is the ability to prevent the
    loading of an EVM symmetric key."

    * 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
    ima: Remove redundant conditional operator
    ima: Fix bool initialization/comparison
    ima: check signature enforcement against cmdline param instead of CONFIG
    module: export module signature enforcement status
    ima: fix hash algorithm initialization
    EVM: Only complain about a missing HMAC key once
    EVM: Allow userspace to signal an RSA key has been loaded
    EVM: Include security.apparmor in EVM measurements
    ima: call ima_file_free() prior to calling fasync
    integrity: use kernel_read_file_from_path() to read x509 certs
    ima: always measure and audit files in policy
    ima: don't remove the securityfs policy file
    vfs: fix mounting a filesystem with i_version

    Linus Torvalds
     

13 Nov, 2017

2 commits

  • The (alleged) users of the module addresses are the same: kernel
    profiling.

    So just expose the same helper and format macros, and unify the logic.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • This code goes back to the historical bitkeeper tree commit 3f7b0672086
    ("Module section offsets in /sys/module"), where Jonathan Corbet wanted
    to show people how to debug loadable modules.

    See

    https://lwn.net/Articles/88052/

    from June 2004.

    To expose the required load address information, Jonathan added the
    sections subdirectory for every module in /sys/modules, and made them
    S_IRUGO - readable by everybody.

    It was a more innocent time, plus those S_IRxxx macro names are a lot
    more confusing than the octal numbers are, so maybe it wasn't even
    intentional. But here we are, thirteen years later, and I'll just change
    it to S_IRUSR instead.

    Let's see if anybody even notices.

    Cc: Jonathan Corbet
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

09 Nov, 2017

1 commit

  • A static variable sig_enforce is used as status var to indicate the real
    value of CONFIG_MODULE_SIG_FORCE, once this one is set the var will hold
    true, but if the CONFIG is not set the status var will hold whatever
    value is present in the module.sig_enforce kernel cmdline param: true
    when =1 and false when =0 or not present.

    Considering this cmdline param take place over the CONFIG value when
    it's not set, other places in the kernel could misbehave since they
    would have only the CONFIG_MODULE_SIG_FORCE value to rely on. Exporting
    this status var allows the kernel to rely in the effective value of
    module signature enforcement, being it from CONFIG value or cmdline
    param.

    Signed-off-by: Bruno E. O. Meneguele
    Signed-off-by: Mimi Zohar

    Bruno E. O. Meneguele
     

19 Oct, 2017

1 commit


06 Oct, 2017

2 commits

  • If function tracing is active when the module init functions are freed, then
    store them to be referenced by kallsyms. As module init functions can now be
    traced on module load, they were useless:

    ># echo ':mod:snd_seq' > set_ftrace_filter
    ># echo function > current_tracer
    ># modprobe snd_seq
    ># cat trace
    # tracer: function
    #
    # _-----=> irqs-off
    # / _----=> need-resched
    # | / _---=> hardirq/softirq
    # || / _--=> preempt-depth
    # ||| / delay
    # TASK-PID CPU# |||| TIMESTAMP FUNCTION
    # | | | |||| | |
    modprobe-2786 [000] .... 3189.037874: 0xffffffffa0860000 irqs-off
    # / _----=> need-resched
    # | / _---=> hardirq/softirq
    # || / _--=> preempt-depth
    # ||| / delay
    # TASK-PID CPU# |||| TIMESTAMP FUNCTION
    # | | | |||| | |
    modprobe-2463 [002] .... 174.243237: alsa_seq_init

    Steven Rostedt (VMware)
     
  • Allow for module init sections to be traced as well as core kernel init
    sections. Now that filtering modules functions can be stored, for when they
    are loaded, it makes sense to be able to trace them.

    Cc: Jessica Yu
    Cc: Rusty Russell
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)