07 Jun, 2020

1 commit

  • Pull integrity updates from Mimi Zohar:
    "The main changes are extending the TPM 2.0 PCR banks with bank
    specific file hashes, calculating the "boot_aggregate" based on other
    TPM PCR banks, using the default IMA hash algorithm, instead of SHA1,
    as the basis for the cache hash table key, and preventing the mprotect
    syscall to circumvent an IMA mmap appraise policy rule.

    - In preparation for extending TPM 2.0 PCR banks with bank specific
    digests, commit 0b6cf6b97b7e ("tpm: pass an array of
    tpm_extend_digest structures to tpm_pcr_extend()") modified
    tpm_pcr_extend(). The original SHA1 file digests were
    padded/truncated, before being extended into the other TPM PCR
    banks. This pull request calculates and extends the TPM PCR banks
    with bank specific file hashes completing the above change.

    - The "boot_aggregate", the first IMA measurement list record, is the
    "trusted boot" link between the pre-boot environment and the
    running OS. With TPM 2.0, the "boot_aggregate" record is not
    limited to being based on the SHA1 TPM PCR bank, but can be
    calculated based on any enabled bank, assuming the hash algorithm
    is also enabled in the kernel.

    Other changes include the following and five other bug fixes/code
    clean up:

    - supporting both a SHA1 and a larger "boot_aggregate" digest in a
    custom template format containing both the the SHA1 ('d') and
    larger digests ('d-ng') fields.

    - Initial hash table key fix, but additional changes would be good"

    * tag 'integrity-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
    ima: Directly free *entry in ima_alloc_init_template() if digests is NULL
    ima: Call ima_calc_boot_aggregate() in ima_eventdigest_init()
    ima: Directly assign the ima_default_policy pointer to ima_rules
    ima: verify mprotect change is consistent with mmap policy
    evm: Fix possible memory leak in evm_calc_hmac_or_hash()
    ima: Set again build_ima_appraise variable
    ima: Remove redundant policy rule set in add_rules()
    ima: Fix ima digest hash table key calculation
    ima: Use ima_hash_algo for collision detection in the measurement list
    ima: Calculate and extend PCR with digests in ima_template_entry
    ima: Allocate and initialize tfm for each PCR bank
    ima: Switch to dynamically allocated buffer for template digests
    ima: Store template digest directly in ima_template_entry
    ima: Evaluate error in init_ima()
    ima: Switch to ima_hash_algo for boot aggregate

    Linus Torvalds
     

05 Jun, 2020

7 commits

  • To support multiple template digests, the static array entry->digest has
    been replaced with a dynamically allocated array in commit aa724fe18a8a
    ("ima: Switch to dynamically allocated buffer for template digests"). The
    array is allocated in ima_alloc_init_template() and if the returned pointer
    is NULL, ima_free_template_entry() is called.

    However, (*entry)->template_desc is not yet initialized while it is used by
    ima_free_template_entry(). This patch fixes the issue by directly freeing
    *entry without calling ima_free_template_entry().

    Fixes: aa724fe18a8a ("ima: Switch to dynamically allocated buffer for template digests")
    Reported-by: syzbot+223310b454ba6b75974e@syzkaller.appspotmail.com
    Signed-off-by: Roberto Sassu
    Signed-off-by: Mimi Zohar

    Roberto Sassu
     
  • Merge yet more updates from Andrew Morton:

    - More MM work. 100ish more to go. Mike Rapoport's "mm: remove
    __ARCH_HAS_5LEVEL_HACK" series should fix the current ppc issue

    - Various other little subsystems

    * emailed patches from Andrew Morton : (127 commits)
    lib/ubsan.c: fix gcc-10 warnings
    tools/testing/selftests/vm: remove duplicate headers
    selftests: vm: pkeys: fix multilib builds for x86
    selftests: vm: pkeys: use the correct page size on powerpc
    selftests/vm/pkeys: override access right definitions on powerpc
    selftests/vm/pkeys: test correct behaviour of pkey-0
    selftests/vm/pkeys: introduce a sub-page allocator
    selftests/vm/pkeys: detect write violation on a mapped access-denied-key page
    selftests/vm/pkeys: associate key on a mapped page and detect write violation
    selftests/vm/pkeys: associate key on a mapped page and detect access violation
    selftests/vm/pkeys: improve checks to determine pkey support
    selftests/vm/pkeys: fix assertion in test_pkey_alloc_exhaust()
    selftests/vm/pkeys: fix number of reserved powerpc pkeys
    selftests/vm/pkeys: introduce powerpc support
    selftests/vm/pkeys: introduce generic pkey abstractions
    selftests: vm: pkeys: use the correct huge page size
    selftests/vm/pkeys: fix alloc_random_pkey() to make it really random
    selftests/vm/pkeys: fix assertion in pkey_disable_set/clear()
    selftests/vm/pkeys: fix pkey_disable_clear()
    selftests: vm: pkeys: add helpers for pkey bits
    ...

    Linus Torvalds
     
  • For kvmalloc'ed data object that contains sensitive information like
    cryptographic keys, we need to make sure that the buffer is always cleared
    before freeing it. Using memset() alone for buffer clearing may not
    provide certainty as the compiler may compile it away. To be sure, the
    special memzero_explicit() has to be used.

    This patch introduces a new kvfree_sensitive() for freeing those sensitive
    data objects allocated by kvmalloc(). The relevant places where
    kvfree_sensitive() can be used are modified to use it.

    Fixes: 4f0882491a14 ("KEYS: Avoid false positive ENOMEM error on key read")
    Suggested-by: Linus Torvalds
    Signed-off-by: Waiman Long
    Signed-off-by: Andrew Morton
    Reviewed-by: Eric Biggers
    Acked-by: David Howells
    Cc: Jarkko Sakkinen
    Cc: James Morris
    Cc: "Serge E. Hallyn"
    Cc: Joe Perches
    Cc: Matthew Wilcox
    Cc: David Rientjes
    Cc: Uladzislau Rezki
    Link: http://lkml.kernel.org/r/20200407200318.11711-1-longman@redhat.com
    Signed-off-by: Linus Torvalds

    Waiman Long
     
  • Pull execve updates from Eric Biederman:
    "Last cycle for the Nth time I ran into bugs and quality of
    implementation issues related to exec that could not be easily be
    fixed because of the way exec is implemented. So I have been digging
    into exec and cleanup up what I can.

    I don't think I have exec sorted out enough to fix the issues I
    started with but I have made some headway this cycle with 4 sets of
    changes.

    - promised cleanups after introducing exec_update_mutex

    - trivial cleanups for exec

    - control flow simplifications

    - remove the recomputation of bprm->cred

    The net result is code that is a bit easier to understand and work
    with and a decrease in the number of lines of code (if you don't count
    the added tests)"

    * 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (24 commits)
    exec: Compute file based creds only once
    exec: Add a per bprm->file version of per_clear
    binfmt_elf_fdpic: fix execfd build regression
    selftests/exec: Add binfmt_script regression test
    exec: Remove recursion from search_binary_handler
    exec: Generic execfd support
    exec/binfmt_script: Don't modify bprm->buf and then return -ENOEXEC
    exec: Move the call of prepare_binprm into search_binary_handler
    exec: Allow load_misc_binary to call prepare_binprm unconditionally
    exec: Convert security_bprm_set_creds into security_bprm_repopulate_creds
    exec: Factor security_bprm_creds_for_exec out of security_bprm_set_creds
    exec: Teach prepare_exec_creds how exec treats uids & gids
    exec: Set the point of no return sooner
    exec: Move handling of the point of no return to the top level
    exec: Run sync_mm_rss before taking exec_update_mutex
    exec: Fix spelling of search_binary_handler in a comment
    exec: Move the comment from above de_thread to above unshare_sighand
    exec: Rename flush_old_exec begin_new_exec
    exec: Move most of setup_new_exec into flush_old_exec
    exec: In setup_new_exec cache current in the local variable me
    ...

    Linus Torvalds
     
  • Pull proc updates from Eric Biederman:
    "This has four sets of changes:

    - modernize proc to support multiple private instances

    - ensure we see the exit of each process tid exactly

    - remove has_group_leader_pid

    - use pids not tasks in posix-cpu-timers lookup

    Alexey updated proc so each mount of proc uses a new superblock. This
    allows people to actually use mount options with proc with no fear of
    messing up another mount of proc. Given the kernel's internal mounts
    of proc for things like uml this was a real problem, and resulted in
    Android's hidepid mount options being ignored and introducing security
    issues.

    The rest of the changes are small cleanups and fixes that came out of
    my work to allow this change to proc. In essence it is swapping the
    pids in de_thread during exec which removes a special case the code
    had to handle. Then updating the code to stop handling that special
    case"

    * 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    proc: proc_pid_ns takes super_block as an argument
    remove the no longer needed pid_alive() check in __task_pid_nr_ns()
    posix-cpu-timers: Replace __get_task_for_clock with pid_for_clock
    posix-cpu-timers: Replace cpu_timer_pid_type with clock_pid_type
    posix-cpu-timers: Extend rcu_read_lock removing task_struct references
    signal: Remove has_group_leader_pid
    exec: Remove BUG_ON(has_group_leader_pid)
    posix-cpu-timer: Unify the now redundant code in lookup_task
    posix-cpu-timer: Tidy up group_leader logic in lookup_task
    proc: Ensure we see the exit of each process tid exactly once
    rculist: Add hlists_swap_heads_rcu
    proc: Use PIDTYPE_TGID in next_tgid
    Use proc_pid_ns() to get pid_namespace from the proc superblock
    proc: use named enums for better readability
    proc: use human-readable values for hidepid
    docs: proc: add documentation for "hidepid=4" and "subset=pid" options and new mount behavior
    proc: add option to mount only a pids subset
    proc: instantiate only pids that we can ptrace on 'hidepid=4' mount option
    proc: allow to mount many instances of proc in one pid namespace
    proc: rename struct proc_fs_info to proc_fs_opts

    Linus Torvalds
     
  • Pull smack updates from Casey Schaufler:
    "Clean out dead code and repair an out-of-bounds warning"

    * tag 'Smack-for-5.8' of git://github.com/cschaufler/smack-next:
    Smack: Remove unused inline function smk_ad_setfield_u_fs_path_mnt
    Smack:- Remove redundant inode_smack cache
    Smack:- Remove mutex lock "smk_lock" from inode_smack
    Smack: slab-out-of-bounds in vsscanf
    smack: remove redundant structure variable from header.
    smack: avoid unused 'sip' variable warning

    Linus Torvalds
     
  • Pull keyring updates from David Howells:

    - Fix a documentation warning.

    - Replace a zero-length array with a flexible one

    - Make the big_key key type use ChaCha20Poly1305 and use the crypto
    algorithm directly rather than going through the crypto layer.

    - Implement the update op for the big_key type.

    * tag 'keys-next-20200602' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
    keys: Implement update for the big_key type
    security/keys: rewrite big_key crypto to use library interface
    KEYS: Replace zero-length array with flexible-array
    Documentation: security: core.rst: add missing argument

    Linus Torvalds
     

04 Jun, 2020

3 commits

  • Pull networking updates from David Miller:

    1) Allow setting bluetooth L2CAP modes via socket option, from Luiz
    Augusto von Dentz.

    2) Add GSO partial support to igc, from Sasha Neftin.

    3) Several cleanups and improvements to r8169 from Heiner Kallweit.

    4) Add IF_OPER_TESTING link state and use it when ethtool triggers a
    device self-test. From Andrew Lunn.

    5) Start moving away from custom driver versions, use the globally
    defined kernel version instead, from Leon Romanovsky.

    6) Support GRO vis gro_cells in DSA layer, from Alexander Lobakin.

    7) Allow hard IRQ deferral during NAPI, from Eric Dumazet.

    8) Add sriov and vf support to hinic, from Luo bin.

    9) Support Media Redundancy Protocol (MRP) in the bridging code, from
    Horatiu Vultur.

    10) Support netmap in the nft_nat code, from Pablo Neira Ayuso.

    11) Allow UDPv6 encapsulation of ESP in the ipsec code, from Sabrina
    Dubroca. Also add ipv6 support for espintcp.

    12) Lots of ReST conversions of the networking documentation, from Mauro
    Carvalho Chehab.

    13) Support configuration of ethtool rxnfc flows in bcmgenet driver,
    from Doug Berger.

    14) Allow to dump cgroup id and filter by it in inet_diag code, from
    Dmitry Yakunin.

    15) Add infrastructure to export netlink attribute policies to
    userspace, from Johannes Berg.

    16) Several optimizations to sch_fq scheduler, from Eric Dumazet.

    17) Fallback to the default qdisc if qdisc init fails because otherwise
    a packet scheduler init failure will make a device inoperative. From
    Jesper Dangaard Brouer.

    18) Several RISCV bpf jit optimizations, from Luke Nelson.

    19) Correct the return type of the ->ndo_start_xmit() method in several
    drivers, it's netdev_tx_t but many drivers were using
    'int'. From Yunjian Wang.

    20) Add an ethtool interface for PHY master/slave config, from Oleksij
    Rempel.

    21) Add BPF iterators, from Yonghang Song.

    22) Add cable test infrastructure, including ethool interfaces, from
    Andrew Lunn. Marvell PHY driver is the first to support this
    facility.

    23) Remove zero-length arrays all over, from Gustavo A. R. Silva.

    24) Calculate and maintain an explicit frame size in XDP, from Jesper
    Dangaard Brouer.

    25) Add CAP_BPF, from Alexei Starovoitov.

    26) Support terse dumps in the packet scheduler, from Vlad Buslov.

    27) Support XDP_TX bulking in dpaa2 driver, from Ioana Ciornei.

    28) Add devm_register_netdev(), from Bartosz Golaszewski.

    29) Minimize qdisc resets, from Cong Wang.

    30) Get rid of kernel_getsockopt and kernel_setsockopt in order to
    eliminate set_fs/get_fs calls. From Christoph Hellwig.

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2517 commits)
    selftests: net: ip_defrag: ignore EPERM
    net_failover: fixed rollback in net_failover_open()
    Revert "tipc: Fix potential tipc_aead refcnt leak in tipc_crypto_rcv"
    Revert "tipc: Fix potential tipc_node refcnt leak in tipc_rcv"
    vmxnet3: allow rx flow hash ops only when rss is enabled
    hinic: add set_channels ethtool_ops support
    selftests/bpf: Add a default $(CXX) value
    tools/bpf: Don't use $(COMPILE.c)
    bpf, selftests: Use bpf_probe_read_kernel
    s390/bpf: Use bcr 0,%0 as tail call nop filler
    s390/bpf: Maintain 8-byte stack alignment
    selftests/bpf: Fix verifier test
    selftests/bpf: Fix sample_cnt shared between two threads
    bpf, selftests: Adapt cls_redirect to call csum_level helper
    bpf: Add csum_level helper for fixing up csum levels
    bpf: Fix up bpf_skb_adjust_room helper's skb csum setting
    sfc: add missing annotation for efx_ef10_try_update_nic_stats_vf()
    crypto/chtls: IPv6 support for inline TLS
    Crypto/chcr: Fixes a coccinile check error
    Crypto/chcr: Fixes compilations warnings
    ...

    Linus Torvalds
     
  • If the template field 'd' is chosen and the digest to be added to the
    measurement entry was not calculated with SHA1 or MD5, it is
    recalculated with SHA1, by using the passed file descriptor. However, this
    cannot be done for boot_aggregate, because there is no file descriptor.

    This patch adds a call to ima_calc_boot_aggregate() in
    ima_eventdigest_init(), so that the digest can be recalculated also for the
    boot_aggregate entry.

    Cc: stable@vger.kernel.org # 3.13.x
    Fixes: 3ce1217d6cd5d ("ima: define template fields library and new helpers")
    Reported-by: Takashi Iwai
    Signed-off-by: Roberto Sassu
    Signed-off-by: Mimi Zohar

    Roberto Sassu
     
  • This patch prevents the following oops:

    [ 10.771813] BUG: kernel NULL pointer dereference, address: 0000000000000
    [...]
    [ 10.779790] RIP: 0010:ima_match_policy+0xf7/0xb80
    [...]
    [ 10.798576] Call Trace:
    [ 10.798993] ? ima_lsm_policy_change+0x2b0/0x2b0
    [ 10.799753] ? inode_init_owner+0x1a0/0x1a0
    [ 10.800484] ? _raw_spin_lock+0x7a/0xd0
    [ 10.801592] ima_must_appraise.part.0+0xb6/0xf0
    [ 10.802313] ? ima_fix_xattr.isra.0+0xd0/0xd0
    [ 10.803167] ima_must_appraise+0x4f/0x70
    [ 10.804004] ima_post_path_mknod+0x2e/0x80
    [ 10.804800] do_mknodat+0x396/0x3c0

    It occurs when there is a failure during IMA initialization, and
    ima_init_policy() is not called. IMA hooks still call ima_match_policy()
    but ima_rules is NULL. This patch prevents the crash by directly assigning
    the ima_default_policy pointer to ima_rules when ima_rules is defined. This
    wouldn't alter the existing behavior, as ima_rules is always set at the end
    of ima_init_policy().

    Cc: stable@vger.kernel.org # 3.7.x
    Fixes: 07f6a79415d7d ("ima: add appraise action keywords and default rules")
    Reported-by: Takashi Iwai
    Signed-off-by: Roberto Sassu
    Signed-off-by: Mimi Zohar

    Roberto Sassu
     

03 Jun, 2020

5 commits

  • Pull lockdown update from James Morris:
    "An update for the security subsystem to allow unprivileged users
    to see the status of the lockdown feature. From Jeremy Cline"

    Also an added comment to describe CAP_SETFCAP.

    * 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
    capabilities: add description for CAP_SETFCAP
    lockdown: Allow unprivileged users to see lockdown status

    Linus Torvalds
     
  • Pull SELinux updates from Paul Moore:
    "The highlights:

    - A number of improvements to various SELinux internal data
    structures to help improve performance. We move the role
    transitions into a hash table. In the content structure we shift
    from hashing the content string (aka SELinux label) to the
    structure itself, when it is valid. This last change not only
    offers a speedup, but it helps us simplify the code some as well.

    - Add a new SELinux policy version which allows for a more space
    efficient way of storing the filename transitions in the binary
    policy. Given the default Fedora SELinux policy with the unconfined
    module enabled, this change drops the policy size from ~7.6MB to
    ~3.3MB. The kernel policy load time dropped as well.

    - Some fixes to the error handling code in the policy parser to
    properly return error codes when things go wrong"

    * tag 'selinux-pr-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
    selinux: netlabel: Remove unused inline function
    selinux: do not allocate hashtabs dynamically
    selinux: fix return value on error in policydb_read()
    selinux: simplify range_write()
    selinux: fix error return code in policydb_read()
    selinux: don't produce incorrect filename_trans_count
    selinux: implement new format of filename transitions
    selinux: move context hashing under sidtab
    selinux: hash context structure directly
    selinux: store role transitions in a hash table
    selinux: drop unnecessary smp_load_acquire() call
    selinux: fix warning Comparison to bool

    Linus Torvalds
     
  • Pull tomoyo update from Tetsuo Handa:
    "One patch for suppressing coccicheck's warning"

    * tag 'tomoyo-pr-20200601' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1:
    tomoyo: use true for bool variable

    Linus Torvalds
     
  • Implement the ->update op for the big_key type.

    Signed-off-by: David Howells
    Acked-by: Jason A. Donenfeld

    David Howells
     
  • A while back, I noticed that the crypto and crypto API usage in big_keys
    were entirely broken in multiple ways, so I rewrote it. Now, I'm
    rewriting it again, but this time using the simpler ChaCha20Poly1305
    library function. This makes the file considerably more simple; the
    diffstat alone should justify this commit. It also should be faster,
    since it no longer requires a mutex around the "aead api object" (nor
    allocations), allowing us to encrypt multiple items in parallel. We also
    benefit from being able to pass any type of pointer, so we can get rid
    of the ridiculously complex custom page allocator that big_key really
    doesn't need.

    [DH: Change the select CRYPTO_LIB_CHACHA20POLY1305 to a depends on as
    select doesn't propagate and the build can end up with an =y dependending
    on some =m pieces.

    The depends on CRYPTO also had to be removed otherwise the configurator
    complains about a recursive dependency.]

    Cc: Andy Lutomirski
    Cc: Greg KH
    Cc: Linus Torvalds
    Cc: kernel-hardening@lists.openwall.com
    Reviewed-by: Eric Biggers
    Signed-off-by: Jason A. Donenfeld
    Signed-off-by: David Howells

    Jason A. Donenfeld
     

02 Jun, 2020

3 commits

  • Pull uaccess/access_ok updates from Al Viro:
    "Removals of trivially pointless access_ok() calls.

    Note: the fiemap stuff was removed from the series, since they are
    duplicates with part of ext4 series carried in Ted's tree"

    * 'uaccess.access_ok' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    vmci_host: get rid of pointless access_ok()
    hfi1: get rid of pointless access_ok()
    usb: get rid of pointless access_ok() calls
    lpfc_debugfs: get rid of pointless access_ok()
    efi_test: get rid of pointless access_ok()
    drm_read(): get rid of pointless access_ok()
    via-pmu: don't bother with access_ok()
    drivers/crypto/ccp/sev-dev.c: get rid of pointless access_ok()
    omapfb: get rid of pointless access_ok() calls
    amifb: get rid of pointless access_ok() calls
    drivers/fpga/dfl-afu-dma-region.c: get rid of pointless access_ok()
    drivers/fpga/dfl-fme-pr.c: get rid of pointless access_ok()
    cm4000_cs.c cmm_ioctl(): get rid of pointless access_ok()
    nvram: drop useless access_ok()
    n_hdlc_tty_read(): remove pointless access_ok()
    tomoyo_write_control(): get rid of pointless access_ok()
    btrfs_ioctl_send(): don't bother with access_ok()
    fat_dir_ioctl(): hadn't needed that access_ok() for more than a decade...
    dlmfs_file_write(): get rid of pointless access_ok()

    Linus Torvalds
     
  • Pull perf updates from Ingo Molnar:
    "Kernel side changes:

    - Add AMD Fam17h RAPL support

    - Introduce CAP_PERFMON to kernel and user space

    - Add Zhaoxin CPU support

    - Misc fixes and cleanups

    Tooling changes:

    - perf record:

    Introduce '--switch-output-event' to use arbitrary events to be
    setup and read from a side band thread and, when they take place a
    signal be sent to the main 'perf record' thread, reusing the core
    for '--switch-output' to take perf.data snapshots from the ring
    buffer used for '--overwrite', e.g.:

    # perf record --overwrite -e sched:* \
    --switch-output-event syscalls:*connect* \
    workload

    will take perf.data.YYYYMMDDHHMMSS snapshots up to around the
    connect syscalls.

    Add '--num-synthesize-threads' option to control degree of
    parallelism of the synthesize_mmap() code which is scanning
    /proc/PID/task/PID/maps and can be time consuming. This mimics
    pre-existing behaviour in 'perf top'.

    - perf bench:

    Add a multi-threaded synthesize benchmark and kallsyms parsing
    benchmark.

    - Intel PT support:

    Stitch LBR records from multiple samples to get deeper backtraces,
    there are caveats, see the csets for details.

    Allow using Intel PT to synthesize callchains for regular events.

    Add support for synthesizing branch stacks for regular events
    (cycles, instructions, etc) from Intel PT data.

    Misc changes:

    - Updated perf vendor events for power9 and Coresight.

    - Add flamegraph.py script via 'perf flamegraph'

    - Misc other changes, fixes and cleanups - see the Git log for details

    Also, since over the last couple of years perf tooling has matured and
    decoupled from the kernel perf changes to a large degree, going
    forward Arnaldo is going to send perf tooling changes via direct pull
    requests"

    * tag 'perf-core-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (163 commits)
    perf/x86/rapl: Add AMD Fam17h RAPL support
    perf/x86/rapl: Make perf_probe_msr() more robust and flexible
    perf/x86/rapl: Flip logic on default events visibility
    perf/x86/rapl: Refactor to share the RAPL code between Intel and AMD CPUs
    perf/x86/rapl: Move RAPL support to common x86 code
    perf/core: Replace zero-length array with flexible-array
    perf/x86: Replace zero-length array with flexible-array
    perf/x86/intel: Add more available bits for OFFCORE_RESPONSE of Intel Tremont
    perf/x86/rapl: Add Ice Lake RAPL support
    perf flamegraph: Use /bin/bash for report and record scripts
    perf cs-etm: Move definition of 'traceid_list' global variable from header file
    libsymbols kallsyms: Move hex2u64 out of header
    libsymbols kallsyms: Parse using io api
    perf bench: Add kallsyms parsing
    perf: cs-etm: Update to build with latest opencsd version.
    perf symbol: Fix kernel symbol address display
    perf inject: Rename perf_evsel__*() operating on 'struct evsel *' to evsel__*()
    perf annotate: Rename perf_evsel__*() operating on 'struct evsel *' to evsel__*()
    perf trace: Rename perf_evsel__*() operating on 'struct evsel *' to evsel__*()
    perf script: Rename perf_evsel__*() operating on 'struct evsel *' to evsel__*()
    ...

    Linus Torvalds
     
  • Pull crypto updates from Herbert Xu:
    "API:
    - Introduce crypto_shash_tfm_digest() and use it wherever possible.
    - Fix use-after-free and race in crypto_spawn_alg.
    - Add support for parallel and batch requests to crypto_engine.

    Algorithms:
    - Update jitter RNG for SP800-90B compliance.
    - Always use jitter RNG as seed in drbg.

    Drivers:
    - Add Arm CryptoCell driver cctrng.
    - Add support for SEV-ES to the PSP driver in ccp"

    * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (114 commits)
    crypto: hisilicon - fix driver compatibility issue with different versions of devices
    crypto: engine - do not requeue in case of fatal error
    crypto: cavium/nitrox - Fix a typo in a comment
    crypto: hisilicon/qm - change debugfs file name from qm_regs to regs
    crypto: hisilicon/qm - add DebugFS for xQC and xQE dump
    crypto: hisilicon/zip - add debugfs for Hisilicon ZIP
    crypto: hisilicon/hpre - add debugfs for Hisilicon HPRE
    crypto: hisilicon/sec2 - add debugfs for Hisilicon SEC
    crypto: hisilicon/qm - add debugfs to the QM state machine
    crypto: hisilicon/qm - add debugfs for QM
    crypto: stm32/crc32 - protect from concurrent accesses
    crypto: stm32/crc32 - don't sleep in runtime pm
    crypto: stm32/crc32 - fix multi-instance
    crypto: stm32/crc32 - fix run-time self test issue.
    crypto: stm32/crc32 - fix ext4 chksum BUG_ON()
    crypto: hisilicon/zip - Use temporary sqe when doing work
    crypto: hisilicon - add device error report through abnormal irq
    crypto: hisilicon - remove codes of directly report device errors through MSI
    crypto: hisilicon - QM memory management optimization
    crypto: hisilicon - unify initial value assignment into QM
    ...

    Linus Torvalds
     

01 Jun, 2020

1 commit

  • xdp_umem.c had overlapping changes between the 64-bit math fix
    for the calculation of npgs and the removal of the zerocopy
    memory type which got rid of the chunk_size_nohdr member.

    The mlx5 Kconfig conflict is a case where we just take the
    net-next copy of the Kconfig entry dependency as it takes on
    the ESWITCH dependency by one level of indirection which is
    what the 'net' conflicting change is trying to ensure.

    Signed-off-by: David S. Miller

    David S. Miller
     

30 May, 2020

2 commits

  • Move the computation of creds from prepare_binfmt into begin_new_exec
    so that the creds need only be computed once. This is just code
    reorganization no semantic changes of any kind are made.

    Moving the computation is safe. I have looked through the kernel and
    verified none of the binfmts look at bprm->cred directly, and that
    there are no helpers that look at bprm->cred indirectly. Which means
    that it is not a problem to compute the bprm->cred later in the
    execution flow as it is not used until it becomes current->cred.

    A new function bprm_creds_from_file is added to contain the work that
    needs to be done. bprm_creds_from_file first computes which file
    bprm->executable or most likely bprm->file that the bprm->creds
    will be computed from.

    The funciton bprm_fill_uid is updated to receive the file instead of
    accessing bprm->file. The now unnecessary work needed to reset the
    bprm->cred->euid, and bprm->cred->egid is removed from brpm_fill_uid.
    A small comment to document that bprm_fill_uid now only deals with the
    work to handle suid and sgid files. The default case is already
    heandled by prepare_exec_creds.

    The function security_bprm_repopulate_creds is renamed
    security_bprm_creds_from_file and now is explicitly passed the file
    from which to compute the creds. The documentation of the
    bprm_creds_from_file security hook is updated to explain when the hook
    is called and what it needs to do. The file is passed from
    cap_bprm_creds_from_file into get_file_caps so that the caps are
    computed for the appropriate file. The now unnecessary work in
    cap_bprm_creds_from_file to reset the ambient capabilites has been
    removed. A small comment to document that the work of
    cap_bprm_creds_from_file is to read capabilities from the files
    secureity attribute and derive capabilities from the fact the
    user had uid 0 has been added.

    Reviewed-by: Kees Cook
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • There is a small bug in the code that recomputes parts of bprm->cred
    for every bprm->file. The code never recomputes the part of
    clear_dangerous_personality_flags it is responsible for.

    Which means that in practice if someone creates a sgid script
    the interpreter will not be able to use any of:
    READ_IMPLIES_EXEC
    ADDR_NO_RANDOMIZE
    ADDR_COMPAT_LAYOUT
    MMAP_PAGE_ZERO.

    This accentially clearing of personality flags probably does
    not matter in practice because no one has complained
    but it does make the code more difficult to understand.

    Further remaining bug compatible prevents the recomputation from being
    removed and replaced by simply computing bprm->cred once from the
    final bprm->file.

    Making this change removes the last behavior difference between
    computing bprm->creds from the final file and recomputing
    bprm->cred several times. Which allows this behavior change
    to be justified for it's own reasons, and for any but hunts
    looking into why the behavior changed to wind up here instead
    of in the code that will follow that computes bprm->cred
    from the final bprm->file.

    This small logic bug appears to have existed since the code
    started clearing dangerous personality bits.

    History Tree: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
    Fixes: 1bb0fa189c6a ("[PATCH] NX: clean up legacy binary support")
    Reviewed-by: Kees Cook
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

29 May, 2020

1 commit


28 May, 2020

4 commits

  • Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • This is a bug fix and one of two places where I have found that the
    result of calling security_bprm_repopulate_creds more than once on
    different bprm->files depends on all of the bprm->files not just the
    file bprm->file.

    I intend to fix both of those cases and then modify the code to
    only call security_bprm_repopulate_creds on the final bprm file.

    So merge this change in so I hopefully reduce conflicts for others
    and I make it possible to build on top of this change.

    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • Pull cgroup fixes from Tejun Heo:

    - Reverted stricter synchronization for cgroup recursive stats which
    was prepping it for event counter usage which never got merged. The
    change was causing performation regressions in some cases.

    - Restore bpf-based device-cgroup operation even when cgroup1 device
    cgroup is disabled.

    - An out-param init fix.

    * 'for-5.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
    device_cgroup: Cleanup cgroup eBPF device filter code
    xattr: fix uninitialized out-param
    Revert "cgroup: Add memory barriers to plug cgroup_rstat_updated() race window"

    Linus Torvalds
     
  • Pull execve fix from Eric Biederman:
    "While working on my exec cleanups I found a bug in exec that winds up
    miscomputing the ambient credentials during exec. Andy appears to have
    to been confused as to why credentials are computed for both the
    script and the interpreter

    From the original patch description:

    [3] Linux very confusingly processes both the script and the
    interpreter if applicable, for reasons that elude me. The results
    from thinking about a script's file capabilities and/or setuid
    bits are mostly discarded.

    The only value in struct cred that gets changed in cap_bprm_set_creds
    that I could find that might persist between the script and the
    interpreter was cap_ambient. Which is fixed with this trivial change"

    * 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    exec: Always set cap_ambient in cap_bprm_set_creds

    Linus Torvalds
     

27 May, 2020

1 commit

  • An invariant of cap_bprm_set_creds is that every field in the new cred
    structure that cap_bprm_set_creds might set, needs to be set every
    time to ensure the fields does not get a stale value.

    The field cap_ambient is not set every time cap_bprm_set_creds is
    called, which means that if there is a suid or sgid script with an
    interpreter that has neither the suid nor the sgid bits set the
    interpreter should be able to accept ambient credentials.
    Unfortuantely because cap_ambient is not reset to it's original value
    the interpreter can not accept ambient credentials.

    Given that the ambient capability set is expected to be controlled by
    the caller, I don't think this is particularly serious. But it is
    definitely worth fixing so the code works correctly.

    I have tested to verify my reading of the code is correct and the
    interpreter of a sgid can receive ambient capabilities with this
    change and cannot receive ambient capabilities without this change.

    Cc: stable@vger.kernel.org
    Cc: Andy Lutomirski
    Fixes: 58319057b784 ("capabilities: ambient capabilities")
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

25 May, 2020

1 commit


24 May, 2020

1 commit

  • Pull networking fixes from David Miller:

    1) Fix RCU warnings in ipv6 multicast router code, from Madhuparna
    Bhowmik.

    2) Nexthop attributes aren't being checked properly because of
    mis-initialized iterator, from David Ahern.

    3) Revert iop_idents_reserve() change as it caused performance
    regressions and was just working around what is really a UBSAN bug
    in the compiler. From Yuqi Jin.

    4) Read MAC address properly from ROM in bmac driver (double iteration
    proceeds past end of address array), from Jeremy Kerr.

    5) Add Microsoft Surface device IDs to r8152, from Marc Payne.

    6) Prevent reference to freed SKB in __netif_receive_skb_core(), from
    Boris Sukholitko.

    7) Fix ACK discard behavior in rxrpc, from David Howells.

    8) Preserve flow hash across packet scrubbing in wireguard, from Jason
    A. Donenfeld.

    9) Cap option length properly for SO_BINDTODEVICE in AX25, from Eric
    Dumazet.

    10) Fix encryption error checking in kTLS code, from Vadim Fedorenko.

    11) Missing BPF prog ref release in flow dissector, from Jakub Sitnicki.

    12) dst_cache must be used with BH disabled in tipc, from Eric Dumazet.

    13) Fix use after free in mlxsw driver, from Jiri Pirko.

    14) Order kTLS key destruction properly in mlx5 driver, from Tariq
    Toukan.

    15) Check devm_platform_ioremap_resource() return value properly in
    several drivers, from Tiezhu Yang.

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (71 commits)
    net: smsc911x: Fix runtime PM imbalance on error
    net/mlx4_core: fix a memory leak bug.
    net: ethernet: ti: cpsw: fix ASSERT_RTNL() warning during suspend
    net: phy: mscc: fix initialization of the MACsec protocol mode
    net: stmmac: don't attach interface until resume finishes
    net: Fix return value about devm_platform_ioremap_resource()
    net/mlx5: Fix error flow in case of function_setup failure
    net/mlx5e: CT: Correctly get flow rule
    net/mlx5e: Update netdev txq on completions during closure
    net/mlx5: Annotate mutex destroy for root ns
    net/mlx5: Don't maintain a case of del_sw_func being null
    net/mlx5: Fix cleaning unmanaged flow tables
    net/mlx5: Fix memory leak in mlx5_events_init
    net/mlx5e: Fix inner tirs handling
    net/mlx5e: kTLS, Destroy key object after destroying the TIS
    net/mlx5e: Fix allowed tc redirect merged eswitch offload cases
    net/mlx5: Avoid processing commands before cmdif is ready
    net/mlx5: Fix a race when moving command interface to events mode
    net/mlx5: Add command entry handling completion
    rxrpc: Fix a memory leak in rxkad_verify_response()
    ...

    Linus Torvalds
     

23 May, 2020

1 commit

  • Files can be mmap'ed read/write and later changed to execute to circumvent
    IMA's mmap appraise policy rules. Due to locking issues (mmap semaphore
    would be taken prior to i_mutex), files can not be measured or appraised at
    this point. Eliminate this integrity gap, by denying the mprotect
    PROT_EXECUTE change, if an mmap appraise policy rule exists.

    On mprotect change success, return 0. On failure, return -EACESS.

    Reviewed-by: Lakshmi Ramasubramanian
    Signed-off-by: Mimi Zohar

    Mimi Zohar
     

22 May, 2020

3 commits

  • In the implementation of aa_audit_rule_init(), when aa_label_parse()
    fails the allocated memory for rule is released using
    aa_audit_rule_free(). But after this release, the return statement
    tries to access the label field of the rule which results in
    use-after-free. Before releasing the rule, copy errNo and return it
    after release.

    Fixes: 52e8c38001d8 ("apparmor: Fix memory leak of rule on error exit path")
    Signed-off-by: Navid Emamdoost
    Signed-off-by: John Johansen

    Navid Emamdoost
     
  • policy_update() invokes begin_current_label_crit_section(), which
    returns a reference of the updated aa_label object to "label" with
    increased refcount.

    When policy_update() returns, "label" becomes invalid, so the refcount
    should be decreased to keep refcount balanced.

    The reference counting issue happens in one exception handling path of
    policy_update(). When aa_may_manage_policy() returns not NULL, the
    refcnt increased by begin_current_label_crit_section() is not decreased,
    causing a refcnt leak.

    Fix this issue by jumping to "end_section" label when
    aa_may_manage_policy() returns not NULL.

    Fixes: 5ac8c355ae00 ("apparmor: allow introspecting the loaded policy pre internal transform")
    Signed-off-by: Xiyu Yang
    Signed-off-by: Xin Tan
    Signed-off-by: John Johansen

    Xiyu Yang
     
  • aa_change_profile() invokes aa_get_current_label(), which returns
    a reference of the current task's label.

    According to the comment of aa_get_current_label(), the returned
    reference must be put with aa_put_label().
    However, when the original object pointed by "label" becomes
    unreachable because aa_change_profile() returns or a new object
    is assigned to "label", reference count increased by
    aa_get_current_label() is not decreased, causing a refcnt leak.

    Fix this by calling aa_put_label() before aa_change_profile() return
    and dropping unnecessary aa_get_current_label().

    Fixes: 9fcf78cca198 ("apparmor: update domain transitions that are subsets of confinement at nnp")
    Signed-off-by: Xiyu Yang
    Signed-off-by: Xin Tan
    Signed-off-by: John Johansen

    Xiyu Yang
     

21 May, 2020

3 commits

  • Rename bprm->cap_elevated to bprm->active_secureexec and initialize it
    in prepare_binprm instead of in cap_bprm_set_creds. Initializing
    bprm->active_secureexec in prepare_binprm allows multiple
    implementations of security_bprm_repopulate_creds to play nicely with
    each other.

    Rename security_bprm_set_creds to security_bprm_reopulate_creds to
    emphasize that this path recomputes part of bprm->cred. This
    recomputation avoids the time of check vs time of use problems that
    are inherent in unix #! interpreters.

    In short two renames and a move in the location of initializing
    bprm->active_secureexec.

    Link: https://lkml.kernel.org/r/87o8qkzrxp.fsf_-_@x220.int.ebiederm.org
    Acked-by: Linus Torvalds
    Reviewed-by: Kees Cook
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • secid_to_secctx is not stackable, and since the BPF LSM registers this
    hook by default, the call_int_hook logic is not suitable which
    "bails-on-fail" and casues issues when other LSMs register this hook and
    eventually breaks Audit.

    In order to fix this, directly iterate over the security hooks instead
    of using call_int_hook as suggested in:

    https: //lore.kernel.org/bpf/9d0eb6c6-803a-ff3a-5603-9ad6d9edfc00@schaufler-ca.com/#t

    Fixes: 98e828a0650f ("security: Refactor declaration of LSM hooks")
    Fixes: 625236ba3832 ("security: Fix the default value of secid_to_secctx hook")
    Reported-by: Alexei Starovoitov
    Signed-off-by: KP Singh
    Signed-off-by: Alexei Starovoitov
    Acked-by: James Morris
    Link: https://lore.kernel.org/bpf/20200520125616.193765-1-kpsingh@chromium.org

    KP Singh
     
  • Today security_bprm_set_creds has several implementations:
    apparmor_bprm_set_creds, cap_bprm_set_creds, selinux_bprm_set_creds,
    smack_bprm_set_creds, and tomoyo_bprm_set_creds.

    Except for cap_bprm_set_creds they all test bprm->called_set_creds and
    return immediately if it is true. The function cap_bprm_set_creds
    ignores bprm->calld_sed_creds entirely.

    Create a new LSM hook security_bprm_creds_for_exec that is called just
    before prepare_binprm in __do_execve_file, resulting in a LSM hook
    that is called exactly once for the entire of exec. Modify the bits
    of security_bprm_set_creds that only want to be called once per exec
    into security_bprm_creds_for_exec, leaving only cap_bprm_set_creds
    behind.

    Remove bprm->called_set_creds all of it's former users have been moved
    to security_bprm_creds_for_exec.

    Add or upate comments a appropriate to bring them up to date and
    to reflect this change.

    Link: https://lkml.kernel.org/r/87v9kszrzh.fsf_-_@x220.int.ebiederm.org
    Acked-by: Linus Torvalds
    Acked-by: Casey Schaufler # For the LSM and Smack bits
    Reviewed-by: Kees Cook
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

19 May, 2020

2 commits

  • syzbot found that

    touch /proc/testfile

    causes NULL pointer dereference at tomoyo_get_local_path()
    because inode of the dentry is NULL.

    Before c59f415a7cb6, Tomoyo received pid_ns from proc's s_fs_info
    directly. Since proc_pid_ns() can only work with inode, using it in
    the tomoyo_get_local_path() was wrong.

    To avoid creating more functions for getting proc_ns, change the
    argument type of the proc_pid_ns() function. Then, Tomoyo can use
    the existing super_block to get pid_ns.

    Link: https://lkml.kernel.org/r/0000000000002f0c7505a5b0e04c@google.com
    Link: https://lkml.kernel.org/r/20200518180738.2939611-1-gladkov.alexey@gmail.com
    Reported-by: syzbot+c1af344512918c61362c@syzkaller.appspotmail.com
    Fixes: c59f415a7cb6 ("Use proc_pid_ns() to get pid_namespace from the proc superblock")
    Signed-off-by: Alexey Gladkov
    Signed-off-by: Eric W. Biederman

    Alexey Gladkov
     
  • Pull integrity fixes from Mimi Zohar:
    "A couple of miscellaneous bug fixes for the integrity subsystem:

    IMA:

    - Properly modify the open flags in order to calculate the file hash.

    - On systems requiring the IMA policy to be signed, the policy is
    loaded differently. Don't differentiate between "enforce" and
    either "log" or "fix" modes how the policy is loaded.

    EVM:

    - Two patches to fix an EVM race condition, normally the result of
    attempting to load an unsupported hash algorithm.

    - Use the lockless RCU version for walking an append only list"

    * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
    evm: Fix a small race in init_desc()
    evm: Fix RCU list related warnings
    ima: Fix return value of ima_write_policy()
    evm: Check also if *tfm is an error pointer in init_desc()
    ima: Set file->f_mode instead of file->f_flags in ima_calc_file_hash()

    Linus Torvalds
     

15 May, 2020

1 commit

  • Split BPF operations that are allowed under CAP_SYS_ADMIN into
    combination of CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN.
    For backward compatibility include them in CAP_SYS_ADMIN as well.

    The end result provides simple safety model for applications that use BPF:
    - to load tracing program types
    BPF_PROG_TYPE_{KPROBE, TRACEPOINT, PERF_EVENT, RAW_TRACEPOINT, etc}
    use CAP_BPF and CAP_PERFMON
    - to load networking program types
    BPF_PROG_TYPE_{SCHED_CLS, XDP, SK_SKB, etc}
    use CAP_BPF and CAP_NET_ADMIN

    There are few exceptions from this rule:
    - bpf_trace_printk() is allowed in networking programs, but it's using
    tracing mechanism, hence this helper needs additional CAP_PERFMON
    if networking program is using this helper.
    - BPF_F_ZERO_SEED flag for hash/lru map is allowed under CAP_SYS_ADMIN only
    to discourage production use.
    - BPF HW offload is allowed under CAP_SYS_ADMIN.
    - bpf_probe_write_user() is allowed under CAP_SYS_ADMIN only.

    CAPs are not checked at attach/detach time with two exceptions:
    - loading BPF_PROG_TYPE_CGROUP_SKB is allowed for unprivileged users,
    hence CAP_NET_ADMIN is required at attach time.
    - flow_dissector detach doesn't check prog FD at detach,
    hence CAP_NET_ADMIN is required at detach time.

    CAP_SYS_ADMIN is required to iterate BPF objects (progs, maps, links) via get_next_id
    command and convert them to file descriptor via GET_FD_BY_ID command.
    This restriction guarantees that mutliple tasks with CAP_BPF are not able to
    affect each other. That leads to clean isolation of tasks. For example:
    task A with CAP_BPF and CAP_NET_ADMIN loads and attaches a firewall via bpf_link.
    task B with the same capabilities cannot detach that firewall unless
    task A explicitly passed link FD to task B via scm_rights or bpffs.
    CAP_SYS_ADMIN can still detach/unload everything.

    Two networking user apps with CAP_SYS_ADMIN and CAP_NET_ADMIN can
    accidentely mess with each other programs and maps.
    Two networking user apps with CAP_NET_ADMIN and CAP_BPF cannot affect each other.

    CAP_NET_ADMIN + CAP_BPF allows networking programs access only packet data.
    Such networking progs cannot access arbitrary kernel memory or leak pointers.

    bpftool, bpftrace, bcc tools binaries should NOT be installed with
    CAP_BPF and CAP_PERFMON, since unpriv users will be able to read kernel secrets.
    But users with these two permissions will be able to use these tracing tools.

    CAP_PERFMON is least secure, since it allows kprobes and kernel memory access.
    CAP_NET_ADMIN can stop network traffic via iproute2.
    CAP_BPF is the safest from security point of view and harmless on its own.

    Having CAP_BPF and/or CAP_NET_ADMIN is not enough to write into arbitrary map
    and if that map is used by firewall-like bpf prog.
    CAP_BPF allows many bpf prog_load commands in parallel. The verifier
    may consume large amount of memory and significantly slow down the system.

    Existing unprivileged BPF operations are not affected.
    In particular unprivileged users are allowed to load socket_filter and cg_skb
    program types and to create array, hash, prog_array, map-in-map map types.

    Signed-off-by: Alexei Starovoitov
    Signed-off-by: Daniel Borkmann
    Link: https://lore.kernel.org/bpf/20200513230355.7858-2-alexei.starovoitov@gmail.com

    Alexei Starovoitov