13 Feb, 2018

1 commit

  • Changes since v1:
    Added changes in these files:
    drivers/infiniband/hw/usnic/usnic_transport.c
    drivers/staging/lustre/lnet/lnet/lib-socket.c
    drivers/target/iscsi/iscsi_target_login.c
    drivers/vhost/net.c
    fs/dlm/lowcomms.c
    fs/ocfs2/cluster/tcp.c
    security/tomoyo/network.c

    Before:
    All these functions either return a negative error indicator,
    or store length of sockaddr into "int *socklen" parameter
    and return zero on success.

    "int *socklen" parameter is awkward. For example, if caller does not
    care, it still needs to provide on-stack storage for the value
    it does not need.

    None of the many FOO_getname() functions of various protocols
    ever used old value of *socklen. They always just overwrite it.

    This change drops this parameter, and makes all these functions, on success,
    return length of sockaddr. It's always >= 0 and can be differentiated
    from an error.

    Tests in callers are changed from "if (err)" to "if (err < 0)", where needed.

    rpc_sockname() lost "int buflen" parameter, since its only use was
    to be passed to kernel_getsockname() as &buflen and subsequently
    not used in any way.

    Userspace API is not changed.

    text data bss dec hex filename
    30108430 2633624 873672 33615726 200ef6e vmlinux.before.o
    30108109 2633612 873672 33615393 200ee21 vmlinux.o

    Signed-off-by: Denys Vlasenko
    CC: David S. Miller
    CC: linux-kernel@vger.kernel.org
    CC: netdev@vger.kernel.org
    CC: linux-bluetooth@vger.kernel.org
    CC: linux-decnet-user@lists.sourceforge.net
    CC: linux-wireless@vger.kernel.org
    CC: linux-rdma@vger.kernel.org
    CC: linux-sctp@vger.kernel.org
    CC: linux-nfs@vger.kernel.org
    CC: linux-x25@vger.kernel.org
    Signed-off-by: David S. Miller

    Denys Vlasenko
     

12 Feb, 2018

1 commit

  • This is the mindless scripted replacement of kernel use of POLL*
    variables as described by Al, done by this script:

    for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
    L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
    for f in $L; do sed -i "-es/^\([^\"]*\)\(\\)/\\1E\\2/" $f; done
    done

    with de-mangling cleanups yet to come.

    NOTE! On almost all architectures, the EPOLL* constants have the same
    values as the POLL* constants do. But they keyword here is "almost".
    For various bad reasons they aren't the same, and epoll() doesn't
    actually work quite correctly in some cases due to this on Sparc et al.

    The next patch from Al will sort out the final differences, and we
    should be all done.

    Scripted-by: Al Viro
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

08 Feb, 2018

1 commit

  • Pull inode->i_version cleanup from Jeff Layton:
    "Goffredo went ahead and sent a patch to rename this function, and
    reverse its sense, as we discussed last week.

    The patch is very straightforward and I figure it's probably best to
    go ahead and merge this to get the API as settled as possible"

    * tag 'iversion-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
    iversion: Rename make inode_cmp_iversion{+raw} to inode_eq_iversion{+raw}

    Linus Torvalds
     

07 Feb, 2018

1 commit

  • There are several functions that do find_task_by_vpid() followed by
    get_task_struct(). We can use a helper function instead.

    Link: http://lkml.kernel.org/r/1509602027-11337-1-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Acked-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

04 Feb, 2018

1 commit

  • Pull hardened usercopy whitelisting from Kees Cook:
    "Currently, hardened usercopy performs dynamic bounds checking on slab
    cache objects. This is good, but still leaves a lot of kernel memory
    available to be copied to/from userspace in the face of bugs.

    To further restrict what memory is available for copying, this creates
    a way to whitelist specific areas of a given slab cache object for
    copying to/from userspace, allowing much finer granularity of access
    control.

    Slab caches that are never exposed to userspace can declare no
    whitelist for their objects, thereby keeping them unavailable to
    userspace via dynamic copy operations. (Note, an implicit form of
    whitelisting is the use of constant sizes in usercopy operations and
    get_user()/put_user(); these bypass all hardened usercopy checks since
    these sizes cannot change at runtime.)

    This new check is WARN-by-default, so any mistakes can be found over
    the next several releases without breaking anyone's system.

    The series has roughly the following sections:
    - remove %p and improve reporting with offset
    - prepare infrastructure and whitelist kmalloc
    - update VFS subsystem with whitelists
    - update SCSI subsystem with whitelists
    - update network subsystem with whitelists
    - update process memory with whitelists
    - update per-architecture thread_struct with whitelists
    - update KVM with whitelists and fix ioctl bug
    - mark all other allocations as not whitelisted
    - update lkdtm for more sensible test overage"

    * tag 'usercopy-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (38 commits)
    lkdtm: Update usercopy tests for whitelisting
    usercopy: Restrict non-usercopy caches to size 0
    kvm: x86: fix KVM_XEN_HVM_CONFIG ioctl
    kvm: whitelist struct kvm_vcpu_arch
    arm: Implement thread_struct whitelist for hardened usercopy
    arm64: Implement thread_struct whitelist for hardened usercopy
    x86: Implement thread_struct whitelist for hardened usercopy
    fork: Provide usercopy whitelisting for task_struct
    fork: Define usercopy region in thread_stack slab caches
    fork: Define usercopy region in mm_struct slab caches
    net: Restrict unwhitelisted proto caches to size 0
    sctp: Copy struct sctp_sock.autoclose to userspace using put_user()
    sctp: Define usercopy region in SCTP proto slab cache
    caif: Define usercopy region in caif proto slab cache
    ip: Define usercopy region in IP proto slab cache
    net: Define usercopy region in struct proto slab cache
    scsi: Define usercopy region in scsi_sense_cache slab cache
    cifs: Define usercopy region in cifs_request slab cache
    vxfs: Define usercopy region in vxfs_inode slab cache
    ufs: Define usercopy region in ufs_inode_cache slab cache
    ...

    Linus Torvalds
     

02 Feb, 2018

2 commits

  • Intermittently security.ima is not being written for new files. This
    patch re-initializes the new slab iint->atomic_flags field before
    freeing it.

    Fixes: commit 0d73a55208e9 ("ima: re-introduce own integrity cache lock")
    Signed-off-by: Mimi Zohar
    Signed-off-by: James Morris

    Mimi Zohar
     
  • Pull char/misc driver updates from Greg KH:
    "Here is the big pull request for char/misc drivers for 4.16-rc1.

    There's a lot of stuff in here. Three new driver subsystems were added
    for various types of hardware busses:

    - siox
    - slimbus
    - soundwire

    as well as a new vboxguest subsystem for the VirtualBox hypervisor
    drivers.

    There's also big updates from the FPGA subsystem, lots of Android
    binder fixes, the usual handful of hyper-v updates, and lots of other
    smaller driver updates.

    All of these have been in linux-next for a long time, with no reported
    issues"

    * tag 'char-misc-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (155 commits)
    char: lp: use true or false for boolean values
    android: binder: use VM_ALLOC to get vm area
    android: binder: Use true and false for boolean values
    lkdtm: fix handle_irq_event symbol for INT_HW_IRQ_EN
    EISA: Delete error message for a failed memory allocation in eisa_probe()
    EISA: Whitespace cleanup
    misc: remove AVR32 dependencies
    virt: vbox: Add error mapping for VERR_INVALID_NAME and VERR_NO_MORE_FILES
    soundwire: Fix a signedness bug
    uio_hv_generic: fix new type mismatch warnings
    uio_hv_generic: fix type mismatch warnings
    auxdisplay: img-ascii-lcd: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
    uio_hv_generic: add rescind support
    uio_hv_generic: check that host supports monitor page
    uio_hv_generic: create send and receive buffers
    uio: document uio_hv_generic regions
    doc: fix documentation about uio_hv_generic
    vmbus: add monitor_id and subchannel_id to sysfs per channel
    vmbus: fix ABI documentation
    uio_hv_generic: use ISR callback method
    ...

    Linus Torvalds
     

01 Feb, 2018

6 commits

  • The function inode_cmp_iversion{+raw} is counter-intuitive, because it
    returns true when the counters are different and false when these are equal.

    Rename it to inode_eq_iversion{+raw}, which will returns true when
    the counters are equal and false otherwise.

    Signed-off-by: Goffredo Baroncelli
    Signed-off-by: Jeff Layton

    Goffredo Baroncelli
     
  • Pull networking updates from David Miller:

    1) Significantly shrink the core networking routing structures. Result
    of http://vger.kernel.org/~davem/seoul2017_netdev_keynote.pdf

    2) Add netdevsim driver for testing various offloads, from Jakub
    Kicinski.

    3) Support cross-chip FDB operations in DSA, from Vivien Didelot.

    4) Add a 2nd listener hash table for TCP, similar to what was done for
    UDP. From Martin KaFai Lau.

    5) Add eBPF based queue selection to tun, from Jason Wang.

    6) Lockless qdisc support, from John Fastabend.

    7) SCTP stream interleave support, from Xin Long.

    8) Smoother TCP receive autotuning, from Eric Dumazet.

    9) Lots of erspan tunneling enhancements, from William Tu.

    10) Add true function call support to BPF, from Alexei Starovoitov.

    11) Add explicit support for GRO HW offloading, from Michael Chan.

    12) Support extack generation in more netlink subsystems. From Alexander
    Aring, Quentin Monnet, and Jakub Kicinski.

    13) Add 1000BaseX, flow control, and EEE support to mvneta driver. From
    Russell King.

    14) Add flow table abstraction to netfilter, from Pablo Neira Ayuso.

    15) Many improvements and simplifications to the NFP driver bpf JIT,
    from Jakub Kicinski.

    16) Support for ipv6 non-equal cost multipath routing, from Ido
    Schimmel.

    17) Add resource abstration to devlink, from Arkadi Sharshevsky.

    18) Packet scheduler classifier shared filter block support, from Jiri
    Pirko.

    19) Avoid locking in act_csum, from Davide Caratti.

    20) devinet_ioctl() simplifications from Al viro.

    21) More TCP bpf improvements from Lawrence Brakmo.

    22) Add support for onlink ipv6 route flag, similar to ipv4, from David
    Ahern.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1925 commits)
    tls: Add support for encryption using async offload accelerator
    ip6mr: fix stale iterator
    net/sched: kconfig: Remove blank help texts
    openvswitch: meter: Use 64-bit arithmetic instead of 32-bit
    tcp_nv: fix potential integer overflow in tcpnv_acked
    r8169: fix RTL8168EP take too long to complete driver initialization.
    qmi_wwan: Add support for Quectel EP06
    rtnetlink: enable IFLA_IF_NETNSID for RTM_NEWLINK
    ipmr: Fix ptrdiff_t print formatting
    ibmvnic: Wait for device response when changing MAC
    qlcnic: fix deadlock bug
    tcp: release sk_frag.page in tcp_disconnect
    ipv4: Get the address of interface correctly.
    net_sched: gen_estimator: fix lockdep splat
    net: macb: Handle HRESP error
    net/mlx5e: IPoIB, Fix copy-paste bug in flow steering refactoring
    ipv6: addrconf: break critical section in addrconf_verify_rtnl()
    ipv6: change route cache aging logic
    i40e/i40evf: Update DESC_NEEDED value to reflect larger value
    bnxt_en: cleanup DIM work on device shutdown
    ...

    Linus Torvalds
     
  • Pull selinux updates from Paul Moore:
    "A small pull request this time, just three patches, and one of these
    is just a comment update (swap the FSF physical address for a URL).

    The other two patches are small bug fixes found by szybot/syzkaller;
    they individual patch descriptions should tell you all you ever wanted
    to know"

    * tag 'selinux-pr-20180130' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
    selinux: skip bounded transition processing if the policy isn't loaded
    selinux: ensure the context is NUL terminated in security_context_to_sid_core()
    security: replace FSF address with web source in license notices

    Linus Torvalds
     
  • Pull tpm updates from James Morris:

    - reduce polling delays in tpm_tis

    - support retrieving TPM 2.0 Event Log through EFI before
    ExitBootServices

    - replace tpm-rng.c with a hwrng device managed by the driver for each
    TPM device

    - TPM resource manager synthesizes TPM_RC_COMMAND_CODE response instead
    of returning -EINVAL for unknown TPM commands. This makes user space
    more sound.

    - CLKRUN fixes:

    * Keep #CLKRUN disable through the entier TPM command/response flow

    * Check whether #CLKRUN is enabled before disabling and enabling it
    again because enabling it breaks PS/2 devices on a system where it
    is disabled

    * 'next-tpm' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
    tpm: remove unused variables
    tpm: remove unused data fields from I2C and OF device ID tables
    tpm: only attempt to disable the LPC CLKRUN if is already enabled
    tpm: follow coding style for variable declaration in tpm_tis_core_init()
    tpm: delete the TPM_TIS_CLK_ENABLE flag
    tpm: Update MAINTAINERS for Jason Gunthorpe
    tpm: Keep CLKRUN enabled throughout the duration of transmit_cmd()
    tpm_tis: Move ilb_base_addr to tpm_tis_data
    tpm2-cmd: allow more attempts for selftest execution
    tpm: return a TPM_RC_COMMAND_CODE response if command is not implemented
    tpm: Move Linux RNG connection to hwrng
    tpm: use struct tpm_chip for tpm_chip_find_get()
    tpm: parse TPM event logs based on EFI table
    efi: call get_event_log before ExitBootServices
    tpm: add event log format version
    tpm: rename event log provider files
    tpm: move tpm_eventlog.h outside of drivers folder
    tpm: use tpm_msleep() value as max delay
    tpm: reduce tpm polling delay in tpm_tis_core
    tpm: move wait_for_tpm_stat() to respective driver files

    Linus Torvalds
     
  • Pull smack updates from James Morris:
    "Two minor fixes"

    * 'next-smack' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
    Smack: Privilege check on key operations
    Smack: fix dereferenced before check

    Linus Torvalds
     
  • …morris/linux-security

    Pull integrity updates from James Morris:
    "This contains a mixture of bug fixes, code cleanup, and new
    functionality. Of note is the integrity cache locking fix, file change
    detection, and support for a new EVM portable and immutable signature
    type.

    The re-introduction of the integrity cache lock (iint) fixes the
    problem of attempting to take the i_rwsem shared a second time, when
    it was previously taken exclusively. Defining atomic flags resolves
    the original iint/i_rwsem circular locking - accessing the file data
    vs. modifying the file metadata. Although it fixes the O_DIRECT
    problem as well, a subsequent patch is needed to remove the explicit
    O_DIRECT prevention.

    For performance reasons, detecting when a file has changed and needs
    to be re-measured, re-appraised, and/or re-audited, was limited to
    after the last writer has closed, and only if the file data has
    changed. Detecting file change is based on i_version. For filesystems
    that do not support i_version, remote filesystems, or userspace
    filesystems, the file was measured, appraised and/or audited once and
    never re-evaluated. Now local filesystems, which do not support
    i_version or are not mounted with the i_version option, assume the
    file has changed and are required to re-evaluate the file. This change
    does not address detecting file change on remote or userspace
    filesystems.

    Unlike file data signatures, which can be included and distributed in
    software packages (eg. rpm, deb), the existing EVM signature, which
    protects the file metadata, could not be included in software
    packages, as it includes file system specific information (eg. i_ino,
    possibly the UUID). This pull request defines a new EVM portable and
    immutable file metadata signature format, which can be included in
    software packages"

    * 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
    ima/policy: fix parsing of fsuuid
    ima: Use i_version only when filesystem supports it
    integrity: remove unneeded initializations in integrity_iint_cache entries
    ima: log message to module appraisal error
    ima: pass filename to ima_rdwr_violation_check()
    ima: Fix line continuation format
    ima: support new "hash" and "dont_hash" policy actions
    ima: re-introduce own integrity cache lock
    EVM: Add support for portable signature format
    EVM: Allow userland to permit modification of EVM-protected metadata
    ima: relax requiring a file signature for new files with zero length

    Linus Torvalds
     

31 Jan, 2018

2 commits

  • Pull poll annotations from Al Viro:
    "This introduces a __bitwise type for POLL### bitmap, and propagates
    the annotations through the tree. Most of that stuff is as simple as
    'make ->poll() instances return __poll_t and do the same to local
    variables used to hold the future return value'.

    Some of the obvious brainos found in process are fixed (e.g. POLLIN
    misspelled as POLL_IN). At that point the amount of sparse warnings is
    low and most of them are for genuine bugs - e.g. ->poll() instance
    deciding to return -EINVAL instead of a bitmap. I hadn't touched those
    in this series - it's large enough as it is.

    Another problem it has caught was eventpoll() ABI mess; select.c and
    eventpoll.c assumed that corresponding POLL### and EPOLL### were
    equal. That's true for some, but not all of them - EPOLL### are
    arch-independent, but POLL### are not.

    The last commit in this series separates userland POLL### values from
    the (now arch-independent) kernel-side ones, converting between them
    in the few places where they are copied to/from userland. AFAICS, this
    is the least disruptive fix preserving poll(2) ABI and making epoll()
    work on all architectures.

    As it is, it's simply broken on sparc - try to give it EPOLLWRNORM and
    it will trigger only on what would've triggered EPOLLWRBAND on other
    architectures. EPOLLWRBAND and EPOLLRDHUP, OTOH, are never triggered
    at all on sparc. With this patch they should work consistently on all
    architectures"

    * 'misc.poll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits)
    make kernel-side POLL... arch-independent
    eventpoll: no need to mask the result of epi_item_poll() again
    eventpoll: constify struct epoll_event pointers
    debugging printk in sg_poll() uses %x to print POLL... bitmap
    annotate poll(2) guts
    9p: untangle ->poll() mess
    ->si_band gets POLL... bitmap stored into a user-visible long field
    ring_buffer_poll_wait() return value used as return value of ->poll()
    the rest of drivers/*: annotate ->poll() instances
    media: annotate ->poll() instances
    fs: annotate ->poll() instances
    ipc, kernel, mm: annotate ->poll() instances
    net: annotate ->poll() instances
    apparmor: annotate ->poll() instances
    tomoyo: annotate ->poll() instances
    sound: annotate ->poll() instances
    acpi: annotate ->poll() instances
    crypto: annotate ->poll() instances
    block: annotate ->poll() instances
    x86: annotate ->poll() instances
    ...

    Linus Torvalds
     
  • Pull RCU updates from Ingo Molnar:
    "The main RCU changes in this cycle were:

    - Updates to use cond_resched() instead of cond_resched_rcu_qs()
    where feasible (currently everywhere except in kernel/rcu and in
    kernel/torture.c). Also a couple of fixes to avoid sending IPIs to
    offline CPUs.

    - Updates to simplify RCU's dyntick-idle handling.

    - Updates to remove almost all uses of smp_read_barrier_depends() and
    read_barrier_depends().

    - Torture-test updates.

    - Miscellaneous fixes"

    * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits)
    torture: Save a line in stutter_wait(): while -> for
    torture: Eliminate torture_runnable and perf_runnable
    torture: Make stutter less vulnerable to compilers and races
    locking/locktorture: Fix num reader/writer corner cases
    locking/locktorture: Fix rwsem reader_delay
    torture: Place all torture-test modules in one MAINTAINERS group
    rcutorture/kvm-build.sh: Skip build directory check
    rcutorture: Simplify functions.sh include path
    rcutorture: Simplify logging
    rcutorture/kvm-recheck-*: Improve result directory readability check
    rcutorture/kvm.sh: Support execution from any directory
    rcutorture/kvm.sh: Use consistent help text for --qemu-args
    rcutorture/kvm.sh: Remove unused variable, `alldone`
    rcutorture: Remove unused script, config2frag.sh
    rcutorture/configinit: Fix build directory error message
    rcutorture: Preempt RCU-preempt readers more vigorously
    torture: Reduce #ifdefs for preempt_schedule()
    rcu: Remove have_rcu_nocb_mask from tree_plugin.h
    rcu: Add comment giving debug strategy for double call_rcu()
    tracing, rcu: Hide trace event rcu_nocb_wake when not used
    ...

    Linus Torvalds
     

30 Jan, 2018

1 commit

  • Pull inode->i_version rework from Jeff Layton:
    "This pile of patches is a rework of the inode->i_version field. We
    have traditionally incremented that field on every inode data or
    metadata change. Typically this increment needs to be logged on disk
    even when nothing else has changed, which is rather expensive.

    It turns out though that none of the consumers of that field actually
    require this behavior. The only real requirement for all of them is
    that it be different iff the inode has changed since the last time the
    field was checked.

    Given that, we can optimize away most of the i_version increments and
    avoid dirtying inode metadata when the only change is to the i_version
    and no one is querying it. Queries of the i_version field are rather
    rare, so we can help write performance under many common workloads.

    This patch series converts existing accesses of the i_version field to
    a new API, and then converts all of the in-kernel filesystems to use
    it. The last patch in the series then converts the backend
    implementation to a scheme that optimizes away a large portion of the
    metadata updates when no one is looking at it.

    In my own testing this series significantly helps performance with
    small I/O sizes. I also got this email for Christmas this year from
    the kernel test robot (a 244% r/w bandwidth improvement with XFS over
    DAX, with 4k writes):

    https://lkml.org/lkml/2017/12/25/8

    A few of the earlier patches in this pile are also flowing to you via
    other trees (mm, integrity, and nfsd trees in particular)".

    * tag 'iversion-v4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: (22 commits)
    fs: handle inode->i_version more efficiently
    btrfs: only dirty the inode in btrfs_update_time if something was changed
    xfs: avoid setting XFS_ILOG_CORE if i_version doesn't need incrementing
    fs: only set S_VERSION when updating times if necessary
    IMA: switch IMA over to new i_version API
    xfs: convert to new i_version API
    ufs: use new i_version API
    ocfs2: convert to new i_version API
    nfsd: convert to new i_version API
    nfs: convert to new i_version API
    ext4: convert to new i_version API
    ext2: convert to new i_version API
    exofs: switch to new i_version API
    btrfs: convert to new i_version API
    afs: convert to new i_version API
    affs: convert to new i_version API
    fat: convert to new i_version API
    fs: don't take the i_lock in inode_inc_iversion
    fs: new API for handling inode->i_version
    ntfs: remove i_version handling
    ...

    Linus Torvalds
     

29 Jan, 2018

1 commit


19 Jan, 2018

1 commit

  • The switch to uuid_t invereted the logic of verfication that &entry->fsuuid
    is zero during parsing of "fsuuid=" rule. Instead of making sure the
    &entry->fsuuid field is not attempted to be overwritten, we bail out for
    perfectly correct rule.

    Fixes: 787d8c530af7 ("ima/policy: switch to use uuid_t")

    Signed-off-by: Mike Rapoport
    Cc: stable@vger.kernel.org
    Signed-off-by: Mimi Zohar

    Mike Rapoport
     

17 Jan, 2018

1 commit


16 Jan, 2018

1 commit

  • This introduces CONFIG_HARDENED_USERCOPY_FALLBACK to control the
    behavior of hardened usercopy whitelist violations. By default, whitelist
    violations will continue to WARN() so that any bad or missing usercopy
    whitelists can be discovered without being too disruptive.

    If this config is disabled at build time or a system is booted with
    "slab_common.usercopy_fallback=0", usercopy whitelists will BUG() instead
    of WARN(). This is useful for admins that want to use usercopy whitelists
    immediately.

    Suggested-by: Matthew Garrett
    Signed-off-by: Kees Cook

    Kees Cook
     

15 Jan, 2018

1 commit

  • Pull x86 pti updates from Thomas Gleixner:
    "This contains:

    - a PTI bugfix to avoid setting reserved CR3 bits when PCID is
    disabled. This seems to cause issues on a virtual machine at least
    and is incorrect according to the AMD manual.

    - a PTI bugfix which disables the perf BTS facility if PTI is
    enabled. The BTS AUX buffer is not globally visible and causes the
    CPU to fault when the mapping disappears on switching CR3 to user
    space. A full fix which restores BTS on PTI is non trivial and will
    be worked on.

    - PTI bugfixes for EFI and trusted boot which make sure that the user
    space visible page table entries have the NX bit cleared

    - removal of dead code in the PTI pagetable setup functions

    - add PTI documentation

    - add a selftest for vsyscall to verify that the kernel actually
    implements what it advertises.

    - a sysfs interface to expose vulnerability and mitigation
    information so there is a coherent way for users to retrieve the
    status.

    - the initial spectre_v2 mitigations, aka retpoline:

    + The necessary ASM thunk and compiler support

    + The ASM variants of retpoline and the conversion of affected ASM
    code

    + Make LFENCE serializing on AMD so it can be used as speculation
    trap

    + The RSB fill after vmexit

    - initial objtool support for retpoline

    As I said in the status mail this is the most of the set of patches
    which should go into 4.15 except two straight forward patches still on
    hold:

    - the retpoline add on of LFENCE which waits for ACKs

    - the RSB fill after context switch

    Both should be ready to go early next week and with that we'll have
    covered the major holes of spectre_v2 and go back to normality"

    * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits)
    x86,perf: Disable intel_bts when PTI
    security/Kconfig: Correct the Documentation reference for PTI
    x86/pti: Fix !PCID and sanitize defines
    selftests/x86: Add test_vsyscall
    x86/retpoline: Fill return stack buffer on vmexit
    x86/retpoline/irq32: Convert assembler indirect jumps
    x86/retpoline/checksum32: Convert assembler indirect jumps
    x86/retpoline/xen: Convert Xen hypercall indirect jumps
    x86/retpoline/hyperv: Convert assembler indirect jumps
    x86/retpoline/ftrace: Convert ftrace assembler indirect jumps
    x86/retpoline/entry: Convert entry assembler indirect jumps
    x86/retpoline/crypto: Convert crypto assembler indirect jumps
    x86/spectre: Add boot time option to select Spectre v2 mitigation
    x86/retpoline: Add initial retpoline support
    objtool: Allow alternatives to be ignored
    objtool: Detect jumps to retpoline thunks
    x86/pti: Make unpoison of pgd for trusted boot work for real
    x86/alternatives: Fix optimize_nops() checking
    sysfs/cpu: Fix typos in vulnerability documentation
    x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC
    ...

    Linus Torvalds
     

14 Jan, 2018

1 commit

  • When the config option for PTI was added a reference to documentation was
    added as well. But the documentation did not exist at that point. The final
    documentation has a different file name.

    Fix it up to point to the proper file.

    Fixes: 385ce0ea ("x86/mm/pti: Add Kconfig")
    Signed-off-by: W. Trevor King
    Signed-off-by: Thomas Gleixner
    Cc: Dave Hansen
    Cc: linux-mm@kvack.org
    Cc: linux-security-module@vger.kernel.org
    Cc: James Morris
    Cc: "Serge E. Hallyn"
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/3009cc8ccbddcd897ec1e0cb6dda524929de0d14.1515799398.git.wking@tremily.us

    W. Trevor King
     

13 Jan, 2018

2 commits

  • The intended behaviour in apparmor profile matching is to flag a
    conflict if two profiles match equally well. However, right now a
    conflict is generated if another profile has the same match length even
    if that profile doesn't actually match. Fix the logic so we only
    generate a conflict if the profiles match.

    Fixes: 844b8292b631 ("apparmor: ensure that undecidable profile attachments fail")
    Cc: Stable
    Signed-off-by: Matthew Garrett
    Signed-off-by: John Johansen

    Matthew Garrett
     
  • Given a label with a profile stack of
    A//&B or A//&C ...

    A ptrace rule should be able to specify a generic trace pattern with
    a rule like

    ptrace trace A//&**,

    however this is failing because while the correct label match routine
    is called, it is being done post label decomposition so it is always
    being done against a profile instead of the stacked label.

    To fix this refactor the cross check to pass the full peer label in to
    the label_match.

    Fixes: 290f458a4f16 ("apparmor: allow ptrace checks to be finer grained than just capability")
    Cc: Stable
    Reported-by: Matthew Garrett
    Tested-by: Matthew Garrett
    Signed-off-by: John Johansen

    John Johansen
     

11 Jan, 2018

1 commit

  • Smack: Privilege check on key operations

    Operations on key objects are subjected to Smack policy
    even if the process is privileged. This is inconsistent
    with the general behavior of Smack and may cause issues
    with authentication by privileged daemons. This patch
    allows processes with CAP_MAC_OVERRIDE to access keys
    even if the Smack rules indicate otherwise.

    Reported-by: Jose Bollo
    Signed-off-by: Casey Schaufler

    Casey Schaufler
     

09 Jan, 2018

1 commit


08 Jan, 2018

2 commits

  • Device number (the character device index) is not a stable identifier
    for a TPM chip. That is the reason why every call site passes
    TPM_ANY_NUM to tpm_chip_find_get().

    This commit changes the API in a way that instead a struct tpm_chip
    instance is given and NULL means the default chip. In addition, this
    commit refines the documentation to be up to date with the
    implementation.

    Suggested-by: Jason Gunthorpe (@chip_num -> @chip part)
    Signed-off-by: Jarkko Sakkinen
    Reviewed-by: Jason Gunthorpe
    Tested-by: PrasannaKumar Muralidharan

    Jarkko Sakkinen
     
  • …git/jj/linux-apparmor

    Pull apparmor fix from John Johansen:
    "This fixes a regression when the kernel feature set is reported as
    supporting mount and policy is pinned to a feature set that does not
    support mount mediation"

    * tag 'apparmor-pr-2018-01-07' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
    apparmor: fix regression in mount mediation when feature set is pinned

    Linus Torvalds
     

06 Jan, 2018

1 commit

  • When the mount code was refactored for Labels it was not correctly
    updated to check whether policy supported mediation of the mount
    class. This causes a regression when the kernel feature set is
    reported as supporting mount and policy is pinned to a feature set
    that does not support mount mediation.

    BugLink: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=882697#41
    Fixes: 2ea3ffb7782a ("apparmor: add mount mediation")
    Reported-by: Fabian Grünbichler
    Cc: Stable
    Signed-off-by: John Johansen

    John Johansen
     

04 Jan, 2018

1 commit

  • Pull x86 page table isolation fixes from Thomas Gleixner:
    "A couple of urgent fixes for PTI:

    - Fix a PTE mismatch between user and kernel visible mapping of the
    cpu entry area (differs vs. the GLB bit) and causes a TLB mismatch
    MCE on older AMD K8 machines

    - Fix the misplaced CR3 switch in the SYSCALL compat entry code which
    causes access to unmapped kernel memory resulting in double faults.

    - Fix the section mismatch of the cpu_tss_rw percpu storage caused by
    using a different mechanism for declaration and definition.

    - Two fixes for dumpstack which help to decode entry stack issues
    better

    - Enable PTI by default in Kconfig. We should have done that earlier,
    but it slipped through the cracks.

    - Exclude AMD from the PTI enforcement. Not necessarily a fix, but if
    AMD is so confident that they are not affected, then we should not
    burden users with the overhead"

    * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/process: Define cpu_tss_rw in same section as declaration
    x86/pti: Switch to kernel CR3 at early in entry_SYSCALL_compat()
    x86/dumpstack: Print registers for first stack frame
    x86/dumpstack: Fix partial register dumps
    x86/pti: Make sure the user/kernel PTEs match
    x86/cpu, x86/pti: Do not enable PTI on AMD processors
    x86/pti: Enable PTI by default

    Linus Torvalds
     

03 Jan, 2018

2 commits

  • This really want's to be enabled by default. Users who know what they are
    doing can disable it either in the config or on the kernel command line.

    Signed-off-by: Thomas Gleixner
    Cc: stable@vger.kernel.org

    Thomas Gleixner
     
  • …k/linux-rcu into core/rcu

    Pull RCU updates from Paul E. McKenney:

    - Updates to use cond_resched() instead of cond_resched_rcu_qs()
    where feasible (currently everywhere except in kernel/rcu and
    in kernel/torture.c). Also a couple of fixes to avoid sending
    IPIs to offline CPUs.

    - Updates to simplify RCU's dyntick-idle handling.

    - Updates to remove almost all uses of smp_read_barrier_depends()
    and read_barrier_depends().

    - Miscellaneous fixes.

    - Torture-test updates.

    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     

02 Jan, 2018

2 commits

  • We want the fixes in here as well.

    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     
  • If userspace attempted to set a "security.capability" xattr shorter than
    4 bytes (e.g. 'setfattr -n security.capability -v x file'), then
    cap_convert_nscap() read past the end of the buffer containing the xattr
    value because it accessed the ->magic_etc field without verifying that
    the xattr value is long enough to contain that field.

    Fix it by validating the xattr value size first.

    This bug was found using syzkaller with KASAN. The KASAN report was as
    follows (cleaned up slightly):

    BUG: KASAN: slab-out-of-bounds in cap_convert_nscap+0x514/0x630 security/commoncap.c:498
    Read of size 4 at addr ffff88002d8741c0 by task syz-executor1/2852

    CPU: 0 PID: 2852 Comm: syz-executor1 Not tainted 4.15.0-rc6-00200-gcc0aac99d977 #253
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
    Call Trace:
    __dump_stack lib/dump_stack.c:17 [inline]
    dump_stack+0xe3/0x195 lib/dump_stack.c:53
    print_address_description+0x73/0x260 mm/kasan/report.c:252
    kasan_report_error mm/kasan/report.c:351 [inline]
    kasan_report+0x235/0x350 mm/kasan/report.c:409
    cap_convert_nscap+0x514/0x630 security/commoncap.c:498
    setxattr+0x2bd/0x350 fs/xattr.c:446
    path_setxattr+0x168/0x1b0 fs/xattr.c:472
    SYSC_setxattr fs/xattr.c:487 [inline]
    SyS_setxattr+0x36/0x50 fs/xattr.c:483
    entry_SYSCALL_64_fastpath+0x18/0x85

    Fixes: 8db6c34f1dbc ("Introduce v3 namespaced file capabilities")
    Cc: # v4.14+
    Signed-off-by: Eric Biggers
    Reviewed-by: Serge Hallyn
    Signed-off-by: James Morris

    Eric Biggers
     

30 Dec, 2017

1 commit

  • Pull x86 page table isolation updates from Thomas Gleixner:
    "This is the final set of enabling page table isolation on x86:

    - Infrastructure patches for handling the extra page tables.

    - Patches which map the various bits and pieces which are required to
    get in and out of user space into the user space visible page
    tables.

    - The required changes to have CR3 switching in the entry/exit code.

    - Optimizations for the CR3 switching along with documentation how
    the ASID/PCID mechanism works.

    - Updates to dump pagetables to cover the user space page tables for
    W+X scans and extra debugfs files to analyze both the kernel and
    the user space visible page tables

    The whole functionality is compile time controlled via a config switch
    and can be turned on/off on the command line as well"

    * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
    x86/ldt: Make the LDT mapping RO
    x86/mm/dump_pagetables: Allow dumping current pagetables
    x86/mm/dump_pagetables: Check user space page table for WX pages
    x86/mm/dump_pagetables: Add page table directory to the debugfs VFS hierarchy
    x86/mm/pti: Add Kconfig
    x86/dumpstack: Indicate in Oops whether PTI is configured and enabled
    x86/mm: Clarify the whole ASID/kernel PCID/user PCID naming
    x86/mm: Use INVPCID for __native_flush_tlb_single()
    x86/mm: Optimize RESTORE_CR3
    x86/mm: Use/Fix PCID to optimize user/kernel switches
    x86/mm: Abstract switching CR3
    x86/mm: Allow flushing for future ASID switches
    x86/pti: Map the vsyscall page if needed
    x86/pti: Put the LDT in its own PGD if PTI is on
    x86/mm/64: Make a full PGD-entry size hole in the memory map
    x86/events/intel/ds: Map debug buffers in cpu_entry_area
    x86/cpu_entry_area: Add debugstore entries to cpu_entry_area
    x86/mm/pti: Map ESPFIX into user space
    x86/mm/pti: Share entry text PMD
    x86/entry: Align entry text section to PMD boundary
    ...

    Linus Torvalds
     

28 Dec, 2017

1 commit


24 Dec, 2017

1 commit

  • Finally allow CONFIG_PAGE_TABLE_ISOLATION to be enabled.

    PARAVIRT generally requires that the kernel not manage its own page tables.
    It also means that the hypervisor and kernel must agree wholeheartedly
    about what format the page tables are in and what they contain.
    PAGE_TABLE_ISOLATION, unfortunately, changes the rules and they
    can not be used together.

    I've seen conflicting feedback from maintainers lately about whether they
    want the Kconfig magic to go first or last in a patch series. It's going
    last here because the partially-applied series leads to kernels that can
    not boot in a bunch of cases. I did a run through the entire series with
    CONFIG_PAGE_TABLE_ISOLATION=y to look for build errors, though.

    [ tglx: Removed SMP and !PARAVIRT dependencies as they not longer exist ]

    Signed-off-by: Dave Hansen
    Signed-off-by: Thomas Gleixner
    Cc: Andy Lutomirski
    Cc: Boris Ostrovsky
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: David Laight
    Cc: Denys Vlasenko
    Cc: Eduardo Valentin
    Cc: Greg KH
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Will Deacon
    Cc: aliguori@amazon.com
    Cc: daniel.gruss@iaik.tugraz.at
    Cc: hughd@google.com
    Cc: keescook@google.com
    Cc: linux-mm@kvack.org
    Signed-off-by: Ingo Molnar

    Dave Hansen
     

18 Dec, 2017

3 commits

  • As done for /proc/kcore in

    commit df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data")

    this adds a bounce buffer when reading memory via /dev/mem. This
    is needed to allow kernel text memory to be read out when built with
    CONFIG_HARDENED_USERCOPY (which refuses to read out kernel text) and
    without CONFIG_STRICT_DEVMEM (which would have refused to read any RAM
    contents at all).

    Since this build configuration isn't common (most systems with
    CONFIG_HARDENED_USERCOPY also have CONFIG_STRICT_DEVMEM), this also tries
    to inform Kconfig about the recommended settings.

    This patch is modified from Brad Spengler/PaX Team's changes to /dev/mem
    code in the last public patch of grsecurity/PaX based on my understanding
    of the code. Changes or omissions from the original code are mine and
    don't reflect the original grsecurity/PaX code.

    Reported-by: Michael Holzheu
    Fixes: f5509cc18daa ("mm: Hardened usercopy")
    Signed-off-by: Kees Cook
    Signed-off-by: Greg Kroah-Hartman

    Kees Cook
     
  • i_version is only supported by a filesystem when the SB_I_VERSION
    flag is set. This patch tests for the SB_I_VERSION flag before using
    i_version. If we can't use i_version to detect a file change then we
    must assume the file has changed in the last_writer path and remeasure
    it.

    On filesystems without i_version support IMA used to measure a file
    only once and didn't detect any changes to a file. With this patch
    IMA now works properly on these filesystems.

    Signed-off-by: Sascha Hauer
    Reviewed-by: Jeff Layton
    Signed-off-by: Mimi Zohar

    Sascha Hauer
     
  • The init_once routine memsets the whole object to 0, and then
    explicitly sets some of the fields to 0 again. Just remove the explicit
    initializations.

    Signed-off-by: Jeff Layton
    Signed-off-by: Mimi Zohar

    Jeff Layton