21 Nov, 2018

1 commit

  • commit d0ffb805b729322626639336986bc83fc2e60871 upstream.

    Alpha has had c_ispeed and c_ospeed, but still set speeds in c_cflags
    using arbitrary flags. Because BOTHER is not defined, the general
    Linux code doesn't allow setting arbitrary baud rates, and because
    CBAUDEX == 0, we can have an array overrun of the baud_rate[] table in
    drivers/tty/tty_baudrate.c if (c_cflags & CBAUD) == 037.

    Resolve both problems by #defining BOTHER to 037 on Alpha.

    However, userspace still needs to know if setting BOTHER is actually
    safe given legacy kernels (does anyone actually care about that on
    Alpha anymore?), so enable the TCGETS2/TCSETS*2 ioctls on Alpha, even
    though they use the same structure. Define struct termios2 just for
    compatibility; it is the exact same structure as struct termios. In a
    future patchset, this will be cleaned up so the uapi headers are
    usable from libc.

    Signed-off-by: H. Peter Anvin (Intel)
    Cc: Jiri Slaby
    Cc: Al Viro
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Cc: Thomas Gleixner
    Cc: Kate Stewart
    Cc: Philippe Ombredanne
    Cc: Eugene Syromiatnikov
    Cc:
    Cc:
    Cc: Johan Hovold
    Cc: Alan Cox
    Cc:
    Signed-off-by: Greg Kroah-Hartman

    H. Peter Anvin (Intel)
     

30 May, 2018

2 commits

  • [ Upstream commit 472e8c55cf6622d1c112dc2bc777f68bbd4189db ]

    Successful RMW operations are supposed to be fully ordered, but
    Alpha's xchg() and cmpxchg() do not meet this requirement.

    Will Deacon noticed the bug:

    > So MP using xchg:
    >
    > WRITE_ONCE(x, 1)
    > xchg(y, 1)
    >
    > smp_load_acquire(y) == 1
    > READ_ONCE(x) == 0
    >
    > would be allowed.

    ... which thus violates the above requirement.

    Fix it by adding a leading smp_mb() to the xchg() and cmpxchg() implementations.

    Reported-by: Will Deacon
    Signed-off-by: Andrea Parri
    Acked-by: Paul E. McKenney
    Cc: Alan Stern
    Cc: Andrew Morton
    Cc: Ivan Kokshaysky
    Cc: Linus Torvalds
    Cc: Matt Turner
    Cc: Peter Zijlstra
    Cc: Richard Henderson
    Cc: Thomas Gleixner
    Cc: linux-alpha@vger.kernel.org
    Link: http://lkml.kernel.org/r/1519291488-5752-1-git-send-email-parri.andrea@gmail.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Andrea Parri
     
  • [ Upstream commit cb13b424e986aed68d74cbaec3449ea23c50e167 ]

    Continuing along with the fight against smp_read_barrier_depends() [1]
    (or rather, against its improper use), add an unconditional barrier to
    cmpxchg. This guarantees that dependency ordering is preserved when a
    dependency is headed by an unsuccessful cmpxchg. As it turns out, the
    change could enable further simplification of LKMM as proposed in [2].

    [1] https://marc.info/?l=linux-kernel&m=150884953419377&w=2
    https://marc.info/?l=linux-kernel&m=150884946319353&w=2
    https://marc.info/?l=linux-kernel&m=151215810824468&w=2
    https://marc.info/?l=linux-kernel&m=151215816324484&w=2

    [2] https://marc.info/?l=linux-kernel&m=151881978314872&w=2

    Signed-off-by: Andrea Parri
    Acked-by: Peter Zijlstra
    Acked-by: Paul E. McKenney
    Cc: Alan Stern
    Cc: Ivan Kokshaysky
    Cc: Linus Torvalds
    Cc: Matt Turner
    Cc: Richard Henderson
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: linux-alpha@vger.kernel.org
    Link: http://lkml.kernel.org/r/1519152356-4804-1-git-send-email-parri.andrea@gmail.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Andrea Parri
     

17 Feb, 2018

1 commit

  • commit 84e455361ec97ea6037d31d42a2955628ea2094b upstream.

    Fix the typo (mixed up arguments) in the EXC macro in the futex
    definitions introduced by commit ca282f697381 (alpha: add a
    helper for emitting exception table entries).

    Signed-off-by: Michael Cree
    Signed-off-by: Matt Turner
    Signed-off-by: Greg Kroah-Hartman

    Michael Cree
     

02 Nov, 2017

3 commits

  • Many user space API headers have licensing information, which is either
    incomplete, badly formatted or just a shorthand for referring to the
    license under which the file is supposed to be. This makes it hard for
    compliance tools to determine the correct license.

    Update these files with an SPDX license identifier. The identifier was
    chosen based on the license information in the file.

    GPL/LGPL licensed headers get the matching GPL/LGPL SPDX license
    identifier with the added 'WITH Linux-syscall-note' exception, which is
    the officially assigned exception identifier for the kernel syscall
    exception:

    NOTE! This copyright does *not* cover user programs that use kernel
    services by normal system calls - this is merely considered normal use
    of the kernel, and does *not* fall under the heading of "derived work".

    This exception makes it possible to include GPL headers into non GPL
    code, without confusing license compliance tools.

    Headers which have either explicit dual licensing or are just licensed
    under a non GPL license are updated with the corresponding SPDX
    identifier and the GPLv2 with syscall exception identifier. The format
    is:
    ((GPL-2.0 WITH Linux-syscall-note) OR SPDX-ID-OF-OTHER-LICENSE)

    SPDX license identifiers are a legally binding shorthand, which can be
    used instead of the full boiler plate text. The update does not remove
    existing license information as this has to be done on a case by case
    basis and the copyright holders might have to be consulted. This will
    happen in a separate step.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne. See the previous patch in this series for the
    methodology of how this patch was researched.

    Reviewed-by: Kate Stewart
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     
  • Many user space API headers are missing licensing information, which
    makes it hard for compliance tools to determine the correct license.

    By default are files without license information under the default
    license of the kernel, which is GPLV2. Marking them GPLV2 would exclude
    them from being included in non GPLV2 code, which is obviously not
    intended. The user space API headers fall under the syscall exception
    which is in the kernels COPYING file:

    NOTE! This copyright does *not* cover user programs that use kernel
    services by normal system calls - this is merely considered normal use
    of the kernel, and does *not* fall under the heading of "derived work".

    otherwise syscall usage would not be possible.

    Update the files which contain no license information with an SPDX
    license identifier. The chosen identifier is 'GPL-2.0 WITH
    Linux-syscall-note' which is the officially assigned identifier for the
    Linux syscall exception. SPDX license identifiers are a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne. See the previous patch in this series for the
    methodology of how this patch was researched.

    Reviewed-by: Kate Stewart
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     
  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

04 Oct, 2017

1 commit

  • The build of alpha allmodconfig is giving error:

    arch/alpha/include/asm/mmu_context.h: In function 'ev5_switch_mm':
    arch/alpha/include/asm/mmu_context.h:160:2: error:
    implicit declaration of function 'task_thread_info';
    did you mean 'init_thread_info'? [-Werror=implicit-function-declaration]

    The file 'mmu_context.h' needed an extra header file.

    Link: http://lkml.kernel.org/r/1505668810-7497-1-git-send-email-sudipm.mukherjee@gmail.com
    Signed-off-by: Sudip Mukherjee
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sudip Mukherjee
     

12 Sep, 2017

1 commit

  • Pull namespace updates from Eric Biederman:
    "Life has been busy and I have not gotten half as much done this round
    as I would have liked. I delayed it so that a minor conflict
    resolution with the mips tree could spend a little time in linux-next
    before I sent this pull request.

    This includes two long delayed user namespace changes from Kirill
    Tkhai. It also includes a very useful change from Serge Hallyn that
    allows the security capability attribute to be used inside of user
    namespaces. The practical effect of this is people can now untar
    tarballs and install rpms in user namespaces. It had been suggested to
    generalize this and encode some of the namespace information
    information in the xattr name. Upon close inspection that makes the
    things that should be hard easy and the things that should be easy
    more expensive.

    Then there is my bugfix/cleanup for signal injection that removes the
    magic encoding of the siginfo union member from the kernel internal
    si_code. The mips folks reported the case where I had used FPE_FIXME
    me is impossible so I have remove FPE_FIXME from mips, while at the
    same time including a return statement in that case to keep gcc from
    complaining about unitialized variables.

    I almost finished the work to get make copy_siginfo_to_user a trivial
    copy to user. The code is available at:

    git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git neuter-copy_siginfo_to_user-v3

    But I did not have time/energy to get the code posted and reviewed
    before the merge window opened.

    I was able to see that the security excuse for just copying fields
    that we know are initialized doesn't work in practice there are buggy
    initializations that don't initialize the proper fields in siginfo. So
    we still sometimes copy unitialized data to userspace"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    Introduce v3 namespaced file capabilities
    mips/signal: In force_fcr31_sig return in the impossible case
    signal: Remove kernel interal si_code magic
    fcntl: Don't use ambiguous SIG_POLL si_codes
    prctl: Allow local CAP_SYS_ADMIN changing exe_file
    security: Use user_namespace::level to avoid redundant iterations in cap_capable()
    userns,pidns: Verify the userns for new pid namespaces
    signal/testing: Don't look for __SI_FAULT in userspace
    signal/mips: Document a conflict with SI_USER with SIGFPE
    signal/sparc: Document a conflict with SI_USER with SIGFPE
    signal/ia64: Document a conflict with SI_USER with SIGFPE
    signal/alpha: Document a conflict with SI_USER for SIGTRAP

    Linus Torvalds
     

09 Sep, 2017

1 commit

  • Alpha already had an optimised fill-memory-with-16-bit-quantity
    assembler routine called memsetw(). It has a slightly different calling
    convention from memset16() in that it takes a byte count, not a count of
    words. That's the same convention used by ARM's __memset routines, so
    rename Alpha's routine to match and add a memset16() wrapper around it.
    Then convert Alpha's scr_memsetw() to call memset16() instead of
    memsetw().

    Link: http://lkml.kernel.org/r/20170720184539.31609-6-willy@infradead.org
    Signed-off-by: Matthew Wilcox
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Cc: "H. Peter Anvin"
    Cc: "James E.J. Bottomley"
    Cc: "Martin K. Petersen"
    Cc: David Miller
    Cc: Ingo Molnar
    Cc: Michael Ellerman
    Cc: Minchan Kim
    Cc: Ralf Baechle
    Cc: Russell King
    Cc: Sam Ravnborg
    Cc: Sergey Senozhatsky
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     

07 Sep, 2017

4 commits

  • Merge updates from Andrew Morton:

    - various misc bits

    - DAX updates

    - OCFS2

    - most of MM

    * emailed patches from Andrew Morton : (119 commits)
    mm,fork: introduce MADV_WIPEONFORK
    x86,mpx: make mpx depend on x86-64 to free up VMA flag
    mm: add /proc/pid/smaps_rollup
    mm: hugetlb: clear target sub-page last when clearing huge page
    mm: oom: let oom_reap_task and exit_mmap run concurrently
    swap: choose swap device according to numa node
    mm: replace TIF_MEMDIE checks by tsk_is_oom_victim
    mm, oom: do not rely on TIF_MEMDIE for memory reserves access
    z3fold: use per-cpu unbuddied lists
    mm, swap: don't use VMA based swap readahead if HDD is used as swap
    mm, swap: add sysfs interface for VMA based swap readahead
    mm, swap: VMA based swap readahead
    mm, swap: fix swap readahead marking
    mm, swap: add swap readahead hit statistics
    mm/vmalloc.c: don't reinvent the wheel but use existing llist API
    mm/vmstat.c: fix wrong comment
    selftests/memfd: add memfd_create hugetlbfs selftest
    mm/shmem: add hugetlbfs support to memfd_create()
    mm, devm_memremap_pages: use multi-order radix for ZONE_DEVICE lookups
    mm/vmalloc.c: halve the number of comparisons performed in pcpu_get_vm_areas()
    ...

    Linus Torvalds
     
  • Introduce MADV_WIPEONFORK semantics, which result in a VMA being empty
    in the child process after fork. This differs from MADV_DONTFORK in one
    important way.

    If a child process accesses memory that was MADV_WIPEONFORK, it will get
    zeroes. The address ranges are still valid, they are just empty.

    If a child process accesses memory that was MADV_DONTFORK, it will get a
    segmentation fault, since those address ranges are no longer valid in
    the child after fork.

    Since MADV_DONTFORK also seems to be used to allow very large programs
    to fork in systems with strict memory overcommit restrictions, changing
    the semantics of MADV_DONTFORK might break existing programs.

    MADV_WIPEONFORK only works on private, anonymous VMAs.

    The use case is libraries that store or cache information, and want to
    know that they need to regenerate it in the child process after fork.

    Examples of this would be:
    - systemd/pulseaudio API checks (fail after fork) (replacing a getpid
    check, which is too slow without a PID cache)
    - PKCS#11 API reinitialization check (mandated by specification)
    - glibc's upcoming PRNG (reseed after fork)
    - OpenSSL PRNG (reseed after fork)

    The security benefits of a forking server having a re-inialized PRNG in
    every child process are pretty obvious. However, due to libraries
    having all kinds of internal state, and programs getting compiled with
    many different versions of each library, it is unreasonable to expect
    calling programs to re-initialize everything manually after fork.

    A further complication is the proliferation of clone flags, programs
    bypassing glibc's functions to call clone directly, and programs calling
    unshare, causing the glibc pthread_atfork hook to not get called.

    It would be better to have the kernel take care of this automatically.

    The patch also adds MADV_KEEPONFORK, to undo the effects of a prior
    MADV_WIPEONFORK.

    This is similar to the OpenBSD minherit syscall with MAP_INHERIT_ZERO:

    https://man.openbsd.org/minherit.2

    [akpm@linux-foundation.org: numerically order arch/parisc/include/uapi/asm/mman.h #defines]
    Link: http://lkml.kernel.org/r/20170811212829.29186-3-riel@redhat.com
    Signed-off-by: Rik van Riel
    Reported-by: Florian Weimer
    Reported-by: Colm MacCártaigh
    Reviewed-by: Mike Kravetz
    Cc: "H. Peter Anvin"
    Cc: "Kirill A. Shutemov"
    Cc: Andy Lutomirski
    Cc: Dave Hansen
    Cc: Ingo Molnar
    Cc: Helge Deller
    Cc: Kees Cook
    Cc: Matthew Wilcox
    Cc: Thomas Gleixner
    Cc: Will Drewry
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rik van Riel
     
  • A non-default huge page size can be encoded in the flags argument of the
    mmap system call. The definitions for these encodings are in arch
    specific header files. However, all architectures use the same values.

    Consolidate all the definitions in the primary user header file
    (uapi/linux/mman.h). Include definitions for all known huge page sizes.
    Use the generic encoding definitions in hugetlb_encode.h as the basis
    for these definitions.

    Link: http://lkml.kernel.org/r/1501527386-10736-3-git-send-email-mike.kravetz@oracle.com
    Signed-off-by: Mike Kravetz
    Acked-by: Michal Hocko
    Cc: Andi Kleen
    Cc: Andrea Arcangeli
    Cc: Aneesh Kumar K.V
    Cc: Anshuman Khandual
    Cc: Arnd Bergmann
    Cc: Davidlohr Bueso
    Cc: Matthew Wilcox
    Cc: Michael Kerrisk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Kravetz
     
  • Pull networking updates from David Miller:

    1) Support ipv6 checksum offload in sunvnet driver, from Shannon
    Nelson.

    2) Move to RB-tree instead of custom AVL code in inetpeer, from Eric
    Dumazet.

    3) Allow generic XDP to work on virtual devices, from John Fastabend.

    4) Add bpf device maps and XDP_REDIRECT, which can be used to build
    arbitrary switching frameworks using XDP. From John Fastabend.

    5) Remove UFO offloads from the tree, gave us little other than bugs.

    6) Remove the IPSEC flow cache, from Florian Westphal.

    7) Support ipv6 route offload in mlxsw driver.

    8) Support VF representors in bnxt_en, from Sathya Perla.

    9) Add support for forward error correction modes to ethtool, from
    Vidya Sagar Ravipati.

    10) Add time filter for packet scheduler action dumping, from Jamal Hadi
    Salim.

    11) Extend the zerocopy sendmsg() used by virtio and tap to regular
    sockets via MSG_ZEROCOPY. From Willem de Bruijn.

    12) Significantly rework value tracking in the BPF verifier, from Edward
    Cree.

    13) Add new jump instructions to eBPF, from Daniel Borkmann.

    14) Rework rtnetlink plumbing so that operations can be run without
    taking the RTNL semaphore. From Florian Westphal.

    15) Support XDP in tap driver, from Jason Wang.

    16) Add 32-bit eBPF JIT for ARM, from Shubham Bansal.

    17) Add Huawei hinic ethernet driver.

    18) Allow to report MD5 keys in TCP inet_diag dumps, from Ivan
    Delalande.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1780 commits)
    i40e: point wb_desc at the nvm_wb_desc during i40e_read_nvm_aq
    i40e: avoid NVM acquire deadlock during NVM update
    drivers: net: xgene: Remove return statement from void function
    drivers: net: xgene: Configure tx/rx delay for ACPI
    drivers: net: xgene: Read tx/rx delay for ACPI
    rocker: fix kcalloc parameter order
    rds: Fix non-atomic operation on shared flag variable
    net: sched: don't use GFP_KERNEL under spin lock
    vhost_net: correctly check tx avail during rx busy polling
    net: mdio-mux: add mdio_mux parameter to mdio_mux_init()
    rxrpc: Make service connection lookup always check for retry
    net: stmmac: Delete dead code for MDIO registration
    gianfar: Fix Tx flow control deactivation
    cxgb4: Ignore MPS_TX_INT_CAUSE[Bubble] for T6
    cxgb4: Fix pause frame count in t4_get_port_stats
    cxgb4: fix memory leak
    tun: rename generic_xdp to skb_xdp
    tun: reserve extra headroom only when XDP is set
    net: dsa: bcm_sf2: Configure IMP port TC2QOS mapping
    net: dsa: bcm_sf2: Advertise number of egress queues
    ...

    Linus Torvalds
     

06 Sep, 2017

1 commit

  • Pull alpha updates from Matt Turner:
    "This contains some small clean up patches I've neglected, and some
    build improvements from Ben Hutchings"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha:
    alpha: math-emu: Fix modular build
    alpha: Restore symbol versions for symbols exported from assembly
    alpha: defconfig: Cleanup from old Kconfig options
    alpha: use kobj_to_dev()
    alpha: squash lines for immediate return
    alpha: kernel: Use vma_pages()
    alpha: silence a buffer overflow warning
    alpha: marvel: make use of raw_spinlock variants
    alpha: cleanup: remove __NR_sys_epoll_*, leave __NR_epoll_*
    alpha: use generic fb.h

    Linus Torvalds
     

05 Sep, 2017

5 commits

  • Add so that genksyms knows the types of
    these symbols and can generate CRCs for them.

    Fixes: 00fc0e0dda62 ("alpha: move exports to actual definitions")
    Signed-off-by: Ben Hutchings
    Signed-off-by: Matt Turner

    Ben Hutchings
     
  • The alpha/marvel code currently implements an irq_chip for handling
    interrupts; due to how irq_chip handling is done, it's necessary for the
    irq_chip methods to be invoked from hardirq context, even on a a
    real-time kernel. Because the spinlock_t type becomes a "sleeping"
    spinlock w/ RT kernels, it is not suitable to be used with irq_chips.

    A quick audit of the operations under the lock reveal that they do only
    minimal, bounded work, and are therefore safe to do under a raw spinlock.

    Signed-off-by: Julia Cartwright
    Signed-off-by: Matt Turner

    Julia Cartwright
     
  • __NR_sys_epoll_create and friends are alpha-specific
    while __NR_epoll_create is a generic name for other
    arches.

    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: linux-alpha@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Sergei Trofimovich
    Signed-off-by: Matt Turner

    Sergei Trofimovich
     
  • The arch uses a verbatim copy of the asm-generic version and does not
    add any own implemntations to the header, so use asm-generic/fb.h
    instead of duplicating code.

    Signed-off-by: Tobias Klauser
    Signed-off-by: Matt Turner

    Tobias Klauser
     
  • Pull locking updates from Ingo Molnar:

    - Add 'cross-release' support to lockdep, which allows APIs like
    completions, where it's not the 'owner' who releases the lock, to be
    tracked. It's all activated automatically under
    CONFIG_PROVE_LOCKING=y.

    - Clean up (restructure) the x86 atomics op implementation to be more
    readable, in preparation of KASAN annotations. (Dmitry Vyukov)

    - Fix static keys (Paolo Bonzini)

    - Add killable versions of down_read() et al (Kirill Tkhai)

    - Rework and fix jump_label locking (Marc Zyngier, Paolo Bonzini)

    - Rework (and fix) tlb_flush_pending() barriers (Peter Zijlstra)

    - Remove smp_mb__before_spinlock() and convert its usages, introduce
    smp_mb__after_spinlock() (Peter Zijlstra)

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits)
    locking/lockdep/selftests: Fix mixed read-write ABBA tests
    sched/completion: Avoid unnecessary stack allocation for COMPLETION_INITIALIZER_ONSTACK()
    acpi/nfit: Fix COMPLETION_INITIALIZER_ONSTACK() abuse
    locking/pvqspinlock: Relax cmpxchg's to improve performance on some architectures
    smp: Avoid using two cache lines for struct call_single_data
    locking/lockdep: Untangle xhlock history save/restore from task independence
    locking/refcounts, x86/asm: Disable CONFIG_ARCH_HAS_REFCOUNT for the time being
    futex: Remove duplicated code and fix undefined behaviour
    Documentation/locking/atomic: Finish the document...
    locking/lockdep: Fix workqueue crossrelease annotation
    workqueue/lockdep: 'Fix' flush_work() annotation
    locking/lockdep/selftests: Add mixed read-write ABBA tests
    mm, locking/barriers: Clarify tlb_flush_pending() barriers
    locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS truly non-interactive
    locking/lockdep: Explicitly initialize wq_barrier::done::map
    locking/lockdep: Rename CONFIG_LOCKDEP_COMPLETE to CONFIG_LOCKDEP_COMPLETIONS
    locking/lockdep: Reword title of LOCKDEP_CROSSRELEASE config
    locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE part of CONFIG_PROVE_LOCKING
    locking/refcounts, x86/asm: Implement fast refcount overflow protection
    locking/lockdep: Fix the rollback and overwrite detection logic in crossrelease
    ...

    Linus Torvalds
     

04 Sep, 2017

2 commits

  • Pull RCU updates from Ingo Molnad:
    "The main RCU related changes in this cycle were:

    - Removal of spin_unlock_wait()
    - SRCU updates
    - RCU torture-test updates
    - RCU Documentation updates
    - Extend the sys_membarrier() ABI with the MEMBARRIER_CMD_PRIVATE_EXPEDITED variant
    - Miscellaneous RCU fixes
    - CPU-hotplug fixes"

    * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
    arch: Remove spin_unlock_wait() arch-specific definitions
    locking: Remove spin_unlock_wait() generic definitions
    drivers/ata: Replace spin_unlock_wait() with lock/unlock pair
    ipc: Replace spin_unlock_wait() with lock/unlock pair
    exit: Replace spin_unlock_wait() with lock/unlock pair
    completion: Replace spin_unlock_wait() with lock/unlock pair
    doc: Set down RCU's scheduling-clock-interrupt needs
    doc: No longer allowed to use rcu_dereference on non-pointers
    doc: Add RCU files to docbook-generation files
    doc: Update memory-barriers.txt for read-to-write dependencies
    doc: Update RCU documentation
    membarrier: Provide expedited private command
    rcu: Remove exports from rcu_idle_exit() and rcu_idle_enter()
    rcu: Add warning to rcu_idle_enter() for irqs enabled
    rcu: Make rcu_idle_enter() rely on callers disabling irqs
    rcu: Add assertions verifying blocked-tasks list
    rcu/tracing: Set disable_rcu_irq_enter on rcu_eqs_exit()
    rcu: Add TPS() protection for _rcu_barrier_trace strings
    rcu: Use idle versions of swait to make idle-hack clear
    swait: Add idle variants which don't contribute to load average
    ...

    Linus Torvalds
     
  • Conflicts:
    mm/page_alloc.c

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

02 Sep, 2017

1 commit


30 Aug, 2017

3 commits

  • This fixes compiler errors in perf such as:

    tests/attr.c: In function 'store_event':
    tests/attr.c:66:27: error: format '%llu' expects argument of type 'long long unsigned int', but argument 6 has type '__u64 {aka long unsigned int}' [-Werror=format=]
    snprintf(path, PATH_MAX, "%s/event-%d-%llu-%d", dir,
    ^

    Signed-off-by: Ben Hutchings
    Tested-by: Michael Cree
    Cc: stable@vger.kernel.org
    Signed-off-by: Matt Turner

    Ben Hutchings
     
  • Commit 3cc2dac5be3f ("drivers/video/fbdev/atyfb: Replace MTRR UC hole
    with strong UC") introduces calls to ioremap_wc and ioremap_uc. This
    causes build failures with alpha:allmodconfig. Map the missing functions
    to ioremap_nocache.

    Fixes: 3cc2dac5be3f ("drivers/video/fbdev/atyfb:
    Replace MTRR UC hole with strong UC")
    Cc: Paul Gortmaker
    Cc: Luis R. Rodriguez
    Signed-off-by: Guenter Roeck
    Signed-off-by: Matt Turner

    Guenter Roeck
     
  • Signed-off-by: Richard Henderson
    Signed-off-by: Matt Turner

    Richard Henderson
     

26 Aug, 2017

1 commit

  • There is code duplicated over all architecture's headers for
    futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr,
    and comparison of the result.

    Remove this duplication and leave up to the arches only the needed
    assembly which is now in arch_futex_atomic_op_inuser.

    This effectively distributes the Will Deacon's arm64 fix for undefined
    behaviour reported by UBSAN to all architectures. The fix was done in
    commit 5f16a046f8e1 (arm64: futex: Fix undefined behaviour with
    FUTEX_OP_OPARG_SHIFT usage). Look there for an example dump.

    And as suggested by Thomas, check for negative oparg too, because it was
    also reported to cause undefined behaviour report.

    Note that s390 removed access_ok check in d12a29703 ("s390/uaccess:
    remove pointless access_ok() checks") as access_ok there returns true.
    We introduce it back to the helper for the sake of simplicity (it gets
    optimized away anyway).

    Signed-off-by: Jiri Slaby
    Signed-off-by: Thomas Gleixner
    Acked-by: Russell King
    Acked-by: Michael Ellerman (powerpc)
    Acked-by: Heiko Carstens [s390]
    Acked-by: Chris Metcalf [for tile]
    Reviewed-by: Darren Hart (VMware)
    Reviewed-by: Will Deacon [core/arm64]
    Cc: linux-mips@linux-mips.org
    Cc: Rich Felker
    Cc: linux-ia64@vger.kernel.org
    Cc: linux-sh@vger.kernel.org
    Cc: peterz@infradead.org
    Cc: Benjamin Herrenschmidt
    Cc: Max Filippov
    Cc: Paul Mackerras
    Cc: sparclinux@vger.kernel.org
    Cc: Jonas Bonn
    Cc: linux-s390@vger.kernel.org
    Cc: linux-arch@vger.kernel.org
    Cc: Yoshinori Sato
    Cc: linux-hexagon@vger.kernel.org
    Cc: Helge Deller
    Cc: "James E.J. Bottomley"
    Cc: Catalin Marinas
    Cc: Matt Turner
    Cc: linux-snps-arc@lists.infradead.org
    Cc: Fenghua Yu
    Cc: Arnd Bergmann
    Cc: linux-xtensa@linux-xtensa.org
    Cc: Stefan Kristiansson
    Cc: openrisc@lists.librecores.org
    Cc: Ivan Kokshaysky
    Cc: Stafford Horne
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: Richard Henderson
    Cc: Chris Zankel
    Cc: Michal Simek
    Cc: Tony Luck
    Cc: linux-parisc@vger.kernel.org
    Cc: Vineet Gupta
    Cc: Ralf Baechle
    Cc: Richard Kuo
    Cc: linux-alpha@vger.kernel.org
    Cc: Martin Schwidefsky
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: "David S. Miller"
    Link: http://lkml.kernel.org/r/20170824073105.3901-1-jslaby@suse.cz

    Jiri Slaby
     

17 Aug, 2017

1 commit

  • There is no agreed-upon definition of spin_unlock_wait()'s semantics,
    and it appears that all callers could do just as well with a lock/unlock
    pair. This commit therefore removes the underlying arch-specific
    arch_spin_unlock_wait() for all architectures providing them.

    Signed-off-by: Paul E. McKenney
    Cc:
    Cc: Peter Zijlstra
    Cc: Alan Stern
    Cc: Andrea Parri
    Cc: Linus Torvalds
    Acked-by: Will Deacon
    Acked-by: Boqun Feng

    Paul E. McKenney
     

04 Aug, 2017

1 commit

  • The send call ignores unknown flags. Legacy applications may already
    unwittingly pass MSG_ZEROCOPY. Continue to ignore this flag unless a
    socket opts in to zerocopy.

    Introduce socket option SO_ZEROCOPY to enable MSG_ZEROCOPY processing.
    Processes can also query this socket option to detect kernel support
    for the feature. Older kernels will return ENOPROTOOPT.

    Signed-off-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

25 Jul, 2017

1 commit

  • struct siginfo is a union and the kernel since 2.4 has been hiding a union
    tag in the high 16bits of si_code using the values:
    __SI_KILL
    __SI_TIMER
    __SI_POLL
    __SI_FAULT
    __SI_CHLD
    __SI_RT
    __SI_MESGQ
    __SI_SYS

    While this looks plausible on the surface, in practice this situation has
    not worked well.

    - Injected positive signals are not copied to user space properly
    unless they have these magic high bits set.

    - Injected positive signals are not reported properly by signalfd
    unless they have these magic high bits set.

    - These kernel internal values leaked to userspace via ptrace_peek_siginfo

    - It was possible to inject these kernel internal values and cause the
    the kernel to misbehave.

    - Kernel developers got confused and expected these kernel internal values
    in userspace in kernel self tests.

    - Kernel developers got confused and set si_code to __SI_FAULT which
    is SI_USER in userspace which causes userspace to think an ordinary user
    sent the signal and that it was not kernel generated.

    - The values make it impossible to reorganize the code to transform
    siginfo_copy_to_user into a plain copy_to_user. As si_code must
    be massaged before being passed to userspace.

    So remove these kernel internal si codes and make the kernel code simpler
    and more maintainable.

    To replace these kernel internal magic si_codes introduce the helper
    function siginfo_layout, that takes a signal number and an si_code and
    computes which union member of siginfo is being used. Have
    siginfo_layout return an enumeration so that gcc will have enough
    information to warn if a switch statement does not handle all of union
    members.

    A couple of architectures have a messed up ABI that defines signal
    specific duplications of SI_USER which causes more special cases in
    siginfo_layout than I would like. The good news is only problem
    architectures pay the cost.

    Update all of the code that used the previous magic __SI_ values to
    use the new SIL_ values and to call siginfo_layout to get those
    values. Escept where not all of the cases are handled remove the
    defaults in the switch statements so that if a new case is missed in
    the future the lack will show up at compile time.

    Modify the code that copies siginfo si_code to userspace to just copy
    the value and not cast si_code to a short first. The high bits are no
    longer used to hold a magic union member.

    Fixup the siginfo header files to stop including the __SI_ values in
    their constants and for the headers that were missing it to properly
    update the number of si_codes for each signal type.

    The fixes to copy_siginfo_from_user32 implementations has the
    interesting property that several of them perviously should never have
    worked as the __SI_ values they depended up where kernel internal.
    With that dependency gone those implementations should work much
    better.

    The idea of not passing the __SI_ values out to userspace and then
    not reinserting them has been tested with criu and criu worked without
    changes.

    Ref: 2.4.0-test1
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

20 Jul, 2017

1 commit

  • Setting si_code to __SI_FAULT results in a userspace seeing
    an si_code of 0. This is the same si_code as SI_USER. Posix
    and common sense requires that SI_USER not be a signal specific
    si_code. As such this use of 0 for the si_code is a pretty
    horribly broken ABI.

    Given that alpha is on it's last legs I don't know that it is worth
    fixing this, but it is worth documenting what is going on so that
    no one decides to copy this bad decision.

    This was introduced during the 2.5 development cycle so this
    mess has had a long time for people to be able to depend upon it.

    v2: Added FPE_FIXME for alpha as Helge Deller pointed out
    with his alternate patch one of the cases is SIGFPE not SIGTRAP.

    Cc: Helge Deller
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Cc: linux-alpha@vger.kernel.org
    Acked-by: Richard Henderson
    History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
    Ref: 0a635c7a84cf ("Fill in siginfo_t.")
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

17 Jul, 2017

1 commit

  • This ioctl does nothing to justify an _IOC_READ or _IOC_WRITE flag
    because it doesn't copy anything from/to userspace to access the
    argument.

    Fixes: 54ebbfb16034 ("tty: add TIOCGPTPEER ioctl")
    Signed-off-by: Gleb Fotengauer-Malinovskiy
    Acked-by: Aleksa Sarai
    Acked-by: Arnd Bergmann
    Signed-off-by: Greg Kroah-Hartman

    Gleb Fotengauer-Malinovskiy
     

07 Jul, 2017

2 commits

  • Pull user access str* updates from Al Viro:
    "uaccess str...() dead code removal"

    * 'uaccess.strlen' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    s390 keyboard.c: don't open-code strndup_user()
    mips: get rid of unused __strnlen_user()
    get rid of unused __strncpy_from_user() instances
    kill strlen_user()

    Linus Torvalds
     
  • Pull misc compat stuff updates from Al Viro:
    "This part is basically untangling various compat stuff. Compat
    syscalls moved to their native counterparts, getting rid of quite a
    bit of double-copying and/or set_fs() uses. A lot of field-by-field
    copyin/copyout killed off.

    - kernel/compat.c is much closer to containing just the
    copyin/copyout of compat structs. Not all compat syscalls are gone
    from it yet, but it's getting there.

    - ipc/compat_mq.c killed off completely.

    - block/compat_ioctl.c cleaned up; floppy compat ioctls moved to
    drivers/block/floppy.c where they belong. Yes, there are several
    drivers that implement some of the same ioctls. Some are m68k and
    one is 32bit-only pmac. drivers/block/floppy.c is the only one in
    that bunch that can be built on biarch"

    * 'misc.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    mqueue: move compat syscalls to native ones
    usbdevfs: get rid of field-by-field copyin
    compat_hdio_ioctl: get rid of set_fs()
    take floppy compat ioctls to sodding floppy.c
    ipmi: get rid of field-by-field __get_user()
    ipmi: get COMPAT_IPMICTL_RECEIVE_MSG in sync with the native one
    rt_sigtimedwait(): move compat to native
    select: switch compat_{get,put}_fd_set() to compat_{get,put}_bitmap()
    put_compat_rusage(): switch to copy_to_user()
    sigpending(): move compat to native
    getrlimit()/setrlimit(): move compat to native
    times(2): move compat to native
    compat_{get,put}_bitmap(): use unsafe_{get,put}_user()
    fb_get_fscreeninfo(): don't bother with do_fb_ioctl()
    do_sigaltstack(): lift copying to/from userland into callers
    take compat_sys_old_getrlimit() to native syscall
    trim __ARCH_WANT_SYS_OLD_GETRLIMIT

    Linus Torvalds
     

06 Jul, 2017

1 commit

  • Pull networking updates from David Miller:
    "Reasonably busy this cycle, but perhaps not as busy as in the 4.12
    merge window:

    1) Several optimizations for UDP processing under high load from
    Paolo Abeni.

    2) Support pacing internally in TCP when using the sch_fq packet
    scheduler for this is not practical. From Eric Dumazet.

    3) Support mutliple filter chains per qdisc, from Jiri Pirko.

    4) Move to 1ms TCP timestamp clock, from Eric Dumazet.

    5) Add batch dequeueing to vhost_net, from Jason Wang.

    6) Flesh out more completely SCTP checksum offload support, from
    Davide Caratti.

    7) More plumbing of extended netlink ACKs, from David Ahern, Pablo
    Neira Ayuso, and Matthias Schiffer.

    8) Add devlink support to nfp driver, from Simon Horman.

    9) Add RTM_F_FIB_MATCH flag to RTM_GETROUTE queries, from Roopa
    Prabhu.

    10) Add stack depth tracking to BPF verifier and use this information
    in the various eBPF JITs. From Alexei Starovoitov.

    11) Support XDP on qed device VFs, from Yuval Mintz.

    12) Introduce BPF PROG ID for better introspection of installed BPF
    programs. From Martin KaFai Lau.

    13) Add bpf_set_hash helper for TC bpf programs, from Daniel Borkmann.

    14) For loads, allow narrower accesses in bpf verifier checking, from
    Yonghong Song.

    15) Support MIPS in the BPF selftests and samples infrastructure, the
    MIPS eBPF JIT will be merged in via the MIPS GIT tree. From David
    Daney.

    16) Support kernel based TLS, from Dave Watson and others.

    17) Remove completely DST garbage collection, from Wei Wang.

    18) Allow installing TCP MD5 rules using prefixes, from Ivan
    Delalande.

    19) Add XDP support to Intel i40e driver, from Björn Töpel

    20) Add support for TC flower offload in nfp driver, from Simon
    Horman, Pieter Jansen van Vuuren, Benjamin LaHaise, Jakub
    Kicinski, and Bert van Leeuwen.

    21) IPSEC offloading support in mlx5, from Ilan Tayari.

    22) Add HW PTP support to macb driver, from Rafal Ozieblo.

    23) Networking refcount_t conversions, From Elena Reshetova.

    24) Add sock_ops support to BPF, from Lawrence Brako. This is useful
    for tuning the TCP sockopt settings of a group of applications,
    currently via CGROUPs"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1899 commits)
    net: phy: dp83867: add workaround for incorrect RX_CTRL pin strap
    dt-bindings: phy: dp83867: provide a workaround for incorrect RX_CTRL pin strap
    cxgb4: Support for get_ts_info ethtool method
    cxgb4: Add PTP Hardware Clock (PHC) support
    cxgb4: time stamping interface for PTP
    nfp: default to chained metadata prepend format
    nfp: remove legacy MAC address lookup
    nfp: improve order of interfaces in breakout mode
    net: macb: remove extraneous return when MACB_EXT_DESC is defined
    bpf: add missing break in for the TCP_BPF_SNDCWND_CLAMP case
    bpf: fix return in load_bpf_file
    mpls: fix rtm policy in mpls_getroute
    net, ax25: convert ax25_cb.refcount from atomic_t to refcount_t
    net, ax25: convert ax25_route.refcount from atomic_t to refcount_t
    net, ax25: convert ax25_uid_assoc.refcount from atomic_t to refcount_t
    net, sctp: convert sctp_ep_common.refcnt from atomic_t to refcount_t
    net, sctp: convert sctp_transport.refcnt from atomic_t to refcount_t
    net, sctp: convert sctp_chunk.refcnt from atomic_t to refcount_t
    net, sctp: convert sctp_datamsg.refcnt from atomic_t to refcount_t
    net, sctp: convert sctp_auth_bytes.refcnt from atomic_t to refcount_t
    ...

    Linus Torvalds
     

21 Jun, 2017

1 commit

  • This adds the new getsockopt(2) option SO_PEERGROUPS on SOL_SOCKET to
    retrieve the auxiliary groups of the remote peer. It is designed to
    naturally extend SO_PEERCRED. That is, the underlying data is from the
    same credentials. Regarding its syntax, it is based on SO_PEERSEC. That
    is, if the provided buffer is too small, ERANGE is returned and @optlen
    is updated. Otherwise, the information is copied, @optlen is set to the
    actual size, and 0 is returned.

    While SO_PEERCRED (and thus `struct ucred') already returns the primary
    group, it lacks the auxiliary group vector. However, nearly all access
    controls (including kernel side VFS and SYSVIPC, but also user-space
    polkit, DBus, ...) consider the entire set of groups, rather than just
    the primary group. But this is currently not possible with pure
    SO_PEERCRED. Instead, user-space has to work around this and query the
    system database for the auxiliary groups of a UID retrieved via
    SO_PEERCRED.

    Unfortunately, there is no race-free way to query the auxiliary groups
    of the PID/UID retrieved via SO_PEERCRED. Hence, the current user-space
    solution is to use getgrouplist(3p), which itself falls back to NSS and
    whatever is configured in nsswitch.conf(3). This effectively checks
    which groups we *would* assign to the user if it logged in *now*. On
    normal systems it is as easy as reading /etc/group, but with NSS it can
    resort to quering network databases (eg., LDAP), using IPC or network
    communication.

    Long story short: Whenever we want to use auxiliary groups for access
    checks on IPC, we need further IPC to talk to the user/group databases,
    rather than just relying on SO_PEERCRED and the incoming socket. This
    is unfortunate, and might even result in dead-locks if the database
    query uses the same IPC as the original request.

    So far, those recursions / dead-locks have been avoided by using
    primitive IPC for all crucial NSS modules. However, we want to avoid
    re-inventing the wheel for each NSS module that might be involved in
    user/group queries. Hence, we would preferably make DBus (and other IPC
    that supports access-management based on groups) work without resorting
    to the user/group database. This new SO_PEERGROUPS ioctl would allow us
    to make dbus-daemon work without ever calling into NSS.

    Cc: Michal Sekletar
    Cc: Simon McVittie
    Reviewed-by: Tom Gundersen
    Signed-off-by: David Herrmann
    Signed-off-by: David S. Miller

    David Herrmann
     

09 Jun, 2017

1 commit

  • When opening the slave end of a PTY, it is not possible for userspace to
    safely ensure that /dev/pts/$num is actually a slave (in cases where the
    mount namespace in which devpts was mounted is controlled by an
    untrusted process). In addition, there are several unresolvable
    race conditions if userspace were to attempt to detect attacks through
    stat(2) and other similar methods [in addition it is not clear how
    userspace could detect attacks involving FUSE].

    Resolve this by providing an interface for userpace to safely open the
    "peer" end of a PTY file descriptor by using the dentry cached by
    devpts. Since it is not possible to have an open master PTY without
    having its slave exposed in /dev/pts this interface is safe. This
    interface currently does not provide a way to get the master pty (since
    it is not clear whether such an interface is safe or even useful).

    Cc: Christian Brauner
    Cc: Valentin Rothberg
    Signed-off-by: Aleksa Sarai
    Signed-off-by: Greg Kroah-Hartman

    Aleksa Sarai
     

28 May, 2017

1 commit


22 May, 2017

1 commit


16 May, 2017

1 commit