20 Feb, 2019

2 commits

  • commit bfc913682464f45bc4d6044084e370f9048de9d5 upstream.

    Eiger machine vector definition has nr_irqs 128, and working 2.6.26
    boot shows SCSI getting IRQ-s 64 and 65. Current kernel boot fails
    because Symbios SCSI fails to request IRQ-s and does not find the disks.
    It has been broken at least since 3.18 - the earliest I could test with
    my gcc-5.

    The headers have moved around and possibly another order of defines has
    worked in the past - but since 128 seems to be correct and used, fix
    arch/alpha/include/asm/irq.h to have NR_IRQS=128 for Eiger.

    This fixes 4.19-rc7 boot on my Force Flexor A264 (Eiger subarch).

    Cc: stable@vger.kernel.org # v3.18+
    Signed-off-by: Meelis Roos
    Signed-off-by: Matt Turner
    Signed-off-by: Greg Kroah-Hartman

    Meelis Roos
     
  • commit 491af60ffb848b59e82f7c9145833222e0bf27a5 upstream.

    Fix page fault handling code to fixup r16-r18 registers.
    Before the patch code had off-by-two registers bug.
    This bug caused overwriting of ps,pc,gp registers instead
    of fixing intended r16,r17,r18 (see `struct pt_regs`).

    More details:

    Initially Dmitry noticed a kernel bug as a failure
    on strace test suite. Test passes unmapped userspace
    pointer to io_submit:

    ```c
    #include
    #include
    #include
    #include
    int main(void)
    {
    unsigned long ctx = 0;
    if (syscall(__NR_io_setup, 1, &ctx))
    err(1, "io_setup");
    const size_t page_size = sysconf(_SC_PAGESIZE);
    const size_t size = page_size * 2;
    void *ptr = mmap(NULL, size, PROT_READ | PROT_WRITE,
    MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (MAP_FAILED == ptr)
    err(1, "mmap(%zu)", size);
    if (munmap(ptr, size))
    err(1, "munmap");
    syscall(__NR_io_submit, ctx, 1, ptr + page_size);
    syscall(__NR_io_destroy, ctx);
    return 0;
    }
    ```

    Running this test causes kernel to crash when handling page fault:

    ```
    Unable to handle kernel paging request at virtual address ffffffffffff9468
    CPU 3
    aio(26027): Oops 0
    pc = [] ra = [] ps = 0000 Not tainted
    pc is at sys_io_submit+0x108/0x200
    ra is at sys_io_submit+0x6c/0x200
    v0 = fffffc00c58e6300 t0 = fffffffffffffff2 t1 = 000002000025e000
    t2 = fffffc01f159fef8 t3 = fffffc0001009640 t4 = fffffc0000e0f6e0
    t5 = 0000020001002e9e t6 = 4c41564e49452031 t7 = fffffc01f159c000
    s0 = 0000000000000002 s1 = 000002000025e000 s2 = 0000000000000000
    s3 = 0000000000000000 s4 = 0000000000000000 s5 = fffffffffffffff2
    s6 = fffffc00c58e6300
    a0 = fffffc00c58e6300 a1 = 0000000000000000 a2 = 000002000025e000
    a3 = 00000200001ac260 a4 = 00000200001ac1e8 a5 = 0000000000000001
    t8 = 0000000000000008 t9 = 000000011f8bce30 t10= 00000200001ac440
    t11= 0000000000000000 pv = fffffc00006fd320 at = 0000000000000000
    gp = 0000000000000000 sp = 00000000265fd174
    Disabling lock debugging due to kernel taint
    Trace:
    [] entSys+0xa4/0xc0
    ```

    Here `gp` has invalid value. `gp is s overwritten by a fixup for the
    following page fault handler in `io_submit` syscall handler:

    ```
    __se_sys_io_submit
    ...
    ldq a1,0(t1)
    bne t0,4280
    ```

    After a page fault `t0` should contain -EFALUT and `a1` is 0.
    Instead `gp` was overwritten in place of `a1`.

    This happens due to a off-by-two bug in `dpf_reg()` for `r16-r18`
    (aka `a0-a2`).

    I think the bug went unnoticed for a long time as `gp` is one
    of scratch registers. Any kernel function call would re-calculate `gp`.

    Dmitry tracked down the bug origin back to 2.1.32 kernel version
    where trap_a{0,1,2} fields were inserted into struct pt_regs.
    And even before that `dpf_reg()` contained off-by-one error.

    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: linux-alpha@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Reported-and-reviewed-by: "Dmitry V. Levin"
    Cc: stable@vger.kernel.org # v2.1.32+
    Bug: https://bugs.gentoo.org/672040
    Signed-off-by: Sergei Trofimovich
    Signed-off-by: Matt Turner
    Signed-off-by: Greg Kroah-Hartman

    Sergei Trofimovich
     

21 Nov, 2018

1 commit

  • commit d0ffb805b729322626639336986bc83fc2e60871 upstream.

    Alpha has had c_ispeed and c_ospeed, but still set speeds in c_cflags
    using arbitrary flags. Because BOTHER is not defined, the general
    Linux code doesn't allow setting arbitrary baud rates, and because
    CBAUDEX == 0, we can have an array overrun of the baud_rate[] table in
    drivers/tty/tty_baudrate.c if (c_cflags & CBAUD) == 037.

    Resolve both problems by #defining BOTHER to 037 on Alpha.

    However, userspace still needs to know if setting BOTHER is actually
    safe given legacy kernels (does anyone actually care about that on
    Alpha anymore?), so enable the TCGETS2/TCSETS*2 ioctls on Alpha, even
    though they use the same structure. Define struct termios2 just for
    compatibility; it is the exact same structure as struct termios. In a
    future patchset, this will be cleaned up so the uapi headers are
    usable from libc.

    Signed-off-by: H. Peter Anvin (Intel)
    Cc: Jiri Slaby
    Cc: Al Viro
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Cc: Thomas Gleixner
    Cc: Kate Stewart
    Cc: Philippe Ombredanne
    Cc: Eugene Syromiatnikov
    Cc:
    Cc:
    Cc: Johan Hovold
    Cc: Alan Cox
    Cc:
    Signed-off-by: Greg Kroah-Hartman

    H. Peter Anvin (Intel)
     

25 Aug, 2018

1 commit

  • Pull namespace fixes from Eric Biederman:
    "This is a set of four fairly obvious bug fixes:

    - a switch from d_find_alias to d_find_any_alias because the xattr
    code perversely takes a dentry

    - two mutex vs copy_to_user fixes from Jann Horn

    - a fix to use a sanitized size not the size userspace passed in from
    Christian Brauner"

    * 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    getxattr: use correct xattr length
    sys: don't hold uts_sem while accessing userspace memory
    userns: move user access out of the mutex
    cap_inode_getsecurity: use d_find_any_alias() instead of d_find_alias()

    Linus Torvalds
     

18 Aug, 2018

1 commit

  • Use new return type vm_fault_t for fault handler. For now, this is just
    documenting that the function returns a VM_FAULT value rather than an
    errno. Once all instances are converted, vm_fault_t will become a
    distinct type.

    Ref-> commit 1c8f422059ae ("mm: change return type to vm_fault_t")

    In this patch all the caller of handle_mm_fault() are changed to return
    vm_fault_t type.

    Link: http://lkml.kernel.org/r/20180617084810.GA6730@jordon-HP-15-Notebook-PC
    Signed-off-by: Souptick Joarder
    Cc: Matthew Wilcox
    Cc: Richard Henderson
    Cc: Tony Luck
    Cc: Matt Turner
    Cc: Vineet Gupta
    Cc: Russell King
    Cc: Catalin Marinas
    Cc: Will Deacon
    Cc: Richard Kuo
    Cc: Geert Uytterhoeven
    Cc: Michal Simek
    Cc: James Hogan
    Cc: Ley Foon Tan
    Cc: Jonas Bonn
    Cc: James E.J. Bottomley
    Cc: Benjamin Herrenschmidt
    Cc: Palmer Dabbelt
    Cc: Yoshinori Sato
    Cc: David S. Miller
    Cc: Richard Weinberger
    Cc: Guan Xuetao
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: "Levin, Alexander (Sasha Levin)"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Souptick Joarder
     

16 Aug, 2018

3 commits

  • Pull networking updates from David Miller:
    "Highlights:

    - Gustavo A. R. Silva keeps working on the implicit switch fallthru
    changes.

    - Support 802.11ax High-Efficiency wireless in cfg80211 et al, From
    Luca Coelho.

    - Re-enable ASPM in r8169, from Kai-Heng Feng.

    - Add virtual XFRM interfaces, which avoids all of the limitations of
    existing IPSEC tunnels. From Steffen Klassert.

    - Convert GRO over to use a hash table, so that when we have many
    flows active we don't traverse a long list during accumluation.

    - Many new self tests for routing, TC, tunnels, etc. Too many
    contributors to mention them all, but I'm really happy to keep
    seeing this stuff.

    - Hardware timestamping support for dpaa_eth/fsl-fman from Yangbo Lu.

    - Lots of cleanups and fixes in L2TP code from Guillaume Nault.

    - Add IPSEC offload support to netdevsim, from Shannon Nelson.

    - Add support for slotting with non-uniform distribution to netem
    packet scheduler, from Yousuk Seung.

    - Add UDP GSO support to mlx5e, from Boris Pismenny.

    - Support offloading of Team LAG in NFP, from John Hurley.

    - Allow to configure TX queue selection based upon RX queue, from
    Amritha Nambiar.

    - Support ethtool ring size configuration in aquantia, from Anton
    Mikaev.

    - Support DSCP and flowlabel per-transport in SCTP, from Xin Long.

    - Support list based batching and stack traversal of SKBs, this is
    very exciting work. From Edward Cree.

    - Busyloop optimizations in vhost_net, from Toshiaki Makita.

    - Introduce the ETF qdisc, which allows time based transmissions. IGB
    can offload this in hardware. From Vinicius Costa Gomes.

    - Add parameter support to devlink, from Moshe Shemesh.

    - Several multiplication and division optimizations for BPF JIT in
    nfp driver, from Jiong Wang.

    - Lots of prepatory work to make more of the packet scheduler layer
    lockless, when possible, from Vlad Buslov.

    - Add ACK filter and NAT awareness to sch_cake packet scheduler, from
    Toke Høiland-Jørgensen.

    - Support regions and region snapshots in devlink, from Alex Vesker.

    - Allow to attach XDP programs to both HW and SW at the same time on
    a given device, with initial support in nfp. From Jakub Kicinski.

    - Add TLS RX offload and support in mlx5, from Ilya Lesokhin.

    - Use PHYLIB in r8169 driver, from Heiner Kallweit.

    - All sorts of changes to support Spectrum 2 in mlxsw driver, from
    Ido Schimmel.

    - PTP support in mv88e6xxx DSA driver, from Andrew Lunn.

    - Make TCP_USER_TIMEOUT socket option more accurate, from Jon
    Maxwell.

    - Support for templates in packet scheduler classifier, from Jiri
    Pirko.

    - IPV6 support in RDS, from Ka-Cheong Poon.

    - Native tproxy support in nf_tables, from Máté Eckl.

    - Maintain IP fragment queue in an rbtree, but optimize properly for
    in-order frags. From Peter Oskolkov.

    - Improvde handling of ACKs on hole repairs, from Yuchung Cheng"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1996 commits)
    bpf: test: fix spelling mistake "REUSEEPORT" -> "REUSEPORT"
    hv/netvsc: Fix NULL dereference at single queue mode fallback
    net: filter: mark expected switch fall-through
    xen-netfront: fix warn message as irq device name has '/'
    cxgb4: Add new T5 PCI device ids 0x50af and 0x50b0
    net: dsa: mv88e6xxx: missing unlock on error path
    rds: fix building with IPV6=m
    inet/connection_sock: prefer _THIS_IP_ to current_text_addr
    net: dsa: mv88e6xxx: bitwise vs logical bug
    net: sock_diag: Fix spectre v1 gadget in __sock_diag_cmd()
    ieee802154: hwsim: using right kind of iteration
    net: hns3: Add vlan filter setting by ethtool command -K
    net: hns3: Set tx ring' tc info when netdev is up
    net: hns3: Remove tx ring BD len register in hns3_enet
    net: hns3: Fix desc num set to default when setting channel
    net: hns3: Fix for phy link issue when using marvell phy driver
    net: hns3: Fix for information of phydev lost problem when down/up
    net: hns3: Fix for command format parsing error in hclge_is_all_function_id_zero
    net: hns3: Add support for serdes loopback selftest
    bnxt_en: take coredump_record structure off stack
    ...

    Linus Torvalds
     
  • Pull Kconfig consolidation from Masahiro Yamada:
    "Consolidation of Kconfig files by Christoph Hellwig.

    Move the source statements of arch-independent Kconfig files instead
    of duplicating the includes in every arch/$(SRCARCH)/Kconfig"

    * tag 'kconfig-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
    kconfig: add a Memory Management options" menu
    kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt
    kconfig: use a menu in arch/Kconfig to reduce clutter
    kconfig: include kernel/Kconfig.preempt from init/Kconfig
    Kconfig: consolidate the "Kernel hacking" menu
    kconfig: include common Kconfig files from top-level Kconfig
    kconfig: remove duplicate SWAP symbol defintions
    um: create a proper drivers Kconfig
    um: cleanup Kconfig files
    um: stop abusing KBUILD_KCONFIG

    Linus Torvalds
     
  • Pull Kbuild updates from Masahiro Yamada:

    - verify depmod is installed before modules_install

    - support build salt in case build ids must be unique between builds

    - allow users to specify additional host compiler flags via HOST*FLAGS,
    and rename internal variables to KBUILD_HOST*FLAGS

    - update buildtar script to drop vax support, add arm64 support

    - update builddeb script for better debarch support

    - document the pit-fall of if_changed usage

    - fix parallel build of UML with O= option

    - make 'samples' target depend on headers_install to fix build errors

    - remove deprecated host-progs variable

    - add a new coccinelle script for refcount_t vs atomic_t check

    - improve double-test coccinelle script

    - misc cleanups and fixes

    * tag 'kbuild-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (41 commits)
    coccicheck: return proper error code on fail
    Coccinelle: doubletest: reduce side effect false positives
    kbuild: remove deprecated host-progs variable
    kbuild: make samples really depend on headers_install
    um: clean up archheaders recipe
    kbuild: add %asm-generic to no-dot-config-targets
    um: fix parallel building with O= option
    scripts: Add Python 3 support to tracing/draw_functrace.py
    builddeb: Add automatic support for sh{3,4}{,eb} architectures
    builddeb: Add automatic support for riscv* architectures
    builddeb: Add automatic support for m68k architecture
    builddeb: Add automatic support for or1k architecture
    builddeb: Add automatic support for sparc64 architecture
    builddeb: Add automatic support for mips{,64}r6{,el} architectures
    builddeb: Add automatic support for mips64el architecture
    builddeb: Add automatic support for ppc64 and powerpcspe architectures
    builddeb: Introduce functions to simplify kconfig tests in set_debarch
    builddeb: Drop check for 32-bit s390
    builddeb: Change architecture detection fallback to use dpkg-architecture
    builddeb: Skip architecture detection when KBUILD_DEBARCH is set
    ...

    Linus Torvalds
     

14 Aug, 2018

1 commit

  • Pull locking/atomics update from Thomas Gleixner:
    "The locking, atomics and memory model brains delivered:

    - A larger update to the atomics code which reworks the ordering
    barriers, consolidates the atomic primitives, provides the new
    atomic64_fetch_add_unless() primitive and cleans up the include
    hell.

    - Simplify cmpxchg() instrumentation and add instrumentation for
    xchg() and cmpxchg_double().

    - Updates to the memory model and documentation"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits)
    locking/atomics: Rework ordering barriers
    locking/atomics: Instrument cmpxchg_double*()
    locking/atomics: Instrument xchg()
    locking/atomics: Simplify cmpxchg() instrumentation
    locking/atomics/x86: Reduce arch_cmpxchg64*() instrumentation
    tools/memory-model: Rename litmus tests to comply to norm7
    tools/memory-model/Documentation: Fix typo, smb->smp
    sched/Documentation: Update wake_up() & co. memory-barrier guarantees
    locking/spinlock, sched/core: Clarify requirements for smp_mb__after_spinlock()
    sched/core: Use smp_mb() in wake_woken_function()
    tools/memory-model: Add informal LKMM documentation to MAINTAINERS
    locking/atomics/Documentation: Describe atomic_set() as a write operation
    tools/memory-model: Make scripts executable
    tools/memory-model: Remove ACCESS_ONCE() from model
    tools/memory-model: Remove ACCESS_ONCE() from recipes
    locking/memory-barriers.txt/kokr: Update Korean translation to fix broken DMA vs. MMIO ordering example
    MAINTAINERS: Add Daniel Lustig as an LKMM reviewer
    tools/memory-model: Fix ISA2+pooncelock+pooncelock+pombonce name
    tools/memory-model: Add litmus test for full multicopy atomicity
    locking/refcount: Always allow checked forms
    ...

    Linus Torvalds
     

11 Aug, 2018

1 commit

  • Holding uts_sem as a writer while accessing userspace memory allows a
    namespace admin to stall all processes that attempt to take uts_sem.
    Instead, move data through stack buffers and don't access userspace memory
    while uts_sem is held.

    Cc: stable@vger.kernel.org
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Jann Horn
    Signed-off-by: Eric W. Biederman

    Jann Horn
     

02 Aug, 2018

3 commits

  • Almost all architectures include it. Add a ARCH_NO_PREEMPT symbol to
    disable preempt support for alpha, hexagon, non-coldfire m68k and
    user mode Linux.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Masahiro Yamada

    Christoph Hellwig
     
  • Move the source of lib/Kconfig.debug and arch/$(ARCH)/Kconfig.debug to
    the top-level Kconfig. For two architectures that means moving their
    arch-specific symbols in that menu into a new arch Kconfig.debug file,
    and for a few more creating a dummy file so that we can include it
    unconditionally.

    Also move the actual 'Kernel hacking' menu to lib/Kconfig.debug, where
    it belongs.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Masahiro Yamada

    Christoph Hellwig
     
  • Instead of duplicating the source statements in every architecture just
    do it once in the toplevel Kconfig file.

    Note that with this the inclusion of arch/$(SRCARCH/Kconfig moves out of
    the top-level Kconfig into arch/Kconfig so that don't violate ordering
    constraits while keeping a sensible menu structure.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Masahiro Yamada

    Christoph Hellwig
     

25 Jul, 2018

2 commits

  • Currently architectures can override __atomic_op_*() to define the barriers
    used before/after a relaxed atomic when used to build acquire/release/fence
    variants.

    This has the unfortunate property of requiring the architecture to define the
    full wrapper for the atomics, rather than just the barriers they care about,
    and gets in the way of generating atomics which can be easily read.

    Instead, this patch has architectures define an optional set of barriers:

    * __atomic_acquire_fence()
    * __atomic_release_fence()
    * __atomic_pre_full_fence()
    * __atomic_post_full_fence()

    ... which uses to build the wrappers.

    It would be nice if we could undef these, along with the __atomic_op_*()
    wrappers, but that would break the cmpxchg() wrappers, which are written
    in preprocessor. Undefs would have been nice, but alas.

    There should be no functional change as a result of this patch.

    Signed-off-by: Mark Rutland
    Acked-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Cc: Andrea Parri
    Cc: Boqun Feng
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: andy.shevchenko@gmail.com
    Cc: arnd@arndb.de
    Cc: aryabinin@virtuozzo.com
    Cc: catalin.marinas@arm.com
    Cc: dvyukov@google.com
    Cc: glider@google.com
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: peter@hurleysoftware.com
    Link: http://lkml.kernel.org/r/20180716113017.3909-7-mark.rutland@arm.com
    Signed-off-by: Ingo Molnar

    Mark Rutland
     
  • David S. Miller
     

23 Jul, 2018

1 commit

  • kernel_wait4() expects a userland address for status - it's only
    rusage that goes as a kernel one (and needs a copyout afterwards)

    [ Also, fix the prototype of kernel_wait4() to have that __user
    annotation - Linus ]

    Fixes: 92ebce5ac55d ("osf_wait4: switch to kernel_wait4()")
    Cc: stable@kernel.org # v4.13+
    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Al Viro
     

18 Jul, 2018

1 commit


17 Jul, 2018

1 commit


04 Jul, 2018

1 commit

  • This patch introduces SO_TXTIME. User space enables this option in
    order to pass a desired future transmit time in a CMSG when calling
    sendmsg(2). The argument to this socket option is a 8-bytes long struct
    provided by the uapi header net_tstamp.h defined as:

    struct sock_txtime {
    clockid_t clockid;
    u32 flags;
    };

    Note that new fields were added to struct sock by filling a 2-bytes
    hole found in the struct. For that reason, neither the struct size or
    number of cachelines were altered.

    Signed-off-by: Richard Cochran
    Signed-off-by: Jesus Sanchez-Palencia
    Signed-off-by: David S. Miller

    Richard Cochran
     

21 Jun, 2018

7 commits

  • The conditional inc/dec ops differ for atomic_t and atomic64_t:

    - atomic_inc_unless_positive() is optional for atomic_t, and doesn't exist for atomic64_t.
    - atomic_dec_unless_negative() is optional for atomic_t, and doesn't exist for atomic64_t.
    - atomic_dec_if_positive is optional for atomic_t, and is mandatory for atomic64_t.

    Let's make these consistently optional for both. At the same time, let's
    clean up the existing fallbacks to use atomic_try_cmpxchg().

    The instrumented atomics are updated accordingly.

    There should be no functional change as a result of this patch.

    Signed-off-by: Mark Rutland
    Reviewed-by: Will Deacon
    Acked-by: Peter Zijlstra (Intel)
    Cc: Boqun Feng
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Link: https://lore.kernel.org/lkml/20180621121321.4761-18-mark.rutland@arm.com
    Signed-off-by: Ingo Molnar

    Mark Rutland
     
  • Many of the inc/dec ops are mandatory, but for most architectures inc/dec are
    simply trivial wrappers around their corresponding add/sub ops.

    Let's make all the inc/dec ops optional, so that we can get rid of these
    boilerplate wrappers.

    The instrumented atomics are updated accordingly.

    There should be no functional change as a result of this patch.

    Signed-off-by: Mark Rutland
    Reviewed-by: Will Deacon
    Acked-by: Peter Zijlstra (Intel)
    Acked-by: Palmer Dabbelt
    Cc: Boqun Feng
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Link: https://lore.kernel.org/lkml/20180621121321.4761-17-mark.rutland@arm.com
    Signed-off-by: Ingo Molnar

    Mark Rutland
     
  • Some of the atomics return the result of a test applied after the atomic
    operation, and almost all architectures implement these as trivial
    wrappers around the underlying atomic. Specifically:

    * _inc_and_test(v) is (_inc_return(v) == 0)
    * _dec_and_test(v) is (_dec_return(v) == 0)
    * _sub_and_test(i, v) is (_sub_return(i, v) == 0)
    * _add_negative(i, v) is (_add_return(i, v) < 0)

    Rather than have these definitions duplicated in all architectures, with
    minor inconsistencies in formatting and documentation, let's make these
    operations optional, with default fallbacks as above. Implementations
    must now provide a preprocessor symbol.

    The instrumented atomics are updated accordingly.

    Both x86 and m68k have custom implementations, which are left as-is,
    given preprocessor symbols to avoid being overridden.

    There should be no functional change as a result of this patch.

    Signed-off-by: Mark Rutland
    Reviewed-by: Will Deacon
    Acked-by: Geert Uytterhoeven
    Acked-by: Peter Zijlstra (Intel)
    Acked-by: Palmer Dabbelt
    Cc: Boqun Feng
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Link: https://lore.kernel.org/lkml/20180621121321.4761-16-mark.rutland@arm.com
    Signed-off-by: Ingo Molnar

    Mark Rutland
     
  • As a step towards unifying the atomic/atomic64/atomic_long APIs, this
    patch converts the arch/alpha implementation of atomic64_add_unless() into
    an implementation of atomic64_fetch_add_unless().

    A wrapper in will build atomic_add_unless() atop of
    this, provided it is given a preprocessor definition.

    No functional change is intended as a result of this patch.

    Signed-off-by: Mark Rutland
    Reviewed-by: Will Deacon
    Acked-by: Peter Zijlstra (Intel)
    Cc: Boqun Feng
    Cc: Ivan Kokshaysky
    Cc: Linus Torvalds
    Cc: Matt Turner
    Cc: Richard Henderson
    Cc: Thomas Gleixner
    Link: https://lore.kernel.org/lkml/20180621121321.4761-10-mark.rutland@arm.com
    Signed-off-by: Ingo Molnar

    Mark Rutland
     
  • Several architectures these have a near-identical implementation based
    on atomic_read() and atomic_cmpxchg() which we can instead define in
    , so let's do so, using something close to the existing
    x86 implementation with try_cmpxchg().

    Where an architecture provides its own atomic_fetch_add_unless(), it
    must define a preprocessor symbol for it. The instrumented atomics are
    updated accordingly.

    Note that arch/arc's existing atomic_fetch_add_unless() had redundant
    barriers, as these are already present in its atomic_cmpxchg()
    implementation.

    There should be no functional change as a result of this patch.

    Signed-off-by: Mark Rutland
    Reviewed-by: Geert Uytterhoeven
    Reviewed-by: Will Deacon
    Acked-by: Geert Uytterhoeven
    Acked-by: Peter Zijlstra (Intel)
    Acked-by: Palmer Dabbelt
    Cc: Boqun Feng
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Cc: Vineet Gupta
    Link: https://lore.kernel.org/lkml/20180621121321.4761-7-mark.rutland@arm.com
    Signed-off-by: Ingo Molnar

    Mark Rutland
     
  • We define a trivial fallback for atomic_inc_not_zero(), but don't do
    the same for atomic64_inc_not_zero(), leading most architectures to
    define the same boilerplate.

    Let's add a fallback in , and remove the redundant
    implementations. Note that atomic64_add_unless() is always defined in
    , and promotes its arguments to the requisite types, so
    we need not do this explicitly.

    There should be no functional change as a result of this patch.

    Signed-off-by: Mark Rutland
    Reviewed-by: Will Deacon
    Acked-by: Peter Zijlstra (Intel)
    Acked-by: Palmer Dabbelt
    Cc: Boqun Feng
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Link: https://lore.kernel.org/lkml/20180621121321.4761-6-mark.rutland@arm.com
    Signed-off-by: Ingo Molnar

    Mark Rutland
     
  • While __atomic_add_unless() was originally intended as a building-block
    for atomic_add_unless(), it's now used in a number of places around the
    kernel. It's the only common atomic operation named __atomic*(), rather
    than atomic_*(), and for consistency it would be better named
    atomic_fetch_add_unless().

    This lack of consistency is slightly confusing, and gets in the way of
    scripting atomics. Given that, let's clean things up and promote it to
    an official part of the atomics API, in the form of
    atomic_fetch_add_unless().

    This patch converts definitions and invocations over to the new name,
    including the instrumented version, using the following script:

    ----
    git grep -w __atomic_add_unless | while read line; do
    sed -i '{s/\/atomic_fetch_add_unless/}' "${line%%:*}";
    done
    git grep -w __arch_atomic_add_unless | while read line; do
    sed -i '{s/\/arch_atomic_fetch_add_unless/}' "${line%%:*}";
    done
    ----

    Note that we do not have atomic{64,_long}_fetch_add_unless(), which will
    be introduced by later patches.

    There should be no functional change as a result of this patch.

    Signed-off-by: Mark Rutland
    Reviewed-by: Will Deacon
    Acked-by: Geert Uytterhoeven
    Acked-by: Peter Zijlstra (Intel)
    Acked-by: Palmer Dabbelt
    Cc: Boqun Feng
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Link: https://lore.kernel.org/lkml/20180621121321.4761-2-mark.rutland@arm.com
    Signed-off-by: Ingo Molnar

    Mark Rutland
     

13 Jun, 2018

1 commit

  • Alpha provides a custom implementation of dec_and_lock(). The functions
    is split into two parts:
    - atomic_add_unless() + return 0 (fast path in assembly)
    - remaining part including locking (slow path in C)

    Comparing the result of the alpha implementation with the generic
    implementation compiled by gcc it looks like the fast path is optimized
    by avoiding a stack frame (and reloading the GP), register store and all
    this. This is only done in the slowpath.
    After marking the slowpath (atomic_dec_and_lock_1()) as "noinline" and
    doing the slowpath in C (the atomic_add_unless(atomic, -1, 1) part) I
    noticed differences in the resulting assembly:
    - the GP is still reloaded
    - atomic_add_unless() adds more memory barriers compared to the custom
    assembly
    - the custom assembly here does "load, sub, beq" while
    atomic_add_unless() does "load, cmpeq, add, bne". This is okay because
    it compares against zero after subtraction while the generic code
    compares against 1 before.

    I'm not sure if avoiding the stack frame (and GP reloading) brings a lot
    in terms of performance. Regarding the different barriers, Peter
    Zijlstra says:

    |refcount decrement needs to be a RELEASE operation, such that all the
    |load/stores to the object happen before we decrement the refcount.
    |
    |Otherwise things like:
    |
    | obj->foo = 5;
    | refcnt_dec(&obj->ref);
    |
    |can be re-ordered, which then allows fun scenarios like:
    |
    | CPU0 CPU1
    |
    | refcnt_dec(&obj->ref);
    | if (dec_and_test(&obj->ref))
    | free(obj);
    | obj->foo = 5; // oops UaF
    |
    |
    |This means (for alpha) that there should be a memory barrier _before_
    |the decrement, however the dec_and_lock asm thing only has one _after_,
    |which, per the above, is too late.
    |
    |The generic version using add_unless will result in memory barrier
    |before and after (because that is the rule for atomic ops with a return
    |value) which is strictly too many barriers for the refcount story, but
    |who knows what other ordering requirements code has.

    Remove the custom alpha implementation of dec_and_lock() and if it is an
    issue (performance wise) then the fast path could still be inlined.

    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra (Intel)
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Cc: linux-alpha@vger.kernel.org
    Link: https://lkml.kernel.org/r/20180606115918.GG12198@hirez.programming.kicks-ass.net
    Link: https://lkml.kernel.org/r20180612161621.22645-2-bigeasy@linutronix.de

    Sebastian Andrzej Siewior
     

07 Jun, 2018

1 commit

  • Pull Kbuild updates from Masahiro Yamada:

    - improve fixdep to coalesce consecutive slashes in dep-files

    - fix some issues of the maintainer string generation in deb-pkg script

    - remove unused CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX and clean-up
    several tools and linker scripts

    - clean-up modpost

    - allow to enable the dead code/data elimination for PowerPC in EXPERT
    mode

    - improve two coccinelle scripts for better performance

    - pass endianness and machine size flags to sparse for all architecture

    - misc fixes

    * tag 'kbuild-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (25 commits)
    kbuild: add machine size to CHECKFLAGS
    kbuild: add endianness flag to CHEKCFLAGS
    kbuild: $(CHECK) doesnt need NOSTDINC_FLAGS twice
    scripts: Fixed printf format mismatch
    scripts/tags.sh: use `find` for $ALLSOURCE_ARCHS generation
    coccinelle: deref_null: improve performance
    coccinelle: mini_lock: improve performance
    powerpc: Allow LD_DEAD_CODE_DATA_ELIMINATION to be selected
    kbuild: Allow LD_DEAD_CODE_DATA_ELIMINATION to be selectable if enabled
    kbuild: LD_DEAD_CODE_DATA_ELIMINATION no -ffunction-sections/-fdata-sections for module build
    kbuild: Fix asm-generic/vmlinux.lds.h for LD_DEAD_CODE_DATA_ELIMINATION
    modpost: constify *modname function argument where possible
    modpost: remove redundant is_vmlinux() test
    modpost: use strstarts() helper more widely
    modpost: pass struct elf_info pointer to get_modinfo()
    checkpatch: remove VMLINUX_SYMBOL() check
    vmlinux.lds.h: remove no-op macro VMLINUX_SYMBOL()
    kbuild: remove CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX
    export.h: remove code for prefixing symbols with underscore
    depmod.sh: remove symbol prefix support
    ...

    Linus Torvalds
     

05 Jun, 2018

5 commits

  • Pull time/Y2038 updates from Thomas Gleixner:

    - Consolidate SySV IPC UAPI headers

    - Convert SySV IPC to the new COMPAT_32BIT_TIME mechanism

    - Cleanup the core interfaces and standardize on the ktime_get_* naming
    convention.

    - Convert the X86 platform ops to timespec64

    - Remove the ugly temporary timespec64 hack

    * 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
    x86: Convert x86_platform_ops to timespec64
    timekeeping: Add more coarse clocktai/boottime interfaces
    timekeeping: Add ktime_get_coarse_with_offset
    timekeeping: Standardize on ktime_get_*() naming
    timekeeping: Clean up ktime_get_real_ts64
    timekeeping: Remove timespec64 hack
    y2038: ipc: Redirect ipc(SEMTIMEDOP, ...) to compat_ksys_semtimedop
    y2038: ipc: Enable COMPAT_32BIT_TIME
    y2038: ipc: Use __kernel_timespec
    y2038: ipc: Report long times to user space
    y2038: ipc: Use ktime_get_real_seconds consistently
    y2038: xtensa: Extend sysvipc data structures
    y2038: powerpc: Extend sysvipc data structures
    y2038: sparc: Extend sysvipc data structures
    y2038: parisc: Extend sysvipc data structures
    y2038: mips: Extend sysvipc data structures
    y2038: arm64: Extend sysvipc compat data structures
    y2038: s390: Remove unneeded ipc uapi header files
    y2038: ia64: Remove unneeded ipc uapi header files
    y2038: alpha: Remove unneeded ipc uapi header files
    ...

    Linus Torvalds
     
  • Pull timers and timekeeping updates from Thomas Gleixner:

    - Core infrastucture work for Y2038 to address the COMPAT interfaces:

    + Add a new Y2038 safe __kernel_timespec and use it in the core
    code

    + Introduce config switches which allow to control the various
    compat mechanisms

    + Use the new config switch in the posix timer code to control the
    32bit compat syscall implementation.

    - Prevent bogus selection of CPU local clocksources which causes an
    endless reselection loop

    - Remove the extra kthread in the clocksource code which has no value
    and just adds another level of indirection

    - The usual bunch of trivial updates, cleanups and fixlets all over the
    place

    - More SPDX conversions

    * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
    clocksource/drivers/mxs_timer: Switch to SPDX identifier
    clocksource/drivers/timer-imx-tpm: Switch to SPDX identifier
    clocksource/drivers/timer-imx-gpt: Switch to SPDX identifier
    clocksource/drivers/timer-imx-gpt: Remove outdated file path
    clocksource/drivers/arc_timer: Add comments about locking while read GFRC
    clocksource/drivers/mips-gic-timer: Add pr_fmt and reword pr_* messages
    clocksource/drivers/sprd: Fix Kconfig dependency
    clocksource: Move inline keyword to the beginning of function declarations
    timer_list: Remove unused function pointer typedef
    timers: Adjust a kernel-doc comment
    tick: Prefer a lower rating device only if it's CPU local device
    clocksource: Remove kthread
    time: Change nanosleep to safe __kernel_* types
    time: Change types to new y2038 safe __kernel_* types
    time: Fix get_timespec64() for y2038 safe compat interfaces
    time: Add new y2038 safe __kernel_timespec
    posix-timers: Make compat syscalls depend on CONFIG_COMPAT_32BIT_TIME
    time: Introduce CONFIG_COMPAT_32BIT_TIME
    time: Introduce CONFIG_64BIT_TIME in architectures
    compat: Enable compat_get/put_timespec64 always
    ...

    Linus Torvalds
     
  • …iederm/user-namespace

    Pull siginfo updates from Eric Biederman:
    "This set of changes close the known issues with setting si_code to an
    invalid value, and with not fully initializing struct siginfo. There
    remains work to do on nds32, arc, unicore32, powerpc, arm, arm64, ia64
    and x86 to get the code that generates siginfo into a simpler and more
    maintainable state. Most of that work involves refactoring the signal
    handling code and thus careful code review.

    Also not included is the work to shrink the in kernel version of
    struct siginfo. That depends on getting the number of places that
    directly manipulate struct siginfo under control, as it requires the
    introduction of struct kernel_siginfo for the in kernel things.

    Overall this set of changes looks like it is making good progress, and
    with a little luck I will be wrapping up the siginfo work next
    development cycle"

    * 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits)
    signal/sh: Stop gcc warning about an impossible case in do_divide_error
    signal/mips: Report FPE_FLTUNK for undiagnosed floating point exceptions
    signal/um: More carefully relay signals in relay_signal.
    signal: Extend siginfo_layout with SIL_FAULT_{MCEERR|BNDERR|PKUERR}
    signal: Remove unncessary #ifdef SEGV_PKUERR in 32bit compat code
    signal/signalfd: Add support for SIGSYS
    signal/signalfd: Remove __put_user from signalfd_copyinfo
    signal/xtensa: Use force_sig_fault where appropriate
    signal/xtensa: Consistenly use SIGBUS in do_unaligned_user
    signal/um: Use force_sig_fault where appropriate
    signal/sparc: Use force_sig_fault where appropriate
    signal/sparc: Use send_sig_fault where appropriate
    signal/sh: Use force_sig_fault where appropriate
    signal/s390: Use force_sig_fault where appropriate
    signal/riscv: Replace do_trap_siginfo with force_sig_fault
    signal/riscv: Use force_sig_fault where appropriate
    signal/parisc: Use force_sig_fault where appropriate
    signal/parisc: Use force_sig_mceerr where appropriate
    signal/openrisc: Use force_sig_fault where appropriate
    signal/nios2: Use force_sig_fault where appropriate
    ...

    Linus Torvalds
     
  • Pull documentation updates from Jonathan Corbet:
    "There's been a fair amount of work in the docs tree this time around,
    including:

    - Extensive RST conversions and organizational work in the
    memory-management docs thanks to Mike Rapoport.

    - An update of Documentation/features from Andrea Parri and a script
    to keep it updated.

    - Various LICENSES updates from Thomas, along with a script to check
    SPDX tags.

    - Work to fix dangling references to documentation files; this
    involved a fair number of one-liner comment changes outside of
    Documentation/

    ... and the usual list of documentation improvements, typo fixes, etc"

    * tag 'docs-4.18' of git://git.lwn.net/linux: (103 commits)
    Documentation: document hung_task_panic kernel parameter
    docs/admin-guide/mm: add high level concepts overview
    docs/vm: move ksm and transhuge from "user" to "internals" section.
    docs: Use the kerneldoc comments for memalloc_no*()
    doc: document scope NOFS, NOIO APIs
    docs: update kernel versions and dates in tables
    docs/vm: transhuge: split userspace bits to admin-guide/mm/transhuge
    docs/vm: transhuge: minor updates
    docs/vm: transhuge: change sections order
    Documentation: arm: clean up Marvell Berlin family info
    Documentation: gpio: driver: Fix a typo and some odd grammar
    docs: ranoops.rst: fix location of ramoops.txt
    scripts/documentation-file-ref-check: rewrite it in perl with auto-fix mode
    docs: uio-howto.rst: use a code block to solve a warning
    mm, THP, doc: Add document for thp_swpout/thp_swpout_fallback
    w1: w1_io.c: fix a kernel-doc warning
    Documentation/process/posting: wrap text at 80 cols
    docs: admin-guide: add cgroup-v2 documentation
    Revert "Documentation/features/vm: Remove arch support status file for 'pte_special'"
    Documentation: refcount-vs-atomic: Update reference to LKMM doc.
    ...

    Linus Torvalds
     
  • Pull dma-mapping updates from Christoph Hellwig:

    - replace the force_dma flag with a dma_configure bus method. (Nipun
    Gupta, although one patch is іncorrectly attributed to me due to a
    git rebase bug)

    - use GFP_DMA32 more agressively in dma-direct. (Takashi Iwai)

    - remove PCI_DMA_BUS_IS_PHYS and rely on the dma-mapping API to do the
    right thing for bounce buffering.

    - move dma-debug initialization to common code, and apply a few
    cleanups to the dma-debug code.

    - cleanup the Kconfig mess around swiotlb selection

    - swiotlb comment fixup (Yisheng Xie)

    - a trivial swiotlb fix. (Dan Carpenter)

    - support swiotlb on RISC-V. (based on a patch from Palmer Dabbelt)

    - add a new generic dma-noncoherent dma_map_ops implementation and use
    it for arc, c6x and nds32.

    - improve scatterlist validity checking in dma-debug. (Robin Murphy)

    - add a struct device quirk to limit the dma-mask to 32-bit due to
    bridge/system issues, and switch x86 to use it instead of a local
    hack for VIA bridges.

    - handle devices without a dma_mask more gracefully in the dma-direct
    code.

    * tag 'dma-mapping-4.18' of git://git.infradead.org/users/hch/dma-mapping: (48 commits)
    dma-direct: don't crash on device without dma_mask
    nds32: use generic dma_noncoherent_ops
    nds32: implement the unmap_sg DMA operation
    nds32: consolidate DMA cache maintainance routines
    x86/pci-dma: switch the VIA 32-bit DMA quirk to use the struct device flag
    x86/pci-dma: remove the explicit nodac and allowdac option
    x86/pci-dma: remove the experimental forcesac boot option
    Documentation/x86: remove a stray reference to pci-nommu.c
    core, dma-direct: add a flag 32-bit dma limits
    dma-mapping: remove unused gfp_t parameter to arch_dma_alloc_attrs
    dma-debug: check scatterlist segments
    c6x: use generic dma_noncoherent_ops
    arc: use generic dma_noncoherent_ops
    arc: fix arc_dma_{map,unmap}_page
    arc: fix arc_dma_sync_sg_for_{cpu,device}
    arc: simplify arc_dma_sync_single_for_{cpu,device}
    dma-mapping: provide a generic dma-noncoherent implementation
    dma-mapping: simplify Kconfig dependencies
    riscv: add swiotlb support
    riscv: only enable ZONE_DMA32 for 64-bit
    ...

    Linus Torvalds
     

01 Jun, 2018

1 commit

  • By default, sparse assumes a 64bit machine when compiled on x86-64
    and 32bit when compiled on anything else.

    This can of course create all sort of problems for the other archs, like
    issuing false warnings ('shift too big (32) for type unsigned long'), or
    worse, failing to emit legitimate warnings.

    Fix this by adding the -m32/-m64 flag, depending on CONFIG_64BIT,
    to CHECKFLAGS in the main Makefile (and so for all archs).
    Also, remove the now unneeded -m32/-m64 in arch specific Makefiles.

    Signed-off-by: Luc Van Oostenryck
    Signed-off-by: Masahiro Yamada

    Luc Van Oostenryck
     

23 May, 2018

3 commits

  • memory-barriers.txt has been updated with the following requirement.

    "When using writel(), a prior wmb() is not needed to guarantee that the
    cache coherent memory writes have completed before writing to the MMIO
    region."

    Current writeX() and iowriteX() implementations on alpha are not
    satisfying this requirement as the barrier is after the register write.

    Move mb() in writeX() and iowriteX() functions to guarantee that HW
    observes memory changes before performing register operations.

    Signed-off-by: Sinan Kaya
    Reported-by: Arnd Bergmann
    Signed-off-by: Matt Turner

    Sinan Kaya
     
  • Remove the dma_ops indirection.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Matt Turner

    Christoph Hellwig
     
  • The generic dma_direct implementation does the same thing as the alpha
    pci-noop implementation, just with more bells and whistles. And unlike
    the current code it at least has a theoretical chance to actually compile.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Matt Turner

    Christoph Hellwig
     

09 May, 2018

3 commits