09 May, 2017

16 commits

  • Pull more tracing updates from Steven Rostedt:
    "These are three simple changes.

    The first one is just a switch from using strcpy() to strlcpy().
    Someone thought that it may cause an overflow bug, but since it only
    copies comms into a pre-allocated array of TASK_COMM_LEN, and no comm
    should ever be bigger than that, nor not end with a nul character,
    this change is more of a safety precaution than fixing anything that
    is actually broken.

    The other two changes are simply cleaning and optimizing some code"

    * tag 'trace-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    ftrace: Simplify ftrace_match_record() even more
    ftrace: Remove an unneeded condition
    tracing: Use strlcpy() instead of strcpy() in __trace_find_cmdline()

    Linus Torvalds
     
  • Pull PCI updates from Bjorn Helgaas:

    - add framework for supporting PCIe devices in Endpoint mode (Kishon
    Vijay Abraham I)

    - use non-postable PCI config space mappings when possible (Lorenzo
    Pieralisi)

    - clean up and unify mmap of PCI BARs (David Woodhouse)

    - export and unify Function Level Reset support (Christoph Hellwig)

    - avoid FLR for Intel 82579 NICs (Sasha Neftin)

    - add pci_request_irq() and pci_free_irq() helpers (Christoph Hellwig)

    - short-circuit config access failures for disconnected devices (Keith
    Busch)

    - remove D3 sleep delay when possible (Adrian Hunter)

    - freeze PME scan before suspending devices (Lukas Wunner)

    - stop disabling MSI/MSI-X in pci_device_shutdown() (Prarit Bhargava)

    - disable boot interrupt quirk for ASUS M2N-LR (Stefan Assmann)

    - add arch-specific alignment control to improve device passthrough by
    avoiding multiple BARs in a page (Yongji Xie)

    - add sysfs sriov_drivers_autoprobe to control VF driver binding
    (Bodong Wang)

    - allow slots below PCI-to-PCIe "reverse bridges" (Bjorn Helgaas)

    - fix crashes when unbinding host controllers that don't support
    removal (Brian Norris)

    - add driver for MicroSemi Switchtec management interface (Logan
    Gunthorpe)

    - add driver for Faraday Technology FTPCI100 host bridge (Linus
    Walleij)

    - add i.MX7D support (Andrey Smirnov)

    - use generic MSI support for Aardvark (Thomas Petazzoni)

    - make Rockchip driver modular (Brian Norris)

    - advertise 128-byte Read Completion Boundary support for Rockchip
    (Shawn Lin)

    - advertise PCI_EXP_LNKSTA_SLC for Rockchip root port (Shawn Lin)

    - convert atomic_t to refcount_t in HV driver (Elena Reshetova)

    - add CPU IRQ affinity in HV driver (K. Y. Srinivasan)

    - fix PCI bus removal in HV driver (Long Li)

    - add support for ThunderX2 DMA alias topology (Jayachandran C)

    - add ThunderX pass2.x 2nd node MCFG quirk (Tomasz Nowicki)

    - add ITE 8893 bridge DMA alias quirk (Jarod Wilson)

    - restrict Cavium ACS quirk only to CN81xx/CN83xx/CN88xx devices
    (Manish Jaggi)

    * tag 'pci-v4.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (146 commits)
    PCI: Don't allow unbinding host controllers that aren't prepared
    ARM: DRA7: clockdomain: Change the CLKTRCTRL of CM_PCIE_CLKSTCTRL to SW_WKUP
    MAINTAINERS: Add PCI Endpoint maintainer
    Documentation: PCI: Add userguide for PCI endpoint test function
    tools: PCI: Add sample test script to invoke pcitest
    tools: PCI: Add a userspace tool to test PCI endpoint
    Documentation: misc-devices: Add Documentation for pci-endpoint-test driver
    misc: Add host side PCI driver for PCI test function device
    PCI: Add device IDs for DRA74x and DRA72x
    dt-bindings: PCI: dra7xx: Add DT bindings to enable unaligned access
    PCI: dwc: dra7xx: Workaround for errata id i870
    dt-bindings: PCI: dra7xx: Add DT bindings for PCI dra7xx EP mode
    PCI: dwc: dra7xx: Add EP mode support
    PCI: dwc: dra7xx: Facilitate wrapper and MSI interrupts to be enabled independently
    dt-bindings: PCI: Add DT bindings for PCI designware EP mode
    PCI: dwc: designware: Add EP mode support
    Documentation: PCI: Add binding documentation for pci-test endpoint function
    ixgbe: Use pcie_flr() instead of duplicating it
    IB/hfi1: Use pcie_flr() instead of duplicating it
    PCI: imx6: Fix spelling mistake: "contol" -> "control"
    ...

    Linus Torvalds
     
  • Pull tty/serial updates from Greg KH:
    "Here is the "big" TTY/Serial patch updates for 4.12-rc1

    Not a lot of new things here, the normal number of serial driver
    updates and additions, tiny bugs fixed, and some core files split up
    to make future changes a bit easier for Nicolas's "tiny-tty" work.

    All of these have been in linux-next for a while"

    * tag 'tty-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (62 commits)
    serial: small Makefile reordering
    tty: split job control support into a file of its own
    tty: move baudrate handling code to a file of its own
    console: move console_init() out of tty_io.c
    serial: 8250_early: Add earlycon support for Palmchip UART
    tty: pl011: use "qdf2400_e44" as the earlycon name for QDF2400 E44
    vt: make mouse selection of non-ASCII consistent
    vt: set mouse selection word-chars to gpm's default
    imx-serial: Reduce RX DMA startup latency when opening for reading
    serial: omap: suspend device on probe errors
    serial: omap: fix runtime-pm handling on unbind
    tty: serial: omap: add UPF_BOOT_AUTOCONF flag for DT init
    serial: samsung: Remove useless spinlock
    serial: samsung: Add missing checks for dma_map_single failure
    serial: samsung: Use right device for DMA-mapping calls
    serial: imx: setup DCEDTE early and ensure DCD and RI irqs to be off
    tty: fix comment typo s/repsonsible/responsible/
    tty: amba-pl011: Fix spurious TX interrupts
    serial: xuartps: Enable clocks in the pm disable case also
    serial: core: Re-use struct uart_port {name} field
    ...

    Linus Torvalds
     
  • struct timespec is not y2038 safe on 32 bit machines and needs to be
    replaced by struct timespec64 in order to represent times beyond year
    2038 on such machines.

    Fix all the timestamp representation in struct trace_hwlat and all the
    corresponding implementations.

    Link: http://lkml.kernel.org/r/1491613030-11599-3-git-send-email-deepa.kernel@gmail.com
    Signed-off-by: Deepa Dinamani
    Acked-by: Steven Rostedt (VMware)
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Deepa Dinamani
     
  • set_memory_* functions have moved to set_memory.h. Switch to this
    explicitly.

    Link: http://lkml.kernel.org/r/1488920133-27229-13-git-send-email-labbott@redhat.com
    Signed-off-by: Laura Abbott
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Laura Abbott
     
  • set_memory_* functions have moved to set_memory.h. Switch to this
    explicitly.

    Link: http://lkml.kernel.org/r/1488920133-27229-12-git-send-email-labbott@redhat.com
    Signed-off-by: Laura Abbott
    Acked-by: Jessica Yu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Laura Abbott
     
  • __vmalloc* allows users to provide gfp flags for the underlying
    allocation. This API is quite popular

    $ git grep "=[[:space:]]__vmalloc\|return[[:space:]]*__vmalloc" | wc -l
    77

    The only problem is that many people are not aware that they really want
    to give __GFP_HIGHMEM along with other flags because there is really no
    reason to consume precious lowmemory on CONFIG_HIGHMEM systems for pages
    which are mapped to the kernel vmalloc space. About half of users don't
    use this flag, though. This signals that we make the API unnecessarily
    too complex.

    This patch simply uses __GFP_HIGHMEM implicitly when allocating pages to
    be mapped to the vmalloc space. Current users which add __GFP_HIGHMEM
    are simplified and drop the flag.

    Link: http://lkml.kernel.org/r/20170307141020.29107-1-mhocko@kernel.org
    Signed-off-by: Michal Hocko
    Reviewed-by: Matthew Wilcox
    Cc: Al Viro
    Cc: Vlastimil Babka
    Cc: David Rientjes
    Cc: Cristopher Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     
  • in_interrupt() semantics are confusing and wrong for most users as it
    also returns true when bh is disabled. Thus we open coded a proper
    check for interrupts in __sanitizer_cov_trace_pc() with a lengthy
    explanatory comment.

    Use the new in_task() predicate instead.

    Link: http://lkml.kernel.org/r/20170321091026.139655-1-dvyukov@google.com
    Signed-off-by: Dmitry Vyukov
    Cc: Kefeng Wang
    Cc: James Morse
    Cc: Alexander Popov
    Cc: Andrey Konovalov
    Cc: Hillf Danton
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitry Vyukov
     
  • The elapsed time, user CPU time and system CPU time for the thread group
    status request are presently left at zero. Fill these in.

    [akpm@linux-foundation.org: run ktime_get_ns() a single time]
    [akpm@linux-foundation.org: include linux/sched/cputime.h for task_cputime()]
    Link: http://lkml.kernel.org/r/1488508424-12322-1-git-send-email-xiao.zhang@windriver.com
    Signed-off-by: Zhang Xiao
    Cc: Balbir Singh
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Zhang Xiao
     
  • pid_ns_for_children set by a task is known only to the task itself, and
    it's impossible to identify it from outside.

    It's a big problem for checkpoint/restore software like CRIU, because it
    can't correctly handle tasks, that do setns(CLONE_NEWPID) in proccess of
    their work.

    This patch solves the problem, and it exposes pid_ns_for_children to ns
    directory in standard way with the name "pid_for_children":

    ~# ls /proc/5531/ns -l | grep pid
    lrwxrwxrwx 1 root root 0 Jan 14 16:38 pid -> pid:[4026531836]
    lrwxrwxrwx 1 root root 0 Jan 14 16:38 pid_for_children -> pid:[4026532286]

    Link: http://lkml.kernel.org/r/149201123914.6007.2187327078064239572.stgit@localhost.localdomain
    Signed-off-by: Kirill Tkhai
    Cc: Andrei Vagin
    Cc: Andreas Gruenbacher
    Cc: Kees Cook
    Cc: Michael Kerrisk
    Cc: Al Viro
    Cc: Oleg Nesterov
    Cc: Paul Moore
    Cc: Eric Biederman
    Cc: Andy Lutomirski
    Cc: Ingo Molnar
    Cc: Serge Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill Tkhai
     
  • alloc_pidmap() advances pid_namespace::last_pid. When first pid
    allocation fails, then next created process will have pid 2 and
    pid_ns_prepare_proc() won't be called. So, pid_namespace::proc_mnt will
    never be initialized (not to mention that there won't be a child
    reaper).

    I saw crash stack of such case on kernel 3.10:

    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: proc_flush_task+0x8f/0x1b0
    Call Trace:
    release_task+0x3f/0x490
    wait_consider_task.part.10+0x7ff/0xb00
    do_wait+0x11f/0x280
    SyS_wait4+0x7d/0x110

    We may fix this by restore of last_pid in 0 or by prohibiting of futher
    allocations. Since there was a similar issue in Oleg Nesterov's commit
    314a8ad0f18a ("pidns: fix free_pid() to handle the first fork failure").
    and it was fixed via prohibiting allocation, let's follow this way, and
    do the same.

    Link: http://lkml.kernel.org/r/149201021004.4863.6762095011554287922.stgit@localhost.localdomain
    Signed-off-by: Kirill Tkhai
    Acked-by: Cyrill Gorcunov
    Cc: Andrei Vagin
    Cc: Andreas Gruenbacher
    Cc: Kees Cook
    Cc: Michael Kerrisk
    Cc: Al Viro
    Cc: Oleg Nesterov
    Cc: Paul Moore
    Cc: Eric Biederman
    Cc: Andy Lutomirski
    Cc: Ingo Molnar
    Cc: Serge Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill Tkhai
     
  • Get rid of multiple definitions of append_elf_note() & final_note()
    functions. Reuse these functions compiled under CONFIG_CRASH_CORE Also,
    define Elf_Word and use it instead of generic u32 or the more specific
    Elf64_Word.

    Link: http://lkml.kernel.org/r/149035342324.6881.11667840929850361402.stgit@hbathini.in.ibm.com
    Signed-off-by: Hari Bathini
    Acked-by: Dave Young
    Acked-by: Tony Luck
    Cc: Fenghua Yu
    Cc: Eric Biederman
    Cc: Mahesh Salgaonkar
    Cc: Vivek Goyal
    Cc: Michael Ellerman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hari Bathini
     
  • Patch series "kexec/fadump: remove dependency with CONFIG_KEXEC and
    reuse crashkernel parameter for fadump", v4.

    Traditionally, kdump is used to save vmcore in case of a crash. Some
    architectures like powerpc can save vmcore using architecture specific
    support instead of kexec/kdump mechanism. Such architecture specific
    support also needs to reserve memory, to be used by dump capture kernel.
    crashkernel parameter can be a reused, for memory reservation, by such
    architecture specific infrastructure.

    This patchset removes dependency with CONFIG_KEXEC for crashkernel
    parameter and vmcoreinfo related code as it can be reused without kexec
    support. Also, crashkernel parameter is reused instead of
    fadump_reserve_mem to reserve memory for fadump.

    The first patch moves crashkernel parameter parsing and vmcoreinfo
    related code under CONFIG_CRASH_CORE instead of CONFIG_KEXEC_CORE. The
    second patch reuses the definitions of append_elf_note() & final_note()
    functions under CONFIG_CRASH_CORE in IA64 arch code. The third patch
    removes dependency on CONFIG_KEXEC for firmware-assisted dump (fadump)
    in powerpc. The next patch reuses crashkernel parameter for reserving
    memory for fadump, instead of the fadump_reserve_mem parameter. This
    has the advantage of using all syntaxes crashkernel parameter supports,
    for fadump as well. The last patch updates fadump kernel documentation
    about use of crashkernel parameter.

    This patch (of 5):

    Traditionally, kdump is used to save vmcore in case of a crash. Some
    architectures like powerpc can save vmcore using architecture specific
    support instead of kexec/kdump mechanism. Such architecture specific
    support also needs to reserve memory, to be used by dump capture kernel.
    crashkernel parameter can be a reused, for memory reservation, by such
    architecture specific infrastructure.

    But currently, code related to vmcoreinfo and parsing of crashkernel
    parameter is built under CONFIG_KEXEC_CORE. This patch introduces
    CONFIG_CRASH_CORE and moves the above mentioned code under this config,
    allowing code reuse without dependency on CONFIG_KEXEC. There is no
    functional change with this patch.

    Link: http://lkml.kernel.org/r/149035338104.6881.4550894432615189948.stgit@hbathini.in.ibm.com
    Signed-off-by: Hari Bathini
    Acked-by: Dave Young
    Cc: Fenghua Yu
    Cc: Tony Luck
    Cc: Eric Biederman
    Cc: Mahesh Salgaonkar
    Cc: Vivek Goyal
    Cc: Michael Ellerman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hari Bathini
     
  • Using virtually mapped stack, kernel stacks are allocated via vmalloc.

    In the current implementation, two stacks per cpu can be cached when
    tasks are freed and the cached stacks are used again in task
    duplications. But the cached stacks may remain unfreed even when cpu
    are offline. By adding a cpu hotplug callback to free the cached stacks
    when a cpu goes offline, the pages of the cached stacks are not wasted.

    Link: http://lkml.kernel.org/r/1487076043-17802-1-git-send-email-hoeun.ryu@gmail.com
    Signed-off-by: Hoeun Ryu
    Reviewed-by: Thomas Gleixner
    Acked-by: Michal Hocko
    Cc: Ingo Molnar
    Cc: Andy Lutomirski
    Cc: Kees Cook
    Cc: "Eric W. Biederman"
    Cc: Oleg Nesterov
    Cc: Mateusz Guzik
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hoeun Ryu
     
  • When I was running my testcase which may block hundreds of threads on fs
    locks, I got lockup due to output from debug_show_all_locks() added by
    commit b2d4c2edb2e4 ("locking/hung_task: Show all locks").

    For example, if 1000 threads were blocked in TASK_UNINTERRUPTIBLE state
    and 500 out of 1000 threads hold some lock, debug_show_all_locks() from
    for_each_process_thread() loop will report locks held by 500 threads for
    1000 times. This is a too much noise.

    In order to make sure rcu_lock_break() is called frequently, we should
    avoid calling debug_show_all_locks() from for_each_process_thread() loop
    because debug_show_all_locks() effectively calls for_each_process_thread()
    loop. Let's defer calling debug_show_all_locks() till before panic() or
    leaving for_each_process_thread() loop.

    Link: http://lkml.kernel.org/r/1489296834-60436-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
    Signed-off-by: Tetsuo Handa
    Reviewed-by: Vegard Nossum
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tetsuo Handa
     
  • do_proc_dointvec_jiffies_conv() uses LONG_MAX/HZ as the max value to
    avoid overflow. But actually the *valp is int type, so it still causes
    overflow.

    For example,

    echo 2147483647 > ./sys/net/ipv4/tcp_keepalive_time

    Then,

    cat ./sys/net/ipv4/tcp_keepalive_time

    The output is "-1", it is not expected.

    Now use INT_MAX/HZ as the max value instead LONG_MAX/HZ to fix it.

    Link: http://lkml.kernel.org/r/1490109532-9228-1-git-send-email-fgao@ikuai8.com
    Signed-off-by: Gao Feng
    Cc: Arnaldo Carvalho de Melo
    Cc: Ingo Molnar
    Cc: Alexey Dobriyan
    Cc: Eric Dumazet
    Cc: Josh Poimboeuf
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gao Feng
     

06 May, 2017

2 commits

  • Pull powerpc updates from Michael Ellerman:
    "Highlights include:

    - Larger virtual address space on 64-bit server CPUs. By default we
    use a 128TB virtual address space, but a process can request access
    to the full 512TB by passing a hint to mmap().

    - Support for the new Power9 "XIVE" interrupt controller.

    - TLB flushing optimisations for the radix MMU on Power9.

    - Support for CAPI cards on Power9, using the "Coherent Accelerator
    Interface Architecture 2.0".

    - The ability to configure the mmap randomisation limits at build and
    runtime.

    - Several small fixes and cleanups to the kprobes code, as well as
    support for KPROBES_ON_FTRACE.

    - Major improvements to handling of system reset interrupts,
    correctly treating them as NMIs, giving them a dedicated stack and
    using a new hypervisor call to trigger them, all of which should
    aid debugging and robustness.

    - Many fixes and other minor enhancements.

    Thanks to: Alastair D'Silva, Alexey Kardashevskiy, Alistair Popple,
    Andrew Donnellan, Aneesh Kumar K.V, Anshuman Khandual, Anton
    Blanchard, Balbir Singh, Ben Hutchings, Benjamin Herrenschmidt,
    Bhupesh Sharma, Chris Packham, Christian Zigotzky, Christophe Leroy,
    Christophe Lombard, Daniel Axtens, David Gibson, Gautham R. Shenoy,
    Gavin Shan, Geert Uytterhoeven, Guilherme G. Piccoli, Hamish Martin,
    Hari Bathini, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Mahesh J
    Salgaonkar, Mahesh Salgaonkar, Masami Hiramatsu, Matt Brown, Matthew
    R. Ochs, Michael Neuling, Naveen N. Rao, Nicholas Piggin, Oliver
    O'Halloran, Pan Xinhui, Paul Mackerras, Rashmica Gupta, Russell
    Currey, Sukadev Bhattiprolu, Thadeu Lima de Souza Cascardo, Tobin C.
    Harding, Tyrel Datwyler, Uma Krishnan, Vaibhav Jain, Vipin K Parashar,
    Yang Shi"

    * tag 'powerpc-4.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (214 commits)
    powerpc/64s: Power9 has no LPCR[VRMASD] field so don't set it
    powerpc/powernv: Fix TCE kill on NVLink2
    powerpc/mm/radix: Drop support for CPUs without lockless tlbie
    powerpc/book3s/mce: Move add_taint() later in virtual mode
    powerpc/sysfs: Move #ifdef CONFIG_HOTPLUG_CPU out of the function body
    powerpc/smp: Document irq enable/disable after migrating IRQs
    powerpc/mpc52xx: Don't select user-visible RTAS_PROC
    powerpc/powernv: Document cxl dependency on special case in pnv_eeh_reset()
    powerpc/eeh: Clean up and document event handling functions
    powerpc/eeh: Avoid use after free in eeh_handle_special_event()
    cxl: Mask slice error interrupts after first occurrence
    cxl: Route eeh events to all drivers in cxl_pci_error_detected()
    cxl: Force context lock during EEH flow
    powerpc/64: Allow CONFIG_RELOCATABLE if COMPILE_TEST
    powerpc/xmon: Teach xmon oops about radix vectors
    powerpc/mm/hash: Fix off-by-one in comment about kernel contexts ids
    powerpc/pseries: Enable VFIO
    powerpc/powernv: Fix iommu table size calculation hook for small tables
    powerpc/powernv: Check kzalloc() return value in pnv_pci_table_alloc
    powerpc: Add arch/powerpc/tools directory
    ...

    Linus Torvalds
     
  • Pull namespace updates from Eric Biederman:
    "This is a set of small fixes that were mostly stumbled over during
    more significant development. This proc fix and the fix to
    posix-timers are the most significant of the lot.

    There is a lot of good development going on but unfortunately it
    didn't quite make the merge window"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    proc: Fix unbalanced hard link numbers
    signal: Make kill_proc_info static
    rlimit: Properly call security_task_setrlimit
    signal: Remove unused definition of sig_user_definied
    ia64: Remove unused IA64_TASK_SIGHAND_OFFSET and IA64_SIGHAND_SIGLOCK_OFFSET
    ipc: Remove unused declaration of recompute_msgmni
    posix-timers: Correct sanity check in posix_cpu_nsleep
    sysctl: Remove dead register_sysctl_root

    Linus Torvalds
     

04 May, 2017

13 commits

  • Dan Carpenter sent a patch to remove a check in ftrace_match_record()
    because the logic of the code made the check redundant. I looked deeper into
    the code, and made the following logic table, with the three variables and
    the result of the original code.

    modname mod_matches exclude_mod result
    ------- ----------- ----------- ------
    0 0 0 return 0
    0 0 1 func_match
    0 1 * < cannot exist >
    1 0 0 return 0
    1 0 1 func_match
    1 1 0 func_match
    1 1 1 return 0

    Notice that when mod_matches == exclude mod, the result is always to
    return 0, and when mod_matches != exclude_mod, then the result is to test
    the function. This means we only need test if mod_matches is equal to
    exclude_mod.

    Cc: Dan Carpenter
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • We know that "mod_matches" is true here so there is no need to check
    again.

    Link: http://lkml.kernel.org/r/20170331152130.GA4947@mwanda

    Signed-off-by: Dan Carpenter
    Signed-off-by: Steven Rostedt (VMware)

    Dan Carpenter
     
  • Strcpy is inherently not safe, and strlcpy() should be used instead.
    __trace_find_cmdline() uses strcpy() because the comms saved must have a
    terminating nul character, but it doesn't hurt to add the extra protection
    of using strlcpy() instead of strcpy().

    Link: http://lkml.kernel.org/r/1493806274-13936-1-git-send-email-amit.pundir@linaro.org

    Signed-off-by: Amey Telawane
    [AmitP: Cherry-picked this commit from CodeAurora kernel/msm-3.10
    https://source.codeaurora.org/quic/la/kernel/msm-3.10/commit/?id=2161ae9a70b12cf18ac8e5952a20161ffbccb477]
    Signed-off-by: Amit Pundir
    [ Updated change log and removed the "- 1" from len parameter ]
    Signed-off-by: Steven Rostedt (VMware)

    Amey Telawane
     
  • Pull modules updates from Jessica Yu:

    - Minor code cleanups

    - Fix section alignment for .init_array

    * tag 'modules-for-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
    kallsyms: Use bounded strnchr() when parsing string
    module: Unify the return value type of try_module_get
    module: set .init_array alignment to 8

    Linus Torvalds
     
  • Pull tracing updates from Steven Rostedt:
    "New features for this release:

    - Pretty much a full rewrite of the processing of function plugins.
    i.e. echo do_IRQ:stacktrace > set_ftrace_filter

    - The rewrite was needed to add plugins to be unique to tracing
    instances. i.e. mkdir instance/foo; cd instances/foo; echo
    do_IRQ:stacktrace > set_ftrace_filter The old way was written very
    hacky. This removes a lot of those hacks.

    - New "function-fork" tracing option. When set, pids in the
    set_ftrace_pid will have their children added when the processes
    with their pids listed in the set_ftrace_pid file forks.

    - Exposure of "maxactive" for kretprobe in kprobe_events

    - Allow for builtin init functions to be traced by the function
    tracer (via the kernel command line). Module init function tracing
    will come in the next release.

    - Added more selftests, and have selftests also test in an instance"

    * tag 'trace-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (60 commits)
    ring-buffer: Return reader page back into existing ring buffer
    selftests: ftrace: Allow some event trigger tests to run in an instance
    selftests: ftrace: Have some basic tests run in a tracing instance too
    selftests: ftrace: Have event tests also run in an tracing instance
    selftests: ftrace: Make func_event_triggers and func_traceonoff_triggers tests do instances
    selftests: ftrace: Allow some tests to be run in a tracing instance
    tracing/ftrace: Allow for instances to trigger their own stacktrace probes
    tracing/ftrace: Allow for the traceonoff probe be unique to instances
    tracing/ftrace: Enable snapshot function trigger to work with instances
    tracing/ftrace: Allow instances to have their own function probes
    tracing/ftrace: Add a better way to pass data via the probe functions
    ftrace: Dynamically create the probe ftrace_ops for the trace_array
    tracing: Pass the trace_array into ftrace_probe_ops functions
    tracing: Have the trace_array hold the list of registered func probes
    ftrace: If the hash for a probe fails to update then free what was initialized
    ftrace: Have the function probes call their own function
    ftrace: Have each function probe use its own ftrace_ops
    ftrace: Have unregister_ftrace_function_probe_func() return a value
    ftrace: Add helper function ftrace_hash_move_and_update_ops()
    ftrace: Remove data field from ftrace_func_probe structure
    ...

    Linus Torvalds
     
  • Pull printk updates from Petr Mladek:

    - There is a situation when early console is not deregistered because
    the preferred one matches a wrong entry. It caused messages to appear
    twice.

    This is the 2nd attempt to fix it. The first one was wrong, see the
    commit c6c7d83b9c9e ('Revert "console: don't prefer first registered
    if DT specifies stdout-path"').

    The fix is coupled with some small code clean up. Well, the console
    registration code would deserve a big one. We need to think about it.

    - Do not lose information about the preemtive context when the console
    semaphore is re-taken.

    - Do not block CPU hotplug when someone else is already pushing
    messages to the console.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
    printk: fix double printing with earlycon
    printk: rename selected_console -> preferred_console
    printk: fix name/type/scope of preferred_console var
    printk: Correctly handle preemption in console_unlock()
    printk: use console_trylock() in console_cpu_notify()

    Linus Torvalds
     
  • Merge misc updates from Andrew Morton:

    - a few misc things

    - most of MM

    - KASAN updates

    * emailed patches from Andrew Morton : (102 commits)
    kasan: separate report parts by empty lines
    kasan: improve double-free report format
    kasan: print page description after stacks
    kasan: improve slab object description
    kasan: change report header
    kasan: simplify address description logic
    kasan: change allocation and freeing stack traces headers
    kasan: unify report headers
    kasan: introduce helper functions for determining bug type
    mm: hwpoison: call shake_page() after try_to_unmap() for mlocked page
    mm: hwpoison: call shake_page() unconditionally
    mm/swapfile.c: fix swap space leak in error path of swap_free_entries()
    mm/gup.c: fix access_ok() argument type
    mm/truncate: avoid pointless cleancache_invalidate_inode() calls.
    mm/truncate: bail out early from invalidate_inode_pages2_range() if mapping is empty
    fs/block_dev: always invalidate cleancache in invalidate_bdev()
    fs: fix data invalidation in the cleancache during direct IO
    zram: reduce load operation in page_same_filled
    zram: use zram_free_page instead of open-coded
    zram: introduce zram data accessor
    ...

    Linus Torvalds
     
  • GFP_NOFS context is used for the following 5 reasons currently:

    - to prevent from deadlocks when the lock held by the allocation
    context would be needed during the memory reclaim

    - to prevent from stack overflows during the reclaim because the
    allocation is performed from a deep context already

    - to prevent lockups when the allocation context depends on other
    reclaimers to make a forward progress indirectly

    - just in case because this would be safe from the fs POV

    - silence lockdep false positives

    Unfortunately overuse of this allocation context brings some problems to
    the MM. Memory reclaim is much weaker (especially during heavy FS
    metadata workloads), OOM killer cannot be invoked because the MM layer
    doesn't have enough information about how much memory is freeable by the
    FS layer.

    In many cases it is far from clear why the weaker context is even used
    and so it might be used unnecessarily. We would like to get rid of
    those as much as possible. One way to do that is to use the flag in
    scopes rather than isolated cases. Such a scope is declared when really
    necessary, tracked per task and all the allocation requests from within
    the context will simply inherit the GFP_NOFS semantic.

    Not only this is easier to understand and maintain because there are
    much less problematic contexts than specific allocation requests, this
    also helps code paths where FS layer interacts with other layers (e.g.
    crypto, security modules, MM etc...) and there is no easy way to convey
    the allocation context between the layers.

    Introduce memalloc_nofs_{save,restore} API to control the scope of
    GFP_NOFS allocation context. This is basically copying
    memalloc_noio_{save,restore} API we have for other restricted allocation
    context GFP_NOIO. The PF_MEMALLOC_NOFS flag already exists and it is
    just an alias for PF_FSTRANS which has been xfs specific until recently.
    There are no more PF_FSTRANS users anymore so let's just drop it.

    PF_MEMALLOC_NOFS is now checked in the MM layer and drops __GFP_FS
    implicitly same as PF_MEMALLOC_NOIO drops __GFP_IO. memalloc_noio_flags
    is renamed to current_gfp_context because it now cares about both
    PF_MEMALLOC_NOFS and PF_MEMALLOC_NOIO contexts. Xfs code paths preserve
    their semantic. kmem_flags_convert() doesn't need to evaluate the flag
    anymore.

    This patch shouldn't introduce any functional changes.

    Let's hope that filesystems will drop direct GFP_NOFS (resp. ~__GFP_FS)
    usage as much as possible and only use a properly documented
    memalloc_nofs_{save,restore} checkpoints where they are appropriate.

    [akpm@linux-foundation.org: fix comment typo, reflow comment]
    Link: http://lkml.kernel.org/r/20170306131408.9828-5-mhocko@kernel.org
    Signed-off-by: Michal Hocko
    Acked-by: Vlastimil Babka
    Cc: Dave Chinner
    Cc: Theodore Ts'o
    Cc: Chris Mason
    Cc: David Sterba
    Cc: Jan Kara
    Cc: Brian Foster
    Cc: Darrick J. Wong
    Cc: Nikolay Borisov
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     
  • The current implementation of the reclaim lockup detection can lead to
    false positives and those even happen and usually lead to tweak the code
    to silence the lockdep by using GFP_NOFS even though the context can use
    __GFP_FS just fine.

    See

    http://lkml.kernel.org/r/20160512080321.GA18496@dastard

    as an example.

    =================================
    [ INFO: inconsistent lock state ]
    4.5.0-rc2+ #4 Tainted: G O
    ---------------------------------
    inconsistent {RECLAIM_FS-ON-R} -> {IN-RECLAIM_FS-W} usage.
    kswapd0/543 [HC0[0]:SC0[0]:HE1:SE1] takes:

    (&xfs_nondir_ilock_class){++++-+}, at: xfs_ilock+0x177/0x200 [xfs]

    {RECLAIM_FS-ON-R} state was registered at:
    mark_held_locks+0x79/0xa0
    lockdep_trace_alloc+0xb3/0x100
    kmem_cache_alloc+0x33/0x230
    kmem_zone_alloc+0x81/0x120 [xfs]
    xfs_refcountbt_init_cursor+0x3e/0xa0 [xfs]
    __xfs_refcount_find_shared+0x75/0x580 [xfs]
    xfs_refcount_find_shared+0x84/0xb0 [xfs]
    xfs_getbmap+0x608/0x8c0 [xfs]
    xfs_vn_fiemap+0xab/0xc0 [xfs]
    do_vfs_ioctl+0x498/0x670
    SyS_ioctl+0x79/0x90
    entry_SYSCALL_64_fastpath+0x12/0x6f

    CPU0
    ----
    lock(&xfs_nondir_ilock_class);

    lock(&xfs_nondir_ilock_class);

    *** DEADLOCK ***

    3 locks held by kswapd0/543:

    stack backtrace:
    CPU: 0 PID: 543 Comm: kswapd0 Tainted: G O 4.5.0-rc2+ #4
    Call Trace:
    lock_acquire+0xd8/0x1e0
    down_write_nested+0x5e/0xc0
    xfs_ilock+0x177/0x200 [xfs]
    xfs_reflink_cancel_cow_range+0x150/0x300 [xfs]
    xfs_fs_evict_inode+0xdc/0x1e0 [xfs]
    evict+0xc5/0x190
    dispose_list+0x39/0x60
    prune_icache_sb+0x4b/0x60
    super_cache_scan+0x14f/0x1a0
    shrink_slab.part.63.constprop.79+0x1e9/0x4e0
    shrink_zone+0x15e/0x170
    kswapd+0x4f1/0xa80
    kthread+0xf2/0x110
    ret_from_fork+0x3f/0x70

    To quote Dave:
    "Ignoring whether reflink should be doing anything or not, that's a
    "xfs_refcountbt_init_cursor() gets called both outside and inside
    transactions" lockdep false positive case. The problem here is lockdep
    has seen this allocation from within a transaction, hence a GFP_NOFS
    allocation, and now it's seeing it in a GFP_KERNEL context. Also note
    that we have an active reference to this inode.

    So, because the reclaim annotations overload the interrupt level
    detections and it's seen the inode ilock been taken in reclaim
    ("interrupt") context, this triggers a reclaim context warning where
    it thinks it is unsafe to do this allocation in GFP_KERNEL context
    holding the inode ilock..."

    This sounds like a fundamental problem of the reclaim lock detection.
    It is really impossible to annotate such a special usecase IMHO unless
    the reclaim lockup detection is reworked completely. Until then it is
    much better to provide a way to add "I know what I am doing flag" and
    mark problematic places. This would prevent from abusing GFP_NOFS flag
    which has a runtime effect even on configurations which have lockdep
    disabled.

    Introduce __GFP_NOLOCKDEP flag which tells the lockdep gfp tracking to
    skip the current allocation request.

    While we are at it also make sure that the radix tree doesn't
    accidentaly override tags stored in the upper part of the gfp_mask.

    Link: http://lkml.kernel.org/r/20170306131408.9828-3-mhocko@kernel.org
    Signed-off-by: Michal Hocko
    Suggested-by: Peter Zijlstra
    Acked-by: Peter Zijlstra (Intel)
    Acked-by: Vlastimil Babka
    Cc: Dave Chinner
    Cc: Theodore Ts'o
    Cc: Chris Mason
    Cc: David Sterba
    Cc: Jan Kara
    Cc: Brian Foster
    Cc: Darrick J. Wong
    Cc: Nikolay Borisov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     
  • Patch series "scope GFP_NOFS api", v5.

    This patch (of 7):

    Commit 21caf2fc1931 ("mm: teach mm by current context info to not do I/O
    during memory allocation") added the memalloc_noio_(save|restore)
    functions to enable people to modify the MM behavior by disabling I/O
    during memory allocation.

    This was further extended in commit 934f3072c17c ("mm: clear __GFP_FS
    when PF_MEMALLOC_NOIO is set").

    memalloc_noio_* functions prevent allocation paths recursing back into
    the filesystem without explicitly changing the flags for every
    allocation site.

    However, lockdep hasn't been keeping up with the changes and it entirely
    misses handling the memalloc_noio adjustments. Instead, it is left to
    the callers of __lockdep_trace_alloc to call the function after they
    have shaven the respective GFP flags which can lead to false positives:

    =================================
    [ INFO: inconsistent lock state ]
    4.10.0-nbor #134 Not tainted
    ---------------------------------
    inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage.
    fsstress/3365 [HC0[0]:SC0[0]:HE1:SE1] takes:
    (&xfs_nondir_ilock_class){++++?.}, at: xfs_ilock+0x141/0x230
    {IN-RECLAIM_FS-W} state was registered at:
    __lock_acquire+0x62a/0x17c0
    lock_acquire+0xc5/0x220
    down_write_nested+0x4f/0x90
    xfs_ilock+0x141/0x230
    xfs_reclaim_inode+0x12a/0x320
    xfs_reclaim_inodes_ag+0x2c8/0x4e0
    xfs_reclaim_inodes_nr+0x33/0x40
    xfs_fs_free_cached_objects+0x19/0x20
    super_cache_scan+0x191/0x1a0
    shrink_slab+0x26f/0x5f0
    shrink_node+0xf9/0x2f0
    kswapd+0x356/0x920
    kthread+0x10c/0x140
    ret_from_fork+0x31/0x40
    irq event stamp: 173777
    hardirqs last enabled at (173777): __local_bh_enable_ip+0x70/0xc0
    hardirqs last disabled at (173775): __local_bh_enable_ip+0x37/0xc0
    softirqs last enabled at (173776): _xfs_buf_find+0x67a/0xb70
    softirqs last disabled at (173774): _xfs_buf_find+0x5db/0xb70

    other info that might help us debug this:
    Possible unsafe locking scenario:

    CPU0
    ----
    lock(&xfs_nondir_ilock_class);

    lock(&xfs_nondir_ilock_class);

    *** DEADLOCK ***

    4 locks held by fsstress/3365:
    #0: (sb_writers#10){++++++}, at: mnt_want_write+0x24/0x50
    #1: (&sb->s_type->i_mutex_key#12){++++++}, at: vfs_setxattr+0x6f/0xb0
    #2: (sb_internal#2){++++++}, at: xfs_trans_alloc+0xfc/0x140
    #3: (&xfs_nondir_ilock_class){++++?.}, at: xfs_ilock+0x141/0x230

    stack backtrace:
    CPU: 0 PID: 3365 Comm: fsstress Not tainted 4.10.0-nbor #134
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
    Call Trace:
    kmem_cache_alloc_node_trace+0x3a/0x2c0
    vm_map_ram+0x2a1/0x510
    _xfs_buf_map_pages+0x77/0x140
    xfs_buf_get_map+0x185/0x2a0
    xfs_attr_rmtval_set+0x233/0x430
    xfs_attr_leaf_addname+0x2d2/0x500
    xfs_attr_set+0x214/0x420
    xfs_xattr_set+0x59/0xb0
    __vfs_setxattr+0x76/0xa0
    __vfs_setxattr_noperm+0x5e/0xf0
    vfs_setxattr+0xae/0xb0
    setxattr+0x15e/0x1a0
    path_setxattr+0x8f/0xc0
    SyS_lsetxattr+0x11/0x20
    entry_SYSCALL_64_fastpath+0x23/0xc6

    Let's fix this by making lockdep explicitly do the shaving of respective
    GFP flags.

    Fixes: 934f3072c17c ("mm: clear __GFP_FS when PF_MEMALLOC_NOIO is set")
    Link: http://lkml.kernel.org/r/20170306131408.9828-2-mhocko@kernel.org
    Signed-off-by: Nikolay Borisov
    Signed-off-by: Michal Hocko
    Acked-by: Peter Zijlstra (Intel)
    Cc: Dave Chinner
    Cc: Theodore Ts'o
    Cc: Chris Mason
    Cc: David Sterba
    Cc: Jan Kara
    Cc: Brian Foster
    Cc: Darrick J. Wong
    Cc: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nikolay Borisov
     
  • Pull drm u pdates from Dave Airlie:
    "This is the main drm pull request for v4.12. Apart from two fixes
    pulls, everything should have been in drm-next for at least 2 weeks.

    The biggest thing in here is AMD released the public headers for their
    upcoming VEGA GPUs. These as always are quite a sizeable chunk of
    header files. They've also added initial non-display support for those
    GPUs, though they aren't available in production yet.

    Otherwise it's pretty much normal.

    New bridge drivers:
    - megachips-stdpxxxx-ge-b850v3-fw LVDS->DP++
    - generic LVDS bridge support.

    Core:
    - Displayport link train failure reporting to userspace
    - debugfs interface cleaned up
    - subsystem TODO in kerneldoc now
    - Extended fbdev support (flipping and vblank wait)
    - drm_platform removed
    - EDP CRC support in helper
    - HF-VSDB SCDC support in EDID parser
    - Lots of code cleanups and header extraction
    - Thunderbolt external GPU awareness
    - Atomic helper improvements
    - Documentation improvements

    panel:
    - Sitronix and Samsung new panel support

    amdgpu:
    - Preliminary vega10 support
    - Multi-level page table support
    - GPU sensor support for userspace
    - PRT support for sparse buffers
    - SR-IOV improvements
    - Non-contig VRAM CPU mapping

    i915:
    - Atomic modesetting enabled by default on Gen5+
    - LSPCON improvements
    - Atomic state handling for cdclk
    - GPU reset improvements
    - In-kernel unit tests
    - Geminilake improvements and color manager support
    - Designware i2c fixes
    - vblank evasion improvements
    - Hotplug safe connector iterators
    - GVT scheduler QoS support
    - GVT Kabylake support

    nouveau:
    - Acceleration support for Pascal (GP10x).
    - Rearchitecture of code handling proprietary signed firmware
    - Fix GTX 970 with odd MMU configuration
    - GP10B support
    - GP107 acceleration support

    vmwgfx:
    - Atomic modesetting support for vmwgfx

    omapdrm:
    - Support for render nodes
    - Refactor omapdss code
    - Fix some probe ordering issues
    - Fix too dark RGB565 rendering

    sunxi:
    - prelim rework for multiple pipes.

    mali-dp:
    - Color management support
    - Plane scaling
    - Power management improvements

    imx-drm:
    - Prefetch Resolve Engine/Gasket on i.MX6QP
    - Deferred plane disabling
    - Separate alpha support

    mediatek:
    - Mediatek SoC MT2701 support

    rcar-du:
    - Gen3 HDMI support

    msm:
    - 4k support for newer chips
    - OPP bindings for gpu
    - prep work for per-process pagetables

    vc4:
    - HDMI audio support
    - fixes

    qxl:
    - minor fixes.

    dw-hdmi:
    - PHY improvements
    - CSC fixes
    - Amlogic GX SoC support"

    * tag 'drm-for-v4.12' of git://people.freedesktop.org/~airlied/linux: (1778 commits)
    drm/nouveau/fb/gf100-: Fix 32 bit wraparound in new ram detection
    drm/nouveau/secboot/gm20b: fix the error return code in gm20b_secboot_tegra_read_wpr()
    drm/nouveau/kms: Increase max retries in scanout position queries.
    drm/nouveau/bios/bitP: check that table is long enough for optional pointers
    drm/nouveau/fifo/nv40: no ctxsw for pre-nv44 mpeg engine
    drm: mali-dp: use div_u64 for expensive 64-bit divisions
    drm/i915: Confirm the request is still active before adding it to the await
    drm/i915: Avoid busy-spinning on VLV_GLTC_PW_STATUS mmio
    drm/i915/selftests: Allocate inode/file dynamically
    drm/i915: Fix system hang with EI UP masked on Haswell
    drm/i915: checking for NULL instead of IS_ERR() in mock selftests
    drm/i915: Perform link quality check unconditionally during long pulse
    drm/i915: Fix use after free in lpe_audio_platdev_destroy()
    drm/i915: Use the right mapping_gfp_mask for final shmem allocation
    drm/i915: Make legacy cursor updates more unsynced
    drm/i915: Apply a cond_resched() to the saturated signaler
    drm/i915: Park the signaler before sleeping
    drm: mali-dp: Check the mclk rate and allow up/down scaling
    drm: mali-dp: Enable image enhancement when scaling
    drm: mali-dp: Add plane upscaling support
    ...

    Linus Torvalds
     
  • Pull fsnotify updates from Jan Kara:
    "The branch contains mainly a rework of fsnotify infrastructure fixing
    a shortcoming that we have waited for response to fanotify permission
    events with SRCU read lock held and when the process consuming events
    was slow to respond the kernel has stalled.

    It also contains several cleanups of unnecessary indirections in
    fsnotify framework and a bugfix from Amir fixing leakage of kernel
    internal errno to userspace"

    * 'fsnotify' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (37 commits)
    fanotify: don't expose EOPENSTALE to userspace
    fsnotify: remove a stray unlock
    fsnotify: Move ->free_mark callback to fsnotify_ops
    fsnotify: Add group pointer in fsnotify_init_mark()
    fsnotify: Drop inode_mark.c
    fsnotify: Remove fsnotify_find_{inode|vfsmount}_mark()
    fsnotify: Remove fsnotify_detach_group_marks()
    fsnotify: Rename fsnotify_clear_marks_by_group_flags()
    fsnotify: Inline fsnotify_clear_{inode|vfsmount}_mark_group()
    fsnotify: Remove fsnotify_recalc_{inode|vfsmount}_mask()
    fsnotify: Remove fsnotify_set_mark_{,ignored_}mask_locked()
    fanotify: Release SRCU lock when waiting for userspace response
    fsnotify: Pass fsnotify_iter_info into handle_event handler
    fsnotify: Provide framework for dropping SRCU lock in ->handle_event
    fsnotify: Remove special handling of mark destruction on group shutdown
    fsnotify: Detach mark from object list when last reference is dropped
    fsnotify: Move queueing of mark for destruction into fsnotify_put_mark()
    inotify: Do not drop mark reference under idr_lock
    fsnotify: Free fsnotify_mark_connector when there is no mark attached
    fsnotify: Lock object list with connector lock
    ...

    Linus Torvalds
     
  • Pull audit updates from Paul Moore:
    "Fourteen audit patches for v4.12 that span the full range of fixes,
    new features, and internal cleanups.

    We have a patches to move to 64-bit timestamps, convert refcounts from
    atomic_t to refcount_t, track PIDs using the pid struct instead of
    pid_t, convert our own private audit buffer cache to a standard
    kmem_cache, log kernel module names when they are unloaded, and
    normalize the NETFILTER_PKT to make the userspace folks happier.

    From a fixes perspective, the most important is likely the auditd
    connection tracking RCU fix; it was a rather brain dead bug that I'll
    take the blame for, but thankfully it didn't seem to affect many
    people (only one report).

    I think the patch subject lines and commit descriptions do a pretty
    good job of explaining the details and why the changes are important
    so I'll point you there instead of duplicating it here; as usual, if
    you have any questions you know where to find us.

    We also manage to take out more code than we put in this time, that
    always makes me happy :)"

    * 'stable-4.12' of git://git.infradead.org/users/pcmoore/audit:
    audit: fix the RCU locking for the auditd_connection structure
    audit: use kmem_cache to manage the audit_buffer cache
    audit: Use timespec64 to represent audit timestamps
    audit: store the auditd PID as a pid struct instead of pid_t
    audit: kernel generated netlink traffic should have a portid of 0
    audit: combine audit_receive() and audit_receive_skb()
    audit: convert audit_watch.count from atomic_t to refcount_t
    audit: convert audit_tree.count from atomic_t to refcount_t
    audit: normalize NETFILTER_PKT
    netfilter: use consistent ipv4 network offset in xt_AUDIT
    audit: log module name on delete_module
    audit: remove unnecessary semicolon in audit_watch_handle_event()
    audit: remove unnecessary semicolon in audit_mark_handle_event()
    audit: remove unnecessary semicolon in audit_field_valid()

    Linus Torvalds
     

03 May, 2017

6 commits

  • Pull security subsystem updates from James Morris:
    "Highlights:

    IMA:
    - provide ">" and " of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (98 commits)
    tpm: Fix reference count to main device
    tpm_tis: convert to using locality callbacks
    tpm: fix handling of the TPM 2.0 event logs
    tpm_crb: remove a cruft constant
    keys: select CONFIG_CRYPTO when selecting DH / KDF
    apparmor: Make path_max parameter readonly
    apparmor: fix parameters so that the permission test is bypassed at boot
    apparmor: fix invalid reference to index variable of iterator line 836
    apparmor: use SHASH_DESC_ON_STACK
    security/apparmor/lsm.c: set debug messages
    apparmor: fix boolreturn.cocci warnings
    Smack: Use GFP_KERNEL for smk_netlbl_mls().
    smack: fix double free in smack_parse_opts_str()
    KEYS: add SP800-56A KDF support for DH
    KEYS: Keyring asymmetric key restrict method with chaining
    KEYS: Restrict asymmetric key linkage using a specific keychain
    KEYS: Add a lookup_restriction function for the asymmetric key type
    KEYS: Add KEYCTL_RESTRICT_KEYRING
    KEYS: Consistent ordering for __key_link_begin and restrict check
    KEYS: Add an optional lookup_restriction hook to key_type
    ...

    Linus Torvalds
     
  • Pull trivial tree updates from Jiri Kosina.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
    tty: fix comment for __tty_alloc_driver()
    init/main: properly align the multi-line comment
    init/main: Fix double "the" in comment
    Fix dead URLs to ftp.kernel.org
    drivers: Clean up duplicated email address
    treewide: Fix typo in xml/driver-api/basics.xml
    tools/testing/selftests/powerpc: remove redundant CFLAGS in Makefile: "-Wall -O2 -Wall" -> "-O2 -Wall"
    selftests/timers: Spelling s/privledges/privileges/
    HID: picoLCD: Spelling s/REPORT_WRTIE_MEMORY/REPORT_WRITE_MEMORY/
    net: phy: dp83848: Fix Typo
    UBI: Fix typos
    Documentation: ftrace.txt: Correct nice value of 120 priority
    net: fec: Fix typo in error msg and comment
    treewide: Fix typos in printk

    Linus Torvalds
     
  • Pull livepatch updates from Jiri Kosina:

    - a per-task consistency model is being added for architectures that
    support reliable stack dumping (extending this, currently rather
    trivial set, is currently in the works).

    This extends the nature of the types of patches that can be applied
    by live patching infrastructure. The code stems from the design
    proposal made [1] back in November 2014. It's a hybrid of SUSE's
    kGraft and RH's kpatch, combining advantages of both: it uses
    kGraft's per-task consistency and syscall barrier switching combined
    with kpatch's stack trace switching. There are also a number of
    fallback options which make it quite flexible.

    Most of the heavy lifting done by Josh Poimboeuf with help from
    Miroslav Benes and Petr Mladek

    [1] https://lkml.kernel.org/r/20141107140458.GA21774@suse.cz

    - module load time patch optimization from Zhou Chengming

    - a few assorted small fixes

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
    livepatch: add missing printk newlines
    livepatch: Cancel transition a safe way for immediate patches
    livepatch: Reduce the time of finding module symbols
    livepatch: make klp_mutex proper part of API
    livepatch: allow removal of a disabled patch
    livepatch: add /proc//patch_state
    livepatch: change to a per-task consistency model
    livepatch: store function sizes
    livepatch: use kstrtobool() in enabled_store()
    livepatch: move patching functions into patch.c
    livepatch: remove unnecessary object loaded check
    livepatch: separate enabled and patched states
    livepatch/s390: add TIF_PATCH_PENDING thread flag
    livepatch/s390: reorganize TIF thread flag bits
    livepatch/powerpc: add TIF_PATCH_PENDING thread flag
    livepatch/x86: add TIF_PATCH_PENDING thread flag
    livepatch: create temporary klp_update_patch_state() stub
    x86/entry: define _TIF_ALLWORK_MASK flags explicitly
    stacktrace/x86: add function for detecting reliable stack traces

    Linus Torvalds
     
  • Pull networking updates from David Millar:
    "Here are some highlights from the 2065 networking commits that
    happened this development cycle:

    1) XDP support for IXGBE (John Fastabend) and thunderx (Sunil Kowuri)

    2) Add a generic XDP driver, so that anyone can test XDP even if they
    lack a networking device whose driver has explicit XDP support
    (me).

    3) Sparc64 now has an eBPF JIT too (me)

    4) Add a BPF program testing framework via BPF_PROG_TEST_RUN (Alexei
    Starovoitov)

    5) Make netfitler network namespace teardown less expensive (Florian
    Westphal)

    6) Add symmetric hashing support to nft_hash (Laura Garcia Liebana)

    7) Implement NAPI and GRO in netvsc driver (Stephen Hemminger)

    8) Support TC flower offload statistics in mlxsw (Arkadi Sharshevsky)

    9) Multiqueue support in stmmac driver (Joao Pinto)

    10) Remove TCP timewait recycling, it never really could possibly work
    well in the real world and timestamp randomization really zaps any
    hint of usability this feature had (Soheil Hassas Yeganeh)

    11) Support level3 vs level4 ECMP route hashing in ipv4 (Nikolay
    Aleksandrov)

    12) Add socket busy poll support to epoll (Sridhar Samudrala)

    13) Netlink extended ACK support (Johannes Berg, Pablo Neira Ayuso,
    and several others)

    14) IPSEC hw offload infrastructure (Steffen Klassert)"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2065 commits)
    tipc: refactor function tipc_sk_recv_stream()
    tipc: refactor function tipc_sk_recvmsg()
    net: thunderx: Optimize page recycling for XDP
    net: thunderx: Support for XDP header adjustment
    net: thunderx: Add support for XDP_TX
    net: thunderx: Add support for XDP_DROP
    net: thunderx: Add basic XDP support
    net: thunderx: Cleanup receive buffer allocation
    net: thunderx: Optimize CQE_TX handling
    net: thunderx: Optimize RBDR descriptor handling
    net: thunderx: Support for page recycling
    ipx: call ipxitf_put() in ioctl error path
    net: sched: add helpers to handle extended actions
    qed*: Fix issues in the ptp filter config implementation.
    qede: Fix concurrency issue in PTP Tx path processing.
    stmmac: Add support for SIMATIC IOT2000 platform
    net: hns: fix ethtool_get_strings overflow in hns driver
    tcp: fix wraparound issue in tcp_lp
    bpf, arm64: fix jit branch offset related to ldimm64
    bpf, arm64: implement jiting of BPF_XADD
    ...

    Linus Torvalds
     
  • Pull crypto updates from Herbert Xu:
    "Here is the crypto update for 4.12:

    API:
    - Add batch registration for acomp/scomp
    - Change acomp testing to non-unique compressed result
    - Extend algorithm name limit to 128 bytes
    - Require setkey before accept(2) in algif_aead

    Algorithms:
    - Add support for deflate rfc1950 (zlib)

    Drivers:
    - Add accelerated crct10dif for powerpc
    - Add crc32 in stm32
    - Add sha384/sha512 in ccp
    - Add 3des/gcm(aes) for v5 devices in ccp
    - Add Queue Interface (QI) backend support in caam
    - Add new Exynos RNG driver
    - Add ThunderX ZIP driver
    - Add driver for hardware random generator on MT7623 SoC"

    * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (101 commits)
    crypto: stm32 - Fix OF module alias information
    crypto: algif_aead - Require setkey before accept(2)
    crypto: scomp - add support for deflate rfc1950 (zlib)
    crypto: scomp - allow registration of multiple scomps
    crypto: ccp - Change ISR handler method for a v5 CCP
    crypto: ccp - Change ISR handler method for a v3 CCP
    crypto: crypto4xx - rename ce_ring_contol to ce_ring_control
    crypto: testmgr - Allow ecb(cipher_null) in FIPS mode
    Revert "crypto: arm64/sha - Add constant operand modifier to ASM_EXPORT"
    crypto: ccp - Disable interrupts early on unload
    crypto: ccp - Use only the relevant interrupt bits
    hwrng: mtk - Add driver for hardware random generator on MT7623 SoC
    dt-bindings: hwrng: Add Mediatek hardware random generator bindings
    crypto: crct10dif-vpmsum - Fix missing preempt_disable()
    crypto: testmgr - replace compression known answer test
    crypto: acomp - allow registration of multiple acomps
    hwrng: n2 - Use devm_kcalloc() in n2rng_probe()
    crypto: chcr - Fix error handling related to 'chcr_alloc_shash'
    padata: get_next is never NULL
    crypto: exynos - Add new Exynos RNG driver
    ...

    Linus Torvalds
     
  • Pull splice updates from Al Viro:
    "These actually missed the last cycle; the branch itself is from last
    December"

    * 'work.splice' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    make nr_pages calculation in default_file_splice_read() a bit less ugly
    splice/tee/vmsplice: validate flags
    splice_pipe_desc: kill ->flags
    remove spd_release_page()

    Linus Torvalds
     

02 May, 2017

3 commits

  • Cong Wang correctly pointed out that the RCU read locking of the
    auditd_connection struct was wrong, this patch correct this by
    adopting a more traditional, and correct RCU locking model.

    This patch is heavily based on an earlier prototype by Cong Wang.

    Cc: # 4.11.x-
    Reported-by: Cong Wang
    Signed-off-by: Cong Wang
    Signed-off-by: Paul Moore

    Paul Moore
     
  • The audit subsystem implemented its own buffer cache mechanism which
    is a bit silly these days when we could use the kmem_cache construct.

    Some credit is due to Florian Westphal for originally proposing that
    we remove the audit cache implementation in favor of simple
    kmalloc()/kfree() calls, but I would rather have a dedicated slab
    cache to ease debugging and future stats/performance work.

    Cc: Florian Westphal
    Reviewed-by: Richard Guy Briggs
    Signed-off-by: Paul Moore

    Paul Moore
     
  • struct timespec is not y2038 safe.
    Audit timestamps are recorded in string format into
    an audit buffer for a given context.
    These mark the entry timestamps for the syscalls.
    Use y2038 safe struct timespec64 to represent the times.
    The log strings can handle this transition as strings can
    hold upto 1024 characters.

    Signed-off-by: Deepa Dinamani
    Reviewed-by: Arnd Bergmann
    Acked-by: Paul Moore
    Acked-by: Richard Guy Briggs
    Signed-off-by: Paul Moore

    Deepa Dinamani