07 May, 2019

2 commits

  • Pull power management updates from Rafael Wysocki:
    "These fix the (Intel-specific) Performance and Energy Bias Hint (EPB)
    handling and expose it to user space via sysfs, fix and clean up
    several cpufreq drivers, add support for two new chips to the qoriq
    cpufreq driver, fix, simplify and clean up the cpufreq core and the
    schedutil governor, add support for "CPU" domains to the generic power
    domains (genpd) framework and provide low-level PSCI firmware support
    for that feature, fix the exynos cpuidle driver and fix a couple of
    issues in the devfreq subsystem and clean it up.

    Specifics:

    - Fix the handling of Performance and Energy Bias Hint (EPB) on Intel
    processors and expose it to user space via sysfs to avoid having to
    access it through the generic MSR I/F (Rafael Wysocki).

    - Improve the handling of global turbo changes made by the platform
    firmware in the intel_pstate driver (Rafael Wysocki).

    - Convert some slow-path static_cpu_has() callers to boot_cpu_has()
    in cpufreq (Borislav Petkov).

    - Fix the frequency calculation loop in the armada-37xx cpufreq
    driver (Gregory CLEMENT).

    - Fix possible object reference leaks in multuple cpufreq drivers
    (Wen Yang).

    - Fix kerneldoc comment in the centrino cpufreq driver (dongjian).

    - Clean up the ACPI and maple cpufreq drivers (Viresh Kumar, Mohan
    Kumar).

    - Add support for lx2160a and ls1028a to the qoriq cpufreq driver
    (Vabhav Sharma, Yuantian Tang).

    - Fix kobject memory leak in the cpufreq core (Viresh Kumar).

    - Simplify the IOwait boosting in the schedutil cpufreq governor and
    rework the TSC cpufreq notifier on x86 (Rafael Wysocki).

    - Clean up the cpufreq core and statistics code (Yue Hu, Kyle Lin).

    - Improve the cpufreq documentation, add SPDX license tags to some PM
    documentation files and unify copyright notices in them (Rafael
    Wysocki).

    - Add support for "CPU" domains to the generic power domains (genpd)
    framework and provide low-level PSCI firmware support for that
    feature (Ulf Hansson).

    - Rearrange the PSCI firmware support code and add support for
    SYSTEM_RESET2 to it (Ulf Hansson, Sudeep Holla).

    - Improve genpd support for devices in multiple power domains (Ulf
    Hansson).

    - Unify target residency for the AFTR and coupled AFTR states in the
    exynos cpuidle driver (Marek Szyprowski).

    - Introduce new helper routine in the operating performance points
    (OPP) framework (Andrew-sh.Cheng).

    - Add support for passing on-die termination (ODT) and auto power
    down parameters from the kernel to Trusted Firmware-A (TF-A) to the
    rk3399_dmc devfreq driver (Enric Balletbo i Serra).

    - Add tracing to devfreq (Lukasz Luba).

    - Make the exynos-bus devfreq driver suspend all devices on system
    shutdown (Marek Szyprowski).

    - Fix a few minor issues in the devfreq subsystem and clean it up
    somewhat (Enric Balletbo i Serra, MyungJoo Ham, Rob Herring,
    Saravana Kannan, Yangtao Li).

    - Improve system wakeup diagnostics (Stephen Boyd).

    - Rework filesystem sync messages emitted during system suspend and
    hibernation (Harry Pan)"

    * tag 'pm-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (72 commits)
    cpufreq: Fix kobject memleak
    cpufreq: armada-37xx: fix frequency calculation for opp
    cpufreq: centrino: Fix centrino_setpolicy() kerneldoc comment
    cpufreq: qoriq: add support for lx2160a
    x86: tsc: Rework time_cpufreq_notifier()
    PM / Domains: Allow to attach a CPU via genpd_dev_pm_attach_by_id|name()
    PM / Domains: Search for the CPU device outside the genpd lock
    PM / Domains: Drop unused in-parameter to some genpd functions
    PM / Domains: Use the base device for driver_deferred_probe_check_state()
    cpufreq: qoriq: Add ls1028a chip support
    PM / Domains: Enable genpd_dev_pm_attach_by_id|name() for single PM domain
    PM / Domains: Allow OF lookup for multi PM domain case from ->attach_dev()
    PM / Domains: Don't kfree() the virtual device in the error path
    cpufreq: Move ->get callback check outside of __cpufreq_get()
    PM / Domains: remove unnecessary unlikely()
    cpufreq: Remove needless bios_limit check in show_bios_limit()
    drivers/cpufreq/acpi-cpufreq.c: This fixes the following checkpatch warning
    firmware/psci: add support for SYSTEM_RESET2
    PM / devfreq: add tracing for scheduling work
    trace: events: add devfreq trace event file
    ...

    Linus Torvalds
     
  • Pull timer updates from Ingo Molnar:
    "This cycle had the following changes:

    - Timer tracing improvements (Anna-Maria Gleixner)

    - Continued tasklet reduction work: remove the hrtimer_tasklet
    (Thomas Gleixner)

    - Fix CPU hotplug remove race in the tick-broadcast mask handling
    code (Thomas Gleixner)

    - Force upper bound for setting CLOCK_REALTIME, to fix ABI
    inconsistencies with handling values that are close to the maximum
    supported and the vagueness of when uptime related wraparound might
    occur. Make the consistent maximum the year 2232 across all
    relevant ABIs and APIs. (Thomas Gleixner)

    - various cleanups and smaller fixes"

    * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    tick: Fix typos in comments
    tick/broadcast: Fix warning about undefined tick_broadcast_oneshot_offline()
    timekeeping: Force upper bound for setting CLOCK_REALTIME
    timer/trace: Improve timer tracing
    timer/trace: Replace deprecated vsprintf pointer extension %pf by %ps
    timer: Move trace point to get proper index
    tick/sched: Update tick_sched struct documentation
    tick: Remove outgoing CPU from broadcast masks
    timekeeping: Consistently use unsigned int for seqcount snapshot
    softirq: Remove tasklet_hrtimer
    xfrm: Replace hrtimer tasklet with softirq hrtimer
    mac80211_hwsim: Replace hrtimer tasklet with softirq hrtimer

    Linus Torvalds
     

16 Apr, 2019

1 commit

  • The patch adds a new file for with trace events for devfreq
    framework. They are used for performance analysis of the framework.
    It also contains updates in MAINTAINERS file adding new entry for
    devfreq maintainers.

    Signed-off-by: Lukasz Luba
    Reviewed-by: Chanwoo Choi
    Signed-off-by: MyungJoo Ham

    Lukasz Luba
     

05 Apr, 2019

1 commit

  • At Linux Plumbers, Andy Lutomirski approached me and pointed out that the
    function call syscall_get_arguments() implemented in x86 was horribly
    written and not optimized for the standard case of passing in 0 and 6 for
    the starting index and the number of system calls to get. When looking at
    all the users of this function, I discovered that all instances pass in only
    0 and 6 for these arguments. Instead of having this function handle
    different cases that are never used, simply rewrite it to return the first 6
    arguments of a system call.

    This should help out the performance of tracing system calls by ptrace,
    ftrace and perf.

    Link: http://lkml.kernel.org/r/20161107213233.754809394@goodmis.org

    Cc: Oleg Nesterov
    Cc: Kees Cook
    Cc: Andy Lutomirski
    Cc: Dominik Brodowski
    Cc: Dave Martin
    Cc: "Dmitry V. Levin"
    Cc: x86@kernel.org
    Cc: linux-snps-arc@lists.infradead.org
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-c6x-dev@linux-c6x.org
    Cc: uclinux-h8-devel@lists.sourceforge.jp
    Cc: linux-hexagon@vger.kernel.org
    Cc: linux-ia64@vger.kernel.org
    Cc: linux-mips@vger.kernel.org
    Cc: nios2-dev@lists.rocketboards.org
    Cc: openrisc@lists.librecores.org
    Cc: linux-parisc@vger.kernel.org
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: linux-riscv@lists.infradead.org
    Cc: linux-s390@vger.kernel.org
    Cc: linux-sh@vger.kernel.org
    Cc: sparclinux@vger.kernel.org
    Cc: linux-um@lists.infradead.org
    Cc: linux-xtensa@linux-xtensa.org
    Cc: linux-arch@vger.kernel.org
    Acked-by: Paul Burton # MIPS parts
    Acked-by: Max Filippov # For xtensa changes
    Acked-by: Will Deacon # For the arm64 bits
    Reviewed-by: Thomas Gleixner # for x86
    Reviewed-by: Dmitry V. Levin
    Reported-by: Andy Lutomirski
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (Red Hat)
     

25 Mar, 2019

2 commits

  • Timers are added to the timer wheel off by one. This is required in
    case a timer is queued directly before incrementing jiffies to prevent
    early timer expiry.

    When reading a timer trace and relying only on the expiry time of the timer
    in the timer_start trace point and on the now in the timer_expiry_entry
    trace point, it seems that the timer fires late. With the current
    timer_expiry_entry trace point information only now=jiffies is printed but
    not the value of base->clk. This makes it impossible to draw a conclusion
    to the index of base->clk and makes it impossible to examine timer problems
    without additional trace points.

    Therefore add the base->clk value to the timer_expire_entry trace
    point, to be able to calculate the index the timer base is located at
    during collecting expired timers.

    Signed-off-by: Anna-Maria Gleixner
    Signed-off-by: Thomas Gleixner
    Cc: fweisbec@gmail.com
    Cc: peterz@infradead.org
    Cc: Steven Rostedt
    Link: https://lkml.kernel.org/r/20190321120921.16463-5-anna-maria@linutronix.de

    Anna-Maria Gleixner
     
  • Since commit 04b8eb7a4ccd ("symbol lookup: introduce
    dereference_symbol_descriptor()") %pf is deprecated, because %ps is smart
    enough to handle function pointer dereference on platforms where such a
    dereference is required.

    While at it add proper line breaks to stay in the 80 character limit.

    Signed-off-by: Anna-Maria Gleixner
    Signed-off-by: Thomas Gleixner
    Cc: fweisbec@gmail.com
    Cc: peterz@infradead.org
    Cc: Steven Rostedt
    Link: https://lkml.kernel.org/r/20190321120921.16463-4-anna-maria@linutronix.de

    Anna-Maria Gleixner
     

17 Mar, 2019

1 commit

  • Pull NFS client bugfixes from Trond Myklebust:
    "Highlights include:

    Bugfixes:
    - Fix an Oops in SUNRPC back channel tracepoints
    - Fix a SUNRPC client regression when handling oversized replies
    - Fix the minimal size for SUNRPC reply buffer allocation
    - rpc_decode_header() must always return a non-zero value on error
    - Fix a typo in pnfs_update_layout()

    Cleanup:
    - Remove redundant check for the reply length in call_decode()"

    * tag 'nfs-for-5.1-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
    SUNRPC: Remove redundant check for the reply length in call_decode()
    SUNRPC: Handle the SYSTEM_ERR rpc error
    SUNRPC: rpc_decode_header() must always return a non-zero value on error
    SUNRPC: Use the ENOTCONN error on socket disconnect
    SUNRPC: Fix the minimal size for reply buffer allocation
    SUNRPC: Fix a client regression when handling oversized replies
    pNFS: Fix a typo in pnfs_update_layout
    fix null pointer deref in tracepoints in back channel

    Linus Torvalds
     

16 Mar, 2019

1 commit

  • Pull f2fs updates from Jaegeuk Kim:
    "We've continued mainly to fix bugs in this round, as f2fs has been
    shipped in more devices. Especially, we've focused on stabilizing
    checkpoint=disable feature, and provided some interfaces for QA.

    Enhancements:
    - expose FS_NOCOW_FL for pin_file
    - run discard jobs at unmount time with timeout
    - tune discarding thread to avoid idling which consumes power
    - some checking codes to address vulnerabilities
    - give random value to i_generation
    - shutdown with more flags for QA

    Bug fixes:
    - clean up stale objects when mount is failed along with
    checkpoint=disable
    - fix system being stuck due to wrong count by atomic writes
    - handle some corrupted disk cases
    - fix a deadlock in f2fs_read_inline_dir

    We've also added some minor build error fixes and clean-up patches"

    * tag 'f2fs-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (53 commits)
    f2fs: set pin_file under CAP_SYS_ADMIN
    f2fs: fix to avoid deadlock in f2fs_read_inline_dir()
    f2fs: fix to adapt small inline xattr space in __find_inline_xattr()
    f2fs: fix to do sanity check with inode.i_inline_xattr_size
    f2fs: give some messages for inline_xattr_size
    f2fs: don't trigger read IO for beyond EOF page
    f2fs: fix to add refcount once page is tagged PG_private
    f2fs: remove wrong comment in f2fs_invalidate_page()
    f2fs: fix to use kvfree instead of kzfree
    f2fs: print more parameters in trace_f2fs_map_blocks
    f2fs: trace f2fs_ioc_shutdown
    f2fs: fix to avoid deadlock of atomic file operations
    f2fs: fix to dirty inode for i_mode recovery
    f2fs: give random value to i_generation
    f2fs: no need to take page lock in readdir
    f2fs: fix to update iostat correctly in IPU path
    f2fs: fix encrypted page memory leak
    f2fs: make fault injection covering __submit_flush_wait()
    f2fs: fix to retry fill_super only if recovery failed
    f2fs: silence VM_WARN_ON_ONCE in mempool_alloc
    ...

    Linus Torvalds
     

15 Mar, 2019

1 commit

  • Pull dmaengine updates from Vinod Koul:

    - dmatest updates for modularizing common struct and code

    - remove SG support for VDMA xilinx IP and updates to driver

    - Update to dw driver to support Intel iDMA controllers multi-block
    support

    - tegra updates for proper reporting of residue

    - Add Snow Ridge ioatdma device id and support for IOATDMA v3.4

    - struct_size() usage and useless LIST_HEAD cleanups in subsystem.

    - qDMA controller driver for Layerscape SoCs

    - stm32-dma PM Runtime support

    - And usual updates to imx-sdma, sprd, Documentation, fsl-edma,
    bcm2835, qcom_hidma etc

    * tag 'dmaengine-5.1-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (81 commits)
    dmaengine: imx-sdma: fix consistent dma test failures
    dmaengine: imx-sdma: add a test for imx8mq multi sdma devices
    dmaengine: imx-sdma: add clock ratio 1:1 check
    dmaengine: dmatest: move test data alloc & free into functions
    dmaengine: dmatest: add short-hand `buf_size` var in dmatest_func()
    dmaengine: dmatest: wrap src & dst data into a struct
    dmaengine: ioatdma: support latency tolerance report (LTR) for v3.4
    dmaengine: ioatdma: add descriptor pre-fetch support for v3.4
    dmaengine: ioatdma: disable DCA enabling on IOATDMA v3.4
    dmaengine: ioatdma: Add Snow Ridge ioatdma device id
    dmaengine: sprd: Change channel id to slave id for DMA cell specifier
    dt-bindings: dmaengine: sprd: Change channel id to slave id for DMA cell specifier
    dmaengine: mv_xor: Use correct device for DMA API
    Documentation :dmaengine: clarify DMA desc. pointer after submission
    Documentation: dmaengine: fix dmatest.rst warning
    dmaengine: k3dma: Add support for dma-channel-mask
    dmaengine: k3dma: Delete axi_config
    dmaengine: k3dma: Upgrade k3dma driver to support hisi_asp_dma hardware
    Documentation: bindings: dma: Add binding for dma-channel-mask
    Documentation: bindings: k3dma: Extend the k3dma driver binding to support hisi-asp
    ...

    Linus Torvalds
     

13 Mar, 2019

4 commits

  • for better map_blocks trace.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • This patch supports to trace f2fs_ioc_shutdown.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • Pull NFS client updates from Trond Myklebust:
    "Highlights include:

    Stable fixes:
    - Fixes for NFS I/O request leakages
    - Fix error handling paths in the NFS I/O recoalescing code
    - Reinitialise NFSv4.1 sequence results before retransmitting a
    request
    - Fix a soft lockup in the delegation recovery code
    - Bulk destroy of layouts needs to be safe w.r.t. umount
    - Prevent thundering herd issues when the SUNRPC socket is not
    connected
    - Respect RPC call timeouts when retrying transmission

    Features:
    - Convert rpc auth layer to use xdr_streams
    - Config option to disable insecure RPCSEC_GSS crypto types
    - Reduce size of RPC receive buffers
    - Readdirplus optimization by cache mechanism
    - Convert SUNRPC socket send code to use iov_iter()
    - SUNRPC micro-optimisations to avoid indirect calls
    - Add support for the pNFS LAYOUTERROR operation and use it with the
    pNFS/flexfiles driver
    - Add trace events to report non-zero NFS status codes
    - Various removals of unnecessary dprintks

    Bugfixes and cleanups:
    - Fix a number of sparse warnings and documentation format warnings
    - Fix nfs_parse_devname to not modify it's argument
    - Fix potential corruption of page being written through pNFS/blocks
    - fix xfstest generic/099 failures on nfsv3
    - Avoid NFSv4.1 "false retries" when RPC calls are interrupted
    - Abort I/O early if the pNFS/flexfiles layout segment was
    invalidated
    - Avoid unnecessary pNFS/flexfiles layout invalidations"

    * tag 'nfs-for-5.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (90 commits)
    SUNRPC: Take the transport send lock before binding+connecting
    SUNRPC: Micro-optimise when the task is known not to be sleeping
    SUNRPC: Check whether the task was transmitted before rebind/reconnect
    SUNRPC: Remove redundant calls to RPC_IS_QUEUED()
    SUNRPC: Clean up
    SUNRPC: Respect RPC call timeouts when retrying transmission
    SUNRPC: Fix up RPC back channel transmission
    SUNRPC: Prevent thundering herd when the socket is not connected
    SUNRPC: Allow dynamic allocation of back channel slots
    NFSv4.1: Bump the default callback session slot count to 16
    SUNRPC: Convert remaining GFP_NOIO, and GFP_NOWAIT sites in sunrpc
    NFS/flexfiles: Clean up mirror DS initialisation
    NFS/flexfiles: Remove dead code in ff_layout_mirror_valid()
    NFS/flexfile: Simplify nfs4_ff_layout_select_ds_stateid()
    NFS/flexfile: Simplify nfs4_ff_layout_ds_version()
    NFS/flexfiles: Simplify ff_layout_get_ds_cred()
    NFS/flexfiles: Simplify nfs4_ff_find_or_create_ds_client()
    NFS/flexfiles: Simplify nfs4_ff_layout_select_ds_fh()
    NFS/flexfiles: Speed up read failover when DSes are down
    NFS/flexfiles: Don't invalidate DS deviceids for being unresponsive
    ...

    Linus Torvalds
     
  • Backchannel doesn't have the rq_task->tk_clientid pointer set.

    Otherwise can lead to the following oops:
    ocalhost login: [ 111.385319] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
    [ 111.388073] #PF error: [normal kernel read fault]
    [ 111.389452] PGD 80000000290d8067 P4D 80000000290d8067 PUD 75f25067 PMD 0
    [ 111.391224] Oops: 0000 [#1] SMP PTI
    [ 111.392151] CPU: 0 PID: 3533 Comm: NFSv4 callback Not tainted 5.0.0-rc7+ #1
    [ 111.393787] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
    [ 111.396340] RIP: 0010:trace_event_raw_event_xprt_enq_xmit+0x6f/0xf0 [sunrpc]
    [ 111.397974] Code: 00 00 00 48 89 ee 48 89 e7 e8 bd 0a 85 d7 48 85 c0 74 4a 41 0f b7 94 24 e0 00 00 00 48 89 e7 89 50 08 49 8b 94 24 a8 00 00 00 52 04 89 50 0c 49 8b 94 24 c0 00 00 00 8b 92 a8 00 00 00 0f ca
    [ 111.402215] RSP: 0018:ffffb98743263cf8 EFLAGS: 00010286
    [ 111.403406] RAX: ffffa0890fc3bc88 RBX: 0000000000000003 RCX: 0000000000000000
    [ 111.405057] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffb98743263cf8
    [ 111.406656] RBP: ffffa0896f5368f0 R08: 0000000000000246 R09: 0000000000000000
    [ 111.408437] R10: ffffe19b01c01500 R11: 0000000000000000 R12: ffffa08977d28a00
    [ 111.410210] R13: 0000000000000004 R14: ffffa089315303f0 R15: ffffa08931530000
    [ 111.411856] FS: 0000000000000000(0000) GS:ffffa0897bc00000(0000) knlGS:0000000000000000
    [ 111.413699] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 111.415068] CR2: 0000000000000004 CR3: 000000002ac90004 CR4: 00000000001606f0
    [ 111.416745] Call Trace:
    [ 111.417339] xprt_request_enqueue_transmit+0x2b6/0x4a0 [sunrpc]
    [ 111.418709] ? rpc_task_need_encode+0x40/0x40 [sunrpc]
    [ 111.419957] call_bc_transmit+0xd5/0x170 [sunrpc]
    [ 111.421067] __rpc_execute+0x7e/0x3f0 [sunrpc]
    [ 111.422177] rpc_run_bc_task+0x78/0xd0 [sunrpc]
    [ 111.423212] bc_svc_process+0x281/0x340 [sunrpc]
    [ 111.424325] nfs41_callback_svc+0x130/0x1c0 [nfsv4]
    [ 111.425430] ? remove_wait_queue+0x60/0x60
    [ 111.426398] kthread+0xf5/0x130
    [ 111.427155] ? nfs_callback_authenticate+0x50/0x50 [nfsv4]
    [ 111.428388] ? kthread_bind+0x10/0x10
    [ 111.429270] ret_from_fork+0x1f/0x30

    localhost login: [ 467.462259] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
    [ 467.464411] #PF error: [normal kernel read fault]
    [ 467.465445] PGD 80000000728c1067 P4D 80000000728c1067 PUD 728c0067 PMD 0
    [ 467.466980] Oops: 0000 [#1] SMP PTI
    [ 467.467759] CPU: 0 PID: 3517 Comm: NFSv4 callback Not tainted 5.0.0-rc7+ #1
    [ 467.469393] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
    [ 467.471840] RIP: 0010:trace_event_raw_event_xprt_transmit+0x7c/0xf0 [sunrpc]
    [ 467.473392] Code: f6 48 85 c0 74 4b 49 8b 94 24 98 00 00 00 48 89 e7 0f b7 92 e0 00 00 00 89 50 08 49 8b 94 24 98 00 00 00 48 8b 92 a8 00 00 00 52 04 89 50 0c 41 8b 94 24 a8 00 00 00 0f ca 89 50 10 41 8b 94
    [ 467.477605] RSP: 0018:ffffabe7434fbcd0 EFLAGS: 00010282
    [ 467.478793] RAX: ffff99720fc3bce0 RBX: 0000000000000003 RCX: 0000000000000000
    [ 467.480409] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffabe7434fbcd0
    [ 467.482011] RBP: ffff99726f631948 R08: 0000000000000246 R09: 0000000000000000
    [ 467.483591] R10: 0000000070000000 R11: 0000000000000000 R12: ffff997277dfcc00
    [ 467.485226] R13: 0000000000000000 R14: 0000000000000000 R15: ffff99722fecdca8
    [ 467.486830] FS: 0000000000000000(0000) GS:ffff99727bc00000(0000) knlGS:0000000000000000
    [ 467.488596] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 467.489931] CR2: 0000000000000004 CR3: 00000000270e6006 CR4: 00000000001606f0
    [ 467.491559] Call Trace:
    [ 467.492128] xprt_transmit+0x303/0x3f0 [sunrpc]
    [ 467.493143] ? rpc_task_need_encode+0x40/0x40 [sunrpc]
    [ 467.494328] call_bc_transmit+0x49/0x170 [sunrpc]
    [ 467.495379] __rpc_execute+0x7e/0x3f0 [sunrpc]
    [ 467.496451] rpc_run_bc_task+0x78/0xd0 [sunrpc]
    [ 467.497467] bc_svc_process+0x281/0x340 [sunrpc]
    [ 467.498507] nfs41_callback_svc+0x130/0x1c0 [nfsv4]
    [ 467.499751] ? remove_wait_queue+0x60/0x60
    [ 467.500686] kthread+0xf5/0x130
    [ 467.501438] ? nfs_callback_authenticate+0x50/0x50 [nfsv4]
    [ 467.502640] ? kthread_bind+0x10/0x10
    [ 467.503454] ret_from_fork+0x1f/0x30

    Signed-off-by: Olga Kornievskaia
    Signed-off-by: Trond Myklebust

    Olga Kornievskaia
     

11 Mar, 2019

1 commit

  • Pull networking fixes from David Miller:
    "First batch of fixes in the new merge window:

    1) Double dst_cache free in act_tunnel_key, from Wenxu.

    2) Avoid NULL deref in IN_DEV_MFORWARD() by failing early in the
    ip_route_input_rcu() path, from Paolo Abeni.

    3) Fix appletalk compile regression, from Arnd Bergmann.

    4) If SLAB objects reach the TCP sendpage method we are in serious
    trouble, so put a debugging check there. From Vasily Averin.

    5) Memory leak in hsr layer, from Mao Wenan.

    6) Only test GSO type on GSO packets, from Willem de Bruijn.

    7) Fix crash in xsk_diag_put_umem(), from Eric Dumazet.

    8) Fix VNIC mailbox length in nfp, from Dirk van der Merwe.

    9) Fix race in ipv4 route exception handling, from Xin Long.

    10) Missing DMA memory barrier in hns3 driver, from Jian Shen.

    11) Use after free in __tcf_chain_put(), from Vlad Buslov.

    12) Handle inet_csk_reqsk_queue_add() failures, from Guillaume Nault.

    13) Return value correction when ip_mc_may_pull() fails, from Eric
    Dumazet.

    14) Use after free in x25_device_event(), also from Eric"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (72 commits)
    gro_cells: make sure device is up in gro_cells_receive()
    vxlan: test dev->flags & IFF_UP before calling gro_cells_receive()
    net/x25: fix use-after-free in x25_device_event()
    isdn: mISDNinfineon: fix potential NULL pointer dereference
    net: hns3: fix to stop multiple HNS reset due to the AER changes
    ip: fix ip_mc_may_pull() return value
    net: keep refcount warning in reqsk_free()
    net: stmmac: Avoid one more sometimes uninitialized Clang warning
    net: dsa: mv88e6xxx: Set correct interface mode for CPU/DSA ports
    rxrpc: Fix client call queueing, waiting for channel
    tcp: handle inet_csk_reqsk_queue_add() failures
    net: ethernet: sun: Zero initialize class in default case in niu_add_ethtool_tcam_entry
    8139too : Add support for U.S. Robotics USR997901A 10/100 Cardbus NIC
    fou, fou6: avoid uninit-value in gue_err() and gue6_err()
    net: sched: fix potential use-after-free in __tcf_chain_put()
    vhost: silence an unused-variable warning
    vsock/virtio: fix kernel panic from virtio_transport_reset_no_sock
    connector: fix unsafe usage of ->real_parent
    vxlan: do not need BH again in vxlan_cleanup()
    net: hns3: add dma_rmb() for rx description
    ...

    Linus Torvalds
     

10 Mar, 2019

1 commit

  • Pull media updates from Mauro Carvalho Chehab:

    - remove sensor drivers that got converted from soc_camera

    - remaining soc_camera drivers got moved to staging

    - some documentation cleanups and improvements

    - the imx staging driver now supports imx7

    - the ov9640, mt9m001 and mt9m111 got converted from soc_camera

    - the vim2m driver now does what a m2m convert driver expects to do

    - epoll() fixes on media subsystems

    - several drivers fixes, typos, cleanups and improvements

    * tag 'media/v5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (346 commits)
    media: dvb/earth-pt1: fix wrong initialization for demod blocks
    media: vim2m: Address some coding style issues
    media: vim2m: don't use BUG()
    media: vim2m: speedup passthrough copy
    media: vim2m: add an horizontal scaler
    media: vim2m: don't accept YUYV anymore as output format
    media: vim2m: add vertical linear scaler
    media: vim2m: better handle cap/out buffers with different sizes
    media: vim2m: use different framesizes for bayer formats
    media: vim2m: add support for VIDIOC_ENUM_FRAMESIZES
    media: vim2m: ensure that width is multiple of two
    media: vim2m: improve debug messages
    media: vim2m: add bayer capture formats
    media: a few more typos at staging, pci, platform, radio and usb
    media: Documentation: fix several typos
    media: staging: fix several typos
    media: include: fix several typos
    media: common: fix several typos
    media: v4l2-core: fix several typos
    media: usb: fix several typos
    ...

    Linus Torvalds
     

09 Mar, 2019

3 commits

  • rxrpc_disconnect_client_call() reads the call's connection ID protocol
    value (call->cid) as part of that function's variable declarations. This
    is bad because it's not inside the locked section and so may race with
    someone granting use of the channel to the call.

    This manifests as an assertion failure (see below) where the call in the
    presumed channel (0 because call->cid wasn't set when we read it) doesn't
    match the call attached to the channel we were actually granted (if 1, 2 or
    3).

    Fix this by moving the read and dependent calculations inside of the
    channel_lock section. Also, only set the channel number and pointer
    variables if cid is not zero (ie. unset).

    This problem can be induced by injecting an occasional error in
    rxrpc_wait_for_channel() before the call to schedule().

    Make two further changes also:

    (1) Add a trace for wait failure in rxrpc_connect_call().

    (2) Drop channel_lock before BUG'ing in the case of the assertion failure.

    The failure causes a trace akin to the following:

    rxrpc: Assertion failed - 18446612685268945920(0xffff8880beab8c00) == 18446612685268621312(0xffff8880bea69800) is false
    ------------[ cut here ]------------
    kernel BUG at net/rxrpc/conn_client.c:824!
    ...
    RIP: 0010:rxrpc_disconnect_client_call+0x2bf/0x99d
    ...
    Call Trace:
    rxrpc_connect_call+0x902/0x9b3
    ? wake_up_q+0x54/0x54
    rxrpc_new_client_call+0x3a0/0x751
    ? rxrpc_kernel_begin_call+0x141/0x1bc
    ? afs_alloc_call+0x1b5/0x1b5
    rxrpc_kernel_begin_call+0x141/0x1bc
    afs_make_call+0x20c/0x525
    ? afs_alloc_call+0x1b5/0x1b5
    ? __lock_is_held+0x40/0x71
    ? lockdep_init_map+0xaf/0x193
    ? lockdep_init_map+0xaf/0x193
    ? __lock_is_held+0x40/0x71
    ? yfs_fs_fetch_data+0x33b/0x34a
    yfs_fs_fetch_data+0x33b/0x34a
    afs_fetch_data+0xdc/0x3b7
    afs_read_dir+0x52d/0x97f
    afs_dir_iterate+0xa0/0x661
    ? iterate_dir+0x63/0x141
    iterate_dir+0xa2/0x141
    ksys_getdents64+0x9f/0x11b
    ? filldir+0x111/0x111
    ? do_syscall_64+0x3e/0x1a0
    __x64_sys_getdents64+0x16/0x19
    do_syscall_64+0x7d/0x1a0
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    Fixes: 45025bceef17 ("rxrpc: Improve management and caching of client connection objects")
    Signed-off-by: David Howells
    Reviewed-by: Marc Dionne
    Signed-off-by: David S. Miller

    David Howells
     
  • Pull i2c updates from Wolfram Sang:

    - the I2C core gained helpers to assist drivers in handling their
    suspended state, and drivers were converted to use it

    - two new fault-injectors for stress-testing

    - bigger refactoring and feature improvements for the ocores,
    sh_mobile, and tegra drivers

    - platform_data removal for the at24 EEPROM driver

    - ... and various improvements and bugfixes all over the subsystem

    * 'i2c/for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (69 commits)
    i2c: Allow recovery of the initial IRQ by an I2C client device.
    i2c: ocores: turn incomplete kdoc into a comment
    i2c: designware: Do not allow i2c_dw_xfer() calls while suspended
    i2c: tegra: Only display error messages if DMA setup fails
    i2c: gpio: fault-injector: add 'inject_panic' injector
    i2c: gpio: fault-injector: add 'lose_arbitration' injector
    i2c: tegra: remove multi-master support
    i2c: tegra: remove master fifo support on tegra186
    i2c: tegra: change phrasing, "fallbacking" to "falling back"
    i2c: expand minor range when registering chrdev region
    i2c: aspeed: Add multi-master use case support
    i2c: core-smbus: don't trace smbus_reply data on errors
    i2c: ocores: Add support for bus clock via platform data
    i2c: ocores: Add support for IO mapper registers.
    i2c: ocores: checkpatch fixes
    i2c: ocores: add SPDX tag
    i2c: ocores: add polling interface
    i2c: ocores: do not handle IRQ if IF is not set
    i2c: ocores: stop transfer on timeout
    i2c: tegra: add i2c interface timing support
    ...

    Linus Torvalds
     
  • Pull drm updates from Dave Airlie:
    "This is the main drm pull request for the 5.1 merge window.

    The big changes I'd highlight are:
    - nouveau has HMM support now, there is finally an in-tree user so we
    can quieten down the rip it out people.
    - i915 now enables fastboot by default on Skylake+
    - Displayport Multistream support has been refactored and should
    hopefully be more reliable.

    Core:
    - header cleanups aiming towards removing drmP.h
    - dma-buf fence seqnos to 64-bits
    - common helper for DP mst hotplug for radeon,i915,amdgpu + new
    refcounting scheme
    - MST i2c improvements
    - drm_syncobj_cb removal
    - ARM FB compression fourcc
    - P010 + P016 fourcc
    - allwinner tiled format modifier
    - i2c over aux I2C_M_STOP support
    - DRM_AUTH handling fixes

    TTM:
    - ref/unref renaming

    New driver:
    - ARM komeda display driver

    scheduler:
    - refactor mirror list handling
    - rework hw fence processing
    - 0 run queue entity fix

    bridge:
    - TI DS90C185 LVDS bridge
    - thc631lvdm83d bridge improvements
    - cadence + allwinner DSI ported to generic phy

    panels:
    - Sitronix ST7701 panel
    - Kingdisplay KD097D04
    - LeMaker BL035-RGB-002
    - PDA 91-00156-A0
    - Innolux EE101IA-01D

    i915:
    - Enable fastboot by default on SKL+/VLV/CHV
    - Export RPCS configuration for ICL media driver
    - Coffelake PCI ID
    - CNL clocks setup fixes
    - ACPI/PMIC support for MIPI/DSI
    - Per-engine WA init for all engines
    - Shrinker locking fixes
    - Kerneldoc updates
    - Lots of ring improvements and reset fixes
    - Coffeelake GVT Support
    - VFIO GVT EDID Region support
    - runtime PM wakeref tracking
    - ILK->IVB primary plane enable delays
    - userptr mutex locking fixes
    - DSI fixes
    - LVDS/TV cleanups
    - HW readout fixes
    - LUT robustness fixes
    - ICL display and watermark fixes
    - gem mmap race fix

    amdgpu:
    - add scheduled dependencies interface
    - DCC on scanout surfaces
    - vega10/20 BACO support
    - Multiple IH rings on soc15
    - XGMI locking fixes
    - DC i2c/aux cleanups
    - runtime SMU debug interface
    - Kexec improvmeents
    - SR-IOV fixes
    - DC freesync + ABM fixes
    - GDS fixes
    - GPUVM fixes
    - vega20 PCIE DPM switching fixes
    - Context priority handling fixes

    radeon:
    - fix missing break in evergreen parser

    nouveau:
    - SVM support via HMM

    msm:
    - QCOM Compressed modifier support

    exynos:
    - s5pv210 rotator support

    imx:
    - zpos property support
    - pending update fixes

    v3d:
    - cache flush improvments

    vc4:
    - reflection support
    - HDMI overscan support

    tegra:
    - CEC refactoring
    - HDMI audio fixes
    - Tegra186 prep work
    - SOR crossbar device tree fixes

    sun4i:
    - implicit fencing support
    - YUV and scalar support improvements
    - A23 support
    - tiling fixes

    atmel-hlcdc:
    - clipping and rotation property fixes

    qxl:
    - BO and PRIME improvements
    - generic fbdev emulation

    dw-hdmi:
    - HDMI 2.0 2160p
    - YUV420 ouput

    rockchip:
    - implicit fencing support
    - reflection proerties

    virtio-gpu:
    - use generic fbdev emulation

    tilcdc:
    - cpufreq vs crtc init fix

    rcar-du:
    - R8A774C0 support
    - D3/E3 RGB output routing fixes and DPAD0 support
    - RA87744 LVDS support

    bochs:
    - atomic and generic fbdev emulation
    - ID mismatch error on bochs load

    meson:
    - remove firmware fbs"

    * tag 'drm-next-2019-03-06' of git://anongit.freedesktop.org/drm/drm: (1130 commits)
    drm/amd/display: Use vrr friendly pageflip throttling in DC.
    drm/imx: only send commit done event when all state has been applied
    drm/imx: allow building under COMPILE_TEST
    drm/imx: imx-tve: depend on COMMON_CLK
    drm/imx: ipuv3-plane: add zpos property
    drm/imx: ipuv3-plane: add function to query atomic update status
    gpu: ipu-v3: prg: add function to get channel configure status
    gpu: ipu-v3: pre: add double buffer status readback
    drm/amdgpu: Bump amdgpu version for context priority override.
    drm/amdgpu/powerplay: fix typo in BACO header guards
    drm/amdgpu/powerplay: fix return codes in BACO code
    drm/amdgpu: add missing license on baco files
    drm/bochs: Fix the ID mismatch error
    drm/nouveau/dmem: use dma addresses during migration copies
    drm/nouveau/dmem: use physical vram addresses during migration copies
    drm/nouveau/dmem: extend copy function to allow direct use of physical addresses
    drm/nouveau/svm: new ioctl to migrate process memory to GPU memory
    drm/nouveau/dmem: device memory helpers for SVM
    drm/nouveau/svm: initial support for shared virtual memory
    drm/nouveau: prepare for enabling svm with existing userspace interfaces
    ...

    Linus Torvalds
     

08 Mar, 2019

1 commit

  • Pull btrfs updates from David Sterba:
    "This contains usual mix of new features, core changes and fixes; full
    list below. I'm planning second pull request, with a few more fixes
    that arrived recently but too close to merge window, will send it next
    week.

    New features:

    - support zstd compression levels

    - new ioctl to unregister a device from the module (ie. reverse of
    device scan)

    - scrub prints a message to log when it's about to start or finish

    Core changes:

    - qgroups can now skip part of a tree that does not get updated
    during relocation, because this does not affect the quota
    accounting, estimated speedup in run time is about 20%

    - the compression workspace management had to be enhanced due to zstd
    requirements

    - various enospc fixes, when there's high fragmentation the
    over-reservation can cause ENOSPC that might not happen after a
    flush, in such cases try to wait if the situation improves

    Fixes:

    - various ioctls could overwrite previous return value if
    copy_to_user fails, fix this so the original error is reported

    - more reclaim vs GFP_KERNEL fixes

    - other cleanups and refactoring

    - fix a (valid) lockdep warning in a test when device replace is
    destroying worker threads

    - make qgroup async transaction commit more aggressive, this avoids
    some 'quota limit reached' errors if there are not enough data to
    trigger transaction in order to flush

    - fix deadlock between snapshot deletion and quotas when backref
    walking is called from context that already holds the same locks

    - fsync fixes:
    - fix fsync after succession of renames of different files
    - fix fsync after succession of renames and unlink/rmdir"

    * tag 'for-5.1-part1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (92 commits)
    btrfs: Remove unnecessary casts in btrfs_read_root_item
    Btrfs: remove assertion when searching for a key in a node/leaf
    Btrfs: add missing error handling after doing leaf/node binary search
    btrfs: drop the lock on error in btrfs_dev_replace_cancel
    btrfs: ensure that a DUP or RAID1 block group has exactly two stripes
    btrfs: init csum_list before possible free
    Btrfs: remove no longer needed range length checks for deduplication
    Btrfs: fix fsync after succession of renames and unlink/rmdir
    Btrfs: fix fsync after succession of renames of different files
    btrfs: honor path->skip_locking in backref code
    btrfs: qgroup: Make qgroup async transaction commit more aggressive
    btrfs: qgroup: Move reserved data accounting from btrfs_delayed_ref_head to btrfs_qgroup_extent_record
    btrfs: scrub: remove unused nocow worker pointer
    btrfs: scrub: add assertions for worker pointers
    btrfs: scrub: convert scrub_workers_refcnt to refcount_t
    btrfs: scrub: add scrub_lock lockdep check in scrub_workers_get
    btrfs: scrub: fix circular locking dependency warning
    btrfs: fix comment its device list mutex not volume lock
    btrfs: extent_io: Kill the forward declaration of flush_write_bio
    btrfs: Fix grossly misleading argument names in extent io search
    ...

    Linus Torvalds
     

06 Mar, 2019

1 commit

  • Pull networking updates from David Miller:
    "Here we go, another merge window full of networking and #ebpf changes:

    1) Snoop DHCPACKS in batman-adv to learn MAC/IP pairs in the DHCP
    range without dealing with floods of ARP traffic, from Linus
    Lüssing.

    2) Throttle buffered multicast packet transmission in mt76, from
    Felix Fietkau.

    3) Support adaptive interrupt moderation in ice, from Brett Creeley.

    4) A lot of struct_size conversions, from Gustavo A. R. Silva.

    5) Add peek/push/pop commands to bpftool, as well as bash completion,
    from Stanislav Fomichev.

    6) Optimize sk_msg_clone(), from Vakul Garg.

    7) Add SO_BINDTOIFINDEX, from David Herrmann.

    8) Be more conservative with local resends due to local congestion,
    from Yuchung Cheng.

    9) Allow vetoing of unsupported VXLAN FDBs, from Petr Machata.

    10) Add health buffer support to devlink, from Eran Ben Elisha.

    11) Add TXQ scheduling API to mac80211, from Toke Høiland-Jørgensen.

    12) Add statistics to basic packet scheduler filter, from Cong Wang.

    13) Add GRE tunnel support for mlxsw Spectrum-2, from Nir Dotan.

    14) Lots of new IP tunneling forwarding tests, also from Nir Dotan.

    15) Add 3ad stats to bonding, from Nikolay Aleksandrov.

    16) Lots of probing improvements for bpftool, from Quentin Monnet.

    17) Various nfp drive #ebpf JIT improvements from Jakub Kicinski.

    18) Allow #ebpf programs to access gso_segs from skb shared info, from
    Eric Dumazet.

    19) Add sock_diag support for AF_XDP sockets, from Björn Töpel.

    20) Support 22260 iwlwifi devices, from Luca Coelho.

    21) Use rbtree for ipv6 defragmentation, from Peter Oskolkov.

    22) Add JMP32 instruction class support to #ebpf, from Jiong Wang.

    23) Add spinlock support to #ebpf, from Alexei Starovoitov.

    24) Support 256-bit keys and TLS 1.3 in ktls, from Dave Watson.

    25) Add device infomation API to devlink, from Jakub Kicinski.

    26) Add new timestamping socket options which are y2038 safe, from
    Deepa Dinamani.

    27) Add RX checksum offloading for various sh_eth chips, from Sergei
    Shtylyov.

    28) Flow offload infrastructure, from Pablo Neira Ayuso.

    29) Numerous cleanups, improvements, and bug fixes to the PHY layer
    and many drivers from Heiner Kallweit.

    30) Lots of changes to try and make packet scheduler classifiers run
    lockless as much as possible, from Vlad Buslov.

    31) Support BCM957504 chip in bnxt_en driver, from Erik Burrows.

    32) Add concurrency tests to tc-tests infrastructure, from Vlad
    Buslov.

    33) Add hwmon support to aquantia, from Heiner Kallweit.

    34) Allow 64-bit values for SO_MAX_PACING_RATE, from Eric Dumazet.

    And I would be remiss if I didn't thank the various major networking
    subsystem maintainers for integrating much of this work before I even
    saw it. Alexei Starovoitov, Daniel Borkmann, Pablo Neira Ayuso,
    Johannes Berg, Kalle Valo, and many others. Thank you!"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2207 commits)
    net/sched: avoid unused-label warning
    net: ignore sysctl_devconf_inherit_init_net without SYSCTL
    phy: mdio-mux: fix Kconfig dependencies
    net: phy: use phy_modify_mmd_changed in genphy_c45_an_config_aneg
    net: dsa: mv88e6xxx: add call to mv88e6xxx_ports_cmode_init to probe for new DSA framework
    selftest/net: Remove duplicate header
    sky2: Disable MSI on Dell Inspiron 1545 and Gateway P-79
    net/mlx5e: Update tx reporter status in case channels were successfully opened
    devlink: Add support for direct reporter health state update
    devlink: Update reporter state to error even if recover aborted
    sctp: call iov_iter_revert() after sending ABORT
    team: Free BPF filter when unregistering netdev
    ip6mr: Do not call __IP6_INC_STATS() from preemptible context
    isdn: mISDN: Fix potential NULL pointer dereference of kzalloc
    net: dsa: mv88e6xxx: support in-band signalling on SGMII ports with external PHYs
    cxgb4/chtls: Prefix adapter flags with CXGB4
    net-sysfs: Switch to bitmap_zalloc()
    mellanox: Switch to bitmap_zalloc()
    bpf: add test cases for non-pointer sanitiation logic
    mlxsw: i2c: Extend initialization by querying resources data
    ...

    Linus Torvalds
     

05 Mar, 2019

1 commit

  • It is possible that a reporter state will be updated due to a recover flow
    which is not triggered by a devlink health related operation, but as a side
    effect of some other operation in the system.

    Expose devlink health API for a direct update of a reporter status.

    Move devlink_health_reporter_state enum definition to devlink.h so it could
    be used from drivers as a parameter of devlink_health_reporter_state_update.

    In addition, add trace_devlink_health_reporter_state_update to provide user
    notification for reporter state change.

    Signed-off-by: Eran Ben Elisha
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Eran Ben Elisha
     

04 Mar, 2019

1 commit


25 Feb, 2019

3 commits

  • …s_qgroup_extent_record

    [BUG]
    Btrfs/139 will fail with a high probability if the testing machine (VM)
    has only 2G RAM.

    Resulting the final write success while it should fail due to EDQUOT,
    and the fs will have quota exceeding the limit by 16K.

    The simplified reproducer will be: (needs a 2G ram VM)

    $ mkfs.btrfs -f $dev
    $ mount $dev $mnt

    $ btrfs subv create $mnt/subv
    $ btrfs quota enable $mnt
    $ btrfs quota rescan -w $mnt
    $ btrfs qgroup limit -e 1G $mnt/subv

    $ for i in $(seq -w 1 8); do
    xfs_io -f -c "pwrite 0 128M" $mnt/subv/file_$i > /dev/null
    echo "file $i written" > /dev/kmsg
    done
    $ sync
    $ btrfs qgroup show -pcre --raw $mnt

    The last pwrite will not trigger EDQUOT and final 'qgroup show' will
    show something like:

    qgroupid rfer excl max_rfer max_excl parent child
    -------- ---- ---- -------- -------- ------ -----
    0/5 16384 16384 none none --- ---
    0/256 1073758208 1073758208 none 1073741824 --- ---

    And 1073758208 is larger than
    > 1073741824.

    [CAUSE]
    It's a bug in btrfs qgroup data reserved space management.

    For quota limit, we must ensure that:
    reserved (data + metadata) + rfer/excl <= limit

    Since rfer/excl is only updated at transaction commmit time, reserved
    space needs to be taken special care.

    One important part of reserved space is data, and for a new data extent
    written to disk, we still need to take the reserved space until
    rfer/excl numbers get updated.

    Originally when an ordered extent finishes, we migrate the reserved
    qgroup data space from extent_io tree to delayed ref head of the data
    extent, expecting delayed ref will only be cleaned up at commit
    transaction time.

    However for small RAM machine, due to memory pressure dirty pages can be
    flushed back to disk without committing a transaction.

    The related events will be something like:

    file 1 written
    btrfs_finish_ordered_io: ino=258 ordered offset=0 len=54947840
    btrfs_finish_ordered_io: ino=258 ordered offset=54947840 len=5636096
    btrfs_finish_ordered_io: ino=258 ordered offset=61153280 len=57344
    btrfs_finish_ordered_io: ino=258 ordered offset=61210624 len=8192
    btrfs_finish_ordered_io: ino=258 ordered offset=60583936 len=569344
    cleanup_ref_head: num_bytes=54947840
    cleanup_ref_head: num_bytes=5636096
    cleanup_ref_head: num_bytes=569344
    cleanup_ref_head: num_bytes=57344
    cleanup_ref_head: num_bytes=8192
    ^^^^^^^^^^^^^^^^ This will free qgroup data reserved space
    file 2 written
    ...
    file 8 written
    cleanup_ref_head: num_bytes=8192
    ...
    btrfs_commit_transaction <<< the only transaction committed during
    the test

    When file 2 is written, we have already freed 128M reserved qgroup data
    space for ino 258. Thus later write won't trigger EDQUOT.

    This allows us to write more data beyond qgroup limit.

    In my 2G ram VM, it could reach about 1.2G before hitting EDQUOT.

    [FIX]
    By moving reserved qgroup data space from btrfs_delayed_ref_head to
    btrfs_qgroup_extent_record, we can ensure that reserved qgroup data
    space won't be freed half way before commit transaction, thus fix the
    problem.

    Fixes: f64d5ca86821 ("btrfs: delayed_ref: Add new function to record reserved space into delayed ref")
    Signed-off-by: Qu Wenruo <wqu@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>

    Qu Wenruo
     
  • We've done this forever because of the voodoo around knowing how much
    space we have. However, we have better ways of doing this now, and on
    normal file systems we'll easily have a global reserve of 512MiB, and
    since metadata chunks are usually 1GiB that means we'll allocate
    metadata chunks more readily. Instead use the actual used amount when
    determining if we need to allocate a chunk or not.

    This has a side effect for mixed block group fs'es where we are no
    longer allocating enough chunks for the data/metadata requirements. To
    deal with this add a ALLOC_CHUNK_FORCE step to the flushing state
    machine. This will only get used if we've already made a full loop
    through the flushing machinery and tried committing the transaction.

    If we have then we can try and force a chunk allocation since we likely
    need it to make progress. This resolves issues I was seeing with
    the mixed bg tests in xfstests without the new flushing state.

    Reviewed-by: Nikolay Borisov
    Signed-off-by: Josef Bacik
    [ merged with patch "add ALLOC_CHUNK_FORCE to the flushing code" ]
    Signed-off-by: David Sterba

    Josef Bacik
     
  • Hit the new tracepoint once the vregion migration ends.

    Signed-off-by: Jiri Pirko
    Signed-off-by: Ido Schimmel
    Signed-off-by: David S. Miller

    Jiri Pirko
     

24 Feb, 2019

1 commit


18 Feb, 2019

3 commits

  • Linux 5.0-rc7

    * tag 'v5.0-rc7': (1667 commits)
    Linux 5.0-rc7
    Input: elan_i2c - add ACPI ID for touchpad in Lenovo V330-15ISK
    Input: st-keyscan - fix potential zalloc NULL dereference
    Input: apanel - switch to using brightness_set_blocking()
    powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
    efi/arm: Revert "Defer persistent reservations until after paging_init()"
    arm64, mm, efi: Account for GICv3 LPI tables in static memblock reserve table
    sunrpc: fix 4 more call sites that were using stack memory with a scatterlist
    include/linux/module.h: copy __init/__exit attrs to init/cleanup_module
    Compiler Attributes: add support for __copy (gcc >= 9)
    lib/crc32.c: mark crc32_le_base/__crc32c_le_base aliases as __pure
    auxdisplay: ht16k33: fix potential user-after-free on module unload
    x86/platform/UV: Use efi_runtime_lock to serialise BIOS calls
    i2c: bcm2835: Clear current buffer pointers and counts after a transfer
    i2c: cadence: Fix the hold bit setting
    drm: Use array_size() when creating lease
    dm thin: fix bug where bio that overwrites thin block ignores FUA
    Revert "exec: load_script: don't blindly truncate shebang string"
    Revert "gfs2: read journal in large chunks to locate the head"
    net: ethernet: freescale: set FEC ethtool regs version
    ...

    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • Backmerging for nouveau and imx that needed some fixes for next pulls.

    Signed-off-by: Dave Airlie

    Dave Airlie
     
  • The goal here is to trace neigh state changes covering all possible
    neigh update paths. Plus have a specific trace point in neigh_update
    to cover flags sent to neigh_update.

    Signed-off-by: Roopa Prabhu
    Signed-off-by: David S. Miller

    Roopa Prabhu
     

15 Feb, 2019

1 commit

  • If an smbus transfer fails, there's no guarantee that the output
    buffer was written. So, avoid trying to show the output buffer when
    tracing after an error. This was 'mostly harmless', but would trip
    up kasan checking if left-over cruft in byte 0 is a large length,
    causing us to read from unwritten memory.

    Signed-off-by: John Sperbeck
    Reviewed-by: Steven Rostedt (VMware)
    Signed-off-by: Wolfram Sang

    John Sperbeck
     

14 Feb, 2019

6 commits

  • prepare_reply_buffer() and its NFSv4 equivalents expose the details
    of the RPC header and the auth slack values to upper layer
    consumers, creating a layering violation, and duplicating code.

    Remedy these issues by adding a new RPC client API that hides those
    details from upper layers in a common helper function.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Add infrastructure for trace points in the RPC_AUTH_GSS kernel
    module, and add a few sample trace points. These report exceptional
    or unexpected events, and observe the assignment of GSS sequence
    numbers.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • - Recover some instruction count because I'm about to introduce a
    few xdr_inline_decode call sites
    - Replace dprintk() call sites with trace points
    - Reduce the hot path so it fits in fewer cachelines

    I've also renamed it rpc_decode_header() to match everything else
    in the RPC client.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Modernize and harden the code path that constructs each RPC Call
    message.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • We don't want READ payloads that are partially in the head iovec and
    in the page buffer because this requires pull-up, which can be
    expensive.

    The NFS/RPC client tries hard to predict the size of the head iovec
    so that the incoming READ data payload lands only in the page
    vector, but it doesn't always get it right. To help diagnose such
    problems, add a trace point in the logic that decodes READ-like
    operations that reports whether pull-up is being done.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • This can help field troubleshooting without needing the overhead of
    a full network capture (ie, tcpdump).

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     

13 Feb, 2019

1 commit


09 Feb, 2019

1 commit


08 Feb, 2019

2 commits

  • Upon error discover, every driver can report it to the devlink health
    mechanism via devlink_health_report function, using the appropriate
    reporter registered to it. Driver can pass error specific context which
    will be delivered to it as part of the dump / recovery callbacks.

    Once an error is reported, devlink health will do the following actions:
    * A log is being send to the kernel trace events buffer
    * Health status and statistics are being updated for the reporter instance
    * Object dump is being taken and stored at the reporter instance (as long
    as there is no other dump which is already stored)
    * Auto recovery attempt is being done. Depends on:
    - Auto Recovery configuration
    - Grace period vs. Time since last recover

    Signed-off-by: Eran Ben Elisha
    Reviewed-by: Moshe Shemesh
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Eran Ben Elisha
     
  • The CDMA push buffer can currently only handle opcodes that take a
    single word parameter. However, the host1x implementation on Tegra186
    and later supports opcodes that require multiple words as parameters.

    Unfortunately the way the push buffer is structured, these wide opcodes
    cannot simply be composed of two regular opcodes because that could
    result in the wide opcode being split across the end of the push buffer
    and the final RESTART opcode required to wrap the push buffer around
    would break the wide opcode.

    One way to fix this would be to remove the concept of slots to simplify
    push buffer operations. However, that's not entirely trivial and should
    be done in a separate patch. For now, simply use a different function
    to push four-word opcodes into the push buffer. Technically only three
    words are pushed, with the fourth word used as padding to preserve the
    2-word alignment required by the slots abstraction. The fourth word is
    always a NOP opcode.

    Additional care must be taken when the end of the push buffer is
    reached. If a four-word opcode doesn't fit into the push buffer without
    being split by the boundary, NOP opcodes will be introduced and the new
    wide opcode placed at the beginning of the push buffer.

    Signed-off-by: Thierry Reding

    Thierry Reding