18 Apr, 2019

1 commit

  • Interactive governor has lived in Android sources for a very long time
    and this commit is based on the code present in following branch:

    https://android.googlesource.com/kernel/common android-4.4

    The Interactive governor is designed for latency-sensitive workloads,
    such as interactive user interfaces like the mobile phones and tablets.
    The interactive governor aims to be significantly more responsive to
    ramp CPU quickly up when CPU-intensive activity begins.

    Existing governors sample CPU load at a particular rate, typically every
    X ms and then update the frequency from a work-handler. This can lead
    to under-powering UI threads for the period of time during which the
    user begins interacting with a previously-idle system until the next
    sample period happens.

    The 'interactive' governor uses a different approach.

    A real-time thread is used for scaling up, giving the remaining tasks
    the CPU performance benefit, unlike existing governors which are more
    likely to schedule ramp-up work to occur after your performance starved
    tasks have completed.

    The Android version of interactive governor also checks whether to scale
    the CPU frequency up soon after coming out of idle. When the CPU comes
    out of idle, the governor check if the CPU sampling is overdue or not.
    If yes, it immediately starts the sampling. Otherwise, the utilization
    hooks from the scheduler handle the sampling later. If the CPU is very
    busy from exiting idle to when the evaluation happens, then it assumes
    that the CPU is under-powered and ramps it to MAX speed.

    If the CPU was not sufficiently busy to immediately ramp to MAX speed,
    then the governor evaluates the CPU load since the last speed
    adjustment, choosing the highest value between that longer-term load or
    the short-term load since idle exit to determine the CPU speed to ramp
    to.

    Idle notifiers will be be handled later and are not included for now.

    The core of this code is written and maintained (in Android
    repositories) by Mike Chan and Todd Poyner over a long period of time.

    Vireshk has made changes to to the governor to align it with the current
    practices followed with mainline governors, like using utilization hooks
    from the scheduler and handling kobject (for governor's sysfs directory)
    in a race free manner. And of course this included general cleanup of
    the governor as well.

    Signed-off-by: Mike Chan
    Signed-off-by: Todd Poynor
    Signed-off-by: Viresh Kumar
    Signed-off-by: Vipul Kumar

    Viresh Kumar
     

17 Jan, 2019

1 commit

  • commit d4b09acf924b84bae77cad090a9d108e70b43643 upstream.

    if node have NFSv41+ mounts inside several net namespaces
    it can lead to use-after-free in svc_process_common()

    svc_process_common()
    /* Setup reply header */
    rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr(rqstp); <<< HERE

    svc_process_common() can use incorrect rqstp->rq_xprt,
    its caller function bc_svc_process() takes it from serv->sv_bc_xprt.
    The problem is that serv is global structure but sv_bc_xprt
    is assigned per-netnamespace.

    According to Trond, the whole "let's set up rqstp->rq_xprt
    for the back channel" is nothing but a giant hack in order
    to work around the fact that svc_process_common() uses it
    to find the xpt_ops, and perform a couple of (meaningless
    for the back channel) tests of xpt_flags.

    All we really need in svc_process_common() is to be able to run
    rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr()

    Bruce J Fields points that this xpo_prep_reply_hdr() call
    is an awfully roundabout way just to do "svc_putnl(resv, 0);"
    in the tcp case.

    This patch does not initialiuze rqstp->rq_xprt in bc_svc_process(),
    now it calls svc_process_common() with rqstp->rq_xprt = NULL.

    To adjust reply header svc_process_common() just check
    rqstp->rq_prot and calls svc_tcp_prep_reply_hdr() for tcp case.

    To handle rqstp->rq_xprt = NULL case in functions called from
    svc_process_common() patch intruduces net namespace pointer
    svc_rqst->rq_bc_net and adjust SVC_NET() definition.
    Some other function was also adopted to properly handle described case.

    Signed-off-by: Vasily Averin
    Cc: stable@vger.kernel.org
    Fixes: 23c20ecd4475 ("NFS: callback up - users counting cleanup")
    Signed-off-by: J. Bruce Fields
    v2: added lost extern svc_tcp_prep_reply_hdr()
    Signed-off-by: Vasily Averin
    Signed-off-by: Greg Kroah-Hartman

    Vasily Averin
     

10 Jan, 2019

1 commit

  • commit fde872682e175743e0c3ef939c89e3c6008a1529 upstream.

    Some time back, nfsd switched from calling vfs_fsync() to using a new
    commit_metadata() hook in export_operations(). If the file system did
    not provide a commit_metadata() hook, it fell back to using
    sync_inode_metadata(). Unfortunately doesn't work on all file
    systems. In particular, it doesn't work on ext4 due to how the inode
    gets journalled --- the VFS writeback code will not always call
    ext4_write_inode().

    So we need to provide our own ext4_nfs_commit_metdata() method which
    calls ext4_write_inode() directly.

    Google-Bug-Id: 121195940
    Signed-off-by: Theodore Ts'o
    Cc: stable@kernel.org
    Signed-off-by: Greg Kroah-Hartman

    Theodore Ts'o
     

08 Dec, 2018

1 commit

  • commit 3054426dc68e5d63aa6a6e9b91ac4ec78e3f3805 upstream.

    commit 3f5fe9fef5b2 ("sched/debug: Fix task state recording/printout")
    tried to fix the problem introduced by a previous commit efb40f588b43
    ("sched/tracing: Fix trace_sched_switch task-state printing"). However
    the prev_state output in sched_switch is still broken.

    task_state_index() uses fls() which considers the LSB as 1. Left
    shifting 1 by this value gives an incorrect mapping to the task state.
    Fix this by decrementing the value returned by __get_task_state()
    before shifting.

    Link: http://lkml.kernel.org/r/1540882473-1103-1-git-send-email-pkondeti@codeaurora.org

    Cc: stable@vger.kernel.org
    Fixes: 3f5fe9fef5b2 ("sched/debug: Fix task state recording/printout")
    Signed-off-by: Pavankumar Kondeti
    Signed-off-by: Steven Rostedt (VMware)
    Signed-off-by: Greg Kroah-Hartman

    Pavankumar Kondeti
     

11 Oct, 2018

1 commit

  • David Howells says:

    ====================
    rxrpc: Fix packet reception code

    Here are a set of patches that prepares for and fix problems in rxrpc's
    package reception code. There serious problems are:

    (A) There's a window between binding the socket and setting the data_ready
    hook in which packets can find their way into the UDP socket's receive
    queues.

    (B) The skb_recv_udp() will return an error (and clear the error state) if
    there was an error on the Tx side. rxrpc doesn't handle this.

    (C) The rxrpc data_ready handler doesn't fully drain the UDP receive
    queue.

    (D) The rxrpc data_ready handler assumes it is called in a non-reentrant
    state.

    The second patch fixes (A) - (C); the third patch renders (B) and (C)
    non-issues by using the recap_rcv hook instead of data_ready - and the
    final patch fixes (D). That last is the most complex.

    The preparatory patches are:

    (1) Fix some places that are doing things in the wrong net namespace.

    (2) Stop taking the rcu read lock as it's held by the IP input routine in
    the call chain.

    (3) Only end the Tx phase if *we* rotated the final packet out of the Tx
    buffer.

    (4) Don't assume that the call state won't change after dropping the
    call_state lock.

    (5) Only take receive window and MTU suze parameters from an ACK packet if
    it's the latest ACK packet.

    (6) Record connection-level abort information correctly.

    (7) Fix a trace line.

    And then there are three main patches - note that these are mixed in with
    the preparatory patches somewhat:

    (1) Fix the setup window (A), skb_recv_udp() error check (B) and packet
    drainage (C).

    (2) Switch to using the encap_rcv instead of data_ready to cut out the
    effects of the UDP read queues and get the packets delivered directly.

    (3) Add more locking into the various packet input paths to defend against
    re-entrance (D).
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

09 Oct, 2018

1 commit


06 Oct, 2018

1 commit

  • Ingo writes:
    "scheduler fixes:

    These fixes address a rather involved performance regression between
    v4.17->v4.19 in the sched/numa auto-balancing code. Since distros
    really need this fix we accelerated it to sched/urgent for a faster
    upstream merge.

    NUMA scheduling and balancing performance is now largely back to
    v4.17 levels, without reintroducing the NUMA placement bugs that
    v4.18 and v4.19 fixed.

    Many thanks to Srikar Dronamraju, Mel Gorman and Jirka Hladky, for
    reporting, testing, re-testing and solving this rather complex set of
    bugs."

    * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched/numa: Migrate pages to local nodes quicker early in the lifetime of a task
    mm, sched/numa: Remove rate-limiting of automatic NUMA balancing migration
    sched/numa: Avoid task migration for small NUMA improvement
    mm/migrate: Use spin_trylock() while resetting rate limit
    sched/numa: Limit the conditions where scan period is reset
    sched/numa: Reset scan rate whenever task moves across nodes
    sched/numa: Pass destination CPU as a parameter to migrate_task_rq
    sched/numa: Stop multiple tasks from moving to the CPU at the same time

    Greg Kroah-Hartman
     

02 Oct, 2018

1 commit

  • Rate limiting of page migrations due to automatic NUMA balancing was
    introduced to mitigate the worst-case scenario of migrating at high
    frequency due to false sharing or slowly ping-ponging between nodes.
    Since then, a lot of effort was spent on correctly identifying these
    pages and avoiding unnecessary migrations and the safety net may no longer
    be required.

    Jirka Hladky reported a regression in 4.17 due to a scheduler patch that
    avoids spreading STREAM tasks wide prematurely. However, once the task
    was properly placed, it delayed migrating the memory due to rate limiting.
    Increasing the limit fixed the problem for him.

    Currently, the limit is hard-coded and does not account for the real
    capabilities of the hardware. Even if an estimate was attempted, it would
    not properly account for the number of memory controllers and it could
    not account for the amount of bandwidth used for normal accesses. Rather
    than fudging, this patch simply eliminates the rate limiting.

    However, Jirka reports that a STREAM configuration using multiple
    processes achieved similar performance to 4.16. In local tests, this patch
    improved performance of STREAM relative to the baseline but it is somewhat
    machine-dependent. Most workloads show little or not performance difference
    implying that there is not a heavily reliance on the throttling mechanism
    and it is safe to remove.

    STREAM on 2-socket machine
    4.19.0-rc5 4.19.0-rc5
    numab-v1r1 noratelimit-v1r1
    MB/sec copy 43298.52 ( 0.00%) 44673.38 ( 3.18%)
    MB/sec scale 30115.06 ( 0.00%) 31293.06 ( 3.91%)
    MB/sec add 32825.12 ( 0.00%) 34883.62 ( 6.27%)
    MB/sec triad 32549.52 ( 0.00%) 34906.60 ( 7.24%

    Signed-off-by: Mel Gorman
    Reviewed-by: Rik van Riel
    Acked-by: Peter Zijlstra
    Cc: Jirka Hladky
    Cc: Linus Torvalds
    Cc: Linux-MM
    Cc: Srikar Dronamraju
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20181001100525.29789-2-mgorman@techsingularity.net
    Signed-off-by: Ingo Molnar

    Mel Gorman
     

28 Sep, 2018

1 commit

  • Fix error distribution by immediately delivering the errors to all the
    affected calls rather than deferring them to a worker thread. The problem
    with the latter is that retries and things can happen in the meantime when we
    want to stop that sooner.

    To this end:

    (1) Stop the error distributor from removing calls from the error_targets
    list so that peer->lock isn't needed to synchronise against other adds
    and removals.

    (2) Require the peer's error_targets list to be accessed with RCU, thereby
    avoiding the need to take peer->lock over distribution.

    (3) Don't attempt to affect a call's state if it is already marked complete.

    Signed-off-by: David Howells

    David Howells
     

25 Aug, 2018

1 commit


21 Aug, 2018

1 commit

  • Pull tracing updates from Steven Rostedt:

    - Restructure of lockdep and latency tracers

    This is the biggest change. Joel Fernandes restructured the hooks
    from irqs and preemption disabling and enabling. He got rid of a lot
    of the preprocessor #ifdef mess that they caused.

    He turned both lockdep and the latency tracers to use trace events
    inserted in the preempt/irqs disabling paths. But unfortunately,
    these started to cause issues in corner cases. Thus, parts of the
    code was reverted back to where lockdep and the latency tracers just
    get called directly (without using the trace events). But because the
    original change cleaned up the code very nicely we kept that, as well
    as the trace events for preempt and irqs disabling, but they are
    limited to not being called in NMIs.

    - Have trace events use SRCU for "rcu idle" calls. This was required
    for the preempt/irqs off trace events. But it also had to not allow
    them to be called in NMI context. Waiting till Paul makes an NMI safe
    SRCU API.

    - New notrace SRCU API to allow trace events to use SRCU.

    - Addition of mcount-nop option support

    - SPDX headers replacing GPL templates.

    - Various other fixes and clean ups.

    - Some fixes are marked for stable, but were not fully tested before
    the merge window opened.

    * tag 'trace-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits)
    tracing: Fix SPDX format headers to use C++ style comments
    tracing: Add SPDX License format tags to tracing files
    tracing: Add SPDX License format to bpf_trace.c
    blktrace: Add SPDX License format header
    s390/ftrace: Add -mfentry and -mnop-mcount support
    tracing: Add -mcount-nop option support
    tracing: Avoid calling cc-option -mrecord-mcount for every Makefile
    tracing: Handle CC_FLAGS_FTRACE more accurately
    Uprobe: Additional argument arch_uprobe to uprobe_write_opcode()
    Uprobes: Simplify uprobe_register() body
    tracepoints: Free early tracepoints after RCU is initialized
    uprobes: Use synchronize_rcu() not synchronize_sched()
    tracing: Fix synchronizing to event changes with tracepoint_synchronize_unregister()
    ftrace: Remove unused pointer ftrace_swapper_pid
    tracing: More reverting of "tracing: Centralize preemptirq tracepoints and unify their usage"
    tracing/irqsoff: Handle preempt_count for different configs
    tracing: Partial revert of "tracing: Centralize preemptirq tracepoints and unify their usage"
    tracing: irqsoff: Account for additional preempt_disable
    trace: Use rcu_dereference_raw for hooks from trace-event subsystem
    tracing/kprobes: Fix within_notrace_func() to check only notrace functions
    ...

    Linus Torvalds
     

20 Aug, 2018

1 commit

  • Pull networking fixes from David Miller:

    1) Fix races in IPVS, from Tan Hu.

    2) Missing unbind in matchall classifier, from Hangbin Liu.

    3) Missing act_ife action release, from Vlad Buslov.

    4) Cure lockdep splats in ila, from Cong Wang.

    5) veth queue leak on link delete, from Toshiaki Makita.

    6) Disable isdn's IIOCDBGVAR ioctl, it exposes kernel addresses. From
    Kees Cook.

    7) RCU usage fixup in XDP, from Tariq Toukan.

    8) Two TCP ULP fixes from Daniel Borkmann.

    9) r8169 needs REALTEK_PHY as a Kconfig dependency, from Heiner
    Kallweit.

    10) Always take tcf_lock with BH disabled, otherwise we can deadlock
    with rate estimator code paths. From Vlad Buslov.

    11) Don't use MSI-X on RTL8106e r8169 chips, they don't resume properly.
    From Jian-Hong Pan.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
    ip6_vti: fix creating fallback tunnel device for vti6
    ip_vti: fix a null pointer deferrence when create vti fallback tunnel
    r8169: don't use MSI-X on RTL8106e
    net: lan743x_ptp: convert to ktime_get_clocktai_ts64
    net: sched: always disable bh when taking tcf_lock
    ip6_vti: simplify stats handling in vti6_xmit
    bpf: fix redirect to map under tail calls
    r8169: add missing Kconfig dependency
    tools/bpf: fix bpf selftest test_cgroup_storage failure
    bpf, sockmap: fix sock_map_ctx_update_elem race with exist/noexist
    bpf, sockmap: fix map elem deletion race with smap_stop_sock
    bpf, sockmap: fix leakage of smap_psock_map_entry
    tcp, ulp: fix leftover icsk_ulp_ops preventing sock from reattach
    tcp, ulp: add alias for all ulp modules
    bpf: fix a rcu usage warning in bpf_prog_array_copy_core()
    samples/bpf: all XDP samples should unload xdp/bpf prog on SIGTERM
    net/xdp: Fix suspicious RCU usage warning
    net/mlx5e: Delete unneeded function argument
    Documentation: networking: ti-cpsw: correct cbs parameters for Eth1 100Mb
    isdn: Disable IIOCDBGVAR
    ...

    Linus Torvalds
     

19 Aug, 2018

1 commit

  • Pull char/misc driver updates from Greg KH:
    "Here is the bit set of char/misc drivers for 4.19-rc1

    There is a lot here, much more than normal, seems like everyone is
    writing new driver subsystems these days... Anyway, major things here
    are:

    - new FSI driver subsystem, yet-another-powerpc low-level hardware
    bus

    - gnss, finally an in-kernel GPS subsystem to try to tame all of the
    crazy out-of-tree drivers that have been floating around for years,
    combined with some really hacky userspace implementations. This is
    only for GNSS receivers, but you have to start somewhere, and this
    is great to see.

    Other than that, there are new slimbus drivers, new coresight drivers,
    new fpga drivers, and loads of DT bindings for all of these and
    existing drivers.

    All of these have been in linux-next for a while with no reported
    issues"

    * tag 'char-misc-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (255 commits)
    android: binder: Rate-limit debug and userspace triggered err msgs
    fsi: sbefifo: Bump max command length
    fsi: scom: Fix NULL dereference
    misc: mic: SCIF Fix scif_get_new_port() error handling
    misc: cxl: changed asterisk position
    genwqe: card_base: Use true and false for boolean values
    misc: eeprom: assignment outside the if statement
    uio: potential double frees if __uio_register_device() fails
    eeprom: idt_89hpesx: clean up an error pointer vs NULL inconsistency
    misc: ti-st: Fix memory leak in the error path of probe()
    android: binder: Show extra_buffers_size in trace
    firmware: vpd: Fix section enabled flag on vpd_section_destroy
    platform: goldfish: Retire pdev_bus
    goldfish: Use dedicated macros instead of manual bit shifting
    goldfish: Add missing includes to goldfish.h
    mux: adgs1408: new driver for Analog Devices ADGS1408/1409 mux
    dt-bindings: mux: add adi,adgs1408
    Drivers: hv: vmbus: Cleanup synic memory free path
    Drivers: hv: vmbus: Remove use of slow_virt_to_phys()
    Drivers: hv: vmbus: Reset the channel callback in vmbus_onoffer_rescind()
    ...

    Linus Torvalds
     

18 Aug, 2018

1 commit

  • Commits 109980b894e9 ("bpf: don't select potentially stale ri->map
    from buggy xdp progs") and 7c3001313396 ("bpf: fix ri->map_owner
    pointer on bpf_prog_realloc") tried to mitigate that buggy programs
    using bpf_redirect_map() helper call do not leave stale maps behind.
    Idea was to add a map_owner cookie into the per CPU struct redirect_info
    which was set to prog->aux by the prog making the helper call as a
    proof that the map is not stale since the prog is implicitly holding
    a reference to it. This owner cookie could later on get compared with
    the program calling into BPF whether they match and therefore the
    redirect could proceed with processing the map safely.

    In (obvious) hindsight, this approach breaks down when tail calls are
    involved since the original caller's prog->aux pointer does not have
    to match the one from one of the progs out of the tail call chain,
    and therefore the xdp buffer will be dropped instead of redirected.
    A way around that would be to fix the issue differently (which also
    allows to remove related work in fast path at the same time): once
    the life-time of a redirect map has come to its end we use it's map
    free callback where we need to wait on synchronize_rcu() for current
    outstanding xdp buffers and remove such a map pointer from the
    redirect info if found to be present. At that time no program is
    using this map anymore so we simply invalidate the map pointers to
    NULL iff they previously pointed to that instance while making sure
    that the redirect path only reads out the map once.

    Fixes: 97f91a7cf04f ("bpf: add bpf_redirect_map helper routine")
    Fixes: 109980b894e9 ("bpf: don't select potentially stale ri->map from buggy xdp progs")
    Reported-by: Sebastiano Miano
    Signed-off-by: Daniel Borkmann
    Acked-by: John Fastabend
    Signed-off-by: Alexei Starovoitov

    Daniel Borkmann
     

16 Aug, 2018

1 commit

  • Pull networking updates from David Miller:
    "Highlights:

    - Gustavo A. R. Silva keeps working on the implicit switch fallthru
    changes.

    - Support 802.11ax High-Efficiency wireless in cfg80211 et al, From
    Luca Coelho.

    - Re-enable ASPM in r8169, from Kai-Heng Feng.

    - Add virtual XFRM interfaces, which avoids all of the limitations of
    existing IPSEC tunnels. From Steffen Klassert.

    - Convert GRO over to use a hash table, so that when we have many
    flows active we don't traverse a long list during accumluation.

    - Many new self tests for routing, TC, tunnels, etc. Too many
    contributors to mention them all, but I'm really happy to keep
    seeing this stuff.

    - Hardware timestamping support for dpaa_eth/fsl-fman from Yangbo Lu.

    - Lots of cleanups and fixes in L2TP code from Guillaume Nault.

    - Add IPSEC offload support to netdevsim, from Shannon Nelson.

    - Add support for slotting with non-uniform distribution to netem
    packet scheduler, from Yousuk Seung.

    - Add UDP GSO support to mlx5e, from Boris Pismenny.

    - Support offloading of Team LAG in NFP, from John Hurley.

    - Allow to configure TX queue selection based upon RX queue, from
    Amritha Nambiar.

    - Support ethtool ring size configuration in aquantia, from Anton
    Mikaev.

    - Support DSCP and flowlabel per-transport in SCTP, from Xin Long.

    - Support list based batching and stack traversal of SKBs, this is
    very exciting work. From Edward Cree.

    - Busyloop optimizations in vhost_net, from Toshiaki Makita.

    - Introduce the ETF qdisc, which allows time based transmissions. IGB
    can offload this in hardware. From Vinicius Costa Gomes.

    - Add parameter support to devlink, from Moshe Shemesh.

    - Several multiplication and division optimizations for BPF JIT in
    nfp driver, from Jiong Wang.

    - Lots of prepatory work to make more of the packet scheduler layer
    lockless, when possible, from Vlad Buslov.

    - Add ACK filter and NAT awareness to sch_cake packet scheduler, from
    Toke Høiland-Jørgensen.

    - Support regions and region snapshots in devlink, from Alex Vesker.

    - Allow to attach XDP programs to both HW and SW at the same time on
    a given device, with initial support in nfp. From Jakub Kicinski.

    - Add TLS RX offload and support in mlx5, from Ilya Lesokhin.

    - Use PHYLIB in r8169 driver, from Heiner Kallweit.

    - All sorts of changes to support Spectrum 2 in mlxsw driver, from
    Ido Schimmel.

    - PTP support in mv88e6xxx DSA driver, from Andrew Lunn.

    - Make TCP_USER_TIMEOUT socket option more accurate, from Jon
    Maxwell.

    - Support for templates in packet scheduler classifier, from Jiri
    Pirko.

    - IPV6 support in RDS, from Ka-Cheong Poon.

    - Native tproxy support in nf_tables, from Máté Eckl.

    - Maintain IP fragment queue in an rbtree, but optimize properly for
    in-order frags. From Peter Oskolkov.

    - Improvde handling of ACKs on hole repairs, from Yuchung Cheng"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1996 commits)
    bpf: test: fix spelling mistake "REUSEEPORT" -> "REUSEPORT"
    hv/netvsc: Fix NULL dereference at single queue mode fallback
    net: filter: mark expected switch fall-through
    xen-netfront: fix warn message as irq device name has '/'
    cxgb4: Add new T5 PCI device ids 0x50af and 0x50b0
    net: dsa: mv88e6xxx: missing unlock on error path
    rds: fix building with IPV6=m
    inet/connection_sock: prefer _THIS_IP_ to current_text_addr
    net: dsa: mv88e6xxx: bitwise vs logical bug
    net: sock_diag: Fix spectre v1 gadget in __sock_diag_cmd()
    ieee802154: hwsim: using right kind of iteration
    net: hns3: Add vlan filter setting by ethtool command -K
    net: hns3: Set tx ring' tc info when netdev is up
    net: hns3: Remove tx ring BD len register in hns3_enet
    net: hns3: Fix desc num set to default when setting channel
    net: hns3: Fix for phy link issue when using marvell phy driver
    net: hns3: Fix for information of phydev lost problem when down/up
    net: hns3: Fix for command format parsing error in hclge_is_all_function_id_zero
    net: hns3: Add support for serdes loopback selftest
    bnxt_en: take coredump_record structure off stack
    ...

    Linus Torvalds
     

15 Aug, 2018

2 commits

  • Pull sound updates from Takashi Iwai:
    "It's been busy summer weeks and hence lots of changes, partly for a
    few new drivers and partly for a wide range of fixes.

    Here are highlights:

    ALSA Core:
    - Fix rawmidi buffer management, code cleanup / refactoring
    - Fix the SG-buffer page handling with incorrect fallback size
    - Fix the stall at virmidi trigger callback with a large buffer; also
    offloading and code-refactoring along with it
    - Various ALSA sequencer code cleanups

    ASoC:
    - Deploy the standard snd_pcm_stop_xrun() helper in several drivers
    - Support for providing name prefixes to generic component nodes
    - Quite a few fixes for DPCM as it gains a bit wider use and more
    robust testing
    - Generalization of the DIO2125 support to a simple amplifier driver
    - Accessory detection support for the audio graph card
    - DT support for PXA AC'97 devices
    - Quirks for a number of new x86 systems
    - Support for AM Logic Meson, Everest ES7154, Intel systems with
    RT5682, Qualcomm QDSP6 and WCD9335, Realtek RT5682 and TI TAS5707

    HD-audio:
    - Code refactoring in HD-audio ext codec codes to drop own classes;
    preliminary works for the upcoming legacy codec support
    - Generalized DRM audio component for the upcoming radeon / amdgpu
    support
    - Unification of mic mute-LED and GPIO support for various codecs
    - Further improvement of CA0132 codec support including Recon3D
    - Proper vga_switcheroo handling for AMD i-GPU
    - Update of model list in documentation
    - Fixups for another HP Spectre x360, Conexant codecs, power-save
    blacklist update

    USB-audio:
    - Fix the invalid sample rate setup with external clock
    - Support of UAC3 selector units and processing units
    - Basic UAC3 power-domain support
    - Support for Encore mDSD and Thesycon-based DSD devices
    - Preparation for future complete callback changes

    Firewire:
    - Add support for MOTU Traveler

    Misc:
    - The endianess notation fixes in various drivers
    - Add fall-through comment in lots of drivers
    - Various sparse warning fixes, e.g. about PCM format types"

    * tag 'sound-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (529 commits)
    ASoC: adav80x: mark expected switch fall-through
    ASoC: da7219: Add delays to capture path to remove DC offset noise
    ALSA: usb-audio: Mark expected switch fall-through
    ALSA: mixart: Mark expected switch fall-through
    ALSA: opl3: Mark expected switch fall-through
    ALSA: hda/ca0132 - Add exit commands for Recon3D
    ALSA: hda/ca0132 - Change mixer controls for Recon3D
    ALSA: hda/ca0132 - Add Recon3D input and output select commands
    ALSA: hda/ca0132 - Add DSP setup defaults for Recon3D
    ALSA: hda/ca0132 - Add Recon3D startup functions and setup
    ALSA: hda/ca0132 - Add bool variable to enable/disable pci region2 mmio
    ALSA: hda/ca0132 - Add Recon3D pincfg
    ALSA: hda/ca0132 - Add quirk ID and enum for Recon3D
    ALSA: hda/ca0132 - Add alt_functions unsolicited response
    ALSA: hda/ca0132 - Clean up ca0132_init function.
    ALSA: hda/ca0132 - Create mmio gpio function to make code clearer
    ASoC: wm_adsp: Make DSP name configurable by codec driver
    ASoC: wm_adsp: Declare firmware controls from codec driver
    ASoC: max98373: Added software reset register to readable registers
    ASoC: wm_adsp: Correct DSP pointer for preloader control
    ...

    Linus Torvalds
     
  • Pull power management updates from Rafael Wysocki:
    "These add a new framework for CPU idle time injection, to be used by
    all of the idle injection code in the kernel in the future, fix some
    issues and add a number of relatively small extensions in multiple
    places.

    Specifics:

    - Add a new framework for CPU idle time injection (Daniel Lezcano).

    - Add AVS support to the armada-37xx cpufreq driver (Gregory
    CLEMENT).

    - Add support for current CPU frequency reporting to the ACPI CPPC
    cpufreq driver (George Cherian).

    - Rework the cooling device registration in the imx6q/thermal driver
    (Bastian Stender).

    - Make the pcc-cpufreq driver refuse to work with dynamic scaling
    governors on systems with many CPUs to avoid scalability issues
    with it (Rafael Wysocki).

    - Fix the intel_pstate driver to report different maximum CPU
    frequencies on systems where they really are different and to
    ignore the turbo active ratio if hardware-managend P-states (HWP)
    are in use; make it use the match_string() helper (Xie Yisheng,
    Srinivas Pandruvada).

    - Fix a minor deferred probe issue in the qcom-kryo cpufreq driver
    (Niklas Cassel).

    - Add a tracepoint for the tracking of frequency limits changes (from
    Andriod) to the cpufreq core (Ruchi Kandoi).

    - Fix a circular lock dependency between CPU hotplug and sysfs
    locking in the cpufreq core reported by lockdep (Waiman Long).

    - Avoid excessive error reports on driver registration failures in
    the ARM cpuidle driver (Sudeep Holla).

    - Add a new device links flag to the driver core to make links go
    away automatically on supplier driver removal (Vivek Gautam).

    - Eliminate potential race condition between system-wide power
    management transitions and system shutdown (Pingfan Liu).

    - Add a quirk to save NVS memory on system suspend for the ASUS 1025C
    laptop (Willy Tarreau).

    - Make more systems use suspend-to-idle (instead of ACPI S3) by
    default (Tristian Celestin).

    - Get rid of stack VLA usage in the low-level hibernation code on
    64-bit x86 (Kees Cook).

    - Fix error handling in the hibernation core and mark an expected
    fall-through switch in it (Chengguang Xu, Gustavo Silva).

    - Extend the generic power domains (genpd) framework to support
    attaching a device to a power domain by name (Ulf Hansson).

    - Fix device reference counting and user limits initialization in the
    devfreq core (Arvind Yadav, Matthias Kaehlcke).

    - Fix a few issues in the rk3399_dmc devfreq driver and improve its
    documentation (Enric Balletbo i Serra, Lin Huang, Nick Milner).

    - Drop a redundant error message from the exynos-ppmu devfreq driver
    (Markus Elfring)"

    * tag 'pm-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (35 commits)
    PM / reboot: Eliminate race between reboot and suspend
    PM / hibernate: Mark expected switch fall-through
    cpufreq: intel_pstate: Ignore turbo active ratio in HWP
    cpufreq: Fix a circular lock dependency problem
    cpu/hotplug: Add a cpus_read_trylock() function
    x86/power/hibernate_64: Remove VLA usage
    cpufreq: trace frequency limits change
    cpufreq: intel_pstate: Show different max frequency with turbo 3 and HWP
    cpufreq: pcc-cpufreq: Disable dynamic scaling on many-CPU systems
    cpufreq: qcom-kryo: Silently error out on EPROBE_DEFER
    cpufreq / CPPC: Add cpuinfo_cur_freq support for CPPC
    cpufreq: armada-37xx: Add AVS support
    dt-bindings: marvell: Add documentation for the Armada 3700 AVS binding
    PM / devfreq: rk3399_dmc: Fix duplicated opp table on reload.
    PM / devfreq: Init user limits from OPP limits, not viceversa
    PM / devfreq: rk3399_dmc: fix spelling mistakes.
    PM / devfreq: rk3399_dmc: do not print error when get supply and clk defer.
    dt-bindings: devfreq: rk3399_dmc: move interrupts to be optional.
    PM / devfreq: rk3399_dmc: remove wait for dcf irq event.
    dt-bindings: clock: add rk3399 DDR3 standard speed bins.
    ...

    Linus Torvalds
     

14 Aug, 2018

3 commits

  • Pull btrfs updates from David Sterba:
    "Mostly fixes and cleanups, nothing big, though the notable thing is
    the inserted/deleted lines delta -1124.

    User visible changes:
    - allow defrag on opened read-only files that have rw permissions;
    similar to what dedupe will allow on such files

    Core changes:
    - tree checker improvements, reported by fuzzing:
    * more checks for: block group items, essential trees
    * chunk type validation
    * mount time cross-checks that physical and logical chunks match
    * switch more error codes to EUCLEAN aka EFSCORRUPTED

    Fixes:
    - fsync corner case fixes

    - fix send failure when root has deleted files still open

    - send, fix incorrect file layout after hole punching beyond eof

    - fix races between mount and deice scan ioctl, found by fuzzing

    - fix deadlock when delayed iput is called from writeback on the same
    inode; rare but has been observed in practice, also removes code

    - fix pinned byte accounting, using the right percpu helpers; this
    should avoid some write IO inefficiency during low space conditions

    - don't remove block group that still has pinned bytes

    - reset on-disk device stats value after replace, otherwise this
    would report stale values for the new device

    Cleanups:
    - time64_t/timespec64 cleanups

    - remove remaining dead code in scrub handling NOCOW extents after
    disabling it in previous cycle

    - simplify fsync regarding ordered extents logic and remove all the
    related code

    - remove redundant arguments in order to reduce stack space
    consumption

    - remove support for V0 type of extents, not in use since 2.6.30

    - remove several unused structure members

    - fewer indirect function calls by inlining some callbacks

    - qgroup rescan timing fixes

    - vfs: iget cleanups"

    * tag 'for-4.19-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (182 commits)
    btrfs: revert fs_devices state on error of btrfs_init_new_device
    btrfs: Exit gracefully when chunk map cannot be inserted to the tree
    btrfs: Introduce mount time chunk dev extent mapping check
    btrfs: Verify that every chunk has corresponding block group at mount time
    btrfs: Check that each block group has corresponding chunk at mount time
    Btrfs: send, fix incorrect file layout after hole punching beyond eof
    btrfs: Use wrapper macro for rcu string to remove duplicate code
    btrfs: simplify btrfs_iget
    btrfs: lift make_bad_inode into btrfs_iget
    btrfs: simplify IS_ERR/PTR_ERR checks
    btrfs: btrfs_iget never returns an is_bad_inode inode
    btrfs: replace: Reset on-disk dev stats value after replace
    btrfs: extent-tree: Remove unused __btrfs_free_block_rsv
    btrfs: backref: Use ERR_CAST to return error code
    btrfs: Remove redundant btrfs_release_path from btrfs_unlink_subvol
    btrfs: Remove root parameter from btrfs_unlink_subvol
    btrfs: Remove fs_info from btrfs_add_root_ref
    btrfs: Remove fs_info from btrfs_del_root_ref
    btrfs: Remove fs_info from btrfs_del_root
    btrfs: Remove fs_info from btrfs_delete_delayed_dir_index
    ...

    Linus Torvalds
     
  • Pull file locking updates from Jeff Layton:
    "Just a couple of patches from Konstantin to fix /proc/locks when the
    process that set the lock has exited, and a new tracepoint for the
    flock() codepath. Also threw in mailmap entries for my addresses and a
    comment cleanup"

    * tag 'locks-v4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
    locks: remove misleading obsolete comment
    mailmap: remap some of my email addresses to kernel.org address
    locks: add tracepoint in flock codepath
    fs/lock: show locks taken by processes from another pidns
    fs/lock: skip lock owner pid translation in case we are in init_pid_ns

    Linus Torvalds
     
  • There is an unalignment access about the structure
    'trace_event_raw_fib_table_lookup'.

    In include/trace/events/fib.h, there is a memory operation which casting
    the 'src' data member to a pointer, and then store a value to this
    pointer point to.

    p32 = (__be32 *) __entry->src;
    *p32 = flp->saddr;

    The offset of 'src' in structure trace_event_raw_fib_table_lookup is not
    four bytes alignment. On some architectures, they don't permit the
    unalignment access, it need to pay the price to handle this situation in
    exception handler.

    Adjust the layout of structure to avoid this case.

    Fixes: 9f323973c915 ("net/ipv4: Udate fib_table_lookup tracepoint")
    Signed-off-by: Zong Li
    Acked-by: David Ahern
    Signed-off-by: David S. Miller

    Zong Li
     

07 Aug, 2018

1 commit


06 Aug, 2018

2 commits

  • We used to call btrfs_file_extent_inline_len() to get the uncompressed
    data size of an inlined extent.

    However this function is hiding evil, for compressed extent, it has no
    choice but to directly read out ram_bytes from btrfs_file_extent_item.
    While for uncompressed extent, it uses item size to calculate the real
    data size, and ignoring ram_bytes completely.

    In fact, for corrupted ram_bytes, due to above behavior kernel
    btrfs_print_leaf() can't even print correct ram_bytes to expose the bug.

    Since we have the tree-checker to verify all EXTENT_DATA, such mismatch
    can be detected pretty easily, thus we can trust ram_bytes without the
    evil btrfs_file_extent_inline_len().

    Signed-off-by: Qu Wenruo
    Reviewed-by: David Sterba
    Signed-off-by: David Sterba

    Qu Wenruo
     
  • This is no longer used anywhere, remove all of it.

    Signed-off-by: Josef Bacik
    Reviewed-by: Filipe Manana
    Signed-off-by: David Sterba

    Josef Bacik
     

01 Aug, 2018

3 commits

  • Trace notifications from the softirq side of the socket to the
    process-context side.

    Signed-off-by: David Howells

    David Howells
     
  • Fix the ACK proposal tracepoint outcomes list by making the one that's an
    empty string not an empty string - which gets rendered as a hex number
    string instead.

    Signed-off-by: David Howells

    David Howells
     
  • Trace successful packet transmission (kernel_sendmsg() succeeded, that is)
    in AF_RXRPC. We can share the enum that defines the transmission points
    with the trace_rxrpc_tx_fail() tracepoint, so rename its constants to be
    applicable to both.

    Also, save the internal call->debug_id in the rxrpc_channel struct so that
    it can be used in retransmission trace lines.

    Signed-off-by: David Howells

    David Howells
     

31 Jul, 2018

1 commit

  • This patch detaches the preemptirq tracepoints from the tracers and
    keeps it separate.

    Advantages:
    * Lockdep and irqsoff event can now run in parallel since they no longer
    have their own calls.

    * This unifies the usecase of adding hooks to an irqsoff and irqson
    event, and a preemptoff and preempton event.
    3 users of the events exist:
    - Lockdep
    - irqsoff and preemptoff tracers
    - irqs and preempt trace events

    The unification cleans up several ifdefs and makes the code in preempt
    tracer and irqsoff tracers simpler. It gets rid of all the horrific
    ifdeferry around PROVE_LOCKING and makes configuration of the different
    users of the tracepoints more easy and understandable. It also gets rid
    of the time_* function calls from the lockdep hooks used to call into
    the preemptirq tracer which is not needed anymore. The negative delta in
    lines of code in this patch is quite large too.

    In the patch we introduce a new CONFIG option PREEMPTIRQ_TRACEPOINTS
    as a single point for registering probes onto the tracepoints. With
    this,
    the web of config options for preempt/irq toggle tracepoints and its
    users becomes:

    PREEMPT_TRACER PREEMPTIRQ_EVENTS IRQSOFF_TRACER PROVE_LOCKING
    | | \ | |
    \ (selects) / \ \ (selects) /
    TRACE_PREEMPT_TOGGLE ----> TRACE_IRQFLAGS
    \ /
    \ (depends on) /
    PREEMPTIRQ_TRACEPOINTS

    Other than the performance tests mentioned in the previous patch, I also
    ran the locking API test suite. I verified that all tests cases are
    passing.

    I also injected issues by not registering lockdep probes onto the
    tracepoints and I see failures to confirm that the probes are indeed
    working.

    This series + lockdep probes not registered (just to inject errors):
    [ 0.000000] hard-irqs-on + irq-safe-A/21: ok | ok | ok |
    [ 0.000000] soft-irqs-on + irq-safe-A/21: ok | ok | ok |
    [ 0.000000] sirq-safe-A => hirqs-on/12:FAILED|FAILED| ok |
    [ 0.000000] sirq-safe-A => hirqs-on/21:FAILED|FAILED| ok |
    [ 0.000000] hard-safe-A + irqs-on/12:FAILED|FAILED| ok |
    [ 0.000000] soft-safe-A + irqs-on/12:FAILED|FAILED| ok |
    [ 0.000000] hard-safe-A + irqs-on/21:FAILED|FAILED| ok |
    [ 0.000000] soft-safe-A + irqs-on/21:FAILED|FAILED| ok |
    [ 0.000000] hard-safe-A + unsafe-B #1/123: ok | ok | ok |
    [ 0.000000] soft-safe-A + unsafe-B #1/123: ok | ok | ok |

    With this series + lockdep probes registered, all locking tests pass:

    [ 0.000000] hard-irqs-on + irq-safe-A/21: ok | ok | ok |
    [ 0.000000] soft-irqs-on + irq-safe-A/21: ok | ok | ok |
    [ 0.000000] sirq-safe-A => hirqs-on/12: ok | ok | ok |
    [ 0.000000] sirq-safe-A => hirqs-on/21: ok | ok | ok |
    [ 0.000000] hard-safe-A + irqs-on/12: ok | ok | ok |
    [ 0.000000] soft-safe-A + irqs-on/12: ok | ok | ok |
    [ 0.000000] hard-safe-A + irqs-on/21: ok | ok | ok |
    [ 0.000000] soft-safe-A + irqs-on/21: ok | ok | ok |
    [ 0.000000] hard-safe-A + unsafe-B #1/123: ok | ok | ok |
    [ 0.000000] soft-safe-A + unsafe-B #1/123: ok | ok | ok |

    Link: http://lkml.kernel.org/r/20180730222423.196630-4-joel@joelfernandes.org

    Acked-by: Peter Zijlstra (Intel)
    Reviewed-by: Namhyung Kim
    Signed-off-by: Joel Fernandes (Google)
    Signed-off-by: Steven Rostedt (VMware)

    Joel Fernandes (Google)
     

26 Jul, 2018

1 commit

  • systrace used for tracing for Android systems has carried a patch for
    many years in the Android tree that traces when the cpufreq limits
    change. With the help of this information, systrace can know when the
    policy limits change and can visually display the data. Lets add
    upstream support for the same.

    Signed-off-by: Ruchi Kandoi
    Signed-off-by: Joel Fernandes (Google)
    Acked-by: Viresh Kumar
    Acked-by: Steven Rostedt (VMware)
    Signed-off-by: Rafael J. Wysocki

    Ruchi Kandoi
     

23 Jul, 2018

2 commits


13 Jul, 2018

10 commits