04 Mar, 2021

6 commits

  • If tracing is disabled for some reason (traceoff_on_warning, command line,
    etc), the ftrace selftests are guaranteed to fail, as their results are
    defined by trace data in the ring buffers. If the ring buffers are turned
    off, the tests will fail, due to lack of data.

    Because tracing being disabled is for a specific reason (warning, user
    decided to, etc), it does not make sense to enable tracing to run the self
    tests, as the test output may corrupt the reason for the tracing to be
    disabled.

    Instead, simply skip the self tests and report that they are being skipped
    due to tracing being disabled.

    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • kmemleak report:
    unreferenced object 0xc5a6f708 (size 8):
    comm "ftracetest", pid 1209, jiffies 4294911500 (age 6.816s)
    hex dump (first 8 bytes):
    00 c1 3d 60 14 83 1f 8a ..=`....
    backtrace:
    [] __kmalloc_track_caller+0x2a6/0x460
    [] kstrndup+0x37/0x70
    [] argv_split+0x1c/0x120
    [] __create_synth_event+0x192/0xb00
    [] create_synth_event+0xbb/0x150
    [] create_dyn_event+0x5c/0xb0
    [] trace_parse_run_command+0xa7/0x140
    [] dyn_event_write+0x10/0x20
    [] vfs_write+0xa9/0x3c0
    [] ksys_write+0x89/0xc0
    [] __ia32_sys_write+0x15/0x20
    [] __do_fast_syscall_32+0x45/0x80
    [] do_fast_syscall_32+0x29/0x60
    [] do_SYSENTER_32+0x15/0x20
    [] entry_SYSENTER_32+0xa9/0xfc
    unreferenced object 0xc5a6f078 (size 8):
    comm "ftracetest", pid 1209, jiffies 4294911500 (age 6.816s)
    hex dump (first 8 bytes):
    08 f7 a6 c5 00 00 00 00 ........
    backtrace:
    [] __kmalloc+0x2b6/0x470
    [] argv_split+0x82/0x120
    [] __create_synth_event+0x192/0xb00
    [] create_synth_event+0xbb/0x150
    [] create_dyn_event+0x5c/0xb0
    [] trace_parse_run_command+0xa7/0x140
    [] dyn_event_write+0x10/0x20
    [] vfs_write+0xa9/0x3c0
    [] ksys_write+0x89/0xc0
    [] __ia32_sys_write+0x15/0x20
    [] __do_fast_syscall_32+0x45/0x80
    [] do_fast_syscall_32+0x29/0x60
    [] do_SYSENTER_32+0x15/0x20
    [] entry_SYSENTER_32+0xa9/0xfc

    In __create_synth_event(), while iterating field/type arguments, the
    argv_split() will return array of atleast 2 elements even when zero
    arguments(argc=0) are passed. for e.g. when there is double delimiter
    or string ends with delimiter

    To fix call argv_free() even when argc=0.

    Link: https://lkml.kernel.org/r/20210304094521.GA1826@cosmos

    Signed-off-by: Vamshi K Sthambamkadi
    Signed-off-by: Steven Rostedt (VMware)

    Vamshi K Sthambamkadi
     
  • When the CONFIG_RING_BUFFER_VALIDATE_TIME_DELTAS is enabled, and the time
    stamps are detected as not being valid, it reports information about the
    write stamp, but does not show the before_stamp which is still useful
    information. Also, it should give a warning once, such that tests detect
    this happening.

    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • Part of the logic of the new time stamp code depends on the before_stamp and
    the write_stamp to be different if the write_stamp does not match the last
    event on the buffer, as it will be used to calculate the delta of the next
    event written on the buffer.

    The discard logic depends on this, as the next event to come in needs to
    inject a full timestamp as it can not rely on the last event timestamp in
    the buffer because it is unknown due to events after it being discarded. But
    by changing the write_stamp back to the time before it, it forces the next
    event to use a full time stamp, instead of relying on it.

    The issue came when a full time stamp was used for the event, and
    rb_time_delta() returns zero in that case. The update to the write_stamp
    (which subtracts delta) made it not change. Then when the event is removed
    from the buffer, because the before_stamp and write_stamp still match, the
    next event written would calculate its delta from the write_stamp, but that
    would be wrong as the write_stamp is of the time of the event that was
    discarded.

    In the case that the delta change being made to write_stamp is zero, set the
    before_stamp to zero as well, and this will force the next event to inject a
    full timestamp and not use the current write_stamp.

    Cc: stable@vger.kernel.org
    Fixes: a389d86f7fd09 ("ring-buffer: Have nested events still record running time stamp")
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • It's "cond_resched()" not "cond_sched()".

    Link: https://lkml.kernel.org/r/1863065.aFVDpXsuPd@devpool47

    Signed-off-by: Rolf Eike Beer
    Signed-off-by: Steven Rostedt (VMware)

    Rolf Eike Beer
     
  • A declaration of function "int trace_empty(struct trace_iterator *iter)"
    shows up twice in the header file kernel/trace/trace.h

    Link: https://lkml.kernel.org/r/20210304092348.208033-1-y.karadz@gmail.com

    Signed-off-by: Yordan Karadzhov (VMware)
    Signed-off-by: Steven Rostedt (VMware)

    Yordan Karadzhov (VMware)
     

01 Mar, 2021

1 commit

  • Pull more block updates from Jens Axboe:
    "A few stragglers (and one due to me missing it originally), and fixes
    for changes in this merge window mostly. In particular:

    - blktrace cleanups (Chaitanya, Greg)

    - Kill dead blk_pm_* functions (Bart)

    - Fixes for the bio alloc changes (Christoph)

    - Fix for the partition changes (Christoph, Ming)

    - Fix for turning off iopoll with polled IO inflight (Jeffle)

    - nbd disconnect fix (Josef)

    - loop fsync error fix (Mauricio)

    - kyber update depth fix (Yang)

    - max_sectors alignment fix (Mikulas)

    - Add bio_max_segs helper (Matthew)"

    * tag 'block-5.12-2021-02-27' of git://git.kernel.dk/linux-block: (21 commits)
    block: Add bio_max_segs
    blktrace: fix documentation for blk_fill_rw()
    block: memory allocations in bounce_clone_bio must not fail
    block: remove the gfp_mask argument to bounce_clone_bio
    block: fix bounce_clone_bio for passthrough bios
    block-crypto-fallback: use a bio_set for splitting bios
    block: fix logging on capacity change
    blk-settings: align max_sectors on "logical_block_size" boundary
    block: reopen the device in blkdev_reread_part
    block: don't skip empty device in in disk_uevent
    blktrace: remove debugfs file dentries from struct blk_trace
    nbd: handle device refs for DESTROY_ON_DISCONNECT properly
    kyber: introduce kyber_depth_updated()
    loop: fix I/O error on fsync() in detached loop devices
    block: fix potential IO hang when turning off io_poll
    block: get rid of the trace rq insert wrapper
    blktrace: fix blk_rq_merge documentation
    blktrace: fix blk_rq_issue documentation
    blktrace: add blk_fill_rwbs documentation comment
    block: remove superfluous param in blk_fill_rwbs()
    ...

    Linus Torvalds
     

27 Feb, 2021

1 commit

  • Patch series "Add error_report_end tracepoint to KFENCE and KASAN", v3.

    This patchset adds a tracepoint, error_repor_end, that is to be used by
    KFENCE, KASAN, and potentially other bug detection tools, when they print
    an error report. One of the possible use cases is userspace collection of
    kernel error reports: interested parties can subscribe to the tracing
    event via tracefs, and get notified when an error report occurs.

    This patch (of 3):

    Introduce error_report_end tracepoint. It can be used in debugging tools
    like KASAN, KFENCE, etc. to provide extensions to the error reporting
    mechanisms (e.g. allow tests hook into error reporting, ease error report
    collection from production kernels). Another benefit would be making use
    of ftrace for debugging or benchmarking the tools themselves.

    Should we need it, the tracepoint name leaves us with the possibility to
    introduce a complementary error_report_start tracepoint in the future.

    Link: https://lkml.kernel.org/r/20210121131915.1331302-1-glider@google.com
    Link: https://lkml.kernel.org/r/20210121131915.1331302-2-glider@google.com
    Signed-off-by: Alexander Potapenko
    Suggested-by: Marco Elver
    Cc: Andrey Konovalov
    Cc: Dmitry Vyukov
    Cc: Ingo Molnar
    Cc: Petr Mladek
    Cc: Steven Rostedt
    Cc: Sergey Senozhatsky
    Cc: Greg Kroah-Hartman
    Cc: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Potapenko
     

24 Feb, 2021

6 commits

  • Add missing ":" after rwbs function parameter documentation that fixes
    following warning :-

    ./kernel/trace/blktrace.c:1877: warning: Function parameter or member 'rwbs' not described in 'blk_fill_rwbs'

    Reported-by: Stephen Rothwell
    Fixes: 1f83bb4b4914 ("blktrace: add blk_fill_rwbs documentation comment")
    Signed-off-by: Chaitanya Kulkarni
    Signed-off-by: Jens Axboe

    Chaitanya Kulkarni
     
  • Pull more clang LTO updates from Kees Cook:
    "Clang LTO x86 enablement.

    Full disclosure: while this has _not_ been in linux-next (since it
    initially looked like the objtool dependencies weren't going to make
    v5.12), it has been under daily build and runtime testing by Sami for
    quite some time. These x86 portions have been discussed on lkml, with
    Peter, Josh, and others helping nail things down.

    The bulk of the changes are to get objtool working happily. The rest
    of the x86 enablement is very small.

    Summary:

    - Generate __mcount_loc in objtool (Peter Zijlstra)

    - Support running objtool against vmlinux.o (Sami Tolvanen)

    - Clang LTO enablement for x86 (Sami Tolvanen)"

    Link: https://lore.kernel.org/lkml/20201013003203.4168817-26-samitolvanen@google.com/
    Link: https://lore.kernel.org/lkml/cover.1611263461.git.jpoimboe@redhat.com/

    * tag 'clang-lto-v5.12-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
    kbuild: lto: force rebuilds when switching CONFIG_LTO
    x86, build: allow LTO to be selected
    x86, cpu: disable LTO for cpu.c
    x86, vdso: disable LTO only for vDSO
    kbuild: lto: postpone objtool
    objtool: Split noinstr validation from --vmlinux
    x86, build: use objtool mcount
    tracing: add support for objtool mcount
    objtool: Don't autodetect vmlinux.o
    objtool: Fix __mcount_loc generation with Clang's assembler
    objtool: Add a pass for generating __mcount_loc

    Linus Torvalds
     
  • This change adds build support for using objtool to generate
    __mcount_loc sections.

    Signed-off-by: Sami Tolvanen

    Sami Tolvanen
     
  • Pull module updates from Jessica Yu:

    - Retire EXPORT_UNUSED_SYMBOL() and EXPORT_SYMBOL_GPL_FUTURE(). These
    export types were introduced between 2006 - 2008. All the of the
    unused symbols have been long removed and gpl future symbols were
    converted to gpl quite a long time ago, and I don't believe these
    export types have been used ever since. So, I think it should be safe
    to retire those export types now (Christoph Hellwig)

    - Refactor and clean up some aged code cruft in the module loader
    (Christoph Hellwig)

    - Build {,module_}kallsyms_on_each_symbol only when livepatching is
    enabled, as it is the only caller (Christoph Hellwig)

    - Unexport find_module() and module_mutex and fix the last module
    callers to not rely on these anymore. Make module_mutex internal to
    the module loader (Christoph Hellwig)

    - Harden ELF checks on module load and validate ELF structures before
    checking the module signature (Frank van der Linden)

    - Fix undefined symbol warning for clang (Fangrui Song)

    - Fix smatch warning (Dan Carpenter)

    * tag 'modules-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
    module: potential uninitialized return in module_kallsyms_on_each_symbol()
    module: remove EXPORT_UNUSED_SYMBOL*
    module: remove EXPORT_SYMBOL_GPL_FUTURE
    module: move struct symsearch to module.c
    module: pass struct find_symbol_args to find_symbol
    module: merge each_symbol_section into find_symbol
    module: remove each_symbol_in_section
    module: mark module_mutex static
    kallsyms: only build {,module_}kallsyms_on_each_symbol when required
    kallsyms: refactor {,module_}kallsyms_on_each_symbol
    module: use RCU to synchronize find_module
    module: unexport find_module and module_mutex
    drm: remove drm_fb_helper_modinit
    powerpc/powernv: remove get_cxl_module
    module: harden ELF info handling
    module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols

    Linus Torvalds
     
  • Pull clang LTO updates from Kees Cook:
    "Clang Link Time Optimization.

    This is built on the work done preparing for LTO by arm64 folks,
    tracing folks, etc. This includes the core changes as well as the
    remaining pieces for arm64 (LTO has been the default build method on
    Android for about 3 years now, as it is the prerequisite for the
    Control Flow Integrity protections).

    While x86 LTO enablement is done, it depends on some pending objtool
    clean-ups. It's possible that I'll send a "part 2" pull request for
    LTO that includes x86 support.

    For merge log posterity, and as detailed in commit dc5723b02e52
    ("kbuild: add support for Clang LTO"), here is the lt;dr to do an LTO
    build:

    make LLVM=1 LLVM_IAS=1 defconfig
    scripts/config -e LTO_CLANG_THIN
    make LLVM=1 LLVM_IAS=1

    (To do a cross-compile of arm64, add "CROSS_COMPILE=aarch64-linux-gnu-"
    and "ARCH=arm64" to the "make" command lines.)

    Summary:

    - Clang LTO build infrastructure and arm64-specific enablement (Sami
    Tolvanen)

    - Recursive build CC_FLAGS_LTO fix (Alexander Lobakin)"

    * tag 'clang-lto-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
    kbuild: prevent CC_FLAGS_LTO self-bloating on recursive rebuilds
    arm64: allow LTO to be selected
    arm64: disable recordmcount with DYNAMIC_FTRACE_WITH_REGS
    arm64: vdso: disable LTO
    drivers/misc/lkdtm: disable LTO for rodata.o
    efi/libstub: disable LTO
    scripts/mod: disable LTO for empty.c
    modpost: lto: strip .lto from module names
    PCI: Fix PREL32 relocations for LTO
    init: lto: fix PREL32 relocations
    init: lto: ensure initcall ordering
    kbuild: lto: add a default list of used symbols
    kbuild: lto: merge module sections
    kbuild: lto: limit inlining
    kbuild: lto: fix module versioning
    kbuild: add support for Clang LTO
    tracing: move function tracer options to Kconfig

    Linus Torvalds
     
  • These debugfs dentries do not need to be saved for anything as the whole
    directory and everything in it is properly cleaned up when the parent
    directory is removed. So remove them from struct blk_trace and don't
    save them when created as it's not necessary.

    Cc: Jens Axboe
    Cc: Steven Rostedt
    Cc: Ingo Molnar
    Cc: linux-block@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: Jens Axboe

    Greg Kroah-Hartman
     

23 Feb, 2021

1 commit

  • Pull tracing updates from Steven Rostedt:

    - Update to the way irqs and preemption is tracked via the trace event
    PC field

    - Fix handling of unregistering event failing due to allocate memory.
    This is only triggered by failure injection, as it is pretty much
    guaranteed to have less than a page allocation succeed.

    - Do not show the useless "filter" or "enable" files for the "ftrace"
    trace system, as they have no effect on doing anything.

    - Add a warning if kprobes are registered more than once.

    - Synthetic events now have their fields parsed by semicolons. Old
    formats without semicolons will still work, but new features will
    require them.

    - New option to allow trace events to show %p without hashing in trace
    file. The trace file can only be read by root, and reading the raw
    event buffer did not have any pointers hashed, so this does not
    expose anything new.

    - New directory in tools called tools/tracing, where a new tool that
    reads sequential latency reports from the ftrace latency tracers.

    - Other minor fixes and cleanups.

    * tag 'trace-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits)
    kprobes: Fix to delay the kprobes jump optimization
    tracing/tools: Add the latency-collector to tools directory
    tracing: Make hash-ptr option default
    tracing: Add ptr-hash option to show the hashed pointer value
    tracing: Update the stage 3 of trace event macro comment
    tracing: Show real address for trace event arguments
    selftests/ftrace: Add '!event' synthetic event syntax check
    selftests/ftrace: Update synthetic event syntax errors
    tracing: Add a backward-compatibility check for synthetic event creation
    tracing: Update synth command errors
    tracing: Rework synthetic event command parsing
    tracing/dynevent: Delegate parsing to create function
    kprobes: Warn if the kprobe is reregistered
    ftrace: Remove unused ftrace_force_update()
    tracepoints: Code clean up
    tracepoints: Do not punish non static call users
    tracepoints: Remove unnecessary "data_args" macro parameter
    tracing: Do not create "enable" or "filter" files for ftrace event subsystem
    kernel: trace: preemptirq_delay_test: add cpu affinity
    tracepoint: Do not fail unregistering a probe due to memory failure
    ...

    Linus Torvalds
     

22 Feb, 2021

3 commits

  • blk_fill_rwbs() is an expoted function, add kernel style documentation
    comment.

    Signed-off-by: Chaitanya Kulkarni
    Reviewed-by: Johannes Thumshirn
    Reviewed-by: Damien Le Moal
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Chaitanya Kulkarni
     
  • The last parameter for the function blk_fill_rwbs() was added in
    5782138e47 ("tracing/events: convert block trace points to
    TRACE_EVENT()") in order to signal read request and use of that parameter
    was replaced with using switch case REQ_OP_READ with
    1b9a9ab78b0 ("blktrace: use op accessors"), but the parameter was never
    removed.

    Remove the unused parameter and adjust the respective call sites.

    Fixes: 1b9a9ab78b0 ("blktrace: use op accessors")
    Signed-off-by: Chaitanya Kulkarni
    Reviewed-by: Johannes Thumshirn
    Reviewed-by: Damien Le Moal
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Chaitanya Kulkarni
     
  • Pull core block updates from Jens Axboe:
    "Another nice round of removing more code than what is added, mostly
    due to Christoph's relentless pursuit of tech debt removal/cleanups.
    This pull request contains:

    - Two series of BFQ improvements (Paolo, Jan, Jia)

    - Block iov_iter improvements (Pavel)

    - bsg error path fix (Pan)

    - blk-mq scheduler improvements (Jan)

    - -EBUSY discard fix (Jan)

    - bvec allocation improvements (Ming, Christoph)

    - bio allocation and init improvements (Christoph)

    - Store bdev pointer in bio instead of gendisk + partno (Christoph)

    - Block trace point cleanups (Christoph)

    - hard read-only vs read-only split (Christoph)

    - Block based swap cleanups (Christoph)

    - Zoned write granularity support (Damien)

    - Various fixes/tweaks (Chunguang, Guoqing, Lei, Lukas, Huhai)"

    * tag 'for-5.12/block-2021-02-17' of git://git.kernel.dk/linux-block: (104 commits)
    mm: simplify swapdev_block
    sd_zbc: clear zone resources for non-zoned case
    block: introduce blk_queue_clear_zone_settings()
    zonefs: use zone write granularity as block size
    block: introduce zone_write_granularity limit
    block: use blk_queue_set_zoned in add_partition()
    nullb: use blk_queue_set_zoned() to setup zoned devices
    nvme: cleanup zone information initialization
    block: document zone_append_max_bytes attribute
    block: use bi_max_vecs to find the bvec pool
    md/raid10: remove dead code in reshape_request
    block: mark the bio as cloned in bio_iov_bvec_set
    block: set BIO_NO_PAGE_REF in bio_iov_bvec_set
    block: remove a layer of indentation in bio_iov_iter_get_pages
    block: turn the nr_iovecs argument to bio_alloc* into an unsigned short
    block: remove the 1 and 4 vec bvec_slabs entries
    block: streamline bvec_alloc
    block: factor out a bvec_alloc_gfp helper
    block: move struct biovec_slab to bio.c
    block: reuse BIO_INLINE_VECS for integrity bvecs
    ...

    Linus Torvalds
     

21 Feb, 2021

1 commit

  • Pull networking updates from David Miller:
    "Here is what we have this merge window:

    1) Support SW steering for mlx5 Connect-X6Dx, from Yevgeny Kliteynik.

    2) Add RSS multi group support to octeontx2-pf driver, from Geetha
    Sowjanya.

    3) Add support for KS8851 PHY. From Marek Vasut.

    4) Add support for GarfieldPeak bluetooth controller from Kiran K.

    5) Add support for half-duplex tcan4x5x can controllers.

    6) Add batch skb rx processing to bcrm63xx_enet, from Sieng Piaw
    Liew.

    7) Rework RX port offload infrastructure, particularly wrt, UDP
    tunneling, from Jakub Kicinski.

    8) Add BCM72116 PHY support, from Florian Fainelli.

    9) Remove Dsa specific notifiers, they are unnecessary. From Vladimir
    Oltean.

    10) Add support for picosecond rx delay in dwmac-meson8b chips. From
    Martin Blumenstingl.

    11) Support TSO on xfrm interfaces from Eyal Birger.

    12) Add support for MP_PRIO to mptcp stack, from Geliang Tang.

    13) Support BCM4908 integrated switch, from Rafał Miłecki.

    14) Support for directly accessing kernel module variables via module
    BTF info, from Andrii Naryiko.

    15) Add DASH (esktop and mobile Architecture for System Hardware)
    support to r8169 driver, from Heiner Kallweit.

    16) Add rx vlan filtering to dpaa2-eth, from Ionut-robert Aron.

    17) Add support for 100 base0x SFP devices, from Bjarni Jonasson.

    18) Support link aggregation in DSA, from Tobias Waldekranz.

    19) Support for bitwidse atomics in bpf, from Brendan Jackman.

    20) SmartEEE support in at803x driver, from Russell King.

    21) Add support for flow based tunneling to GTP, from Pravin B Shelar.

    22) Allow arbitrary number of interconnrcts in ipa, from Alex Elder.

    23) TLS RX offload for bonding, from Tariq Toukan.

    24) RX decap offklload support in mac80211, from Felix Fietkou.

    25) devlink health saupport in octeontx2-af, from George Cherian.

    26) Add TTL attr to SCM_TIMESTAMP_OPT_STATS, from Yousuk Seung

    27) Delegated actionss support in mptcp, from Paolo Abeni.

    28) Support receive timestamping when doin zerocopy tcp receive. From
    Arjun Ray.

    29) HTB offload support for mlx5, from Maxim Mikityanskiy.

    30) UDP GRO forwarding, from Maxim Mikityanskiy.

    31) TAPRIO offloading in dsa hellcreek driver, from Kurt Kanzenbach.

    32) Weighted random twos choice algorithm for ipvs, from Darby Payne.

    33) Fix netdev registration deadlock, from Johannes Berg.

    34) Various conversions to new tasklet api, from EmilRenner Berthing.

    35) Bulk skb allocations in veth, from Lorenzo Bianconi.

    36) New ethtool interface for lane setting, from Danielle Ratson.

    37) Offload failiure notifications for routes, from Amit Cohen.

    38) BCM4908 support, from Rafał Miłecki.

    39) Support several new iwlwifi chips, from Ihab Zhaika.

    40) Flow drector support for ipv6 in i40e, from Przemyslaw Patynowski.

    41) Support for mhi prrotocols, from Loic Poulain.

    42) Optimize bpf program stats.

    43) Implement RFC6056, for better port randomization, from Eric
    Dumazet.

    44) hsr tag offloading support from George McCollister.

    45) Netpoll support in qede, from Bhaskar Upadhaya.

    46) 2005/400g speed support in bonding 3ad mode, from Nikolay
    Aleksandrov.

    47) Netlink event support in mptcp, from Florian Westphal.

    48) Better skbuff caching, from Alexander Lobakin.

    49) MRP (Media Redundancy Protocol) offloading in DSA and a few
    drivers, from Horatiu Vultur.

    50) mqprio saupport in mvneta, from Maxime Chevallier.

    51) Remove of_phy_attach, no longer needed, from Florian Fainelli"

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1766 commits)
    octeontx2-pf: Fix otx2_get_fecparam()
    cteontx2-pf: cn10k: Prevent harmless double shift bugs
    net: stmmac: Add PCI bus info to ethtool driver query output
    ptp: ptp_clockmatrix: clean-up - parenthesis around a == b are unnecessary
    ptp: ptp_clockmatrix: Simplify code - remove unnecessary `err` variable.
    ptp: ptp_clockmatrix: Coding style - tighten vertical spacing.
    ptp: ptp_clockmatrix: Clean-up dev_*() messages.
    ptp: ptp_clockmatrix: Remove unused header declarations.
    ptp: ptp_clockmatrix: Add alignment of 1 PPS to idtcm_perout_enable.
    ptp: ptp_clockmatrix: Add wait_for_sys_apll_dpll_lock.
    net: stmmac: dwmac-sun8i: Add a shutdown callback
    net: stmmac: dwmac-sun8i: Minor probe function cleanup
    net: stmmac: dwmac-sun8i: Use reset_control_reset
    net: stmmac: dwmac-sun8i: Remove unnecessary PHY power check
    net: stmmac: dwmac-sun8i: Return void from PHY unpower
    r8169: use macro pm_ptr
    net: mdio: Remove of_phy_attach()
    net: mscc: ocelot: select PACKING in the Kconfig
    net: re-solve some conflicts after net -> net-next merge
    net: dsa: tag_rtl4_a: Support also egress tags
    ...

    Linus Torvalds
     

17 Feb, 2021

1 commit

  • Daniel Borkmann says:

    ====================
    pull-request: bpf-next 2021-02-16

    The following pull-request contains BPF updates for your *net-next* tree.

    There's a small merge conflict between 7eeba1706eba ("tcp: Add receive timestamp
    support for receive zerocopy.") from net-next tree and 9cacf81f8161 ("bpf: Remove
    extra lock_sock for TCP_ZEROCOPY_RECEIVE") from bpf-next tree. Resolve as follows:

    [...]
    lock_sock(sk);
    err = tcp_zerocopy_receive(sk, &zc, &tss);
    err = BPF_CGROUP_RUN_PROG_GETSOCKOPT_KERN(sk, level, optname,
    &zc, &len, err);
    release_sock(sk);
    [...]

    We've added 116 non-merge commits during the last 27 day(s) which contain
    a total of 156 files changed, 5662 insertions(+), 1489 deletions(-).

    The main changes are:

    1) Adds support of pointers to types with known size among global function
    args to overcome the limit on max # of allowed args, from Dmitrii Banshchikov.

    2) Add bpf_iter for task_vma which can be used to generate information similar
    to /proc/pid/maps, from Song Liu.

    3) Enable bpf_{g,s}etsockopt() from all sock_addr related program hooks. Allow
    rewriting bind user ports from BPF side below the ip_unprivileged_port_start
    range, both from Stanislav Fomichev.

    4) Prevent recursion on fentry/fexit & sleepable programs and allow map-in-map
    as well as per-cpu maps for the latter, from Alexei Starovoitov.

    5) Add selftest script to run BPF CI locally. Also enable BPF ringbuffer
    for sleepable programs, both from KP Singh.

    6) Extend verifier to enable variable offset read/write access to the BPF
    program stack, from Andrei Matei.

    7) Improve tc & XDP MTU handling and add a new bpf_check_mtu() helper to
    query device MTU from programs, from Jesper Dangaard Brouer.

    8) Allow bpf_get_socket_cookie() helper also be called from [sleepable] BPF
    tracing programs, from Florent Revest.

    9) Extend x86 JIT to pad JMPs with NOPs for helping image to converge when
    otherwise too many passes are required, from Gary Lin.

    10) Verifier fixes on atomics with BPF_FETCH as well as function-by-function
    verification both related to zero-extension handling, from Ilya Leoshkevich.

    11) Better kernel build integration of resolve_btfids tool, from Jiri Olsa.

    12) Batch of AF_XDP selftest cleanups and small performance improvement
    for libbpf's xsk map redirect for newer kernels, from Björn Töpel.

    13) Follow-up BPF doc and verifier improvements around atomics with
    BPF_FETCH, from Brendan Jackman.

    14) Permit zero-sized data sections e.g. if ELF .rodata section contains
    read-only data from local variables, from Yonghong Song.

    15) veth driver skb bulk-allocation for ndo_xdp_xmit, from Lorenzo Bianconi.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

13 Feb, 2021

3 commits

  • task_file and task_vma iter programs have access to file->f_path. Enable
    bpf_d_path to print paths of these file.

    Signed-off-by: Song Liu
    Signed-off-by: Alexei Starovoitov
    Acked-by: Yonghong Song
    Link: https://lore.kernel.org/bpf/20210212183107.50963-3-songliubraving@fb.com

    Song Liu
     
  • Pull tracing fix from Steven Rostedt:
    "Fix buffer overflow in trace event filter.

    It was reported that if an trace event was larger than a page and was
    filtered, that it caused memory corruption. The reason is that
    filtered events first go into a buffer to test the filter before being
    written into the ring buffer. Unfortunately, this write did not check
    the size"

    * tag 'trace-v5.11-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    tracing: Check length before giving out the filter buffer

    Linus Torvalds
     
  • Since the original behavior of the trace events is to hash the %p pointers,
    make that the default, and have developers have to enable the option in
    order to have them unhashed.

    Cc: Masami Hiramatsu
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     

12 Feb, 2021

4 commits

  • This needs a new helper that:
    - can work in a sleepable context (using sock_gen_cookie)
    - takes a struct sock pointer and checks that it's not NULL

    Signed-off-by: Florent Revest
    Signed-off-by: Alexei Starovoitov
    Acked-by: KP Singh
    Acked-by: Andrii Nakryiko
    Link: https://lore.kernel.org/bpf/20210210111406.785541-2-revest@chromium.org

    Florent Revest
     
  • Add tracefs/options/hash-ptr option to show hashed pointer
    value by %p in event printk format string.

    For the security reason, normal printk will show the hashed
    pointer value (encrypted by random number) with %p to printk
    buffer to hide the real address. But the tracefs/trace always
    shows real address for debug. To bridge those outputs, add an
    option to switch the output format. Ftrace users can use it
    to find the hashed value corresponding to the real address
    in trace log.

    Link: https://lkml.kernel.org/r/160277372504.29307.14909828808982012211.stgit@devnote2

    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt (VMware)

    Masami Hiramatsu
     
  • To help debugging kernel, show real address for trace event arguments
    in tracefs/trace{,pipe} instead of hashed pointer value.

    Since ftrace human-readable format uses vsprintf(), all %p are
    translated to hash values instead of pointer address.

    However, when debugging the kernel, raw address value gives a
    hint when comparing with the memory mapping in the kernel.
    (Those are sometimes used with crash log, which is not hashed too)
    So converting %p with %px when calling trace_seq_printf().

    Moreover, this is not improving the security because the tracefs
    can be used only by root user and the raw address values are readable
    from tracefs/percpu/cpu*/trace_pipe_raw file.

    Link: https://lkml.kernel.org/r/160277370703.29307.5134475491761971203.stgit@devnote2

    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt (VMware)

    Masami Hiramatsu
     
  • When filters are used by trace events, a page is allocated on each CPU and
    used to copy the trace event fields to this page before writing to the ring
    buffer. The reason to use the filter and not write directly into the ring
    buffer is because a filter may discard the event and there's more overhead
    on discarding from the ring buffer than the extra copy.

    The problem here is that there is no check against the size being allocated
    when using this page. If an event asks for more than a page size while being
    filtered, it will get only a page, leading to the caller writing more that
    what was allocated.

    Check the length of the request, and if it is more than PAGE_SIZE minus the
    header default back to allocating from the ring buffer directly. The ring
    buffer may reject the event if its too big anyway, but it wont overflow.

    Link: https://lore.kernel.org/ath10k/1612839593-2308-1-git-send-email-wgong@codeaurora.org/

    Cc: stable@vger.kernel.org
    Fixes: 0fc1b09ff1ff4 ("tracing: Use temp buffer when filtering events")
    Reported-by: Wen Gong
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     

11 Feb, 2021

1 commit

  • Pull networking fixes from David Miller:
    "Another pile of networing fixes:

    1) ath9k build error fix from Arnd Bergmann

    2) dma memory leak fix in mediatec driver from Lorenzo Bianconi.

    3) bpf int3 kprobe fix from Alexei Starovoitov.

    4) bpf stackmap integer overflow fix from Bui Quang Minh.

    5) Add usb device ids for Cinterion MV31 to qmi_qwwan driver, from
    Christoph Schemmel.

    6) Don't update deleted entry in xt_recent netfilter module, from
    Jazsef Kadlecsik.

    7) Use after free in nftables, fix from Pablo Neira Ayuso.

    8) Header checksum fix in flowtable from Sven Auhagen.

    9) Validate user controlled length in qrtr code, from Sabyrzhan
    Tasbolatov.

    10) Fix race in xen/netback, from Juergen Gross,

    11) New device ID in cxgb4, from Raju Rangoju.

    12) Fix ring locking in rxrpc release call, from David Howells.

    13) Don't return LAPB error codes from x25_open(), from Xie He.

    14) Missing error returns in gsi_channel_setup() from Alex Elder.

    15) Get skb_copy_and_csum_datagram working properly with odd segment
    sizes, from Willem de Bruijn.

    16) Missing RFS/RSS table init in enetc driver, from Vladimir Oltean.

    17) Do teardown on probe failure in DSA, from Vladimir Oltean.

    18) Fix compilation failures of txtimestamp selftest, from Vadim
    Fedorenko.

    19) Limit rx per-napi gro queue size to fix latency regression, from
    Eric Dumazet.

    20) dpaa_eth xdp fixes from Camelia Groza.

    21) Missing txq mode update when switching CBS off, in stmmac driver,
    from Mohammad Athari Bin Ismail.

    22) Failover pending logic fix in ibmvnic driver, from Sukadev
    Bhattiprolu.

    23) Null deref fix in vmw_vsock, from Norbert Slusarek.

    24) Missing verdict update in xdp paths of ena driver, from Shay
    Agroskin.

    25) seq_file iteration fix in sctp from Neil Brown.

    26) bpf 32-bit src register truncation fix on div/mod, from Daniel
    Borkmann.

    27) Fix jmp32 pruning in bpf verifier, from Daniel Borkmann.

    28) Fix locking in vsock_shutdown(), from Stefano Garzarella.

    29) Various missing index bound checks in hns3 driver, from Yufeng Mo.

    30) Flush ports on .phylink_mac_link_down() in dsa felix driver, from
    Vladimir Oltean.

    31) Don't mix up stp and mrp port states in bridge layer, from Horatiu
    Vultur.

    32) Fix locking during netif_tx_disable(), from Edwin Peer"

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits)
    bpf: Fix 32 bit src register truncation on div/mod
    bpf: Fix verifier jmp32 pruning decision logic
    bpf: Fix verifier jsgt branch analysis on max bound
    vsock: fix locking in vsock_shutdown()
    net: hns3: add a check for index in hclge_get_rss_key()
    net: hns3: add a check for tqp_index in hclge_get_ring_chain_from_mbx()
    net: hns3: add a check for queue_id in hclge_reset_vf_queue()
    net: dsa: felix: implement port flushing on .phylink_mac_link_down
    switchdev: mrp: Remove SWITCHDEV_ATTR_ID_MRP_PORT_STAT
    bridge: mrp: Fix the usage of br_mrp_port_switchdev_set_state
    net: watchdog: hold device global xmit lock during tx disable
    netfilter: nftables: relax check for stateful expressions in set definition
    netfilter: conntrack: skip identical origin tuple in same zone only
    vsock/virtio: update credit only if socket is not closed
    net: fix iteration for sctp transport seq_files
    net: ena: Update XDP verdict upon failure
    net/vmw_vsock: improve locking in vsock_connect_timeout()
    net/vmw_vsock: fix NULL pointer dereference
    ibmvnic: Clear failover_pending if unable to schedule
    net: stmmac: set TxQ mode back to DCB after disabling CBS
    ...

    Linus Torvalds
     

10 Feb, 2021

4 commits

  • The synthetic event parsing rework now requires semicolons between
    synthetic event fields. That requirement breaks existing users who
    might already have used the old synthetic event command format, so
    this adds an inner loop that can parse more than one field, if
    present, between semicolons. For each field, parse_synth_field()
    checks in which version that field was introduced, using
    check_field_version(). The caller, __create_synth_event() can then use
    that version information to determine whether or not to enforce the
    requirement on the command as a whole.

    In the future, if/when new features are added, the requirement will be
    that any field/string containing the new feature must use semicolons,
    and the check_field_version() check can then check for those and
    enforce it. Using a version number allows this scheme to be extended
    if necessary.

    Link: https://lkml.kernel.org/r/74fcc500d561b40ce91c5ee94818c70c6b0c9330.1612208610.git.zanussi@kernel.org

    [ zanussi: added check_field_version() comment from rostedt@goodmis.org ]
    Signed-off-by: Tom Zanussi
    Signed-off-by: Steven Rostedt (VMware)

    Tom Zanussi
     
  • Since array types are handled differently, errors referencing them
    also need to be handled differently. Add and use a new
    INVALID_ARRAY_SPEC error. Also add INVALID_CMD and INVALID_DYN_CMD to
    catch and display the correct form for badly-formed commands, which
    can also be used in place of CMD_INCOMPLETE, which is removed, and
    remove CMD_TOO_LONG, since it's no longer used.

    Link: https://lkml.kernel.org/r/b9dd434dc6458dcff11adc6ed616fe93a8794770.1612208610.git.zanussi@kernel.org

    Signed-off-by: Tom Zanussi
    Signed-off-by: Steven Rostedt (VMware)

    Tom Zanussi
     
  • Now that command parsing has been delegated to the create functions
    and we're no longer constrained by argv_split(), we can modify the
    synthetic event command parser to better match the higher-level
    structure of the synthetic event commands, which is basically an event
    name followed by a set of semicolon-separated fields.

    Since we're also now passed the raw command, we can also save it
    directly and can get rid of save_cmdstr().

    Link: https://lkml.kernel.org/r/cb9e2be92d992ce59f2b4f132264a5d467f3933f.1612208610.git.zanussi@kernel.org

    Signed-off-by: Tom Zanussi
    Signed-off-by: Steven Rostedt (VMware)

    Tom Zanussi
     
  • Delegate command parsing to each create function so that the
    command syntax can be customized.

    This requires changes to the kprobe/uprobe/synthetic event handling,
    which are also included here.

    Link: https://lkml.kernel.org/r/e488726f49cbdbc01568618f8680584306c4c79f.1612208610.git.zanussi@kernel.org

    Signed-off-by: Masami Hiramatsu
    [ zanussi@kernel.org: added synthetic event modifications ]
    Signed-off-by: Tom Zanussi
    Signed-off-by: Steven Rostedt (VMware)

    Masami Hiramatsu
     

08 Feb, 2021

1 commit

  • Allow for a RCU-sched critical section around find_module, following
    the lower level find_module_all helper, and switch the two callers
    outside of module.c to use such a RCU-sched critical section instead
    of module_mutex.

    Reviewed-by: Petr Mladek
    Acked-by: Miroslav Benes
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jessica Yu

    Christoph Hellwig
     

06 Feb, 2021

2 commits

  • The ftrace event subsystem is only created for showing the format files of
    events created by the ftrace tracers, and are not trace events. The ftrace
    subsystem currently has both the "enable" and "filter" files that in other
    subsystems are used to enable/disable all events within the subsystem or set
    a filter for all the subsystem events.

    As ftrace subsystem events do not use enable or filter operations, these
    files are useless in the ftrace subsystem. Remove them.

    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • The file /sys/kernel/tracing/events/enable is used to enable all events by
    echoing in "1", or disabling all events when echoing in "0". To know if all
    events are enabled, disabled, or some are enabled but not all of them,
    cating the file should show either "1" (all enabled), "0" (all disabled), or
    "X" (some enabled but not all of them). This works the same as the "enable"
    files in the individule system directories (like tracing/events/sched/enable).

    But when all events are enabled, the top level "enable" file shows "X". The
    reason is that its checking the "ftrace" events, which are special events
    that only exist for their format files. These include the format for the
    function tracer events, that are enabled when the function tracer is
    enabled, but not by the "enable" file. The check includes these events,
    which will always be disabled, and even though all true events are enabled,
    the top level "enable" file will show "X" instead of "1".

    To fix this, have the check test the event's flags to see if it has the
    "IGNORE_ENABLE" flag set, and if so, not test it.

    Cc: stable@vger.kernel.org
    Fixes: 553552ce1796c ("tracing: Combine event filter_active and enable into single flags field")
    Reported-by: "Yordan Karadzhov (VMware)"
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     

03 Feb, 2021

5 commits

  • The commit 0d00449c7a28 ("x86: Replace ist_enter() with nmi_enter()")
    converted do_int3 handler to be "NMI-like".
    That made old if (in_nmi()) check abort execution of bpf programs
    attached to kprobe when kprobe is firing via int3
    (For example when kprobe is placed in the middle of the function).
    Remove the check to restore user visible behavior.

    Fixes: 0d00449c7a28 ("x86: Replace ist_enter() with nmi_enter()")
    Reported-by: Nikolay Borisov
    Signed-off-by: Alexei Starovoitov
    Signed-off-by: Daniel Borkmann
    Tested-by: Nikolay Borisov
    Reviewed-by: Masami Hiramatsu
    Link: https://lore.kernel.org/bpf/20210203070636.70926-1-alexei.starovoitov@gmail.com

    Alexei Starovoitov
     
  • The kernel thread executing test can run on any cpu, which might be
    different cpu latency tracer is running on, as a result, the
    big latency caused by preemptirq delay test can't be detected.

    Therefore, the argument cpu_affinity is added to be passed to test,
    ensure it's running on the same cpu with latency tracer.

    e.g.
    cyclictest -p 90 -m -c 0 -i 1000 -a 3
    modprobe preemptirq_delay_test test_mode=preempt delay=500 \
    burst_size=3 cpu_affinity=3

    Link: https://lkml.kernel.org/r/1611797713-20965-1-git-send-email-chensong_2000@189.cn

    Signed-off-by: Song Chen
    Signed-off-by: Steven Rostedt (VMware)

    Song Chen
     
  • Defining DEBUG should only be done in development.
    So remove DEBUG.

    Link: https://lkml.kernel.org/r/20210115153348.131791-1-trix@redhat.com

    Signed-off-by: Tom Rix
    Reviewed-by: Karol Herbst
    Signed-off-by: Steven Rostedt (VMware)

    Tom Rix
     
  • Add description for trace_array_put() parameter.

    kernel/trace/trace.c:464: warning: Function parameter or member 'this_tr' not described in 'trace_array_put'

    Link: https://lkml.kernel.org/r/20210112111202.23508-1-huobean@gmail.com

    Signed-off-by: Bean Huo
    [ Merged as one of the original fixes was already fixed by someone else ]
    Signed-off-by: Steven Rostedt (VMware)

    Bean Huo
     
  • s/controling/controlling/p

    Link: https://lkml.kernel.org/r/20210112045008.29834-1-unixbhaskar@gmail.com

    Signed-off-by: Bhaskar Chowdhury
    Acked-by: Randy Dunlap
    Signed-off-by: Steven Rostedt (VMware)

    Bhaskar Chowdhury