15 Mar, 2021

2 commits

  • Extend psample to report the following attributes when available:

    * Output traffic class as a 16-bit value
    * Output traffic class occupancy in bytes as a 64-bit value
    * End-to-end latency of the packet in nanoseconds resolution
    * Software timestamp in nanoseconds resolution (always available)
    * Packet's protocol. Needed for packet dissection in user space (always
    available)

    Signed-off-by: Ido Schimmel
    Reviewed-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • Currently, callers of psample_sample_packet() pass three metadata
    attributes: Ingress port, egress port and truncated size. Subsequent
    patches are going to add more attributes (e.g., egress queue occupancy),
    which also need an indication whether they are valid or not.

    Encapsulate packet metadata in a struct in order to keep the number of
    arguments reasonable.

    Signed-off-by: Ido Schimmel
    Reviewed-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     

26 Feb, 2021

1 commit

  • Currently, the psample netlink skb is allocated with a size that does
    not account for the nested 'PSAMPLE_ATTR_TUNNEL' attribute and the
    padding required for the 64-bit attribute 'PSAMPLE_TUNNEL_KEY_ATTR_ID'.
    This can result in failure to add attributes to the netlink skb due
    to insufficient tail room. The following error message is printed to
    the kernel log: "Could not create psample log message".

    Fix this by adjusting the allocation size to take into account the
    nested attribute and the padding.

    Fixes: d8bed686ab96 ("net: psample: Add tunnel support")
    CC: Yotam Gigi
    Reviewed-by: Ido Schimmel
    Reviewed-by: Jiri Pirko
    Signed-off-by: Chris Mi
    Link: https://lore.kernel.org/r/20210225075145.184314-1-cmi@nvidia.com
    Signed-off-by: Jakub Kicinski

    Chris Mi
     

28 Jan, 2021

1 commit

  • These Kconfig files are included from net/Kconfig, inside the
    if NET ... endif.

    Remove 'depends on NET', which we know it is already met.

    Signed-off-by: Masahiro Yamada
    Link: https://lore.kernel.org/r/20210125232026.106855-1-masahiroy@kernel.org
    Signed-off-by: Jakub Kicinski

    Masahiro Yamada
     

03 Oct, 2020

1 commit


24 May, 2020

1 commit

  • Fix psample build error when CONFIG_INET is not set/enabled by
    bracketing the tunnel code in #ifdef CONFIG_NET / #endif.

    ../net/psample/psample.c: In function ‘__psample_ip_tun_to_nlattr’:
    ../net/psample/psample.c:216:25: error: implicit declaration of function ‘ip_tunnel_info_opts’; did you mean ‘ip_tunnel_info_opts_set’? [-Werror=implicit-function-declaration]

    Signed-off-by: Randy Dunlap
    Cc: Yotam Gigi
    Cc: Cong Wang
    Signed-off-by: David S. Miller

    Randy Dunlap
     

22 May, 2020

1 commit

  • Currently, psample can only send the packet bits after decapsulation.
    The tunnel information is lost. Add the tunnel support.

    If the sampled packet has no tunnel info, the behavior is the same as
    before. If it has, add a nested metadata field named PSAMPLE_ATTR_TUNNEL
    and include the tunnel subfields if applicable.

    Increase the metadata length for sampled packet with the tunnel info.
    If new subfields of tunnel info should be included, update the metadata
    length accordingly.

    Signed-off-by: Chris Mi
    Reviewed-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Chris Mi
     

27 Nov, 2019

1 commit

  • We need to calculate the skb size correctly otherwise we risk triggering
    skb_over_panic[1]. The issue is that data_len is added to the skb in a
    nl attribute, but we don't account for its header size (nlattr 4 bytes)
    and alignment. We account for it when calculating the total size in
    the > PSAMPLE_MAX_PACKET_SIZE comparison correctly, but not when
    allocating after that. The fix is simple - use nla_total_size() for
    data_len when allocating.

    To reproduce:
    $ tc qdisc add dev eth1 clsact
    $ tc filter add dev eth1 egress matchall action sample rate 1 group 1 trunc 129
    $ mausezahn eth1 -b bcast -a rand -c 1 -p 129
    < skb_over_panic BUG(), tail is 4 bytes past skb->end >

    [1] Trace:
    [ 50.459526][ T3480] skbuff: skb_over_panic: text:(____ptrval____) len:196 put:136 head:(____ptrval____) data:(____ptrval____) tail:0xc4 end:0xc0 dev:
    [ 50.474339][ T3480] ------------[ cut here ]------------
    [ 50.481132][ T3480] kernel BUG at net/core/skbuff.c:108!
    [ 50.486059][ T3480] invalid opcode: 0000 [#1] PREEMPT SMP
    [ 50.489463][ T3480] CPU: 3 PID: 3480 Comm: mausezahn Not tainted 5.4.0-rc7 #108
    [ 50.492844][ T3480] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
    [ 50.496551][ T3480] RIP: 0010:skb_panic+0x79/0x7b
    [ 50.498261][ T3480] Code: bc 00 00 00 41 57 4c 89 e6 48 c7 c7 90 29 9a 83 4c 8b 8b c0 00 00 00 50 8b 83 b8 00 00 00 50 ff b3 c8 00 00 00 e8 ae ef c0 fe 0b e8 2f df c8 fe 48 8b 55 08 44 89 f6 4c 89 e7 48 c7 c1 a0 22
    [ 50.504111][ T3480] RSP: 0018:ffffc90000447a10 EFLAGS: 00010282
    [ 50.505835][ T3480] RAX: 0000000000000087 RBX: ffff888039317d00 RCX: 0000000000000000
    [ 50.507900][ T3480] RDX: 0000000000000000 RSI: ffffffff812716e1 RDI: 00000000ffffffff
    [ 50.509820][ T3480] RBP: ffffc90000447a60 R08: 0000000000000001 R09: 0000000000000000
    [ 50.511735][ T3480] R10: ffffffff81d4f940 R11: 0000000000000000 R12: ffffffff834a22b0
    [ 50.513494][ T3480] R13: ffffffff82c10433 R14: 0000000000000088 R15: ffffffff838a8084
    [ 50.515222][ T3480] FS: 00007f3536462700(0000) GS:ffff88803eac0000(0000) knlGS:0000000000000000
    [ 50.517135][ T3480] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 50.518583][ T3480] CR2: 0000000000442008 CR3: 000000003b222000 CR4: 00000000000006e0
    [ 50.520723][ T3480] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 50.522709][ T3480] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [ 50.524450][ T3480] Call Trace:
    [ 50.525214][ T3480] skb_put.cold+0x1b/0x1b
    [ 50.526171][ T3480] psample_sample_packet+0x1d3/0x340
    [ 50.527307][ T3480] tcf_sample_act+0x178/0x250
    [ 50.528339][ T3480] tcf_action_exec+0xb1/0x190
    [ 50.529354][ T3480] mall_classify+0x67/0x90
    [ 50.530332][ T3480] tcf_classify+0x72/0x160
    [ 50.531286][ T3480] __dev_queue_xmit+0x3db/0xd50
    [ 50.532327][ T3480] dev_queue_xmit+0x18/0x20
    [ 50.533299][ T3480] packet_sendmsg+0xee7/0x2090
    [ 50.534331][ T3480] sock_sendmsg+0x54/0x70
    [ 50.535271][ T3480] __sys_sendto+0x148/0x1f0
    [ 50.536252][ T3480] ? tomoyo_file_ioctl+0x23/0x30
    [ 50.537334][ T3480] ? ksys_ioctl+0x5e/0xb0
    [ 50.540068][ T3480] __x64_sys_sendto+0x2a/0x30
    [ 50.542810][ T3480] do_syscall_64+0x73/0x1f0
    [ 50.545383][ T3480] entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [ 50.548477][ T3480] RIP: 0033:0x7f35357d6fb3
    [ 50.551020][ T3480] Code: 48 8b 0d 18 90 20 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d f9 d3 20 00 00 75 13 49 89 ca b8 2c 00 00 00 0f 05 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 eb f6 ff ff 48 89 04 24
    [ 50.558547][ T3480] RSP: 002b:00007ffe0c7212c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
    [ 50.561870][ T3480] RAX: ffffffffffffffda RBX: 0000000001dac010 RCX: 00007f35357d6fb3
    [ 50.565142][ T3480] RDX: 0000000000000082 RSI: 0000000001dac2a2 RDI: 0000000000000003
    [ 50.568469][ T3480] RBP: 00007ffe0c7212f0 R08: 00007ffe0c7212d0 R09: 0000000000000014
    [ 50.571731][ T3480] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000082
    [ 50.574961][ T3480] R13: 0000000001dac2a2 R14: 0000000000000001 R15: 0000000000000003
    [ 50.578170][ T3480] Modules linked in: sch_ingress virtio_net
    [ 50.580976][ T3480] ---[ end trace 61a515626a595af6 ]---

    CC: Yotam Gigi
    CC: Jiri Pirko
    CC: Jamal Hadi Salim
    CC: Simon Horman
    CC: Roopa Prabhu
    Fixes: 6ae0a6286171 ("net: Introduce psample, a new genetlink channel for packet sampling")
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

16 Sep, 2019

1 commit

  • With recent patch set that removed rtnl lock dependency from cls hardware
    offload API rtnl lock is only taken when reading action data and can be
    released after action-specific data is parsed into intermediate
    representation. However, sample action psample group is passed by pointer
    without obtaining reference to it first, which makes it possible to
    concurrently overwrite the action and deallocate object pointed by
    psample_group pointer after rtnl lock is released but before driver
    finished using the pointer.

    To prevent such race condition, obtain reference to psample group while it
    is used by flow_action infra. Extend psample API with function
    psample_group_take() that increments psample group reference counter.
    Extend struct tc_action_ops with new get_psample_group() API. Implement the
    API for action sample using psample_group_take() and already existing
    psample_group_put() as a destructor. Use it in tc_setup_flow_action() to
    take reference to psample group pointed to by entry->sample.psample_group
    and release it in tc_cleanup_flow_action().

    Disable bh when taking psample_groups_lock. The lock is now taken while
    holding action tcf_lock that is used by data path and requires bh to be
    disabled, so doing the same for psample_groups_lock is necessary to
    preserve SOFTIRQ-irq-safety.

    Fixes: 918190f50eb6 ("net: sched: flower: don't take rtnl lock for cls hw offloads API")
    Signed-off-by: Vlad Buslov
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Vlad Buslov
     

29 Aug, 2019

1 commit

  • Action sample doesn't properly handle psample_group pointer in overwrite
    case. Following issues need to be fixed:

    - In tcf_sample_init() function RCU_INIT_POINTER() is used to set
    s->psample_group, even though we neither setting the pointer to NULL, nor
    preventing concurrent readers from accessing the pointer in some way.
    Use rcu_swap_protected() instead to safely reset the pointer.

    - Old value of s->psample_group is not released or deallocated in any way,
    which results resource leak. Use psample_group_put() on non-NULL value
    obtained with rcu_swap_protected().

    - The function psample_group_put() that released reference to struct
    psample_group pointed by rcu-pointer s->psample_group doesn't respect rcu
    grace period when deallocating it. Extend struct psample_group with rcu
    head and use kfree_rcu when freeing it.

    Fixes: 5c5670fae430 ("net/sched: Introduce sample tc action")
    Signed-off-by: Vlad Buslov
    Signed-off-by: David S. Miller

    Vlad Buslov
     

19 Jun, 2019

1 commit

  • Based on 2 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation #

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 4122 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Enrico Weigelt
    Reviewed-by: Kate Stewart
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

21 May, 2019

1 commit


28 Apr, 2019

1 commit

  • Add options to strictly validate messages and dump messages,
    sometimes perhaps validating dump messages non-strictly may
    be required, so add an option for that as well.

    Since none of this can really be applied to existing commands,
    set the options everwhere using the following spatch:

    @@
    identifier ops;
    expression X;
    @@
    struct genl_ops ops[] = {
    ...,
    {
    .cmd = X,
    + .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
    ...
    },
    ...
    };

    For new commands one should just not copy the .validate 'opt-out'
    flags and thus get strict validation.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

01 Nov, 2017

1 commit

  • For the time being I will be available in my private mail. Update both the
    MAINTAINERS file and the individual modules MODULE_AUTHOR directive with
    the new address.

    Signed-off-by: Yotam Gigi
    Signed-off-by: Yuval Mintz
    Signed-off-by: David S. Miller

    Yotam Gigi
     

16 Jun, 2017

1 commit

  • It seems like a historic accident that these return unsigned char *,
    and in many places that means casts are required, more often than not.

    Make these functions (skb_put, __skb_put and pskb_put) return void *
    and remove all the casts across the tree, adding a (u8 *) cast only
    where the unsigned char pointer was used directly, all done with the
    following spatch:

    @@
    expression SKB, LEN;
    typedef u8;
    identifier fn = { skb_put, __skb_put };
    @@
    - *(fn(SKB, LEN))
    + *(u8 *)fn(SKB, LEN)

    @@
    expression E, SKB, LEN;
    identifier fn = { skb_put, __skb_put };
    type T;
    @@
    - E = ((T *)(fn(SKB, LEN)))
    + E = fn(SKB, LEN)

    which actually doesn't cover pskb_put since there are only three
    users overall.

    A handful of stragglers were converted manually, notably a macro in
    drivers/isdn/i4l/isdn_bsdcomp.c and, oddly enough, one of the many
    instances in net/bluetooth/hci_sock.c. In the former file, I also
    had to fix one whitespace problem spatch introduced.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

25 Jan, 2017

1 commit

  • Add a general way for kernel modules to sample packets, without being tied
    to any specific subsystem. This netlink channel can be used by tc,
    iptables, etc. and allow to standardize packet sampling in the kernel.

    For every sampled packet, the psample module adds the following metadata
    fields:

    PSAMPLE_ATTR_IIFINDEX - the packets input ifindex, if applicable

    PSAMPLE_ATTR_OIFINDEX - the packet output ifindex, if applicable

    PSAMPLE_ATTR_ORIGSIZE - the packet's original size, in case it has been
    truncated during sampling

    PSAMPLE_ATTR_SAMPLE_GROUP - the packet's sample group, which is set by the
    user who initiated the sampling. This field allows the user to
    differentiate between several samplers working simultaneously and
    filter packets relevant to him

    PSAMPLE_ATTR_GROUP_SEQ - sequence counter of last sent packet. The
    sequence is kept for each group

    PSAMPLE_ATTR_SAMPLE_RATE - the sampling rate used for sampling the packets

    PSAMPLE_ATTR_DATA - the actual packet bits

    The sampled packets are sent to the PSAMPLE_NL_MCGRP_SAMPLE multicast
    group. In addition, add the GET_GROUPS netlink command which allows the
    user to see the current sample groups, their refcount and sequence number.
    This command currently supports only netlink dump mode.

    Signed-off-by: Yotam Gigi
    Signed-off-by: Jiri Pirko
    Reviewed-by: Jamal Hadi Salim
    Reviewed-by: Simon Horman
    Signed-off-by: David S. Miller

    Yotam Gigi