15 Jan, 2020

4 commits

  • Allow to call macsec_pn_wrapped from hardware drivers to notify when a
    PN rolls over. Some drivers might used an interrupt to implement this.

    Signed-off-by: Antoine Tenart
    Signed-off-by: David S. Miller

    Antoine Tenart
     
  • MACsec offloading to underlying hardware devices is disabled by default
    (the software implementation is used). This patch adds support for
    changing this setting through the MACsec netlink interface. Many checks
    are done when enabling offloading on a given MACsec interface as there
    are limitations (it must be supported by the hardware, only a single
    interface can be offloaded on a given physical device at a time, rules
    can't be moved for now).

    Signed-off-by: Antoine Tenart
    Signed-off-by: David S. Miller

    Antoine Tenart
     
  • This patch introduces the MACsec hardware offloading infrastructure.

    The main idea here is to re-use the logic and data structures of the
    software MACsec implementation. This allows not to duplicate definitions
    and structure storing the same kind of information. It also allows to
    use a unified genlink interface for both MACsec implementations (so that
    the same userspace tool, `ip macsec`, is used with the same arguments).
    The MACsec offloading support cannot be disabled if an interface
    supports it at the moment.

    The MACsec configuration is passed to device drivers supporting it
    through macsec_ops which are called from the MACsec genl helpers. Those
    functions call the macsec ops of PHY and Ethernet drivers in two steps:
    a preparation one, and a commit one. The first step is allowed to fail
    and should be used to check if a provided configuration is compatible
    with the features provided by a MACsec engine, while the second step is
    not allowed to fail and should only be used to enable a given MACsec
    configuration. Two extra calls are made: when a virtual MACsec interface
    is created and when it is deleted, so that the hardware driver can stay
    in sync.

    The Rx and TX handlers are modified to take in account the special case
    were the MACsec transformation happens in the hardware, whether in a PHY
    or in a MAC, as the packets seen by the networking stack on both the
    physical and MACsec virtual interface are exactly the same. This leads
    to some limitations: the hardware and software implementations can't be
    used on the same physical interface, as the policies would be impossible
    to fulfill (such as strict validation of the frames). Also only a single
    virtual MACsec interface can be offloaded to a physical port supporting
    hardware offloading as it would be impossible to guess onto which
    interface a given packet should go (for ingress traffic).

    Another limitation as of now is that the counters and statistics are not
    reported back from the hardware to the software MACsec implementation.
    This isn't an issue when using offloaded MACsec transformations, but it
    should be added in the future so that the MACsec state can be reported
    to the user (which would also improve the debug).

    Signed-off-by: Antoine Tenart
    Signed-off-by: David S. Miller

    Antoine Tenart
     
  • This patch moves some structure, type and identifier definitions into a
    MACsec specific header. This patch does not modify how the MACsec code
    is running and only move things around. This is a preparation for the
    future MACsec hardware offloading support, which will re-use those
    definitions outside macsec.c.

    Signed-off-by: Antoine Tenart
    Signed-off-by: David S. Miller

    Antoine Tenart
     

25 Oct, 2019

3 commits

  • This patch removes variables and callback these are related to the nested
    device structure.
    devices that can be nested have their own nest_level variable that
    represents the depth of nested devices.
    In the previous patch, new {lower/upper}_level variables are added and
    they replace old private nest_level variable.
    So, this patch removes all 'nest_level' variables.

    In order to avoid lockdep warning, ->ndo_get_lock_subclass() was added
    to get lockdep subclass value, which is actually lower nested depth value.
    But now, they use the dynamic lockdep key to avoid lockdep warning instead
    of the subclass.
    So, this patch removes ->ndo_get_lock_subclass() callback.

    Signed-off-by: Taehee Yoo
    Signed-off-by: David S. Miller

    Taehee Yoo
     
  • When a macsec interface is created, it increases a refcnt to a lower
    device(real device). when macsec interface is deleted, the refcnt is
    decreased in macsec_free_netdev(), which is ->priv_destructor() of
    macsec interface.

    The problem scenario is this.
    When nested macsec interfaces are exiting, the exit routine of the
    macsec module makes refcnt leaks.

    Test commands:
    ip link add dummy0 type dummy
    ip link add macsec0 link dummy0 type macsec
    ip link add macsec1 link macsec0 type macsec
    modprobe -rv macsec

    [ 208.629433] unregister_netdevice: waiting for macsec0 to become free. Usage count = 1

    Steps of exit routine of macsec module are below.
    1. Calls ->dellink() in __rtnl_link_unregister().
    2. Checks refcnt and wait refcnt to be 0 if refcnt is not 0 in
    netdev_run_todo().
    3. Calls ->priv_destruvtor() in netdev_run_todo().

    Step2 checks refcnt, but step3 decreases refcnt.
    So, step2 waits forever.

    This patch makes the macsec module do not hold a refcnt of the lower
    device because it already holds a refcnt of the lower device with
    netdev_upper_dev_link().

    Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver")
    Signed-off-by: Taehee Yoo
    Signed-off-by: David S. Miller

    Taehee Yoo
     
  • Some interface types could be nested.
    (VLAN, BONDING, TEAM, MACSEC, MACVLAN, IPVLAN, VIRT_WIFI, VXLAN, etc..)
    These interface types should set lockdep class because, without lockdep
    class key, lockdep always warn about unexisting circular locking.

    In the current code, these interfaces have their own lockdep class keys and
    these manage itself. So that there are so many duplicate code around the
    /driver/net and /net/.
    This patch adds new generic lockdep keys and some helper functions for it.

    This patch does below changes.
    a) Add lockdep class keys in struct net_device
    - qdisc_running, xmit, addr_list, qdisc_busylock
    - these keys are used as dynamic lockdep key.
    b) When net_device is being allocated, lockdep keys are registered.
    - alloc_netdev_mqs()
    c) When net_device is being free'd llockdep keys are unregistered.
    - free_netdev()
    d) Add generic lockdep key helper function
    - netdev_register_lockdep_key()
    - netdev_unregister_lockdep_key()
    - netdev_update_lockdep_key()
    e) Remove unnecessary generic lockdep macro and functions
    f) Remove unnecessary lockdep code of each interfaces.

    After this patch, each interface modules don't need to maintain
    their lockdep keys.

    Signed-off-by: Taehee Yoo
    Signed-off-by: David S. Miller

    Taehee Yoo
     

26 Sep, 2019

1 commit

  • Fei Liu reported a crash when doing netperf on a topo of macsec
    dev over veth:

    [ 448.919128] refcount_t: underflow; use-after-free.
    [ 449.090460] Call trace:
    [ 449.092895] refcount_sub_and_test+0xb4/0xc0
    [ 449.097155] tcp_wfree+0x2c/0x150
    [ 449.100460] ip_rcv+0x1d4/0x3a8
    [ 449.103591] __netif_receive_skb_core+0x554/0xae0
    [ 449.108282] __netif_receive_skb+0x28/0x78
    [ 449.112366] netif_receive_skb_internal+0x54/0x100
    [ 449.117144] napi_gro_complete+0x70/0xc0
    [ 449.121054] napi_gro_flush+0x6c/0x90
    [ 449.124703] napi_complete_done+0x50/0x130
    [ 449.128788] gro_cell_poll+0x8c/0xa8
    [ 449.132351] net_rx_action+0x16c/0x3f8
    [ 449.136088] __do_softirq+0x128/0x320

    The issue was caused by skb's true_size changed without its sk's
    sk_wmem_alloc increased in tcp/skb_gro_receive(). Later when the
    skb is being freed and the skb's truesize is subtracted from its
    sk's sk_wmem_alloc in tcp_wfree(), underflow occurs.

    macsec is calling gro_cells_receive() to receive a packet, which
    actually requires skb->sk to be NULL. However when macsec dev is
    over veth, it's possible the skb->sk is still set if the skb was
    not unshared or expanded from the peer veth.

    ip_rcv() is calling skb_orphan() to drop the skb's sk for tproxy,
    but it is too late for macsec's calling gro_cells_receive(). So
    fix it by dropping the skb's sk earlier on rx path of macsec.

    Fixes: 5491e7c6b1a9 ("macsec: enable GRO and RPS on macsec devices")
    Reported-by: Xiumei Mu
    Reported-by: Fei Liu
    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     

03 Jul, 2019

2 commits


31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 3029 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

28 Apr, 2019

3 commits

  • Add options to strictly validate messages and dump messages,
    sometimes perhaps validating dump messages non-strictly may
    be required, so add an option for that as well.

    Since none of this can really be applied to existing commands,
    set the options everwhere using the following spatch:

    @@
    identifier ops;
    expression X;
    @@
    struct genl_ops ops[] = {
    ...,
    {
    .cmd = X,
    + .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
    ...
    },
    ...
    };

    For new commands one should just not copy the .validate 'opt-out'
    flags and thus get strict validation.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • We currently have two levels of strict validation:

    1) liberal (default)
    - undefined (type >= max) & NLA_UNSPEC attributes accepted
    - attribute length >= expected accepted
    - garbage at end of message accepted
    2) strict (opt-in)
    - NLA_UNSPEC attributes accepted
    - attribute length >= expected accepted

    Split out parsing strictness into four different options:
    * TRAILING - check that there's no trailing data after parsing
    attributes (in message or nested)
    * MAXTYPE - reject attrs > max known type
    * UNSPEC - reject attributes with NLA_UNSPEC policy entries
    * STRICT_ATTRS - strictly validate attribute size

    The default for future things should be *everything*.
    The current *_strict() is a combination of TRAILING and MAXTYPE,
    and is renamed to _deprecated_strict().
    The current regular parsing has none of this, and is renamed to
    *_parse_deprecated().

    Additionally it allows us to selectively set one of the new flags
    even on old policies. Notably, the UNSPEC flag could be useful in
    this case, since it can be arranged (by filling in the policy) to
    not be an incompatible userspace ABI change, but would then going
    forward prevent forgetting attribute entries. Similar can apply
    to the POLICY flag.

    We end up with the following renames:
    * nla_parse -> nla_parse_deprecated
    * nla_parse_strict -> nla_parse_deprecated_strict
    * nlmsg_parse -> nlmsg_parse_deprecated
    * nlmsg_parse_strict -> nlmsg_parse_deprecated_strict
    * nla_parse_nested -> nla_parse_nested_deprecated
    * nla_validate_nested -> nla_validate_nested_deprecated

    Using spatch, of course:
    @@
    expression TB, MAX, HEAD, LEN, POL, EXT;
    @@
    -nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
    +nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)

    @@
    expression NLH, HDRLEN, TB, MAX, POL, EXT;
    @@
    -nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
    +nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)

    @@
    expression NLH, HDRLEN, TB, MAX, POL, EXT;
    @@
    -nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
    +nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)

    @@
    expression TB, MAX, NLA, POL, EXT;
    @@
    -nla_parse_nested(TB, MAX, NLA, POL, EXT)
    +nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)

    @@
    expression START, MAX, POL, EXT;
    @@
    -nla_validate_nested(START, MAX, POL, EXT)
    +nla_validate_nested_deprecated(START, MAX, POL, EXT)

    @@
    expression NLH, HDRLEN, MAX, POL, EXT;
    @@
    -nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
    +nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)

    For this patch, don't actually add the strict, non-renamed versions
    yet so that it breaks compile if I get it wrong.

    Also, while at it, make nla_validate and nla_parse go down to a
    common __nla_validate_parse() function to avoid code duplication.

    Ultimately, this allows us to have very strict validation for every
    new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
    next patch, while existing things will continue to work as is.

    In effect then, this adds fully strict validation for any new command.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
    netlink based interfaces (including recently added ones) are still not
    setting it in kernel generated messages. Without the flag, message parsers
    not aware of attribute semantics (e.g. wireshark dissector or libmnl's
    mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
    the structure of their contents.

    Unfortunately we cannot just add the flag everywhere as there may be
    userspace applications which check nlattr::nla_type directly rather than
    through a helper masking out the flags. Therefore the patch renames
    nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
    as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
    are rewritten to use nla_nest_start().

    Except for changes in include/net/netlink.h, the patch was generated using
    this semantic patch:

    @@ expression E1, E2; @@
    -nla_nest_start(E1, E2)
    +nla_nest_start_noflag(E1, E2)

    @@ expression E1, E2; @@
    -nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
    +nla_nest_start(E1, E2)

    Signed-off-by: Michal Kubecek
    Acked-by: Jiri Pirko
    Acked-by: David Ahern
    Signed-off-by: David S. Miller

    Michal Kubecek
     

02 Apr, 2019

1 commit

  • seen with debug config:
    drivers/net/macsec.c: In function 'dump_secy':
    drivers/net/macsec.c:2597: warning: the frame size of 2216 bytes is larger
    than 2048 bytes [-Wframe-larger-than=]

    just mark it with noinline_for_stack, this is netlink dump code.

    v2: use 'static noinline_for_stack int' consistently

    Cc: Sabrina Dubroca
    Signed-off-by: Florian Westphal
    Reviewed-by: Sabrina Dubroca
    Signed-off-by: David S. Miller

    Florian Westphal
     

22 Mar, 2019

1 commit

  • Since maxattr is common, the policy can't really differ sanely,
    so make it common as well.

    The only user that did in fact manage to make a non-common policy
    is taskstats, which has to be really careful about it (since it's
    still using a common maxattr!). This is no longer supported, but
    we can fake it using pre_doit.

    This reduces the size of e.g. nl80211.o (which has lots of commands):

    text data bss dec hex filename
    398745 14323 2240 415308 6564c net/wireless/nl80211.o (before)
    397913 14331 2240 414484 65314 net/wireless/nl80211.o (after)
    --------------------------------
    -832 +8 0 -824

    Which is obviously just 8 bytes for each command, and an added 8
    bytes for the new policy pointer. I'm not sure why the ops list is
    counted as .text though.

    Most of the code transformations were done using the following spatch:
    @ops@
    identifier OPS;
    expression POLICY;
    @@
    struct genl_ops OPS[] = {
    ...,
    {
    - .policy = POLICY,
    },
    ...
    };

    @@
    identifier ops.OPS;
    expression ops.POLICY;
    identifier fam;
    expression M;
    @@
    struct genl_family fam = {
    .ops = OPS,
    .maxattr = M,
    + .policy = POLICY,
    ...
    };

    This also gets rid of devlink_nl_cmd_region_read_dumpit() accessing
    the cb->data as ops, which we want to change in a later genl patch.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

29 Oct, 2018

2 commits

  • Currently, the kernel doesn't let the administrator set a macsec device
    up unless its lower device is currently up. This is inconsistent, as a
    macsec device that is up won't automatically go down when its lower
    device goes down.

    Now that linkstate propagation works, there's really no reason for this
    limitation, so let's remove it.

    Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver")
    Reported-by: Radu Rendec
    Signed-off-by: Sabrina Dubroca
    Signed-off-by: David S. Miller

    Sabrina Dubroca
     
  • Like all other virtual devices (macvlan, vlan), the operstate of a
    macsec device should match the state of its lower device. This is done
    by calling netif_stacked_transfer_operstate from its netdevice notifier.

    We also need to call netif_stacked_transfer_operstate when a new macsec
    device is created, so that its operstate is set properly. This is only
    relevant when we try to bring the device up directly when we create it.

    Radu Rendec proposed a similar patch, inspired from the 802.1q driver,
    that included changing the administrative state of the macsec device,
    instead of just the operstate. This version is similar to what the
    macvlan driver does, and updates only the operstate.

    Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver")
    Reported-by: Radu Rendec
    Reported-by: Patrick Talbert
    Signed-off-by: Sabrina Dubroca
    Signed-off-by: David S. Miller

    Sabrina Dubroca
     

22 Sep, 2018

1 commit


16 Apr, 2018

1 commit

  • This patch is just wrong, sorry. I was trying to fix a static checker
    warning and misread the code. The reference taken in macsec_newlink()
    is released in macsec_free_netdev() when the netdevice is destroyed.

    This reverts commit 5dcd8400884cc4a043a6d4617e042489e5d566a9.

    Reported-by: Laura Abbott
    Fixes: 5dcd8400884c ("macsec: missing dev_put() on error in macsec_newlink()")
    Signed-off-by: Dan Carpenter
    Acked-by: Sabrina Dubroca
    Signed-off-by: David S. Miller

    Dan Carpenter
     

23 Mar, 2018

1 commit


23 Jan, 2018

1 commit

  • Commit ccfdec908922 ("macsec: Add support for GCM-AES-256 cipher suite")
    changed a few values in the uapi headers for MACsec.

    Because of existing userspace implementations, we need to preserve the
    value of MACSEC_DEFAULT_CIPHER_ID. Not doing that resulted in
    wpa_supplicant segfaults when a secure channel was created using the
    default cipher. Thus, swap MACSEC_DEFAULT_CIPHER_{ID,ALT} back to their
    original values.

    Changing the maximum length of the MACSEC_SA_ATTR_KEY attribute is
    unnecessary, as the previous value (MACSEC_MAX_KEY_LEN, which was 128B)
    is large enough to carry 32-bytes keys. This patch reverts
    MACSEC_MAX_KEY_LEN to 128B and restores the old length check on
    MACSEC_SA_ATTR_KEY.

    Fixes: ccfdec908922 ("macsec: Add support for GCM-AES-256 cipher suite")
    Signed-off-by: Davide Caratti
    Signed-off-by: Sabrina Dubroca
    Signed-off-by: David S. Miller

    Sabrina Dubroca
     

10 Jan, 2018

1 commit

  • This adds support for the GCM-AES-256 cipher suite as specified in
    IEEE 802.1AEbn-2011. The prepared cipher suite selection mechanism is used,
    with GCM-AES-128 being the default cipher suite as defined in the standard.

    Signed-off-by: Felix Walter
    Cc: Sabrina Dubroca
    Signed-off-by: David S. Miller

    Felix Walter
     

16 Nov, 2017

1 commit

  • According to the description, first argument of genlmsg_nlhdr() points to
    what genlmsg_put() returns, i.e. beginning of user header. Therefore we
    should only subtract size of genetlink header and netlink message header,
    not user header.

    This also means we don't need to pass the pointer to genetlink family and
    the same is true for genl_dump_check_consistent() which is the only caller
    of genlmsg_nlhdr(). (Note that at the moment, these functions are only
    used for families which do not have user header so that they are not
    affected.)

    Fixes: 670dc2833d14 ("netlink: advertise incomplete dumps")
    Signed-off-by: Michal Kubecek
    Reviewed-by: Johannes Berg
    Signed-off-by: David S. Miller

    Michal Kubecek
     

22 Oct, 2017

4 commits

  • There were quite a few overlapping sets of changes here.

    Daniel's bug fix for off-by-ones in the new BPF branch instructions,
    along with the added allowances for "data_end > ptr + x" forms
    collided with the metadata additions.

    Along with those three changes came veritifer test cases, which in
    their final form I tried to group together properly. If I had just
    trimmed GIT's conflict tags as-is, this would have split up the
    meta tests unnecessarily.

    In the socketmap code, a set of preemption disabling changes
    overlapped with the rename of bpf_compute_data_end() to
    bpf_compute_data_pointers().

    Changes were made to the mv88e6060.c driver set addr method
    which got removed in net-next.

    The hyperv transport socket layer had a locking change in 'net'
    which overlapped with a change of socket state macro usage
    in 'net-next'.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • atomic_t variables are currently used to implement reference
    counters with the following properties:
    - counter is initialized to 1 using atomic_set()
    - a resource is freed upon counter reaching zero
    - once counter reaches zero, its further
    increments aren't allowed
    - counter schema uses basic atomic operations
    (set, inc, inc_not_zero, dec_and_test, etc.)

    Such atomic variables should be converted to a newly provided
    refcount_t type and API that prevents accidental counter overflows
    and underflows. This is important since overflows and underflows
    can lead to use-after-free situation and be exploitable.

    The variable masces_tx_sa.refcnt is used as pure reference counter.
    Convert it to refcount_t and fix up the operations.

    Suggested-by: Kees Cook
    Reviewed-by: David Windsor
    Reviewed-by: Hans Liljestrand
    Signed-off-by: Elena Reshetova
    Signed-off-by: David S. Miller

    Elena Reshetova
     
  • atomic_t variables are currently used to implement reference
    counters with the following properties:
    - counter is initialized to 1 using atomic_set()
    - a resource is freed upon counter reaching zero
    - once counter reaches zero, its further
    increments aren't allowed
    - counter schema uses basic atomic operations
    (set, inc, inc_not_zero, dec_and_test, etc.)

    Such atomic variables should be converted to a newly provided
    refcount_t type and API that prevents accidental counter overflows
    and underflows. This is important since overflows and underflows
    can lead to use-after-free situation and be exploitable.

    The variable masces_rx_sc.refcnt is used as pure reference counter.
    Convert it to refcount_t and fix up the operations.

    Suggested-by: Kees Cook
    Reviewed-by: David Windsor
    Reviewed-by: Hans Liljestrand
    Signed-off-by: Elena Reshetova
    Signed-off-by: David S. Miller

    Elena Reshetova
     
  • atomic_t variables are currently used to implement reference
    counters with the following properties:
    - counter is initialized to 1 using atomic_set()
    - a resource is freed upon counter reaching zero
    - once counter reaches zero, its further
    increments aren't allowed
    - counter schema uses basic atomic operations
    (set, inc, inc_not_zero, dec_and_test, etc.)

    Such atomic variables should be converted to a newly provided
    refcount_t type and API that prevents accidental counter overflows
    and underflows. This is important since overflows and underflows
    can lead to use-after-free situation and be exploitable.

    The variable masces_rx_sa.refcnt is used as pure reference counter.
    Convert it to refcount_t and fix up the operations.

    Suggested-by: Kees Cook
    Reviewed-by: David Windsor
    Reviewed-by: Hans Liljestrand
    Signed-off-by: Elena Reshetova
    Signed-off-by: David S. Miller

    Elena Reshetova
     

12 Oct, 2017

1 commit


05 Oct, 2017

1 commit


23 Aug, 2017

1 commit


27 Jun, 2017

3 commits


16 Jun, 2017

1 commit

  • It seems like a historic accident that these return unsigned char *,
    and in many places that means casts are required, more often than not.

    Make these functions return void * and remove all the casts across
    the tree, adding a (u8 *) cast only where the unsigned char pointer
    was used directly, all done with the following spatch:

    @@
    expression SKB, LEN;
    typedef u8;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    @@
    - *(fn(SKB, LEN))
    + *(u8 *)fn(SKB, LEN)

    @@
    expression E, SKB, LEN;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    type T;
    @@
    - E = ((T *)(fn(SKB, LEN)))
    + E = fn(SKB, LEN)

    @@
    expression SKB, LEN;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    @@
    - fn(SKB, LEN)[0]
    + *(u8 *)fn(SKB, LEN)

    Note that the last part there converts from push(...)[0] to the
    more idiomatic *(u8 *)push(...).

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

15 Jun, 2017

1 commit


08 Jun, 2017

1 commit

  • Network devices can allocate reasources and private memory using
    netdev_ops->ndo_init(). However, the release of these resources
    can occur in one of two different places.

    Either netdev_ops->ndo_uninit() or netdev->destructor().

    The decision of which operation frees the resources depends upon
    whether it is necessary for all netdev refs to be released before it
    is safe to perform the freeing.

    netdev_ops->ndo_uninit() presumably can occur right after the
    NETDEV_UNREGISTER notifier completes and the unicast and multicast
    address lists are flushed.

    netdev->destructor(), on the other hand, does not run until the
    netdev references all go away.

    Further complicating the situation is that netdev->destructor()
    almost universally does also a free_netdev().

    This creates a problem for the logic in register_netdevice().
    Because all callers of register_netdevice() manage the freeing
    of the netdev, and invoke free_netdev(dev) if register_netdevice()
    fails.

    If netdev_ops->ndo_init() succeeds, but something else fails inside
    of register_netdevice(), it does call ndo_ops->ndo_uninit(). But
    it is not able to invoke netdev->destructor().

    This is because netdev->destructor() will do a free_netdev() and
    then the caller of register_netdevice() will do the same.

    However, this means that the resources that would normally be released
    by netdev->destructor() will not be.

    Over the years drivers have added local hacks to deal with this, by
    invoking their destructor parts by hand when register_netdevice()
    fails.

    Many drivers do not try to deal with this, and instead we have leaks.

    Let's close this hole by formalizing the distinction between what
    private things need to be freed up by netdev->destructor() and whether
    the driver needs unregister_netdevice() to perform the free_netdev().

    netdev->priv_destructor() performs all actions to free up the private
    resources that used to be freed by netdev->destructor(), except for
    free_netdev().

    netdev->needs_free_netdev is a boolean that indicates whether
    free_netdev() should be done at the end of unregister_netdevice().

    Now, register_netdevice() can sanely release all resources after
    ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit()
    and netdev->priv_destructor().

    And at the end of unregister_netdevice(), we invoke
    netdev->priv_destructor() and optionally call free_netdev().

    Signed-off-by: David S. Miller

    David S. Miller
     

05 Jun, 2017

1 commit


22 May, 2017

1 commit

  • The macsec implementation shouldn't account for rx/tx packets that are
    dropped in the netdev framework. The netdev framework itself accounts
    for such packets by atomically updating struct net_device`rx_dropped and
    struct net_device`tx_dropped fields. Later on when the stats for macsec
    link is retrieved, the packets dropped in netdev framework will be
    included in dev_get_stats() after calling macsec.c`macsec_get_stats64()

    Signed-off-by: Girish Moodalbail
    Signed-off-by: David S. Miller

    Girish Moodalbail
     

27 Apr, 2017

1 commit