30 May, 2018

1 commit

  • [ Upstream commit 1e46ef1762bb2e52f0f996131a4d16ed4e9fd065 ]

    __tcf_ipt_init() can fail after the idr has been successfully reserved.
    When this happens, subsequent attempts to configure xt/ipt rules using
    the same idr value systematically fail with -ENOSPC:

    # tc action add action xt -j LOG --log-prefix test1 index 100
    tablename: mangle hook: NF_IP_POST_ROUTING
    target: LOG level warning prefix "test1" index 100
    RTNETLINK answers: Cannot allocate memory
    We have an error talking to the kernel
    Command "(null)" is unknown, try "tc actions help".
    # tc action add action xt -j LOG --log-prefix test1 index 100
    tablename: mangle hook: NF_IP_POST_ROUTING
    target: LOG level warning prefix "test1" index 100
    RTNETLINK answers: No space left on device
    We have an error talking to the kernel
    Command "(null)" is unknown, try "tc actions help".
    # tc action add action xt -j LOG --log-prefix test1 index 100
    tablename: mangle hook: NF_IP_POST_ROUTING
    target: LOG level warning prefix "test1" index 100
    RTNETLINK answers: No space left on device
    We have an error talking to the kernel
    ...

    Fix this in the error path of __tcf_ipt_init(), calling tcf_idr_release()
    in place of tcf_idr_cleanup(). Since tcf_ipt_release() can now be called
    when tcfi_t is NULL, we also need to protect calls to ipt_destroy_target()
    to avoid NULL pointer dereference.

    Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR")
    Acked-by: Jamal Hadi Salim
    Signed-off-by: Davide Caratti
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti
     

09 Nov, 2017

1 commit

  • This reverts commit ceffcc5e254b450e6159f173e4538215cebf1b59.
    If we hold that refcnt, the netns can never be destroyed until
    all actions are destroyed by user, this breaks our netns design
    which we expect all actions are destroyed when we destroy the
    whole netns.

    Cc: Lucas Bates
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     

03 Nov, 2017

1 commit

  • TC actions have been destroyed asynchronously for a long time,
    previously in a RCU callback and now in a workqueue. If we
    don't hold a refcnt for its netns, we could use the per netns
    data structure, struct tcf_idrinfo, after it has been freed by
    netns workqueue.

    Hold refcnt to ensure netns destroy happens after all actions
    are gone.

    Fixes: ddf97ccdd7cb ("net_sched: add network namespace support for tc actions")
    Reported-by: Lucas Bates
    Tested-by: Lucas Bates
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     

31 Aug, 2017

1 commit

  • Typically, each TC filter has its own action. All the actions of the
    same type are saved in its hash table. But the hash buckets are too
    small that it degrades to a list. And the performance is greatly
    affected. For example, it takes about 0m11.914s to insert 64K rules.
    If we convert the hash table to IDR, it only takes about 0m1.500s.
    The improvement is huge.

    But please note that the test result is based on previous patch that
    cls_flower uses IDR.

    Signed-off-by: Chris Mi
    Signed-off-by: Jiri Pirko
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Chris Mi
     

19 Aug, 2017

1 commit

  • As we know in some target's checkentry it may dereference par.entryinfo
    to check entry stuff inside. But when sched action calls xt_check_target,
    par.entryinfo is set with NULL. It would cause kernel panic when calling
    some targets.

    It can be reproduce with:
    # tc qd add dev eth1 ingress handle ffff:
    # tc filter add dev eth1 parent ffff: u32 match u32 0 0 action xt \
    -j ECN --ecn-tcp-remove

    It could also crash kernel when using target CLUSTERIP or TPROXY.

    By now there's no proper value for par.entryinfo in ipt_init_target,
    but it can not be set with NULL. This patch is to void all these
    panics by setting it with an ipt_entry obj with all members = 0.

    Note that this issue has been there since the very beginning.

    Signed-off-by: Xin Long
    Acked-by: Pablo Neira Ayuso
    Signed-off-by: David S. Miller

    Xin Long
     

10 Aug, 2017

1 commit

  • Commit 55917a21d0cc ("netfilter: x_tables: add context to know if
    extension runs from nft_compat") introduced a member nft_compat to
    xt_tgchk_param structure.

    But it didn't set it's value for ipt_init_target. With unexpected
    value in par.nft_compat, it may return unexpected result in some
    target's checkentry.

    This patch is to set all it's fields as 0 and only initialize the
    non-zero fields in ipt_init_target.

    v1->v2:
    As Wang Cong's suggestion, fix it by setting all it's fields as
    0 and only initializing the non-zero fields.

    Fixes: 55917a21d0cc ("netfilter: x_tables: add context to know if extension runs from nft_compat")
    Suggested-by: Cong Wang
    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     

09 Aug, 2017

1 commit

  • Now xt_tgchk_param par in ipt_init_target is a local varibale,
    par.net is not initialized there. Later when xt_check_target
    calls target's checkentry in which it may access par.net, it
    would cause kernel panic.

    Jaroslav found this panic when running:

    # ip link add TestIface type dummy
    # tc qd add dev TestIface ingress handle ffff:
    # tc filter add dev TestIface parent ffff: u32 match u32 0 0 \
    action xt -j CONNMARK --set-mark 4

    This patch is to pass net param into ipt_init_target and set
    par.net with it properly in there.

    v1->v2:
    As Wang Cong pointed, I missed ipt_net_id != xt_net_id, so fix
    it by also passing net_id to __tcf_ipt_init.
    v2->v3:
    Missed the fixes tag, so add it.

    Fixes: ecb2421b5ddf ("netfilter: add and use nf_ct_netns_get/put")
    Reported-by: Jaroslav Aster
    Signed-off-by: Xin Long
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Xin Long
     

14 Apr, 2017

1 commit


18 Nov, 2016

1 commit

  • Make struct pernet_operations::id unsigned.

    There are 2 reasons to do so:

    1)
    This field is really an index into an zero based array and
    thus is unsigned entity. Using negative value is out-of-bound
    access by definition.

    2)
    On x86_64 unsigned 32-bit data which are mixed with pointers
    via array indexing or offsets added or subtracted to pointers
    are preffered to signed 32-bit data.

    "int" being used as an array index needs to be sign-extended
    to 64-bit before being used.

    void f(long *p, int i)
    {
    g(p[i]);
    }

    roughly translates to

    movsx rsi, esi
    mov rdi, [rsi+...]
    call g

    MOVSX is 3 byte instruction which isn't necessary if the variable is
    unsigned because x86_64 is zero extending by default.

    Now, there is net_generic() function which, you guessed it right, uses
    "int" as an array index:

    static inline void *net_generic(const struct net *net, int id)
    {
    ...
    ptr = ng->ptr[id - 1];
    ...
    }

    And this function is used a lot, so those sign extensions add up.

    Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
    messing with code generation):

    add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)

    Unfortunately some functions actually grow bigger.
    This is a semmingly random artefact of code generation with register
    allocator being used differently. gcc decides that some variable
    needs to live in new r8+ registers and every access now requires REX
    prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
    used which is longer than [r8]

    However, overall balance is in negative direction:

    add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
    function old new delta
    nfsd4_lock 3886 3959 +73
    tipc_link_build_proto_msg 1096 1140 +44
    mac80211_hwsim_new_radio 2776 2808 +32
    tipc_mon_rcv 1032 1058 +26
    svcauth_gss_legacy_init 1413 1429 +16
    tipc_bcbase_select_primary 379 392 +13
    nfsd4_exchange_id 1247 1260 +13
    nfsd4_setclientid_confirm 782 793 +11
    ...
    put_client_renew_locked 494 480 -14
    ip_set_sockfn_get 730 716 -14
    geneve_sock_add 829 813 -16
    nfsd4_sequence_done 721 703 -18
    nlmclnt_lookup_host 708 686 -22
    nfsd4_lockt 1085 1063 -22
    nfs_get_client 1077 1050 -27
    tcf_bpf_init 1106 1076 -30
    nfsd4_encode_fattr 5997 5930 -67
    Total: Before=154856051, After=154854321, chg -0.00%

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

03 Nov, 2016

1 commit


26 Jul, 2016

1 commit

  • struct tc_action is confusing, currently we use it for two purposes:
    1) Pass in arguments and carry out results from helper functions
    2) A generic representation for tc actions

    The first one is error-prone, since we need to make sure we don't
    miss anything. This patch aims to get rid of this use, by moving
    tc_action into tcf_common, so that they are allocated together
    in hashtable and can be cast'ed easily.

    And together with the following patch, we could really make
    tc_action a generic representation for all tc actions and each
    type of action can inherit from it.

    Cc: Jamal Hadi Salim
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    WANG Cong
     

30 Jun, 2016

1 commit


16 Jun, 2016

2 commits


15 Jun, 2016

1 commit


08 Jun, 2016

4 commits


16 May, 2016

1 commit

  • The nf_conntrack_core.c fix in 'net' is not relevant in 'net-next'
    because we no longer have a per-netns conntrack hash.

    The ip_gre.c conflict as well as the iwlwifi ones were cases of
    overlapping changes.

    Conflicts:
    drivers/net/wireless/intel/iwlwifi/mvm/tx.c
    net/ipv4/ip_gre.c
    net/netfilter/nf_conntrack_core.c

    Signed-off-by: David S. Miller

    David S. Miller
     

11 May, 2016

1 commit

  • This was broken and is fixed with this patch.

    //add an ipt action and give it an instance id of 1
    sudo tc actions add action ipt -j mark --set-mark 2 index 1
    //create a filter which binds to ipt action id 1
    sudo tc filter add dev $DEV parent ffff: protocol ip prio 1 u32\
    match ip dst 17.0.0.1/32 flowid 1:10 action ipt index 1

    Message before bug fix was:
    RTNETLINK answers: Invalid argument
    We have an error talking to the kernel

    Signed-off-by: Jamal Hadi Salim
    Reviewed-by: Cong Wang
    Signed-off-by: David S. Miller

    Jamal Hadi Salim
     

27 Apr, 2016

1 commit


09 Mar, 2016

1 commit


07 Mar, 2016

1 commit

  • Before calling the destroy() or target() callbacks, the family parameter
    field has to be initialized. Otherwise at least the LOG target will
    refuse to work and upon removal oops the kernel.

    Cc: Jamal Hadi Salim
    Signed-off-by: Phil Sutter
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Phil Sutter
     

26 Feb, 2016

1 commit

  • Currently tc actions are stored in a per-module hashtable,
    therefore are visible to all network namespaces. This is
    probably the last part of the tc subsystem which is not
    aware of netns now. This patch makes them per-netns,
    several tc action API's need to be adjusted for this.

    The tc action API code is ugly due to historical reasons,
    we need to refactor that code in the future.

    Cc: Jamal Hadi Salim
    Signed-off-by: Cong Wang
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    WANG Cong
     

19 Sep, 2015

1 commit


09 Jul, 2015

1 commit

  • Reuse existing percpu infrastructure John Fastabend added for qdisc.

    This patch adds a new cpustats parameter to tcf_hash_create() and all
    actions pass false, meaning this patch should have no effect yet.

    Signed-off-by: Eric Dumazet
    Cc: Alexei Starovoitov
    Acked-by: Jamal Hadi Salim
    Acked-by: John Fastabend
    Signed-off-by: David S. Miller

    Eric Dumazet
     

07 Nov, 2014

1 commit


13 Feb, 2014

3 commits


22 Jan, 2014

1 commit


20 Jan, 2014

1 commit


17 Jan, 2014

1 commit

  • In tcf_register_action() we check either ->type or ->kind to see if
    there is an existing action registered, but ipt action registers two
    actions with same type but different kinds. They should have different
    types too.

    Cc: Jamal Hadi Salim
    Cc: David S. Miller
    Signed-off-by: Cong Wang
    Signed-off-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    WANG Cong
     

14 Jan, 2014

1 commit


07 Jan, 2014

1 commit


28 Dec, 2013

1 commit

  • This is a bug fix. The existing code tries to kill many
    birds with one stone: Handling binding of actions to
    filters, new actions and replacing of action
    attributes. A simple test case to illustrate:

    XXXX
    moja@fe1:~$ sudo tc actions add action drop index 12
    moja@fe1:~$ actions get action gact index 12
    action order 1: gact action drop
    random type none pass val 0
    index 12 ref 1 bind 0
    moja@fe1:~$ sudo tc actions replace action ok index 12
    moja@fe1:~$ actions get action gact index 12
    action order 1: gact action drop
    random type none pass val 0
    index 12 ref 2 bind 0
    XXXX

    The above shows the refcounf being wrongly incremented on replace.
    There are more complex scenarios with binding of actions to filters
    that i am leaving out that didnt work as well...

    Signed-off-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Jamal Hadi Salim
     

21 Dec, 2013

1 commit

  • This patch fixes:
    1) pass mask rather than size to tcf_hashinfo_init()
    2) the cleanup should be in reversed order in mirred_cleanup_module()

    Reported-by: Eric Dumazet
    Fixes: 369ba56787d7469c0afd ("net_sched: init struct tcf_hashinfo at register time")
    Cc: Eric Dumazet
    Cc: David S. Miller
    Signed-off-by: Cong Wang
    Signed-off-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    WANG Cong
     

19 Dec, 2013

1 commit


06 Dec, 2013

1 commit