28 Aug, 2019

1 commit

  • The net pointer in struct xt_tgdtor_param is not explicitly
    initialized therefore is still NULL when dereferencing it.
    So we have to find a way to pass the correct net pointer to
    ipt_destroy_target().

    The best way I find is just saving the net pointer inside the per
    netns struct tcf_idrinfo, which could make this patch smaller.

    Fixes: 0c66dc1ea3f0 ("netfilter: conntrack: register hooks in netns when needed by ruleset")
    Reported-and-tested-by: itugrok@yahoo.com
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     

06 Aug, 2019

1 commit

  • Currently init call of all actions (except ipt) init their 'parm'
    structure as a direct pointer to nla data in skb. This leads to race
    condition when some of the filter actions were initialized successfully
    (and were assigned with idr action index that was written directly
    into nla data), but then were deleted and retried (due to following
    action module missing or classifier-initiated retry), in which case
    action init code tries to insert action to idr with index that was
    assigned on previous iteration. During retry the index can be reused
    by another action that was inserted concurrently, which causes
    unintended action sharing between filters.
    To fix described race condition, save action idr index to temporary
    stack-allocated variable instead on nla data.

    Fixes: 0190c1d452a9 ("net: sched: atomically check-allocate action")
    Signed-off-by: Dmytro Linkin
    Signed-off-by: Vlad Buslov
    Acked-by: Cong Wang
    Signed-off-by: David S. Miller

    Dmytro Linkin
     

31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 3029 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

28 Apr, 2019

1 commit

  • We currently have two levels of strict validation:

    1) liberal (default)
    - undefined (type >= max) & NLA_UNSPEC attributes accepted
    - attribute length >= expected accepted
    - garbage at end of message accepted
    2) strict (opt-in)
    - NLA_UNSPEC attributes accepted
    - attribute length >= expected accepted

    Split out parsing strictness into four different options:
    * TRAILING - check that there's no trailing data after parsing
    attributes (in message or nested)
    * MAXTYPE - reject attrs > max known type
    * UNSPEC - reject attributes with NLA_UNSPEC policy entries
    * STRICT_ATTRS - strictly validate attribute size

    The default for future things should be *everything*.
    The current *_strict() is a combination of TRAILING and MAXTYPE,
    and is renamed to _deprecated_strict().
    The current regular parsing has none of this, and is renamed to
    *_parse_deprecated().

    Additionally it allows us to selectively set one of the new flags
    even on old policies. Notably, the UNSPEC flag could be useful in
    this case, since it can be arranged (by filling in the policy) to
    not be an incompatible userspace ABI change, but would then going
    forward prevent forgetting attribute entries. Similar can apply
    to the POLICY flag.

    We end up with the following renames:
    * nla_parse -> nla_parse_deprecated
    * nla_parse_strict -> nla_parse_deprecated_strict
    * nlmsg_parse -> nlmsg_parse_deprecated
    * nlmsg_parse_strict -> nlmsg_parse_deprecated_strict
    * nla_parse_nested -> nla_parse_nested_deprecated
    * nla_validate_nested -> nla_validate_nested_deprecated

    Using spatch, of course:
    @@
    expression TB, MAX, HEAD, LEN, POL, EXT;
    @@
    -nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
    +nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)

    @@
    expression NLH, HDRLEN, TB, MAX, POL, EXT;
    @@
    -nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
    +nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)

    @@
    expression NLH, HDRLEN, TB, MAX, POL, EXT;
    @@
    -nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
    +nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)

    @@
    expression TB, MAX, NLA, POL, EXT;
    @@
    -nla_parse_nested(TB, MAX, NLA, POL, EXT)
    +nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)

    @@
    expression START, MAX, POL, EXT;
    @@
    -nla_validate_nested(START, MAX, POL, EXT)
    +nla_validate_nested_deprecated(START, MAX, POL, EXT)

    @@
    expression NLH, HDRLEN, MAX, POL, EXT;
    @@
    -nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
    +nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)

    For this patch, don't actually add the strict, non-renamed versions
    yet so that it breaks compile if I get it wrong.

    Also, while at it, make nla_validate and nla_parse go down to a
    common __nla_validate_parse() function to avoid code duplication.

    Ultimately, this allows us to have very strict validation for every
    new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
    next patch, while existing things will continue to work as is.

    In effect then, this adds fully strict validation for any new command.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

22 Mar, 2019

2 commits

  • the following script:

    # tc qdisc add dev crash0 clsact
    # tc filter add dev crash0 egress matchall \
    > action simple sdata hello pass index 90
    # tc actions replace action simple \
    > sdata world goto chain 42 index 90 cookie c1a0c1a0
    # tc action show action simple

    had the following output:

    Error: Failed to init TC action chain.
    We have an error talking to the kernel
    total acts 1

    action order 0: Simple
    index 90 ref 2 bind 1
    cookie c1a0c1a0

    Then, the first packet transmitted by crash0 made the kernel crash:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
    #PF error: [normal kernel read fault]
    PGD 800000006a6fb067 P4D 800000006a6fb067 PUD 6aed6067 PMD 0
    Oops: 0000 [#1] SMP PTI
    CPU: 2 PID: 3241 Comm: kworker/2:0 Not tainted 5.0.0-rc4.gotochain_crash+ #536
    Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
    Workqueue: ipv6_addrconf addrconf_dad_work
    RIP: 0010:tcf_action_exec+0xb8/0x100
    Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3
    RSP: 0018:ffffbe6781763ad0 EFLAGS: 00010246
    RAX: 000000002000002a RBX: ffff9e59bdb80e00 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: ffff9e59b4716738 RDI: ffff9e59ab12d140
    RBP: ffffbe6781763b70 R08: 0000000000000234 R09: 0000000000aaaaaa
    R10: 0000000000000000 R11: ffff9e59b247cd50 R12: ffff9e59b112f100
    R13: ffff9e59b112f108 R14: 0000000000000001 R15: ffff9e59ab12d0c0
    FS: 0000000000000000(0000) GS:ffff9e59b4700000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 000000006af92004 CR4: 00000000001606e0
    Call Trace:
    tcf_classify+0x58/0x120
    __dev_queue_xmit+0x40a/0x890
    ? ndisc_next_option+0x50/0x50
    ? ___neigh_create+0x4d5/0x680
    ? ip6_finish_output2+0x1b5/0x590
    ip6_finish_output2+0x1b5/0x590
    ? ip6_output+0x68/0x110
    ip6_output+0x68/0x110
    ? nf_hook.constprop.28+0x79/0xc0
    ndisc_send_skb+0x248/0x2e0
    ndisc_send_ns+0xf8/0x200
    ? addrconf_dad_work+0x389/0x4b0
    addrconf_dad_work+0x389/0x4b0
    ? __switch_to_asm+0x34/0x70
    ? process_one_work+0x195/0x380
    ? addrconf_dad_completed+0x370/0x370
    process_one_work+0x195/0x380
    worker_thread+0x30/0x390
    ? process_one_work+0x380/0x380
    kthread+0x113/0x130
    ? kthread_park+0x90/0x90
    ret_from_fork+0x35/0x40
    Modules linked in: act_simple veth ip6table_filter ip6_tables iptable_filter binfmt_misc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ext4 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep mbcache snd_hda_core jbd2 snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd snd_timer glue_helper snd joydev virtio_balloon pcspkr soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops virtio_net ttm net_failover virtio_console virtio_blk failover drm crc32c_intel serio_raw floppy ata_piix libata virtio_pci virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod
    CR2: 0000000000000000

    Validating the control action within tcf_simple_init() proved to fix the
    above issue. A TDC selftest is added to verify the correct behavior.

    Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain")
    Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values")
    Signed-off-by: Davide Caratti
    Signed-off-by: David S. Miller

    Davide Caratti
     
  • - pass a pointer to struct tcf_proto in each actions's init() handler,
    to allow validating the control action, checking whether the chain
    exists and (eventually) refcounting it.
    - remove code that validates the control action after a successful call
    to the action's init() handler, and replace it with a test that forbids
    addition of actions having 'goto_chain' and NULL goto_chain pointer at
    the same time.
    - add tcf_action_check_ctrlact(), that will validate the control action
    and eventually allocate the action 'goto_chain' within the init()
    handler.
    - add tcf_action_set_ctrlact(), that will assign the control action and
    swap the current 'goto_chain' pointer with the new given one.

    This disallows 'goto_chain' on actions that don't initialize it properly
    in their init() handler, i.e. calling tcf_action_check_ctrlact() after
    successful IDR reservation and then calling tcf_action_set_ctrlact()
    to assign 'goto_chain' and 'tcf_action' consistently.

    By doing this, the kernel does not leak anymore refcounts when a valid
    'goto chain' handle is replaced in TC actions, causing kmemleak splats
    like the following one:

    # tc chain add dev dd0 chain 42 ingress protocol ip flower \
    > ip_proto tcp action drop
    # tc chain add dev dd0 chain 43 ingress protocol ip flower \
    > ip_proto udp action drop
    # tc filter add dev dd0 ingress matchall \
    > action gact goto chain 42 index 66
    # tc filter replace dev dd0 ingress matchall \
    > action gact goto chain 43 index 66
    # echo scan >/sys/kernel/debug/kmemleak

    unreferenced object 0xffff93c0ee09f000 (size 1024):
    comm "tc", pid 2565, jiffies 4295339808 (age 65.426s)
    hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 08 00 06 00 00 00 00 00 00 00 00 00 ................
    backtrace:
    [] tc_ctl_chain+0x3d2/0x4c0
    [] rtnetlink_rcv_msg+0x263/0x2d0
    [] netlink_rcv_skb+0x4a/0x110
    [] netlink_unicast+0x1a0/0x250
    [] netlink_sendmsg+0x2c1/0x3c0
    [] sock_sendmsg+0x36/0x40
    [] ___sys_sendmsg+0x280/0x2f0
    [] __sys_sendmsg+0x5e/0xa0
    [] do_syscall_64+0x5b/0x180
    [] entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [] 0xffffffffffffffff

    Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain")
    Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values")
    Signed-off-by: Davide Caratti
    Signed-off-by: David S. Miller

    Davide Caratti
     

11 Feb, 2019

2 commits

  • Modify the kernel users of the TCA_ACT_* macros to use TCA_ID_*. For
    example, use TCA_ID_GACT instead of TCA_ACT_GACT. This will align with
    TCA_ID_POLICE and also differentiates these identifier, used in struct
    tc_action_ops type field, from other macros starting with TCA_ACT_.

    To make things clearer, we name the enum defining the TCA_ID_*
    identifiers and also change the "type" field of struct tc_action to
    id.

    Signed-off-by: Eli Cohen
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Eli Cohen
     
  • Move all the TC identifiers to one place, to the same enum that defines
    the identifier of police action. This makes it easier choose numbers for
    new actions since they are now defined in one place. We preserve the
    original values for binary compatibility. New IDs should be added inside
    the enum.

    Signed-off-by: Eli Cohen
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Eli Cohen
     

01 Sep, 2018

1 commit


22 Aug, 2018

1 commit

  • All ops->delete() wants is getting the tn->idrinfo, but we already
    have tc_action before calling ops->delete(), and tc_action has
    a pointer ->idrinfo.

    More importantly, each type of action does the same thing, that is,
    just calling tcf_idr_delete_index().

    So it can be just removed.

    Fixes: b409074e6693 ("net: sched: add 'delete' function to action ops")
    Cc: Jiri Pirko
    Cc: Vlad Buslov
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     

14 Aug, 2018

1 commit


12 Aug, 2018

1 commit


08 Jul, 2018

5 commits

  • Implement function that atomically checks if action exists and either takes
    reference to it, or allocates idr slot for action index to prevent
    concurrent allocations of actions with same index. Use EBUSY error pointer
    to indicate that idr slot is reserved.

    Implement cleanup helper function that removes temporary error pointer from
    idr. (in case of error between idr allocation and insertion of newly
    created action to specified index)

    Refactor all action init functions to insert new action to idr using this
    API.

    Reviewed-by: Marcelo Ricardo Leitner
    Signed-off-by: Vlad Buslov
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Vlad Buslov
     
  • Return from action init function with reference to action taken,
    even when overwriting existing action.

    Action init API initializes its fourth argument (pointer to pointer to tc
    action) to either existing action with same index or newly created action.
    In case of existing index(and bind argument is zero), init function returns
    without incrementing action reference counter. Caller of action init then
    proceeds working with action, without actually holding reference to it.
    This means that action could be deleted concurrently.

    Change action init behavior to always take reference to action before
    returning successfully, in order to protect from concurrent deletion.

    Reviewed-by: Marcelo Ricardo Leitner
    Signed-off-by: Vlad Buslov
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Vlad Buslov
     
  • Extend action ops with 'delete' function. Each action type to implements
    its own delete function that doesn't depend on rtnl lock.

    Implement delete function that is required to delete actions without
    holding rtnl lock. Use action API function that atomically deletes action
    only if it is still in action idr. This implementation prevents concurrent
    threads from deleting same action twice.

    Reviewed-by: Marcelo Ricardo Leitner
    Signed-off-by: Vlad Buslov
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Vlad Buslov
     
  • Add additional 'rtnl_held' argument to act API init functions. It is
    required to implement actions that need to release rtnl lock before loading
    kernel module and reacquire if afterwards.

    Reviewed-by: Marcelo Ricardo Leitner
    Signed-off-by: Vlad Buslov
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Vlad Buslov
     
  • Change type of action reference counter to refcount_t.

    Change type of action bind counter to atomic_t.
    This type is used to allow decrementing bind counter without testing
    for 0 result.

    Reviewed-by: Marcelo Ricardo Leitner
    Signed-off-by: Vlad Buslov
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Vlad Buslov
     

09 Jun, 2018

1 commit

  • use nla_strlcpy() to avoid copying data beyond the length of TCA_DEF_DATA
    netlink attribute, in case it is less than SIMP_MAX_DATA and it does not
    end with '\0' character.

    v2: fix errors in the commit message, thanks Hangbin Liu

    Fixes: fa1b1cff3d06 ("net_cls_act: Make act_simple use of netlink policy.")
    Signed-off-by: Davide Caratti
    Reviewed-by: Simon Horman
    Signed-off-by: David S. Miller

    Davide Caratti
     

28 Mar, 2018

1 commit


23 Mar, 2018

1 commit

  • Fun set of conflict resolutions here...

    For the mac80211 stuff, these were fortunately just parallel
    adds. Trivially resolved.

    In drivers/net/phy/phy.c we had a bug fix in 'net' that moved the
    function phy_disable_interrupts() earlier in the file, whilst in
    'net-next' the phy_error() call from this function was removed.

    In net/ipv4/xfrm4_policy.c, David Ahern's changes to remove the
    'rt_table_id' member of rtable collided with a bug fix in 'net' that
    added a new struct member "rt_mtu_locked" which needs to be copied
    over here.

    The mlxsw driver conflict consisted of net-next separating
    the span code and definitions into separate files, whilst
    a 'net' bug fix made some changes to that moved code.

    The mlx5 infiniband conflict resolution was quite non-trivial,
    the RDMA tree's merge commit was used as a guide here, and
    here are their notes:

    ====================

    Due to bug fixes found by the syzkaller bot and taken into the for-rc
    branch after development for the 4.17 merge window had already started
    being taken into the for-next branch, there were fairly non-trivial
    merge issues that would need to be resolved between the for-rc branch
    and the for-next branch. This merge resolves those conflicts and
    provides a unified base upon which ongoing development for 4.17 can
    be based.

    Conflicts:
    drivers/infiniband/hw/mlx5/main.c - Commit 42cea83f9524
    (IB/mlx5: Fix cleanup order on unload) added to for-rc and
    commit b5ca15ad7e61 (IB/mlx5: Add proper representors support)
    add as part of the devel cycle both needed to modify the
    init/de-init functions used by mlx5. To support the new
    representors, the new functions added by the cleanup patch
    needed to be made non-static, and the init/de-init list
    added by the representors patch needed to be modified to
    match the init/de-init list changes made by the cleanup
    patch.
    Updates:
    drivers/infiniband/hw/mlx5/mlx5_ib.h - Update function
    prototypes added by representors patch to reflect new function
    names as changed by cleanup patch
    drivers/infiniband/hw/mlx5/ib_rep.c - Update init/de-init
    stage list to match new order from cleanup patch
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

22 Mar, 2018

1 commit

  • if the kernel fails to duplicate 'sdata', creation of a new action fails
    with -ENOMEM. However, subsequent attempts to install the same action
    using the same value of 'index' systematically fail with -ENOSPC, and
    that value of 'index' will no more be usable by act_simple, until rmmod /
    insmod of act_simple.ko is done:

    # tc actions add action simple sdata hello index 100
    # tc actions list action simple

    action order 0: Simple
    index 100 ref 1 bind 0
    # tc actions flush action simple
    # tc actions add action simple sdata hello index 100
    RTNETLINK answers: Cannot allocate memory
    We have an error talking to the kernel
    # tc actions flush action simple
    # tc actions add action simple sdata hello index 100
    RTNETLINK answers: No space left on device
    We have an error talking to the kernel
    # tc actions add action simple sdata hello index 100
    RTNETLINK answers: No space left on device
    We have an error talking to the kernel
    ...

    Fix this in the error path of tcf_simp_init(), calling tcf_idr_release()
    in place of tcf_idr_cleanup().

    Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR")
    Suggested-by: Cong Wang
    Acked-by: Jamal Hadi Salim
    Signed-off-by: Davide Caratti
    Signed-off-by: David S. Miller

    Davide Caratti
     

28 Feb, 2018

1 commit

  • These pernet_operations are from net/sched directory, and they call only
    tc_action_net_init() and tc_action_net_exit():

    bpf_net_ops
    connmark_net_ops
    csum_net_ops
    gact_net_ops
    ife_net_ops
    ipt_net_ops
    xt_net_ops
    mirred_net_ops
    nat_net_ops
    pedit_net_ops
    police_net_ops
    sample_net_ops
    simp_net_ops
    skbedit_net_ops
    skbmod_net_ops
    tunnel_key_net_ops
    vlan_net_ops

    1)tc_action_net_init() just allocates and initializes per-net memory.
    2)There should not be in-flight packets at the time of tc_action_net_exit()
    call, or another pernet_operations send packets to dying net (except
    netlink). So, it seems they can be marked as async.

    Signed-off-by: Kirill Tkhai
    Signed-off-by: David S. Miller

    Kirill Tkhai
     

17 Feb, 2018

4 commits


14 Dec, 2017

1 commit


06 Dec, 2017

1 commit


09 Nov, 2017

1 commit

  • This reverts commit ceffcc5e254b450e6159f173e4538215cebf1b59.
    If we hold that refcnt, the netns can never be destroyed until
    all actions are destroyed by user, this breaks our netns design
    which we expect all actions are destroyed when we destroy the
    whole netns.

    Cc: Lucas Bates
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     

03 Nov, 2017

1 commit

  • TC actions have been destroyed asynchronously for a long time,
    previously in a RCU callback and now in a workqueue. If we
    don't hold a refcnt for its netns, we could use the per netns
    data structure, struct tcf_idrinfo, after it has been freed by
    netns workqueue.

    Hold refcnt to ensure netns destroy happens after all actions
    are gone.

    Fixes: ddf97ccdd7cb ("net_sched: add network namespace support for tc actions")
    Reported-by: Lucas Bates
    Tested-by: Lucas Bates
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     

31 Aug, 2017

1 commit

  • Typically, each TC filter has its own action. All the actions of the
    same type are saved in its hash table. But the hash buckets are too
    small that it degrades to a list. And the performance is greatly
    affected. For example, it takes about 0m11.914s to insert 64K rules.
    If we convert the hash table to IDR, it only takes about 0m1.500s.
    The improvement is huge.

    But please note that the test result is based on previous patch that
    cls_flower uses IDR.

    Signed-off-by: Chris Mi
    Signed-off-by: Jiri Pirko
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Chris Mi
     

14 Apr, 2017

1 commit


18 Nov, 2016

1 commit

  • Make struct pernet_operations::id unsigned.

    There are 2 reasons to do so:

    1)
    This field is really an index into an zero based array and
    thus is unsigned entity. Using negative value is out-of-bound
    access by definition.

    2)
    On x86_64 unsigned 32-bit data which are mixed with pointers
    via array indexing or offsets added or subtracted to pointers
    are preffered to signed 32-bit data.

    "int" being used as an array index needs to be sign-extended
    to 64-bit before being used.

    void f(long *p, int i)
    {
    g(p[i]);
    }

    roughly translates to

    movsx rsi, esi
    mov rdi, [rsi+...]
    call g

    MOVSX is 3 byte instruction which isn't necessary if the variable is
    unsigned because x86_64 is zero extending by default.

    Now, there is net_generic() function which, you guessed it right, uses
    "int" as an array index:

    static inline void *net_generic(const struct net *net, int id)
    {
    ...
    ptr = ng->ptr[id - 1];
    ...
    }

    And this function is used a lot, so those sign extensions add up.

    Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
    messing with code generation):

    add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)

    Unfortunately some functions actually grow bigger.
    This is a semmingly random artefact of code generation with register
    allocator being used differently. gcc decides that some variable
    needs to live in new r8+ registers and every access now requires REX
    prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
    used which is longer than [r8]

    However, overall balance is in negative direction:

    add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
    function old new delta
    nfsd4_lock 3886 3959 +73
    tipc_link_build_proto_msg 1096 1140 +44
    mac80211_hwsim_new_radio 2776 2808 +32
    tipc_mon_rcv 1032 1058 +26
    svcauth_gss_legacy_init 1413 1429 +16
    tipc_bcbase_select_primary 379 392 +13
    nfsd4_exchange_id 1247 1260 +13
    nfsd4_setclientid_confirm 782 793 +11
    ...
    put_client_renew_locked 494 480 -14
    ip_set_sockfn_get 730 716 -14
    geneve_sock_add 829 813 -16
    nfsd4_sequence_done 721 703 -18
    nlmclnt_lookup_host 708 686 -22
    nfsd4_lockt 1085 1063 -22
    nfs_get_client 1077 1050 -27
    tcf_bpf_init 1106 1076 -30
    nfsd4_encode_fattr 5997 5930 -67
    Total: Before=154856051, After=154854321, chg -0.00%

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

26 Jul, 2016

1 commit

  • struct tc_action is confusing, currently we use it for two purposes:
    1) Pass in arguments and carry out results from helper functions
    2) A generic representation for tc actions

    The first one is error-prone, since we need to make sure we don't
    miss anything. This patch aims to get rid of this use, by moving
    tc_action into tcf_common, so that they are allocated together
    in hashtable and can be cast'ed easily.

    And together with the following patch, we could really make
    tc_action a generic representation for all tc actions and each
    type of action can inherit from it.

    Cc: Jamal Hadi Salim
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    WANG Cong
     

16 Jun, 2016

1 commit


08 Jun, 2016

3 commits


16 May, 2016

1 commit

  • The nf_conntrack_core.c fix in 'net' is not relevant in 'net-next'
    because we no longer have a per-netns conntrack hash.

    The ip_gre.c conflict as well as the iwlwifi ones were cases of
    overlapping changes.

    Conflicts:
    drivers/net/wireless/intel/iwlwifi/mvm/tx.c
    net/ipv4/ip_gre.c
    net/netfilter/nf_conntrack_core.c

    Signed-off-by: David S. Miller

    David S. Miller
     

11 May, 2016

1 commit

  • The process below was broken and is fixed with this patch.

    //add a simple action and give it an instance id of 1
    sudo tc actions add action simple sdata "foobar" index 1
    //create a filter which binds to simple action id 1
    sudo tc filter add dev $DEV parent ffff: protocol ip prio 1 u32\
    match ip dst 17.0.0.1/32 flowid 1:10 action simple index 1

    Message before fix was:
    RTNETLINK answers: Invalid argument
    We have an error talking to the kernel

    Signed-off-by: Jamal Hadi Salim
    Reviewed-by: Cong Wang
    Signed-off-by: David S. Miller

    Jamal Hadi Salim