16 Apr, 2017

1 commit


14 Apr, 2017

1 commit


17 Mar, 2017

1 commit

  • refcount_t type and corresponding API (see include/linux/refcount.h)
    should be used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: Pablo Neira Ayuso

    Reshetova, Elena
     

04 Feb, 2017

1 commit

  • Pablo Neira Ayuso says:

    ====================
    Netfilter updates for net-next

    The following patchset contains Netfilter updates for your net-next
    tree, they are:

    1) Stash ctinfo 3-bit field into pointer to nf_conntrack object from
    sk_buff so we only access one single cacheline in the conntrack
    hotpath. Patchset from Florian Westphal.

    2) Don't leak pointer to internal structures when exporting x_tables
    ruleset back to userspace, from Willem DeBruijn. This includes new
    helper functions to copy data to userspace such as xt_data_to_user()
    as well as conversions of our ip_tables, ip6_tables and arp_tables
    clients to use it. Not surprinsingly, ebtables requires an ad-hoc
    update. There is also a new field in x_tables extensions to indicate
    the amount of bytes that we copy to userspace.

    3) Add nf_log_all_netns sysctl: This new knob allows you to enable
    logging via nf_log infrastructure for all existing netnamespaces.
    Given the effort to provide pernet syslog has been discontinued,
    let's provide a way to restore logging using netfilter kernel logging
    facilities in trusted environments. Patch from Michal Kubecek.

    4) Validate SCTP checksum from conntrack helper, from Davide Caratti.

    5) Merge UDPlite conntrack and NAT helpers into UDP, this was mostly
    a copy&paste from the original helper, from Florian Westphal.

    6) Reset netfilter state when duplicating packets, also from Florian.

    7) Remove unnecessary check for broadcast in IPv6 in pkttype match and
    nft_meta, from Liping Zhang.

    8) Add missing code to deal with loopback packets from nft_meta when
    used by the netdev family, also from Liping.

    9) Several cleanups on nf_tables, one to remove unnecessary check from
    the netlink control plane path to add table, set and stateful objects
    and code consolidation when unregister chain hooks, from Gao Feng.

    10) Fix harmless reference counter underflow in IPVS that, however,
    results in problems with the introduction of the new refcount_t
    type, from David Windsor.

    11) Enable LIBCRC32C from nf_ct_sctp instead of nf_nat_sctp,
    from Davide Caratti.

    12) Missing documentation on nf_tables uapi header, from Liping Zhang.

    13) Use rb_entry() helper in xt_connlimit, from Geliang Tang.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

19 Jan, 2017

1 commit

  • We can't access c->pde if CONFIG_PROC_FS is disabled:

    net/ipv4/netfilter/ipt_CLUSTERIP.c: In function 'clusterip_config_find_get':
    net/ipv4/netfilter/ipt_CLUSTERIP.c:147:9: error: 'struct clusterip_config' has no member named 'pde'

    This moves the check inside of another #ifdef.

    Fixes: 6c5d5cfbe3c5 ("netfilter: ipt_CLUSTERIP: check duplicate config when initializing")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Pablo Neira Ayuso

    Arnd Bergmann
     

10 Jan, 2017

1 commit

  • In matches and targets that define a kernel-only tail to their
    xt_match and xt_target data structs, add a field .usersize that
    specifies up to where data is to be shared with userspace.

    Performed a search for comment "Used internally by the kernel" to find
    relevant matches and targets. Manually inspected the structs to derive
    a valid offsetof.

    Signed-off-by: Willem de Bruijn
    Signed-off-by: Pablo Neira Ayuso

    Willem de Bruijn
     

23 Dec, 2016

1 commit

  • Now when adding an ipt_CLUSTERIP rule, it only checks duplicate config in
    clusterip_config_find_get(). But after that, there may be still another
    thread to insert a config with the same ip, then it leaves proc_create_data
    to do duplicate check.

    It's more reasonable to check duplicate config by ipt_CLUSTERIP itself,
    instead of checking it by proc fs duplicate file check. Before, when proc
    fs allowed duplicate name files in a directory, It could even crash kernel
    because of use-after-free.

    This patch is to check duplicate config under the protection of clusterip
    net lock when initializing a new config and correct the return err.

    Note that it also moves proc file node creation after adding new config, as
    proc_create_data may sleep, it couldn't be called under the clusterip_net
    lock. clusterip_config_find_get returns NULL if c->pde is null to make sure
    it can't be used until the proc file node creation is done.

    Suggested-by: Marcelo Ricardo Leitner
    Signed-off-by: Xin Long
    Signed-off-by: Pablo Neira Ayuso

    Xin Long
     

05 Dec, 2016

1 commit

  • currently aliased to try_module_get/_put.
    Will be changed in next patch when we add functions to make use of ->net
    argument to store usercount per l3proto tracker.

    This is needed to avoid registering the conntrack hooks in all netns and
    later only enable connection tracking in those that need conntrack.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

18 Nov, 2016

1 commit

  • Make struct pernet_operations::id unsigned.

    There are 2 reasons to do so:

    1)
    This field is really an index into an zero based array and
    thus is unsigned entity. Using negative value is out-of-bound
    access by definition.

    2)
    On x86_64 unsigned 32-bit data which are mixed with pointers
    via array indexing or offsets added or subtracted to pointers
    are preffered to signed 32-bit data.

    "int" being used as an array index needs to be sign-extended
    to 64-bit before being used.

    void f(long *p, int i)
    {
    g(p[i]);
    }

    roughly translates to

    movsx rsi, esi
    mov rdi, [rsi+...]
    call g

    MOVSX is 3 byte instruction which isn't necessary if the variable is
    unsigned because x86_64 is zero extending by default.

    Now, there is net_generic() function which, you guessed it right, uses
    "int" as an array index:

    static inline void *net_generic(const struct net *net, int id)
    {
    ...
    ptr = ng->ptr[id - 1];
    ...
    }

    And this function is used a lot, so those sign extensions add up.

    Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
    messing with code generation):

    add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)

    Unfortunately some functions actually grow bigger.
    This is a semmingly random artefact of code generation with register
    allocator being used differently. gcc decides that some variable
    needs to live in new r8+ registers and every access now requires REX
    prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
    used which is longer than [r8]

    However, overall balance is in negative direction:

    add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
    function old new delta
    nfsd4_lock 3886 3959 +73
    tipc_link_build_proto_msg 1096 1140 +44
    mac80211_hwsim_new_radio 2776 2808 +32
    tipc_mon_rcv 1032 1058 +26
    svcauth_gss_legacy_init 1413 1429 +16
    tipc_bcbase_select_primary 379 392 +13
    nfsd4_exchange_id 1247 1260 +13
    nfsd4_setclientid_confirm 782 793 +11
    ...
    put_client_renew_locked 494 480 -14
    ip_set_sockfn_get 730 716 -14
    geneve_sock_add 829 813 -16
    nfsd4_sequence_done 721 703 -18
    nlmclnt_lookup_host 708 686 -22
    nfsd4_lockt 1085 1063 -22
    nfs_get_client 1077 1050 -27
    tcf_bpf_init 1106 1076 -30
    nfsd4_encode_fattr 5997 5930 -67
    Total: Before=154856051, After=154854321, chg -0.00%

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

17 Oct, 2015

1 commit


19 Sep, 2015

1 commit


18 Sep, 2015

1 commit

  • Instead of saying "net = dev_net(state->in?state->in:state->out)"
    just say "state->net". As that information is now availabe,
    much less confusing and much less error prone.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

16 May, 2015

1 commit

  • Currently, we have four xtables extensions that cannot be used from the
    xt over nft compat layer. The problem is that they need real access to
    the full blown xt_entry to validate that the rule comes with the right
    dependencies. This check was introduced to overcome the lack of
    sufficient userspace dependency validation in iptables.

    To resolve this problem, this patch introduces a new field to the
    xt_tgchk_param structure that tell us if the extension is run from
    nft_compat context.

    The three affected extensions are:

    1) CLUSTERIP, this target has been superseded by xt_cluster. So just
    bail out by returning -EINVAL.

    2) TCPMSS. Relax the checking when used from nft_compat. If used with
    the wrong configuration, it will corrupt !syn packets by adding TCP
    MSS option.

    3) ebt_stp. Relax the check to make sure it uses the reserved
    destination MAC address for STP.

    Signed-off-by: Pablo Neira Ayuso
    Tested-by: Arturo Borrero Gonzalez

    Pablo Neira Ayuso
     

05 Apr, 2015

1 commit


06 Mar, 2015

1 commit

  • xt_cluster supersedes ipt_CLUSTERIP since it can be also used in
    gateway configurations (not only from the backend side).

    ipt_CLUSTER is also known to leak the netdev that it uses on
    device removal, which requires a rather large fix to workaround
    the problem: http://patchwork.ozlabs.org/patch/358629/

    So let's deprecate this so we can probably kill code this in the
    future.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     

24 Aug, 2014

1 commit


05 Nov, 2013

1 commit

  • Pablo Neira Ayuso says:

    ====================
    This is another batch containing Netfilter/IPVS updates for your net-next
    tree, they are:

    * Six patches to make the ipt_CLUSTERIP target support netnamespace,
    from Gao feng.

    * Two cleanups for the nf_conntrack_acct infrastructure, introducing
    a new structure to encapsulate conntrack counters, from Holger
    Eitzenberger.

    * Fix missing verdict in SCTP support for IPVS, from Daniel Borkmann.

    * Skip checksum recalculation in SCTP support for IPVS, also from
    Daniel Borkmann.

    * Fix behavioural change in xt_socket after IP early demux, from
    Florian Westphal.

    * Fix bogus large memory allocation in the bitmap port set type in ipset,
    from Jozsef Kadlecsik.

    * Fix possible compilation issues in the hash netnet set type in ipset,
    also from Jozsef Kadlecsik.

    * Define constants to identify netlink callback data in ipset dumps,
    again from Jozsef Kadlecsik.

    * Use sock_gen_put() in xt_socket to replace xt_socket_put_sk,
    from Eric Dumazet.

    * Improvements for the SH scheduler in IPVS, from Alexander Frolkin.

    * Remove extra delay due to unneeded rcu barrier in IPVS net namespace
    cleanup path, from Julian Anastasov.

    * Save some cycles in ip6t_REJECT by skipping checksum validation in
    packets leaving from our stack, from Stanislav Fomichev.

    * Fix IPVS_CMD_ATTR_MAX definition in IPVS, larger that required, from
    Julian Anastasov.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

17 Oct, 2013

6 commits


14 Oct, 2013

1 commit


02 May, 2013

1 commit

  • Supply a function (proc_remove()) to remove a proc entry (and any subtree
    rooted there) by proc_dir_entry pointer rather than by name and (optionally)
    root dir entry pointer. This allows us to eliminate all remaining pde->name
    accesses outside of procfs.

    Signed-off-by: David Howells
    Acked-by: Grant Likely
    cc: linux-acpi@vger.kernel.org
    cc: openipmi-developer@lists.sourceforge.net
    cc: devicetree-discuss@lists.ozlabs.org
    cc: linux-pci@vger.kernel.org
    cc: netdev@vger.kernel.org
    cc: netfilter-devel@vger.kernel.org
    cc: alsa-devel@alsa-project.org
    Signed-off-by: Al Viro

    David Howells
     

10 Apr, 2013

1 commit

  • The only part of proc_dir_entry the code outside of fs/proc
    really cares about is PDE(inode)->data. Provide a helper
    for that; static inline for now, eventually will be moved
    to fs/proc, along with the knowledge of struct proc_dir_entry
    layout.

    Signed-off-by: Al Viro

    Al Viro
     

23 Feb, 2013

1 commit


11 Dec, 2012

1 commit


16 May, 2012

1 commit


01 Nov, 2011

1 commit


02 Jul, 2011

1 commit

  • Make the case labels the same indent as the switch.

    git diff -w shows miscellaneous 80 column wrapping,
    comment reflowing and a comment for a useless gcc
    warning for an otherwise unused default: case.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

06 Jun, 2011

1 commit

  • Following error is raised (and other similar ones) :

    net/ipv4/netfilter/nf_nat_standalone.c: In function ‘nf_nat_fn’:
    net/ipv4/netfilter/nf_nat_standalone.c:119:2: warning: case value ‘4’
    not in enumerated type ‘enum ip_conntrack_info’

    gcc barfs on adding two enum values and getting a not enumerated
    result :

    case IP_CT_RELATED+IP_CT_IS_REPLY:

    Add missing enum values

    Signed-off-by: Eric Dumazet
    CC: David Miller
    Signed-off-by: Pablo Neira Ayuso

    Eric Dumazet
     

20 Mar, 2011

1 commit

  • 'buffer' string is copied from userspace. It is not checked whether it is
    zero terminated. This may lead to overflow inside of simple_strtoul().
    Changli Gao suggested to copy not more than user supplied 'size' bytes.

    It was introduced before the git epoch. Files "ipt_CLUSTERIP/*" are
    root writable only by default, however, on some setups permissions might be
    relaxed to e.g. network admin user.

    Signed-off-by: Vasiliy Kulikov
    Acked-by: Changli Gao
    Signed-off-by: Patrick McHardy

    Vasiliy Kulikov
     

18 Jan, 2011

1 commit


20 Aug, 2010

1 commit


22 Jul, 2010

1 commit


15 Jun, 2010

1 commit

  • - clusterip_lock becomes a spinlock
    - lockless lookups
    - kfree() deferred after RCU grace period
    - rcu_barrier_bh() inserted in clusterip_tg_exit()

    v2)
    - As Patrick pointed out, we use atomic_inc_not_zero() in
    clusterip_config_find_get().
    - list_add_rcu() and list_del_rcu() variants are used.
    - atomic_dec_and_lock() used in clusterip_config_entry_put()

    Signed-off-by: Eric Dumazet
    Signed-off-by: Patrick McHardy

    Eric Dumazet
     

12 May, 2010

1 commit


20 Apr, 2010

1 commit


12 Apr, 2010

1 commit