29 Aug, 2020

1 commit

  • Frontend callback reports EAGAIN to nfnetlink to retry a command, this
    is used to signal that module autoloading is required. Unfortunately,
    nlmsg_unicast() reports EAGAIN in case the receiver socket buffer gets
    full, so it enters a busy-loop.

    This patch updates nfnetlink_unicast() to turn EAGAIN into ENOBUFS and
    to use nlmsg_unicast(). Remove the flags field in nfnetlink_unicast()
    since this is always MSG_DONTWAIT in the existing code which is exactly
    what nlmsg_unicast() passes to netlink_unicast() as parameter.

    Fixes: 96518518cc41 ("netfilter: add nftables")
    Reported-by: Phil Sutter
    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     

17 Jul, 2020

1 commit

  • Using uninitialized_var() is dangerous as it papers over real bugs[1]
    (or can in the future), and suppresses unrelated compiler warnings
    (e.g. "unused variable"). If the compiler thinks it is uninitialized,
    either simply initialize the variable or make compiler changes.

    In preparation for removing[2] the[3] macro[4], remove all remaining
    needless uses with the following script:

    git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \
    xargs perl -pi -e \
    's/\buninitialized_var\(([^\)]+)\)/\1/g;
    s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;'

    drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid
    pathological white-space.

    No outstanding warnings were found building allmodconfig with GCC 9.3.0
    for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64,
    alpha, and m68k.

    [1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/
    [2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/
    [3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/
    [4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/

    Reviewed-by: Leon Romanovsky # drivers/infiniband and mlx4/mlx5
    Acked-by: Jason Gunthorpe # IB
    Acked-by: Kalle Valo # wireless drivers
    Reviewed-by: Chao Yu # erofs
    Signed-off-by: Kees Cook

    Kees Cook
     

26 Aug, 2019

1 commit

  • Currently, there is no vlan information (e.g. when used with a vlan aware
    bridge) passed to userspache, HWHEADER will contain an 08 00 (ip) suffix
    even for tagged ip packets.

    Therefore, add an extra netlink attribute that passes the vlan information
    to userspace similarly to 15824ab29f for nfqueue.

    Signed-off-by: Michael Braun
    Reviewed-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Michael Braun
     

13 Aug, 2019

1 commit


19 Jun, 2019

1 commit

  • Based on 2 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation #

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 4122 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Enrico Weigelt
    Reviewed-by: Kate Stewart
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

22 Apr, 2019

1 commit

  • setting net.netfilter.nf_conntrack_timestamp=1 breaks xmit with fq
    scheduler. skb->tstamp might be "refreshed" using ktime_get_real(),
    but fq expects CLOCK_MONOTONIC.

    This patch removes all places in netfilter that check/set skb->tstamp:

    1. To fix the bogus "start" time seen with conntrack timestamping for
    outgoing packets, never use skb->tstamp and always use current time.
    2. In nfqueue and nflog, only use skb->tstamp for incoming packets,
    as determined by current hook (prerouting, input, forward).
    3. xt_time has to use system clock as well rather than skb->tstamp.
    We could still use skb->tstamp for prerouting/input/foward, but
    I see no advantage to make this conditional.

    Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC")
    Cc: Eric Dumazet
    Reported-by: Michal Soltys
    Signed-off-by: Florian Westphal
    Acked-by: Eric Dumazet
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

01 Dec, 2018

1 commit

  • Now that call_rcu()'s callback is not invoked until after bh-disable
    regions of code have completed (in addition to explicitly marked
    RCU read-side critical sections), call_rcu() can be used in place
    of call_rcu_bh(). Similarly, rcu_barrier() can be used in place of
    rcu_barrier_bh() and synchronize_rcu() in place of synchronize_rcu_bh().
    This commit therefore makes these changes.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Pablo Neira Ayuso

    Paul E. McKenney
     

07 Jun, 2018

1 commit

  • Pull networking updates from David Miller:

    1) Add Maglev hashing scheduler to IPVS, from Inju Song.

    2) Lots of new TC subsystem tests from Roman Mashak.

    3) Add TCP zero copy receive and fix delayed acks and autotuning with
    SO_RCVLOWAT, from Eric Dumazet.

    4) Add XDP_REDIRECT support to mlx5 driver, from Jesper Dangaard
    Brouer.

    5) Add ttl inherit support to vxlan, from Hangbin Liu.

    6) Properly separate ipv6 routes into their logically independant
    components. fib6_info for the routing table, and fib6_nh for sets of
    nexthops, which thus can be shared. From David Ahern.

    7) Add bpf_xdp_adjust_tail helper, which can be used to generate ICMP
    messages from XDP programs. From Nikita V. Shirokov.

    8) Lots of long overdue cleanups to the r8169 driver, from Heiner
    Kallweit.

    9) Add BTF ("BPF Type Format"), from Martin KaFai Lau.

    10) Add traffic condition monitoring to iwlwifi, from Luca Coelho.

    11) Plumb extack down into fib_rules, from Roopa Prabhu.

    12) Add Flower classifier offload support to igb, from Vinicius Costa
    Gomes.

    13) Add UDP GSO support, from Willem de Bruijn.

    14) Add documentation for eBPF helpers, from Quentin Monnet.

    15) Add TLS tx offload to mlx5, from Ilya Lesokhin.

    16) Allow applications to be given the number of bytes available to read
    on a socket via a control message returned from recvmsg(), from
    Soheil Hassas Yeganeh.

    17) Add x86_32 eBPF JIT compiler, from Wang YanQing.

    18) Add AF_XDP sockets, with zerocopy support infrastructure as well.
    From Björn Töpel.

    19) Remove indirect load support from all of the BPF JITs and handle
    these operations in the verifier by translating them into native BPF
    instead. From Daniel Borkmann.

    20) Add GRO support to ipv6 gre tunnels, from Eran Ben Elisha.

    21) Allow XDP programs to do lookups in the main kernel routing tables
    for forwarding. From David Ahern.

    22) Allow drivers to store hardware state into an ELF section of kernel
    dump vmcore files, and use it in cxgb4. From Rahul Lakkireddy.

    23) Various RACK and loss detection improvements in TCP, from Yuchung
    Cheng.

    24) Add TCP SACK compression, from Eric Dumazet.

    25) Add User Mode Helper support and basic bpfilter infrastructure, from
    Alexei Starovoitov.

    26) Support ports and protocol values in RTM_GETROUTE, from Roopa
    Prabhu.

    27) Support bulking in ->ndo_xdp_xmit() API, from Jesper Dangaard
    Brouer.

    28) Add lots of forwarding selftests, from Petr Machata.

    29) Add generic network device failover driver, from Sridhar Samudrala.

    * ra.kernel.org:/pub/scm/linux/kernel/git/davem/net-next: (1959 commits)
    strparser: Add __strp_unpause and use it in ktls.
    rxrpc: Fix terminal retransmission connection ID to include the channel
    net: hns3: Optimize PF CMDQ interrupt switching process
    net: hns3: Fix for VF mailbox receiving unknown message
    net: hns3: Fix for VF mailbox cannot receiving PF response
    bnx2x: use the right constant
    Revert "net: sched: cls: Fix offloading when ingress dev is vxlan"
    net: dsa: b53: Fix for brcm tag issue in Cygnus SoC
    enic: fix UDP rss bits
    netdev-FAQ: clarify DaveM's position for stable backports
    rtnetlink: validate attributes in do_setlink()
    mlxsw: Add extack messages for port_{un, }split failures
    netdevsim: Add extack error message for devlink reload
    devlink: Add extack to reload and port_{un, }split operations
    net: metrics: add proper netlink validation
    ipmr: fix error path when ipmr_new_table fails
    ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds
    net: hns3: remove unused hclgevf_cfg_func_mta_filter
    netfilter: provide udp*_lib_lookup for nf_tproxy
    qed*: Utilize FW 8.37.2.0
    ...

    Linus Torvalds
     

16 May, 2018

1 commit

  • Variants of proc_create{,_data} that directly take a struct seq_operations
    and deal with network namespaces in ->open and ->release. All callers of
    proc_create + seq_open_net converted over, and seq_{open,release}_net are
    removed entirely.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     

19 Apr, 2018

1 commit


19 Jan, 2018

1 commit

  • /proc has been ignoring struct file_operations::owner field for 10 years.
    Specifically, it started with commit 786d7e1612f0b0adb6046f19b906609e4fe8b1ba
    ("Fix rmmod/read/write races in /proc entries"). Notice the chunk where
    inode->i_fop is initialized with proxy struct file_operations for
    regular files:

    - if (de->proc_fops)
    - inode->i_fop = de->proc_fops;
    + if (de->proc_fops) {
    + if (S_ISREG(inode->i_mode))
    + inode->i_fop = &proc_reg_file_ops;
    + else
    + inode->i_fop = de->proc_fops;
    + }

    VFS stopped pinning module at this point.

    # ipvs
    Acked-by: Julian Anastasov
    Signed-off-by: Alexey Dobriyan
    Acked-by: Simon Horman
    Signed-off-by: Pablo Neira Ayuso

    Alexey Dobriyan
     

14 Dec, 2017

1 commit

  • Pablo Neira Ayuso says:

    ====================
    Netfilter fixes for net

    The follow patchset contains Netfilter fixes for your net tree,
    they are:

    1) Fix compilation warning in x_tables with clang due to useless
    redundant reassignment, from Colin Ian King.

    2) Add bugtrap to net_exit to catch uninitialized lists, patch
    from Vasily Averin.

    3) Fix out of bounds memory reads in H323 conntrack helper, this
    comes with an initial patch to remove replace the obscure
    CHECK_BOUND macro as a dependency. From Eric Sesterhenn.

    4) Reduce retransmission timeout when window is 0 in TCP conntrack,
    from Florian Westphal.

    6) ctnetlink clamp timeout to INT_MAX if timeout is too large,
    otherwise timeout wraps around and it results in killing the
    entry that is being added immediately.

    7) Missing CAP_NET_ADMIN checks in cthelper and xt_osf, due to
    no netns support. From Kevin Cernekee.

    8) Missing maximum number of instructions checks in xt_bpf, patch
    from Jann Horn.

    9) With no CONFIG_PROC_FS ipt_CLUSTERIP compilation breaks,
    patch from Arnd Bergmann.

    10) Missing netlink attribute policy in nftables exthdr, from
    Florian Westphal.

    11) Enable conntrack with IPv6 MASQUERADE rules, as a357b3f80bc8
    should have done in first place, from Konstantin Khlebnikov.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

22 Nov, 2017

1 commit

  • This converts all remaining cases of the old setup_timer() API into using
    timer_setup(), where the callback argument is the structure already
    holding the struct timer_list. These should have no behavioral changes,
    since they just change which pointer is passed into the callback with
    the same available pointers after conversion. It handles the following
    examples, in addition to some other variations.

    Casting from unsigned long:

    void my_callback(unsigned long data)
    {
    struct something *ptr = (struct something *)data;
    ...
    }
    ...
    setup_timer(&ptr->my_timer, my_callback, ptr);

    and forced object casts:

    void my_callback(struct something *ptr)
    {
    ...
    }
    ...
    setup_timer(&ptr->my_timer, my_callback, (unsigned long)ptr);

    become:

    void my_callback(struct timer_list *t)
    {
    struct something *ptr = from_timer(ptr, t, my_timer);
    ...
    }
    ...
    timer_setup(&ptr->my_timer, my_callback, 0);

    Direct function assignments:

    void my_callback(unsigned long data)
    {
    struct something *ptr = (struct something *)data;
    ...
    }
    ...
    ptr->my_timer.function = my_callback;

    have a temporary cast added, along with converting the args:

    void my_callback(struct timer_list *t)
    {
    struct something *ptr = from_timer(ptr, t, my_timer);
    ...
    }
    ...
    ptr->my_timer.function = (TIMER_FUNC_TYPE)my_callback;

    And finally, callbacks without a data assignment:

    void my_callback(unsigned long data)
    {
    ...
    }
    ...
    setup_timer(&ptr->my_timer, my_callback, 0);

    have their argument renamed to verify they're unused during conversion:

    void my_callback(struct timer_list *unused)
    {
    ...
    }
    ...
    timer_setup(&ptr->my_timer, my_callback, 0);

    The conversion is done with the following Coccinelle script:

    spatch --very-quiet --all-includes --include-headers \
    -I ./arch/x86/include -I ./arch/x86/include/generated \
    -I ./include -I ./arch/x86/include/uapi \
    -I ./arch/x86/include/generated/uapi -I ./include/uapi \
    -I ./include/generated/uapi --include ./include/linux/kconfig.h \
    --dir . \
    --cocci-file ~/src/data/timer_setup.cocci

    @fix_address_of@
    expression e;
    @@

    setup_timer(
    -&(e)
    +&e
    , ...)

    // Update any raw setup_timer() usages that have a NULL callback, but
    // would otherwise match change_timer_function_usage, since the latter
    // will update all function assignments done in the face of a NULL
    // function initialization in setup_timer().
    @change_timer_function_usage_NULL@
    expression _E;
    identifier _timer;
    type _cast_data;
    @@

    (
    -setup_timer(&_E->_timer, NULL, _E);
    +timer_setup(&_E->_timer, NULL, 0);
    |
    -setup_timer(&_E->_timer, NULL, (_cast_data)_E);
    +timer_setup(&_E->_timer, NULL, 0);
    |
    -setup_timer(&_E._timer, NULL, &_E);
    +timer_setup(&_E._timer, NULL, 0);
    |
    -setup_timer(&_E._timer, NULL, (_cast_data)&_E);
    +timer_setup(&_E._timer, NULL, 0);
    )

    @change_timer_function_usage@
    expression _E;
    identifier _timer;
    struct timer_list _stl;
    identifier _callback;
    type _cast_func, _cast_data;
    @@

    (
    -setup_timer(&_E->_timer, _callback, _E);
    +timer_setup(&_E->_timer, _callback, 0);
    |
    -setup_timer(&_E->_timer, &_callback, _E);
    +timer_setup(&_E->_timer, _callback, 0);
    |
    -setup_timer(&_E->_timer, _callback, (_cast_data)_E);
    +timer_setup(&_E->_timer, _callback, 0);
    |
    -setup_timer(&_E->_timer, &_callback, (_cast_data)_E);
    +timer_setup(&_E->_timer, _callback, 0);
    |
    -setup_timer(&_E->_timer, (_cast_func)_callback, _E);
    +timer_setup(&_E->_timer, _callback, 0);
    |
    -setup_timer(&_E->_timer, (_cast_func)&_callback, _E);
    +timer_setup(&_E->_timer, _callback, 0);
    |
    -setup_timer(&_E->_timer, (_cast_func)_callback, (_cast_data)_E);
    +timer_setup(&_E->_timer, _callback, 0);
    |
    -setup_timer(&_E->_timer, (_cast_func)&_callback, (_cast_data)_E);
    +timer_setup(&_E->_timer, _callback, 0);
    |
    -setup_timer(&_E._timer, _callback, (_cast_data)_E);
    +timer_setup(&_E._timer, _callback, 0);
    |
    -setup_timer(&_E._timer, _callback, (_cast_data)&_E);
    +timer_setup(&_E._timer, _callback, 0);
    |
    -setup_timer(&_E._timer, &_callback, (_cast_data)_E);
    +timer_setup(&_E._timer, _callback, 0);
    |
    -setup_timer(&_E._timer, &_callback, (_cast_data)&_E);
    +timer_setup(&_E._timer, _callback, 0);
    |
    -setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)_E);
    +timer_setup(&_E._timer, _callback, 0);
    |
    -setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)&_E);
    +timer_setup(&_E._timer, _callback, 0);
    |
    -setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)_E);
    +timer_setup(&_E._timer, _callback, 0);
    |
    -setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)&_E);
    +timer_setup(&_E._timer, _callback, 0);
    |
    _E->_timer@_stl.function = _callback;
    |
    _E->_timer@_stl.function = &_callback;
    |
    _E->_timer@_stl.function = (_cast_func)_callback;
    |
    _E->_timer@_stl.function = (_cast_func)&_callback;
    |
    _E._timer@_stl.function = _callback;
    |
    _E._timer@_stl.function = &_callback;
    |
    _E._timer@_stl.function = (_cast_func)_callback;
    |
    _E._timer@_stl.function = (_cast_func)&_callback;
    )

    // callback(unsigned long arg)
    @change_callback_handle_cast
    depends on change_timer_function_usage@
    identifier change_timer_function_usage._callback;
    identifier change_timer_function_usage._timer;
    type _origtype;
    identifier _origarg;
    type _handletype;
    identifier _handle;
    @@

    void _callback(
    -_origtype _origarg
    +struct timer_list *t
    )
    {
    (
    ... when != _origarg
    _handletype *_handle =
    -(_handletype *)_origarg;
    +from_timer(_handle, t, _timer);
    ... when != _origarg
    |
    ... when != _origarg
    _handletype *_handle =
    -(void *)_origarg;
    +from_timer(_handle, t, _timer);
    ... when != _origarg
    |
    ... when != _origarg
    _handletype *_handle;
    ... when != _handle
    _handle =
    -(_handletype *)_origarg;
    +from_timer(_handle, t, _timer);
    ... when != _origarg
    |
    ... when != _origarg
    _handletype *_handle;
    ... when != _handle
    _handle =
    -(void *)_origarg;
    +from_timer(_handle, t, _timer);
    ... when != _origarg
    )
    }

    // callback(unsigned long arg) without existing variable
    @change_callback_handle_cast_no_arg
    depends on change_timer_function_usage &&
    !change_callback_handle_cast@
    identifier change_timer_function_usage._callback;
    identifier change_timer_function_usage._timer;
    type _origtype;
    identifier _origarg;
    type _handletype;
    @@

    void _callback(
    -_origtype _origarg
    +struct timer_list *t
    )
    {
    + _handletype *_origarg = from_timer(_origarg, t, _timer);
    +
    ... when != _origarg
    - (_handletype *)_origarg
    + _origarg
    ... when != _origarg
    }

    // Avoid already converted callbacks.
    @match_callback_converted
    depends on change_timer_function_usage &&
    !change_callback_handle_cast &&
    !change_callback_handle_cast_no_arg@
    identifier change_timer_function_usage._callback;
    identifier t;
    @@

    void _callback(struct timer_list *t)
    { ... }

    // callback(struct something *handle)
    @change_callback_handle_arg
    depends on change_timer_function_usage &&
    !match_callback_converted &&
    !change_callback_handle_cast &&
    !change_callback_handle_cast_no_arg@
    identifier change_timer_function_usage._callback;
    identifier change_timer_function_usage._timer;
    type _handletype;
    identifier _handle;
    @@

    void _callback(
    -_handletype *_handle
    +struct timer_list *t
    )
    {
    + _handletype *_handle = from_timer(_handle, t, _timer);
    ...
    }

    // If change_callback_handle_arg ran on an empty function, remove
    // the added handler.
    @unchange_callback_handle_arg
    depends on change_timer_function_usage &&
    change_callback_handle_arg@
    identifier change_timer_function_usage._callback;
    identifier change_timer_function_usage._timer;
    type _handletype;
    identifier _handle;
    identifier t;
    @@

    void _callback(struct timer_list *t)
    {
    - _handletype *_handle = from_timer(_handle, t, _timer);
    }

    // We only want to refactor the setup_timer() data argument if we've found
    // the matching callback. This undoes changes in change_timer_function_usage.
    @unchange_timer_function_usage
    depends on change_timer_function_usage &&
    !change_callback_handle_cast &&
    !change_callback_handle_cast_no_arg &&
    !change_callback_handle_arg@
    expression change_timer_function_usage._E;
    identifier change_timer_function_usage._timer;
    identifier change_timer_function_usage._callback;
    type change_timer_function_usage._cast_data;
    @@

    (
    -timer_setup(&_E->_timer, _callback, 0);
    +setup_timer(&_E->_timer, _callback, (_cast_data)_E);
    |
    -timer_setup(&_E._timer, _callback, 0);
    +setup_timer(&_E._timer, _callback, (_cast_data)&_E);
    )

    // If we fixed a callback from a .function assignment, fix the
    // assignment cast now.
    @change_timer_function_assignment
    depends on change_timer_function_usage &&
    (change_callback_handle_cast ||
    change_callback_handle_cast_no_arg ||
    change_callback_handle_arg)@
    expression change_timer_function_usage._E;
    identifier change_timer_function_usage._timer;
    identifier change_timer_function_usage._callback;
    type _cast_func;
    typedef TIMER_FUNC_TYPE;
    @@

    (
    _E->_timer.function =
    -_callback
    +(TIMER_FUNC_TYPE)_callback
    ;
    |
    _E->_timer.function =
    -&_callback
    +(TIMER_FUNC_TYPE)_callback
    ;
    |
    _E->_timer.function =
    -(_cast_func)_callback;
    +(TIMER_FUNC_TYPE)_callback
    ;
    |
    _E->_timer.function =
    -(_cast_func)&_callback
    +(TIMER_FUNC_TYPE)_callback
    ;
    |
    _E._timer.function =
    -_callback
    +(TIMER_FUNC_TYPE)_callback
    ;
    |
    _E._timer.function =
    -&_callback;
    +(TIMER_FUNC_TYPE)_callback
    ;
    |
    _E._timer.function =
    -(_cast_func)_callback
    +(TIMER_FUNC_TYPE)_callback
    ;
    |
    _E._timer.function =
    -(_cast_func)&_callback
    +(TIMER_FUNC_TYPE)_callback
    ;
    )

    // Sometimes timer functions are called directly. Replace matched args.
    @change_timer_function_calls
    depends on change_timer_function_usage &&
    (change_callback_handle_cast ||
    change_callback_handle_cast_no_arg ||
    change_callback_handle_arg)@
    expression _E;
    identifier change_timer_function_usage._timer;
    identifier change_timer_function_usage._callback;
    type _cast_data;
    @@

    _callback(
    (
    -(_cast_data)_E
    +&_E->_timer
    |
    -(_cast_data)&_E
    +&_E._timer
    |
    -_E
    +&_E->_timer
    )
    )

    // If a timer has been configured without a data argument, it can be
    // converted without regard to the callback argument, since it is unused.
    @match_timer_function_unused_data@
    expression _E;
    identifier _timer;
    identifier _callback;
    @@

    (
    -setup_timer(&_E->_timer, _callback, 0);
    +timer_setup(&_E->_timer, _callback, 0);
    |
    -setup_timer(&_E->_timer, _callback, 0L);
    +timer_setup(&_E->_timer, _callback, 0);
    |
    -setup_timer(&_E->_timer, _callback, 0UL);
    +timer_setup(&_E->_timer, _callback, 0);
    |
    -setup_timer(&_E._timer, _callback, 0);
    +timer_setup(&_E._timer, _callback, 0);
    |
    -setup_timer(&_E._timer, _callback, 0L);
    +timer_setup(&_E._timer, _callback, 0);
    |
    -setup_timer(&_E._timer, _callback, 0UL);
    +timer_setup(&_E._timer, _callback, 0);
    |
    -setup_timer(&_timer, _callback, 0);
    +timer_setup(&_timer, _callback, 0);
    |
    -setup_timer(&_timer, _callback, 0L);
    +timer_setup(&_timer, _callback, 0);
    |
    -setup_timer(&_timer, _callback, 0UL);
    +timer_setup(&_timer, _callback, 0);
    |
    -setup_timer(_timer, _callback, 0);
    +timer_setup(_timer, _callback, 0);
    |
    -setup_timer(_timer, _callback, 0L);
    +timer_setup(_timer, _callback, 0);
    |
    -setup_timer(_timer, _callback, 0UL);
    +timer_setup(_timer, _callback, 0);
    )

    @change_callback_unused_data
    depends on match_timer_function_unused_data@
    identifier match_timer_function_unused_data._callback;
    type _origtype;
    identifier _origarg;
    @@

    void _callback(
    -_origtype _origarg
    +struct timer_list *unused
    )
    {
    ... when != _origarg
    }

    Signed-off-by: Kees Cook

    Kees Cook
     

20 Nov, 2017

1 commit


02 Aug, 2017

1 commit

  • The nf_loginfo structures are only passed as the seventh argument to
    nf_log_trace, which is declared as const or stored in a local const
    variable. Thus the nf_loginfo structures themselves can be const.

    Done with the help of Coccinelle.

    //
    @r disable optional_qualifier@
    identifier i;
    position p;
    @@
    static struct nf_loginfo i@p = { ... };

    @ok1@
    identifier r.i;
    expression list[6] es;
    position p;
    @@
    nf_log_trace(es,&i@p,...)

    @ok2@
    identifier r.i;
    const struct nf_loginfo *e;
    position p;
    @@
    e = &i@p

    @bad@
    position p != {r.p,ok1.p,ok2.p};
    identifier r.i;
    struct nf_loginfo e;
    @@
    e@i@p

    @depends on !bad disable optional_qualifier@
    identifier r.i;
    @@
    static
    +const
    struct nf_loginfo i = { ... };
    //

    Signed-off-by: Julia Lawall
    Signed-off-by: Pablo Neira Ayuso

    Julia Lawall
     

30 Jun, 2017

1 commit

  • Pablo Neira Ayuso says:

    ====================
    Netfilter updates for net-next

    The following patchset contains Netfilter updates for your net-next
    tree. This batch contains connection tracking updates for the cleanup
    iteration path, patches from Florian Westphal:

    X) Skip unconfirmed conntracks in nf_ct_iterate_cleanup_net(), just set
    dying bit to let the CPU release them.

    X) Add nf_ct_iterate_destroy() to be used on module removal, to kill
    conntrack from all namespace.

    X) Restart iteration on hashtable resizing, since both may occur at
    the same time.

    X) Use the new nf_ct_iterate_destroy() to remove conntrack with NAT
    mapping on module removal.

    X) Use nf_ct_iterate_destroy() to remove conntrack entries helper
    module removal, from Liping Zhang.

    X) Use nf_ct_iterate_cleanup_net() to remove the timeout extension
    if user requests this, also from Liping.

    X) Add net_ns_barrier() and use it from FTP helper, so make sure
    no concurrent namespace removal happens at the same time while
    the helper module is being removed.

    X) Use NFPROTO_MAX in layer 3 conntrack protocol array, to reduce
    module size. Same thing in nf_tables.

    Updates for the nf_tables infrastructure:

    X) Prepare usage of the extended ACK reporting infrastructure for
    nf_tables.

    X) Remove unnecessary forward declaration in nf_tables hash set.

    X) Skip set size estimation if number of element is not specified.

    X) Changes to accomodate a (faster) unresizable hash set implementation,
    for anonymous sets and dynamic size fixed sets with no timeouts.

    X) Faster lookup function for unresizable hash table for 2 and 4
    bytes key.

    And, finally, a bunch of asorted small updates and cleanups:

    X) Do not hold reference to netdev from ipt_CLUSTER, instead subscribe
    to device events and look up for index from the packet path, this
    is fixing an issue that is present since the very beginning, patch
    from Xin Long.

    X) Use nf_register_net_hook() in ipt_CLUSTER, from Florian Westphal.

    X) Use ebt_invalid_target() whenever possible in the ebtables tree,
    from Gao Feng.

    X) Calm down compilation warning in nf_dup infrastructure, patch from
    stephen hemminger.

    X) Statify functions in nftables rt expression, also from stephen.

    X) Update Makefile to use canonical method to specify nf_tables-objs.
    From Jike Song.

    X) Use nf_conntrack_helpers_register() in amanda and H323.

    X) Space cleanup for ctnetlink, from linzhang.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

20 Jun, 2017

1 commit

  • Pass down struct netlink_ext_ack as parameter to all of our nfnetlink
    subsystem callbacks, so we can work on follow up patches to provide
    finer grain error reporting using the new infrastructure that
    2d4bc93368f5 ("netlink: extended ACK reporting") provides.

    No functional change, just pass down this new object to callbacks.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     

16 Jun, 2017

1 commit

  • It seems like a historic accident that these return unsigned char *,
    and in many places that means casts are required, more often than not.

    Make these functions (skb_put, __skb_put and pskb_put) return void *
    and remove all the casts across the tree, adding a (u8 *) cast only
    where the unsigned char pointer was used directly, all done with the
    following spatch:

    @@
    expression SKB, LEN;
    typedef u8;
    identifier fn = { skb_put, __skb_put };
    @@
    - *(fn(SKB, LEN))
    + *(u8 *)fn(SKB, LEN)

    @@
    expression E, SKB, LEN;
    identifier fn = { skb_put, __skb_put };
    type T;
    @@
    - E = ((T *)(fn(SKB, LEN)))
    + E = fn(SKB, LEN)

    which actually doesn't cover pskb_put since there are only three
    users overall.

    A handful of stragglers were converted manually, notably a macro in
    drivers/isdn/i4l/isdn_bsdcomp.c and, oddly enough, one of the many
    instances in net/bluetooth/hci_sock.c. In the former file, I also
    had to fix one whitespace problem spatch introduced.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

01 May, 2017

1 commit

  • nf_log_unregister() (which is what gets called in the logger backends
    module exit paths) does a (required, module is removed) synchronize_rcu().

    But nf_log_unset() is only called from pernet exit handlers. It doesn't
    free any memory so there appears to be no need to call synchronize_rcu.

    v2: Liping Zhang points out that nf_log_unregister() needs to be called
    after pernet unregister, else rmmod would become unsafe.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

08 Apr, 2017

1 commit


07 Apr, 2017

1 commit


17 Mar, 2017

1 commit

  • refcount_t type and corresponding API (see include/linux/refcount.h)
    should be used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: Pablo Neira Ayuso

    Reshetova, Elena
     

26 Dec, 2016

1 commit

  • ktime is a union because the initial implementation stored the time in
    scalar nanoseconds on 64 bit machine and in a endianess optimized timespec
    variant for 32bit machines. The Y2038 cleanup removed the timespec variant
    and switched everything to scalar nanoseconds. The union remained, but
    become completely pointless.

    Get rid of the union and just keep ktime_t as simple typedef of type s64.

    The conversion was done with coccinelle and some manual mopping up.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra

    Thomas Gleixner
     

05 Dec, 2016

1 commit


18 Nov, 2016

1 commit

  • Make struct pernet_operations::id unsigned.

    There are 2 reasons to do so:

    1)
    This field is really an index into an zero based array and
    thus is unsigned entity. Using negative value is out-of-bound
    access by definition.

    2)
    On x86_64 unsigned 32-bit data which are mixed with pointers
    via array indexing or offsets added or subtracted to pointers
    are preffered to signed 32-bit data.

    "int" being used as an array index needs to be sign-extended
    to 64-bit before being used.

    void f(long *p, int i)
    {
    g(p[i]);
    }

    roughly translates to

    movsx rsi, esi
    mov rdi, [rsi+...]
    call g

    MOVSX is 3 byte instruction which isn't necessary if the variable is
    unsigned because x86_64 is zero extending by default.

    Now, there is net_generic() function which, you guessed it right, uses
    "int" as an array index:

    static inline void *net_generic(const struct net *net, int id)
    {
    ...
    ptr = ng->ptr[id - 1];
    ...
    }

    And this function is used a lot, so those sign extensions add up.

    Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
    messing with code generation):

    add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)

    Unfortunately some functions actually grow bigger.
    This is a semmingly random artefact of code generation with register
    allocator being used differently. gcc decides that some variable
    needs to live in new r8+ registers and every access now requires REX
    prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
    used which is longer than [r8]

    However, overall balance is in negative direction:

    add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
    function old new delta
    nfsd4_lock 3886 3959 +73
    tipc_link_build_proto_msg 1096 1140 +44
    mac80211_hwsim_new_radio 2776 2808 +32
    tipc_mon_rcv 1032 1058 +26
    svcauth_gss_legacy_init 1413 1429 +16
    tipc_bcbase_select_primary 379 392 +13
    nfsd4_exchange_id 1247 1260 +13
    nfsd4_setclientid_confirm 782 793 +11
    ...
    put_client_renew_locked 494 480 -14
    ip_set_sockfn_get 730 716 -14
    geneve_sock_add 829 813 -16
    nfsd4_sequence_done 721 703 -18
    nlmclnt_lookup_host 708 686 -22
    nfsd4_lockt 1085 1063 -22
    nfs_get_client 1077 1050 -27
    tcf_bpf_init 1106 1076 -30
    nfsd4_encode_fattr 5997 5930 -67
    Total: Before=154856051, After=154854321, chg -0.00%

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

26 Oct, 2016

1 commit


25 Sep, 2016

1 commit


17 Aug, 2016

1 commit

  • Otherwise, if nfnetlink_log.ko is not loaded, we cannot add rules
    to log packets to the userspace when we specify it with arp family,
    such as:

    # nft add rule arp filter input log group 0
    :1:1-37: Error: Could not process rule: No such file or
    directory
    add rule arp filter input log group 0
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    Signed-off-by: Liping Zhang
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     

24 Jun, 2016

1 commit

  • li->u.ulog.copy_len is currently ignored by the kernel, we should truncate
    the packet to either li->u.ulog.copy_len (if set) or copy_range before
    sending it to userspace. 0 is a valid input for copy_len, so add a new
    flag to indicate whether this was option was specified by the user or not.

    Add two flags to indicate whether nflog-size/copy_len was set or not.
    XT_NFLOG_F_COPY_LEN is for XT_NFLOG and NFLOG_F_COPY_LEN for nfnetlink_log

    On the userspace side, this was initially represented by the option
    nflog-range, this will be replaced by --nflog-size now. --nflog-range would
    still exist but does not do anything.

    Reported-by: Joe Dollard
    Reviewed-by: Josh Hunt
    Signed-off-by: Vishwanath Pai
    Signed-off-by: Pablo Neira Ayuso

    Vishwanath Pai
     

19 Feb, 2016

1 commit


08 Jan, 2016

1 commit


29 Dec, 2015

1 commit


09 Dec, 2015

1 commit

  • Change return type of nfulnl_set_timeout() and nfulnl_set_qthresh() to
    be void.

    This patch changes the return type of the static methods
    nfulnl_set_timeout() and nfulnl_set_qthresh() to be void, as there is no
    justification and no need for these methods to return int.

    Signed-off-by: Rami Rosen
    Signed-off-by: Pablo Neira Ayuso

    Rosen, Rami
     

25 Nov, 2015

1 commit

  • Various files are owned by root with 0440 permission. Reading them is
    impossible in an unprivileged user namespace, interfering with firewall
    tools. For instance, iptables-save relies on /proc/net/ip_tables_names
    contents to dump only loaded tables.

    This patch assigned ownership of the following files to root in the
    current namespace:

    - /proc/net/*_tables_names
    - /proc/net/*_tables_matches
    - /proc/net/*_tables_targets
    - /proc/net/nf_conntrack
    - /proc/net/nf_conntrack_expect
    - /proc/net/netfilter/nfnetlink_log

    A mapping for root must be available, so this order should be followed:

    unshare(CLONE_NEWUSER);
    /* Setup the mapping */
    unshare(CLONE_NEWNET);

    Signed-off-by: Philip Whineray
    Signed-off-by: Pablo Neira Ayuso

    Philip Whineray
     

11 Nov, 2015

1 commit

  • After a recent (correct) change, gcc started warning about the use
    of the 'flags' variable in nfulnl_recv_config()

    net/netfilter/nfnetlink_log.c: In function 'nfulnl_recv_config':
    net/netfilter/nfnetlink_log.c:320:14: warning: 'flags' may be used uninitialized in this function [-Wmaybe-uninitialized]
    net/netfilter/nfnetlink_log.c:828:6: note: 'flags' was declared here

    The warning first shows up in ARM s3c2410_defconfig with gcc-4.3 or
    higher (including 5.2.1, which is the latest version I checked) I
    tried working around it by rearranging the code but had no success
    with that.

    As a last resort, this initializes the variable to zero, which shuts
    up the warning, but means that we don't get a warning if the code
    is ever changed in a way that actually causes the variable to be
    used without first being written.

    Signed-off-by: Arnd Bergmann
    Fixes: 8cbc870829ec ("netfilter: nfnetlink_log: validate dependencies to avoid breaking atomicity")
    Signed-off-by: Pablo Neira Ayuso

    Arnd Bergmann
     

17 Oct, 2015

1 commit


15 Oct, 2015

2 commits

  • Check that dependencies are fulfilled before updating the logger
    instance, otherwise we can leave things in intermediate state on errors
    in nfulnl_recv_config().

    [ Ken-ichirou reports that this is also fixing missing instance refcnt drop
    on error introduced in his patch 914eebf2f434 ("netfilter: nfnetlink_log:
    autoload nf_conntrack_netlink module NFQA_CFG_F_CONNTRACK config flag"). ]

    Signed-off-by: Pablo Neira Ayuso
    Tested-by: Ken-ichirou MATSUZAWA

    Pablo Neira
     
  • This patch consolidates the check for valid logger instance once we have
    passed the command handling:

    The config message that we receive may contain the following info:

    1) Command only: We always get a valid instance pointer if we just
    created it. In case that the instance is being destroyed or the
    command is unknown, we jump to exit path of nfulnl_recv_config().
    This patch doesn't modify this handling.

    2) Config only: In this case, the instance must always exist since the
    user is asking for configuration updates. If the instance doesn't exist
    this returns -ENODEV.

    3) No command and no configs are specified: This case is rare. The
    user is sending us a config message with neither commands nor
    config options. In this case, we have to check if the instance exists
    and bail out otherwise. Before this patch, it was possible to send a
    config message with no command and no config updates for an
    unexisting instance without triggering an error. So this is the only
    case that changes.

    Signed-off-by: Pablo Neira Ayuso
    Tested-by: Ken-ichirou MATSUZAWA

    Pablo Neira Ayuso
     

13 Oct, 2015

1 commit


05 Oct, 2015

1 commit

  • This patch enables to include the conntrack information together
    with the packet that is sent to user-space via NFLOG, then a
    user-space program can acquire NATed information by this NFULA_CT
    attribute.

    Including the conntrack information is optional, you can set it
    via NFULNL_CFG_F_CONNTRACK flag with the NFULA_CFG_FLAGS attribute
    like NFQUEUE.

    Signed-off-by: Ken-ichirou MATSUZAWA
    Signed-off-by: Pablo Neira Ayuso

    Ken-ichirou MATSUZAWA