30 Aug, 2011

1 commit

  • A userspace listener may send (bogus) NF_STOLEN verdict, which causes skb leak.

    This problem was previously fixed via
    64507fdbc29c3a622180378210ecea8659b14e40 (netfilter:
    nf_queue: fix NF_STOLEN skb leak) but this had to be reverted because
    NF_STOLEN can also be returned by a netfilter hook when iterating the
    rules in nf_reinject.

    Reject userspace NF_STOLEN verdict, as suggested by Michal Miroslaw.

    This is complementary to commit fad54440438a7c231a6ae347738423cbabc936d9
    (netfilter: avoid double free in nf_reinject).

    Cc: Julian Anastasov
    Cc: Eric Dumazet
    Signed-off-by: Florian Westphal
    Signed-off-by: Patrick McHardy

    Florian Westphal
     

27 Jul, 2011

1 commit

  • This allows us to move duplicated code in
    (atomic_inc_not_zero() for now) to

    Signed-off-by: Arun Sharma
    Reviewed-by: Eric Dumazet
    Cc: Ingo Molnar
    Cc: David Miller
    Cc: Eric Dumazet
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun Sharma
     

22 Jul, 2011

1 commit


19 Jul, 2011

2 commits

  • Introduces a new nfnetlink type that applies a given
    verdict to all queued packets with an id
    Signed-off-by: Patrick McHardy

    Florian Westphal
     
  • Packet identifier is currently setup in nfqnl_build_packet_message(),
    using one atomic_inc_return().

    Problem is that since several cpus might concurrently call
    nfqnl_enqueue_packet() for the same queue, we can deliver packets to
    consumer in non monotonic way (packet N+1 being delivered after packet
    N)

    This patch moves the packet id setup from nfqnl_build_packet_message()
    to nfqnl_enqueue_packet() to guarantee correct delivery order.

    This also removes one atomic operation.

    Signed-off-by: Eric Dumazet
    CC: Florian Westphal
    CC: Pablo Neira Ayuso
    CC: Eric Leblond
    Signed-off-by: Patrick McHardy

    Eric Dumazet
     

18 Jul, 2011

1 commit

  • nenetlink_queue operations on SMP are not efficent if several queues are
    used, because of nfnl_mutex contention when applications give packet
    verdict.

    Use new call_rcu field in struct nfnl_callback to advertize a callback
    that is called under rcu_read_lock instead of nfnl_mutex.

    On my 2x4x2 machine, I was able to reach 2.000.000 pps going through
    user land returning NF_ACCEPT verdicts without losses, instead of less
    than 500.000 pps before patch.

    Signed-off-by: Eric Dumazet
    CC: Florian Westphal
    CC: Eric Leblond
    Signed-off-by: Patrick McHardy

    Eric Dumazet
     

16 Jun, 2011

1 commit

  • By default, when broadcast or multicast packet are sent from a local
    application, they are sent to the interface then looped by the kernel
    to other local applications, going throught netfilter hooks in the
    process.

    These looped packet have their MAC header removed from the skb by the
    kernel looping code. This confuse various netfilter's netlink queue,
    netlink log and the legacy ip_queue, because they try to extract a
    hardware address from these packets, but extracts a part of the IP
    header instead.

    This patch prevent NFQUEUE, NFLOG and ip_QUEUE to include a MAC header
    if there is none in the packet.

    Signed-off-by: Nicolas Cavallari
    Signed-off-by: Patrick McHardy

    Nicolas Cavallari
     

18 Jan, 2011

1 commit

  • instead of returning -1 on error, return an error number to allow the
    caller to handle some errors differently.

    ECANCELED is used to indicate that the hook is going away and should be
    ignored.

    A followup patch will introduce more 'ignore this hook' conditions,
    (depending on queue settings) and will move kfree_skb responsibility
    to the caller.

    Signed-off-by: Florian Westphal
    Signed-off-by: Patrick McHardy

    Florian Westphal
     

16 Jun, 2010

2 commits


10 Jun, 2010

1 commit


20 Apr, 2010

1 commit


08 Apr, 2010

1 commit


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

12 Feb, 2010

1 commit


18 Jan, 2010

1 commit


13 Jan, 2010

1 commit


07 Nov, 2009

1 commit


25 Aug, 2009

1 commit


10 Jun, 2009

1 commit


20 Jul, 2008

1 commit


10 Jun, 2008

1 commit

  • - No need to perform data_len = 0 in the switch command, since data_len
    is initialized to 0 in the beginning of the ipq_build_packet_message()
    method.

    - {ip,ip6}_queue: We can reach nlmsg_failure only from one place; skb is
    sure to be NULL when getting there; since skb is NULL, there is no need
    to check this fact and call kfree_skb().

    Signed-off-by: Rami Rosen
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Rami Rosen
     

29 Apr, 2008

1 commit

  • While reinjecting *bigger* modified versions of IPv6 packets using
    libnetfilter_queue, things work fine on a 2.6.24 kernel (2.6.22 too)
    but I get the following on recents kernels (2.6.25, trace below is
    against today's net-2.6 git tree):

    skb_over_panic: text:c04fddb0 len:696 put:632 head:f7592c00 data:f7592c00 tail:0xf7592eb8 end:0xf7592e80 dev:eth0
    ------------[ cut here ]------------
    invalid opcode: 0000 [#1] PREEMPT
    Process sendd (pid: 3657, ti=f6014000 task=f77c31d0 task.ti=f6014000)
    Stack: c071e638 c04fddb0 000002b8 00000278 f7592c00 f7592c00 f7592eb8 f7592e80
    f763c000 f6bc5200 f7592c40 f6015c34 c04cdbfc f6bc5200 00000278 f6015c60
    c04fddb0 00000020 f72a10c0 f751b420 00000001 0000000a 000002b8 c065582c
    Call Trace:
    [] ? nfqnl_recv_verdict+0x1c0/0x2e0
    [] ? skb_put+0x3c/0x40
    [] ? nfqnl_recv_verdict+0x1c0/0x2e0
    [] ? nfnetlink_rcv_msg+0xf5/0x160
    [] ? nfnetlink_rcv_msg+0x1e/0x160
    [] ? nfnetlink_rcv_msg+0x0/0x160
    [] ? netlink_rcv_skb+0x77/0xa0
    [] ? nfnetlink_rcv+0x1c/0x30
    [] ? netlink_unicast+0x243/0x2b0
    [] ? memcpy_fromiovec+0x4a/0x70
    [] ? netlink_sendmsg+0x1c6/0x270
    [] ? sock_sendmsg+0xc4/0xf0
    [] ? set_next_entity+0x1d/0x50
    [] ? autoremove_wake_function+0x0/0x40
    [] ? __wake_up_common+0x3e/0x70
    [] ? n_tty_receive_buf+0x34f/0x1280
    [] ? __wake_up+0x68/0x70
    [] ? copy_from_user+0x37/0x70
    [] ? verify_iovec+0x2c/0x90
    [] ? sys_sendmsg+0x10a/0x230
    [] ? __dequeue_entity+0x2a/0xa0
    [] ? set_next_entity+0x1d/0x50
    [] ? pty_write+0x47/0x60
    [] ? tty_default_put_char+0x1b/0x20
    [] ? __wake_up+0x49/0x70
    [] ? tty_ldisc_deref+0x39/0x90
    [] ? tty_write+0x1a0/0x1b0
    [] ? sys_socketcall+0x7f/0x260
    [] ? sysenter_past_esp+0x6a/0x91
    [] ? snd_intel8x0m_probe+0x270/0x6e0
    =======================
    Code: 00 00 89 5c 24 14 8b 98 9c 00 00 00 89 54 24 0c 89 5c 24 10 8b 40 50 89 4c 24 04 c7 04 24 38 e6 71 c0 89 44 24 08 e8 c4 46 c5 ff 0b eb fe 55 89 e5 56 89 d6 53 89 c3 83 ec 0c 8b 40 50 39 d0
    EIP: [] skb_over_panic+0x5c/0x60 SS:ESP 0068:f6015bf8

    Looking at the code, I ended up in nfq_mangle() function (called by
    nfqnl_recv_verdict()) which performs a call to skb_copy_expand() due to
    the increased size of data passed to the function. AFAICT, it should ask
    for 'diff' instead of 'diff - skb_tailroom(e->skb)'. Because the
    resulting sk_buff has not enough space to support the skb_put(skb, diff)
    call a few lines later, this results in the call to skb_over_panic().

    The patch below asks for allocation of a copy with enough space for
    mangled packet and the same amount of headroom as old sk_buff. While
    looking at how the regression appeared (e2b58a67), I noticed the same
    pattern in ipq_mangle_ipv6() and ipq_mangle_ipv4(). The patch corrects
    those locations too.

    Tested with bigger reinjected IPv6 packets (nfqnl_mangle() path), things
    are ok (2.6.25 and today's net-2.6 git tree).

    Signed-off-by: Arnaud Ebalard
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Arnaud Ebalard
     

28 Mar, 2008

2 commits


26 Mar, 2008

1 commit


11 Mar, 2008

2 commits


20 Feb, 2008

1 commit

  • As reported by Tomas Simonaitis ,
    inserting new data in skbs queued over {ip,ip6,nfnetlink}_queue
    triggers a SKB_LINEAR_ASSERT in skb_put().

    Going back through the git history, it seems this bug is present since
    at least 2.6.12-rc2, probably even since the removal of
    skb_linearize() for netfilter.

    Linearize non-linear skbs through skb_copy_expand() when enlarging
    them. Tested by Thomas, fixes bugzilla #9933.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

01 Feb, 2008

1 commit

  • CHECK net/netfilter/nf_conntrack_expect.c
    net/netfilter/nf_conntrack_expect.c:429:13: warning: context imbalance in 'exp_seq_start' - wrong count at exit
    net/netfilter/nf_conntrack_expect.c:441:13: warning: context imbalance in 'exp_seq_stop' - unexpected unlock
    CHECK net/netfilter/nf_log.c
    net/netfilter/nf_log.c:105:13: warning: context imbalance in 'seq_start' - wrong count at exit
    net/netfilter/nf_log.c:125:13: warning: context imbalance in 'seq_stop' - unexpected unlock
    CHECK net/netfilter/nfnetlink_queue.c
    net/netfilter/nfnetlink_queue.c:363:7: warning: symbol 'size' shadows an earlier one
    net/netfilter/nfnetlink_queue.c:217:9: originally declared here
    net/netfilter/nfnetlink_queue.c:847:13: warning: context imbalance in 'seq_start' - wrong count at exit
    net/netfilter/nfnetlink_queue.c:859:13: warning: context imbalance in 'seq_stop' - unexpected unlock

    Signed-off-by: Eric Dumazet
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Eric Dumazet
     

29 Jan, 2008

10 commits