23 Dec, 2011

2 commits

  • Chris Boot reported crashes occurring in ipv6_select_ident().

    [ 461.457562] RIP: 0010:[] []
    ipv6_select_ident+0x31/0xa7

    [ 461.578229] Call Trace:
    [ 461.580742]
    [ 461.582870] [] ? udp6_ufo_fragment+0x124/0x1a2
    [ 461.589054] [] ? ipv6_gso_segment+0xc0/0x155
    [ 461.595140] [] ? skb_gso_segment+0x208/0x28b
    [ 461.601198] [] ? ipv6_confirm+0x146/0x15e
    [nf_conntrack_ipv6]
    [ 461.608786] [] ? nf_iterate+0x41/0x77
    [ 461.614227] [] ? dev_hard_start_xmit+0x357/0x543
    [ 461.620659] [] ? nf_hook_slow+0x73/0x111
    [ 461.626440] [] ? br_parse_ip_options+0x19a/0x19a
    [bridge]
    [ 461.633581] [] ? dev_queue_xmit+0x3af/0x459
    [ 461.639577] [] ? br_dev_queue_push_xmit+0x72/0x76
    [bridge]
    [ 461.646887] [] ? br_nf_post_routing+0x17d/0x18f
    [bridge]
    [ 461.653997] [] ? nf_iterate+0x41/0x77
    [ 461.659473] [] ? br_flood+0xfa/0xfa [bridge]
    [ 461.665485] [] ? nf_hook_slow+0x73/0x111
    [ 461.671234] [] ? br_flood+0xfa/0xfa [bridge]
    [ 461.677299] [] ?
    nf_bridge_update_protocol+0x20/0x20 [bridge]
    [ 461.684891] [] ? nf_ct_zone+0xa/0x17 [nf_conntrack]
    [ 461.691520] [] ? br_flood+0xfa/0xfa [bridge]
    [ 461.697572] [] ? NF_HOOK.constprop.8+0x3c/0x56
    [bridge]
    [ 461.704616] [] ?
    nf_bridge_push_encap_header+0x1c/0x26 [bridge]
    [ 461.712329] [] ? br_nf_forward_finish+0x8a/0x95
    [bridge]
    [ 461.719490] [] ?
    nf_bridge_pull_encap_header+0x1c/0x27 [bridge]
    [ 461.727223] [] ? br_nf_forward_ip+0x1c0/0x1d4 [bridge]
    [ 461.734292] [] ? nf_iterate+0x41/0x77
    [ 461.739758] [] ? __br_deliver+0xa0/0xa0 [bridge]
    [ 461.746203] [] ? nf_hook_slow+0x73/0x111
    [ 461.751950] [] ? __br_deliver+0xa0/0xa0 [bridge]
    [ 461.758378] [] ? NF_HOOK.constprop.4+0x56/0x56
    [bridge]

    This is caused by bridge netfilter special dst_entry (fake_rtable), a
    special shared entry, where attaching an inetpeer makes no sense.

    Problem is present since commit 87c48fa3b46 (ipv6: make fragment
    identifications less predictable)

    Introduce DST_NOPEER dst flag and make sure ipv6_select_ident() and
    __ip_select_ident() fallback to the 'no peer attached' handling.

    Reported-by: Chris Boot
    Tested-by: Chris Boot
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Commit 618f9bc74a039da76 (net: Move mtu handling down to the protocol
    depended handlers) forgot the bridge netfilter case, adding a NULL
    dereference in ip_fragment().

    Reported-by: Chris Boot
    CC: Steffen Klassert
    Signed-off-by: Eric Dumazet
    Acked-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Eric Dumazet
     

18 Jul, 2011

3 commits


14 Jul, 2011

1 commit

  • Now that there is a one-to-one correspondance between neighbour
    and hh_cache entries, we no longer need:

    1) dynamic allocation
    2) attachment to dst->hh
    3) refcounting

    Initialization of the hh_cache entry is indicated by hh_len
    being non-zero, and such initialization is always done with
    the neighbour's lock held as a writer.

    Signed-off-by: David S. Miller

    David S. Miller
     

07 Jun, 2011

1 commit

  • Like in commit 0972ddb237 (provide cow_metrics() methods to blackhole
    dst_ops), we must provide a cow_metrics for bridges fake_dst_ops as
    well.

    This fixes a regression coming from commits 62fa8a846d7d (net: Implement
    read-only protection and COW'ing of metrics.) and 33eb9873a28 (bridge:
    initialize fake_rtable metrics)

    ip link set mybridge mtu 1234
    -->
    [ 136.546243] Pid: 8415, comm: ip Tainted: P
    2.6.39.1-00006-g40545b7 #103 ASUSTeK Computer Inc. V1Sn
    /V1Sn
    [ 136.546256] EIP: 0060:[] EFLAGS: 00010202 CPU: 0
    [ 136.546268] EIP is at 0x0
    [ 136.546273] EAX: f14a389c EBX: 000005d4 ECX: f80d32c0 EDX: f80d1da1
    [ 136.546279] ESI: f14a3000 EDI: f255bf10 EBP: f15c3b54 ESP: f15c3b48
    [ 136.546285] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
    [ 136.546293] Process ip (pid: 8415, ti=f15c2000 task=f4741f80
    task.ti=f15c2000)
    [ 136.546297] Stack:
    [ 136.546301] f80c658f f14a3000 ffffffed f15c3b64 c12cb9c8 f80d1b80
    ffffffa1 f15c3bbc
    [ 136.546315] c12da347 c12d9c7d 00000000 f7670b00 00000000 f80d1b80
    ffffffa6 f15c3be4
    [ 136.546329] 00000004 f14a3000 f255bf20 00000008 f15c3bbc c11d6cae
    00000000 00000000
    [ 136.546343] Call Trace:
    [ 136.546359] [] ? br_change_mtu+0x5f/0x80 [bridge]
    [ 136.546372] [] dev_set_mtu+0x38/0x80
    [ 136.546381] [] do_setlink+0x1a7/0x860
    [ 136.546390] [] ? rtnl_fill_ifinfo+0x9bd/0xc70
    [ 136.546400] [] ? nla_parse+0x6e/0xb0
    [ 136.546409] [] rtnl_newlink+0x361/0x510
    [ 136.546420] [] ? vmalloc_sync_all+0x100/0x100
    [ 136.546429] [] ? error_code+0x5a/0x60
    [ 136.546438] [] ? rtnl_configure_link+0x80/0x80
    [ 136.546446] [] rtnetlink_rcv_msg+0xfa/0x210
    [ 136.546454] [] ? __rtnl_unlock+0x20/0x20
    [ 136.546463] [] netlink_rcv_skb+0x8e/0xb0
    [ 136.546471] [] rtnetlink_rcv+0x1c/0x30
    [ 136.546479] [] netlink_unicast+0x23a/0x280
    [ 136.546487] [] netlink_sendmsg+0x26b/0x2f0
    [ 136.546497] [] sock_sendmsg+0xc8/0x100
    [ 136.546508] [] ? __alloc_pages_nodemask+0xe1/0x750
    [ 136.546517] [] ? _copy_from_user+0x42/0x60
    [ 136.546525] [] ? verify_iovec+0x4c/0xc0
    [ 136.546534] [] sys_sendmsg+0x1c5/0x200
    [ 136.546542] [] ? __do_fault+0x310/0x410
    [ 136.546549] [] ? do_wp_page+0x1d6/0x6b0
    [ 136.546557] [] ? handle_pte_fault+0xe1/0x720
    [ 136.546565] [] ? sys_getsockname+0x7f/0x90
    [ 136.546574] [] ? handle_mm_fault+0xb1/0x180
    [ 136.546582] [] ? vmalloc_sync_all+0x100/0x100
    [ 136.546589] [] ? do_page_fault+0x173/0x3d0
    [ 136.546596] [] ? sys_recvmsg+0x3b/0x60
    [ 136.546605] [] sys_socketcall+0x293/0x2d0
    [ 136.546614] [] sysenter_do_call+0x12/0x26
    [ 136.546619] Code: Bad EIP value.
    [ 136.546627] EIP: [] 0x0 SS:ESP 0068:f15c3b48
    [ 136.546645] CR2: 0000000000000000
    [ 136.546652] ---[ end trace 6909b560e78934fa ]---

    Signed-off-by: Alexander Holler
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Alexander Holler
     

25 May, 2011

1 commit


18 May, 2011

1 commit


14 May, 2011

1 commit

  • The commit 6b1e960fdbd75dcd9bcc3ba5ff8898ff1ad30b6e
    bridge: Reset IPCB when entering IP stack on NF_FORWARD
    broke forwarding of IPV6 packets in bridge because it would
    call bp_parse_ip_options with an IPV6 packet.

    Reported-by: Noah Meyerhans
    Signed-off-by: Stephen Hemminger
    Reviewed-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

23 Apr, 2011

1 commit


13 Apr, 2011

1 commit

  • Commit 462fb2af9788a82 (bridge : Sanitize skb before it enters the IP
    stack), missed one IPCB init before calling ip_options_compile()

    Thanks to Scot Doyle for his tests and bug reports.

    Reported-by: Scot Doyle
    Signed-off-by: Eric Dumazet
    Cc: Hiroaki SHIMODA
    Acked-by: Bandan Das
    Acked-by: Stephen Hemminger
    Cc: Jan Lübbe
    Signed-off-by: David S. Miller

    Eric Dumazet
     

19 Mar, 2011

1 commit

  • Whenever we enter the IP stack proper from bridge netfilter we
    need to ensure that the skb is in a form the IP stack expects
    it to be in.

    The entry point on NF_FORWARD did not meet the requirements of
    the IP stack, therefore leading to potential crashes/panics.

    This patch fixes the problem.

    Signed-off-by: Herbert Xu
    Acked-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Herbert Xu
     

13 Mar, 2011

1 commit


03 Mar, 2011

1 commit


11 Dec, 2010

1 commit

  • The nf_pre_routing functions in bridging have collected two
    distinct ways of returning NF_DROP over the years, inline and
    via goto. There is no reason for preferring either one.

    So this patch arbitrarily picks the inline variant and converts
    the all the gotos.

    Also removes a redundant comment.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

10 Dec, 2010

1 commit

  • Use helper functions to hide all direct accesses, especially writes,
    to dst_entry metrics values.

    This will allow us to:

    1) More easily change how the metrics are stored.

    2) Implement COW for metrics.

    In particular this will help us put metrics into the inetpeer
    cache if that is what we end up doing. We can make the _metrics
    member a pointer instead of an array, initially have it point
    at the read-only metrics in the FIB, and then on the first set
    grab an inetpeer entry and point the _metrics member there.

    Signed-off-by: David S. Miller
    Acked-by: Eric Dumazet

    David S. Miller
     

18 Nov, 2010

1 commit


16 Nov, 2010

1 commit

  • The macro br_port_exists() is not enough protection when only
    RCU is being used. There is a tiny race where other CPU has cleared port
    handler hook, but is bridge port flag might still be set.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    stephen hemminger
     

21 Oct, 2010

2 commits


12 Oct, 2010

1 commit

  • struct dst_ops tracks number of allocated dst in an atomic_t field,
    subject to high cache line contention in stress workload.

    Switch to a percpu_counter, to reduce number of time we need to dirty a
    central location. Place it on a separate cache line to avoid dirtying
    read only fields.

    Stress test :

    (Sending 160.000.000 UDP frames,
    IP route cache disabled, dual E5540 @2.53GHz,
    32bit kernel, FIB_TRIE, SLUB/NUMA)

    Before:

    real 0m51.179s
    user 0m15.329s
    sys 10m15.942s

    After:

    real 0m45.570s
    user 0m15.525s
    sys 9m56.669s

    With a small reordering of struct neighbour fields, subject of a
    following patch, (to separate refcnt from other read mostly fields)

    real 0m41.841s
    user 0m15.261s
    sys 8m45.949s

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

20 Sep, 2010

1 commit

  • Related dicussion here : http://lkml.org/lkml/2010/9/3/16

    Introduce a function br_parse_ip_options that will audit the
    skb and possibly refill IP options before a packet enters the
    IP stack. If no options are present, the function will zero out
    the skb cb area so that it is not misinterpreted as options by some
    unsuspecting IP layer routine. If packet consistency fails, drop it.

    Signed-off-by: Bandan Das
    Signed-off-by: David S. Miller

    Bandan Das
     

02 Sep, 2010

1 commit


24 Aug, 2010

1 commit


08 Jul, 2010

2 commits


03 Jul, 2010

1 commit


02 Jul, 2010

1 commit

  • Support more fine grained control of bridge netfilter iptables invocation
    by adding seperate brnf_call_*tables parameters for each device using the
    sysfs interface. Packets are passed to layer 3 netfilter when either the
    global parameter or the per bridge parameter is enabled.

    Acked-by: Stephen Hemminger
    Acked-by: David S. Miller
    Signed-off-by: Patrick McHardy

    Patrick McHardy
     

16 Jun, 2010

2 commits


15 Jun, 2010

1 commit


11 Jun, 2010

1 commit


01 Jun, 2010

1 commit


13 May, 2010

1 commit

  • [ 4593.956206] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
    [ 4593.956219] IP: [] br_nf_forward_finish+0x154/0x170 [bridge]
    [ 4593.956232] PGD 195ece067 PUD 1ba005067 PMD 0
    [ 4593.956241] Oops: 0000 [#1] SMP
    [ 4593.956248] last sysfs file:
    /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:08/ATK0110:00/hwmon/hwmon0/temp2_label
    [ 4593.956253] CPU 3
    ...
    [ 4593.956380] Pid: 29512, comm: kvm Not tainted 2.6.34-rc7-net #195 P6T DELUXE/System Product Name
    [ 4593.956384] RIP: 0010:[] [] br_nf_forward_finish+0x154/0x170 [bridge]
    [ 4593.956395] RSP: 0018:ffff880001e63b78 EFLAGS: 00010246
    [ 4593.956399] RAX: 0000000000000608 RBX: ffff880057181700 RCX: ffff8801b813d000
    [ 4593.956402] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff880057181700
    [ 4593.956406] RBP: ffff880001e63ba8 R08: ffff8801b9d97000 R09: ffffffffa0335650
    [ 4593.956410] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801b813d000
    [ 4593.956413] R13: ffffffff81ab3940 R14: ffff880057181700 R15: 0000000000000002
    [ 4593.956418] FS: 00007fc40d380710(0000) GS:ffff880001e60000(0000) knlGS:0000000000000000
    [ 4593.956422] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
    [ 4593.956426] CR2: 0000000000000018 CR3: 00000001ba1d7000 CR4: 00000000000026e0
    [ 4593.956429] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 4593.956433] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    [ 4593.956437] Process kvm (pid: 29512, threadinfo ffff8801ba566000, task ffff8801b8003870)
    [ 4593.956441] Stack:
    [ 4593.956443] 0000000100000020 ffff880001e63ba0 ffff880001e63ba0 ffff880057181700
    [ 4593.956451] ffffffffa0335650 ffffffff81ab3940 ffff880001e63bd8 ffffffffa03350e6
    [ 4593.956462] ffff880001e63c40 000000000000024d ffff880057181700 0000000080000000
    [ 4593.956474] Call Trace:
    [ 4593.956478]
    [ 4593.956488] [] ? br_nf_forward_finish+0x0/0x170 [bridge]
    [ 4593.956496] [] NF_HOOK_THRESH+0x56/0x60 [bridge]
    [ 4593.956504] [] br_nf_forward_arp+0x112/0x120 [bridge]
    [ 4593.956511] [] nf_iterate+0x64/0xa0
    [ 4593.956519] [] ? br_forward_finish+0x0/0x60 [bridge]
    [ 4593.956524] [] nf_hook_slow+0x6c/0x100
    [ 4593.956531] [] ? br_forward_finish+0x0/0x60 [bridge]
    [ 4593.956538] [] ? __br_forward+0x0/0xc0 [bridge]
    [ 4593.956545] [] __br_forward+0x6d/0xc0 [bridge]
    [ 4593.956550] [] ? skb_clone+0x3e/0x70
    [ 4593.956557] [] deliver_clone+0x32/0x60 [bridge]
    [ 4593.956564] [] br_flood+0xa6/0xe0 [bridge]
    [ 4593.956571] [] ? __br_forward+0x0/0xc0 [bridge]

    Don't call nf_bridge_update_protocol() for ARP traffic as skb->nf_bridge isn't
    used in the ARP case.

    Reported-by: Stephen Hemminger
    Signed-off-by: Bart De Schuymer
    Signed-off-by: Patrick McHardy

    Bart De Schuymer
     

20 Apr, 2010

2 commits

  • The MTU for IP traffic encapsulated inside PPPoE traffic is smaller
    than the MTU of the Ethernet device (1500). Connection tracking
    gathers all IP packets and sometimes will refragment them in
    ip_fragment(). We then need to subtract the length of the
    encapsulating header from the mtu used in ip_fragment(). The check in
    br_nf_dev_queue_xmit() which determines if ip_fragment() has to be
    called is also updated for the PPPoE-encapsulated packets.
    nf_bridge_copy_header() is also updated to make sure the PPPoE data
    length field has the correct value.

    Signed-off-by: Bart De Schuymer
    Signed-off-by: Patrick McHardy

    Bart De Schuymer
     
  • Conflicts:
    Documentation/feature-removal-schedule.txt
    net/ipv6/netfilter/ip6t_REJECT.c
    net/netfilter/xt_limit.c

    Signed-off-by: Patrick McHardy

    Patrick McHardy
     

15 Apr, 2010

2 commits

  • - fix IP DNAT on vlan- or pppoe-encapsulated traffic: The functions
    neigh_hh_output() or dst->neighbour->output() overwrite the complete
    Ethernet header, although we only need the destination MAC address.
    For encapsulated packets, they ended up overwriting the encapsulating
    header. The new code copies the Ethernet source MAC address and
    protocol number before calling dst->neighbour->output(). The Ethernet
    source MAC and protocol number are copied back in place in
    br_nf_pre_routing_finish_bridge_slow(). This also makes the IP DNAT
    more transparent because in the old scheme the source MAC of the
    bridge was copied into the source address in the Ethernet header. We
    also let skb->protocol equal ETH_P_IP resp. ETH_P_IPV6 during the
    execution of the PF_INET resp. PF_INET6 hooks.

    - Speed up IP DNAT by calling neigh_hh_bridge() instead of
    neigh_hh_output(): if dst->hh is available, we already know the MAC
    address so we can just copy it.

    Signed-off-by: Bart De Schuymer
    Signed-off-by: Patrick McHardy

    Bart De Schuymer
     
  • Remove br_netfilter.c::br_nf_local_out(). The function
    br_nf_local_out() was needed because the PF_BRIDGE::LOCAL_OUT hook
    could be called when IP DNAT happens on to-be-bridged traffic. The
    new scheme eliminates this mess.

    Signed-off-by: Bart De Schuymer
    Signed-off-by: Patrick McHardy

    Bart De Schuymer
     

13 Apr, 2010

1 commit