11 Apr, 2010

1 commit


09 Apr, 2010

1 commit

  • Commits 5051ebd275de672b807c28d93002c2fb0514a3c9 and
    5051ebd275de672b807c28d93002c2fb0514a3c9 ("ipv[46]: udp: optimize unicast RX
    path") broke some programs.

    After upgrading a L2TP server to 2.6.33 it started to fail, tunnels going up an
    down, after the 10th tunnel came up. My modified rp-l2tp uses a global
    unconnected socket bound to (INADDR_ANY, 1701) and one connected socket per
    tunnel after parameter negotiation.

    After ten sockets were open and due to mixed parameters to
    udp[46]_lib_lookup2() kernel started to drop packets.

    Signed-off-by: Jorge Boncompte [DTI2]
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Jorge Boncompte [DTI2]
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

29 Mar, 2010

1 commit


27 Mar, 2010

2 commits

  • When cache is unresolved, c->mf[6]c_parent is set to 65535 and
    minvif, maxvif are not initialized, hence we must avoid to
    parse IIF and OIF.
    A second problem can happen when the user dumps a cache entry
    where a VIF, that was referenced at creation time, has been
    removed.

    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     
  • When a dump is interrupted at the last device in a hash chain and
    then continued, "idx" won't get incremented past s_idx, so s_ip_idx
    is not reset when moving on to the next device. This means of all
    following devices only the last n - s_ip_idx addresses are dumped.

    Tested-by: Pawel Staszewski
    Signed-off-by: Patrick McHardy

    Patrick McHardy
     

26 Mar, 2010

1 commit


25 Mar, 2010

1 commit

  • The order of the IPv6 raw table is currently reversed, that makes impossible
    to use the NOTRACK target in IPv6: for example if someone enters

    ip6tables -t raw -A PREROUTING -p tcp --dport 80 -j NOTRACK

    and if we receive fragmented packets then the first fragment will be
    untracked and thus skip nf_ct_frag6_gather (and conntrack), while all
    subsequent fragments enter nf_ct_frag6_gather and reassembly will never
    successfully be finished.

    Singed-off-by: Jozsef Kadlecsik
    Signed-off-by: Patrick McHardy

    Jozsef Kadlecsik
     

20 Mar, 2010

2 commits

  • mfc_parent of cache entries is used to index into the vif_table and is
    initialised from mfcctl->mfcc_parent. This can take values of to 2^16-1,
    while the vif_table has only MAXVIFS (32) entries. The same problem
    affects ip6mr.

    Refuse invalid values to fix a potential out-of-bounds access. Unlike
    the other validity checks, this is checked in ipmr_mfc_add() instead of
    the setsockopt handler since its unused in the delete path and might be
    uninitialized.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • As the only path leading to ip6_dst_check makes an indirect call
    through dst->ops, dst cannot be NULL in ip6_dst_check.

    This patch removes this check in case it misleads people who
    come across this code.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

14 Mar, 2010

1 commit

  • If we are managing IPv6 addresses using DHCP, it would be nice
    for user-space to be notified if an address configured through
    DHCP fails DAD. Otherwise user-space would have to poll to see
    whether DAD succeeds.

    This patch uses the existing notification mechanism and simply
    hooks it into the DAD failure code path.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

09 Mar, 2010

1 commit

  • Commit 6b03a53a (tcp: use limited socket backlog) added the possibility
    of dropping frames when backlog queue is full.

    Commit d218d111 (tcp: Generalized TTL Security Mechanism) added the
    possibility of dropping frames when TTL is under a given limit.

    This patch adds new SNMP MIB entries, named TCPBacklogDrop and
    TCPMinTTLDrop, published in /proc/net/netstat in TcpExt: line

    netstat -s | egrep "TCPBacklogDrop|TCPMinTTLDrop"
    TCPBacklogDrop: 0
    TCPMinTTLDrop: 0

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

08 Mar, 2010

1 commit

  • IPV6_PREFER_SRC_xxx definitions:
    | #define IPV6_PREFER_SRC_TMP 0x0001
    | #define IPV6_PREFER_SRC_PUBLIC 0x0002
    | #define IPV6_PREFER_SRC_COA 0x0004

    RT6_LOOKUP_F_xxx definitions:
    | #define RT6_LOOKUP_F_SRCPREF_TMP 0x00000008
    | #define RT6_LOOKUP_F_SRCPREF_PUBLIC 0x00000010
    | #define RT6_LOOKUP_F_SRCPREF_COA 0x00000020

    So, we can translate between these two groups by shift operation
    instead of multiple 'if's.

    Signed-off-by: YOSHIFUJI Hideaki
    Signed-off-by: David S. Miller

    YOSHIFUJI Hideaki / 吉藤英明
     

06 Mar, 2010

3 commits

  • sk_add_backlog -> __sk_add_backlog
    sk_add_backlog_limited -> sk_add_backlog

    Signed-off-by: Zhu Yi
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Zhu Yi
     
  • Make udp adapt to the limited socket backlog change.

    Cc: "David S. Miller"
    Cc: Alexey Kuznetsov
    Cc: "Pekka Savola (ipv6)"
    Cc: Patrick McHardy
    Signed-off-by: Zhu Yi
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Zhu Yi
     
  • Make tcp adapt to the limited socket backlog change.

    Cc: "David S. Miller"
    Cc: Alexey Kuznetsov
    Cc: "Pekka Savola (ipv6)"
    Cc: Patrick McHardy
    Signed-off-by: Zhu Yi
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Zhu Yi
     

04 Mar, 2010

4 commits

  • This solves a potential race problem during the cleanup process.
    The issue is that addrconf_ifdown() needs to traverse address list,
    but then drop lock to call the notifier. The version in -next
    could get confused if add/delete happened during this window.
    Original code (2.6.32 and earlier) was okay because all addresses
    were always deleted.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    stephen hemminger
     
  • My recent change in net-next to retain permanent addresses caused regression.
    Device refcount would not go to zero when device was unregistered because
    left over anycast reference would hold ipv6 dev reference which would hold
    device references...

    The correct procedure is to call notify chain when address is no longer
    available for use. When interface comes back DAD timer will notify
    back that address is available.

    Also, link local addresses should be purged when interface is brought
    down. The address might be changed.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    stephen hemminger
     
  • The Router Solicitation timer races with device state changes
    because it doesn't lock the device. Use local variable to avoid
    one repeated dereference.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    stephen hemminger
     
  • Timer code runs in bottom half, so there is no need for
    using _bh form of locking. Also check if device is not ready
    to avoid race with address that is no longer active.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    stephen hemminger
     

03 Mar, 2010

1 commit

  • When I merged the bundle creation code, I introduced a bogus
    flowi value in the bundle. Instead of getting from the caller,
    it was instead set to the flow in the route object, which is
    totally different.

    The end result is that the bundles we created never match, and
    we instead end up with an ever growing bundle list.

    Thanks to Jamal for find this problem.

    Reported-by: Jamal Hadi Salim
    Signed-off-by: Herbert Xu
    Acked-by: Steffen Klassert
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Herbert Xu
     

27 Feb, 2010

2 commits


26 Feb, 2010

3 commits


25 Feb, 2010

6 commits


23 Feb, 2010

1 commit


20 Feb, 2010

2 commits

  • Yuck. It turns out that when we restart sysctls we were restarting
    with the values already changed. Which unfortunately meant that
    the second time through we thought there was no change and skipped
    all kinds of work, despite the fact that there was indeed a change.

    I have fixed this the simplest way possible by restoring the changed
    values when we restart the sysctl write.

    One of my coworkers spotted this bug when after disabling forwarding
    on an interface pings were still forwarded.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • When an ICMPV6_PKT_TOOBIG message is received with a MTU below 1280,
    all further packets include a fragment header.

    Unlike regular defragmentation, conntrack also needs to "reassemble"
    those fragments in order to obtain a packet without the fragment
    header for connection tracking. Currently nf_conntrack_reasm checks
    whether a fragment has either IP6_MF set or an offset != 0, which
    makes it ignore those fragments.

    Remove the invalid check and make reassembly handle fragment queues
    containing only a single fragment.

    Reported-and-tested-by: Ulrich Weber
    Signed-off-by: Patrick McHardy

    Patrick McHardy
     

19 Feb, 2010

3 commits


18 Feb, 2010

1 commit


17 Feb, 2010

1 commit