18 Jun, 2006

1 commit


13 Jun, 2006

1 commit


12 Jun, 2006

2 commits

  • From: Aki M Nyrhinen

    IMHO the current fix to the problem (in_flight underflow in reno)
    is incorrect. it treats the symptons but ignores the problem. the
    problem is timing out packets other than the head packet when we
    don't have sack. i try to explain (sorry if explaining the obvious).

    with sack, scanning the retransmit queue for timed out packets is
    fine because we know which packets in our retransmit queue have been
    acked by the receiver.

    without sack, we know only how many packets in our retransmit queue the
    receiver has acknowledged, but no idea which packets.

    think of a "typical" slow-start overshoot case, where for example
    every third packet in a window get lost because a router buffer gets
    full.

    with sack, we check for timeouts on those every third packet (as the
    rest have been sacked). the packet counting works out and if there
    is no reordering, we'll retransmit exactly the packets that were
    lost.

    without sack, however, we check for timeout on every packet and end up
    retransmitting consecutive packets in the retransmit queue. in our
    slow-start example, 2/3 of those retransmissions are unnecessary. these
    unnecessary retransmissions eat the congestion window and evetually
    prevent fast recovery from continuing, if enough packets were lost.

    Signed-off-by: David S. Miller

    Aki M Nyrhinen
     
  • A soft lockup existed in the handling of ack vector records.
    Specifically, when a tail of the list of ack vector records was
    removed, it was possible to end up iterating infinitely on an element
    of the tail.

    Signed-off-by: Andrea Bittau
    Signed-off-by: Ian McDonald
    Signed-off-by: David S. Miller

    Andrea Bittau
     

06 Jun, 2006

4 commits

  • There are several bugs in error handling in br_add_bridge:
    - when dev_alloc_name fails, allocated net_device is not freed
    - unregister_netdev is called when rtnl lock is held
    - free_netdev is called before netdev_run_todo has a chance to be run after
    unregistering net_device

    Signed-off-by: Jiri Benc
    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Jiri Benc
     
  • The skb allocation may fail, which can result in a NULL pointer dereference
    in irlap_queue_xmit().

    Coverity CID: 434.

    Signed-off-by: Florin Malita
    Signed-off-by: Andrew Morton
    Signed-off-by: David S. Miller

    Florin Malita
     
  • The /proc/sys/net/ethernet directory has been sitting empty for more than
    10 years! Time to eliminate it!

    Signed-off-by: Jes Sorensen
    Signed-off-by: Andrew Morton
    Signed-off-by: David S. Miller

    Jes Sorensen
     
  • Trimming the head of an skb by calling skb_pull can cause the packet
    to become unaligned if the length pulled is odd. Since the length is
    entirely arbitrary for a FIN packet carrying data, this is actually
    quite common.

    Unaligned data is not the end of the world, but we should avoid it if
    it's easily done. In this case it is trivial. Since we're discarding
    all of the head data it doesn't matter whether we move skb->data forward
    or back.

    However, it is still possible to have unaligned skb->data in general.
    So network drivers should be prepared to handle it instead of crashing.

    This patch also adds an unlikely marking on len < headlen since partial
    ACKs on head data are extremely rare in the wild. As the return value
    of __pskb_trim_head is no longer ever NULL that has been removed.

    Signed-off-by: Herbert Xu ~{PmV>HI~}
    Signed-off-by: David S. Miller

    Herbert Xu ~{PmVHI~}
     

03 Jun, 2006

1 commit


29 May, 2006

3 commits


27 May, 2006

2 commits


24 May, 2006

6 commits

  • Bridge will OOPS on removal if other application has the SAP open.
    The bridge SAP might be shared with other usages, so need
    to do reference counting on module removal rather than explicit
    close/delete.

    Since packet might arrive after or during removal, need to clear
    the receive function handle, so LLC only hands it to user (if any).

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • If kmalloc fails, error path leaks data allocated from asn1_oid_decode().

    Signed-off-by: Chris Wright
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Chris Wright
     
  • When parsing unknown sequence extensions the "son"-pointer points behind
    the last known extension for this type, don't try to interpret it.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • The condition "> H323_ERROR_STOP" can never be true since H323_ERROR_STOP
    is positive and is the highest possible return code, while real errors are
    negative, fix the checks. Also only abort on real errors in some spots
    that were just interpreting any return value != 0 as error.

    Fixes crashes caused by use of stale data after a parsing error occured:

    BUG: unable to handle kernel paging request at virtual address bfffffff
    printing eip:
    c01aa0f8
    *pde = 1a801067
    *pte = 00000000
    Oops: 0000 [#1]
    PREEMPT
    Modules linked in: ip_nat_h323 ip_conntrack_h323 nfsd exportfs sch_sfq sch_red cls_fw sch_hfsc xt_length ipt_owner xt_MARK iptable_mangle nfs lockd sunrpc pppoe pppoxx
    CPU: 0
    EIP: 0060:[] Not tainted VLI
    EFLAGS: 00210646 (2.6.17-rc4 #8)
    EIP is at memmove+0x19/0x22
    eax: d77264e9 ebx: d77264e9 ecx: e88d9b17 edx: d77264e9
    esi: bfffffff edi: bfffffff ebp: de6a7680 esp: c0349db8
    ds: 007b es: 007b ss: 0068
    Process asterisk (pid: 3765, threadinfo=c0349000 task=da068540)
    Stack: 00000006 c0349e5e d77264e3 e09a2b4e e09a38a0 d7726052 d7726124 00000491
    00000006 00000006 00000006 00000491 de6a7680 d772601e d7726032 c0349f74
    e09a2dc2 00000006 c0349e5e 00000006 00000000 d76dda28 00000491 c0349f74
    Call Trace:
    [] mangle_contents+0x62/0xfe [ip_nat]
    [] ip_nat_mangle_tcp_packet+0xa1/0x191 [ip_nat]
    [] set_addr+0x74/0x14c [ip_nat_h323]
    [] process_setup+0x11b/0x29e [ip_conntrack_h323]
    [] process_setup+0x14c/0x29e [ip_conntrack_h323]
    [] process_q931+0x3c/0x142 [ip_conntrack_h323]
    [] q931_help+0xe0/0x144 [ip_conntrack_h323]
    ...

    Found by the PROTOS c07-h2250v4 testsuite.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
    [NETFILTER]: SNMP NAT: fix memory corruption
    [IRDA]: fixup type of ->lsap_state
    [IRDA]: fix 16/32 bit confusion
    [NET]: Fix "ntohl(ntohs" bugs
    [BNX2]: Use kmalloc instead of array
    [BNX2]: Fix bug in bnx2_nvram_write()
    [TG3]: Add some missing rx error counters

    Linus Torvalds
     
  • Both cause the 'entries' count in the export cache to be non-zero at module
    removal time, so unregistering that cache fails and results in an oops.

    1/ exp_pseudoroot (used for NFSv4 only) leaks a reference to an export
    entry.
    2/ sunrpc_cache_update doesn't increment the entries count when it adds
    an entry.

    Thanks to "david m. richter" for triggering the
    problem and finding one of the bugs.

    Cc: "david m. richter"
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     

23 May, 2006

3 commits

  • Fix memory corruption caused by snmp_trap_decode:

    - When snmp_trap_decode fails before the id and address are allocated,
    the pointers contain random memory, but are freed by the caller
    (snmp_parse_mangle).

    - When snmp_trap_decode fails after allocating just the ID, it tries
    to free both address and ID, but the address pointer still contains
    random memory. The caller frees both ID and random memory again.

    - When snmp_trap_decode fails after allocating both, it frees both,
    and the callers frees both again.

    The corruption can be triggered remotely when the ip_nat_snmp_basic
    module is loaded and traffic on port 161 or 162 is NATed.

    Found by multiple testcases of the trap-app and trap-enc groups of the
    PROTOS c06-snmpv1 testsuite.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     
  • Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

20 May, 2006

4 commits

  • Enable SO_LINGER functionality for 1-N style sockets. The socket API
    draft will be clarfied to allow for this functionality. The linger
    settings will apply to all associations on a given socket.

    Signed-off-by: Vladislav Yasevich
    Signed-off-by: Sridhar Samudrala

    Vladislav Yasevich
     
  • If SCTP receives a badly formatted HB-ACK chunk, it is possible
    that we may access invalid memory and potentially have a buffer
    overflow. We should really make sure that the chunk format is
    what we expect, before attempting to touch the data.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: Sridhar Samudrala

    Vladislav Yasevich
     
  • sctp_rcv().

    The goal is to hold the ref on the association/endpoint throughout the
    state-machine process. We accomplish like this:

    /* ref on the assoc/ep is taken during lookup */

    if owned_by_user(sk)
    sctp_add_backlog(skb, sk);
    else
    inqueue_push(skb, sk);

    /* drop the ref on the assoc/ep */

    However, in sctp_add_backlog() we take the ref on assoc/ep and hold it
    while the skb is on the backlog queue. This allows us to get rid of the
    sock_hold/sock_put in the lookup routines.

    Now sctp_backlog_rcv() needs to account for potential association move.
    In the unlikely event that association moved, we need to retest if the
    new socket is locked by user. If we don't this, we may have two packets
    racing up the stack toward the same socket and we can't deal with it.
    If the new socket is still locked, we'll just add the skb to its backlog
    continuing to hold the ref on the association. This get's rid of the
    need to move packets from one backlog to another and it also safe in
    case new packets arrive on the same backlog queue.

    The last step, is to lock the new socket when we are moving the
    association to it. This is needed in case any new packets arrive on
    the association when it moved. We want these to go to the backlog since
    we would like to avoid the race between this new packet and a packet
    that may be sitting on the backlog queue of the old socket toward the
    same association.

    Signed-off-by: Vladislav Yasevich
    Signed-off-by: Sridhar Samudrala

    Vladislav Yasevich
     
  • Also fix some other cases where sk_err is not set for 1-1 style sockets.

    Signed-off-by: Sridhar Samudrala

    Sridhar Samudrala
     

19 May, 2006

5 commits


17 May, 2006

6 commits


13 May, 2006

1 commit

  • The classical IP over ATM code maintains its own IPv4
    ARP table, using the standard neighbour-table code. The
    neigh_table_init function adds this neighbour table to a linked list
    of all neighbor tables which is used by the functions neigh_delete()
    neigh_add() and neightbl_set(), all called by the netlink code.

    Once the ATM neighbour table is added to the list, there are two
    tables with family == AF_INET there, and ARP entries sent via netlink
    go into the first table with matching family. This is indeterminate
    and often wrong.

    To see the bug, on a kernel with CLIP enabled, create a standard IPv4
    ARP entry by pinging an unused address on a local subnet. Then attempt
    to complete that entry by doing

    ip neigh replace lladdr nud reachable

    Looking at the ARP tables by using

    ip neigh show

    will reveal two ARP entries for the same address. One of these can be
    found in /proc/net/arp, and the other in /proc/net/atm/arp.

    This patch adds a new function, neigh_table_init_no_netlink() which
    does everything the neigh_table_init() does, except add the table to
    the netlink all-arp-tables chain. In addition neigh_table_init() has a
    check that all tables on the chain have a distinct address family.
    The init call in clip.c is changed to call
    neigh_table_init_no_netlink().

    Since ATM ARP tables are rather more complicated than can currently be
    handled by the available rtattrs in the netlink protocol, no
    functionality is lost by this patch, and non-ATM ARP manipulation via
    netlink is rescued. A more complete solution would involve a rtattr
    for ATM ARP entries and some way for the netlink code to give
    neigh_add and friends more information than just address family with
    which to find the correct ARP table.

    [ I've changed the assertion checking in neigh_table_init() to not
    use BUG_ON() while holding neigh_tbl_lock. Instead we remember that
    we found an existing tbl with the same family, and after dropping
    the lock we'll give a diagnostic kernel log message and a stack dump.
    -DaveM ]

    Signed-off-by: Simon Kelley
    Signed-off-by: David S. Miller

    Simon Kelley
     

12 May, 2006

1 commit