13 May, 2011

5 commits

  • When removing last vlan from a device, garp_uninit_applicant() calls
    synchronize_rcu() to make sure no user can still manipulate struct
    garp_applicant before we free it.

    Use call_rcu() instead, as a step to further net_device dismantle
    optimizations.

    Add the temporary garp_cleanup_module() function to make sure no pending
    call_rcu() are left at module unload time [ this will be removed when
    kfree_rcu() is available ]

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Resending this patch with few changes.

    Avoid multiple queues when MSI or MSI-X not available

    Limit number of Tx queues to 1 if MSI/MSI-X support is not configured in
    the kernel. This will make number of tx and rx queues equal when MSI/X
    is not configured thus providing better performance.

    Signed-off-by: Bhavesh Davda
    Signed-off-by: Shreyas N Bhatewara
    Signed-off-by: David S. Miller

    Shreyas Bhatewara
     
  • It's already known non-null above.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     
  • This variable only needs initialization when cmsgs.info
    is NULL.

    Use memset to ensure padding is also zeroed so
    kernel doesn't leak any data.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     
  • While trying to remove useless synchronize_rcu() calls, I found l2tp is
    indeed incorrectly using two of such calls, but also bumps tunnel
    refcount after list insertion.

    tunnel refcount must be incremented before being made publically visible
    by rcu readers.

    This fix can be applied to 2.6.35+ and might need a backport for older
    kernels, since things were shuffled in commit fd558d186df2c
    (l2tp: Split pppol2tp patch into separate l2tp and ppp parts)

    Signed-off-by: Eric Dumazet
    CC: Paul E. McKenney
    CC: James Chapman
    Reviewed-by: Paul E. McKenney
    Signed-off-by: David S. Miller

    Eric Dumazet
     

12 May, 2011

2 commits


11 May, 2011

33 commits

  • David S. Miller
     
  • TTY layer expects 0 if the ldisc->open operation succeeded.

    Reported-by: Matvejchikov Ilya
    Signed-off-by: Oliver Hartkopp
    Signed-off-by: David S. Miller

    Oliver Hartkopp
     
  • Like other mobile broadband device ethernet interfaces, mark the LG
    VL600 with the 'wwan' devtype so userspace knows it needs additional
    configuration via the AT port before the interface can be used.

    Signed-off-by: Dan Williams
    Signed-off-by: David S. Miller

    Dan Williams
     
  • Unlike the standard case, disabled anti replay detection needs some
    nontrivial extra treatment on ESN. RFC 4303 states:

    Note: If a receiver chooses to not enable anti-replay for an SA, then
    the receiver SHOULD NOT negotiate ESN in an SA management protocol.
    Use of ESN creates a need for the receiver to manage the anti-replay
    window (in order to determine the correct value for the high-order
    bits of the ESN, which are employed in the ICV computation), which is
    generally contrary to the notion of disabling anti-replay for an SA.

    So return an error if an ESN state with disabled anti replay detection
    is inserted for now and add the extra treatment later if we need it.

    Signed-off-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Steffen Klassert
     
  • As it is, we assign the outer modes output function to the dst entry
    when we create the xfrm bundle. This leads to two problems on interfamily
    scenarios. We might insert ipv4 packets into ip6_fragment when called
    from xfrm6_output. The system crashes if we try to fragment an ipv4
    packet with ip6_fragment. This issue was introduced with git commit
    ad0081e4 (ipv6: Fragment locally generated tunnel-mode IPSec6 packets
    as needed). The second issue is, that we might insert ipv4 packets in
    netfilter6 and vice versa on interfamily scenarios.

    With this patch we assign the inner mode output function to the dst entry
    when we create the xfrm bundle. So xfrm4_output/xfrm6_output from the inner
    mode is used and the right fragmentation and netfilter functions are called.
    We switch then to outer mode with the output_finish functions.

    Signed-off-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Steffen Klassert
     
  • Commit 443457242beb (factorize sync-rcu call in
    unregister_netdevice_many) mistakenly removed one test from dev_close()

    Following actions trigger a BUG :

    modprobe bonding
    modprobe dummy
    ifconfig bond0 up
    ifenslave bond0 dummy0
    rmmod dummy

    dev_close() must not close a non IFF_UP device.

    With help from Frank Blaschka and Einar EL Lueck

    Reported-by: Frank Blaschka
    Reported-by: Einar EL Lueck
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • ip link add link eth2 eth2.103 type vlan id 103 gvrp on loose_binding on
    ip link set eth2.103 up
    rmmod tg3 # driver providing eth2

    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: [] garp_request_leave+0x3e/0xc0 [garp]
    PGD 11d251067 PUD 11b9e0067 PMD 0
    Oops: 0000 [#1] SMP
    last sysfs file: /sys/devices/virtual/net/eth2.104/ifindex
    CPU 0
    Modules linked in: tg3(-) 8021q garp nfsd lockd auth_rpcgss sunrpc libphy sg [last unloaded: x_tables]

    Pid: 11494, comm: rmmod Tainted: G W 2.6.39-rc6-00261-gfd71257-dirty #580 HP ProLiant BL460c G6
    RIP: 0010:[] [] garp_request_leave+0x3e/0xc0 [garp]
    RSP: 0018:ffff88007a19bae8 EFLAGS: 00010286
    RAX: 0000000000000000 RBX: ffff88011b5e2000 RCX: 0000000000000002
    RDX: 0000000000000000 RSI: 0000000000000175 RDI: ffffffffa0030d5b
    RBP: ffff88007a19bb18 R08: 0000000000000001 R09: ffff88011bd64a00
    R10: ffff88011d34ec00 R11: 0000000000000000 R12: 0000000000000002
    R13: ffff88007a19bc48 R14: ffff88007a19bb88 R15: 0000000000000001
    FS: 0000000000000000(0000) GS:ffff88011fc00000(0063) knlGS:00000000f77d76c0
    CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
    CR2: 0000000000000000 CR3: 000000011a675000 CR4: 00000000000006f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process rmmod (pid: 11494, threadinfo ffff88007a19a000, task ffff8800798595c0)
    Stack:
    ffff88007a19bb36 ffff88011c84b800 ffff88011b5e2000 ffff88007a19bc48
    ffff88007a19bb88 0000000000000006 ffff88007a19bb38 ffffffffa003a5f6
    ffff88007a19bb38 670088007a19bba8 ffff88007a19bb58 ffffffffa00397e7
    Call Trace:
    [] vlan_gvrp_request_leave+0x46/0x50 [8021q]
    [] vlan_dev_stop+0xb7/0xc0 [8021q]
    [] __dev_close_many+0x87/0xe0
    [] dev_close_many+0x87/0x110
    [] rollback_registered_many+0xa0/0x240
    [] unregister_netdevice_many+0x19/0x60
    [] vlan_device_event+0x53b/0x550 [8021q]
    [] ? ip6mr_device_event+0xa8/0xd0
    [] notifier_call_chain+0x53/0x80
    [] __raw_notifier_call_chain+0x9/0x10
    [] raw_notifier_call_chain+0x11/0x20
    [] call_netdevice_notifiers+0x32/0x60
    [] rollback_registered_many+0x10f/0x240
    [] rollback_registered+0x2f/0x40
    [] unregister_netdevice_queue+0x58/0x90
    [] unregister_netdev+0x1b/0x30
    [] tg3_remove_one+0x6f/0x10b [tg3]

    We should call vlan_gvrp_request_leave() from unregister_vlan_dev(),
    not from vlan_dev_stop(), because vlan_gvrp_uninit_applicant()
    is called right after unregister_netdevice_queue(). In batch mode,
    unregister_netdevice_queue() doesn’t immediately call vlan_dev_stop().

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Commit e67f88dd12f6 (net: dont hold rtnl mutex during netlink dump
    callbacks) switched rtnl protection to RCU, but we forgot to adjust two
    rcu_dereference() lockdep annotations :

    inet_get_link_af_size() or inet_fill_link_af() might be called with
    rcu_read_lock or rtnl held, so use rcu_dereference_rtnl()
    instead of rtnl_dereference()

    Reported-by: Valdis Kletnieks
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Rearrange xfrm4_dst_lookup() so that it works by calling a helper
    function __xfrm_dst_lookup() that takes an explicit flow key storage
    area as an argument.

    Use this new helper in xfrm4_get_saddr() so we can fetch the selected
    source address from the flow instead of from rt->rt_src

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Use an explicit flow key and fetch it from there.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Flow key is available, so fetch it from there.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • We already track and pass around the correct flow key,
    so simply use it in udp_send_skb().

    Signed-off-by: David S. Miller

    David S. Miller
     
  • On input packets, rt->rt_src always equals ip_hdr(skb)->saddr

    Anything that mangles or otherwise changes the IP header must
    relookup the route found at skb_rtable(). Therefore this
    invariant must always hold true.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • This eliminates an access to rt->rt_src.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Revises the algorithm governing the sending of link request messages
    to take into account the number of nodes each bearer is currently in
    contact with, and to ensure more rapid rediscovery of neighboring nodes
    if a bearer fails and then recovers.

    The discovery object now sends requests at least once a second if it
    is not in contact with any other nodes, and at least once a minute if
    it has at least one neighbor; if contact with the only neighbor is
    lost, the object immediately reverts to its initial rapid-fire search
    timing to accelerate the rediscovery process.

    In addition, the discovery object now stops issuing link request
    messages if it is in contact with the only neighboring node it is
    configured to communicate with, since further searching is unnecessary.

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker

    Allan Stephens
     
  • Augments TIPC's discovery object to track the number of neighboring nodes
    having an active link to the associated bearer.

    This means tipc_disc_update_link_req() becomes either one of:

    tipc_disc_add_dest()
    or:
    tipc_disc_remove_dest()

    depending on the code flow direction of things.

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker

    Allan Stephens
     
  • Augments TIPC's discovery object to send its initial neighbor discovery
    request message as soon as the associated bearer is created, rather than
    waiting for its first periodic timeout to occur, thereby speeding up the
    discovery process. Also adds a check to suppress the initial request or
    subsequent requests if the bearer is blocked at the time the request is
    scheduled for transmission.

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker

    Allan Stephens
     
  • Modifies bearer creation and deletion code to improve handling of
    scenarios when a neighbor discovery object cannot be created. The
    creation routine now aborts the creation of a bearer if its discovery
    object cannot be created, and deletes the newly created bearer, rather
    than failing quietly and leaving an unusable bearer hanging around.

    Since the exit via the goto label really isn't a definitive failure
    in all cases, relabel it appropriately.

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker

    Allan Stephens
     
  • Create a helper routine to enqueue a chain of sk_buffs to a link's
    transmit queue. It improves readability and the new function is
    anticipated to be used more than just once in the future as well.

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker

    Allan Stephens
     
  • Rework TIPC's message sending routines to take advantage of the total
    amount of data value passed to it by the kernel socket infrastructure.
    This change eliminates the need for TIPC to compute the size of outgoing
    messages itself, as well as the check for an oversize message in
    tipc_msg_build(). In addition, this change warrants an explanation:

    - res = send_packet(NULL, sock, &my_msg, 0);
    + res = send_packet(NULL, sock, &my_msg, bytes_to_send);

    Previously, the final argument to send_packet() was ignored (since the
    amount of data being sent was recalculated by a lower-level routine)
    and we could just pass in a dummy value (0). Now that the
    recalculation is being eliminated, the argument value being passed to
    send_packet() is significant and we have to supply the actual amount
    of data we want to send.

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker

    Allan Stephens
     
  • Adds checks to TIPC's socket send routines to promptly detect and
    abort attempts to send more than 66,000 bytes in a single TIPC
    message or more than 2**31-1 bytes in a single TIPC byte stream request.
    In addition, this ensures that the number of iovecs in a send request
    does not exceed the limits of a standard integer variable.

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker

    Allan Stephens
     
  • Enhances existing checks on the discovery domain associated with a TIPC
    bearer. A bearer can no longer be configured to accept links from itself
    only (which would be pointless), or to nodes outside its own cluster
    (since multi-cluster support has now been removed from TIPC). Also, the
    neighbor discovery routine now validates link setup requests against the
    configured discovery domain for the bearer, rather than simply ensuring
    the requesting node belongs to the node's own cluster.

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker

    Allan Stephens
     
  • This allows them to be available for easy re-use in other places
    and avoids trivial mistakes caused by "count the f's and 0's".

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     
  • Modifies a TIPC send routine that did not discard the outgoing sk_buff
    if it was not transmitted because of link congestion; this eliminates
    the potential for buffer leakage in the many callers who did not clean up
    the unsent buffer. (The two routines that previously did discard the unsent
    buffer have been updated to eliminate their now-redundant clean up.)

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker

    Allan Stephens
     
  • Sets the destination node field of an incoming multicast message
    to the receiving node's network address before handing off the message
    to each receiving port. This ensures that, in the event the destination
    port returns the message to the sender, the sender can identify which
    node the destination port belonged to.

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker

    Allan Stephens
     
  • Set the destination node and destination port fields of an outgoing
    multicast message header to zero; this is necessary to ensure that
    the receiving node can route the message properly if it was packed
    into a bundle due to link congestion. (Previously, there was a chance
    that the receiving node would send the unbundled message to a random
    node & port, rather than processing the message itself.)

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker

    Allan Stephens
     
  • Ensures that all outgoing data messages have the "name lookup scope"
    field of their header set correctly; that is, named multicast messages
    now specify cluster-wide name lookup, while messages not using TIPC
    naming zero out the lookup field. (Previously, the lookup scope specified
    for these types of messages was inherited from the last message sent
    by the sending port.)

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker

    Allan Stephens
     
  • Modifies the routine that fragments an existing message buffer to
    use similar logic to that used when generating fragments from an iovec.
    The routine now creates a complete chain of fragments and adds them to
    the link transmit queue as a unit, so that the link sends all fragments
    or none; this prevents the incomplete transmission of a fragmented
    message that might otherwise result because of link congestion or
    memory exhaustion. This change also ensures that the counter recording
    the number of fragmented messages sent by the link is now incremented
    only if the message is actually sent.

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker

    Allan Stephens
     
  • Eliminates code that restricts a link's counter of its fragmented
    messages to a 16-bit value, since the counter value is automatically
    restricted to this range when it is written into the message header.

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker

    Allan Stephens
     
  • Eliminates code that sets the link selector field in the header of
    fragmented messages, since this information is never referenced.
    (The unnecessary initialization was harmless as it was over-written
    by the fragmented message identifier value before the fragments were
    transmitted.)

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker

    Allan Stephens
     
  • Eliminates optional code used to test TIPC's ability to recover
    from lost broadcast messages. This code duplicates functionality
    already provided by the network stack's QoS option "network emulator".

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker

    Allan Stephens
     
  • Half of the #define entries in msg.h were down at the bottom
    of the header, instead of up at the top before any of the static
    inlines etc. Relocate them up to the top, to be consistent with
    the other normal linux header file layout conventions.

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker

    Allan Stephens
     
  • Gets rid of unused constants defining the types used in routing
    messages. These messages no longer exist in TIPC now that multicluster
    and multizone support has been eliminated.

    Signed-off-by: Allan Stephens
    Signed-off-by: Paul Gortmaker

    Allan Stephens