12 Dec, 2011

1 commit


10 Dec, 2011

1 commit


17 Nov, 2011

1 commit


02 Nov, 2011

1 commit

  • the tcp and udp code creates a set of struct file_operations at runtime
    while it can also be done at compile time, with the added benefit of then
    having these file operations be const.

    the trickiest part was to get the "THIS_MODULE" reference right; the naive
    method of declaring a struct in the place of registration would not work
    for this reason.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: David S. Miller

    Arjan van de Ven
     

02 Mar, 2011

1 commit

  • This patch converts UDP to use the new ip_finish_skb API. This
    would then allows us to more easily use ip_make_skb which allows
    UDP to run without a socket lock.

    Signed-off-by: Herbert Xu
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Herbert Xu
     

25 Jan, 2011

1 commit

  • Quoting Ben Hutchings: we presumably won't be defining features that
    can only be enabled on 64-bit architectures.

    Occurences found by `grep -r` on net/, drivers/net, include/

    [ Move features and vlan_features next to each other in
    struct netdev, as per Eric Dumazet's suggestion -DaveM ]

    Signed-off-by: Michał Mirosław
    Signed-off-by: David S. Miller

    Michał Mirosław
     

11 Nov, 2010

1 commit

  • Robin Holt tried to boot a 16TB machine and found some limits were
    reached : sysctl_tcp_mem[2], sysctl_udp_mem[2]

    We can switch infrastructure to use long "instead" of "int", now
    atomic_long_t primitives are available for free.

    Signed-off-by: Eric Dumazet
    Reported-by: Robin Holt
    Reviewed-by: Robin Holt
    Signed-off-by: Andrew Morton
    Signed-off-by: David S. Miller

    Eric Dumazet
     

21 Oct, 2010

1 commit

  • Just like with IPv4, we need access to the UDP hash table to look up local
    sockets, but instead of exporting the global udp_table, export a lookup
    function.

    Signed-off-by: Balazs Scheidler
    Signed-off-by: KOVACS Krisztian
    Signed-off-by: Patrick McHardy

    Balazs Scheidler
     

09 Sep, 2010

1 commit

  • commit 30fff923 introduced in linux-2.6.33 (udp: bind() optimisation)
    added a secondary hash on UDP, hashed on (local addr, local port).

    Problem is that following sequence :

    fd = socket(...)
    connect(fd, &remote, ...)

    not only selects remote end point (address and port), but also sets
    local address, while UDP stack stored in secondary hash table the socket
    while its local address was INADDR_ANY (or ipv6 equivalent)

    Sequence is :
    - autobind() : choose a random local port, insert socket in hash tables
    [while local address is INADDR_ANY]
    - connect() : set remote address and port, change local address to IP
    given by a route lookup.

    When an incoming UDP frame comes, if more than 10 sockets are found in
    primary hash table, we switch to secondary table, and fail to find
    socket because its local address changed.

    One solution to this problem is to rehash datagram socket if needed.

    We add a new rehash(struct socket *) method in "struct proto", and
    implement this method for UDP v4 & v6, using a common helper.

    This rehashing only takes care of secondary hash table, since primary
    hash (based on local port only) is not changed.

    Reported-by: Krzysztof Piotr Oledzki
    Signed-off-by: Eric Dumazet
    Tested-by: Krzysztof Piotr Oledzki
    Signed-off-by: David S. Miller

    Eric Dumazet
     

13 Jul, 2010

1 commit

  • remove useless blanks.

    Signed-off-by: Changli Gao
    ----
    include/net/inet_common.h | 55 ++++-------
    include/net/tcp.h | 222 +++++++++++++++++-----------------------------
    include/net/udp.h | 38 +++----
    3 files changed, 123 insertions(+), 192 deletions(-)
    Signed-off-by: David S. Miller

    Changli Gao
     

11 Nov, 2009

1 commit


09 Nov, 2009

2 commits

  • Extends udp_table to contain a secondary hash table.

    socket anchor for this second hash is free, because UDP
    doesnt use skc_bind_node : We define an union to hold
    both skc_bind_node & a new hlist_nulls_node udp_portaddr_node

    udp_lib_get_port() inserts sockets into second hash chain
    (additional cost of one atomic op)

    udp_lib_unhash() deletes socket from second hash chain
    (additional cost of one atomic op)

    Note : No spinlock lockdep annotation is needed, because
    lock for the secondary hash chain is always get after
    lock for primary hash chain.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Adds a counter in udp_hslot to keep an accurate count
    of sockets present in chain.

    This will permit to upcoming UDP lookup algo to chose
    the shortest chain when secondary hash is added.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

08 Oct, 2009

1 commit

  • UDP_HTABLE_SIZE was initialy defined to 128, which is a bit small for
    several setups.

    4000 active UDP sockets -> 32 sockets per chain in average. An
    incoming frame has to lookup all sockets to find best match, so long
    chains hurt latency.

    Instead of a fixed size hash table that cant be perfect for every
    needs, let UDP stack choose its table size at boot time like tcp/ip
    route, using alloc_large_system_hash() helper

    Add an optional boot parameter, uhash_entries=x so that an admin can
    force a size between 256 and 65536 if needed, like thash_entries and
    rhash_entries.

    dmesg logs two new lines :
    [ 0.647039] UDP hash table entries: 512 (order: 0, 4096 bytes)
    [ 0.647099] UDP Lite hash table entries: 512 (order: 0, 4096 bytes)

    Maximal size on 64bit arches would be 65536 slots, ie 1 MBytes for non
    debugging spinlocks.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

01 Oct, 2009

1 commit

  • This provides safety against negative optlen at the type
    level instead of depending upon (sometimes non-trivial)
    checks against this sprinkled all over the the place, in
    each and every implementation.

    Based upon work done by Arjan van de Ven and feedback
    from Linus Torvalds.

    Signed-off-by: David S. Miller

    David S. Miller
     

13 Jul, 2009

1 commit


11 Apr, 2009

1 commit

  • Commit b2f5e7cd3dee2ed721bf0675e1a1ddebb849aee6
    (ipv6: Fix conflict resolutions during ipv6 binding)
    introduced a regression where time-wait sockets were
    not treated correctly. This resulted in the following:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000062
    IP: [] ipv4_rcv_saddr_equal+0x61/0x70
    ...
    Call Trace:
    [] ipv6_rcv_saddr_equal+0x1bb/0x250 [ipv6]
    [] inet6_csk_bind_conflict+0x88/0xd0 [ipv6]
    [] inet_csk_get_port+0x1ee/0x400
    [] inet6_bind+0x1cf/0x3a0 [ipv6]
    [] ? sockfd_lookup_light+0x3c/0xd0
    [] sys_bind+0x89/0x100
    [] ? trace_hardirqs_on_thunk+0x3a/0x3c
    [] system_call_fastpath+0x16/0x1b

    Tested-by: Brian Haley
    Tested-by: Ed Tomlinson
    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

25 Mar, 2009

1 commit


17 Nov, 2008

1 commit

  • This is a straightforward patch, using hlist_nulls infrastructure.

    RCUification already done on UDP two weeks ago.

    Using hlist_nulls permits us to avoid some memory barriers, both
    at lookup time and delete time.

    Patch is large because it adds new macros to include/net/sock.h.
    These macros will be used by TCP & DCCP in next patch.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

29 Oct, 2008

1 commit

  • UDP sockets are hashed in a 128 slots hash table.

    This hash table is protected by *one* rwlock.

    This rwlock is readlocked each time an incoming UDP message is handled.

    This rwlock is writelocked each time a socket must be inserted in
    hash table (bind time), or deleted from this table (close time)

    This is not scalable on SMP machines :

    1) Even in read mode, lock() and unlock() are atomic operations and
    must dirty a contended cache line, shared by all cpus.

    2) A writer might be starved if many readers are 'in flight'. This can
    happen on a machine with some NIC receiving many UDP messages. User
    process can be delayed a long time at socket creation/dismantle time.

    This patch prepares RCU migration, by introducing 'struct udp_table
    and struct udp_hslot', and using one spinlock per chain, to reduce
    contention on central rwlock.

    Introducing one spinlock per chain reduces latencies, for port
    randomization on heavily loaded UDP servers. This also speedup
    bindings to specific ports.

    udp_lib_unhash() was uninlined, becoming to big.

    Some cleanups were done to ease review of following patch
    (RCUification of UDP Unicast lookups)

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

08 Oct, 2008

2 commits


01 Oct, 2008

1 commit


18 Jul, 2008

2 commits


06 Jul, 2008

4 commits


13 Jun, 2008

1 commit


05 Jun, 2008

1 commit

  • IPv6 UDP sockets wth IPv4 mapped address use udp_sendmsg to send the data
    actually. In this case ip_flush_pending_frames should be called instead
    of ip6_flush_pending_frames.

    Signed-off-by: Denis V. Lunev
    Signed-off-by: YOSHIFUJI Hideaki

    Denis V. Lunev
     

01 Apr, 2008

1 commit


29 Mar, 2008

4 commits


23 Mar, 2008

1 commit

  • After this we have only udp_lib_get_port to get the port and two
    stubs for ipv4 and ipv6. No difference in udp and udplite except
    for initialized h.udp_hash member.

    I tried to find a graceful way to drop the only difference between
    udp_v4_get_port and udp_v6_get_port (i.e. the rcv_saddr comparison
    routine), but adding one more callback on the struct proto didn't
    appear such :( Maybe later.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

21 Mar, 2008

2 commits


29 Jan, 2008

1 commit

  • 1) Cleanups (all functions are prefixed by sock_prot_inuse)

    sock_prot_inc_use(prot) -> sock_prot_inuse_add(prot,-1)
    sock_prot_dec_use(prot) -> sock_prot_inuse_add(prot,-1)
    sock_prot_inuse() -> sock_prot_inuse_get()

    New functions :

    sock_prot_inuse_init() and sock_prot_inuse_free() to abstract pcounter use.

    2) if CONFIG_PROC_FS=n, we can zap 'inuse' member from "struct proto",
    since nobody wants to read the inuse value.

    This saves 1372 bytes on i386/SMP and some cpu cycles.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet