29 Dec, 2011

1 commit


06 Dec, 2011

1 commit


01 Dec, 2011

2 commits


19 Nov, 2011

1 commit

  • ipv4: Remove all uses of LL_ALLOCATED_SPACE

    The macro LL_ALLOCATED_SPACE was ill-conceived. It applies the
    alignment to the sum of needed_headroom and needed_tailroom. As
    the amount that is then reserved for head room is needed_headroom
    with alignment, this means that the tail room left may be too small.

    This patch replaces all uses of LL_ALLOCATED_SPACE in net/ipv4
    with the macro LL_RESERVED_SPACE and direct reference to
    needed_tailroom.

    This also fixes the problem with needed_headroom changing between
    allocating the skb and reserving the head room.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

14 Nov, 2011

1 commit

  • Le mercredi 09 novembre 2011 à 16:21 -0500, David Miller a écrit :
    > From: David Miller
    > Date: Wed, 09 Nov 2011 16:16:44 -0500 (EST)
    >
    > > From: Eric Dumazet
    > > Date: Wed, 09 Nov 2011 12:14:09 +0100
    > >
    > >> unres_qlen is the number of frames we are able to queue per unresolved
    > >> neighbour. Its default value (3) was never changed and is responsible
    > >> for strange drops, especially if IP fragments are used, or multiple
    > >> sessions start in parallel. Even a single tcp flow can hit this limit.
    > > ...
    > >
    > > Ok, I've applied this, let's see what happens :-)
    >
    > Early answer, build fails.
    >
    > Please test build this patch with DECNET enabled and resubmit. The
    > decnet neigh layer still refers to the removed ->queue_len member.
    >
    > Thanks.

    Ouch, this was fixed on one machine yesterday, but not the other one I
    used this morning, sorry.

    [PATCH V5 net-next] neigh: new unresolved queue limits

    unres_qlen is the number of frames we are able to queue per unresolved
    neighbour. Its default value (3) was never changed and is responsible
    for strange drops, especially if IP fragments are used, or multiple
    sessions start in parallel. Even a single tcp flow can hit this limit.

    $ arp -d 192.168.20.108 ; ping -c 2 -s 8000 192.168.20.108
    PING 192.168.20.108 (192.168.20.108) 8000(8028) bytes of data.
    8008 bytes from 192.168.20.108: icmp_seq=2 ttl=64 time=0.322 ms

    Signed-off-by: David S. Miller

    Eric Dumazet
     

18 Jul, 2011

1 commit


17 Jul, 2011

2 commits


13 Jul, 2011

1 commit

  • Get rid of all of the useless and costly indirection
    by doing the neigh hash table lookup directly inside
    of the neighbour binding.

    Rename from arp_bind_neighbour to rt_bind_neighbour.

    Use new helpers {__,}ipv4_neigh_lookup()

    In rt_bind_neighbour() get rid of useless tests which
    are never true in the context this function is called,
    namely dev is never NULL and the dst->neighbour is
    always NULL.

    Signed-off-by: David S. Miller

    David Miller
     

11 Jul, 2011

1 commit


30 Mar, 2011

1 commit

  • My commit 6d55cb91a0020ac0 (gre: fix hard header destination
    address checking) broke multicast.

    The reason is that ip_gre used to get ipgre_header() calls with
    zero destination if we have NOARP or multicast destination. Instead
    the actual target was decided at ipgre_tunnel_xmit() time based on
    per-protocol dissection.

    Instead of allowing the "abuse" of ->header() calls with invalid
    destination, this creates multicast mappings for ip_gre. This also
    fixes "ip neigh show nud noarp" to display the proper multicast
    mappings used by the gre device.

    Reported-by: Doug Kehn
    Signed-off-by: Timo Teräs
    Acked-by: Doug Kehn
    Signed-off-by: David S. Miller

    Timo Teräs
     

13 Mar, 2011

1 commit


03 Mar, 2011

1 commit


25 Jan, 2011

1 commit

  • Commit 941666c2e3e0 "net: RCU conversion of dev_getbyhwaddr() and
    arp_ioctl()" introduced a regression, reported by Jamie Heilman.
    "arp -Ds 192.168.2.41 eth0 pub" triggered the ASSERT_RTNL() assert
    in pneigh_lookup()

    Removing RTNL requirement from arp_ioctl() was a mistake, just revert
    that part.

    Reported-by: Jamie Heilman
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

11 Jan, 2011

1 commit

  • IPv4 over firewire needs to be able to remove ARP entries
    from the ARP cache that belong to nodes that are removed, because
    IPv4 over firewire uses ARP packets for private information
    about nodes.

    This information becomes invalid as soon as node drops
    off the bus and when it reconnects, its only possible
    to start talking to it after it responded to an ARP packet.
    But ARP cache prevents such packets from being sent.

    Signed-off-by: Maxim Levitsky
    Signed-off-by: David S. Miller

    Maxim Levitsky
     

09 Dec, 2010

1 commit

  • Le dimanche 05 décembre 2010 à 09:19 +0100, Eric Dumazet a écrit :

    > Hmm..
    >
    > If somebody can explain why RTNL is held in arp_ioctl() (and therefore
    > in arp_req_delete()), we might first remove RTNL use in arp_ioctl() so
    > that your patch can be applied.
    >
    > Right now it is not good, because RTNL wont be necessarly held when you
    > are going to call arp_invalidate() ?

    While doing this analysis, I found a refcount bug in llc, I'll send a
    patch for net-2.6

    Meanwhile, here is the patch for net-next-2.6

    Your patch then can be applied after mine.

    Thanks

    [PATCH] net: RCU conversion of dev_getbyhwaddr() and arp_ioctl()

    dev_getbyhwaddr() was called under RTNL.

    Rename it to dev_getbyhwaddr_rcu() and change all its caller to now use
    RCU locking instead of RTNL.

    Change arp_ioctl() to use RCU instead of RTNL locking.

    Note: this fix a dev refcount bug in llc

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

07 Dec, 2010

1 commit

  • Only when dont_send is 0, arp_filter() is consulted, so we can simply
    assign the return value of arp_filter() to dont_send instead.

    Signed-off-by: Changli Gao
    Signed-off-by: David S. Miller

    Changli Gao
     

18 Nov, 2010

1 commit


12 Oct, 2010

1 commit

  • Add a seqlock in struct neighbour to protect neigh->ha[], and avoid
    dirtying neighbour in stress situation (many different flows / dsts)

    Dirtying takes place because of read_lock(&n->lock) and n->used writes.

    Switching to a seqlock, and writing n->used only on jiffies changes
    permits less dirtying.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

06 Oct, 2010

1 commit

  • David

    This is the first step for RCU conversion of neigh code.

    Next patches will convert hash_buckets[] and "struct neighbour" to RCU
    protected objects.

    Thanks

    [PATCH net-next] net neigh: RCU conversion of neigh hash table

    Instead of storing hash_buckets, hash_mask and hash_rnd in "struct
    neigh_table", a new structure is defined :

    struct neigh_hash_table {
    struct neighbour **hash_buckets;
    unsigned int hash_mask;
    __u32 hash_rnd;
    struct rcu_head rcu;
    };

    And "struct neigh_table" has an RCU protected pointer to such a
    neigh_hash_table.

    This means the signature of (*hash)() function changed: We need to add a
    third parameter with the actual hash_rnd value, since this is not
    anymore a neigh_table field.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

30 Sep, 2010

1 commit


24 Sep, 2010

1 commit


03 Sep, 2010

1 commit

  • Clean the code up according to Documentation/CodingStyle.

    Don't initialize the variable dont_send in arp_process().

    Remove the temporary varialbe flags in arp_state_to_flags().

    Signed-off-by: Changli Gao
    Signed-off-by: David S. Miller

    Changli Gao
     

13 Jul, 2010

1 commit


26 Jun, 2010

1 commit


11 Jun, 2010

1 commit


04 Jun, 2010

1 commit

  • Avoid two atomic ops in arp_fwd_proxy()

    Avoid two atomic ops in arp_process()

    Valid optims since arp_rcv() is run under rcu_read_lock()

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

18 May, 2010

1 commit


10 May, 2010

1 commit


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

17 Feb, 2010

1 commit

  • Stop computing the number of neighbour table settings we have by
    counting the number of binary sysctls. This behaviour was silly
    and meant that we could not add another neighbour table setting
    without also adding another binary sysctl.

    Don't pass the binary sysctl path for neighour table entries
    into neigh_sysctl_register. These parameters are no longer
    used and so are just dead code.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

19 Jan, 2010

1 commit

  • If the per device ARP_ACCEPT option is enable, currently we only allow
    creating new ARP cache entries for response type gratuitous ARP.

    Allowing gratuitous ARP to create new ARP entries (not only to update
    existing ones) is useful when we want to avoid unnecessary delays for
    the first packet of a stream.

    This patch allows request type gratuitous ARP to create new ARP cache
    entries as well. This is useful when we want to populate the ARP cache
    entries for a large number of hosts on the same LAN.

    Signed-off-by: Octavian Purdila
    Signed-off-by: David S. Miller

    Octavian Purdila
     

07 Jan, 2010

1 commit

  • This is to be used together with switch technologies, like RFC3069,
    that where the individual ports are not allowed to communicate with
    each other, but they are allowed to talk to the upstream router. As
    described in RFC 3069, it is possible to allow these hosts to
    communicate through the upstream router by proxy_arp'ing.

    This patch basically allow proxy arp replies back to the same
    interface (from which the ARP request/solicitation was received).

    Tunable per device via proc "proxy_arp_pvlan":
    /proc/sys/net/ipv4/conf/*/proxy_arp_pvlan

    This switch technology is known by different vendor names:
    - In RFC 3069 it is called VLAN Aggregation.
    - Cisco and Allied Telesyn call it Private VLAN.
    - Hewlett-Packard call it Source-Port filtering or port-isolation.
    - Ericsson call it MAC-Forced Forwarding (RFC Draft).

    Signed-off-by: Jesper Dangaard Brouer
    Signed-off-by: David S. Miller

    Jesper Dangaard Brouer
     

12 Nov, 2009

1 commit

  • Now that sys_sysctl is a compatiblity wrapper around /proc/sys
    all sysctl strategy routines, and all ctl_name and strategy
    entries in the sysctl tables are unused, and can be
    revmoed.

    In addition neigh_sysctl_register has been modified to no longer
    take a strategy argument and it's callers have been modified not
    to pass one.

    Cc: "David Miller"
    Cc: Hideaki YOSHIFUJI
    Cc: netdev@vger.kernel.org
    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     

02 Sep, 2009

1 commit


31 Jul, 2009

1 commit


01 Jul, 2009

1 commit

  • This reverts commit 73ce7b01b4496a5fbf9caf63033c874be692333f.

    After discovering that we don't listen to gratuitious arps in 2.6.30
    I tracked the failure down to this commit.

    The patch makes absolutely no sense. RFC2131 RFC3927 and RFC5227.
    are all in agreement that an arp request with sip == 0 should be used
    for the probe (to prevent learning) and an arp request with sip == tip
    should be used for the gratitous announcement that people can learn
    from.

    It appears the author of the broken patch got those two cases confused
    and modified the code to drop all gratuitous arp traffic. Ouch!

    Cc: stable@kernel.org
    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

03 Jun, 2009

2 commits

  • Define three accessors to get/set dst attached to a skb

    struct dst_entry *skb_dst(const struct sk_buff *skb)

    void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

    void skb_dst_drop(struct sk_buff *skb)
    This one should replace occurrences of :
    dst_release(skb->dst)
    skb->dst = NULL;

    Delete skb->dst field

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Define skb_rtable(const struct sk_buff *skb) accessor to get rtable from skb

    Delete skb->rtable field

    Setting rtable is not allowed, just set dst instead as rtable is an alias.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet