15 Jul, 2010

1 commit

  • When configuring DMVPN (GRE + openNHRP) and a GRE remote
    address is configured a kernel Oops is observed. The
    obserseved Oops is caused by a NULL header_ops pointer
    (neigh->dev->header_ops) in neigh_update_hhs() when

    void (*update)(struct hh_cache*, const struct net_device*, const unsigned char *)
    = neigh->dev->header_ops->cache_update;

    is executed. The dev associated with the NULL header_ops is
    the GRE interface. This patch guards against the
    possibility that header_ops is NULL.

    This Oops was first observed in kernel version 2.6.26.8.

    Signed-off-by: Doug Kehn
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Doug Kehn
     

28 May, 2010

1 commit

  • commit 7fee226ad23 (net: add a noref bit on skb dst) missed one spot
    where an skb is enqueued, with a possibly not refcounted dst entry.

    __neigh_event_send() inserts skb into arp_queue, so we must make sure
    dst entry is refcounted, or dst entry can be freed by garbage collector
    after caller exits from rcu protected section.

    Reported-by: Ingo Molnar
    Tested-by: Ingo Molnar
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

10 Mar, 2010

1 commit


17 Feb, 2010

1 commit

  • Stop computing the number of neighbour table settings we have by
    counting the number of binary sysctls. This behaviour was silly
    and meant that we could not add another neighbour table setting
    without also adding another binary sysctl.

    Don't pass the binary sysctl path for neighour table entries
    into neigh_sysctl_register. These parameters are no longer
    used and so are just dead code.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

23 Jan, 2010

1 commit


08 Dec, 2009

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1815 commits)
    mac80211: fix reorder buffer release
    iwmc3200wifi: Enable wimax core through module parameter
    iwmc3200wifi: Add wifi-wimax coexistence mode as a module parameter
    iwmc3200wifi: Coex table command does not expect a response
    iwmc3200wifi: Update wiwi priority table
    iwlwifi: driver version track kernel version
    iwlwifi: indicate uCode type when fail dump error/event log
    iwl3945: remove duplicated event logging code
    b43: fix two warnings
    ipw2100: fix rebooting hang with driver loaded
    cfg80211: indent regulatory messages with spaces
    iwmc3200wifi: fix NULL pointer dereference in pmkid update
    mac80211: Fix TX status reporting for injected data frames
    ath9k: enable 2GHz band only if the device supports it
    airo: Fix integer overflow warning
    rt2x00: Fix padding bug on L2PAD devices.
    WE: Fix set events not propagated
    b43legacy: avoid PPC fault during resume
    b43: avoid PPC fault during resume
    tcp: fix a timewait refcnt race
    ...

    Fix up conflicts due to sysctl cleanups (dead sysctl_check code and
    CTL_UNNUMBERED removed) in
    kernel/sysctl_check.c
    net/ipv4/sysctl_net_ipv4.c
    net/ipv6/addrconf.c
    net/sctp/sysctl.c

    Linus Torvalds
     

26 Nov, 2009

1 commit

  • Generated with the following semantic patch

    @@
    struct net *n1;
    struct net *n2;
    @@
    - n1 == n2
    + net_eq(n1, n2)

    @@
    struct net *n1;
    struct net *n2;
    @@
    - n1 != n2
    + !net_eq(n1, n2)

    applied over {include,net,drivers/net}.

    Signed-off-by: Octavian Purdila
    Signed-off-by: David S. Miller

    Octavian Purdila
     

12 Nov, 2009

1 commit

  • Now that sys_sysctl is a compatiblity wrapper around /proc/sys
    all sysctl strategy routines, and all ctl_name and strategy
    entries in the sysctl tables are unused, and can be
    revmoed.

    In addition neigh_sysctl_register has been modified to no longer
    take a strategy argument and it's callers have been modified not
    to pass one.

    Cc: "David Miller"
    Cc: Hideaki YOSHIFUJI
    Cc: netdev@vger.kernel.org
    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     

03 Aug, 2009

1 commit

  • Current neigh_periodic_timer() function is fired by timer IRQ, and
    scans one hash bucket each round (very litle work in fact)

    As we are supposed to scan whole hash table in 15 seconds, this means
    neigh_periodic_timer() can be fired very often. (depending on the number
    of concurrent hash entries we stored in this table)

    Converting this to a workqueue permits scanning whole table, minimizing
    icache pollution, and firing this work every 15 seconds, independantly
    of hash table size.

    This 15 seconds delay is not a hard number, as work is a deferrable one.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

14 Jul, 2009

1 commit


11 Jun, 2009

1 commit

  • The current code errors out the INCOMPLETE neigh entry skb queue only from
    the timer if maximum probes have been attempted and there has been no reply.
    This also causes the transtion to FAILED state.

    However, the neigh entry can be also updated via Netlink to inform that the
    address is unavailable. Currently, neigh_update() just stops the timers and
    leaves the pending skb's unreleased. This results that the clean up code in
    the timer callback is never called, preventing also proper garbage collection.

    This fixes neigh_update() to process the pending skb queue immediately if
    INCOMPLETE -> FAILED state transtion occurs due to a Netlink request.

    Signed-off-by: Timo Teras
    Signed-off-by: David S. Miller

    Timo Teras
     

03 Jun, 2009

1 commit

  • Define three accessors to get/set dst attached to a skb

    struct dst_entry *skb_dst(const struct sk_buff *skb)

    void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

    void skb_dst_drop(struct sk_buff *skb)
    This one should replace occurrences of :
    dst_release(skb->dst)
    skb->dst = NULL;

    Delete skb->dst field

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

04 Mar, 2009

1 commit

  • Currently it is possible to do just about everything with the arp table
    from user space except treat an entry like you are using it. To that end
    implement and a flag NTF_USE that when set in a netwlink update request
    treats the neighbour table entry like the kernel does on the output path.

    This allows user space applications to share the kernel's arp cache.

    Signed-off-by: Eric Biederman
    Signed-off-by: David S. Miller

    Eric Biederman
     

27 Feb, 2009

1 commit


25 Feb, 2009

1 commit

  • This patch changes the return value of nlmsg_notify() as follows:

    If NETLINK_BROADCAST_ERROR is set by any of the listeners and
    an error in the delivery happened, return the broadcast error;
    else if there are no listeners apart from the socket that
    requested a change with the echo flag, return the result of the
    unicast notification. Thus, with this patch, the unicast
    notification is handled in the same way of a broadcast listener
    that has set the NETLINK_BROADCAST_ERROR socket flag.

    This patch is useful in case that the caller of nlmsg_notify()
    wants to know the result of the delivery of a netlink notification
    (including the broadcast delivery) and take any action in case
    that the delivery failed. For example, ctnetlink can drop packets
    if the event delivery failed to provide reliable logging and
    state-synchronization at the cost of dropping packets.

    This patch also modifies the rtnetlink code to ignore the return
    value of rtnl_notify() in all callers. The function rtnl_notify()
    (before this patch) returned the error of the unicast notification
    which makes rtnl_set_sk_err() reports errors to all listeners. This
    is not of any help since the origin of the change (the socket that
    requested the echoing) notices the ENOBUFS error if the notification
    fails and should resync itself.

    Signed-off-by: Pablo Neira Ayuso
    Acked-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Pablo Neira Ayuso
     

06 Feb, 2009

1 commit

  • neightbl_dump_info and neigh_dump_table can skip entries if the
    *fill*info functions return an error. This results in an incomplete
    dump ((invoked by netlink requests for RTM_GETNEIGHTBL or
    RTM_GETNEIGH)

    nidx and idx should not be incremented if the current entry was not
    placed in the output buffer

    Signed-off-by: Gautam Kachroo
    Signed-off-by: David S. Miller

    Gautam Kachroo
     

30 Dec, 2008

1 commit


21 Nov, 2008

1 commit

  • This patch moves neigh_setup and hard_start_xmit into the network device ops
    structure. For bisection, fix all the previously converted drivers as well.
    Bonding driver took the biggest hit on this.

    Added a prefetch of the hard_start_xmit in the fast path to try and reduce
    any impact this would have.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

12 Nov, 2008

2 commits


04 Nov, 2008

1 commit

  • I want to compile out proc_* and sysctl_* handlers totally and
    stub them to NULL depending on config options, however usage of &
    will prevent this, since taking adress of NULL pointer will break
    compilation.

    So, drop & in front of every ->proc_handler and every ->strategy
    handler, it was never needed in fact.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

29 Oct, 2008

1 commit

  • call_rcu() will unconditionally rewrite RCU head anyway.
    Applies to
    struct neigh_parms
    struct neigh_table
    struct net
    struct cipso_v4_doi
    struct in_ifaddr
    struct in_device
    rt->u.dst

    Signed-off-by: Alexey Dobriyan
    Acked-by: Paul E. McKenney
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

23 Sep, 2008

1 commit


03 Aug, 2008

2 commits

  • When pneigh entries exist, but the user's read buffer isn't sufficient to
    hold them all, one of the pneigh entries will be missing from the results.

    In neigh_get_idx_any, the number of elements which neigh_get_idx
    encountered is not correctly subtracted from the position number before
    the call to pneigh_get_idx. neigh_get_idx reduces the position by 1 for
    each call to neigh_get_next, but it does not reduce it by one for the
    first element (neigh_get_first). The patch alters the neigh_get_idx and
    pneigh_get_idx functions to subtract one from pos, for the first element,
    when pos is non-zero.

    Signed-off-by: Chris Larson
    Signed-off-by: David S. Miller

    Chris Larson
     
  • neigh_seq_next won't be called both with *pos > 0 && v ==
    SEQ_START_TOKEN, so there's no point calling neigh_get_idx when we're
    on the start token, just call neigh_get_first directly.

    Signed-off-by: Chris Larson
    Signed-off-by: David S. Miller

    Chris Larson
     

17 Jul, 2008

1 commit

  • in __neigh_event_send, if we have a neighbour entry which is in
    NUD_INCOMPLETE state, we enqueue any outbound frames to that neighbour
    to the neighbours arp_queue, which is default capped to a length of 3
    skbs. If that queue exceeds its set length, it will drop an skb on
    the queue to enqueue the newly arrived skb. This results in a drop
    for which we have no statistics incremented. This patch adds an
    unresolved_discards stat to /proc/net/stat/ndisc_cache to track these
    lost frames.

    Signed-off-by: Neil Horman
    Signed-off-by: David S. Miller

    Neil Horman
     

04 Jun, 2008

2 commits

  • Make nlmsg_trim(), nlmsg_cancel(), genlmsg_cancel(), and
    nla_nest_cancel() void functions.

    Return -EMSGSIZE instead of -1 if the provided message buffer is not
    big enough.

    Signed-off-by: Thomas Graf
    Signed-off-by: David S. Miller

    Thomas Graf
     
  • The neighbor table time of last use information is returned in the
    incorrect unit. Kernel to user space ABI's need to use USER_HZ (or
    milliseconds), otherwise the application has to try and discover the
    real system HZ value which is problematic. Linux has standardized on
    keeping USER_HZ consistent (100hz) even when kernel is running
    internally at some other value.

    This change is small, but it breaks the ABI for older version of
    iproute2 utilities. But these utilities are already broken since they
    are looking at the psched_hz values which are completely different. So
    let's just go ahead and fix both kernel and user space. Older
    utilities will just print wrong values.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

02 May, 2008

1 commit


28 Mar, 2008

3 commits


26 Mar, 2008

5 commits


25 Mar, 2008

1 commit

  • Proxy neighbors do not have any reference counting, so any caller
    of pneigh_lookup (unless it's a netlink triggered add/del routine)
    should _not_ perform any actions on the found proxy entry.

    There's one exception from this rule - the ipv6's ndisc_recv_ns()
    uses found entry to check the flags for NTF_ROUTER.

    This creates a race between the ndisc and pneigh_delete - after
    the pneigh is returned to the caller, the nd_tbl.lock is dropped
    and the deleting procedure may proceed.

    One of the fixes would be to add a reference counting, but this
    problem exists for ndisc only. Besides such a patch would be too
    big for -rc4.

    So I propose to introduce a __pneigh_lookup() which is supposed
    to be called with the lock held and use it in ndisc code to check
    the flags on alive pneigh entry.

    Changes from v2:
    As David noticed, Exported the __pneigh_lookup() to ipv6 module.
    The checkpatch generates a warning on it, since the EXPORT_SYMBOL
    does not follow the symbol itself, but in this file all the
    exports come at the end, so I decided no to break this harmony.

    Changes from v1:
    Fixed comments from YOSHIFUJI - indentation of prototype in header
    and the pndisc_check_router() name - and a compilation fix, pointed
    by Daniel - the is_routed was (falsely) considered as uninitialized
    by gcc.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

06 Mar, 2008

1 commit