19 Mar, 2010

1 commit

  • When doing "ifenslave -d bond0 eth0", there is chance to get NULL
    dereference in netif_receive_skb(), because dev->master suddenly becomes
    NULL after we tested it.

    We should use ACCESS_ONCE() to avoid this (or rcu_dereference())

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

17 Mar, 2010

1 commit

  • Stanse found that one error path in netpoll_setup dereferences npinfo
    even though it is NULL. Avoid that by adding new label and go to that
    instead.

    Signed-off-by: Jiri Slaby
    Cc: Daniel Borkmann
    Cc: David S. Miller
    Acked-by: chavey@google.com
    Acked-by: Matt Mackall
    Signed-off-by: David S. Miller

    Jiri Slaby
     

10 Mar, 2010

2 commits


09 Mar, 2010

1 commit


08 Mar, 2010

1 commit


06 Mar, 2010

4 commits

  • On 03/04/2010 09:26 AM, Ben Hutchings wrote:
    > On Thu, 2010-03-04 at 00:51 -0800, Jeff Kirsher wrote:
    >> From: Jeff Garzik
    >>
    >> This patch is an alternative approach for accessing string
    >> counts, vs. the drvinfo indirect approach. This way the drvinfo
    >> space doesn't run out, and we don't break ABI later.
    > [...]
    >> --- a/net/core/ethtool.c
    >> +++ b/net/core/ethtool.c
    >> @@ -214,6 +214,10 @@ static noinline int ethtool_get_drvinfo(struct net_device *dev, void __user *use
    >> info.cmd = ETHTOOL_GDRVINFO;
    >> ops->get_drvinfo(dev,&info);
    >>
    >> + /*
    >> + * this method of obtaining string set info is deprecated;
    >> + * consider using ETHTOOL_GSSET_INFO instead
    >> + */
    >
    > This comment belongs on the interface (ethtool.h) not the
    > implementation.

    Debatable -- the current comment is located at the callsite of
    ops->get_sset_count(), which is where an implementor might think to add
    a new call. Not all the numeric fields in ethtool_drvinfo are obtained
    from ->get_sset_count().

    Hence the "some" in the attached patch to include/linux/ethtool.h,
    addressing your comment.

    > [...]
    >> +static noinline int ethtool_get_sset_info(struct net_device *dev,
    >> + void __user *useraddr)
    >> +{
    > [...]
    >> + /* calculate size of return buffer */
    >> + for (i = 0; i< 64; i++)
    >> + if (sset_mask& (1ULL<< i))
    >> + n_bits++;
    > [...]
    >
    > We have a function for this:
    >
    > n_bits = hweight64(sset_mask);

    Agreed.

    I've attached a follow-up patch, which should enable my/Jeff's kernel
    patch to be applied, followed by this one.

    Signed-off-by: Jeff Garzik
    Signed-off-by: David S. Miller

    Jeff Garzik
     
  • This patch is an alternative approach for accessing string
    counts, vs. the drvinfo indirect approach. This way the drvinfo
    space doesn't run out, and we don't break ABI later.

    Signed-off-by: Jeff Garzik
    Signed-off-by: Peter P Waskiewicz Jr
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Jeff Garzik
     
  • sk_add_backlog -> __sk_add_backlog
    sk_add_backlog_limited -> sk_add_backlog

    Signed-off-by: Zhu Yi
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Zhu Yi
     
  • We got system OOM while running some UDP netperf testing on the loopback
    device. The case is multiple senders sent stream UDP packets to a single
    receiver via loopback on local host. Of course, the receiver is not able
    to handle all the packets in time. But we surprisingly found that these
    packets were not discarded due to the receiver's sk->sk_rcvbuf limit.
    Instead, they are kept queuing to sk->sk_backlog and finally ate up all
    the memory. We believe this is a secure hole that a none privileged user
    can crash the system.

    The root cause for this problem is, when the receiver is doing
    __release_sock() (i.e. after userspace recv, kernel udp_recvmsg ->
    skb_free_datagram_locked -> release_sock), it moves skbs from backlog to
    sk_receive_queue with the softirq enabled. In the above case, multiple
    busy senders will almost make it an endless loop. The skbs in the
    backlog end up eat all the system memory.

    The issue is not only for UDP. Any protocols using socket backlog is
    potentially affected. The patch adds limit for socket backlog so that
    the backlog size cannot be expanded endlessly.

    Reported-by: Alex Shi
    Cc: David Miller
    Cc: Arnaldo Carvalho de Melo
    Cc: Alexey Kuznetsov
    Cc: Patrick McHardy
    Cc: Vlad Yasevich
    Cc: Sridhar Samudrala
    Cc: Jon Maloy
    Cc: Allan Stephens
    Cc: Andrew Hendry
    Signed-off-by: Zhu Yi
    Signed-off-by: Eric Dumazet
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Zhu Yi
     

01 Mar, 2010

2 commits


28 Feb, 2010

1 commit

  • NETIF_F_NTUPLE flag setting introduced a bug: non-ntuple flags
    like LRO may be successfully set, before ioctl(2) returns failure
    to userspace.

    The set-flags operation should be all-or-none, rather than leaving
    things in an inconsistent state prior to reporting failure to
    userspace.

    Signed-off-by: Jeff Garzik
    Signed-off-by: David S. Miller

    Jeff Garzik
     

27 Feb, 2010

4 commits

  • commit e8469ed959c373c2ff9e6f488aa5a14971aebe1f
    Author: Patrick McHardy
    Date: Tue Feb 23 20:41:30 2010 +0100

    Support specifying the initial device flags when creating a device though
    rtnl_link. Devices allocated by rtnl_create_link() are marked as INITIALIZING
    in order to surpress netlink registration notifications. To complete setup,
    rtnl_configure_link() must be called, which performs the device flag changes
    and invokes the deferred notifiers if everything went well.

    Two examples:

    # add macvlan to eth0
    #
    $ ip link add link eth0 up allmulticast on type macvlan

    [LINK]11: macvlan0@eth0: mtu 1500 qdisc noqueue state UNKNOWN
    link/ether 26:f8:84:02:f9:2a brd ff:ff:ff:ff:ff:ff
    [ROUTE]ff00::/8 dev macvlan0 table local metric 256 mtu 1500 advmss 1440 hoplimit 0
    [ROUTE]fe80::/64 dev macvlan0 proto kernel metric 256 mtu 1500 advmss 1440 hoplimit 0
    [LINK]11: macvlan0@eth0: mtu 1500
    link/ether 26:f8:84:02:f9:2a
    [ADDR]11: macvlan0 inet6 fe80::24f8:84ff:fe02:f92a/64 scope link
    valid_lft forever preferred_lft forever
    [ROUTE]local fe80::24f8:84ff:fe02:f92a via :: dev lo table local proto none metric 0 mtu 16436 advmss 16376 hoplimit 0
    [ROUTE]default via fe80::215:e9ff:fef0:10f8 dev macvlan0 proto kernel metric 1024 mtu 1500 advmss 1440 hoplimit 0
    [NEIGH]fe80::215:e9ff:fef0:10f8 dev macvlan0 lladdr 00:15:e9:f0:10:f8 router STALE
    [ROUTE]2001:6f8:974::/64 dev macvlan0 proto kernel metric 256 expires 0sec mtu 1500 advmss 1440 hoplimit 0
    [PREFIX]prefix 2001:6f8:974::/64 dev macvlan0 onlink autoconf valid 14400 preferred 131084
    [ADDR]11: macvlan0 inet6 2001:6f8:974:0:24f8:84ff:fe02:f92a/64 scope global dynamic
    valid_lft 86399sec preferred_lft 14399sec

    # add VLAN to eth1, eth1 is down
    #
    $ ip link add link eth1 up type vlan id 1000
    RTNETLINK answers: Network is down

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Split dev_change_flags() into two functions: __dev_change_flags() to
    perform the actual changes and __dev_notify_flags() to invoke netdevice
    notifiers. This will be used by rtnl_link to defer netlink notifications
    until the device has been fully configured.

    This changes ordering of some operations, in particular:

    - netlink notifications are sent after all changes have been performed.
    As a side effect this surpresses one unnecessary netlink message when
    the IFF_UP and other flags are changed simultaneously.

    - The NETDEV_UP/NETDEV_DOWN and NETDEV_CHANGE notifiers are invoked
    after all changes have been performed. Their relative is unchanged.

    - net_dmaengine_put() is invoked before the NETDEV_DOWN notifier instead
    of afterwards. This should not make any difference since both RX and TX
    are already shut down at this point.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • In order to support specifying device flags during device creation,
    we must be able to roll back device registration in case setting the
    flags fails without sending any notifications related to the device
    to userspace.

    This patch changes rollback_registered_many() and register_netdevice()
    to manually send netlink notifications for devices not handled by
    rtnl_link and allows to defer notifications for devices handled by
    rtnl_link until setup is complete.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Commit 3b8bcfd (net: introduce pre-up netdev notifier) added a new
    notifier which is run before a device is set UP for use by cfg80211.

    The patch missed to add the new notifier to the ignore list in
    rtnetlink_event(), so we currently get an unnecessary netlink
    notification before a device is set UP.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

26 Feb, 2010

6 commits


25 Feb, 2010

1 commit

  • Update rcu_dereference() primitives to use new lockdep-based
    checking. The rcu_dereference() in __in6_dev_get() may be
    protected either by rcu_read_lock() or RTNL, per Eric Dumazet.
    The rcu_dereference() in __sk_free() is protected by the fact
    that it is never reached if an update could change it. Check
    for this by using rcu_dereference_check() to verify that the
    struct sock's ->sk_wmem_alloc counter is zero.

    Acked-by: Eric Dumazet
    Acked-by: David S. Miller
    Signed-off-by: Paul E. McKenney
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

24 Feb, 2010

1 commit

  • Traffic (tcp) doesnot start on a vlan interface when gro is enabled.
    Even the tcp handshake was not taking place.
    This is because, the eth_type_trans call before the netif_receive_skb
    in napi_gro_finish() resets the skb->dev to napi->dev from the previously
    set vlan netdev interface. This causes the ip_route_input to drop the
    incoming packet considering it as a packet coming from a martian source.

    I could repro this on 2.6.32.7 (stable) and 2.6.33-rc7.
    With this fix, the traffic starts and the test runs fine on both vlan
    and non-vlan interfaces.

    CC: Herbert Xu
    CC: Patrick McHardy
    Signed-off-by: Ajit Khaparde
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Ajit Khaparde
     

23 Feb, 2010

1 commit


20 Feb, 2010

1 commit


18 Feb, 2010

4 commits

  • Export sk_attach_filter/sk_detach_filter routines,
    so that tun module can use them.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Michael S. Tsirkin
     
  • Traffic (tcp) doesnot start on a vlan interface when gro is enabled.
    Even the tcp handshake was not taking place.
    This is because, the eth_type_trans call before the netif_receive_skb
    in napi_gro_finish() resets the skb->dev to napi->dev from the previously
    set vlan netdev interface. This causes the ip_route_input to drop the
    incoming packet considering it as a packet coming from a martian source.

    I could repro this on 2.6.32.7 (stable) and 2.6.33-rc7.
    With this fix, the traffic starts and the test runs fine on both vlan
    and non-vlan interfaces.

    CC: Herbert Xu
    CC: Patrick McHardy
    Signed-off-by: Ajit Khaparde
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Ajit Khaparde
     
  • The n-tuple list should be flushed if and only if the ETH_RESET_FILTER
    flag is set and the driver is able to reset filtering/flow direction
    hardware without also resetting a component whose flag is not set.
    This test is best left to the driver.

    Signed-off-by: Ben Hutchings
    Signed-off-by: David S. Miller

    Ben Hutchings
     
  • kasprintf() makes code smaller.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

17 Feb, 2010

4 commits


16 Feb, 2010

3 commits


13 Feb, 2010

2 commits