25 Oct, 2011
13 commits
-
If the PHY should disappear (for example, on an USB Ethernet MAC), then
the driver would leak any undelivered time stamp packets. This commit
fixes the issue by calling the appropriate functions to free any packets
left in the transmit and receive queues.The driver first appeared in v3.0.
Signed-off-by: Richard Cochran
Acked-by: Eric Dumazet
Cc:
Signed-off-by: David S. Miller -
The previous commit enforces a new rule for handling the cloned packets
for transmit time stamping. These packets must not be freed using any other
function than skb_complete_tx_timestamp. This commit fixes the one and only
driver using this API.The driver first appeared in v3.0.
Signed-off-by: Richard Cochran
Acked-by: Eric Dumazet
Cc:
Signed-off-by: David S. Miller -
When hybrid mode is enabled (accept_ra == 2), the kernel also sees RAs
generated locally. This is useful since it allows the kernel to auto-configure
its own interface addresses.However, if 'accept_ra_defrtr' and/or 'accept_ra_rtr_pref' are set and the
locally generated RAs announce the default route and/or other route information,
the kernel happily inserts bogus routes with its own address as gateway.With this patch, adding routes from an RA will be skiped when the RAs source
address matches any local address, just as if 'accept_ra_defrtr' and
'accept_ra_rtr_pref' were set to 0.Signed-off-by: Andreas Hofmeister
Signed-off-by: David S. Miller -
If a frame cant be transmitted, it is silently discarded.
Add a counter to report these errors to user.
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
When the HW is in multi-channel mode based on the skew/IPL, there are
4 functions per port and so not enough resources to create multiple
RX/TX rings for each function.Signed-off-by: Suresh Reddy
Signed-off-by: Sathya Perla
Signed-off-by: David S. Miller -
Multiple TXQ support is partially broken in BE2. It is fully
supported BE3 onwards and in Lancer.Signed-off-by: Vasundhara Volam
Signed-off-by: Sathya Perla
Signed-off-by: David S. Miller -
Currently the code for VF setup/teardown done by a PF (if_create,
mac_add_config, link_status_query etc) is scattered; this patch
refactors this code into be_vf_setup() and be_vf_clear(). The
if_create/if_destroy/mac_addr_query cmds are now called after the MCCQ
is created; so these cmds are now modified to use the MCCQ instead of
MBOX.Signed-off-by: Sathya Perla
Signed-off-by: David S. Miller -
When a card is reset due to EEH error recovery or due to a suspend,
rx-mode config (promisc/mc) is not being sent to the FW. be_setup() is
called in these flows and is the best place for such config/re-config
cmds. Hence include rx-mode, vlan and flow-control config in
be_setup().Signed-off-by: Sathya Perla
Signed-off-by: David S. Miller -
Dan Siemon would like to add tunnelling support to cls_flow
This preliminary patch introduces use of skb_header_pointer() to help
this task, while avoiding skb head reallocation because of deep packet
inspection.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
In func ipv4_dst_check,check_peer_pmtu should be called only when peer is updated.
So,if the peer is not updated in ip_rt_frag_needed,we can not inc __rt_peer_genid.Signed-off-by: Gao feng
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller -
It was enabled by default and the messages guarded
by the define are useful.Signed-off-by: Flavio Leitner
Signed-off-by: David S. Miller -
Since commit
"7488876... dt/net: Eliminate users of of_platform_{,un}register_driver"
there are two platform drivers named "mdio-gpio" registered.
I renamed the of variant to "mdio-ofgpio".Signed-off-by: Dirk Eibach
Signed-off-by: David S. Miller
24 Oct, 2011
10 commits
-
There is a long standing bug in linux tcp stack, about ACK messages sent
on behalf of TIME_WAIT sockets.In the IP header of the ACK message, we choose to reflect TOS field of
incoming message, and this might break some setups.Example of things that were broken :
- Routing using TOS as a selector
- Firewalls
- Trafic classification / shapingWe now remember in timewait structure the inet tos field and use it in
ACK generation, and route lookup.Notes :
- We still reflect incoming TOS in RST messages.
- We could extend MuraliRaja Muniraju patch to report TOS value in
netlink messages for TIME_WAIT sockets.
- A patch is needed for IPv6Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
Renato Westphal noticed that since commit a2835763e130c343ace5320c20d33c281e7097b7
"rtnetlink: handle rtnl_link netlink notifications manually" was merged
we no longer send a netlink message when a networking device is moved
from one network namespace to another.Fix this by adding the missing manual notification in dev_change_net_namespaces.
Since all network devices that are processed by dev_change_net_namspaces are
in the initialized state the complicated tests that guard the manual
rtmsg_ifinfo calls in rollback_registered and register_netdevice are
unnecessary and we can just perform a plain notification.Cc: stable@kernel.org
Tested-by: Renato Westphal
Signed-off-by: Eric W. Biederman
Signed-off-by: David S. Miller -
There is bug in commit 5e2b61f(ipv4: Remove flowi from struct rtable).
It makes xfrm4_fill_dst() modify wrong data structure.Signed-off-by: Zheng Yan
Reported-by: Kim Phillips
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller -
If the device is down during suspend/resume, interrupts are enabled
without a registered interrupt handler, causing a storm of
unhandled interrupts until the IRQ is disabled because "nobody
cared".Instead, check that the device is up before touching it in the
suspend/resume code.Fixes https://bugzilla.kernel.org/show_bug.cgi?id=39112
Helped-by: Adrian Chadd
Helped-by: Mohammed Shafi
Signed-off-by: Clemens Buchacher
Signed-off-by: David S. Miller -
The commit f39925dbde7788cfb96419c0f092b086aa325c0f
(ipv4: Cache learned redirect information in inetpeer.)
removed some ICMP packet validations which are required by
RFC 1122, section 3.2.2.2:
...
A Redirect message SHOULD be silently discarded if the new
gateway address it specifies is not on the same connected
(sub-) net through which the Redirect arrived [INTRO:2,
Appendix A], or if the source of the Redirect is not the
current first-hop gateway for the specified destination (see
Section 3.3.1).Signed-off-by: Flavio Leitner
Signed-off-by: David S. Miller -
The pair of functions,
* skb_clone_tx_timestamp()
* skb_complete_tx_timestamp()were designed to allow timestamping in PHY devices. The first
function, called during the MAC driver's hard_xmit method, identifies
PTP protocol packets, clones them, and gives them to the PHY device
driver. The PHY driver may hold onto the packet and deliver it at a
later time using the second function, which adds the packet to the
socket's error queue.As pointed out by Johannes, nothing prevents the socket from
disappearing while the cloned packet is sitting in the PHY driver
awaiting a timestamp. This patch fixes the issue by taking a reference
on the socket for each such packet. In addition, the comments
regarding the usage of these function are expanded to highlight the
rule that PHY drivers must use skb_complete_tx_timestamp() to release
the packet, in order to release the socket reference, too.These functions first appeared in v2.6.36.
Reported-by: Johannes Berg
Signed-off-by: Richard Cochran
Cc:
Signed-off-by: Eric Dumazet
Reviewed-by: Johannes Berg
Signed-off-by: David S. Miller -
Now tcp_md5_hash_header() has a const tcphdr argument, we can add more
const attributes to callers.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
Add support for reporting ring sizes via ethtool -g to the virtio_net
driver.Signed-off-by: Rick Jones
Acked-by: Rusty Russell
Acked-by: Michael S. Tsirkin
Signed-off-by: David S. Miller -
tcp_md5_hash_header() writes into skb header a temporary zero value,
this might confuse other users of this area.Since tcphdr is small (20 bytes), copy it in a temporary variable and
make the change in the copy.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
22 Oct, 2011
4 commits
-
When I made class_attr_bonding_matters per network namespace and dynamically
allocated I overlooked the need for calling sysfs_attr_init. Oops.This fixes the following lockdep splat:
[ 5.749651] bonding: Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
[ 5.749655] bonding: MII link monitoring set to 100 ms
[ 5.749676] BUG: key f49a831c not in .data!
[ 5.749677] ------------[ cut here ]------------
[ 5.749752] WARNING: at kernel/lockdep.c:2897 lockdep_init_map+0x1c3/0x460()
[ 5.749809] Hardware name: ProLiant BL460c G1
[ 5.749862] Modules linked in: bonding(+)
[ 5.749978] Pid: 3177, comm: modprobe Not tainted 3.1.0-rc9-02177-gf2d1a4e-dirty #1157
[ 5.750066] Call Trace:
[ 5.750120] [] ? printk+0x18/0x21
[ 5.750176] [] warn_slowpath_common+0x6d/0xa0
[ 5.750231] [] ? lockdep_init_map+0x1c3/0x460
[ 5.750287] [] ? lockdep_init_map+0x1c3/0x460
[ 5.750342] [] warn_slowpath_null+0x1d/0x20
[ 5.750398] [] lockdep_init_map+0x1c3/0x460
[ 5.750453] [] ? _raw_spin_unlock+0x1d/0x20
[ 5.750510] [] ? sysfs_new_dirent+0x68/0x110
[ 5.750565] [] sysfs_add_file_mode+0x8b/0xe0
[ 5.750621] [] sysfs_add_file+0x13/0x20
[ 5.750675] [] sysfs_create_file+0x1c/0x20
[ 5.750737] [] class_create_file+0x19/0x20
[ 5.750794] [] netdev_class_create_file+0xf/0x20
[ 5.750853] [] bond_create_sysfs+0x44/0x90 [bonding]
[ 5.750911] [] ? bond_create_proc_dir+0x1e/0x3e [bonding]
[ 5.750970] [] bond_net_init+0x7e/0x87 [bonding]
[ 5.751026] [] ? 0xf840ffff
[ 5.751080] [] ops_init.clone.4+0xba/0x100
[ 5.751135] [] ? register_pernet_subsys+0x12/0x30
[ 5.751191] [] register_pernet_operations.clone.3+0x43/0x80
[ 5.751249] [] register_pernet_subsys+0x19/0x30
[ 5.751306] [] bonding_init+0x832/0x8a2 [bonding]
[ 5.751363] [] do_one_initcall+0x30/0x160
[ 5.751420] [] ? bond_net_init+0x87/0x87 [bonding]
[ 5.751477] [] sys_init_module+0xef/0x1890
[ 5.751533] [] sysenter_do_call+0x12/0x36
[ 5.751588] ---[ end trace 89f492d83a7f5006 ]---Signed-off-by: Eric W. Biederman
Reported-by: Eric Dumazet
Tested-by: Eric Dumazet
Signed-off-by: David S. Miller -
Ari got kernel panics using tg3 NIC, and bisected to 2669069aacc9 "tg3:
enable transmit time stamping."This is because tigon3_dma_hwbug_workaround() might alloc a new skb and
free the original. We panic when skb_tx_timestamp() is called on freed
skb.Reported-by: Ari Savolainen
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
INET_ECN_encapsulate() is better understood if we can read the official
statement.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
Signed-off-by: Maciej Żenczykowski
Signed-off-by: David S. Miller
21 Oct, 2011
13 commits
-
Due to a hardware problem, writes to the VFTA register can
theoretically fail. Although the likelihood of this is very low.
This patch adds a shadow vfta in the adapter struct for reading
and adds new write functions for these devices to work around the problem.Signed-off-by: Carolyn Wyborny
Tested-by: Aaron Brown
Signed-off-by: Jeff Kirsher -
This patch moves the DMA Coalescing feature initialization code from
igb_reset to a new function and replaces it with a call to the new
function.Signed-off-by: Carolyn Wyborny
Tested-by: Aaron Brown
Signed-off-by: Jeff Kirsher -
In 82580 and later devices, the alternate MAC address feature is
completely handled by the option ROM and software does not handle
it anymore. This patch changes the check_alt_mac_addr function to
exit immediately if device is 82580 or later.Signed-off-by: Carolyn Wyborny
Signed-off-by: Jeff Kirsher -
Signed-off-by: Mitch Williams
Tested-by: Sibai Li
Signed-off-by: Jeff Kirsher -
Update adapter identification strings to properly indicate i350 VF devices
in the VF driver. Change the driver ID string to remove 82576-specific
wording. Update copyright date.Signed-off-by: Mitch Williams
Tested-by: Sibai Li
Signed-off-by: Jeff Kirsher -
Adding const qualifiers to pointers can ease code review, and spot some
bugs. It might allow compiler to optimize code further.For example, is it legal to temporary write a null cksum into tcphdr
in tcp_md5_hash_header() ? I am afraid a sniffer could catch the
temporary null value...Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
Instead of using the dev->next chain and trying to resync at each call to
dev_seq_start, use the name hash, keeping the bucket and the offset in
seq->private field.Tests revealed the following results for ifconfig > /dev/null
* 1000 interfaces:
* 0.114s without patch
* 0.089s with patch
* 3000 interfaces:
* 0.489s without patch
* 0.110s with patch
* 5000 interfaces:
* 1.363s without patch
* 0.250s with patch
* 128000 interfaces (other setup):
* ~100s without patch
* ~30s with patchSigned-off-by: Mihai Maruseac
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
On systems that create and delete lots of dynamic devices the
31bit linux ifindex fails to fit in the 16bit macvtap minor,
resulting in unusable macvtap devices. I have systems running
automated tests that that hit this condition in just a few days.Use a linux idr allocator to track which mavtap minor numbers
are available and and to track the association between macvtap
minor numbers and macvtap network devices.Remove the unnecessary unneccessary check to see if the network
device we have found is indeed a macvtap device. With macvtap
specific data structures it is impossible to find any other
kind of networking device.Increase the macvtap minor range from 65536 to the full 20 bits
that is supported by linux device numbers. It doesn't solve the
original problem but there is no penalty for a larger minor
device range.Signed-off-by: Eric W. Biederman
Signed-off-by: David S. Miller -
Place macvlan_common_newlink at the end of macvtap_newlink because
failing in newlink after registering your network device is not
supported.Move device_create into a netdevice creation notifier. The network device
notifier is the only hook that is called after the network device has been
registered with the device layer and before register_network_device returns
success.Signed-off-by: Eric W. Biederman
Signed-off-by: David S. Miller -
To avoid leaking packets in the receive queue. Add a socket destructor
that will run whenever destroy a macvtap socket.Signed-off-by: Eric W. Biederman
Signed-off-by: David S. Miller -
To see if it is appropriate to enable the macvtap zero copy feature
don't test the lowerdev network device flags. Instead test the
macvtap network device flags which are a direct copy of the lowerdev
flags. This is important because nothing holds a reference to lowerdev
and on a very bad day we lowerdev could be a pointer to stale memory.Signed-off-by: Eric W. Biederman
Signed-off-by: David S. Miller -
There is a small window in macvtap_open between looking up a
networking device and calling macvtap_set_queue in which
macvtap_del_queues called from macvtap_dellink. After
calling macvtap_del_queues it is totally incorrect to
allow macvtap_set_queue to proceed so prevent success by
reporting that all of the available queues are in use.Signed-off-by: Eric W. Biederman
Signed-off-by: David S. Miller -
We must account in skb->truesize, the size of the fragments, not the
used part of them.Doing this work is important to avoid unexpected OOM situations.
Signed-off-by: Eric Dumazet
CC: Rusty Russell
CC: "Michael S. Tsirkin"
CC: virtualization@lists.linux-foundation.org
CC: Krishna Kumar
Signed-off-by: David S. Miller