22 Jun, 2017
1 commit
-
Add a flag to indicate if a queue is rate-limited. Test the flag in
NAPI poll handler and avoid rescheduling the queue if true, otherwise
we risk locking up the host. The rescheduling will be done in the
timer callback function.Reported-by: Jean-Louis Dupond
Signed-off-by: Wei Liu
Tested-by: Jean-Louis Dupond
Reviewed-by: Paul Durrant
Signed-off-by: David S. Miller
13 Mar, 2017
1 commit
-
In some cases during XenBus disconnect event handling and subsequent
queue resource release there may be some TX handlers active on
other processors. Use RCU in order to synchronize with them.Signed-off-by: Igor Druzhinin
Signed-off-by: David S. Miller
30 Jan, 2017
1 commit
-
The default for the maximum number of tx/rx queues of one interface is
the number of cpus of the system today. As each queue pair reserves 512
grant pages this default consumes a ridiculous number of grants for
large guests.Limit the queue number to 8 as default. This value can be modified
via a module parameter if required.Signed-off-by: Juergen Gross
Reviewed-by: Boris Ostrovsky
Signed-off-by: Boris Ostrovsky
07 Oct, 2016
1 commit
-
The netback source module has become very large and somewhat confusing.
This patch simply moves all code related to the backend to frontend (i.e
guest side rx) data-path into a separate rx source module.This patch contains no functional change, it is code movement and
minimal changes to avoid patch style-check issues.Signed-off-by: Paul Durrant
Signed-off-by: David S. Miller
22 Sep, 2016
1 commit
-
Instead of open coding it use the threaded irq mechanism in
xen-netback.Signed-off-by: Juergen Gross
Acked-by: Wei Liu
Signed-off-by: David S. Miller
17 May, 2016
4 commits
-
My recent patch to include/xen/interface/io/netif.h defines a new extra
info type that can be used to pass hash values between backend and guest
frontend.This patch adds code to xen-netback to use the value in a hash extra
info fragment passed from the guest frontend in a transmit-side
(i.e. netback receive side) packet to set the skb hash accordingly.Signed-off-by: Paul Durrant
Acked-by: Wei Liu
Signed-off-by: David S. Miller -
My recent patch to include/xen/interface/io/netif.h defines a new extra
info type that can be used to pass hash values between backend and guest
frontend.This patch adds code to xen-netback to pass hash values calculated for
guest receive-side packets (i.e. netback transmit side) to the frontend.Signed-off-by: Paul Durrant
Acked-by: Wei Liu
Signed-off-by: David S. Miller -
My recent patch to include/xen/interface/io/netif.h defines a new shared
ring (in addition to the rx and tx rings) for passing control messages
from a VM frontend driver to a backend driver.A previous patch added the necessary boilerplate for mapping the control
ring from the frontend, should it be created. This patch adds
implementations for each of the defined protocol messages.Signed-off-by: Paul Durrant
Cc: Wei Liu
Acked-by: Wei Liu
Signed-off-by: David S. Miller -
My recent patch to include/xen/interface/io/netif.h defines a new shared
ring (in addition to the rx and tx rings) for passing control messages
from a VM frontend driver to a backend driver.This patch adds the necessary code to xen-netback to map this new shared
ring, should it be created by a frontend, but does not add implementations
for any of the defined protocol messages. These are added in a subsequent
patch for clarity.Signed-off-by: Paul Durrant
Acked-by: Wei Liu
Signed-off-by: David S. Miller
13 May, 2016
1 commit
-
Patch 562abd39 "xen-netback: support multiple extra info fragments
passed from frontend" contained a mistake which can result in an in-
correct number of responses being generated when handling errors
encountered when processing packets containing extra info fragments.
This patch fixes the problem.Signed-off-by: Paul Durrant
Reported-by: Jan Beulich
Cc: Wei Liu
Acked-by: Wei Liu
Signed-off-by: David S. Miller
14 Mar, 2016
1 commit
-
The code does not currently support a frontend passing multiple extra info
fragments to the backend in a tx request. The xenvif_get_extras() function
handles multiple extra_info fragments but make_tx_response() assumes there
is only ever a single extra info fragment.This patch modifies xenvif_get_extras() to pass back a count of extra
info fragments, which is then passed to make_tx_response() (after
possibly being stashed in pending_tx_info for deferred responses).Signed-off-by: Paul Durrant
Cc: Wei Liu
Acked-by: Wei Liu
Signed-off-by: David S. Miller
16 Jan, 2016
1 commit
-
Using the MTU or GSO size to determine the number of required guest Rx
requests for an skb was subtly broken since these value may change at
runtime.After 1650d5455bd2dc6b5ee134bd6fc1a3236c266b5b (xen-netback: always
fully coalesce guest Rx packets) we always fully pack a packet into
its guest Rx slots. Calculating the number of required slots from the
packet length is then easy.Signed-off-by: David Vrabel
Signed-off-by: David S. Miller
18 Dec, 2015
2 commits
-
Instead of open-coding memcpy()s and directly accessing Tx and Rx
requests, use the new RING_COPY_REQUEST() that ensures the local copy
is correct.This is more than is strictly necessary for guest Rx requests since
only the id and gref fields are used and it is harmless if the
frontend modifies these.This is part of XSA155.
CC: stable@vger.kernel.org
Reviewed-by: Wei Liu
Signed-off-by: David Vrabel
Signed-off-by: Konrad Rzeszutek Wilk -
The last from guest transmitted request gives no indication about the
minimum amount of credit that the guest might need to send a packet
since the last packet might have been a small one.Instead allow for the worst case 128 KiB packet.
This is part of XSA155.
CC: stable@vger.kernel.org
Reviewed-by: Wei Liu
Signed-off-by: David Vrabel
Signed-off-by: Konrad Rzeszutek Wilk
23 Oct, 2015
2 commits
-
The PV network protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity working as a
network backend on a non-modified Xen.It's only necessary to adapt the ring size and break skb data in small
chunk of 4KB. The rest of the code is relying on the grant table code.Signed-off-by: Julien Grall
Reviewed-by: Wei Liu
Signed-off-by: David Vrabel -
The skb doesn't change within the function. Therefore it's only
necessary to check if we need GSO once at the beginning.Signed-off-by: Julien Grall
Acked-by: Wei Liu
Signed-off-by: David Vrabel
11 Sep, 2015
2 commits
-
Pull xen terminology fixes from David Vrabel:
"Use the correct GFN/BFN terms more consistently"* tag 'for-linus-4.3-rc0b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/xenbus: Rename the variable xen_store_mfn to xen_store_gfn
xen/privcmd: Further s/MFN/GFN/ clean-up
hvc/xen: Further s/MFN/GFN clean-up
video/xen-fbfront: Further s/MFN/GFN clean-up
xen/tmem: Use xen_page_to_gfn rather than pfn_to_gfn
xen: Use correctly the Xen memory terminologies
arm/xen: implement correctly pfn_to_mfn
xen: Make clear that swiotlb and biomerge are dealing with DMA address -
Originally that parameter was always reset to num_online_cpus during
module initialisation, which renders it useless.The fix is to only set max_queues to num_online_cpus when user has not
provided a value.Reported-by: Johnny Strom
Signed-off-by: Wei Liu
Reviewed-by: David Vrabel
Acked-by: Ian Campbell
Signed-off-by: David S. Miller
10 Sep, 2015
1 commit
-
Commit f48da8b14d04ca87ffcffe68829afd45f926ec6a (xen-netback: fix
unlimited guest Rx internal queue and carrier flapping) introduced a
regression.The PV frontend in IPXE only places 4 requests on the guest Rx ring.
Since netback required at least (MAX_SKB_FRAGS + 1) slots, IPXE could
not receive any packets.a) If GSO is not enabled on the VIF, fewer guest Rx slots are required
for the largest possible packet. Calculate the required slots
based on the maximum GSO size or the MTU.This calculation of the number of required slots relies on
1650d5455bd2 (xen-netback: always fully coalesce guest Rx packets)
which present in 4.0-rc1 and later.b) Reduce the Rx stall detection to checking for at least one
available Rx request. This is fine since we're predominately
concerned with detecting interfaces which are down and thus have
zero available Rx requests.Signed-off-by: David Vrabel
Reviewed-by: Wei Liu
Signed-off-by: David S. Miller
09 Sep, 2015
1 commit
-
Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN
is meant, I suspect this is because the first support for Xen was for
PV. This resulted in some misimplementation of helpers on ARM and
confused developers about the expected behavior.For instance, with pfn_to_mfn, we expect to get an MFN based on the name.
Although, if we look at the implementation on x86, it's returning a GFN.For clarity and avoid new confusion, replace any reference to mfn with
gfn in any helpers used by PV drivers. The x86 code will still keep some
reference of pfn_to_mfn which may be used by all kind of guests
No changes as been made in the hypercall field, even
though they may be invalid, in order to keep the same as the defintion
in xen repo.Note that page_to_mfn has been renamed to xen_page_to_gfn to avoid a
name to close to the KVM function gfn_to_page.Take also the opportunity to simplify simple construction such
as pfn_to_mfn(page_to_pfn(page)) into xen_page_to_gfn. More complex clean up
will come in follow-up patches.[1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=e758ed14f390342513405dd766e874934573e6cb
Signed-off-by: Julien Grall
Reviewed-by: Stefano Stabellini
Acked-by: Dmitry Torokhov
Acked-by: Wei Liu
Signed-off-by: David Vrabel
03 Sep, 2015
1 commit
-
Xen's PV network protocol includes messages to add/remove ethernet
multicast addresses to/from a filter list in the backend. This allows
the frontend to request the backend only forward multicast packets
which are of interest thus preventing unnecessary noise on the shared
ring.The canonical netif header in git://xenbits.xen.org/xen.git specifies
the message format (two more XEN_NETIF_EXTRA_TYPEs) so the minimal
necessary changes have been pulled into include/xen/interface/io/netif.h.To prevent the frontend from extending the multicast filter list
arbitrarily a limit (XEN_NETBK_MCAST_MAX) has been set to 64 entries.
This limit is not specified by the protocol and so may change in future.
If the limit is reached then the next XEN_NETIF_EXTRA_TYPE_MCAST_ADD
sent by the frontend will be failed with NETIF_RSP_ERROR.Signed-off-by: Paul Durrant
Cc: Ian Campbell
Cc: Wei Liu
Acked-by: Wei Liu
Signed-off-by: David S. Miller
07 Aug, 2015
1 commit
-
Waking the dealloc thread before decrementing inflight_packets is racy
because it means the thread may go to sleep before inflight_packets is
decremented. If kthread_stop() has already been called, the dealloc
thread may wait forever with nothing to wake it. Instead, wake the
thread only after decrementing inflight_packets.Signed-off-by: Ross Lagerwall
Signed-off-by: David S. Miller
04 Aug, 2015
1 commit
-
Determine if a fraglist is needed in the tx path, and allocate it if
necessary before setting up the copy and map operations.
Otherwise, undoing the copy and map operations is tricky.This fixes a use-after-free: if allocating the fraglist failed, the copy
and map operations that had been set up were still executed, writing
over the data area of a freed skb.Signed-off-by: Ross Lagerwall
Signed-off-by: David S. Miller
15 Jul, 2015
1 commit
-
The > should be >=. I also added spaces around the '-' operations so
the code is a little more consistent and matches the condition better.Fixes: f53c3fe8dad7 ('xen-netback: Introduce TX grant mapping')
Signed-off-by: Dan Carpenter
Signed-off-by: David S. Miller
02 Jul, 2015
1 commit
-
Pull xen updates from David Vrabel:
"Xen features and cleanups for 4.2-rc0:- add "make xenconfig" to assist in generating configs for Xen guests
- preparatory cleanups necessary for supporting 64 KiB pages in ARM
guests- automatically use hvc0 as the default console in ARM guests"
* tag 'for-linus-4.2-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
block/xen-blkback: s/nr_pages/nr_segs/
block/xen-blkfront: Remove invalid comment
block/xen-blkfront: Remove unused macro MAXIMUM_OUTSTANDING_BLOCK_REQS
arm/xen: Drop duplicate define mfn_to_virt
xen/grant-table: Remove unused macro SPP
xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring
xen: Include xen/page.h rather than asm/xen/page.h
kconfig: add xenconfig defconfig helper
kconfig: clarify kvmconfig is for kvm
xen/pcifront: Remove usage of struct timeval
xen/tmem: use BUILD_BUG_ON() in favor of BUG_ON()
hvc_xen: avoid uninitialized variable warning
xenbus: avoid uninitialized variable warning
xen/arm: allow console=hvc0 to be omitted for guests
arm,arm64/xen: move Xen initialization earlier
arm/xen: Correctly check if the event channel interrupt is present
22 Jun, 2015
2 commits
-
Append 0x to all %x in order to avoid while reading when there is other
decimal value in the log.Also replace some of the hexadecimal print to decimal to uniformize the
format with netfront.Signed-off-by: Julien Grall
Cc: Wei Liu
Cc: Ian Campbell
Cc: netdev@vger.kernel.org
Acked-by: Ian Campbell
Signed-off-by: David S. Miller -
The variables old_req_cons and ring_slots_used are assigned but never
used since commit 1650d5455bd2dc6b5ee134bd6fc1a3236c266b5b "xen-netback:
always fully coalesce guest Rx packets".Signed-off-by: Julien Grall
Acked-by: Wei Liu
Cc: Ian Campbell
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller
17 Jun, 2015
1 commit
-
Using xen/page.h will be necessary later for using common xen page
helpers.As xen/page.h already include asm/xen/page.h, always use the later.
Signed-off-by: Julien Grall
Reviewed-by: David Vrabel
Cc: Stefano Stabellini
Cc: Ian Campbell
Cc: Wei Liu
Cc: Konrad Rzeszutek Wilk
Cc: Boris Ostrovsky
Cc: netdev@vger.kernel.org
Signed-off-by: David Vrabel
02 Jun, 2015
2 commits
-
Conflicts:
drivers/net/phy/amd-xgbe-phy.c
drivers/net/wireless/iwlwifi/Kconfig
include/net/mac80211.hiwlwifi/Kconfig and mac80211.h were both trivial overlapping
changes.The drivers/net/phy/amd-xgbe-phy.c file got removed in 'net-next' and
the bug fix that happened on the 'net' side is already integrated
into the rest of the amd-xgbe driver.Signed-off-by: David S. Miller
-
drivers/net/xen-netback/netback.c: In function ‘xenvif_tx_build_gops’:
drivers/net/xen-netback/netback.c:1253:8: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘int’ [-Wformat=]
(txreq.offset&~PAGE_MASK) + txreq.size);
^PAGE_MASK's type can vary by arch, so a cast is needed.
Signed-off-by: Ian Campbell
----
v2: Cast to unsigned long, since PAGE_MASK can vary by arch.
Acked-by: Wei Liu
Signed-off-by: David S. Miller
26 May, 2015
1 commit
-
The variable separate_tx_rx_irq is bool type so assigning true
instead of 1.Signed-off-by: Shailendra Verma
Signed-off-by: David S. Miller
17 Apr, 2015
1 commit
-
Pull xen features and fixes from David Vrabel:
- use a single source list of hypercalls, generating other tables etc.
at build time.- add a "Xen PV" APIC driver to support >255 VCPUs in PV guests.
- significant performance improve to guest save/restore/migration.
- scsiback/front save/restore support.
- infrastructure for multi-page xenbus rings.
- misc fixes.
* tag 'stable/for-linus-4.1-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/pci: Try harder to get PXM information for Xen
xenbus_client: Extend interface to support multi-page ring
xen-pciback: also support disabling of bus-mastering and memory-write-invalidate
xen: support suspend/resume in pvscsi frontend
xen: scsiback: add LUN of restored domain
xen-scsiback: define a pr_fmt macro with xen-pvscsi
xen/mce: fix up xen_late_init_mcelog() error handling
xen/privcmd: improve performance of MMAPBATCH_V2
xen: unify foreign GFN map/unmap for auto-xlated physmap guests
x86/xen/apic: WARN with details.
x86/xen: Provide a "Xen PV" APIC driver to support >255 VCPUs
xen/pciback: Don't print scary messages when unsupported by hypervisor.
xen: use generated hypercall symbols in arch/x86/xen/xen-head.S
xen: use generated hypervisor symbols in arch/x86/xen/trace.c
xen: synchronize include/xen/interface/xen.h with xen
xen: build infrastructure for generating hypercall depending symbols
xen: balloon: Use static attribute groups for sysfs entries
xen: pcpu: Use static attribute groups for sysfs entry
15 Apr, 2015
1 commit
-
Originally Xen PV drivers only use single-page ring to pass along
information. This might limit the throughput between frontend and
backend.The patch extends Xenbus driver to support multi-page ring, which in
general should improve throughput if ring is the bottleneck. Changes to
various frontend / backend to adapt to the new interface are also
included.Affected Xen drivers:
* blkfront/back
* netfront/back
* pcifront/back
* scsifront/back
* vtpmfrontThe interface is documented, as before, in xenbus_client.c.
Signed-off-by: Wei Liu
Signed-off-by: Paul Durrant
Signed-off-by: Bob Liu
Cc: Konrad Wilk
Cc: Boris Ostrovsky
Signed-off-by: David Vrabel
21 Mar, 2015
2 commits
-
Conflicts:
drivers/net/ethernet/emulex/benet/be_main.c
net/core/sysctl_net_core.c
net/ipv4/inet_diag.cThe be_main.c conflict resolution was really tricky. The conflict
hunks generated by GIT were very unhelpful, to say the least. It
split functions in half and moved them around, when the real actual
conflict only existed solely inside of one function, that being
be_map_pci_bars().So instead, to resolve this, I checked out be_main.c from the top
of net-next, then I applied the be_main.c changes from 'net' since
the last time I merged. And this worked beautifully.The inet_diag.c and sysctl_net_core.c conflicts were simple
overlapping changes, and were easily to resolve.Signed-off-by: David S. Miller
-
With the current netback, the bandwidth limiter's parameters are only
settable during vif setup time. This patch register a watch on them, and
thus makes them runtime changeable.When the watch fires, the timer is reset. The timer's mutex is used for
fencing the change.Cc: Anthony Liguori
Signed-off-by: Imre Palik
Acked-by: Wei Liu
Signed-off-by: David S. Miller
12 Mar, 2015
1 commit
-
This fixes a performance regression introduced by
7fbb9d8415d4a51cf542e87cf3a717a9f7e6aedc (xen-netback: release pending
index before pushing Tx responses)Moving the notify outside of the spin locks means it can be delayed a
long time (if the dealloc thread is descheduled or there is an
interrupt or softirq).Signed-off-by: David Vrabel
Reviewed-by: Zoltan Kiss
Acked-by: Wei Liu
Signed-off-by: David S. Miller
06 Mar, 2015
2 commits
-
When handling a from-guest frag list, xenvif_handle_frag_list()
replaces the frags before calling the destructor to clean up the
original (foreign) frags. Whilst this is safe (the destructor doesn't
actually use the frags), it looks odd.Reorder the function to be less confusing.
Signed-off-by: David Vrabel
Signed-off-by: David S. Miller -
Every time a VIF is destroyed up to 256 pages may be leaked if packets
with more than MAX_SKB_FRAGS frags were transmitted from the guest.
Even worse, if another user of ballooned pages allocated one of these
ballooned pages it would not handle the unexpectedly >1 page count
(e.g., gntdev would deadlock when unmapping a grant because the page
count would never reach 1).When handling a from-guest skb with a frag list, unref the frags
before releasing them so they are freed correctly when the VIF is
destroyed.Signed-off-by: David Vrabel
Signed-off-by: David S. Miller
25 Feb, 2015
1 commit
-
If the pending indexes are released /after/ pushing the Tx response
then a stale pending index may be used if a new Tx request is
immediately pushed by the frontend. The may cause various WARNINGs or
BUGs if the stale pending index is actually still in use.Fix this by releasing the pending index before pushing the Tx
response.The full barrier for the pending ring update is not required since the
the Tx response push already has a suitable write barrier.Signed-off-by: David Vrabel
Reviewed-by: Wei Liu
Signed-off-by: David S. Miller
11 Feb, 2015
1 commit
-
Pull networking updates from David Miller:
1) More iov_iter conversion work from Al Viro.
[ The "crypto: switch af_alg_make_sg() to iov_iter" commit was
wrong, and this pull actually adds an extra commit on top of the
branch I'm pulling to fix that up, so that the pre-merge state is
ok. - Linus ]2) Various optimizations to the ipv4 forwarding information base trie
lookup implementation. From Alexander Duyck.3) Remove sock_iocb altogether, from CHristoph Hellwig.
4) Allow congestion control algorithm selection via routing metrics.
From Daniel Borkmann.5) Make ipv4 uncached route list per-cpu, from Eric Dumazet.
6) Handle rfs hash collisions more gracefully, also from Eric Dumazet.
7) Add xmit_more support to r8169, e1000, and e1000e drivers. From
Florian Westphal.8) Transparent Ethernet Bridging support for GRO, from Jesse Gross.
9) Add BPF packet actions to packet scheduler, from Jiri Pirko.
10) Add support for uniqu flow IDs to openvswitch, from Joe Stringer.
11) New NetCP ethernet driver, from Muralidharan Karicheri and Wingman
Kwok.12) More sanely handle out-of-window dupacks, which can result in
serious ACK storms. From Neal Cardwell.13) Various rhashtable bug fixes and enhancements, from Herbert Xu,
Patrick McHardy, and Thomas Graf.14) Support xmit_more in be2net, from Sathya Perla.
15) Group Policy extensions for vxlan, from Thomas Graf.
16) Remove Checksum Offload support for vxlan, from Tom Herbert.
17) Like ipv4, support lockless transmit over ipv6 UDP sockets. From
Vlad Yasevich.* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1494+1 commits)
crypto: fix af_alg_make_sg() conversion to iov_iter
ipv4: Namespecify TCP PMTU mechanism
i40e: Fix for stats init function call in Rx setup
tcp: don't include Fast Open option in SYN-ACK on pure SYN-data
openvswitch: Only set TUNNEL_VXLAN_OPT if VXLAN-GBP metadata is set
ipv6: Make __ipv6_select_ident static
ipv6: Fix fragment id assignment on LE arches.
bridge: Fix inability to add non-vlan fdb entry
net: Mellanox: Delete unnecessary checks before the function call "vunmap"
cxgb4: Add support in cxgb4 to get expansion rom version via ethtool
ethtool: rename reserved1 memeber in ethtool_drvinfo for expansion ROM version
net: dsa: Remove redundant phy_attach()
IB/mlx4: Reset flow support for IB kernel ULPs
IB/mlx4: Always use the correct port for mirrored multicast attachments
net/bonding: Fix potential bad memory access during bonding events
tipc: remove tipc_snprintf
tipc: nl compat add noop and remove legacy nl framework
tipc: convert legacy nl stats show to nl compat
tipc: convert legacy nl net id get to nl compat
tipc: convert legacy nl net id set to nl compat
...