linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-12-24 03:24:55 +08:00

Author	SHA1	Message	Date
Paul Durrant	fedbc8c132	xen-netback: retire guest rx side prefix GSO feature As far as I am aware only very old Windows network frontends make use of this style of passing GSO packets from backend to frontend. These frontends can easily be replaced by the freely available Xen Project Windows PV network frontend, which uses the 'default' mechanism for passing GSO packets, which is also used by all Linux frontends. NOTE: Removal of this feature will not cause breakage in old Windows frontends. They simply will no longer receive GSO packets - the packets instead being fragmented in the backend. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-06 20:37:35 -04:00
Juergen Gross	0364a8824c	xen-netback: switch to threaded irq for control ring Instead of open coding it use the threaded irq mechanism in xen-netback. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-22 08:26:24 -04:00
Paul Durrant	c0fcded2e6	xen-netback: only deinitialized hash if it was initialized A domain with a frontend that does not implement a control ring has been seen to cause a crash during domain save. This was apparently because the call to xenvif_deinit_hash() in xenvif_disconnect_ctrl() is made regardless of whether a control ring was connected, and hence xenvif_hash_init() was called. This patch brings the call to xenvif_deinit_hash() in xenvif_disconnect_ctrl() inside the if clause that checks whether the control ring event channel was connected. This is sufficient to ensure it is only called if xenvif_init_hash() was called previously. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-20 17:41:18 -04:00
Paul Durrant	f07f989338	xen-netback: pass hash value to the frontend My recent patch to include/xen/interface/io/netif.h defines a new extra info type that can be used to pass hash values between backend and guest frontend. This patch adds code to xen-netback to pass hash values calculated for guest receive-side packets (i.e. netback transmit side) to the frontend. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-16 13:35:56 -04:00
Paul Durrant	40d8abdee8	xen-netback: add control protocol implementation My recent patch to include/xen/interface/io/netif.h defines a new shared ring (in addition to the rx and tx rings) for passing control messages from a VM frontend driver to a backend driver. A previous patch added the necessary boilerplate for mapping the control ring from the frontend, should it be created. This patch adds implementations for each of the defined protocol messages. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-16 13:35:56 -04:00
Paul Durrant	4e15ee2cb4	xen-netback: add control ring boilerplate My recent patch to include/xen/interface/io/netif.h defines a new shared ring (in addition to the rx and tx rings) for passing control messages from a VM frontend driver to a backend driver. This patch adds the necessary code to xen-netback to map this new shared ring, should it be created by a frontend, but does not add implementations for any of the defined protocol messages. These are added in a subsequent patch for clarity. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-16 13:35:56 -04:00
David Vrabel	9c6f3ffe82	xen-netback: free queues after freeing the net device If a queue still has a NAPI instance added to the net device, freeing the queues early results in a use-after-free. The shouldn't ever happen because we disconnect and tear down all queues before freeing the net device, but doing this makes it obviously safe. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-15 15:13:19 -05:00
David Vrabel	4a65852727	xen-netback: delete NAPI instance when queue fails to initialize When xenvif_connect() fails it may leave a stale NAPI instance added to the device. Make sure we delete it in the error path. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-15 15:13:18 -05:00
Paul Durrant	210c34dcd8	xen-netback: add support for multicast control Xen's PV network protocol includes messages to add/remove ethernet multicast addresses to/from a filter list in the backend. This allows the frontend to request the backend only forward multicast packets which are of interest thus preventing unnecessary noise on the shared ring. The canonical netif header in git://xenbits.xen.org/xen.git specifies the message format (two more XEN_NETIF_EXTRA_TYPEs) so the minimal necessary changes have been pulled into include/xen/interface/io/netif.h. To prevent the frontend from extending the multicast filter list arbitrarily a limit (XEN_NETBK_MCAST_MAX) has been set to 64 entries. This limit is not specified by the protocol and so may change in future. If the limit is reached then the next XEN_NETIF_EXTRA_TYPE_MCAST_ADD sent by the frontend will be failed with NETIF_RSP_ERROR. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-09-02 11:45:00 -07:00
Ross Lagerwall	57b229063a	xen/netback: Wake dealloc thread after completing zerocopy work Waking the dealloc thread before decrementing inflight_packets is racy because it means the thread may go to sleep before inflight_packets is decremented. If kthread_stop() has already been called, the dealloc thread may wait forever with nothing to wake it. Instead, wake the thread only after decrementing inflight_packets. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-06 23:42:48 -07:00
Palik, Imre	edafc132ba	xen-netback: making the bandwidth limiter runtime settable With the current netback, the bandwidth limiter's parameters are only settable during vif setup time. This patch register a watch on them, and thus makes them runtime changeable. When the watch fires, the timer is reset. The timer's mutex is used for fencing the change. Cc: Anthony Liguori <aliguori@amazon.com> Signed-off-by: Imre Palik <imrep@amazon.de> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:55:15 -04:00
David S. Miller	3cef5c5b0b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/cadence/macb.c Overlapping changes in macb driver, mostly fixes and cleanups in 'net' overlapping with the integration of at91_ether into macb in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-09 23:38:02 -04:00
David Vrabel	d63951d744	xen-netback: return correct ethtool stats Use correct pointer arithmetic to get the pointer to each stat. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 14:58:17 -05:00
Joe Perches	3b6ed26d8b	xen: Use eth_<foo>_addr instead of memset Use the built-in function instead of memset. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-03 17:01:37 -05:00
Linus Torvalds	c5ce28df0e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: 1) More iov_iter conversion work from Al Viro. [ The "crypto: switch af_alg_make_sg() to iov_iter" commit was wrong, and this pull actually adds an extra commit on top of the branch I'm pulling to fix that up, so that the pre-merge state is ok. - Linus ] 2) Various optimizations to the ipv4 forwarding information base trie lookup implementation. From Alexander Duyck. 3) Remove sock_iocb altogether, from CHristoph Hellwig. 4) Allow congestion control algorithm selection via routing metrics. From Daniel Borkmann. 5) Make ipv4 uncached route list per-cpu, from Eric Dumazet. 6) Handle rfs hash collisions more gracefully, also from Eric Dumazet. 7) Add xmit_more support to r8169, e1000, and e1000e drivers. From Florian Westphal. 8) Transparent Ethernet Bridging support for GRO, from Jesse Gross. 9) Add BPF packet actions to packet scheduler, from Jiri Pirko. 10) Add support for uniqu flow IDs to openvswitch, from Joe Stringer. 11) New NetCP ethernet driver, from Muralidharan Karicheri and Wingman Kwok. 12) More sanely handle out-of-window dupacks, which can result in serious ACK storms. From Neal Cardwell. 13) Various rhashtable bug fixes and enhancements, from Herbert Xu, Patrick McHardy, and Thomas Graf. 14) Support xmit_more in be2net, from Sathya Perla. 15) Group Policy extensions for vxlan, from Thomas Graf. 16) Remove Checksum Offload support for vxlan, from Tom Herbert. 17) Like ipv4, support lockless transmit over ipv6 UDP sockets. From Vlad Yasevich. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1494+1 commits) crypto: fix af_alg_make_sg() conversion to iov_iter ipv4: Namespecify TCP PMTU mechanism i40e: Fix for stats init function call in Rx setup tcp: don't include Fast Open option in SYN-ACK on pure SYN-data openvswitch: Only set TUNNEL_VXLAN_OPT if VXLAN-GBP metadata is set ipv6: Make __ipv6_select_ident static ipv6: Fix fragment id assignment on LE arches. bridge: Fix inability to add non-vlan fdb entry net: Mellanox: Delete unnecessary checks before the function call "vunmap" cxgb4: Add support in cxgb4 to get expansion rom version via ethtool ethtool: rename reserved1 memeber in ethtool_drvinfo for expansion ROM version net: dsa: Remove redundant phy_attach() IB/mlx4: Reset flow support for IB kernel ULPs IB/mlx4: Always use the correct port for mirrored multicast attachments net/bonding: Fix potential bad memory access during bonding events tipc: remove tipc_snprintf tipc: nl compat add noop and remove legacy nl framework tipc: convert legacy nl stats show to nl compat tipc: convert legacy nl net id get to nl compat tipc: convert legacy nl net id set to nl compat ...	2015-02-10 20:01:30 -08:00
Linus Torvalds	bdccc4edeb	xen: features and fixes for 3.20-rc0 - Reworked handling for foreign (grant mapped) pages to simplify the code, enable a number of additional use cases and fix a number of long-standing bugs. - Prefer the TSC over the Xen PV clock when dom0 (and the TSC is stable). - Assorted other cleanup and minor bug fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAABAgAGBQJU2JC+AAoJEFxbo/MsZsTRIvAH/1lgQ0EQlxaZtEFWY8cJBzxY dXaTMfyGQOddGYDCW0r42hhXJHeX7DWXSERSD3aW9DZOn/eYdneHq9gWRD4uPrGn hEFQ26J4jZWR5riGXaja0LqI2gJKLZ6BhHIQciLEbY+jw4ynkNBLNRPFehuwrCsZ WdBwJkyvXC3RErekncRl/aNhxdi4p1P6qeiaW/mo3UcSO/CFSKybOLwT65iePazg XuY9UiTn2+qcRkm/tjx8K9heHK8SBEGNWuoTcWYF1to8mwwUfKIAc4NO2UBDXJI+ rp7Z2lVFdII15JsQ08ATh3t7xDrMWLzCX/y4jCzmF3DBXLbSWdHCQMgI7TWt5pE= =PyJK -----END PGP SIGNATURE----- Merge tag 'stable/for-linus-3.20-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen features and fixes from David Vrabel: - Reworked handling for foreign (grant mapped) pages to simplify the code, enable a number of additional use cases and fix a number of long-standing bugs. - Prefer the TSC over the Xen PV clock when dom0 (and the TSC is stable). - Assorted other cleanup and minor bug fixes. * tag 'stable/for-linus-3.20-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (25 commits) xen/manage: Fix USB interaction issues when resuming xenbus: Add proper handling of XS_ERROR from Xenbus for transactions. xen/gntdev: provide find_special_page VMA operation xen/gntdev: mark userspace PTEs as special on x86 PV guests xen-blkback: safely unmap grants in case they are still in use xen/gntdev: safely unmap grants in case they are still in use xen/gntdev: convert priv->lock to a mutex xen/grant-table: add a mechanism to safely unmap pages that are in use xen-netback: use foreign page information from the pages themselves xen: mark grant mapped pages as foreign xen/grant-table: add helpers for allocating pages x86/xen: require ballooned pages for grant maps xen: remove scratch frames for ballooned pages and m2p override xen/grant-table: pre-populate kernel unmap ops for xen_gnttab_unmap_refs() mm: add 'foreign' alias for the 'pinned' page flag mm: provide a find_special_page vma operation x86/xen: cleanup arch/x86/xen/mmu.c x86/xen: add some __init annotations in arch/x86/xen/mmu.c x86/xen: add some __init and static annotations in arch/x86/xen/setup.c x86/xen: use correct types for addresses in arch/x86/xen/setup.c ...	2015-02-10 13:56:56 -08:00
Lad, Prabhakar	38741d50eb	xen-netback: fix sparse warning this patch fixes following sparse warning: interface.c:83:5: warning: symbol 'xenvif_poll' was not declared. Should it be static? Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 16:04:22 -08:00
David Vrabel	42b5212fee	xen-netback: stop the guest rx thread after a fatal error After commit `e9d8b2c296` (xen-netback: disable rogue vif in kthread context), a fatal (protocol) error would leave the guest Rx thread spinning, wasting CPU time. Commit `ecf08d2dbb` (xen-netback: reintroduce guest Rx stall detection) made this even worse by removing a cond_resched() from this path. Since a fatal error is non-recoverable, just allow the guest Rx thread to exit. This requires taking additional refs to the task so the thread exiting early is handled safely. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reported-by: Julien Grall <julien.grall@linaro.org> Tested-by: Julien Grall <julien.grall@linaro.org> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-02 19:39:04 -08:00
David Vrabel	ff4b156f16	xen/grant-table: add helpers for allocating pages Add gnttab_alloc_pages() and gnttab_free_pages() to allocate/free pages suitable to for granted maps. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2015-01-28 14:03:12 +00:00
David Vrabel	26c0e10258	xen-netback: support frontends without feature-rx-notify again Commit `bc96f648df` (xen-netback: make feature-rx-notify mandatory) incorrectly assumed that there were no frontends in use that did not support this feature. But the frontend driver in MiniOS does not and since this is used by (qemu) stubdoms, these stopped working. Netback sort of works as-is in this mode except: - If there are no Rx requests and the internal Rx queue fills, only the drain timeout will wake the thread. The default drain timeout of 10 s would give unacceptable pauses. - If an Rx stall was detected and the internal Rx queue is drained, then the Rx thread would never wake. Handle these two cases (when feature-rx-notify is disabled) by: - Reducing the drain timeout to 30 ms. - Disabling Rx stall detection. Reported-by: John <jw@nuclearfallout.net> Tested-by: John <jw@nuclearfallout.net> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-18 12:49:49 -05:00
David S. Miller	55b42b5ca2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/phy/marvell.c Simple overlapping changes in drivers/net/phy/marvell.c Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-01 14:53:27 -04:00
Zoltan Kiss	8fe78989c3	xen-netback: Disable NAPI after disabling interrupts Otherwise the interrupt handler still calls napi_complete. Although it won't schedule NAPI again as either NAPI_STATE_DISABLE or NAPI_STATE_SCHED is set, it is just unnecessary, and it makes more sense to do this way. Signed-off-by: Zoltan Kiss <zoltan.kiss@linaro.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-29 15:59:37 -04:00
David Vrabel	ecf08d2dbb	xen-netback: reintroduce guest Rx stall detection If a frontend not receiving packets it is useful to detect this and turn off the carrier so packets are dropped early instead of being queued and drained when they expire. A to-guest queue is stalled if it doesn't have enough free slots for a an extended period of time (default 60 s). If at least one queue is stalled, the carrier is turned off (in the expectation that the other queues will soon stall as well). The carrier is only turned on once all queues are ready. When the frontend connects, all the queues start in the stalled state and only become ready once the frontend queues enough Rx requests. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-25 14:15:20 -04:00
David Vrabel	f48da8b14d	xen-netback: fix unlimited guest Rx internal queue and carrier flapping Netback needs to discard old to-guest skb's (guest Rx queue drain) and it needs detect guest Rx stalls (to disable the carrier so packets are discarded earlier), but the current implementation is very broken. 1. The check in hard_start_xmit of the slot availability did not consider the number of packets that were already in the guest Rx queue. This could allow the queue to grow without bound. The guest stops consuming packets and the ring was allowed to fill leaving S slot free. Netback queues a packet requiring more than S slots (ensuring that the ring stays with S slots free). Netback queue indefinately packets provided that then require S or fewer slots. 2. The Rx stall detection is not triggered in this case since the (host) Tx queue is not stopped. 3. If the Tx queue is stopped and a guest Rx interrupt occurs, netback will consider this an Rx purge event which may result in it taking the carrier down unnecessarily. It also considers a queue with only 1 slot free as unstalled (even though the next packet might not fit in this). The internal guest Rx queue is limited by a byte length (to 512 Kib, enough for half the ring). The (host) Tx queue is stopped and started based on this limit. This sets an upper bound on the amount of memory used by packets on the internal queue. This allows the estimatation of the number of slots for an skb to be removed (it wasn't a very good estimate anyway). Instead, the guest Rx thread just waits for enough free slots for a maximum sized packet. skbs queued on the internal queue have an 'expires' time (set to the current time plus the drain timeout). The guest Rx thread will detect when the skb at the head of the queue has expired and discard expired skbs. This sets a clear upper bound on the length of time an skb can be queued for. For a guest being destroyed the maximum time needed to wait for all the packets it sent to be dropped is still the drain timeout (10 s) since it will not be sending new packets. Rx stall detection is reintroduced in a later commit. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-25 14:15:20 -04:00
David Vrabel	bc96f648df	xen-netback: make feature-rx-notify mandatory Frontends that do not provide feature-rx-notify may stall because netback depends on the notification from frontend to wake the guest Rx thread (even if can_queue is false). This could be fixed but feature-rx-notify was introduced in 2006 and I am not aware of any frontends that do not implement this. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-25 14:15:20 -04:00
Wei Liu	e24f8191cc	xen-netback: move netif_napi_add before binding interrupt Interrupt is enabled when bind_interdomain_evtchn_to_irqhandler returns. If there's interrupt pending interrupt handler is invoked. NAPI needs to be initialised before binding interrupt otherwise the interrupt handler will try to scheduling a NAPI instance that is not initialised yet, resulting in kernel OOPS. This fixes a regression introduced in `ea2c5e13` ("xen-netback: move NAPI add/remove calls"). Ideally function calls to create kthreads should also be moved before binding but I intent to fix this regression with minimal changes and refactor the code with another patch. Reported-by: Thomas Leonard <talex5@gmail.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-25 17:31:42 -07:00
Wei Liu	b125285821	xen-netback: remove loop waiting function The original implementation relies on a loop to check if all inflight packets are freed. Now we have proper reference counting, there's no need to use loop anymore. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-13 20:07:44 -07:00
Wei Liu	a64bd93452	xen-netback: don't stop dealloc kthread too early Reference count the number of packets in host stack, so that we don't stop the deallocation thread too early. If not, we can end up with xenvif_free permanently waiting for deallocation thread to unmap grefs. Reported-by: Thomas Leonard <talex5@gmail.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-13 20:07:44 -07:00
Wei Liu	ea2c5e1342	xen-netback: move NAPI add/remove calls Originally netif_napi_add was in xenvif_init_queue and netif_napi_del was in xenvif_deinit_queue, while kthreads were handled in xenvif_connect and xenvif_disconnect. Move netif_napi_add and netif_napi_del to xenvif_connect and xenvif_disconnect so that they reside together with kthread operations. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-13 20:07:44 -07:00
Zoltan Kiss	2561cc15e3	xen-netback: Don't deschedule NAPI when carrier off In the patch called "xen-netback: Turn off the carrier if the guest is not able to receive" NAPI was descheduled when the carrier was set off. That's not what most of the drivers do, and we don't have any specific reason to do so as well, so revert that change. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: xen-devel@lists.xenproject.org Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-11 14:07:47 -07:00
Zoltan Kiss	f34a4cf9c9	xen-netback: Turn off the carrier if the guest is not able to receive Currently when the guest is not able to receive more packets, qdisc layer starts a timer, and when it goes off, qdisc is started again to deliver a packet again. This is a very slow way to drain the queues, consumes unnecessary resources and slows down other guests shutdown. This patch change the behaviour by turning the carrier off when that timer fires, so all the packets are freed up which were stucked waiting for that vif. Instead of the rx_queue_purge bool it uses the VIF_STATUS_RX_PURGE_EVENT bit to signal the thread that either the timeout happened or an RX interrupt arrived, so the thread can check what it should do. It also disables NAPI, so the guest can't transmit, but leaves the interrupts on, so it can resurrect. Only the queues which brought down the interface can enable it again, the bit QUEUE_STATUS_RX_STALLED makes sure of that. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: xen-devel@lists.xenproject.org Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-05 16:04:46 -07:00
Zoltan Kiss	3d1af1df97	xen-netback: Using a new state bit instead of carrier This patch introduces a new state bit VIF_STATUS_CONNECTED to track whether the vif is in a connected state. Using carrier will not work with the next patch in this series, which aims to turn the carrier temporarily off if the guest doesn't seem to be able to receive packets. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: xen-devel@lists.xenproject.org v2: - rename the bitshift type to "enum state_bit_shift" here, not in the next patch Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-05 16:04:46 -07:00
Tom Gundersen	c835a67733	net: set name_assign_type in alloc_netdev() Extend alloc_netdev{,_mq{,s}}() to take name_assign_type as argument, and convert all users to pass NET_NAME_UNKNOWN. Coccinelle patch: @@ expression sizeof_priv, name, setup, txqs, rxqs, count; @@ ( -alloc_netdev_mqs(sizeof_priv, name, setup, txqs, rxqs) +alloc_netdev_mqs(sizeof_priv, name, NET_NAME_UNKNOWN, setup, txqs, rxqs) \| -alloc_netdev_mq(sizeof_priv, name, setup, count) +alloc_netdev_mq(sizeof_priv, name, NET_NAME_UNKNOWN, setup, count) \| -alloc_netdev(sizeof_priv, name, setup) +alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN, setup) ) v9: move comments here from the wrong commit Signed-off-by: Tom Gundersen <teg@jklm.no> Reviewed-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-15 16:12:48 -07:00
Zoltan Kiss	f51de24356	xen-netback: Adding debugfs "io_ring_qX" files This patch adds debugfs capabilities to netback. There used to be a similar patch floating around for classic kernel, but it used procfs. It is based on a very similar blkback patch. It creates xen-netback/[vifname]/io_ring_q[queueno] files, reading them output various ring variables etc. Writing "kick" into it imitates an interrupt happened, it can be useful to check whether the ring is just stalled due to a missed interrupt. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: xen-devel@lists.xenproject.org Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-08 20:48:36 -07:00
Wei Liu	f7b50c4e7c	xen-netback: bookkeep number of active queues in our own module The original code uses netdev->real_num_tx_queues to bookkeep number of queues and invokes netif_set_real_num_tx_queues to set the number of queues. However, netif_set_real_num_tx_queues doesn't allow real_num_tx_queues to be smaller than 1, which means setting the number to 0 will not work and real_num_tx_queues is untouched. This is bogus when xenvif_free is invoked before any number of queues is allocated. That function needs to iterate through all queues to free resources. Using the wrong number of queues results in NULL pointer dereference. So we bookkeep the number of queues in xen-netback to solve this problem. This fixes a regression introduced by multiqueue patchset in 3.16-rc1. There's another bug in original code that the real number of RX queues is never set. In current Xen multiqueue design, the number of TX queues and RX queues are in fact the same. We need to set the numbers of TX and RX queues to the same value. Also remove xenvif_select_queue and leave queue selection to core driver, as suggested by David Miller. Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> CC: Ian Campbell <ian.campbell@citrix.com> CC: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-25 15:59:47 -07:00
Arnd Bergmann	e7b599d7c1	net: xen-netback: include linux/vmalloc.h again commit `e9ce7cb6b1` ("xen-netback: Factor queue-specific data into queue struct") added a use of vzalloc/vfree to interface.c, but removed the #include <linux/vmalloc.h> statement at the same time, which causes this build error: drivers/net/xen-netback/interface.c: In function 'xenvif_free': drivers/net/xen-netback/interface.c:754:2: error: implicit declaration of function 'vfree' [-Werror=implicit-function-declaration] vfree(vif->queues); ^ cc1: some warnings being treated as errors Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Andrew J. Bennieston <andrew.bennieston@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 15:19:28 -07:00
Andrew J. Bennieston	8d3d53b3e4	xen-netback: Add support for multiple queues Builds on the refactoring of the previous patch to implement multiple queues between xen-netfront and xen-netback. Writes the maximum supported number of queues into XenStore, and reads the values written by the frontend to determine how many queues to use. Ring references and event channels are read from XenStore on a per-queue basis and rings are connected accordingly. Also adds code to handle the cleanup of any already initialised queues if the initialisation of a subsequent queue fails. Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-04 14:48:16 -07:00
Wei Liu	e9ce7cb6b1	xen-netback: Factor queue-specific data into queue struct In preparation for multi-queue support in xen-netback, move the queue-specific data from struct xenvif into struct xenvif_queue, and update the rest of the code to use this. Also adds loops over queues where appropriate, even though only one is configured at this point, and uses alloc_netdev_mq() and the corresponding multi-queue netif wake/start/stop functions in preparation for multiple active queues. Finally, implements a trivial queue selection function suitable for ndo_select_queue, which simply returns 0 for a single queue and uses skb_get_hash() to compute the queue index otherwise. Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-04 14:48:16 -07:00
Andrew J. Bennieston	a55d9766ce	xen-netback: Move grant_copy_op array back into struct xenvif. This array was allocated separately in commit `ac3d5ac2` ("xen-netback: fix guest-receive-side array sizes") due to it being very large, and a struct xenvif is allocated as the netdev_priv part of a struct net_device, i.e. via kmalloc() but falling back to vmalloc() if the initial alloc. fails. In preparation for the multi-queue patches, where this array becomes part of struct xenvif_queue and is always allocated through vzalloc(), move this back into the struct xenvif. Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-04 14:48:16 -07:00
David S. Miller	54e5c4def0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/bonding/bond_alb.c drivers/net/ethernet/altera/altera_msgdma.c drivers/net/ethernet/altera/altera_sgdma.c net/ipv6/xfrm6_output.c Several cases of overlapping changes. The xfrm6_output.c has a bug fix which overlaps the renaming of skb->local_df to skb->ignore_df. In the Altera TSE driver cases, the register access cleanups in net-next overlapped with bug fixes done in net. Similarly a bug fix to send ALB packets in the bonding driver using the right source address overlaps with cleanups in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-24 00:32:30 -04:00
David Vrabel	0d08fceb2e	xen-netback: fix race between napi_complete() and interrupt handler When the NAPI budget was not all used, xenvif_poll() would call napi_complete() /after/ enabling the interrupt. This resulted in a race between the napi_complete() and the napi_schedule() in the interrupt handler. The use of local_irq_save/restore() avoided by race iff the handler is running on the same CPU but not if it was running on a different CPU. Fix this properly by calling napi_complete() before reenabling interrupts (in the xenvif_napi_schedule_or_enable_irq() call). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 16:27:23 -04:00
Wilfried Klaebe	7ad24ea4bf	net: get rid of SET_ETHTOOL_OPS net: get rid of SET_ETHTOOL_OPS Dave Miller mentioned he'd like to see SET_ETHTOOL_OPS gone. This does that. Mostly done via coccinelle script: @@ struct ethtool_ops ops; struct net_device dev; @@ - SET_ETHTOOL_OPS(dev, ops); + dev->ethtool_ops = ops; Compile tested only, but I'd seriously wonder if this broke anything. Suggested-by: Dave Miller <davem@davemloft.net> Signed-off-by: Wilfried Klaebe <w-lkml@lebenslange-mailadresse.de> Acked-by: Felipe Balbi <balbi@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-13 17:43:20 -04:00
Wei Liu	e9d8b2c296	xen-netback: disable rogue vif in kthread context When netback discovers frontend is sending malformed packet it will disables the interface which serves that frontend. However disabling a network interface involving taking a mutex which cannot be done in softirq context, so we need to defer this process to kthread context. This patch does the following: 1. introduce a flag to indicate the interface is disabled. 2. check that flag in TX path, don't do any work if it's true. 3. check that flag in RX path, turn off that interface if it's true. The reason to disable it in RX path is because RX uses kthread. After this change the behavior of netback is still consistent -- it won't do any TX work for a rogue frontend, and the interface will be eventually turned off. Also change a "continue" to "break" after xenvif_fatal_tx_err, as it doesn't make sense to continue processing packets if frontend is rogue. This is a fix for XSA-90. Reported-by: Török Edwin <edwin@etorok.net> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-04-01 16:25:51 -04:00
Zoltan Kiss	0e59a4a553	xen-netback: Non-functional follow-up patch for grant mapping series Ian made some late comments about the grant mapping series, I incorporated the non-functional outcomes into this patch: - typo fixes in a comment of xenvif_free(), and add another one there as well - typo fix for comment of rx_drain_timeout_msecs - remove stale comment before calling xenvif_grant_handle_reset() Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-26 16:33:42 -04:00
Zoltan Kiss	869b9b19b3	xen-netback: Stop using xenvif_tx_pending_slots_available Since the early days TX stops if there isn't enough free pending slots to consume a maximum sized (slot-wise) packet. Probably the reason for that is to avoid the case when we don't have enough free pending slot in the ring to finish the packet. But if we make sure that the pending ring has the same size as the shared ring, that shouldn't really happen. The frontend can only post packets which fit the to the free space of the shared ring. If it doesn't, the frontend has to stop, as it can only increase the req_prod when the whole packet fits onto the ring. This patch avoid using this checking, makes sure the 2 ring has the same size, and remove a checking from the callback. As now we don't stop the NAPI instance on this condition, we don't have to wake it up if we free pending slots up. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-26 16:33:42 -04:00
Zoltan Kiss	397dfd9f93	Revert "xen-netback: Aggregate TX unmap operations" This reverts commit `e9275f5e2d`. This commit is the last in the netback grant mapping series, and it tries to do more aggressive aggreagtion of unmap operations. However practical use showed almost no positive effect, whilst with certain frontends it causes significant performance regression. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-25 18:58:07 -04:00
David S. Miller	85dcce7a73	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/usb/r8152.c drivers/net/xen-netback/netback.c Both the r8152 and netback conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-14 22:31:55 -04:00
Wei Liu	836fbaf459	xen-netback: use skb_is_gso in xenvif_start_xmit In `5bd076708` ("Xen-netback: Fix issue caused by using gso_type wrongly") we use skb_is_gso to determine if we need an extra slot to accommodate the SKB. There's similar error in interface.c. Change that to use skb_is_gso as well. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Annie Li <annie.li@oracle.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Paul Durrant <paul.durrant@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 15:36:32 -04:00
Zoltan Kiss	e9275f5e2d	xen-netback: Aggregate TX unmap operations Unmapping causes TLB flushing, therefore we should make it in the largest possible batches. However we shouldn't starve the guest for too long. So if the guest has space for at least two big packets and we don't have at least a quarter ring to unmap, delay it for at most 1 milisec. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-07 15:57:21 -05:00
Zoltan Kiss	093507885a	xen-netback: Timeout packets in RX path A malicious or buggy guest can leave its queue filled indefinitely, in which case qdisc start to queue packets for that VIF. If those packets came from an another guest, it can block its slots and prevent shutdown. To avoid that, we make sure the queue is drained in every 10 seconds. The QDisc queue in worst case takes 3 round to flush usually. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-07 15:57:15 -05:00

1 2

82 Commits