Commit Graph

297 Commits

Author SHA1 Message Date
Michael S. Tsirkin
de2b48d581 virtio_pci_common.h: drop VIRTIO_PCI_NO_LEGACY
Legacy drivers use virtio_pci_common.h too, we should not
define VIRTIO_PCI_NO_LEGACY there.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-11 20:04:39 +02:00
Michael S. Tsirkin
30683a8cce virtio: set VIRTIO_CONFIG_S_FEATURES_OK on restore
virtio 1.0 devices require that drivers set VIRTIO_CONFIG_S_FEATURES_OK
after finalizing features.
virtio core missed doing this on restore, fix it up.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-11 20:04:38 +02:00
Michael S. Tsirkin
5f4c976089 virtio_pci: rename virtio_pci -> virtio_pci_common
kbuild does not seem to like it when we name source
files same as the module.
Let's rename virtio_pci -> virtio_pci_common,
and get rid of #include-ing c files.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 21:42:05 +02:00
Michael S. Tsirkin
a90fdce9dc virtio_pci: update file descriptions and copyright
There's been a lot of changes since 2007.
List main authors, add Red Hat copyright.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 21:42:05 +02:00
Michael S. Tsirkin
38eb4a29a7 virtio_pci: split out legacy device support
Move everything dealing with legacy devices out to virtio_pci_legacy.c.
Expose common code APIs in virtio_pci.h

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 21:42:04 +02:00
Michael S. Tsirkin
6f8f23d63d virtio_pci: setup config vector indirectly
config vector setup is version specific, do it indirectly.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 21:42:04 +02:00
Michael S. Tsirkin
b09f00bbfe virtio_pci: setup vqs indirectly
VQ setup is mostly version-specific, add another level of indirection to
split the version-independent code out.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 21:42:03 +02:00
Michael S. Tsirkin
5386cef200 virtio_pci: delete vqs indirectly
VQ deletion is mostly version-specific, add another level of indirection
to split the version-independent code out.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 21:42:02 +02:00
Michael S. Tsirkin
f30eaf4a09 virtio_pci: use priv for vq notification
slightly reduce the amount of pointer chasing this needs to do.
More importantly, this will easily generalize to virtio 1.0.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 21:42:02 +02:00
Michael S. Tsirkin
3ec7a77bb3 virtio_pci: free up vq->priv
We don't need to go from vq to vq info on
data path, so using direct vq->priv pointer for that
seems like a waste.

Let's build an array of vq infos, then we can use vq->index
for that lookup.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 21:42:01 +02:00
Michael S. Tsirkin
f913dd4536 virtio_pci: fix coding style for structs
should be

struct foo {
}

not

struct foo
{
}

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 21:42:01 +02:00
Michael S. Tsirkin
af535722f8 virtio_pci: add isr field
Use isr field instead of direct access to ioaddr.
This way generalizes easily to virtio 1.0.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 21:42:00 +02:00
Michael S. Tsirkin
d71a6fc6b9 virtio: drop legacy_only driver flag
legacy_only flag is now unused, drop it from core.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-09 21:42:00 +02:00
Michael S. Tsirkin
63d9f218a3 virtio_balloon: drop legacy_only driver flag
we have blacklisted balloon in core, no need
for a driver flag.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-09 21:41:59 +02:00
Michael S. Tsirkin
5c609a5ef0 virtio: allow finalize_features to fail
This will make it easy for transports to validate features and return
failure.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 16:32:32 +02:00
Michael S. Tsirkin
b6098c3042 virtio: add API to detect legacy devices
transports need to be able to detect legacy-only
devices (ATM balloon only) to use legacy path
to drive them.

Add a core API to do just that.
The implementation just blacklists balloon:
not too pretty, but let's not over-engineer.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-09 12:06:33 +02:00
Michael S. Tsirkin
747ae34a6e virtio: make VIRTIO_F_VERSION_1 a transport bit
Activate VIRTIO_F_VERSION_1 automatically unless legacy_only
is set.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 12:06:32 +02:00
Michael S. Tsirkin
df1b57fe59 virtio_balloon: add legacy_only flag
We have no plans to support virtio 1.0 in balloon driver.  Add an
explicit flag to mark it legacy only.

This will be used by follow-up patches.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 12:06:32 +02:00
Michael S. Tsirkin
b3bb62d119 virtio: add legacy feature table support
virtio-blk has some legacy feature bits that modern drivers
must not negotiate, but are needed for old legacy hosts
(that e.g. don't support virtio-scsi).
Allow a separate legacy feature table for such cases.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-09 12:05:26 +02:00
Michael S. Tsirkin
c102659d69 virtio: simplify feature bit handling
Now that we use u64 for bits, we can simply & them together.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-09 12:05:25 +02:00
Michael S. Tsirkin
cb3f6d9da4 virtio: set FEATURES_OK
set FEATURES_OK as per virtio 1.0 spec

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-09 12:05:25 +02:00
Cornelia Huck
8906265215 virtio: allow transports to get avail/used addresses
For virtio-1, we can theoretically have a more complex virtqueue
layout with avail and used buffers not on a contiguous memory area
with the descriptor table. For now, it's fine for a transport driver
to stay with the old layout: It needs, however, a way to access
the locations of the avail/used rings so it can register them with
the host.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 12:05:25 +02:00
Michael S. Tsirkin
00e6f3d9d9 virtio_ring: switch to new memory access APIs
Use virtioXX_to_cpu and friends for access to
all multibyte structures in memory.

Note: this is intentionally mechanical.
A follow-up patch will split long lines etc.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-09 12:05:25 +02:00
Michael S. Tsirkin
93d389f820 virtio: assert 32 bit features in transports
At this point, no transports set any of the high 32 feature bits.
Since transports generally can't (yet) cope with such bits, add BUG_ON
checks to make sure they are not set by mistake.

Based on rproc patch by Rusty.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-09 12:05:24 +02:00
Michael S. Tsirkin
d025477368 virtio: add support for 64 bit features.
Change u32 to u64, and use BIT_ULL and 1ULL everywhere.

Note: transports are unchanged, and only set low 32 bit.
This guarantees that no transport sets e.g. VERSION_1
by mistake without proper support.

Based on patch by Rusty.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-09 12:05:24 +02:00
Michael S. Tsirkin
e16e12be34 virtio: use u32, not bitmap for features
It seemed like a good idea to use bitmap for features
in struct virtio_device, but it's actually a pain,
and seems to become even more painful when we get more
than 32 feature bits.  Just change it to a u32 for now.

Based on patch by Rusty.

Suggested-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-09 12:05:23 +02:00
Raushaniya Maksudova
5a10b7dbf9 virtio_balloon: free some memory from balloon on OOM
Excessive virtio_balloon inflation can cause invocation of OOM-killer,
when Linux is under severe memory pressure. Various mechanisms are
responsible for correct virtio_balloon memory management. Nevertheless
it is often the case that these control tools does not have enough time
to react on fast changing memory load. As a result OS runs out of memory
and invokes OOM-killer. The balancing of memory by use of the virtio
balloon should not cause the termination of processes while there are
pages in the balloon. Now there is no way for virtio balloon driver to
free some memory at the last moment before some process will be get
killed by OOM-killer.

This does not provide a security breach as balloon itself is running
inside guest OS and is working in the cooperation with the host. Thus
some improvements from guest side should be considered as normal.

To solve the problem, introduce a virtio_balloon callback which is
expected to be called from the oom notifier call chain in out_of_memory()
function. If virtio balloon could release some memory, it will make
the system to return and retry the allocation that forced the out of
memory killer to run.

Allocate virtio  feature bit for this: it is not set by default,
the the guest will not deflate virtio balloon on OOM without explicit
permission from host.

Signed-off-by: Raushaniya Maksudova <rmaksudova@parallels.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-11-11 17:09:58 +10:30
Raushaniya Maksudova
1fd9c67203 virtio_balloon: return the amount of freed memory from leak_balloon()
This value would be useful in the next patch to provide the amount of
the freed memory for OOM killer.

Signed-off-by: Raushaniya Maksudova <rmaksudova@parallels.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Rusty Russell <rusty@rustcorp.com.au>
CC: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-11-11 17:09:57 +10:30
Wolfram Sang
257d6e5a3b virtio: drop owner assignment from platform_drivers
A platform_driver does not need to set an owner, it will be populated by the
driver core.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2014-10-20 16:21:55 +02:00
Linus Torvalds
0e6e58f941 One cc: stable commit, the rest are a series of minor cleanups which have
been sitting in MST's tree during my vacation.  I changed a function name
 and made one trivial change, then they spent two days in linux-next.
 
 Thanks,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUQFBQAAoJENkgDmzRrbjxJRIP/1yCQRElQewxURSmJelyqCdU
 0mHYB0R9Mf3tfre1xnofqs2lWeSMc/4ptKHsVR6pupoztSwnz7HsLHfEFvFJh4mj
 KsaqYElxkNxTcfyHwLjyJS0/J6tG1tYypXGiimTBS0bvFHL3XZdimVgJ6WvX+gO7
 YSaDEX8/EqCERafslS5+gKJlz3drDOnCZCe9y4BDSmsvl2k7bkpSxIn8vsR6jIC0
 c5JpUy6QVF+3XA/J932M7yRs+xpqxNoUWiyY3ar9o3CtQAaQB0ZAetSxY6hTfvVc
 GlNFzCifdsaQwsl2SVsE2h6tWaRhtMtcGWQuhHThIPyIf8XxhYyBRY2FLo70LMz1
 eqtwy6F/Bg/nzUsdee4PZBMeoKHlAEL12RpsEKgfUoLzj16Aqa8ll+Agbglbkw8G
 f3d2FwzKAlpY5NwHETC1wYy52PJ3efqksRWuhokmYpxNSbHJS/lsiJOE7272/4Qr
 MtXuvRmo22tf34XFd5y7zqWjgZ58eeFOqQWi/K+6ZgpqVOvikjrXXKEuiVdjO0ZD
 kTVR/sQKiR+79rzENk80XBhWaMveECNXF1TiZ/3MmURkmEOBRQMxRQ20BX3exvna
 AJ/WVA5DcfXZc1yyqknE1NLGrvSBMJENH13x2QPwrqNWAryOOKuF1VKKIwWlDw5j
 vtx5nXiJa8YYdxI2TJCN
 =JK6x
 -----END PGP SIGNATURE-----

Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull virtio updates from Rusty Russell:
 "One cc: stable commit, the rest are a series of minor cleanups which
  have been sitting in MST's tree during my vacation.  I changed a
  function name and made one trivial change, then they spent two days in
  linux-next"

* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (25 commits)
  virtio-rng: refactor probe error handling
  virtio_scsi: drop scan callback
  virtio_balloon: enable VQs early on restore
  virtio_scsi: fix race on device removal
  virito_scsi: use freezable WQ for events
  virtio_net: enable VQs early on restore
  virtio_console: enable VQs early on restore
  virtio_scsi: enable VQs early on restore
  virtio_blk: enable VQs early on restore
  virtio_scsi: move kick event out from virtscsi_init
  virtio_net: fix use after free on allocation failure
  9p/trans_virtio: enable VQs early
  virtio_console: enable VQs early
  virtio_blk: enable VQs early
  virtio_net: enable VQs early
  virtio: add API to enable VQs early
  virtio_net: minor cleanup
  virtio-net: drop config_mutex
  virtio_net: drop config_enable
  virtio-blk: drop config_mutex
  ...
2014-10-18 10:25:09 -07:00
Michael S. Tsirkin
486d2e632c virtio_balloon: enable VQs early on restore
virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after resume returns, virtio balloon
violated this rule by adding bufs, which causes the VQ to be used
directly within restore.

To fix, call virtio_device_ready before using VQ.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:25:13 +10:30
Michael S. Tsirkin
22b7050a02 virtio: defer config changed notifications
Defer config changed notifications that arrive during
probe/scan/freeze/restore.

This will allow drivers to set DRIVER_OK earlier, without worrying about
racing with config change interrupts.

This change will also benefit old hypervisors (before 2009)
that send interrupts without checking DRIVER_OK: previously,
the callback could race with driver-specific initialization.

This will also help simplify drivers.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (cosmetic changes)
2014-10-15 10:24:56 +10:30
Michael S. Tsirkin
c6716bae52 virtio-pci: move freeze/restore to virtio core
This is in preparation to extending config changed event handling
in core.
Wrapping these in an API also seems to make for a cleaner code.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:24:55 +10:30
Michael S. Tsirkin
016c98c6fe virtio: unify config_changed handling
Replace duplicated code in all transports with a single wrapper in
virtio.c.

The only functional change is in virtio_mmio.c: if a buggy device sends
us an interrupt before driver is set, we previously returned IRQ_NONE,
now we return IRQ_HANDLED.

As this must not happen in practice, this does not look like a big deal.

See also commit 3fff0179e3
	virtio-pci: do not oops on config change if driver not loaded.
for the original motivation behind the driver check.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:24:54 +10:30
Michael S. Tsirkin
6fbc198cf6 virtio_pci: fix virtio spec compliance on restore
On restore, virtio pci does the following:
+ set features
+ init vqs etc - device can be used at this point!
+ set ACKNOWLEDGE,DRIVER and DRIVER_OK status bits

This is in violation of the virtio spec, which
requires the following order:
- ACKNOWLEDGE
- DRIVER
- init vqs
- DRIVER_OK

This behaviour will break with hypervisors that assume spec compliant
behaviour.  It seems like a good idea to have this patch applied to
stable branches to reduce the support butden for the hypervisors.

Cc: stable@vger.kernel.org
Cc: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:24:53 +10:30
Konstantin Khlebnikov
09316c09dd mm/balloon_compaction: add vmstat counters and kpageflags bit
Always mark pages with PageBalloon even if balloon compaction is disabled
and expose this mark in /proc/kpageflags as KPF_BALLOON.

Also this patch adds three counters into /proc/vmstat: "balloon_inflate",
"balloon_deflate" and "balloon_migrate".  They accumulate balloon
activity.  Current size of balloon is (balloon_inflate - balloon_deflate)
pages.

All generic balloon code now gathered under option CONFIG_MEMORY_BALLOON.
It should be selected by ballooning driver which wants use this feature.
Currently virtio-balloon is the only user.

Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:26:01 -04:00
Konstantin Khlebnikov
9d1ba80564 mm/balloon_compaction: remove balloon mapping and flag AS_BALLOON_MAP
Now ballooned pages are detected using PageBalloon().  Fake mapping is no
longer required.  This patch links ballooned pages to balloon device using
field page->private instead of page->mapping.  Also this patch embeds
balloon_dev_info directly into struct virtio_balloon.

Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:26:01 -04:00
Konstantin Khlebnikov
d6d86c0a7f mm/balloon_compaction: redesign ballooned pages management
Sasha Levin reported KASAN splash inside isolate_migratepages_range().
Problem is in the function __is_movable_balloon_page() which tests
AS_BALLOON_MAP in page->mapping->flags.  This function has no protection
against anonymous pages.  As result it tried to check address space flags
inside struct anon_vma.

Further investigation shows more problems in current implementation:

* Special branch in __unmap_and_move() never works:
  balloon_page_movable() checks page flags and page_count.  In
  __unmap_and_move() page is locked, reference counter is elevated, thus
  balloon_page_movable() always fails.  As a result execution goes to the
  normal migration path.  virtballoon_migratepage() returns
  MIGRATEPAGE_BALLOON_SUCCESS instead of MIGRATEPAGE_SUCCESS,
  move_to_new_page() thinks this is an error code and assigns
  newpage->mapping to NULL.  Newly migrated page lose connectivity with
  balloon an all ability for further migration.

* lru_lock erroneously required in isolate_migratepages_range() for
  isolation ballooned page.  This function releases lru_lock periodically,
  this makes migration mostly impossible for some pages.

* balloon_page_dequeue have a tight race with balloon_page_isolate:
  balloon_page_isolate could be executed in parallel with dequeue between
  picking page from list and locking page_lock.  Race is rare because they
  use trylock_page() for locking.

This patch fixes all of them.

Instead of fake mapping with special flag this patch uses special state of
page->_mapcount: PAGE_BALLOON_MAPCOUNT_VALUE = -256.  Buddy allocator uses
PAGE_BUDDY_MAPCOUNT_VALUE = -128 for similar purpose.  Storing mark
directly in struct page makes everything safer and easier.

PagePrivate is used to mark pages present in page list (i.e.  not
isolated, like PageLRU for normal pages).  It replaces special rules for
reference counter and makes balloon migration similar to migration of
normal pages.  This flag is protected by page_lock together with link to
the balloon device.

Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Link: http://lkml.kernel.org/p/53E6CEAA.9020105@oracle.com
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: <stable@vger.kernel.org>	[3.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:26:01 -04:00
Rusty Russell
b25bd2515e virtio_ring: unify direct/indirect code paths.
virtqueue_add() populates the virtqueue descriptor table from the sgs
given.  If it uses an indirect descriptor table, then it puts a single
descriptor in the descriptor table pointing to the kmalloc'ed indirect
table where the sg is populated.

Previously vring_add_indirect() did the allocation and the simple
linear layout.  We replace that with alloc_indirect() which allocates
the indirect table then chains it like the normal descriptor table so
we can reuse the core logic.

This slows down pktgen by less than 1/2 a percent (which uses direct
descriptors), as well as vring_bench, but it's far neater.

vring_bench before:
	1061485790-1104800648(1.08254e+09+/-6.6e+06)ns
vring_bench after:
	1125610268-1183528965(1.14172e+09+/-8e+06)ns

pktgen before:
   787781-796334(793165+/-2.4e+03)pps 365-369(367.5+/-1.2)Mb/sec (365530384-369498976(3.68028e+08+/-1.1e+06)bps) errors: 0

pktgen after:
   779988-790404(786391+/-2.5e+03)pps 361-366(364.35+/-1.3)Mb/sec (361914432-366747456(3.64885e+08+/-1.2e+06)bps) errors: 0

Now, if we make force indirect descriptors by turning off any_header_sg
in virtio_net.c:

pktgen before:
  713773-721062(718374+/-2.1e+03)pps 331-334(332.95+/-0.92)Mb/sec (331190672-334572768(3.33325e+08+/-9.6e+05)bps) errors: 0
pktgen after:
  710542-719195(714898+/-2.4e+03)pps 329-333(331.15+/-1.1)Mb/sec (329691488-333706480(3.31713e+08+/-1.1e+06)bps) errors: 0

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-13 12:52:35 -04:00
Rusty Russell
eeebf9b1fc virtio_ring: assume sgs are always well-formed.
We used to have several callers which just used arrays.  They're
gone, so we can use sg_next() everywhere, simplifying the code.

On my laptop, this slowed down vring_bench by 15%:

vring_bench before:
	936153354-967745359(9.44739e+08+/-6.1e+06)ns
vring_bench after:
	1061485790-1104800648(1.08254e+09+/-6.6e+06)ns

However, a more realistic test using pktgen on a AMD FX(tm)-8320 saw
a few percent improvement:

pktgen before:
  767390-792966(785159+/-6.5e+03)pps 356-367(363.75+/-2.9)Mb/sec (356068960-367936224(3.64314e+08+/-3e+06)bps) errors: 0

pktgen after:
   787781-796334(793165+/-2.4e+03)pps 365-369(367.5+/-1.2)Mb/sec (365530384-369498976(3.68028e+08+/-1.1e+06)bps) errors: 0

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-13 12:50:46 -04:00
Benoit Taine
cef340e6aa virtio: Replace DEFINE_PCI_DEVICE_TABLE macro use
We should prefer `struct pci_device_id` over `DEFINE_PCI_DEVICE_TABLE` to meet
kernel coding style guidelines. This issue was reported by checkpatch.

Signed-off-by: Benoit Taine <benoit.taine@lip6.fr>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-07-27 21:07:15 +09:30
Rusty Russell
e2dcdfe95c virtio: virtio_break_device() to mark all virtqueues broken.
Good for post-apocalyptic scenarios, like S/390 hotplug.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-04-28 11:34:13 +09:30
Rusty Russell
70670444c2 virtio: fail adding buffer on broken queues.
Heinz points out that adding buffers to a broken virtqueue (which
should "never happen") still works.  Failing allows drivers to detect
and complain about broken devices.

Now drivers are robust, we can add this extra check.

Reported-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-03-13 11:27:57 +10:30
Rusty Russell
4951cc9083 virtio_balloon: don't crash if virtqueue is broken.
A bad implementation of virtio might cause us to mark the virtqueue
broken: we'll dev_err() in that case, and the device is useless, but
let's not BUG().

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-03-13 11:27:56 +10:30
Rusty Russell
1f74ef0f2d virtio_balloon: don't softlockup on huge balloon changes.
When adding or removing 100G from a balloon:

    BUG: soft lockup - CPU#0 stuck for 22s! [vballoon:367]

We have a wait_event_interruptible(), but the condition is always true
(more ballooning to do) so we don't ever sleep.  We also have a
wait_event() for the host to ack, but that is also always true as QEMU
is synchronous for balloon operations.

Reported-by: Gopesh Kumar Chaudhary <gopchaud@in.ibm.com>
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-03-13 11:27:55 +10:30
Alexander Gordeev
5e37f67063 virtio: Use pci_enable_msix_exact() instead of pci_enable_msix()
As result of deprecation of MSI-X/MSI enablement functions
pci_enable_msix() and pci_enable_msi_block() all drivers
using these two interfaces need to be updated to use the
new pci_enable_msi_range()  or pci_enable_msi_exact()
and pci_enable_msix_range() or pci_enable_msix_exact()
interfaces.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-03-13 11:27:54 +10:30
Joel Stanley
6abb2dd928 tools/virtio: fix missing kmemleak_ignore symbol
In commit bb478d8b16 virtio_ring: plug kmemleak false positive,
kmemleak_ignore was introduced. This broke compilation of virtio_test:

  cc -g -O2 -Wall -I. -I ../../usr/include/ -Wno-pointer-sign
    -fno-strict-overflow -fno-strict-aliasing -fno-common -MMD
    -U_FORTIFY_SOURCE   -c -o virtio_ring.o ../../drivers/virtio/virtio_ring.c
  ../../drivers/virtio/virtio_ring.c: In function ‘vring_add_indirect’:
  ../../drivers/virtio/virtio_ring.c:177:2: warning: implicit declaration
  of function ‘kmemleak_ignore’ [-Wimplicit-function-declaration]
    kmemleak_ignore(desc);
    ^
  cc   virtio_test.o virtio_ring.o   -o virtio_test
  virtio_ring.o: In function `vring_add_indirect':
  tools/virtio/../../drivers/virtio/virtio_ring.c:177:
  undefined reference to `kmemleak_ignore'

Add a dummy header for tools/virtio, and add #incldue <linux/kmemleak.h>
to drivers/virtio/virtio_ring.c so it is picked up by the userspace
tools.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-03-13 11:23:25 +10:30
Linus Torvalds
93b05cba8e A few simple fixes. Quiet cycle.
Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJS4H+AAAoJENkgDmzRrbjxhXgP/0S4aPYPm6aw+nMKzrbZ6umd
 7kFPjZyh8zDuOOwLOIFt2gepI/H0d56ZWi4EmNBO5Md0OChNjHjDF4sC/PA+vTsQ
 DUoP6zv2yGlmbnaIri4Xo7DXOM/40G4+VOtO60KsTnwjIj8/bqA/VGBFoodqKVw4
 lBgkFs32IHynZuwxj14UeuzcPBR8tmHQDvH+ROqa5/7lzMHf24MQLh7NjNCJyoz9
 d56GW5tFNpWYgZO27v/QuAI2wHDPDpSOoZnPCh1yfr8+kk+W2HfVFYVD0dOiu2VR
 FMQ8OMGI+o9XMKvvwCxp2WkXExZRv0KbUOi8+LhphygZgLQw7FEYdulRLpPcach4
 T4vzpTHXBVhkrBNkKJIYS556xuubwnQM57ZcaoiDr16SHnyji/tALRIrhnyfvwuo
 r44HxVkq6RjMlDGZs/5JdmjcW+FWMg5/eVZBkyr+Y+/5xTo8E+WYz6JhMEc+hrrq
 TL8HDUtpEwFx1HORwkPUBNKoj14sqKwst6wVgN4iyTv534Cq8lqDPanZncBRb2ho
 p0NxTC96hqOz/LRAy5mS3qn530DOUuvEeV0KGIgbUFn4nuWeGO4kswUY7GlU8dda
 4ww3Uxw5SF9uHYBZWtXavBLIq6PhcTdJ6Ckz83kgRwev4gLo9Rj83yW/hwjP9pxu
 FwRo38ZtfY9jyL4w0cv1
 =mh3b
 -----END PGP SIGNATURE-----

Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull virtio update from Rusty Russell:
 "A few simple fixes.  Quiet cycle"

* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  drivers: virtio: Mark function virtballoon_migratepage() as static in virtio_balloon.c
  virtio-scsi: Fix hotcpu_notifier use-after-free with virtscsi_freeze
  virtio: pci: remove unnecessary pci_set_drvdata()
2014-01-22 22:24:35 -08:00
Rashika Kheria
05c54de8c8 drivers: virtio: Mark function virtballoon_migratepage() as static in virtio_balloon.c
Mark the function virtballoon_migratepage() as static in
virtio_balloon.c because it is not used outside this file.

This eliminates the following warning in virtio_balloon.c:
drivers/virtio/virtio_balloon.c:372:5: warning: no previous prototype for ‘virtballoon_migratepage’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-01-16 10:22:28 +10:30
Luiz Capitulino
3459f11a8b virtio_balloon: update_balloon_size(): update correct field
According to the virtio spec, the device configuration field
that should be updated after an inflation or deflation
operation is the 'actual' field, not the 'num_pages' one.

Commit 855e0c5288 swapped them
in update_balloon_size(). Fix it.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fixes: 855e0c5288
2013-12-05 13:12:39 +10:30
Jingoo Han
7d2dddda5c virtio: pci: remove unnecessary pci_set_drvdata()
The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-12-04 14:20:26 +10:30
Linus Torvalds
b746f9c794 Nothing really exciting: some groundwork for changing virtio endian, and
some robustness fixes for broken virtio devices, plus minor tweaks.
 
 [vs last pull request: added the virtio-scsi broken vq escape patch, which
 I somehow lost.]
 
 Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJSgDsJAAoJENkgDmzRrbjxEE4P/jXqZHS/HdlxW9k0BjKKlEIF
 PdtCoP3UhWTdskXvy2pD8m6nYn214MEJYUIa4HFlIEZsdxhuexzQHY19Ynkjagyv
 57sRsUUm5fYQLIL7IUh2DUD1VU38hUFinno/y333szzvCj9qITDA/QABsiWxK8NO
 dq+Lmeixgrhc5yN9iryW+gZV+hekJIZ4LsU5ejSaJucKblzXUH8qIbmSthG7RTYJ
 tr4J7xTTXbhxY4CoC5Dpx2hvsFkvzaAIvI4Nr1mDjfq5cR8BaYvnC89U1IbhdAey
 p1AbZE58JLrY+Z8K8LBRGV2KjO8qSZ6R47hbZ9nAnodJYB7sZLyj6jUe1q+/htuC
 Dh9Xm9O4eW2xNaFk20dYeIF4UU5/HzdsbvG/IlH8x4sm8/K706ocYyAOHlzYUg2T
 k7gltrgDzDokMgb2R44gwnr4oaJ2q8Gne6JXswlPEv2eRs6vNnA5Xhc0rEHGkU6C
 gYn1vNFN6yx0vf2syG/Ce5pZtMxGpefKQkHzzWdq8FKr1B9s54dDuf2hls7J8A9t
 OQT1gE33yURSelf4Kh4k9zWXaWk/Ohv9l2R1cqpALnJ4/+q0fP5t7HdK500S7aax
 DxLeFeqvsBw7nlWgsGxQmt+fjITQFHhcDiwst0ehnt6RbDEW7XPIguz0K/gyhxYG
 +UNbl/5Gr64jnUX3YCzm
 =vY2L
 -----END PGP SIGNATURE-----

Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull virtio updates from Rusty Russell:
 "Nothing really exciting: some groundwork for changing virtio endian,
  and some robustness fixes for broken virtio devices, plus minor
  tweaks"

* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  virtio_scsi: verify if queue is broken after virtqueue_get_buf()
  x86, asmlinkage, lguest: Pass in globals into assembler statement
  virtio: mmio: fix signature checking for BE guests
  virtio_ring: adapt to notify() returning bool
  virtio_net: verify if queue is broken after virtqueue_get_buf()
  virtio_console: verify if queue is broken after virtqueue_get_buf()
  virtio_blk: verify if queue is broken after virtqueue_get_buf()
  virtio_ring: add new function virtqueue_is_broken()
  virtio_test: verify if virtqueue_kick() succeeded
  virtio_net: verify if virtqueue_kick() succeeded
  virtio_ring: let virtqueue_{kick()/notify()} return a bool
  virtio_ring: change host notification API
  virtio_config: remove virtio_config_val
  virtio: use size-based config accessors.
  virtio_config: introduce size-based accessors.
  virtio_ring: plug kmemleak false positive.
  virtio: pm: use CONFIG_PM_SLEEP instead of CONFIG_PM
2013-11-15 13:28:47 +09:00
Marc Zyngier
4ae8537072 virtio: mmio: fix signature checking for BE guests
As virtio-mmio config registers are specified to be little-endian,
using readl() to read the magic value and then memcmp() to check it
fails on BE (as readl() has an implicit swab).

Fix it by encoding the magic value as an integer instead of a string.

Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-11-07 12:13:04 +10:30
Heinz Graalfs
2342d6a651 virtio_ring: adapt to notify() returning bool
Correct if statement to check for bool returned by notify()
(introduced in 5b1bf7cb67).

Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-11-05 21:21:08 +10:30
Heinz Graalfs
b3b32c9413 virtio_ring: add new function virtqueue_is_broken()
Add new function virtqueue_is_broken(). Callers of virtqueue_get_buf()
should check for a broken queue.

Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-10-29 11:28:17 +10:30
Heinz Graalfs
5b1bf7cb67 virtio_ring: let virtqueue_{kick()/notify()} return a bool
virtqueue_{kick()/notify()} should exploit the new host notification API.
If the notify call returned with a negative value the host kick failed
(e.g. a kick triggered after a device was hot-unplugged). In this case
the virtqueue is set to 'broken' and false is returned, otherwise true.

Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-10-29 11:28:12 +10:30
Heinz Graalfs
46f9c2b925 virtio_ring: change host notification API
Currently a host kick error is silently ignored and not reflected in
the virtqueue of a particular virtio device.

Changing the notify API for guest->host notification seems to be one
prerequisite in order to be able to handle such errors in the context
where the kick is triggered.

This patch changes the notify API. The notify function must return a
bool return value. It returns false if the host notification failed.

Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-10-29 11:28:11 +10:30
Greg Kroah-Hartman
3736dab6e5 virtio: convert bus code to use dev_groups
The dev_attrs field of struct bus_type is going away soon, dev_groups
should be used instead.  This converts the virtio bus code to use the
correct field.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <virtualization@lists.linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-16 18:40:57 -07:00
Rusty Russell
855e0c5288 virtio: use size-based config accessors.
This lets the transport do endian conversion if necessary, and insulates
the drivers from the difference.

Most drivers can use the simple helpers virtio_cread() and virtio_cwrite().

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-10-17 10:55:37 +10:30
Rusty Russell
bb478d8b16 virtio_ring: plug kmemleak false positive.
unreferenced object 0xffff88003d467e20 (size 32):
  comm "softirq", pid 0, jiffies 4295197765 (age 6.364s)
  hex dump (first 32 bytes):
    28 19 bf 3d 00 00 00 00 0c 00 00 00 01 00 01 00  (..=............
    02 dc 51 3c 00 00 00 00 56 00 00 00 00 00 00 00  ..Q<....V.......
  backtrace:
    [<ffffffff8152db19>] kmemleak_alloc+0x59/0xc0
    [<ffffffff81102e93>] __kmalloc+0xf3/0x180
    [<ffffffff812db5d6>] vring_add_indirect+0x36/0x280
    [<ffffffff812dc59f>] virtqueue_add_outbuf+0xbf/0x4e0
    [<ffffffff813a8b30>] start_xmit+0x1a0/0x3b0
    [<ffffffff81445861>] dev_hard_start_xmit+0x2d1/0x4d0
    [<ffffffff81460052>] sch_direct_xmit+0xf2/0x1c0
    [<ffffffff81445c28>] dev_queue_xmit+0x1c8/0x460
    [<ffffffff814e3187>] ip6_finish_output2+0x1d7/0x470
    [<ffffffff814e34b0>] ip6_finish_output+0x90/0xb0
    [<ffffffff814e3507>] ip6_output+0x37/0xb0
    [<ffffffff815021eb>] igmp6_send+0x2db/0x470
    [<ffffffff81502645>] igmp6_timer_handler+0x95/0xa0
    [<ffffffff8104b57c>] call_timer_fn+0x2c/0x90
    [<ffffffff8104b7ba>] run_timer_softirq+0x1da/0x1f0
    [<ffffffff81045721>] __do_softirq+0xd1/0x1b0

Address gets embedded in a descriptor via virt_to_phys().  See detach_buf,
which frees it:

	if (vq->vring.desc[i].flags & VRING_DESC_F_INDIRECT)
		kfree(phys_to_virt(vq->vring.desc[i].addr));

Reported-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Fix-suggested-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Typing-done-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-10-17 10:55:35 +10:30
Aaron Lu
8910700039 virtio: pm: use CONFIG_PM_SLEEP instead of CONFIG_PM
The freeze and restore functions defined in virtio drivers are used
for suspend and hibernate, so CONFIG_PM_SLEEP is more appropriate than
CONFIG_PM. This patch replace all CONFIG_PM with CONFIG_PM_SLEEP for
virtio drivers that implement freeze and restore callbacks.

Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-09-23 15:45:58 +09:30
Aaron Lu
9e266ece21 virtio_pci: pm: Use CONFIG_PM_SLEEP instead of CONFIG_PM
The virtio_pci_freeze/restore are defined under CONFIG_PM but is used
by SET_SYSTEM_SLEEP_PM_OPS macro, which is defined under
CONFIG_PM_SLEEP. So if CONFIG_PM_SLEEP is not cofigured but
CONFIG_PM_RUNTIME is, the following warning message appeared:

drivers/virtio/virtio_pci.c:770:12: warning: ‘virtio_pci_freeze’ defined but not used [-Wunused-function]
 static int virtio_pci_freeze(struct device *dev)
            ^
drivers/virtio/virtio_pci.c:790:12: warning: ‘virtio_pci_restore’ defined but not used [-Wunused-function]
 static int virtio_pci_restore(struct device *dev)
            ^
Fix it by changing CONFIG_PM to CONFIG_PM_SLEEP.

Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-09-09 10:02:53 +09:30
Linus Torvalds
5f12972171 No real surprises.
Thanks,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJR3NcfAAoJENkgDmzRrbjxtgAQAMAChsD867ADZPVaT1jcS6Me
 5OoavOnpw84AopY4cLOFxL13lDgHeDZxTbsS56WP/100WEDTFQqr9uWpuFxt/NU7
 ukXwMrN/5RBmTXtDpdApM+jnBdgRLxu6gJaEjmVZSW10XwPDar+zGJV+NUVX42uy
 8rhjZcFTPIhLz96VBeXgtnmlc8f33AwdFb1JtA6c5slMXmJ0pGKpme3Gri0Hqu0X
 sowVCchjuJXuhg3siyyjcyrDtWnH6hPNCbvAFuL4wwcGyMjb3TS7l922GiscD9pq
 KZcpkYywCosJEnaR4XZQ3ewDGnqaslgb556Tcigm2TXC9LfAR/k4Q1ZYiPGrxcmU
 zgiN/Nvr1PbMDcKr6Lr+utQGh82jeKTcrKz/CPlCJtusg+4hPCwuugDMhNqKK4/2
 3O+0c5t32K38TUdDhVRu2wXlvlyLoQc55yIbhw70O+G1Td7KMHrjXuxwyWbP+tGU
 X/D2DKQu5bcNcBv5sA04PdyRM2buYFvwSZUjxwdwWssmdUqU1xdybDK3pgWf92fF
 llsri86xZ/hOOE+A+jn/oUpXol2PAVOCoh80P7O9VeuDgPL2Fjl4UWf8HWqVvn/u
 A3kpCrIygogc4I7xQkdxBkR6Aa2Uh323rpXKm7E4Tlg0Ii6srqBWq74fz2Ou3TlS
 tOPNG/Npv6yiWo2ejXp8
 =XNkj
 -----END PGP SIGNATURE-----

Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull virtio updates from Rusty Russell:
 "No real surprises"

* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  MAINTAINERS: add tools/virtio/ under virtio
  tools/virtio: move module license stub to module.h
  virtio: include asm/barrier explicitly
  virtio: VIRTIO_F_ANY_LAYOUT feature
  lguest: fix example launcher compilation for broken glibc headers.
  virtio-net: fix the race between channels setting and refill
  tools/lguest: real barriers.
  tools/lguest: fix missing rmb().
  virtio_balloon: leak_balloon(): only tell host if we got pages deflated
  virtio-pci: fix leaks of msix_affinity_masks
  Fix comment typo "CONFIG_PAE"
2013-07-10 14:50:58 -07:00
Linus Torvalds
496322bc91 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "This is a re-do of the net-next pull request for the current merge
  window.  The only difference from the one I made the other day is that
  this has Eliezer's interface renames and the timeout handling changes
  made based upon your feedback, as well as a few bug fixes that have
  trickeled in.

  Highlights:

   1) Low latency device polling, eliminating the cost of interrupt
      handling and context switches.  Allows direct polling of a network
      device from socket operations, such as recvmsg() and poll().

      Currently ixgbe, mlx4, and bnx2x support this feature.

      Full high level description, performance numbers, and design in
      commit 0a4db187a9 ("Merge branch 'll_poll'")

      From Eliezer Tamir.

   2) With the routing cache removed, ip_check_mc_rcu() gets exercised
      more than ever before in the case where we have lots of multicast
      addresses.  Use a hash table instead of a simple linked list, from
      Eric Dumazet.

   3) Add driver for Atheros CQA98xx 802.11ac wireless devices, from
      Bartosz Markowski, Janusz Dziedzic, Kalle Valo, Marek Kwaczynski,
      Marek Puzyniak, Michal Kazior, and Sujith Manoharan.

   4) Support reporting the TUN device persist flag to userspace, from
      Pavel Emelyanov.

   5) Allow controlling network device VF link state using netlink, from
      Rony Efraim.

   6) Support GRE tunneling in openvswitch, from Pravin B Shelar.

   7) Adjust SOCK_MIN_RCVBUF and SOCK_MIN_SNDBUF for modern times, from
      Daniel Borkmann and Eric Dumazet.

   8) Allow controlling of TCP quickack behavior on a per-route basis,
      from Cong Wang.

   9) Several bug fixes and improvements to vxlan from Stephen
      Hemminger, Pravin B Shelar, and Mike Rapoport.  In particular,
      support receiving on multiple UDP ports.

  10) Major cleanups, particular in the area of debugging and cookie
      lifetime handline, to the SCTP protocol code.  From Daniel
      Borkmann.

  11) Allow packets to cross network namespaces when traversing tunnel
      devices.  From Nicolas Dichtel.

  12) Allow monitoring netlink traffic via AF_PACKET sockets, in a
      manner akin to how we monitor real network traffic via ptype_all.
      From Daniel Borkmann.

  13) Several bug fixes and improvements for the new alx device driver,
      from Johannes Berg.

  14) Fix scalability issues in the netem packet scheduler's time queue,
      by using an rbtree.  From Eric Dumazet.

  15) Several bug fixes in TCP loss recovery handling, from Yuchung
      Cheng.

  16) Add support for GSO segmentation of MPLS packets, from Simon
      Horman.

  17) Make network notifiers have a real data type for the opaque
      pointer that's passed into them.  Use this to properly handle
      network device flag changes in arp_netdev_event().  From Jiri
      Pirko and Timo Teräs.

  18) Convert several drivers over to module_pci_driver(), from Peter
      Huewe.

  19) tcp_fixup_rcvbuf() can loop 500 times over loopback, just use a
      O(1) calculation instead.  From Eric Dumazet.

  20) Support setting of explicit tunnel peer addresses in ipv6, just
      like ipv4.  From Nicolas Dichtel.

  21) Protect x86 BPF JIT against spraying attacks, from Eric Dumazet.

  22) Prevent a single high rate flow from overruning an individual cpu
      during RX packet processing via selective flow shedding.  From
      Willem de Bruijn.

  23) Don't use spinlocks in TCP md5 signing fast paths, from Eric
      Dumazet.

  24) Don't just drop GSO packets which are above the TBF scheduler's
      burst limit, chop them up so they are in-bounds instead.  Also
      from Eric Dumazet.

  25) VLAN offloads are missed when configured on top of a bridge, fix
      from Vlad Yasevich.

  26) Support IPV6 in ping sockets.  From Lorenzo Colitti.

  27) Receive flow steering targets should be updated at poll() time
      too, from David Majnemer.

  28) Fix several corner case regressions in PMTU/redirect handling due
      to the routing cache removal, from Timo Teräs.

  29) We have to be mindful of ipv4 mapped ipv6 sockets in
      upd_v6_push_pending_frames().  From Hannes Frederic Sowa.

  30) Fix L2TP sequence number handling bugs, from James Chapman."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1214 commits)
  drivers/net: caif: fix wrong rtnl_is_locked() usage
  drivers/net: enic: release rtnl_lock on error-path
  vhost-net: fix use-after-free in vhost_net_flush
  net: mv643xx_eth: do not use port number as platform device id
  net: sctp: confirm route during forward progress
  virtio_net: fix race in RX VQ processing
  virtio: support unlocked queue poll
  net/cadence/macb: fix bug/typo in extracting gem_irq_read_clear bit
  Documentation: Fix references to defunct linux-net@vger.kernel.org
  net/fs: change busy poll time accounting
  net: rename low latency sockets functions to busy poll
  bridge: fix some kernel warning in multicast timer
  sfc: Fix memory leak when discarding scattered packets
  sit: fix tunnel update via netlink
  dt:net:stmmac: Add dt specific phy reset callback support.
  dt:net:stmmac: Add support to dwmac version 3.610 and 3.710
  dt:net:stmmac: Allocate platform data only if its NULL.
  net:stmmac: fix memleak in the open method
  ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available
  net: ipv6: fix wrong ping_v6_sendmsg return value
  ...
2013-07-09 18:24:39 -07:00
Michael S. Tsirkin
cc229884d3 virtio: support unlocked queue poll
This adds a way to check ring empty state after enable_cb outside any
locks. Will be used by virtio_net.

Note: there's room for more optimization: caller is likely to have a
memory barrier already, which means we might be able to get rid of a
barrier here.  Deferring this optimization until we do some
benchmarking.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-09 12:45:37 -07:00
Linus Torvalds
7f0ef0267e Merge branch 'akpm' (updates from Andrew Morton)
Merge first patch-bomb from Andrew Morton:
 - various misc bits
 - I'm been patchmonkeying ocfs2 for a while, as Joel and Mark have been
   distracted.  There has been quite a bit of activity.
 - About half the MM queue
 - Some backlight bits
 - Various lib/ updates
 - checkpatch updates
 - zillions more little rtc patches
 - ptrace
 - signals
 - exec
 - procfs
 - rapidio
 - nbd
 - aoe
 - pps
 - memstick
 - tools/testing/selftests updates

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (445 commits)
  tools/testing/selftests: don't assume the x bit is set on scripts
  selftests: add .gitignore for kcmp
  selftests: fix clean target in kcmp Makefile
  selftests: add .gitignore for vm
  selftests: add hugetlbfstest
  self-test: fix make clean
  selftests: exit 1 on failure
  kernel/resource.c: remove the unneeded assignment in function __find_resource
  aio: fix wrong comment in aio_complete()
  drivers/w1/slaves/w1_ds2408.c: add magic sequence to disable P0 test mode
  drivers/memstick/host/r592.c: convert to module_pci_driver
  drivers/memstick/host/jmb38x_ms: convert to module_pci_driver
  pps-gpio: add device-tree binding and support
  drivers/pps/clients/pps-gpio.c: convert to module_platform_driver
  drivers/pps/clients/pps-gpio.c: convert to devm_* helpers
  drivers/parport/share.c: use kzalloc
  Documentation/accounting/getdelays.c: avoid strncpy in accounting tool
  aoe: update internal version number to v83
  aoe: update copyright date
  aoe: perform I/O completions in parallel
  ...
2013-07-03 17:12:13 -07:00
Jiang Liu
3dcc0571cd mm: correctly update zone->managed_pages
Enhance adjust_managed_page_count() to adjust totalhigh_pages for
highmem pages.  And change code which directly adjusts totalram_pages to
use adjust_managed_page_count() because it adjusts totalram_pages,
totalhigh_pages and zone->managed_pages altogether in a safe way.

Remove inc_totalhigh_pages() and dec_totalhigh_pages() from xen/balloon
driver bacause adjust_managed_page_count() has already adjusted
totalhigh_pages.

This patch also fixes two bugs:

1) enhances virtio_balloon driver to adjust totalhigh_pages when
   reserve/unreserve pages.
2) enhance memory_hotplug.c to adjust totalhigh_pages when hot-removing
   memory.

We still need to deal with modifications of totalram_pages in file
arch/powerpc/platforms/pseries/cmm.c, but need help from PPC experts.

[akpm@linux-foundation.org: remove ifdef, per Wanpeng Li, virtio_balloon.c cleanup, per Sergei]
[akpm@linux-foundation.org: export adjust_managed_page_count() to modules, for drivers/virtio/virtio_balloon.c]
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Minchan Kim <minchan@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <sworddragon2@aol.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:07:33 -07:00
Luiz Capitulino
8c6bab4f38 virtio_balloon: leak_balloon(): only tell host if we got pages deflated
balloon_page_dequeue() can return NULL.  If it does for the first page
being freed then leak_balloon() will create a scatter list with len=0.
Which in turn seems to generate an invalid virtio request.

I didn't get this in practice, I found it by code review.  On the other
hand, such an invalid virtio request will cause errors in QEMU and
fill_balloon() also performs the same check implemented by this commit.

This bug was introduced in e2250429.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org # 3.9
2013-07-02 15:42:03 +09:30
Andrew Vagin
f11335db5e virtio-pci: fix leaks of msix_affinity_masks
vp_dev->msix_vectors should be initialized before allocating
msix_affinity_masks, otherwise vp_free_vectors will not free these
objects.

unreferenced object 0xffff88010f969d88 (size 512):
  comm "systemd-udevd", pid 158, jiffies 4294673645 (age 80.545s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff816e455e>] kmemleak_alloc+0x5e/0xc0
    [<ffffffff811aa7f1>] kmem_cache_alloc_node_trace+0x141/0x2c0
    [<ffffffff8133ba23>] alloc_cpumask_var_node+0x23/0x80
    [<ffffffff8133ba8e>] alloc_cpumask_var+0xe/0x10
    [<ffffffff813fdb3d>] vp_try_to_find_vqs+0x25d/0x810
    [<ffffffff813fe171>] vp_find_vqs+0x81/0xb0
    [<ffffffffa00d2a05>] init_vqs+0x85/0x120 [virtio_balloon]
    [<ffffffffa00d2c29>] virtballoon_probe+0xf9/0x1a0 [virtio_balloon]
    [<ffffffff813fb61e>] virtio_dev_probe+0xde/0x140
    [<ffffffff814452b8>] driver_probe_device+0x98/0x3a0
    [<ffffffff8144566b>] __driver_attach+0xab/0xb0
    [<ffffffff814432f4>] bus_for_each_dev+0x94/0xb0
    [<ffffffff81444f4e>] driver_attach+0x1e/0x20
    [<ffffffff81444910>] bus_add_driver+0x200/0x280
    [<ffffffff81445c14>] driver_register+0x74/0x160
    [<ffffffff813fb7d0>] register_virtio_driver+0x20/0x40

v2: change msix_vectors uncoditionaly in vp_free_vectors

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-07-02 15:42:02 +09:30
Rusty Russell
b3087e48ce virtio: remove virtqueue_add_buf().
All users changed to virtqueue_add_sg() or virtqueue_add_outbuf/inbuf.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-05-20 12:16:01 +09:30
Rusty Russell
92549abc6a virtio_balloon: use simplified virtqueue accessors.
We never add buffers with input and output parts, so use the new accessors.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-03-20 15:45:06 +10:30
Rusty Russell
282edb3649 virtio_ring: virtqueue_add_outbuf / virtqueue_add_inbuf.
These are specialized versions of virtqueue_add_buf(), which cover
over 80% of cases and are far clearer.

In particular, the scatterlists passed to these functions don't have
to be clean (ie. we ignore end markers).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-03-20 15:44:52 +10:30
Rusty Russell
13816c768d virtio_ring: virtqueue_add_sgs, to add multiple sgs.
virtio_scsi can really use this, to avoid the current hack of copying
the whole sg array.  Some other things get slightly neater, too.

This causes a slowdown in virtqueue_add_buf(), which is implemented as
a wrapper.  This is addressed in the next patches.

for i in `seq 50`; do /usr/bin/time -f 'Wall time:%e' ./vringh_test --indirect --eventidx --parallel --fast-vringh; done 2>&1 | stats --trim-outliers:

Before:
	Using CPUS 0 and 3
	Guest: notified 0, pinged 39009-39063(39062)
	Host: notified 39009-39063(39062), pinged 0
	Wall time:1.700000-1.950000(1.723542)

After:
	Using CPUS 0 and 3
	Guest: notified 0, pinged 39062-39063(39063)
	Host: notified 39062-39063(39063), pinged 0
	Wall time:1.760000-2.220000(1.789167)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Asias He <asias@redhat.com>
2013-03-20 15:43:29 +10:30
Rusty Russell
a9a0fef779 virtio_ring: expose virtio barriers for use in vringh.
The host side of ring needs this logic too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-03-20 14:00:41 +10:30
Linus Torvalds
3c834b6f41 All trivial, thanks to the stuff which didn't quite make it time.
Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJRLEkHAAoJENkgDmzRrbjx6K0P/3o9/iW5hkOPYpu+KV2nr0wG
 6RG0uu0DCOb/tckigYwnn5PkS7UEcJu6kDypnEgXfhcNhYiBjoGIAUEQ2jBYyzQm
 IYc4oZhxDdqigJL/FHi2zL50mkWacTdBK83udxim3eRkgW9ysBRoBFwqMruVyhZv
 474KGS4PNT8pHDOCAlrRS1I9oW2pYTuUidN+SnfVJ8gFSkH0tuHEGoJeGrtDa010
 XkiWqiTJq6pDHTM5f4Wwp/WNQ21UNDBlvRahg0nqTZXicQsvujV0Tdb+EnAfXwa+
 bssqvO4X2WqlraVK1TJteufhcdhUgt/I1+45p9eLZvOXizk7EBKxFWynE7C878GV
 dhpz8i4mc+u6aJlGoHzwvKah0zhwDtqWdbDS+LNQjRmjMK9v45ttoisU+iR1GjLt
 JpDVg73K315aWJ9RLsYu7mvZ8JY+qRFkwXqX3lZd+GdohY0DSmGMxMqJG93sLBYi
 42vyHMaBd1JY0rDVqpfzlmjnSxX+k05uB0GYB3fO0CXbPxmfXtRKz7/5wpSz0Up+
 nTCs6Xa7t5jtG9qpC4cKxyEDEwB9M6SSxi+lhrIBkeDqlFwXAv1Fean4jqqhtFwr
 HO6i1mYgy0xCVY7EmequVyWlH6/B5GVB2xTMQuAz3f8pKm+XUr6C/twROYkgoXbV
 AOVyRWtnbISDVjtVFn9c
 =KcDx
 -----END PGP SIGNATURE-----

Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull virtio updates from Rusty Russell:
 "All trivial, thanks to the stuff which didn't quite make it time"

* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  virtio_console: Initialize guest_connected=true for rproc_serial
  virtio: use module_virtio_driver.
  virtio: Add module driver macro for virtio drivers.
  virtio_console: Use virtio device index to generate port name
  virtio: make pci_device_id const
  virtio: make config_ops const
  virtio-mmio: fix wrong comment about register offset
  virtio_console: Let unconnected rproc device receive data.
2013-02-26 14:49:12 -08:00
Rusty Russell
b2a17029c2 virtio: use module_virtio_driver.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-02-13 17:00:37 +10:30
Stephen Hemminger
35cdc9eb65 virtio: make pci_device_id const
Use DEVICE_TABLE macro which also makes table const.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-02-11 15:32:18 +10:30
Stephen Hemminger
9350393239 virtio: make config_ops const
It is just a table of function pointers, make it const for cleanliness and security
reasons.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-02-11 15:32:17 +10:30
Ryota Ozaki
0d34cc2d6d virtio-mmio: fix wrong comment about register offset
The register offset of InterruptACK is 0x064, not 0x060.
Fix it.

Signed-off-by: Ryota Ozaki <ozaki.ryota@gmail.com>
Reviewed-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Acked-by: Pawel Moll <pawel.moll@arm.com>
CC: Jiri Kosina <trivial@kernel.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-02-04 10:37:01 +10:30
Kees Cook
d72c5a8c8c drivers/virtio: remove depends on CONFIG_EXPERIMENTAL
The CONFIG_EXPERIMENTAL config item has not carried much meaning for a
while now and is almost always enabled by default. As agreed during the
Linux kernel summit, remove it from any "depends on" lines in Kconfigs.

CC: Rusty Russell <rusty@rustcorp.com.au>
CC: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
2013-01-11 11:39:04 -08:00
Greg Kroah-Hartman
8590dbc79a Drivers: virtio: remove __dev* attributes.
CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
markings need to be removed.

This change removes the use of __devinit, __devexit_p, and __devexit
from these drivers.

Based on patches originally written by Bill Pemberton, but redone by me
in order to handle some of the coding style issues better, by hand.

Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-03 15:57:01 -08:00
Linus Torvalds
b7dfde956d Some nice cleanups, and even a patch my wife did as a "live" demo for
Latinoware 2012.
 
 There's a slightly non-trivial merge in virtio-net, as we cleaned up the
 virtio add_buf interface while DaveM accepted the mq virtio-net patches.
 
 You can see my solution in my pending-rebases branch, if that helps, but I
 know you love merging:
 
 https://git.kernel.org/?p=linux/kernel/git/rusty/linux.git;a=commit;h=12e4e64fa66a4c812e4855de32abdb4d819526fe
 
 Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQz/vKAAoJENkgDmzRrbjx+eYQAK/egj9T8Nnth6mkzdbCFSO7
 Bciga2hDiudGCiGojTRGPRSc0VP9LgfvPbY2pxX+R9CfEqR+a8q/rRQhCS79ZwPB
 /mJy3HNiCx418HZxgwNtk6vPe0PjJm6SsjbXeB9hB+PQLCbdwA0BjpG6xjF/jitP
 noPqhhXreeQgYVxAKoFPvff/Byu2GlNnDdVMQxWRmo8hTKlTCzl0T/7BHRxthhJj
 iOrXTFzrT/osPT0zyqlngT03T4wlBvL2Bfw8d/kuRPEZ71dpIctWeH2KzdwXVCrz
 hFQGxAz4OWvW3xrNwj7c6O3SWj4VemUMjQqeA/PtRiOEI5gM0Y/Bit47dWL4wM/O
 OWUKFHzq4DFs8MmwXBgDDXl5xOjOBH9Ik4FZayn3Y7COT/B8CjFdOC2MdDGmZ9yd
 NInumg7FqP+u12g+9Vq8S/b0cfoQm4qFe8VHiPJu+jRmCZglyvLjk7oq/QwW8Gaq
 Pkzit1Ey0DWo2KvZ4D/nuXJCuhmzN/AJ10M48lLYZhtOIVg9gsa0xjhfgq4FnvSK
 xFCf3rcWnlGIXcOYh/hKU25WaCLzBuqMuSK35A72IujrQOL7OJTk4Oqote3Z3H9B
 08XJmyW6SOZdfw17X4Im1jbyuLek///xQJ9Jw/tya7j9lBt8zjJ+FmLPs4mLGEOm
 WJv9uZPs+QbIMNky2Lcb
 =myDR
 -----END PGP SIGNATURE-----

Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull virtio update from Rusty Russell:
 "Some nice cleanups, and even a patch my wife did as a "live" demo for
  Latinoware 2012.

  There's a slightly non-trivial merge in virtio-net, as we cleaned up
  the virtio add_buf interface while DaveM accepted the mq virtio-net
  patches."

* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (27 commits)
  virtio_console: Add support for remoteproc serial
  virtio_console: Merge struct buffer_token into struct port_buffer
  virtio: add drv_to_virtio to make code clearly
  virtio: use dev_to_virtio wrapper in virtio
  virtio-mmio: Fix irq parsing in command line parameter
  virtio_console: Free buffers from out-queue upon close
  virtio: Convert dev_printk(KERN_<LEVEL> to dev_<level>(
  virtio_console: Use kmalloc instead of kzalloc
  virtio_console: Free buffer if splice fails
  virtio: tools: make it clear that virtqueue_add_buf() no longer returns > 0
  virtio: scsi: make it clear that virtqueue_add_buf() no longer returns > 0
  virtio: rpmsg: make it clear that virtqueue_add_buf() no longer returns > 0
  virtio: net: make it clear that virtqueue_add_buf() no longer returns > 0
  virtio: console: make it clear that virtqueue_add_buf() no longer returns > 0
  virtio: make virtqueue_add_buf() returning 0 on success, not capacity.
  virtio: console: don't rely on virtqueue_add_buf() returning capacity.
  virtio_net: don't rely on virtqueue_add_buf() returning capacity.
  virtio-net: remove unused skb_vnet_hdr->num_sg field
  virtio-net: correct capacity math on ring full
  virtio: move queue_index and num_free fields into core struct virtqueue.
  ...
2012-12-20 08:37:05 -08:00
Wanlong Gao
9a2bdcc85d virtio: add drv_to_virtio to make code clearly
Add drv_to_virtio wrapper to get virtio_driver from device_driver.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-12-18 15:20:43 +10:30
Wanlong Gao
9bffdca8c6 virtio: use dev_to_virtio wrapper in virtio
Use dev_to_virtio wrapper in virtio to make code clearly.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-12-18 15:20:42 +10:30
Pawel Moll
40f9938c4c virtio-mmio: Fix irq parsing in command line parameter
When the resource_size_t is 64-bit long, the sscanf() on
the virtio device command line paramter string may return
wrong value because its format was defined as "%u". Fixed
by using an intermediate local value of a known length.

Also added cleaned up the resource creation and added extra
comments to make the parameters parsing easier to follow.

Reported-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-12-18 15:20:41 +10:30
Joe Perches
800ba5eabf virtio: Convert dev_printk(KERN_<LEVEL> to dev_<level>(
dev_<level> calls take less code than dev_printk(KERN_<LEVEL>
and reducing object size is good.
Convert if (printk_ratelimit()) dev_printk to dev_<level>_ratelimited.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-12-18 15:20:40 +10:30
Rusty Russell
98e8c6bc66 virtio: make virtqueue_add_buf() returning 0 on success, not capacity.
Now noone relies on this behavior, we simplify virtqueue_add_buf() so it
return 0 or -errno.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2012-12-18 15:20:34 +10:30
Rusty Russell
06ca287dba virtio: move queue_index and num_free fields into core struct virtqueue.
They're generic concepts, so hoist them.  This also avoids accessor
functions (though kept around for merge with DaveM's net tree).

This goes even further than Jason Wang's 17bb6d4088 patch
("virtio-ring: move queue_index to vring_virtqueue") which moved the
queue_index from the specific transport.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-12-18 15:20:31 +10:30
Wei Yongjun
1ce6853aa0 virtio-pci: use module_pci_driver to simplify the code
Use the module_pci_driver() macro to make the code simpler
by eliminating module_init and module_exit calls.

dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-12-18 15:20:30 +10:30
Rafael Aquini
e22504296d virtio_balloon: introduce migration primitives to balloon pages
Memory fragmentation introduced by ballooning might reduce significantly
the number of 2MB contiguous memory blocks that can be used within a guest,
thus imposing performance penalties associated with the reduced number of
transparent huge pages that could be used by the guest workload.

Besides making balloon pages movable at allocation time and introducing
the necessary primitives to perform balloon page migration/compaction,
this patch also introduces the following locking scheme, in order to
enhance the syncronization methods for accessing elements of struct
virtio_balloon, thus providing protection against concurrent access
introduced by parallel memory migration threads.

 - balloon_lock (mutex) : synchronizes the access demand to elements of
                          struct virtio_balloon and its queue operations;

[yongjun_wei@trendmicro.com.cn: fix missing unlock on error in fill_balloon()]
[akpm@linux-foundation.org: avoid having multiple return points in fill_balloon()]
[akpm@linux-foundation.org: fix printk warning]Signed-off-by: Rafael Aquini <aquini@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:27 -08:00
Cornelia Huck
237242bddc virtio: Don't access index after unregister.
Virtio wants to release used indices after the corresponding
virtio device has been unregistered. However, virtio does not
hold an extra reference, giving up its last reference with
device_unregister(), making accessing dev->index afterwards
invalid.

I actually saw problems when testing my (not-yet-merged)
virtio-ccw code:

- device_add virtio-net,id=xxx
-> creates device virtio<n> with n>0

- device_del xxx
-> deletes virtio<n>, but calls ida_simple_remove with an
   index of 0

- device_add virtio-net,id=xxx
-> tries to add virtio0, which is still in use...

So let's save the index we want to release before calling
device_unregister().

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-11-09 14:54:24 +10:30
Will Deacon
b92b1b89a3 virtio: force vring descriptors to be allocated from lowmem
Virtio devices may attempt to add descriptors to a virtqueue from atomic
context using GFP_ATOMIC allocation. This is problematic because such
allocations can fall outside of the lowmem mapping, causing virt_to_phys
to report bogus physical addresses which are subsequently passed to
userspace via the buffers for the virtual device.

This patch masks out __GFP_HIGH and __GFP_HIGHMEM from the requested
flags when allocating descriptors for a virtqueue. If an atomic
allocation is requested and later fails, we will return -ENOSPC which
will be handled by the driver.

Cc: stable@kernel.org
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-10-22 18:19:49 +10:30
Brian Foley
d78b519f6b virtio_mmio: Don't attempt to create empty virtqueues
If a virtio device reports a QueueNumMax of 0, vring_new_virtqueue()
doesn't check this, and thanks to an unsigned (i < num - 1) loop
guard, scribbles over memory when initialising the free list.

Avoid by not trying to create zero-descriptor queues, as there's no
way to do any I/O with one.

Signed-off-by: Brian Foley <brian.foley@arm.com>
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-09-28 15:05:16 +09:30
Brian Foley
3850d29fc4 virtio_mmio: fix off by one error allocating queue
vm_setup_vq fails to allow VirtQueues needing only 2 pages of
storage, as it should. Found with a kernel using 64kB pages, but
can be provoked if a virtio device reports QueueNumMax where the
descriptor table and available ring fit in one page, and the used
ring on the second (<= 227 descriptors with 4kB pages and <= 3640
with 64kB pages.)

Signed-off-by: Brian Foley <brian.foley@arm.com>
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-09-28 15:05:16 +09:30
Peter Senna Tschudin
74a74b376c drivers/virtio/virtio_pci.c: fix error return code
Convert a nonnegative error return code to a negative one, as returned
elsewhere in the function.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
(
if@p1 (\(ret < 0\|ret != 0\))
 { ... return ret; }
|
ret@p1 = 0
)
... when != ret = e1
    when != &ret
*if(...)
{
  ... when != ret = e2
      when forall
 return ret;
}
// </smpl>

Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-09-28 15:05:16 +09:30
Michael S. Tsirkin
5543a6ac31 virtio: don't crash when device is buggy
Because of a sanity check in virtio_dev_remove, a buggy device can crash
kernel.  And in case of rproc it's userspace so it's not a good idea.
We are unloading a driver so how bad can it be?
Be less aggressive in handling this error: if it's a driver bug,
warning once should be enough.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-09-28 15:05:16 +09:30
Rusty Russell
eccbb05a64 virtio: remove CONFIG_VIRTIO_RING
Everyone who selects VIRTIO is also made to select VIRTIO_RING; just make
them synonymous, since we removed the indirection layer some time ago.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-09-28 15:05:15 +09:30
Rusty Russell
387daf1716 virtio: add help to CONFIG_VIRTIO option.
Trying to enable a virtio driver (eg CONFIG_VIRTIO_BLK) is painful
because it depends on CONFIG_VIRTIO.  CONFIG_VIRTIO doesn't tell you
how to turn it on (it's selected from anything which provides a virtio
bus).

This patch at least adds some documentation, visible in menuconfig, as
a hint.

Reported-by: Kent Overstreet <koverstreet@google.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-09-28 15:05:15 +09:30
Michael S. Tsirkin
6457f126c8 virtio: support reserved vqs
virtio network device multiqueue support reserves
vq 3 for future use (useful both for future extensions and to make it
pretty - this way receive vqs have even and transmit - odd numbers).
Make it possible to skip initialization for
specific vq numbers by specifying NULL for name.
Document this usage as well as (existing) NULL callback.

Drivers using this not coded up yet, so I simply tested
with virtio-pci and verified that this patch does
not break existing drivers.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-09-28 15:05:15 +09:30
Jason Wang
75a0a52be3 virtio: introduce an API to set affinity for a virtqueue
Sometimes, virtio device need to configure irq affinity hint to maximize the
performance. Instead of just exposing the irq of a virtqueue, this patch
introduce an API to set the affinity for a virtqueue.

The api is best-effort, the affinity hint may not be set as expected due to
platform support, irq sharing or irq type. Currently, only pci method were
implemented and we set the affinity according to:

- if device uses INTX, we just ignore the request
- if device has per vq vector, we force the affinity hint
- if the virtqueues share MSI, make the affinity OR over all affinities
  requested

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-09-28 15:05:15 +09:30