Commit Graph

30542 Commits

Author SHA1 Message Date
Peter Maydell
97374ce538 Merge remote-tracking branch 'sstabellini/xen-170114' into staging
* sstabellini/xen-170114:
  xen_pt: Fix passthrough of device with ROM.
  xen_pt: Fix debug output.
  xenfb: map framebuffer read-only and handle unmap errors

Message-id: alpine.DEB.2.02.1401171537140.21510@kaball.uk.xensource.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-01-31 00:13:02 +00:00
Peter Maydell
8e02b35926 Net patches
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJS5nEPAAoJEJykq7OBq3PI2XkH/1+f2mTJydxsg71571fOV+05
 IQl5dp14FZe+b3R4gRIEkU78JzsvhSnBdM8BQp4PwUhRIxz37tv3WfiAUVxOVMME
 U21/e+Oq0dFwBdCsx8G7ZRNP0xfWz8dvujo1cVZPWfFHAyBecgkcDQM1shHB4kGt
 PRrN3UfS6JOLQ6yBbXWSMMA7sWxZ7dlME7OPvUs3rql/mk20+xzpFJiyXVyMTEGA
 uu2l0iSpdwytisYTe2Pn8efwvuG1pIvNYmWIFcBjQIXfRTr0X9UGmGfgXTHlWERL
 91BR7wcKvQveMRYYbRdYvHLm/wcCc3o+MhS1My5TZItxZPLFIaZZyvf/Z4gvj9U=
 =xJOf
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'stefanha/tags/net-pull-request' into staging

Net patches

# gpg: Signature made Mon 27 Jan 2014 14:45:35 GMT using RSA key ID 81AB73C8
# gpg: Can't check signature: public key not found

* stefanha/tags/net-pull-request:
  tap-linux: Get features once and use it many times
  Fix lan9118 buffer length handling
  Fix lan9118 TX "CMD A" handling
  net: Use g_strdup_printf instead of snprintf.

Message-id: 1390834129-19625-1-git-send-email-stefanha@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-01-30 22:25:39 +00:00
Peter Maydell
dc08f85188 Merge remote-tracking branch 'rth/tcg-movbe' into staging
* rth/tcg-movbe:
  tcg/i386: cleanup useless #ifdef
  tcg/i386: use movbe instruction in qemu_ldst routines
  tcg/i386: add support for three-byte opcodes
  tcg/i386: remove hardcoded P_REXW value
  disas/i386.c: disassemble movbe instruction

Message-id: 1390692772-15282-1-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-01-30 19:02:16 +00:00
Peter Maydell
0706f7c85b trivial-patches for 2014-01-16
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iJwEAAECAAYFAlLYFuEACgkQUlPFrXTwyDjuGAQAqa23FDOF4VLnNNAkL37Aje3V
 CyroMi1Dj3WcKcbD8mEZOgZV5B/W5h6+v11JJdTkCTdOuiXYM3O+y4l3jxGhxcS2
 yxzCOMP5NrM7D1WogeFogbjabTq1Lm83FLXCgPCWi2rDIhPLQ6RPaMNNPyDdHHih
 aSuZo8EfnJOUcephR0I=
 =HvPH
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'mjt/tags/trivial-patches-2014-01-16' into staging

trivial-patches for 2014-01-16

# gpg: Signature made Thu 16 Jan 2014 17:29:05 GMT using RSA key ID 74F0C838
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: E190 8639 3B10 B51B AC2C  8B73 5253 C5AD 74F0 C838

Message-id: 1389893719-16336-1-git-send-email-mjt@msgid.tls.msk.ru
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-01-30 13:56:00 +00:00
Alexander Graf
18d13fa293 TCG: Fix I64-on-32bit-host temporaries
We have cache pools of temporaries that we can reuse later when they've
already been allocated before.

These cache pools differenciate between the target TCG variable type they
contain. So we have one pool for I32 and one pool for I64 variables.

On a 32bit system, we can't work with 64bit registers though. So instead we
spawn two I32 temporaries for every I64 temporary we create. All caching
works the same way as on a real 64-bit system though: We create a cache entry
in the 64bit array for the first i32 index.

However, when we free such a temporary we free it to the pool of its type
(which is always i32 on 32bit systems) rather than its base_type (which is
i64 or i32 depending on the variable). This means we put a temporary that
is of base_type == i64 into the i32 preallocated temporary pool.

Eventually, this results in failures like this on 32bit hosts:

  qemu-system-ppc64: tcg/tcg.c:515: tcg_temp_new_internal: Assertion `ts->base_type == type' failed.

This patch makes the free routine use the base_type instead for the free case,
so it's consistent with the temporary allocation. It fixes the above failure
for me.

Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1390146811-59936-1-git-send-email-agraf@suse.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-01-30 13:25:28 +00:00
Kusanagi Kouichi
1f149e721f tap-linux: Get features once and use it many times
Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-01-27 15:44:06 +01:00
Roy Franz
c444dfabfc Fix lan9118 buffer length handling
The 9118 ethernet controller supports transmission of multi-buffer packets
with arbitrary byte alignment of the start and end bytes.  All writes to
the packet fifo are 32 bits, so the controller discards bytes at the beginning
and end of each buffer based on the 'Data start offset' and 'Buffer size'
of the TX command 'A' format.

This patch uses the provided buffer length to limit the bytes transmitted.
Previously all the bytes of the last 32-bit word written to the TX fifo
were added to the internal transmit buffer structure resulting in more bytes
being transmitted than were submitted to the hardware in the command.  This
resulted in extra bytes being inserted into the middle of multi-buffer
packets when the non-final buffers had non-32bit aligned ending addresses.

Signed-off-by: Roy Franz <roy.franz@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-01-27 15:44:06 +01:00
Roy Franz
2ad657e3f3 Fix lan9118 TX "CMD A" handling
The 9118 ethernet controller supports transmission of multi-buffer packets
with arbitrary byte alignment of the start and end bytes.  All writes to
the packet fifo are 32 bits, so the controller discards bytes at the beginning
and end of each buffer based on the 'Data start offset' and 'Buffer size'
of the TX command 'A' format.

This patch changes the buffer size and offset internal state variables to be
updated on every "TX command A" write.  Previously they were only updated for
the first segment, which resulted incorrect behavior for packets with more
than one segment. Each segment of the packet has its own CMD A command, with
its own buffer size and start offset.

Also update extraction of fields from the CMD A word to use extract32().

Signed-off-by: Roy Franz <roy.franz@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-01-27 15:44:06 +01:00
Hani Benhabiles
4bf2c138dd net: Use g_strdup_printf instead of snprintf.
assign_name() in net/net.c is using snprintf + g_strdup to get the same
result as g_strdup_printf.

Signed-off-by: Hani Benhabiles <kroosec@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-01-27 15:44:06 +01:00
Aurelien Jarno
2d23d5edb5 tcg/i386: cleanup useless #ifdef
TCG_TARGET_HAS_movcond_i32 is always defined to 1 in tcg-target.h, so
remove the corresponding #ifdef #endif sequence, left from a previous
refactoring.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-01-25 15:21:33 -08:00
Aurelien Jarno
085bb5bb64 tcg/i386: use movbe instruction in qemu_ldst routines
The movbe instruction has been added on some Intel Atom CPUs and on
recent Intel Haswell CPUs. It allows to load/store a value and at the
same time bswap it.

This patch detects the avaibility of this instruction and when available
use it in the qemu load/store routines in replacement of load/store +
bswap. Note that for 16-bit unsigned loads, movbe + movzw is basically the
same as movzw + bswap, so the patch doesn't touch this case.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
[RTH: Reduced the number of conditionals using "movop".]
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-01-25 15:19:19 -08:00
Aurelien Jarno
2a1137753f tcg/i386: add support for three-byte opcodes
Add support for three-byte opcodes, starting with the 0x0f 0x38 prefix.
Use P_EXT38 as the new constant, and shift all other constants so that
P_EXT and P_EXT38 have neighbouring values.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
[RTH: Changed the name from P_EXT2 to P_EXT38.]
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-01-25 14:12:45 -08:00
Aurelien Jarno
c9d78213b8 tcg/i386: remove hardcoded P_REXW value
P_REXW is defined has a constant at the beginning of i386/tcg-target.c,
but the corresponding bit is later used in a harcoded way, which defeat
the purpose of a constant.

Fix that by using a conditional expression operator instead of a shift.
On x86 this actually makes the code slightly smaller as GCC does in
practice (opc >> 8) & 8 instead of (opc & 0x800) >> 8 so the constants
are smaller to load.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-01-25 14:12:38 -08:00
Aurelien Jarno
ba00599cc3 disas/i386.c: disassemble movbe instruction
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-01-25 14:12:17 -08:00
Anthony Liguori
0169c51155 Merge remote-tracking branch 'qemu-kvm/uq/master' into staging
* qemu-kvm/uq/master:
  kvm: always update the MPX model specific register
  KVM: fix addr type for KVM_IOEVENTFD
  KVM: Retry KVM_CREATE_VM on EINTR
  mempath prefault: fix off-by-one error
  kvm: x86: Separately write feature control MSR on reset
  roms: Flush icache when writing roms to guest memory
  target-i386: clear guest TSC on reset
  target-i386: do not special case TSC writeback
  target-i386: Intel MPX

Conflicts:
	exec.c

aliguori: fix trivial merge conflict in exec.c

Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2014-01-24 15:52:44 -08:00
Anthony Liguori
1c51e68b18 Merge remote-tracking branch 'otubo/seccomp' into staging
* otubo/seccomp:
  seccomp: add some basic shared memory syscalls to the whitelist
  seccomp: add mkdir() and fchmod() to the whitelist

Message-id: 1390231004-18392-1-git-send-email-otubo@linux.vnet.ibm.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2014-01-24 15:52:16 -08:00
Anthony Liguori
7d64b2c2e2 Initial patch for QEMU GTK support on Windows
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQEcBAABAgAGBQJS3XsWAAoJEC+DHmz61iBpbSoH/05bF5kfogHUuqXdF/pGPB8H
 LmnaodaC0dnigKnEpwQrpiq5ZwkzRek4J35txrRKZcD+M5awjEnZEUJ9fP1sDTMi
 rxgewWIbUnISVtvw9pxz4CJumH3nRaVz7wmtcMd+4LNEuRhgwszfedDAQRLUKCmB
 xViGQeuauqUm0m12qx4F7rE7Dq5ru8uOl8Zf0u9QPpIreLmN8f/VZzE5NzApw0YV
 EZKpMsOPWs6mBUO978yjOH/wjncBoJp/dE2wkZm/LzCQtiNGMxTBlU7qvrxaID/O
 dVHtQGCRarStM3lkJhq3to3EXS9PtuLJ0zeWwAgncdGG+tw0becNfBEA4XLkudY=
 =CmlU
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'sweil/tags/for_anthony' into staging

Initial patch for QEMU GTK support on Windows

# gpg: Signature made Mon 20 Jan 2014 11:37:58 AM PST using RSA key ID FAD62069
# gpg: Can't check signature: public key not found

* sweil/tags/for_anthony:
  gtk: Support keyboard translation for hosts running Windows

Message-id: 1390246909-18757-1-git-send-email-sw@weilnetz.de
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2014-01-24 15:52:09 -08:00
Anthony Liguori
14ac4febb2 hda-codec: disable streams on reset
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJS3kkoAAoJEEy22O7T6HE4WJ4QAJYlYsrVThAi9bnpFB7qLve+
 uP8MWwK+Wxc+O2CrN38BOoIyb273xk+U6Vj1FEfh9C1rN2dcZ4fBsUZfqqGEOYpM
 LNgFWzTUwIXDHnxVVc8Dyp+DyPbRd4CE+ii/ukY7Xq1QiHJmcdxweMPQnIZQl9UZ
 4wTWWubK1u/We+J87udQPfE3ZT2+5ndcbKM2FcCW3j8yBLvZav6/DNL6alr5eJ2L
 WGB8mOyCNu5hjaOWP1mya8cRHO6Z5IbFXqKNGU2m78C9WBO3LaZY7IL+s6mPvTzq
 0SaOk/XfK4XCwpgbvfFQ8lnJ5tdx4dagjGpwZOHPE2+Sy1l/XZztqr+caju/maqF
 PnRrosGUvYnVvPxPJKpN02mbohBizGUwL0nbal5Knx+2quQ+5sHnTJb3a3nonfuC
 /pfqvOD4thb3L7994RKu2bKfTq6TK0/n64JanmXADmEKebb3bb5AijFTModcMd8z
 uW4NejmpqsNYKXWUQ3WUktZWCEaNE7CbxTuxgr7DTSY0b2gl82cLiDS0ZAiJ4zGd
 ddwsvkEadIXHWbsdMMM3kCQQvlO/gVNYB7xozUPqOJdghlTSQAtkMi8ReR+Hxqq+
 PVS7dM276sPjRdqqaJSdk3JiE10NzUSKyBiJZtCZ5dqqIuHPlaHMgx77MzBwlVNK
 PkkqSNr9UpmwHu8WVGhC
 =yVv9
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'kraxel/tags/pull-audio-2' into staging

hda-codec: disable streams on reset

# gpg: Signature made Tue 21 Jan 2014 02:17:12 AM PST using RSA key ID D3E87138
# gpg: Can't check signature: public key not found

* kraxel/tags/pull-audio-2:
  hda-codec: disable streams on reset

Message-id: 1390299589-5082-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2014-01-24 15:51:39 -08:00
Anthony Liguori
f4b27793a8 usb core+hid: add support for microsoft os descriptors
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJS3kopAAoJEEy22O7T6HE4PjkP/0NZqOSU3e9GmgeBfEpvgFfP
 4axBVEycXJtGc9nNfahmEIyUkk1+DMa8oMHnVSoVzelrdm2VNHs6qk9yC9EPa9YG
 HTXZuLvHbuIVL/NXY5b+loh4fOjbOPWEthdUd+RpNfH545Azoi2T/cYgtNz8BTyf
 eO3vCZy69NJjcsuk0zlRhEEAhnhKb+jkYgzARecL6JrAC7uuDrIhdz9peT1Du8k8
 lUnCLHFXrDaZUprkLrgvB+TTHJw9IgRq1g0gV20Gu+hDzBOJsG3cKXAl4X2b3lzO
 Ih+jcORu4ubURvOJSqTpcnObsQWKJHDR3sPeMAVD2ZzCd82ZOeH/+cEhfVBp+dZX
 qV2XbFU6yEiQBbzIuZEbsE4XWQVEAsfa8MaP/SdodJ4WueW21Rq8KDlnWzalPfDc
 T0hofDyLLcpkr2O3AM7OfBfLa37B4joV5FnuinvOo5r6byMMNgAYp7gbDVeIzEXJ
 Cs8N6KF0m8wX0KNyFms49VMTQnKm0MbuTYFbDBkW5jnxKHZ1ZLXdIP1HvYCvWp3w
 or0hxGze6Mrv8Aya9HUu7K0BEEbz8OLhVISkpOi35PGii2PcSm3lYHHhsohRYOcW
 EHilDyCn3Q20RkwcHi6YMy43T3eSJJnYwN4VWEb4lnDT5sjh/ehthxdtjCQHt8+f
 O1cBVvdDyh/BZt6SSuy8
 =sFbE
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'kraxel/tags/pull-usb-2' into staging

usb core+hid: add support for microsoft os descriptors

# gpg: Signature made Tue 21 Jan 2014 02:21:29 AM PST using RSA key ID D3E87138
# gpg: Can't check signature: public key not found

* kraxel/tags/pull-usb-2:
  usb-hid: add microsoft os descriptor support
  usb: add support for microsoft os descriptors

Message-id: 1390299772-5368-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2014-01-24 15:51:23 -08:00
Anthony Liguori
e9f526ab7b Merge remote-tracking branch 'bonzini/scsi-next' into staging
* bonzini/scsi-next:
  scsi: Support TEST UNIT READY in the dummy LUN0
  block: add .bdrv_reopen_prepare() stub for iscsi
  virtio-scsi: Prevent assertion on missed events
  virtio-scsi: Cleanup of I/Os that never started
  scsi: Assign cancel_io vector for scsi_disk_emulate_ops

Conflicts:
	block/iscsi.c

aliguori: resolve trivial merge conflict in block/iscsi.c

Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2014-01-24 15:50:14 -08:00
Anthony Liguori
0d688cf7d8 Block patches
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJS4peVAAoJEH8JsnLIjy/WIfcP/0fezyrowrBSMNpqpbOI9bpy
 mjbo7tI1ZdvR2GlAsxZ8ollMNYtC611MHbnq48ywhRHJPMBPN9CYPJqwGtyzpA00
 f3gw2Lct9YMPRu0wiRclFaSSdPXVohqdL16AFRcsz8FVTD8MQkgAMu0q4+dHhs6p
 G2BvIyo3fjuS7jLJuXUmyyB4cOHI9/gw+B/ANF1BFWl0llcb5jLVxy44NGLwqovH
 Nxe4GbnMb6QYUhwibzGwMnImm8nrLHbocv0IMyX4zqBLap7BfQUJ9zs4kPUnd1rB
 /sB5dBKj5r7l8cuOyCZRkGo4gc1FyddjWlU5UpcqBuk92PXavsRM74C6H42CS3y1
 4pcEu0fAYBdfEaDAINUdbfT1eyMVjiNiFZT1NAgcz5JkTnuG3dThaDAKQEx7FCuq
 iP2n1nlpwSs9mzeifnZH6INyyEjoJzvBh3cIdBi3yEIJ4OVSLyrxi2djTEnRvAeh
 YXyMuJlscjelJpwsMCsWnkKp4vAuiXrefdQbKqZ9HwPD5oJmWi4PNvRndBiVj+Na
 CJfZROwAej3sAusUHbCB5pt5v+Fdb64bmTO5ph42ChnKIG9T2Piz2IQJPPTeVe9h
 cdtkd3tIV5xB6iXjy2xb4Y6y6Xqs1qEs91dyjQE1GszeBcairutwUmI0oYwHY5eJ
 tVyL8ptpNvfj9vKnDvE+
 =wLul
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'kwolf/tags/for-anthony' into staging

Block patches

# gpg: Signature made Fri 24 Jan 2014 08:40:53 AM PST using RSA key ID C88F2FD6
# gpg: Can't check signature: public key not found

* kwolf/tags/for-anthony: (93 commits)
  block: Switch bdrv_io_limits_intercept() to byte granularity
  qemu-iotests: Test pwritev RMW logic
  qemu-io: New command 'sleep'
  blkdebug: Make required alignment configurable
  iscsi: Set bs->request_alignment
  block: Make bdrv_pwrite() a bdrv_prwv_co() wrapper
  block: Make bdrv_pread() a bdrv_prwv_co() wrapper
  block: Change coroutine wrapper to byte granularity
  block: Assert serialisation assumptions in pwritev
  block: Align requests in bdrv_co_do_pwritev()
  block: Allow wait_serialising_requests() at any point
  block: Make overlap range for serialisation dynamic
  block: Generalise and optimise COR serialisation
  block: Make zero-after-EOF work with larger alignment
  block: Allow waiting for overlapping requests between begin/end
  block: Switch BdrvTrackedRequest to byte granularity
  block: Introduce bdrv_co_do_pwritev()
  block: write: Handle COR dependency after I/O throttling
  block: Introduce bdrv_aligned_pwritev()
  block: Introduce bdrv_co_do_preadv()
  ...

Message-id: 1390584136-24703-1-git-send-email-kwolf@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2014-01-24 15:43:30 -08:00
Kevin Wolf
d5103588aa block: Switch bdrv_io_limits_intercept() to byte granularity
Request sizes used to be rounded down to the next sector boundary,
allowing to bypass the I/O limit. Now all requests are accounted for
with their exact byte size.

Reported-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24 17:40:28 +01:00
Kevin Wolf
9e1cb96d9a qemu-iotests: Test pwritev RMW logic
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24 17:40:25 +01:00
Kevin Wolf
cd33d02a10 qemu-io: New command 'sleep'
There is no easy way to check that a request correctly waits for a
different request. With a sleep command we can at least approximate it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 17:40:03 +01:00
Kevin Wolf
b35ee7fb23 blkdebug: Make required alignment configurable
The new 'align' option of blkdebug can be used in order to emulate
backends with a required 4k alignment on hosts which only really require
512 byte alignment.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 17:40:03 +01:00
Paolo Bonzini
2c9880c45e iscsi: Set bs->request_alignment
The iSCSI backend already gets the block size from the READ CAPACITY
command it sends.  Save it so that the generic block layer gets it
too.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24 17:40:03 +01:00
Kevin Wolf
8407d5d7e2 block: Make bdrv_pwrite() a bdrv_prwv_co() wrapper
Instead of implementing the alignment adjustment here, use the now
existing functionality of bdrv_co_do_pwritev().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24 17:40:03 +01:00
Kevin Wolf
a3ef657185 block: Make bdrv_pread() a bdrv_prwv_co() wrapper
Instead of implementing the alignment adjustment here, use the now
existing functionality of bdrv_co_do_preadv().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24 17:40:03 +01:00
Kevin Wolf
775aa8b6e0 block: Change coroutine wrapper to byte granularity
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24 17:40:03 +01:00
Kevin Wolf
28de2dcd88 block: Assert serialisation assumptions in pwritev
If a request calls wait_serialising_requests() and actually has to wait
in this function (i.e. a coroutine yield), other requests can run and
previously read data (like the head or tail buffer) could become
outdated. In this case, we would have to restart from the beginning to
read in the updated data.

However, we're lucky and don't actually need to do that: A request can
only wait in the first call of wait_serialising_requests() because we
mark it as serialising before that call, so any later requests would
wait. So as we don't wait in practice, we don't have to reload the data.

This is an important assumption that may not be broken or data
corruption will happen. Document it with some assertions.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24 17:40:03 +01:00
Kevin Wolf
3b8242e0ea block: Align requests in bdrv_co_do_pwritev()
This patch changes bdrv_co_do_pwritev() to actually be what its name
promises. If requests aren't properly aligned, it performs a RMW.

Requests touching the same block are serialised against the RMW request.
Further optimisation of this is possible by differentiating types of
requests (concurrent reads should actually be okay here).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24 17:40:02 +01:00
Kevin Wolf
6460440f34 block: Allow wait_serialising_requests() at any point
We can only have a single wait_serialising_requests() call per request
because otherwise we can run into deadlocks where requests are waiting
for each other. The same is true when wait_serialising_requests() is not
at the very beginning of a request, so that other requests can be issued
between the start of the tracking and wait_serialising_requests().

Fix this by changing wait_serialising_requests() to ignore requests that
are already (directly or indirectly) waiting for the calling request.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24 17:40:02 +01:00
Kevin Wolf
7327145f63 block: Make overlap range for serialisation dynamic
Copy on Read wants to serialise with all requests touching the same
cluster, so wait_serialising_requests() rounded to cluster boundaries.
Other users like alignment RMW will have different requirements, though
(requests touching the same sector), so make it dynamic.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24 17:40:02 +01:00
Kevin Wolf
2dbafdc012 block: Generalise and optimise COR serialisation
Change the API so that specific requests can be marked serialising. Only
these requests are checked for overlaps then.

This means that during a Copy on Read operation, not all requests
overlapping other requests are serialised any more, but only those that
actually overlap with the specific COR request.

Also remove COR from function and variable names because this
functionality can be useful in other contexts.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24 17:40:02 +01:00
Kevin Wolf
ec746e10cb block: Make zero-after-EOF work with larger alignment
Odd file sizes could make bdrv_aligned_preadv() shorten the request in
non-aligned ways. Fix it by rounding to the required alignment instead
of 512 bytes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24 17:40:02 +01:00
Kevin Wolf
65afd211c7 block: Allow waiting for overlapping requests between begin/end
Previously, it was not possible to use wait_for_overlapping_requests()
between tracked_request_begin()/end() because it would wait for itself.

Ignore the current request in the overlap check and run more of the
bdrv_co_do_preadv/pwritev code with a BdrvTrackedRequest present.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24 17:40:02 +01:00
Kevin Wolf
793ed47a7a block: Switch BdrvTrackedRequest to byte granularity
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24 17:40:02 +01:00
Kevin Wolf
6601553e27 block: Introduce bdrv_co_do_pwritev()
This is going to become the bdrv_co_do_preadv() equivalent for writes.
In this patch, however, just a function taking byte offsets is created,
it doesn't align anything yet.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24 17:40:02 +01:00
Kevin Wolf
244eadef5c block: write: Handle COR dependency after I/O throttling
First waiting for all COR requests to complete and calling the
throttling function afterwards means that the request could be delayed
and we still need to wait for the COR request even if it was issued only
after the throttled write request.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24 17:40:02 +01:00
Kevin Wolf
b404f72036 block: Introduce bdrv_aligned_pwritev()
This separates the part of bdrv_co_do_writev() that needs to happen
before the request is modified to match the backend alignment, and a
part that needs to be executed afterwards and passes the request to the
BlockDriver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24 17:40:02 +01:00
Kevin Wolf
1b0288ae7f block: Introduce bdrv_co_do_preadv()
Similar to bdrv_pread(), which aligns byte-aligned request to 512 byte
sectors, bdrv_co_do_preadv() takes a byte-aligned request and aligns it
to the alignment specified in bs->request_alignment.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24 17:40:02 +01:00
Kevin Wolf
d0c7f642f5 block: Introduce bdrv_aligned_preadv()
This separates the part of bdrv_co_do_readv() that needs to happen
before the request is modified to match the backend alignment, and a
part that needs to be executed afterwards and passes the request to the
BlockDriver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24 17:40:02 +01:00
Paolo Bonzini
c25f53b06e raw: Probe required direct I/O alignment
Add a bs->request_alignment field that contains the required
offset/length alignment for I/O requests and fill it in the raw block
drivers. Use ioctls if possible, else see what alignment it takes for
O_DIRECT to succeed.

While at it, also expose the memory alignment requirements, which may be
(and in practice are) different from the disk alignment requirements.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24 17:40:02 +01:00
Paolo Bonzini
1b7fd72955 block: rename buffer_alignment to guest_block_size
The alignment field is now set to the value that is promised to the
guest, rather than required by the host.  The next patches will make
QEMU aware of the host-provided values, so make this clear.

The alignment is also not about memory buffers, but about the sectors on
the disk, change the documentation of the field.

At this point, the field is set by the device emulation, but completely
ignored by the block layer.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24 17:40:01 +01:00
Kevin Wolf
339064d506 block: Don't use guest sector size for qemu_blockalign()
bs->buffer_alignment is set by the device emulation and contains the
logical block size of the guest device. This isn't something that the
block layer should know, and even less something to use for determining
the right alignment of buffers to be used for the host.

The new BlockLimits field opt_mem_alignment tells the qemu block layer
the optimal alignment to be used so that no bounce buffer must be used
in the driver.

This patch may change the buffer alignment from 4k to 512 for all
callers that used qemu_blockalign() with the top-level image format
BlockDriverState. The value was never propagated to other levels in the
tree, so in particular raw-posix never required anything else than 512.

While on disks with 4k sectors direct I/O requires a 4k alignment,
memory may still be okay when aligned to 512 byte boundaries. This is
what must have happened in practice, because otherwise this would
already have failed earlier. Therefore I don't expect regressions even
with this intermediate state. Later, raw-posix can implement the hook
and expose a different memory alignment requirement.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24 17:40:01 +01:00
Kevin Wolf
1ff735bdc4 block: Detect unaligned length in bdrv_qiov_is_aligned()
For an O_DIRECT request to succeed, it's not only necessary that all
base addresses in the qiov are aligned, but also that each length in it
is aligned.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24 17:40:01 +01:00
Kevin Wolf
e5354657a6 qemu_memalign: Allow small alignments
The functions used by qemu_memalign() require an alignment that is at
least sizeof(void*). Adjust it if it is too small.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit@irqsave.net>
2014-01-24 17:40:01 +01:00
Kevin Wolf
355ef4ac95 block: Update BlockLimits when they might have changed
When reopening with different flags, or when backing files disappear
from the chain, the limits may change. Make sure they get updated in
these cases.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit@irqsave.net>
2014-01-24 17:40:01 +01:00
Kevin Wolf
466ad822de block: Inherit opt_transfer_length
When there is a format driver between the backend, it's not guaranteed
that exposing the opt_transfer_length for the format driver results in
the optimal requests (because of fragmentation etc.), but it can't make
things worse, so let's just do it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit@irqsave.net>
2014-01-24 17:40:01 +01:00
Kevin Wolf
d34682cd4a block: Move initialisation of BlockLimits to bdrv_refresh_limits()
This function separates filling the BlockLimits from bdrv_open(), which
allows it to call it from other operations which may change the limits
(e.g. modifications to the backing file chain or bdrv_reopen)

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24 17:40:01 +01:00