The descriptor table (also known as "queue") consists of descriptors. (The
corresponding type in the code is VRING_DESC.)
An individual descriptor describes a contiguous buffer, to be transferred
uni-directionally between host and guest.
Several descriptors in the descriptor table can be linked into a
descriptor chain, specifying a bi-directional scatter-gather transfer
between host and guest. Such a descriptor chain is also known as "virtio
request".
(The descriptor table can host sereval descriptor chains (in-flight virtio
requests) in parallel, but the OVMF driver supports at most one chain, at
any point in time.)
The first descriptor in any descriptor chain is called "head descriptor".
In order to submit a number of parallel requests (= a set of independent
descriptor chains) from the guest to the host, the guest must put *only*
the head descriptor of each separate chain onto the Available Ring.
VirtioLib currently places the head of its one descriptor chain onto the
Available Ring repeatedly, once for each single (head *or* dependent)
descriptor in said descriptor chain. If the descriptor chain comprises N
descriptors, this error amounts to submitting the same entire chain N
times in parallel.
Available Ring Descriptor table
Ptr to head ----> Desc#0 (head of chain)
Ptr to head --/ Desc#1 (next in same chain)
... / ...
Ptr to head / Desc#(N-1) (last in same chain)
Anatomy of a single virtio-blk READ request (a descriptor chain with three
descriptors):
virtio-blk request header, prepared by guest:
VirtioAppendDesc PhysAddr=3FBC6050 Size=16 Flags=1 Head=1232 Next=1232
payload to be filled in by host:
VirtioAppendDesc PhysAddr=3B934C00 Size=32768 Flags=3 Head=1232 Next=1233
host status, to be filled in by host:
VirtioAppendDesc PhysAddr=3FBC604F Size=1 Flags=2 Head=1232 Next=1234
Processing on the host side -- the descriptor chain is processed three
times in parallel (its head is available to virtqueue_pop() thrice); the
same chain is submitted/collected separately to/from AIO three times:
virtio_queue_notify vdev VDEV vq VQ#0
virtqueue_pop vq VQ#0 elem EL#0 in_num 2 out_num 1
bdrv_aio_readv bs BDRV sector_num 585792 nb_sectors 64 opaque REQ#0
virtqueue_pop vq VQ#0 elem EL#1 in_num 2 out_num 1
bdrv_aio_readv bs BDRV sector_num 585792 nb_sectors 64 opaque REQ#1
virtqueue_pop vq VQ#0 elem EL#2 in_num 2 out_num 1
bdrv_aio_readv bs BDRV sector_num 585792 nb_sectors 64 opaque REQ#2
virtio_blk_rw_complete req REQ#0 ret 0
virtio_blk_req_complete req REQ#0 status 0
virtio_blk_rw_complete req REQ#1 ret 0
virtio_blk_req_complete req REQ#1 status 0
virtio_blk_rw_complete req REQ#2 ret 0
virtio_blk_req_complete req REQ#2 status 0
On my Thinkpad T510 laptop with RHEL-6 as host, this probably leads to
simultaneous DMA transfers targeting the same RAM area. Even though the
source of each transfer is identical, the data is corrupted in the
destination buffer -- the CRC32 calculated over the buffer varies, even
though the origin of the transfers is the same, never rewritten LBA.
SynchronousRequest Lba=585792 BufSiz=32768 ReqIsWrite=0 Crc32=BF68A44D
The problem is invisible on my HP Z400 workstation.
Fix the request submission by:
- building the only one descriptor chain supported by VirtioLib always at
the beginning of the descriptor table,
- ensuring the head descriptor of this chain is put on the Available Ring
only once,
- requesting the virtio spec's language to be cleaned up
<http://lists.linuxfoundation.org/pipermail/virtualization/2013-April/024032.html>.
Available Ring Descriptor table
Ptr to head ----> Desc#0 (head of chain)
Desc#1 (next in same chain)
...
Desc#(N-1) (last in same chain)
VirtioAppendDesc PhysAddr=3FBC6040 Size=16 Flags=1 Head=0 Next=0
VirtioAppendDesc PhysAddr=3B934C00 Size=32768 Flags=3 Head=0 Next=1
VirtioAppendDesc PhysAddr=3FBC603F Size=1 Flags=2 Head=0 Next=2
virtio_queue_notify vdev VDEV vq VQ#0
virtqueue_pop vq VQ#0 elem EL#0 in_num 2 out_num 1
bdrv_aio_readv bs BDRV sector_num 585792 nb_sectors 64 opaque REQ#0
virtio_blk_rw_complete req REQ#0 ret 0
virtio_blk_req_complete req REQ#0 status 0
SynchronousRequest Lba=585792 BufSiz=32768 ReqIsWrite=0 Crc32=1EEB2B07
(The Crc32 was double-checked with edk2's and Linux's guest IDE driver.)
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
git-svn-id: https://edk2.svn.sourceforge.net/svnroot/edk2/trunk/edk2@14356 6f19259b-4bc3-4df7-8a09-765794883524
We're supposed to zero everything in the kernel bootparams that we don't
explicitly initialise, other than the setup_header from 0x1f1 onwards
for a precisely defined length, which is copied from the bzImage.
We're *not* supposed to just pass the garbage that we happened to find
in the bzImage file surrounding the setup_header.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
git-svn-id: https://edk2.svn.sourceforge.net/svnroot/edk2/trunk/edk2@14052 6f19259b-4bc3-4df7-8a09-765794883524
AppendDesc() should have a prefix implying its containing library,
VirtioLib. Update its sole client VirtioBlkDxe.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
git-svn-id: https://edk2.svn.sourceforge.net/svnroot/edk2/trunk/edk2@13843 6f19259b-4bc3-4df7-8a09-765794883524
Introduce a new library called VirtioLib, for now only collecting the
following reusable functions with as little changes as possible:
- VirtioWrite()
- VirtioRead()
- VirtioRingInit()
- VirtioRingUninit()
- AppendDesc()
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
git-svn-id: https://edk2.svn.sourceforge.net/svnroot/edk2/trunk/edk2@13842 6f19259b-4bc3-4df7-8a09-765794883524
Tested with the "bootorder" fw_cfg file. Example contents (leading space
added and line terminators transcribed for readability):
/pci@i0cf8/ide@1,1/drive@0/disk@0<LF>
/pci@i0cf8/ide@1,1/drive@1/disk@0<LF>
/pci@i0cf8/ethernet@3/ethernet-phy@0<NUL>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
git-svn-id: https://edk2.svn.sourceforge.net/svnroot/edk2/trunk/edk2@13549 6f19259b-4bc3-4df7-8a09-765794883524
This library provides an interface for converting the system
variables into a binary and also restoring the system variables
from that binary.
git-svn-id: https://edk2.svn.sourceforge.net/svnroot/edk2/trunk/edk2@11284 6f19259b-4bc3-4df7-8a09-765794883524
Note:
* This only works before ExitBootServices
* For OVMF, variables are only preserved on the disk if there
is a hard disk connected which has a writeable FAT file system.
The Ovmf/Library/EmuVariableFvbLib library will look for the
gUefiOvmfPkgTokenSpaceGuid.PcdEmuVariableEvent PCD to be set to
a non-zero value. If set, it is treated as an event handle, and
each write to the EmuVariableFvb will cause the event to be
signaled.
In this change, the OVMF platform BDS library sets up this event,
and sets the PCD so that after each write to the EMU Variable FVB,
the non-volatile variables will be saved out to the file system.
The end result is that NV variables that are written prior to the
ExitBootServices call should be preserved by storing them on the
disk.
git-svn-id: https://edk2.svn.sourceforge.net/svnroot/edk2/trunk/edk2@9318 6f19259b-4bc3-4df7-8a09-765794883524
This library provides an interface where variables can be saved and restored
using a file in a file system accessible to the firmware. It is expected
that a platform BDS library will use this library. The platform BDS
implementation can decide which devices to connect and then to attempt to use
for saving and restoring NV variables.
git-svn-id: https://edk2.svn.sourceforge.net/svnroot/edk2/trunk/edk2@9272 6f19259b-4bc3-4df7-8a09-765794883524