In commit 49edde1523 ("OvmfPkg/PlatformPei: set 32-bit UC area at
PciBase / PciExBarBase (pc/q35)", 2019-06-03), I forgot to update the
comment. Do it now.
Fixes: 49edde1523
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
qemu uses the etc/e820 fw_cfg file not only for memory, but
also for reservations. Handle reservations by adding resource
descriptor hobs for them.
A typical qemu configuration has a small reservation between
lapic and flash:
# sudo cat /proc/iomem
[ ... ]
fee00000-fee00fff : Local APIC
feffc000-feffffff : Reserved <= HERE
ffc00000-ffffffff : Reserved
[ ... ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=4172
TDVF once accepts memory only by BSP. To improve the boot performance
this patch introduce the multi-core accpet memory. Multi-core means
BSP and APs work together to accept memory.
TDVF leverages mailbox to wake up APs. It is not enabled in MpInitLib
(Which requires SIPI). So multi-core accept memory cannot leverages
MpInitLib to coordinate BSP and APs to work together.
So TDVF split the accept memory into 2 phases.
- AcceptMemoryForAPsStack:
BSP accepts a small piece of memory which is then used by APs to setup
stack. We assign a 16KB stack for each AP. So a td-guest with 256 vCPU
requires 255*16KB = 4080KB.
- AcceptMemory:
After above small piece of memory is accepted, BSP commands APs to
accept memory by sending AcceptPages command in td-mailbox. Together
with the command and accpet-function, the APsStack address is send
as well. APs then set the stack and jump to accept-function to accept
memory.
AcceptMemoryForAPsStack accepts as small memory as possible and then jump
to AcceptMemory. It fully takes advantage of BSP/APs to work together.
After accept memory is done, the memory region for APsStack is not used
anymore. It can be used as other private memory. Because accept-memory
is in the very beginning of boot process and it will not impact other
phases.
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Min Xu <min.m.xu@intel.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
In the commit 4f173db8b4 "OvmfPkg/PlatformInitLib: Add functions for
EmuVariableNvStore", it introduced a PlatformValidateNvVarStore() function
for checking the integrity of NvVarStore.
In some cases when the VariableHeader->StartId is VARIABLE_DATA, the
VariableHeader->State is not just one of the four primary states:
VAR_IN_DELETED_TRANSITION, VAR_DELETED, VAR_HEADER_VALID_ONLY, VAR_ADDED.
The state may combined two or three states, e.g.
0x3C = (VAR_IN_DELETED_TRANSITION & VAR_ADDED) & VAR_DELETED
or
0x3D = VAR_ADDED & VAR_DELETED
When the variable store has those variables, system booting/rebooting will
hangs in a ASSERT:
NvVarStore Variable header State was invalid.
ASSERT
/mnt/working/source_code-git/edk2/OvmfPkg/Library/PlatformInitLib/Platform.c(819):
((BOOLEAN)(0==1))
Adding more log to UpdateVariable() and PlatformValidateNvVarStore(), we
saw some variables which have 0x3C or 0x3D state in store.
e.g.
UpdateVariable(), VariableName=BootOrder
L1871, State=0000003F <-- VAR_ADDED
State &= VAR_DELETED=0000003D
FlushHobVariableToFlash(), VariableName=BootOrder
...
UpdateVariable(), VariableName=InitialAttemptOrder
L1977, State=0000003F
State &= VAR_IN_DELETED_TRANSITION=0000003E
L2376, State=0000003E
State &= VAR_DELETED=0000003C
FlushHobVariableToFlash(), VariableName=InitialAttemptOrder
...
UpdateVariable(), VariableName=ConIn
L1977, State=0000003F
State &= VAR_IN_DELETED_TRANSITION=0000003E
L2376, State=0000003E
State &= VAR_DELETED=0000003C
FlushHobVariableToFlash(), VariableName=ConIn
...
So, only allowing the four primary states is not enough. This patch changes
the falid states list (Follow Jiewen Yao's suggestion):
1. VAR_HEADER_VALID_ONLY (0x7F)
- Header added (*)
2. VAR_ADDED (0x3F)
- Header + data added
3. VAR_ADDED & VAR_IN_DELETED_TRANSITION (0x3E)
- marked as deleted, but still valid, before new data is added. (*)
4. VAR_ADDED & VAR_IN_DELETED_TRANSITION & VAR_DELETED (0x3C)
- deleted, after new data is added.
5. VAR_ADDED & VAR_DELETED (0x3D)
- deleted directly, without new data.
(*) means to support surprise shutdown.
And removed (VAR_IN_DELETED_TRANSITION) and (VAR_DELETED) because they are
invalid states.
v2:
Follow Jiewen Yao's suggestion to add the following valid states:
VAR_ADDED & VAR_DELETED (0x3D)
VAR_ADDED & VAR_IN_DELETED_TRANSITION (0x3E)
VAR_ADDED & VAR_IN_DELETED_TRANSITION & VAR_DELETED (0x3C)
and removed the following invalid states:
VAR_IN_DELETED_TRANSITION
VAR_DELETED
Signed-off-by: Chun-Yi Lee <jlee@suse.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
This is required for passing the ACPI tables from the VMM up to the
guest OS. They are transferred through this GUID extension.
Signed-off-by: Jiaqi Gao <jiaqi.gao@intel.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Reviewed-by: Min Xu <min.m.xu@intel.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
Rely on the CcProbe() function to identify when running on TDX. This
allows the firmware to follow a different codepath for Cloud Hypervisor,
which means it doesn't rely on PVH to find out about memory below 4GiB.
instead it falls back onto the CMOS to retrieve that information.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Reviewed-by: Min Xu <min.m.xu@intel.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
There should be a check that the FV HeaderLength cannot be an odd
number. Otherwise in the following CalculateSum16 there would be an
ASSERT.
In ValidateFvHeader@QemuFlashFvbServicesRuntimeDxe/FwBlockServices.c
there a is similar check to the FwVolHeader->HeaderLength.
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Min Xu <min.m.xu@intel.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
RFC: https://bugzilla.tianocore.org/show_bug.cgi?id=3937
There are below major changes in PlatformInitLib/PlatformPei
1. ProcessHobList
The unaccepted memory is accepted if it is under 4G address.
Please be noted: in current stage, we only accept the memory under 4G.
We will re-visit here in the future when on-demand accept memory is
required.
2. TransferTdxHobList
Transfer the unaccepted memory hob to EFI_RESOURCE_SYSTEM_MEMORY hob
if it is accepted.
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Min Xu <min.m.xu@intel.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
RFC: https://bugzilla.tianocore.org/show_bug.cgi?id=3937
BZ3937_EFI_RESOURCE_MEMORY_UNACCEPTED is defined in MdeModulePkg. The
files which use the definition are updated as well.
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Min Xu <min.m.xu@intel.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
In case we have a reliable PhysMemAddressWidth use that to dynamically
size the 64bit address window. Allocate 1/8 of the physical address
space and place the window at the upper end of the address space.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Try detect physical address space, when successful use it.
Otherwise go continue using the current guesswork code path.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Add some qemu specific quirks to PlatformAddressWidthFromCpuid()
to figure whenever the PhysBits value returned by CPUID is
something real we can work with or not.
See the source code comment for details on the logic.
Also apply some limits to the address space we are going to use:
* Place a hard cap at 47 PhysBits (128 TB) to avoid using addresses
which require 5-level paging support.
* Cap at 40 PhysBits (1 TB) in case the CPU has no support for
gigabyte pages, to avoid excessive amounts of pages being
used for page tables.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Traditional q35 memory layout is 2.75 GB of low memory, leaving room
for the pcie mmconfig at 0xb0000000 and the 32-bit pci mmio window at
0xc0000000. Because of that OVMF tags the memory range above
0xb0000000 as uncachable via mtrr.
A while ago qemu started to gigabyte-align memory by default (to make
huge pages more effective) and q35 uses only 2G of low memory in that
case. Which effectively makes the 32-bit pci mmio window start at
0x80000000.
This patch updates the mtrr setup code accordingly.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
There are 3 functions added for EmuVariableNvStore:
- PlatformReserveEmuVariableNvStore
- PlatformInitEmuVariableNvStore
- PlatformValidateNvVarStore
PlatformReserveEmuVariableNvStore allocate storage for NV variables early
on so it will be at a consistent address.
PlatformInitEmuVariableNvStore copies the content in
PcdOvmfFlashNvStorageVariableBase to the storage allocated by
PlatformReserveEmuVariableNvStore. This is used in the case that OVMF is
launched with -bios parameter. Because in that situation UEFI variables
will be partially emulated, and non-volatile variables may lose their
contents after a reboot. This makes the secure boot feature not working.
PlatformValidateNvVarStore is renamed from TdxValidateCfv and it is used
to validate the integrity of FlashNvVarStore
(PcdOvmfFlashNvStorageVariableBase). It should be called before
PlatformInitEmuVariableNvStore is called to copy over the content.
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Min Xu <min.m.xu@intel.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
Read the "hardware-info" item from fw-cfg to extract specifications
of PCI host bridges and analyze the 64-bit apertures of them to
find out the highest 64-bit MMIO address required which determines
the address space required by the guest, and, consequently, the
FirstNonAddress used to calculate size of physical addresses.
Using the static PeiHardwareInfoLib, read the fw-cfg file of
hardware information to extract, one by one, all the host
bridges. Find the last 64-bit MMIO address of each host bridge,
using the HardwareInfoPciHostBridgeLib API, and compare it to an
accumulate value to discover the highest address used, which
corresponds to the highest value that must be included in the
guest's physical address space.
Given that platforms with multiple host bridges may provide the PCI
apertures' addresses, the memory detection logic must take into
account that, if the host provided the MMIO windows that can and must
be used, the guest needs to take those values. Therefore, if the
MMIO windows are found in the host-provided fw-cfg file, skip all the
logic calculating the physical address size and just use the value
provided. Since each PCI host bridge corresponds to an element in
the information provided by the host, each of these must be analyzed
looking for the highest address used.
Cc: Alexander Graf <graf@amazon.de>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Nicolas Ojeda Leon <ncoleon@amazon.com>
Since Cloud Hypervisor doesn't emulate an A20 gate register on I/O port
0x92, it's better to avoid accessing it when the platform is identified
as Cloud Hypervisor.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Acked-by: Jiewen Yao <jiewen.yao@intel.com>
There are few places in the codebase assuming QemuFwCfg will be present
and supported, which can cause some issues when trying to rely on the
QemuFwCfgLibNull implementation of QemuFwCfgLib.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Acked-by: Jiewen Yao <jiewen.yao@intel.com>
microvm places the 64bit mmio space at the end of the physical address
space. So mPhysMemAddressWidth must be correct, otherwise the pci host
bridge setup throws an error because it thinks the 64bit mmio window is
not addressable.
On microvm we can simply use standard cpuid to figure the address width
because the host-phys-bits option (-cpu ${name},host-phys-bits=on) is
forced to be enabled. Side note: For 'pc' and 'q35' this is not the
case for backward compatibility reasons.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
RFC: https://bugzilla.tianocore.org/show_bug.cgi?id=3429
There are below changes in PlatformInitLib for Tdx guest:
1. Publish ram regions
In Tdx guest, the system memory is passed in TdHob by host VMM. So
the major task of PlatformTdxPublishRamRegions is to walk thru the
TdHob list and transfer the ResourceDescriptorHob and MemoryAllocationHob
to the hobs in DXE phase.
2. Build MemoryAllocationHob for Tdx Mailbox and Ovmf work area.
3. Update of PlatformAddressWidthInitialization. The physical
address width that Tdx guest supports is either 48 or 52.
4. Update of PlatformMemMapInitialization.
0xA0000 - 0xFFFFF is VGA bios region. Platform initialization marks the
region as MMIO region. Dxe code maps MMIO region as IO region.
As TDX guest, MMIO region is maps as shared. However VGA BIOS doesn't need
to be shared. Guest TDX Linux maps VGA BIOS as private and accesses for
BIOS and stuck on repeating EPT violation. VGA BIOS (more generally ROM
region) should be private. Skip marking VGA BIOA region [0xa000, 0xfffff]
as MMIO in HOB.
Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
Signed-off-by: Min Xu <min.m.xu@intel.com>
RFC: https://bugzilla.tianocore.org/show_bug.cgi?id=3429
When host VMM create the Td guest, the system memory informations are
stored in TdHob, which is a memory region described in Tdx metadata.
The system memory region in TdHob should be accepted before it can be
accessed. So the newly added function (ProcessTdxHobList) is to process
the TdHobList to accept the memory. Because TdHobList is provided by
host VMM which is not trusted, so its content should be checked before
it is consumed by TDVF.
Because ProcessTdxHobList is to be called in SEC phase, so
PlatformInitLib.inf is updated to support SEC.
Note: In this patch it is BSP which accepts the pages. So there maybe
boot performance issue. There are some mitigations to this issue, such
as lazy accept, 2M accept page size, etc. We will re-visit here in the
future.
EFI_RESOURCE_MEMORY_UNACCEPTED is a new ResourceType in
EFI_HOB_RESOURCE_DESCRIPTOR. It is defined for the unaccepted memory
passed from Host VMM. This is proposed in microsoft/mu_basecore#66
files#diff-b20a11152d1ce9249c691be5690b4baf52069efadf2e2546cdd2eb663d80c9
e4R237 according to UEFI-Code-First. The proposal was approved in 2021
in UEFI Mantis, and will be added to the new PI.next specification.
Per the MdePkg reviewer's comments, before this new ResourceType is
added in the PI spec, it should not be in MdePkg. So it is now
defined as an internal implementation and will be moved to
MdePkg/Include/Pi/PiHob.h after it is added in PI spec.
See https://edk2.groups.io/g/devel/message/87641
PcdTdxAcceptPageSize is added for page accepting. Currently TDX supports
4K and 2M accept page size. The default value is 2M.
Tdx guest is only supported in X64. So for IA32 ProcessTdxHobList
just returns EFI_UNSUPPORTED.
Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
Signed-off-by: Min Xu <min.m.xu@intel.com>
BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=3863
There are 3 variants of PlatformPei in OvmfPkg:
- OvmfPkg/PlatformPei
- OvmfPkg/XenPlatformPei
- OvmfPkg/Bhyve/PlatformPei/PlatformPei.inf
These PlatformPeis can share many common codes, such as
Cmos / Hob / Memory / Platform related functions. This commit
(and its following several patches) are to create a PlatformInitLib
which wraps the common code called in above PlatformPeis.
In this initial version of PlatformInitLib, below Cmos related functions
are introduced:
- PlatformCmosRead8
- PlatformCmosWrite8
- PlatformDebugDumpCmos
They correspond to the functions in OvmfPkg/PlatformPei:
- CmosRead8
- CmosWrite8
- DebugDumpCmos
Considering this PlatformInitLib will be used in SEC phase, global
variables and dynamic PCDs are avoided. We use PlatformInfoHob
to exchange information between functions.
EFI_HOB_PLATFORM_INFO is the data struct which contains the platform
information, such as HostBridgeDevId, BootMode, S3Supported,
SmmSmramRequire, etc.
After PlatformInitLib is created, OvmfPkg/PlatformPei is refactored
with this library.
Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
Signed-off-by: Min Xu <min.m.xu@intel.com>