Add support for GuestPhysBits (cpuid 0x80000008, eax, bits 23:16).
GuestPhysBits is a field which can be set by the hypervisor to inform
the guest about the /usable/ physical address space bits. This can be
smaller than the PhysBits of the CPU, for example because of nested
paging limitations.
OVMF will read GuestPhysBits, log the value, in case it is set use it
as upper limit.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This patch refines the SmmAccess implementation:
1. SmramMap will be retrieved from the
gEfiSmmSmramMemoryGuid instead of original from
the TSEG Memory Base register.
2. Remove the gEfiAcpiVariableGuid creation, thus
the DESCRIPTOR_INDEX definition can be also cleaned.
3. The gEfiAcpiVariableGuid HOB is moved to the
OvmfPkg/Library/PlatformInitLib/PlatformInitLib.inf.
Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Ray Ni <ray.ni@intel.com>
Signed-off-by: Jiaxin Wu <jiaxin.wu@intel.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Jiewen Yao <Jiewen.yao@intel.com>
In the PiSmmCpuDxeSmm driver, SMRAM allocation for SMI
handlers and processor Save State areas was traditionally
performed using the Smst->AllocatePages() function during
the DXE phase. The introduction of SmmRelocationLib
changes this process by moving the allocation to the PEI
phase, where Smst->AllocatePages() is not accessible.
Instead, the allocation is now handled by partitioning
the SMRAM based on the information provided by a GUID HOB
(identified by gEfiSmmSMramMemoryGuid).
This patch is to ensure that OVMF produces the
gEfiSmmSMramMemoryGuid HOB, allowing SmmRelocationLib to
reserve the necessary memory for SMBASE relocation.
More info for the change:
1. The EFI_SMM_SMRAM_MEMORY_GUID HOB, as defined in the PI
specification, vol.3, section 5, which is used to describe
the SMRAM memory regions supported by the platform. This HOB
should be produced during the memory detection phase to
align with the PI spec.
2. In addition to the memory reserved for ACPI S3 resume,
an increasing number of features require reserving SMRAM
for specific purposes, such as SmmRelocation. Other
advanced features in Intel platforms also necessitate
this. The implementation of these features varies and is
entirely dependent on the platform. This is why an
increasing number of platforms are adopting the
EFI_SMM_SMRAM_MEMORY_GUID HOB for SMRAM description.
3. It is crucial that the SMRAM information remains
consistent when retrieved from the platform, whether
through the SMM ACCESS PPI/Protocol or the
EFI_SMM_SMRAM_MEMORY_GUID HOB. Inconsistencies can lead
to unexpected issues, most commonly memory region conflicts.
4. The SMM ACCESS PPI/Protocol can be naturally
implemented for general use. The common approach is to
utilize the EFI_SMM_SMRAM_MEMORY_GUID HOB. For reference,
see the existing implementation in the EDK2 repository at
edk2/UefiPayloadPkg/SmmAccessDxe/SmmAccessDxe.inf and
edk2-platforms/Silicon/Intel/IntelSiliconPkg/Feature/
SmmAccess/Library/PeiSmmAccessLib/PeiSmmAccessLib.inf.
Next patch will refine the OVMF SMM Access to consume
the EFI_SMM_SMRAM_MEMORY_GUID HOB.
Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Ray Ni <ray.ni@intel.com>
Signed-off-by: Jiaxin Wu <jiaxin.wu@intel.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Jiewen Yao <Jiewen.yao@intel.com>
Adjust physical address space logic for la57 mode (5-level paging).
With a larger logical address space we can identity-map a larger
physical address space.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Message-Id: <20240222105407.75735-4-kraxel@redhat.com>
Cc: Michael Roth <michael.roth@amd.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Liming Gao <gaoliming@byosoft.com.cn>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Min Xu <min.m.xu@intel.com>
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: Oliver Steffen <osteffen@redhat.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
[lersek@redhat.com: turn the "Cc:" message headers from Gerd's on-list
posting into "Cc:" tags in the commit message, in order to pacify
"PatchCheck.py"]
In addition to initializing the PhysMemAddressWidth and
FirstNonAddress fields in PlatformInfoHob, the
PlatformAddressWidthInitialization function is responsible
for initializing the PcdPciMmio64Base and PcdPciMmio64Size
fields.
Currently, for CloudHv guests, the PcdPciMmio64Base is
placed immediately after either the 4G boundary or the
last RAM region, whichever is greater. We do not change
this behavior.
Previously, when booting CloudHv guests with greater than
1TiB of high memory, the PlatformAddressWidthInitialization
function incorrect calculates the amount of RAM using the
overflowed 24-bit CMOS register.
Now, we update the PlatformAddressWidthInitialization
behavior on CloudHv to scan the E820 entries to detect
the amount of RAM. This allows CloudHv guests to boot with
greater than 1TiB of RAM
Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
The PlatformScanE820 utility function is not currently compatible
with CloudHv since it relies on the prescence of the "etc/e820"
QemuFwCfg file. Update the PlatformScanE820 to iterate through the
PVH e820 entries when running on a CloudHv guest.
Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Older linux kernels have problems with phys-bits larger than 46,
ubuntu 18.04 (kernel 4.15) has been reported to be affected.
Reduce phys-bits limit from 47 to 46.
Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
If PcdUse1GPageTable is not enabled restrict the physical address space
used to 1TB, to limit the amount of memory needed for identity mapping
page tables.
The same already happens in case the processor has no support for
gigabyte pages.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
__FUNCTION__ is a pre-standard extension that gcc and Visual C++ among
others support, while __func__ was standardized in C99.
Since it's more standard, replace __FUNCTION__ with __func__ throughout
OvmfPkg.
Signed-off-by: Rebecca Cran <rebecca@bsdio.com>
Reviewed-by: Michael D Kinney <michael.d.kinney@intel.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Replace the static struct initialization with a call to ZeroMem to avoid
generating a call to memset in certain build configurations.
Signed-off-by: Rebecca Cran <rebecca@bsdio.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Oliver Smith-Denny <osd@smith-denny.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
Reviewed-by: Michael D Kinney <michael.d.kinney@intel.com>
With the new mmconfig location at 0xe0000000 above the 32-bit PCI MMIO
window we don't have to special-case the mmconfig xbar any more. We'll
just add a mtrr uncachable entry starting at MMIO window base and ending
at 4GB.
Update comments to match reality.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
BuildResourceDescriptorHob() expects the third parameter be the Length,
not the End address.
Fixes: 328076cfdf ("OvmfPkg/PlatformInitLib: Add PlatformAddHobCB")
Reported-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
First handle the cases which do not need know the value of
PlatformInfoHob->LowMemory (microvm and cloudhv). Then call
PlatformGetSystemMemorySizeBelow4gb() to get LowMemory. Finally handle
the cases (q35 and pc) which need to look at LowMemory,
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Add PlatformReservationConflictCB() callback function for use with
PlatformScanE820(). It checks whenever the 64bit PCI MMIO window
overlaps with a reservation from qemu. If so move down the MMIO window
to resolve the conflict.
Write any actions done (moving mmio window) to the firmware log with
INFO loglevel.
This happens on (virtual) AMD machines with 1TB address space,
because the AMD IOMMU uses an address window just below 1TB.
Bugzilla: https://bugzilla.tianocore.org/show_bug.cgi?id=4251
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Add PlatformAddHobCB() callback function for use with
PlatformScanE820(). It adds HOBs for high memory and reservations (low
memory is handled elsewhere because there are some special cases to
consider). This replaces calls to PlatformScanOrAdd64BitE820Ram() with
AddHighHobs = TRUE.
Write any actions done (adding HOBs, skip unknown types) to the firmware
log with INFO loglevel.
Also remove PlatformScanOrAdd64BitE820Ram() which is not used any more.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Add PlatformGetLowMemoryCB() callback function for use with
PlatformScanE820(). It stores the low memory size in
PlatformInfoHob->LowMemory. This replaces calls to
PlatformScanOrAdd64BitE820Ram() with non-NULL LowMemory.
Write any actions done (setting LowMemory) to the firmware log
with INFO loglevel.
Also change PlatformGetSystemMemorySizeBelow4gb() to likewise set
PlatformInfoHob->LowMemory instead of returning the value. Update
all Callers to the new convention.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
First step replacing the PlatformScanOrAdd64BitE820Ram() function.
Add a PlatformScanE820() function which loops over the e280 entries
from FwCfg and calls a callback for each of them.
Add a GetFirstNonAddressCB() function which will store the first free
address (right after the last RAM block) in
PlatformInfoHob->FirstNonAddress. This replaces calls to
PlatformScanOrAdd64BitE820Ram() with non-NULL MaxAddress.
Write any actions done (setting FirstNonAddress) to the firmware log
with INFO loglevel.
Also drop local FirstNonAddress variables and use
PlatformInfoHob->FirstNonAddress instead everywhere.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
In commit 49edde1523 ("OvmfPkg/PlatformPei: set 32-bit UC area at
PciBase / PciExBarBase (pc/q35)", 2019-06-03), I forgot to update the
comment. Do it now.
Fixes: 49edde1523
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
qemu uses the etc/e820 fw_cfg file not only for memory, but
also for reservations. Handle reservations by adding resource
descriptor hobs for them.
A typical qemu configuration has a small reservation between
lapic and flash:
# sudo cat /proc/iomem
[ ... ]
fee00000-fee00fff : Local APIC
feffc000-feffffff : Reserved <= HERE
ffc00000-ffffffff : Reserved
[ ... ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Rely on the CcProbe() function to identify when running on TDX. This
allows the firmware to follow a different codepath for Cloud Hypervisor,
which means it doesn't rely on PVH to find out about memory below 4GiB.
instead it falls back onto the CMOS to retrieve that information.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Reviewed-by: Min Xu <min.m.xu@intel.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
RFC: https://bugzilla.tianocore.org/show_bug.cgi?id=3937
There are below major changes in PlatformInitLib/PlatformPei
1. ProcessHobList
The unaccepted memory is accepted if it is under 4G address.
Please be noted: in current stage, we only accept the memory under 4G.
We will re-visit here in the future when on-demand accept memory is
required.
2. TransferTdxHobList
Transfer the unaccepted memory hob to EFI_RESOURCE_SYSTEM_MEMORY hob
if it is accepted.
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Min Xu <min.m.xu@intel.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
In case we have a reliable PhysMemAddressWidth use that to dynamically
size the 64bit address window. Allocate 1/8 of the physical address
space and place the window at the upper end of the address space.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Try detect physical address space, when successful use it.
Otherwise go continue using the current guesswork code path.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Add some qemu specific quirks to PlatformAddressWidthFromCpuid()
to figure whenever the PhysBits value returned by CPUID is
something real we can work with or not.
See the source code comment for details on the logic.
Also apply some limits to the address space we are going to use:
* Place a hard cap at 47 PhysBits (128 TB) to avoid using addresses
which require 5-level paging support.
* Cap at 40 PhysBits (1 TB) in case the CPU has no support for
gigabyte pages, to avoid excessive amounts of pages being
used for page tables.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Traditional q35 memory layout is 2.75 GB of low memory, leaving room
for the pcie mmconfig at 0xb0000000 and the 32-bit pci mmio window at
0xc0000000. Because of that OVMF tags the memory range above
0xb0000000 as uncachable via mtrr.
A while ago qemu started to gigabyte-align memory by default (to make
huge pages more effective) and q35 uses only 2G of low memory in that
case. Which effectively makes the 32-bit pci mmio window start at
0x80000000.
This patch updates the mtrr setup code accordingly.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Read the "hardware-info" item from fw-cfg to extract specifications
of PCI host bridges and analyze the 64-bit apertures of them to
find out the highest 64-bit MMIO address required which determines
the address space required by the guest, and, consequently, the
FirstNonAddress used to calculate size of physical addresses.
Using the static PeiHardwareInfoLib, read the fw-cfg file of
hardware information to extract, one by one, all the host
bridges. Find the last 64-bit MMIO address of each host bridge,
using the HardwareInfoPciHostBridgeLib API, and compare it to an
accumulate value to discover the highest address used, which
corresponds to the highest value that must be included in the
guest's physical address space.
Given that platforms with multiple host bridges may provide the PCI
apertures' addresses, the memory detection logic must take into
account that, if the host provided the MMIO windows that can and must
be used, the guest needs to take those values. Therefore, if the
MMIO windows are found in the host-provided fw-cfg file, skip all the
logic calculating the physical address size and just use the value
provided. Since each PCI host bridge corresponds to an element in
the information provided by the host, each of these must be analyzed
looking for the highest address used.
Cc: Alexander Graf <graf@amazon.de>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Nicolas Ojeda Leon <ncoleon@amazon.com>
microvm places the 64bit mmio space at the end of the physical address
space. So mPhysMemAddressWidth must be correct, otherwise the pci host
bridge setup throws an error because it thinks the 64bit mmio window is
not addressable.
On microvm we can simply use standard cpuid to figure the address width
because the host-phys-bits option (-cpu ${name},host-phys-bits=on) is
forced to be enabled. Side note: For 'pc' and 'q35' this is not the
case for backward compatibility reasons.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
RFC: https://bugzilla.tianocore.org/show_bug.cgi?id=3429
There are below changes in PlatformInitLib for Tdx guest:
1. Publish ram regions
In Tdx guest, the system memory is passed in TdHob by host VMM. So
the major task of PlatformTdxPublishRamRegions is to walk thru the
TdHob list and transfer the ResourceDescriptorHob and MemoryAllocationHob
to the hobs in DXE phase.
2. Build MemoryAllocationHob for Tdx Mailbox and Ovmf work area.
3. Update of PlatformAddressWidthInitialization. The physical
address width that Tdx guest supports is either 48 or 52.
4. Update of PlatformMemMapInitialization.
0xA0000 - 0xFFFFF is VGA bios region. Platform initialization marks the
region as MMIO region. Dxe code maps MMIO region as IO region.
As TDX guest, MMIO region is maps as shared. However VGA BIOS doesn't need
to be shared. Guest TDX Linux maps VGA BIOS as private and accesses for
BIOS and stuck on repeating EPT violation. VGA BIOS (more generally ROM
region) should be private. Skip marking VGA BIOA region [0xa000, 0xfffff]
as MMIO in HOB.
Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
Signed-off-by: Min Xu <min.m.xu@intel.com>