BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=2198
Typically, an AP is booted using the INIT-SIPI-SIPI sequence. This
sequence is intercepted by the hypervisor, which sets the AP's registers
to the values requested by the sequence. At that point, the hypervisor can
start the AP, which will then begin execution at the appropriate location.
Under SEV-ES, AP booting presents some challenges since the hypervisor is
not allowed to alter the AP's register state. In this situation, we have
to distinguish between the AP's first boot and AP's subsequent boots.
First boot:
Once the AP's register state has been defined (which is before the guest
is first booted) it cannot be altered. Should the hypervisor attempt to
alter the register state, the change would be detected by the hardware
and the VMRUN instruction would fail. Given this, the first boot for the
AP is required to begin execution with this initial register state, which
is typically the reset vector. This prevents the BSP from directing the
AP startup location through the INIT-SIPI-SIPI sequence.
To work around this, the firmware will provide a build time reserved area
that can be used as the initial IP value. The hypervisor can extract this
location value by checking for the SEV-ES reset block GUID that must be
located 48-bytes from the end of the firmware. The format of the SEV-ES
reset block area is:
0x00 - 0x01 - SEV-ES Reset IP
0x02 - 0x03 - SEV-ES Reset CS Segment Base[31:16]
0x04 - 0x05 - Size of the SEV-ES reset block
0x06 - 0x15 - SEV-ES Reset Block GUID
(00f771de-1a7e-4fcb-890e-68c77e2fb44e)
The total size is 22 bytes. Any expansion to this block must be done
by adding new values before existing values.
The hypervisor will use the IP and CS values obtained from the SEV-ES
reset block to set as the AP's initial values. The CS Segment Base
represents the upper 16 bits of the CS segment base and must be left
shifted by 16 bits to form the complete CS segment base value.
Before booting the AP for the first time, the BSP must initialize the
SEV-ES reset area. This consists of programming a FAR JMP instruction
to the contents of a memory location that is also located in the SEV-ES
reset area. The BSP must program the IP and CS values for the FAR JMP
based on values drived from the INIT-SIPI-SIPI sequence.
Subsequent boots:
Again, the hypervisor cannot alter the AP register state, so a method is
required to take the AP out of halt state and redirect it to the desired
IP location. If it is determined that the AP is running in an SEV-ES
guest, then instead of calling CpuSleep(), a VMGEXIT is issued with the
AP Reset Hold exit code (0x80000004). The hypervisor will put the AP in
a halt state, waiting for an INIT-SIPI-SIPI sequence. Once the sequence
is recognized, the hypervisor will resume the AP. At this point the AP
must transition from the current 64-bit long mode down to 16-bit real
mode and begin executing at the derived location from the INIT-SIPI-SIPI
sequence.
Another change is around the area of obtaining the (x2)APIC ID during AP
startup. During AP startup, the AP can't take a #VC exception before the
AP has established a stack. However, the AP stack is set by using the
(x2)APIC ID, which is obtained through CPUID instructions. A CPUID
instruction will cause a #VC, so a different method must be used. The
GHCB protocol supports a method to obtain CPUID information from the
hypervisor through the GHCB MSR. This method does not require a stack,
so it is used to obtain the necessary CPUID information to determine the
(x2)APIC ID.
The new 16-bit protected mode GDT entry is used in order to transition
from 64-bit long mode down to 16-bit real mode.
A new assembler routine is created that takes the AP from 64-bit long mode
to 16-bit real mode. This is located under 1MB in memory and transitions
from 64-bit long mode to 32-bit compatibility mode to 16-bit protected
mode and finally 16-bit real mode.
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Regression-tested-by: Laszlo Ersek <lersek@redhat.com>
REF: https://bugzilla.tianocore.org/show_bug.cgi?id=1521
We scan the SMM code with ROPgadget.
http://shell-storm.org/project/ROPgadget/https://github.com/JonathanSalwan/ROPgadget/tree/master
This tool reports the gadget in SMM driver.
This patch enabled CET ShadowStack for X86 SMM.
If CET is supported, SMM will enable CET ShadowStack.
SMM CET will save the OS CET context at SmmEntry and
restore OS CET context at SmmExit.
Test:
1) test Intel internal platform (x64 only, CET enabled/disabled)
Boot test:
CET supported or not supported CPU
on CET supported platform
CET enabled/disabled
PcdCpuSmmCetEnable enabled/disabled
Single core/Multiple core
PcdCpuSmmStackGuard enabled/disabled
PcdCpuSmmProfileEnable enabled/disabled
PcdCpuSmmStaticPageTable enabled/disabled
CET exception test:
#CF generated with PcdCpuSmmStackGuard enabled/disabled.
Other exception test:
#PF for normal stack overflow
#PF for NX protection
#PF for RO protection
CET env test:
Launch SMM in CET enabled/disabled environment (DXE) - no impact to DXE
The test case can be found at
https://github.com/jyao1/SecurityEx/tree/master/ControlFlowPkg
2) test ovmf (both IA32 and X64 SMM, CET disabled only)
test OvmfIa32/Ovmf3264, with -D SMM_REQUIRE.
qemu-system-x86_64.exe -machine q35,smm=on -smp 4
-serial file:serial.log
-drive if=pflash,format=raw,unit=0,file=OVMF_CODE.fd,readonly=on
-drive if=pflash,format=raw,unit=1,file=OVMF_VARS.fd
QEMU emulator version 3.1.0 (v3.1.0-11736-g7a30e7adb0-dirty)
3) not tested
IA32 CET enabled platform
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Yao Jiewen <jiewen.yao@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
Regression-tested-by: Laszlo Ersek <lersek@redhat.com>
Index is initialized to MAX_UINT16 as default failure value, which
is what the ASSERT is supposed to test for. The ASSERT condition
however can never return FALSE for INT16 != int, as due to
Integer Promotion[1], Index is converted to int, which can never
result in -1.
Furthermore, Index is used as a for loop index variable inbetween its
initialization and the ASSERT, so the value is unconditionally
overwritten too.
Fix the ASSERT check to compare Index to its upper boundary, which it
will be equal to if the loop was not broken out of on success.
[1] ISO/IEC 9899:2011, 6.5.9.4
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Marvin Haeuser <Marvin.Haeuser@outlook.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
1. Do not use tab characters
2. No trailing white space in one line
3. All files must end with CRLF
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Liming Gao <liming.gao@intel.com>
When StackGuard is enabled on IA32, the #double fault exception
is reported instead of #page fault.
This issue does not exist on X64, or IA32 without StackGuard.
The fix at e4435f710c was incomplete.
It is because AllocateCodePages() is used to allocate buffer for
GDT and TSS, the code pages will be set to RO in SetMemMapAttributes().
But IA32 Stack Guard need use task switch to switch stack that need
write GDT and TSS, so AllocateCodePages() could not be used.
This patch uses AllocatePages() instead of AllocateCodePages() to
allocate buffer for GDT and TSS if StackGuard is enabled on IA32.
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Jian J Wang <jian.j.wang@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Star Zeng <star.zeng@intel.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
This patch fixes https://bugzilla.tianocore.org/show_bug.cgi?id=246
Previously, when SMM exception happens after EndOfDxe,
with StackGuard enabled on IA32, the #double fault exception
is reported instead of #page fault.
Root cause is below:
Current EDKII SMM page protection will lock GDT.
If IA32 stack guard is enabled, the page fault handler will do task switch.
This task switch need write busy flag in GDT, and write TSS.
However, the GDT and TSS is locked at that time, so the
double fault happens.
We decide to not lock GDT for IA32 StackGuard enabled.
This issue does not exist on X64, or IA32 without StackGuard.
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Jeff Fan <jeff.fan@intel.com>
Cc: Michael D Kinney <michael.d.kinney@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jiewen Yao <jiewen.yao@intel.com>
Reviewed-by: Jeff Fan <jeff.fan@intel.com>
Regression-tested-by: Laszlo Ersek <lersek@redhat.com>
Update TransferApToSafeState() use UINTN params to reduce the
number of type casts required in these calls. Also change
the NumberToFinish parameter from UINT32* to UINTN
NumberToFinishAddress to resolve issues with conversion from
a volatile pointer to a non-volatile pointer. The assembly
code that receives the NumberToFinishAddress value must treat
that memory location as a volatile to track the number of APs.
Cc: Liming Gao <liming.gao@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Andrew Fish <afish@apple.com>
Cc: Jeff Fan <jeff.fan@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Michael Kinney <michael.d.kinney@intel.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Jeff Fan <jeff.fan@intel.com>
PiSmmCpuDxeSmm consumes SmmAttributesTable and setup page table:
1) Code region is marked as read-only and Data region is non-executable,
if the PE image is 4K aligned.
2) Important data structure is set to RO, such as GDT/IDT.
3) SmmSaveState is set to non-executable,
and SmmEntrypoint is set to read-only.
4) If static page is supported, page table is read-only.
We use page table to protect other components, and itself.
If we use dynamic paging, we can still provide *partial* protection.
And hope page table is not modified by other components.
The XD enabling code is moved to SmiEntry to let NX take effect.
Cc: Jeff Fan <jeff.fan@intel.com>
Cc: Feng Tian <feng.tian@intel.com>
Cc: Star Zeng <star.zeng@intel.com>
Cc: Michael D Kinney <michael.d.kinney@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jiewen Yao <jiewen.yao@intel.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Michael D Kinney <michael.d.kinney@intel.com>
We will put APs into hlt-loop in safe code. But we decrease mNumberToFinish
before APs enter into the safe code. Paolo pointed out this gap.
This patch is to move mNumberToFinish decreasing to the safe code. It could
make sure BSP could wait for all APs are running in safe code.
https://bugzilla.tianocore.org/show_bug.cgi?id=216
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Michael D Kinney <michael.d.kinney@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
On S3 path, we may transfer to long mode (if DXE is long mode) to restore CPU
contexts with CR3 = SmmS3Cr3 (in SMM). AP will execute hlt-loop after CPU
contexts restoration. Once one NMI or SMI happens, APs may exit from hlt state
and execute the instruction after HLT instruction. If APs are running on long
mode, page table is required to fetch the instruction. However, CR3 pointer to
page table in SMM. APs will crash.
This fix is to disable long mode on APs and transfer to 32bit protected mode to
execute hlt-loop. Then CR3 and page table will no longer be required.
https://bugzilla.tianocore.org/show_bug.cgi?id=216
Reported-by: Laszlo Ersek <lersek@redhat.com>
Analyzed-by: Paolo Bonzini <pbonzini@redhat.com>
Analyzed-by: Laszlo Ersek <lersek@redhat.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Michael D Kinney <michael.d.kinney@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
On S3 path, we will wake up APs to restore CPU context in PiSmmCpuDxeSmm
driver. However, we place AP in hlt-loop under 1MB space borrowed after CPU
restoring CPU contexts.
In case, one NMI or SMI happens, APs may exit from hlt state and execute the
instruction after HLT instruction. But the code under 1MB is no longer safe at
that time.
This fix is to allocate one ACPI NVS range to place the AP hlt-loop code. When
CPU finished restoration CPU contexts, AP will execute in this ACPI NVS range.
https://bugzilla.tianocore.org/show_bug.cgi?id=216
v2:
1. Make stack alignment per Laszlo's comment.
2. Trim whitespace at end of end.
3. Update year mark in file header.
Reported-by: Laszlo Ersek <lersek@redhat.com>
Analyzed-by: Paolo Bonzini <pbonzini@redhat.com>
Analyzed-by: Laszlo Ersek <lersek@redhat.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Michael D Kinney <michael.d.kinney@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Move Gdt initialization from InitializeMpServiceData() to CPU Arch specific function.
We create SmmFuncsArch.c for hold CPU specific function, so that
EFI_IMAGE_MACHINE_TYPE_SUPPORTED(EFI_IMAGE_MACHINE_X64) can be removed.
For IA32 version, we always allocate new page for GDT entry, for easy maintenance.
For X64 version, we fixed TssBase in GDT entry to make sure TSS data is correct.
Remove TSS fixup for GDT in ASM file.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: "Yao, Jiewen" <jiewen.yao@intel.com>
Reviewed-by: "Fan, Jeff" <jeff.fan@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@18937 6f19259b-4bc3-4df7-8a09-765794883524