Use 16bit and 32bit assembly code to replace hard code db.
In V2: add 0x67 prefixes to far jumps
Without the a32 modifier under FLAT32_JUMP, and the a16 modifier under
LONG_JUMP, nasm doesn't generate the 0x67 prefixes, and the far jumps
don't work. (For the former, KVM returns an emulation failure. For the
latter, KVM performs a triple fault (guest reboot).) By forcing the 0x67
prefixes we end up with the same machine code as the one open-coded in
"MpFuncs.asm".
This bug breaks S3 resume in the Ia32X64 + SMM_REQUIRE build of OVMF.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Liming Gao <liming.gao@intel.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
X64/MpFuncs.asm to X64/MpFuncs.nasm
And, manually update it to pass build.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Liming Gao <liming.gao@intel.com>
Use 16bit assembly code to replace hard code db.
In V2:
Add 0x67 prefix to far jump
When we enter protected mode, with the far jump still in big real mode,
the JMP instruction not only needs the 0x66 prefix (for 32-bit operand
size), but also the 0x67 prefix (for 32-bit address size). Use the a32
nasm modifier to enforce this.
This bug breaks S3 resume in the Ia32 + SMM_REQUIRE build of OVMF.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Liming Gao <liming.gao@intel.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
Ia32/MpFuncs.asm to Ia32/MpFuncs.nasm.
And, manually update it to pass build.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Liming Gao <liming.gao@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
X64/AsmFuncs.asm to X64/AsmFuncs.nasm.
And, manually add o16 prefix to specify 16bit operation.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Liming Gao <liming.gao@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
Ia32/AsmFuncs.asm to Ia32/AsmFuncs.nasm.
And, manually add o16 prefix to specify 16bit operation.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Liming Gao <liming.gao@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
X64/ExceptionHandlerAsm.asm to X64/ExceptionHandlerAsm.nasm.
Then, manually update nasm to pass build.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Liming Gao <liming.gao@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
Ia32/ExceptionHandlerAsm.asm to Ia32/ExceptionHandlerAsm.nasm.
Then, manually update nasm to pass build.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Liming Gao <liming.gao@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
X64/InitializeFpu.asm to X64/InitializeFpu.nasm.
And, manually add .rdata section.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Liming Gao <liming.gao@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
Ia32/InitializeFpu.asm to Ia32/InitializeFpu.nasm.
And, manually add .rdata section.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Liming Gao <liming.gao@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
Ia32/CpuAsm.asm to Ia32/CpuAsm.nasm
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Liming Gao <liming.gao@intel.com>
This patch adds the NORETURN attribute to the function that transfers
to the PEI phase, along with an UNREACHABLE() call at the end to
avoid false warnings.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Marvin Haeuser <Marvin.Haeuser@outlook.com>
Reviewed-by: Liming Gao <liming.gao@intel.com>
Currently, if the memory length to be programmed is less than the remaining size
of one Fixed-MTRR supported, RETURN_UNSUPPORTED returned. This is not correct.
This is one regression at 07e8892090 when we
updated ProgramFixedMtrr() to remove the loop of calculating Fixed-MTRR Mask.
This fix will calculate Right offset in Fixed-MTRR beside left offset. It
supports small length (less than remaining size supported by Fixed-MTRR) to be
programmed.
Cc: Eric Dong <eric.dong@intel.com>
Cc: Feng Tian <feng.tian@intel.com>
Cc: Michael Kinney <michael.d.kinney@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
This Feature PCD causes PiSmmCpuDxe to catch SMM stack overflow at
runtime, logging a clear error message, and entering a CPU dead loop.
Compared to the chaotic and catastrophic consequences of the stack leaking
into, and corrupting, the SMM page table, a stack guard that is enabled by
default is vastly superior.
We should not require sane platforms to explicitly opt in to this
safeguard; instead, we should require platforms that prefer to live
dangerously to opt out of it.
Stack overflow in SMM might even give rise to security vulnerabilities.
Cc: Jeff Fan <jeff.fan@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Michael D Kinney <michael.d.kinney@intel.com>
Ref: http://thread.gmane.org/gmane.comp.bios.edk2.devel/12864
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1341733
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
Reviewed-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
This module could be linked by CpuMpPei driver to handle reserved vector list
and provide spin lock for BSP/APs to prevent dump message corrupted.
Cc: Michael Kinney <michael.d.kinney@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Feng Tian <feng.tian@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Feng Tian <feng.tian@intel.com>
Move some global variables location from PeiDxeSmmCpuException.c to
DxeCpuException.c and SmmCpuException.c. And remove some un-used global
vairables.
Cc: Michael Kinney <michael.d.kinney@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Feng Tian <feng.tian@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Feng Tian <feng.tian@intel.com>
Rename DxeSmmCpuException.c to PeiDxeSmmCpuException.c that will be used by
PeiCpuExceptionHandlerLib.
Cc: Michael Kinney <michael.d.kinney@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Feng Tian <feng.tian@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Feng Tian <feng.tian@intel.com>
Update MSRs semaphores to the ones in allocated aligned semaphores
buffer. If MSRs semaphores is not enough, allocate one page more.
Cc: Michael Kinney <michael.d.kinney@intel.com>
Cc: Feng Tian <feng.tian@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Feng Tian <feng.tian@intel.com>
Reviewed-by: Michael Kinney <michael.d.kinney@intel.com>
Regression-tested-by: Laszlo Ersek <lersek@redhat.com>
Update each CPU semaphores to the ones in allocated aligned
semaphores buffer.
Cc: Michael Kinney <michael.d.kinney@intel.com>
Cc: Feng Tian <feng.tian@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Feng Tian <feng.tian@intel.com>
Reviewed-by: Michael Kinney <michael.d.kinney@intel.com>
Regression-tested-by: Laszlo Ersek <lersek@redhat.com>
Allocate each CPU semaphores in allocated aligned semaphores buffer.
And add it into semaphores structure.
Cc: Michael Kinney <michael.d.kinney@intel.com>
Cc: Feng Tian <feng.tian@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Feng Tian <feng.tian@intel.com>
Reviewed-by: Michael Kinney <michael.d.kinney@intel.com>
Regression-tested-by: Laszlo Ersek <lersek@redhat.com>
Update all global semaphores to the ones in allocated aligned
semaphores buffer.
Cc: Michael Kinney <michael.d.kinney@intel.com>
Cc: Feng Tian <feng.tian@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Feng Tian <feng.tian@intel.com>
Reviewed-by: Michael Kinney <michael.d.kinney@intel.com>
Regression-tested-by: Laszlo Ersek <lersek@redhat.com>
Move MP sync data initialization in front of the place that initialize
page table, because the page fault spin lock is allocated in
InitializeMpSyncData() while it is initialized in SmmInitPageTable().
Cc: Michael Kinney <michael.d.kinney@intel.com>
Cc: Feng Tian <feng.tian@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Feng Tian <feng.tian@intel.com>
Reviewed-by: Michael Kinney <michael.d.kinney@intel.com>
Regression-tested-by: Laszlo Ersek <lersek@redhat.com>
Get semaphores alignment/size requirement and allocate aligned
buffer for all global spin lock and semaphores.
Cc: Michael Kinney <michael.d.kinney@intel.com>
Cc: Feng Tian <feng.tian@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Feng Tian <feng.tian@intel.com>
Reviewed-by: Michael Kinney <michael.d.kinney@intel.com>
Regression-tested-by: Laszlo Ersek <lersek@redhat.com>
SMRR range size and alignment should follow the rules like MTRR:
a. The minimum range size is 4 KBytes and the base address of the
range must be on at least a 4-KByte boundary.
b. For ranges greater than 4 KBytes, each range must be of length
2^n and its base address must be aligned on a 2^n boundary, where
n is a value equal to or greater than 12. The base-address
alignment value cannot be less than its length.
Thus, it could meet "Address_Within_Range AND PhysMask = PhysBase
AND PhysMask".
Cc: Jeff Fan <jeff.fan@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Feng Tian <feng.tian@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Michael Kinney <michael.d.kinney@intel.com>
Reviewed-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Use the MSR MSR_IA32_MISC_ENABLE definition defined in UefiCpuPkg/Include and
remove the local definition.
Cc: Michael Kinney <michael.d.kinney@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Feng Tian <feng.tian@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
Reviewed-by: Michael Kinney <michael.d.kinney@intel.com>
Regression-tested-by: Laszlo Ersek <lersek@redhat.com>
BTS used DS save area by IA32_DS_AREA MSR to get invoker IP instead of the
Last Branch Record Stack. So, removed the unnecessary BTS MSRs.
Cc: Michael Kinney <michael.d.kinney@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Feng Tian <feng.tian@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
Reviewed-by: Michael Kinney <michael.d.kinney@intel.com>
Regression-tested-by: Laszlo Ersek <lersek@redhat.com>
SmmProfile feature depends on BTS feature to get the invoker IP (in SMM) from
last branch record. If this feature is not supported, SmmProfile cannot get the
invoker IP (in SMM). Per IA-32 Architectures Software Developer's Manual, BTS
feature is detected by IA32_MISC_ENABLE. If BIT11 of IA32_MISC_ENABLE is set,
BTS is not supported. But current implementation check BIT11 opposite. Also, BTS
feature does not depends on PEBS feature if supported or not.
Cc: Michael Kinney <michael.d.kinney@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Feng Tian <feng.tian@intel.com>
Cc: Shifflett, Joseph <joseph.shifflett@hpe.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Reported-by: Shifflett, Joseph <joseph.shifflett@hpe.com>
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Shifflett, Joseph <joseph.shifflett@hpe.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
Reviewed-by: Michael Kinney <michael.d.kinney@intel.com>
Regression-tested-by: Laszlo Ersek <lersek@redhat.com>
Introduce the 32bit mask seeds to calculate Fixed-MTRR or&and mask values. It
could avoid the loop operation and 64bit shift operations.
Cc: Feng Tian <feng.tian@intel.com>
Cc: Michael Kinney <michael.d.kinney@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Feng Tian <feng.tian@intel.com>
Add input fixed-MTRR MSR index to be start MSR index to avoid finding fixed-MTRR
MSR index from 0 at each time.
Cc: Feng Tian <feng.tian@intel.com>
Cc: Michael Kinney <michael.d.kinney@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Feng Tian <feng.tian@intel.com>
* Short description:
The CpuIoServiceRead() and CpuIoServiceWrite() functions transfer data
between memory and IO ports with individual Io(Read|Write)(8|16|32)
function calls, each in an appropriately set up loop.
On the Ia32 and X64 platforms however, FIFO reads and writes can be
optimized, by coding them in assembly, and delegating the loop to the
CPU, with the REP prefix.
On KVM virtualization hosts, this difference has a huge performance
impact: if the loop is open-coded, then the virtual machine traps to the
hypervisor on every single UINT8 / UINT16 / UINT32 transfer, whereas
with the REP prefix, KVM can transfer up to a page of data per VM trap.
This is especially noticeable with IDE PIO transfers, where all the data
are squeezed through IO ports.
* Long description:
The RootBridgeIoIoRW() function in
PcAtChipsetPkg/PciHostBridgeDxe/PciRootBridgeIo.c
used to have the exact same IO port acces optimization, dating back
verbatim to commit 1fd376d9792:
PcAtChipsetPkg/PciHostBridgeDxe: Improve KVM FIFO I/O read/write
performance
OvmfPkg cloned the "PcAtChipsetPkg/PciHostBridgeDxe" driver (for
unrelated reasons), and inherited the optimization from PcAtChipsetPkg.
The "PcAtChipsetPkg/PciHostBridgeDxe" driver was ultimately removed in
commit 111d79db47:
PcAtChipsetPkg/PciHostBridge: Remove PciHostBridge driver
and OvmfPkg too was rebased to the new core Pci Host Bridge Driver, in
commit 4014885ffd:
OvmfPkg: switch to MdeModulePkg/Bus/Pci/PciHostBridgeDxe
This caused the optimization to go lost. Namely, the
RootBridgeIoIoRead() and RootBridgeIoIoWrite() functions in the new core
Pci Host Bridge Driver delegate IO port accesses to
EFI_CPU_IO2_PROTOCOL. And, in OvmfPkg (and likely most other Ia32 / X64
edk2 platforms), this protocol is provided by "UefiCpuPkg/CpuIo2Dxe",
which lacks the optimization.
Therefore, this patch ports the C source code logic from commit
1fd376d979 (see above) to "UefiCpuPkg/CpuIo2Dxe", plus it ports the
NASM-converted assembly helper functions from OvmfPkg commits
6026bf4600 and ace1d0517b65:
OvmfPkg PciHostBridgeDxe: Convert Ia32/IoFifo.asm to NASM
OvmfPkg PciHostBridgeDxe: Convert X64/IoFifo.asm to NASM
In order to support the MSFT and INTEL toolchains as well, the *.asm
files are ported from OvmfPkg as well, immediately from before the above
conversion (that is, at 6026bf460037^).
* Notes about the port:
- The write and read branches from commit 1fd376d979 are split to the
separate functions CpuIoServiceWrite() and CpuIoServiceRead().
- The EfiPciWidthUintXX constants are replaced with EfiCpuIoWidthUintXX.
- The cast expression "(UINTN) Address" is replaced with
"(UINTN)Address" (i.e., no space), because that's how the receiving
functions spell it as well.
- The labels in the switch statements are unindented by one level, to
match the edk2 coding style (and the rest of UefiCpuPkg) better.
* The first signoff belongs to Jordan, because he authored all of
1fd376d979, 6026bf4600 and ace1d0517b.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Ref: https://www.redhat.com/archives/vfio-users/2016-April/msg00029.html
Reported-by: Mark <kram321@gmail.com>
Ref: http://thread.gmane.org/gmane.comp.bios.edk2.devel/10424/focus=10432
Reported-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Cc: Jeff Fan <jeff.fan@intel.com>
Cc: Mark <kram321@gmail.com>
Tested-by: Mark <kram321@gmail.com>
Reviewed-by: Jeff Fan <jeff.fan@intel.com>
If ApLoopMode is set to ApInMwaitLoop, AP will be placed into C-State by mwait
instruction. BSP will wakeup AP by write start-up signal in monitor address.
However, AP maybe waken by SMI/NMI/MCE and other condition. On this case, AP
will check if BSP wants to wakeup itself really. If not, AP will continue to
execute mwait to C-State.
One potential issue: BSP may not recognize AP was wakeup from C-State by other
event and BSP still writes start-up signal to wakeup AP. But AP does not aware
it and still execute mwait instruction to C-State. So, AP cannot be wakeup on
this case.
This fix is let AP to clear start-up signal when it really is wakeup to execute
AP function. And BSP will write start-up signal till AP clears it.
Cc: Michael Kinney <michael.d.kinney@intel.com>
Cc: Feng Tian <feng.tian@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Feng Tian <feng.tian@intel.com>