As reported by Vishal, CompareGuid() is a hotspot, and switching from
BaseMemoryLibStm in ArmPkg/ to BaseMemoryLibOptDxe causes a noticeable
performance regression due to the fact that BaseMemoryLibOptDxe uses
unaligned accessors explicitly to implement CompareGuid() and the related
functions.
Since BaseMemoryLibOptDxe on ARM and AARCH64 can only be used in contexts
where unaligned accesses are allowed, reimplement these functions for ARM
and AARCH64 specifically, using wide accessors that can tolerate any
misalignment.
Reported-by: "Oliyil Kunnil, Vishal" <vishalo@qti.qualcomm.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Liming Gao <liming.gao@intel.com>
Fix two bugs:
- Erroneous shift of 2 in a bytes to bits conversion.
- Use reverse subtract rather than negate for value that is subsequently
used as operand #2 in a shift operation.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Liming Gao <liming.gao@intel.com>
The new accelerated ARM and AARCH64 implementations take advantage of
features that are only available when the MMU and Dcache are on. So
restrict the use of this library to the DXE phase or later.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Liming Gao <liming.gao@intel.com>
This adds AARCH64 support to BaseMemoryLibOptDxe, based on the cortex-strings
library. All string routines are accelerated except ScanMem16, ScanMem32,
ScanMem64 and IsZeroBuffer, which can wait for another day. (Very few
occurrences exist in the codebase)
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Liming Gao <liming.gao@intel.com>
This adds ARM support to BaseMemoryLibOptDxe, partially based on the
cortex-strings library (ScanMem) and the existing CopyMem() implementation
from BaseMemoryLibStm in ArmPkg.
All string routines are accelerated except ScanMem16, ScanMem32,
ScanMem64 and IsZeroBuffer, which can wait for another day. (Very few
occurrences exist in the codebase)
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Liming Gao <liming.gao@intel.com>
Since the default BaseMemoryLib should be callable from any context,
including ones where unaligned accesses are not allowed, it implements
InternalCopyMem() and InternalSetMem() using byte accesses only.
However, especially in a context where the MMU is off, such narrow
accesses may be disproportionately costly, and so if the size and
alignment of the access allow it, use 32-bit or even 64-bit loads and
stores (the latter may be beneficial even on a 32-bit architectures like
ARM, which has load pair/store pair instructions)
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Liming Gao <liming.gao@intel.com>
When switching to the DXE phase stack, set the frame pointer to zero so
that code walking the stack frame will not try to access stack frames
belonging to the old stack.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
Add the implementation of API IsZeroBuffer() via assembly in
BaseMemoryLibSse2.
The assembly codes use SSE2 XMM registers and related instructions.
Cc: Michael D Kinney <michael.d.kinney@intel.com>
Cc: Liming Gao <liming.gao@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Hao Wu <hao.a.wu@intel.com>
Reviewed-by: Liming Gao <liming.gao@intel.com>
Add the implementation of API IsZeroBuffer() via assembly for the
following library instances:
BaseMemoryLibMmx
BaseMemoryLibOptDxe
BaseMemoryLibOptPei
BaseMemoryLibRepStr
Cc: Michael D Kinney <michael.d.kinney@intel.com>
Cc: Liming Gao <liming.gao@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Hao Wu <hao.a.wu@intel.com>
Reviewed-by: Liming Gao <liming.gao@intel.com>
Add the implementation of API IsZeroBuffer() via C language for the
following library instances:
BaseMemoryLib
PeiMemoryLib
UefiMemoryLib
Cc: Michael D Kinney <michael.d.kinney@intel.com>
Cc: Liming Gao <liming.gao@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Hao Wu <hao.a.wu@intel.com>
Reviewed-by: Liming Gao <liming.gao@intel.com>
The RVCT compiler in --gnu mode appears to simply strip of the __builtin
prefix when it encounters calls to __builtin_xxx() functions, and so
the __builtin_unreachable() we emit for GCC results in linker errors
regarding undefined references against 'unreachable()'.
So define UNREACHABLE() to a NOP instead.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Liming Gao <liming.gao@intel.com>
In latest UEFI2.6 spec, the type of the fourth parameter in function
GetImageInfo() is "EFI_IMAGE_OUTPUT", but in the header file, it is
"EFI_IMAGE_INPUT". Now correct it to follow the spec.
Cc: Liming Gao <liming.gao@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Cecil Sheng <cecil.sheng@hpe.com>
Cc: Abner Chang <abner.chang@hpe.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Dandan Bi <dandan.bi@intel.com>
Reviewed-by: Liming Gao <liming.gao@intel.com>
The original implementation only looks for very last backslash
and removes the string after that.
But when the path is like "FS0:File.txt" which doesn't contain
backslash, the function cannot work well.
The patch enhances the code to look for very last backslash or
colon to support the path which doesn't contain backslash.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ruiyu Ni <ruiyu.ni@intel.com>
Reviewed-by: Jaben Carsey <jaben.carsey@intel.com>
Reviewed-by: Tapan Shah <tapandshah@hpe.com>
Add the following definition in the [BuildOptions] section in package DSC
files to disable APIs that are deprecated:
[BuildOptions]
*_*_*_CC_FLAGS = -D DISABLE_NEW_DEPRECATED_INTERFACES
Cc: Michael D Kinney <michael.d.kinney@intel.com>
Cc: Liming Gao <liming.gao@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Hao Wu <hao.a.wu@intel.com>
Reviewed-by: Liming Gao <liming.gao@intel.com>
This adds support for GCC 5.x in LTO mode for IA32, X64, ARM and
AARCH64. Due to the fact that the GCC project switched to a new
numbering scheme where the first digit is now incremented for every
major release, the new toolchain is simply called 'GCC5', and is
intended to support all GCC v5.x releases.
Since IA32 and X64 enable compiler optimizations (-Os) for both DEBUG
and RELEASE builds, LTO support is equally enabled for both targets.
On ARM and AARCH64, DEBUG builds are not optimized, and so the LTO
optimizations are only enabled for RELEASE.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Liming Gao <liming.gao@intel.com>
When using GCC to build for X64, we switched to the position independent
small code model, which is much more efficient in terms of code generation
and runtime relocation footprint, and produces binaries that can execute
correctly from any offset.
However, the PIC routines are by default geared towards hosted binaries
containing symbol references that may resolve to definitions in other
dynamic objects, and for this reason, most symbol references are indirected
via a GOT entry (which also results in a .reloc fixup entry) unless we
annotate them.
For this reason, we introduced the 'protected' visibility annotation for
all symbol definitions and references, by setting the GCC visibility
pragma. However, as it turns out, this is not sufficient for all versions
of GCC, and in some cases (GCC 5.x using the GCC49 toolchain tag), may
still result in GOT based relocations.
So switch to 'hidden' visibility instead, which is slightly stronger, and
fixes this issue for the versions of GCC that exhibit the problem.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Liming Gao <liming.gao@intel.com>
When building position independent (PIC) ELF objects, the GCC compiler
assumes that each symbol with external linkage may potentially end up
being exported from a shared library, which means that each of those
symbols may be subject to symbol preemption, i.e., the executable
linking to the shared library at runtime may override symbols exported
by the shared library, and every internal reference held by the shared
library itself *must* be made to point to the overridden version instead.
For this reason, PIC code symbol references always go via the Global
Offset Table (GOT), even if the code in question references symbols that
are defined in the same compilation unit. The GOT refers to each symbol
by absolute address, and so each entry is subject to runtime relocation.
Since not every symbol with external linkage is ultimately exported from
a shared library, the GCC compiler allows control over symbol visibility
using attributes, command line arguments and pragmas, where 'protected'
means that the symbol is only referenced by the shared library itself.
Due to the poor hygiene in EDK2 regarding the use of the 'static'
modifier, many symbols that are local to their compilation unit end up
being referenced indirectly via the GOT when building PIC code.
In UEFI, there are no shared libraries and so there is no need to deal
with symbol preemption, and we can mark every symbol reference protected.
The only method that applies to all symbol definitions as well as
declarations is the #pragma. So set the visibility 'protected' pragma when
building PIC code for X64 using GCC. Note that this affects code generated
with the -fpie compiler switch as well as the -fpic compiler switch.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Tested-By: Liming Gao <liming.gao@intel.com>
Reviewed-by: Liming Gao <liming.gao@intel.com>
This is never set anymore, so unsetting it or testing whether it is unset
no longer makes any sense.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Tested-By: Liming Gao <liming.gao@intel.com>
Reviewed-by: Liming Gao <liming.gao@intel.com>
Both GCC and LLVM 3.8 64bits support new variable argument (VA)
intrinsics for Microsoft ABI, enable these new VA intrinsics for
GNUC family 64bits code build. These VA intrinsics are only
permitted use in 64bits code, so not use them in 32bits code build.
The original 32bits GNU VA intrinsics has the same calling convention
as MS, so we don't need change them.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Steven Shi <steven.shi@intel.com>
[ardb: update CPP logic so that the change only applies to X64]
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Tested-By: Liming Gao <liming.gao@intel.com>
Reviewed-by: Liming Gao <liming.gao@intel.com>
GCC v4.4 does not implement __builtin_unreachable(), so avoid using
it when building with this version or earlier.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Tested-By: Liming Gao <liming.gao@intel.com>
Reviewed-by: Liming Gao <liming.gao@intel.com>
BaseLib Ia32 InternalSwitchStack.S has no matched InternalSwitchStack.nasm.
Use ObjDump to verify the output object files be same.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Liming Gao <liming.gao@intel.com>
Reviewed-by: Jeff Fan <jeff.fan@intel.com>
Some processor may return small cache line size, we should return 32 bytes at
least for spin lock alignment.
Cc: Liming Gao <liming.gao@intel.com>
Cc: Michael Kinney <michael.d.kinney@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
The "Pci22.h" header file defines the macro EFI_PCI_CAPABILITY_ID_HOTPLUG
with value 0x06. According to all of:
- later parts of the same header file,
- Appendix H ("Capability IDs") of the PCI Local Bus Specification
Revision 2.3,
- and Chapter 2 ("Capability IDs") of the PCI Code and ID Assignment
Specification Revision 0.9,
0x06 means "CompactPCI Hot Swap". It does not mean "PCI Hot-Plug": that
capability is described by ID 0x0C:
0Ch PCI Hot-Plug -- This Capability ID indicates that the associated
device conforms to the Standard Hot-Plug Controller model.
Therefore EFI_PCI_CAPABILITY_ID_HOTPLUG is arguably a misnomer. PciBusDxe
(mis-)uses EFI_PCI_CAPABILITY_ID_HOTPLUG in the IsSHPC() helper function
to identify PCI Hot-Plug capability.
In order to preserve compatibility with existent code, leave
EFI_PCI_CAPABILITY_ID_HOTPLUG alone, and introduce
EFI_PCI_CAPABILITY_ID_SHPC with the right ID value.
Cc: "Johnson, Brian J." <bjohnson@sgi.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Andrew Fish <afish@apple.com>
Cc: Feng Tian <feng.tian@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Liming Gao <liming.gao@intel.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Michael Kinney <michael.d.kinney@intel.com>
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Cc: Star Zeng <star.zeng@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Ruiyu Ni <Ruiyu.ni@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
Ia32/DisablePaging32.asm to Ia32/DisablePaging32.nasm
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
X64/InterlockedIncrement.asm to X64/InterlockedIncrement.nasm
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
X64/InterlockedDecrement.asm to X64/InterlockedDecrement.nasm
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
X64/InterlockedCompareExchange16.asm to X64/InterlockedCompareExchange16.nasm
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
X64/InterlockedCompareExchange32.asm to X64/InterlockedCompareExchange32.nasm
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
X64/InterlockedCompareExchange64.asm to X64/InterlockedCompareExchange64.nasm
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
Ia32/InterlockedIncrement.asm to Ia32/InterlockedIncrement.nasm
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
Ia32/InterlockedDecrement.asm to Ia32/InterlockedDecrement.nasm
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
Ia32/InterlockedCompareExchange16.asm to Ia32/InterlockedCompareExchange16.nasm
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
Ia32/InterlockedCompareExchange32.asm to Ia32/InterlockedCompareExchange32.nasm
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
Ia32/InterlockedCompareExchange64.asm to Ia32/InterlockedCompareExchange64.nasm
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
X64/CpuSleep.asm to X64/CpuSleep.nasm
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
X64/CpuFlushTlb.asm to X64/CpuFlushTlb.nasm
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
Ia32/CpuFlushTlb.asm to Ia32/CpuFlushTlb.nasm
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
Ia32/CpuSleep.asm to Ia32/CpuSleep.nasm
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
The BaseTools/Scripts/ConvertMasmToNasm.py script was used to convert
X64/CopyMem.asm to X64/CopyMem.nasm
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>