While building with the following command line:
build -b DEBUG -a AARCH64 -t VS2017 -p MdeModulePkg\MdeModulePkg.dsc
A missing cast triggers the following warning, then triggering an error:
ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c(652):
warning C4152: nonstandard extension, function/data pointer
conversion in expression
This patch first casts the function pointer to (UINTN), then to (VOID *),
followowing the C99 standard s6.3.2.3 "Pointer", paragraphs 5 and 6.
This suppresses the warning.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Suggested-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@arm.com>
TT_ATTR_INDX_INVALID is #define'd but never used so drop it. Note
that this leaves a CPP macro of the same name in CpuDxe, but there,
it is actually being used, and although the name suggests that this
value is somehow defined by the architecture, this is really not the
case and it only has meaning within the scope of CpuDxe's implementation.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@arm.com>
Reviewed-by: Leif Lindholm <leif@nuviainc.com>
Only a single call to GetRootTranslationTableInfo() remains, which
only provides the root table level. So let's create a new static
helper function that returns just this value, and use it instead.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@arm.com>
Reviewed-by: Leif Lindholm <leif@nuviainc.com>
LookupAddresstoRootTable() uses a loop to go over its MaxAddress
argument, essentially to do a log2() and determine how many bits are
needed to represent it. Since the argument is the result of a shift-left
expression, there is some room for improvement here, and we can simply
use the bit count directly to calculate the value of T0SZ. At the same
time, we can omit calling GetRootTranslationTableInfo() to determine the
number of root table entries, and add a new helper that applies the
trivial calculation directly.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@arm.com>
Reviewed-by: Leif Lindholm <leif@nuviainc.com>
The routine PageAttributeToGcdAttribute() is exported by ArmMmuLib
but only ever used in the implementation of CpuDxe. So let's move
the function there and make it STATIC.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@arm.com>
Reviewed-by: Leif Lindholm <leif@nuviainc.com>
Currently, depending on the size of the region being (re)mapped, the
page table manipulation code may replace a table entry with a block entry,
even if the existing table entry uses different mapping attributes to
describe different parts of the region it covers. This is undesirable, and
instead, we should avoid doing so unless we are disregarding the original
attributes anyway. And if we make such a replacement, we should free all
the page tables that have become orphaned in the process.
So let's implement this, by taking the table entry path through the code
for block sized regions if a table entry already exists, and the clear
mask is set (which means we are preserving attributes from the existing
mapping). And when we do replace a table entry with a block entry, free
all the pages that are no longer referenced.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif@nuviainc.com>
Reviewed-by: Ashish Singhal <ashishsingha@nvidia.com>
Tested-by: Ashish Singhal <ashishsingha@nvidia.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Given how the meaning of the attribute bits for page table entry types
is slightly awkward, and changes between levels, add some helpers to
abstract from this.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif@nuviainc.com>
Reviewed-by: Ashish Singhal <ashishsingha@nvidia.com>
Tested-by: Ashish Singhal <ashishsingha@nvidia.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
FreePageTablesRecursive () traverses the page table tree depth first
to free all pages that it finds, without taking into account the
level at which it is operating.
Since TT_TYPE_TABLE_ENTRY aliases TT_TYPE_BLOCK_ENTRY_LEVEL3, we cannot
distinguish table entries from block entries unless we take the level
into account, and so we may be dereferencing garbage if we happen to
try and free a hierarchy of page tables that has level 3 pages in it.
Let's fix this by passing the level into FreePageTablesRecursive (),
and limit the recursion to levels < 3.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif@nuviainc.com>
Reviewed-by: Ashish Singhal <ashishsingha@nvidia.com>
Tested-by: Ashish Singhal <ashishsingha@nvidia.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Some cosmetic fixups to the AArch64 MMU code:
- reflow overly long lines unless it hurts legibility
- add/remove whitespace according to the [de facto] coding style
- use camel case for goto labels
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-Id: <20200307091008.14918-3-ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif@nuviainc.com>
This is the AARCH64 counterpart of commit 1f3b1eb308, to remove
a pointless check against the memory type of the allocations that the
page tables happened to land in. On ArmV8, we use writeback cacheable
exclusively for all memory.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-Id: <20200307091008.14918-2-ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif@nuviainc.com>
As it turns out, ARMv8 also permits accesses made with the MMU and
caches off to hit in the caches, so to ensure that any modifications
we make before enabling the MMU are visible afterwards as well, we
should invalidate page tables right after allocation like we do now on
ARM, if the MMU is still disabled at that point.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif@nuviainc.com>
Message-Id: <20200307083849.8940-3-ard.biesheuvel@linaro.org>
Replace the slightly overcomplicated page table management code with
a simplified, recursive implementation that should be far easier to
reason about.
Note that, as a side effect, this extends the per-entry cache invalidation
that we do on page table entries to block and page entries, whereas the
previous change inadvertently only affected the creation of table entries.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-Id: <20200307083849.8940-2-ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif@nuviainc.com>
In the AARCH64 version of ArmMmuLib, we are currently relying on
set/way invalidation to ensure that the caches are in a consistent
state with respect to main memory once we turn the MMU on. Even if
set/way operations were the appropriate method to achieve this, doing
an invalidate-all first and then populating the page table entries
creates a window where page table entries could be loaded speculatively
into the caches before we modify them, and shadow the new values that
we write there.
So let's get rid of the blanket clean/invalidate operations, and
instead, update ArmUpdateTranslationTableEntry () to invalidate each
page table entry *after* it is written if the MMU is still disabled
at this point.
On ARMv8, it is guaranteed that memory accesses done by the page table
walker are cache coherent, and so we can ignore the case where the
MMU is on.
Since the MMU and D-cache are already off when we reach this point, we
can drop the MMU and D-cache disables as well. Maintenance of the I-cache
is unnecessary, since we are not modifying any code, and the installed
mapping is guaranteed to be 1:1. This means we can also leave it enabled
while the page table population code is running.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif@nuviainc.com>
Currently, we always invalidate the TLBs entirely after making
any modification to the page tables. Now that we have introduced
strict memory permissions in quite a number of places, such
modifications occur much more often, and it is better for performance
to flush only those TLB entries that are actually affected by
the changes.
At the same time, relax some system wide data synchronization barriers
to non-shared. When running in UEFI, we don't share virtual address
translations with other masters, unless we are running under virt, but
in that case, the host will upgrade them as appropriate (by setting
an override at EL2)
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
Take care not to dereference BlockEntry if it may be pointing past
the end of the page table we are manipulating. It is only a read,
and thus harmless, but HeapGuard triggers on it so let's fix it.
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
When creating the page tables for the 1:1 mapping, ensure that we don't
attempt to map more than what is architecturally permitted when running
with 4 KB pages, which is 48 bits of VA. This will be reflected in the
value of MAX_ALLOC_ADDRESS once we override it for AArch64, so use that
macro instead of MAX_ADDRESS.
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
In preparation of dropping PcdPrePiCpuMemorySize entirely, base the
maximum size of the identity map on the capabilities of the CPU.
Since that may exceed what is architecturally permitted when using
4 KB pages, take MAX_ADDRESS into account as well.
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
Flash region needs to be set as cacheable (write back) to increase
performance, if PEI is still XIP on flash or DXE FV is decompressed
from flash FV. However some ARM platforms do not support to set flash
as inner shareable since flash is not normal DDR memory and it will
not respond to cache snoop request, which will causes system hang
after MMU is enabled.
So we need a new ARM memory region attribute WRITE_BACK_NONSHAREABLE
for flash region on these platforms specifically. This attribute will
set the region as write back but not inner shared.
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Peicong Li <lipeicong@huawei.com>
Signed-off-by: Heyi Guo <heyi.guo@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
We no longer make use of the ArmMmuLib 'feature' to create aliased
memory ranges with mismatched attributes, and in fact, it was only
wired up in the ARM version to begin with.
So remove the VirtualMask argument from ArmSetMemoryAttributes()'s
prototype, and remove the dead code that referred to it.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
... where it belongs, since AARCH64 already keeps it there, and
non DXE users of ArmMmuLib (such as DxeIpl, for the non-executable
stack) may need its functionality as well.
While at it, rename SetMemoryAttributes to ArmSetMemoryAttributes,
and make any functions that are not exported STATIC. Also, replace
an explicit gBS->AllocatePages() call [which is DXE specific] with
MemoryAllocationLib::AllocatePages().
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
The routines ArmConfigureMmu(), SetMemoryAttributes() [*] and the
various set/clear read-only/no-exec routines are declared as returning
EFI_STATUS in the respective header files, so align the definitions with
that.
* SetMemoryAttributes() is declared in the wrong header (and defined in
ArmMmuLib for AARCH64 and in CpuDxe for ARM)
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
Enable the hardware stack alignment check, as mandated by the UEFI spec.
This ensures that the stack pointer is 16 byte aligned at each instance
where it is used as the base address in a load/store operation.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
Since the new DXE page protection for PE/COFF images may invoke
EFI_CPU_ARCH_PROTOCOL.SetMemoryAttributes() with only permission
attributes set, add support for this in the AARCH64 MMU code.
Move the EFI_MEMORY_CACHETYPE_MASK macro to a shared location between
CpuDxe and ArmMmuLib so we don't have to introduce yet another
definition.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
Current Arm CpuDxe driver uses EFI_MEMORY_WP for write protection,
according to UEFI spec, we should use EFI_MEMORY_RO for write protection.
The EFI_MEMORY_WP is the cache attribute instead of memory attribute.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jiewen Yao <jiewen.yao@intel.com>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
This reverts commit d32702d2c2.
Using a pool allocation for the root translation table seemed like
a good idea at the time, but as it turns out, such allocations are
handled in a way that makes them unsuitable for this purpose: they
are backed by HOBs that don't remain in the same place during the
various PI phase changes, which means the address programmed into
the TTBR register is no longer valid, and may refer to memory that
is reported as available to the OS.
So switch back to using a page based allocation.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
Translation table walks are always cache coherent on ARMv8-A, so cache
maintenance on page tables is never needed. Since there is a risk of
loss of coherency when using mismatched attributes, and given that memory
is mapped cacheable except for extraordinary cases (such as non-coherent
DMA), restrict the page table walker to performing cacheable accesses to
the translation tables.
For DEBUG builds, retain some of the logic so that we can double check
that the memory holding the root translation table is indeed located in
memory that is mapped cacheable.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
As reported by Eugene, the practice of sizing the address space in the
virtual memory system based on the maximum address in the table passed
to ArmConfigureMmu() is problematic, since it fails to take into account
the fact that the GCD memory space may be extended at a later time, both
for memory and for MMIO. So instead, choose the VA size identical to the
GCD memory map size, which is based on PcdPrePiCpuMemorySize on ARM
systems.
Reported-by: Eugene Cohen <eugene@hp.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Eugene Cohen <eugene@hp.com>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
Currently, we allocate a full page for the root translation table, even
if the configured translation only requires two entries (16 bytes) for
the root level, which happens to be the case for a 40 bit VA. Likewise,
for a 36-bit VA space, the root table only needs 16 entries of 8 bytes
each, adding up to 128 bytes.
So switch to a pool allocation for the root table if we can, but take into
account that the architecture requires it to be naturally aligned to its
size, i.e., a 64 byte table requires 64 byte alignment, whereas pool
allocations in general are only guaranteed to be aligned to 8 bytes.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
In commit 7d189f99d8 ("ArmPkg/Mmu: Fix bug of aligning new allocated
page table"), we fixed a flaw in the logic regarding alignment of newly
allocated translation table pages. However, we all failed to spot that
aligning page based allocations to page size is rather pointless to
begin with, so simply allocate a single page each time we add new pages
to the translation tables.
Also, drop the unnecessary cast.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
The relations between T0SZ, the number of translation levels and the
size/alignment of the root table can be expressed in simple arithmetic
expressions, so get rid of the lookup table.
Note that this disregards the fact that the maximum value of T0SZ is
39 not 42 (as one would expect for the smallest VA size using 2 levels)
but since this corresponds to a VA size of 32 MB and 4 MB, respectively,
neither of which are sufficient to run UEFI, we can safely ignore the
distinction.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
Annotate functions with ASM_FUNC() so that they are emitted into
separate sections.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
The function ArmReplaceLiveTranslationEntry() has been moved to
ArmMmuLib, so remove the old implementation from ArmLib.
Note that the new implementation was not exported from the object file,
and so references to it were satisfied by the old version residing in
ArmLib. Since we are removing that one, we need to export the new one
at the same time to prevent the linker from bailing with undefined
reference errors.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
This introduces a special version of ArmMmuLib for PEIMs that takes care
only to perform cache maintenance on the live entry replacement routine
if the module is not executing in place. Not only is such cache maintenance
unnecessary in that case, it may be actively harmful on some systems that
fail to tolerate cache maintenance operations on NOR flash regions.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
This base library encapsulates the MMU manipulation routines that have been
factored out of ArmLib. The functionality covers initial creation of the 1:1
mapping in the page tables, and remapping regions to change permissions or
cacheability attributes.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>