All AutoGen.obj files consist of global GUID definitions, fixed
and patchable PCDs and other data that is essentially read-only at
runtime but has not been declared as such for various reasons.
By moving these contents to .text we achieve two things:
- global GUIDs and other data items which must be constant for correct
program operation can no longer be modified, for instance, when
running a DXE_RUNTIME_MODULE binary under the OS with the Properties
Table feature for memory protection enabled;
- the .data section becomes smaller, and may be dropped completely for
many XIP modules, which reduces wasted FV space if the PE/COFF section
alignment is large.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Liming Gao <liming.gao@intel.com>
Tested-by: Leif Lindholm <leif.lindholm@linaro.org>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@18137 6f19259b-4bc3-4df7-8a09-765794883524
Now that GenFw honors the ELF section alignment when placing the
PE/COFF sections in the output, the start of the PE/COFF version of
.data will be aligned to the alignment of .text if its alignment is
higher than the default. So duplicate this behavior in the ELF output,
this will make the memory layout of the PE/COFF binary match the
layout of the ELF version more closely.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Liming Gao <liming.gao@intel.com>
Tested-by: Leif Lindholm <leif.lindholm@linaro.org>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@18136 6f19259b-4bc3-4df7-8a09-765794883524
This unifies all GCC linker scripts into a single parametrised GCC
linker script that can be used for all GCC versions and architectures.
The two parameters that can be set on the linker command line are:
- PECOFF_HEADER_SIZE, this is a build time property of GenFw, but
its value is different between 32-bit and 64-bit;
- common-page-size, this can be set using -z on the ld command line,
and controls the value of the COMMONPAGESIZE constant when used in
a linker script. This value is used for the minimum section alignment.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Liming Gao <liming.gao@intel.com>
Tested-by: Leif Lindholm <leif.lindholm@linaro.org>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@18135 6f19259b-4bc3-4df7-8a09-765794883524
Instead of hardcoding the values for the PE/COFF header size and the
section alignment, set them on the linker command line. This factors
out these values from the various linker scripts, which will allow us
to unify them in a subsequent patch.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Liming Gao <liming.gao@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@18134 6f19259b-4bc3-4df7-8a09-765794883524
Move the .got contents to the PE/COFF .text section. This should be
a no-op, since we typically don't generate position independent code
(i.e., using -fPIC). But since the GOT contains variable addresses that
are updated at relocation time only, its contents are best kept in .text
to prevent them from being overwritten inadvertently.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Liming Gao <liming.gao@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@18133 6f19259b-4bc3-4df7-8a09-765794883524
There is no need to pad out the end of a section of the start of
the following section is aligned to the same value. So drop the
redundant ALIGN() statements.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Liming Gao <liming.gao@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@18132 6f19259b-4bc3-4df7-8a09-765794883524
The .rodata ELF section contains constant non-executable data that
should never be modified by the program itself. Since the risk of
inadvertent modification is typically higher than the risk of
inadvertent execution, it makes sense to put this data in the
R-X .text section rather than in the RW- .data section.
So move it there.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Liming Gao <liming.gao@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@18131 6f19259b-4bc3-4df7-8a09-765794883524
The NOP padding in the GCC linker scripts ensures that all empty
regions in the ELF binary are filled with x86 NOP instructions.
There is no upside to doing this: if the CPU ends up executing these
instructions, we have little hope of resuming normal execution of the
program anyway. And having NOP slides in memory only makes it easier
for attackers to launch exploits. So remove them.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Liming Gao <liming.gao@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@18130 6f19259b-4bc3-4df7-8a09-765794883524
The keyword with value TRUE OR FALSE is used to
indicate whether the FV UI name is included in
FV EXT header as a entry or not.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Yingke Liu <yingke.d.liu@intel.com>
Reviewed-by: Liming Gao <liming.gao@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@18090 6f19259b-4bc3-4df7-8a09-765794883524
To prevent double padding of XIP modules leading to excessive
waste of FV space, try to adjust existing padding rather than
adding more.
Instead of adding a pad file to the FV to line up an FFS file that
itself may contain padding to line up the payload, try to find a
dedicated padding section inside the FFS, and reduce its size to
place all subsequent aligned FFS section at their respective minimum
alignments.
When using 4 KB section alignment (which is required on AARCH64 in
some cases), this will save 4 KB for each XIP module.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Liming Gao <liming.gao@intel.com>
Reviewed-by: Yingke Liu <yingke.d.liu@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@18080 6f19259b-4bc3-4df7-8a09-765794883524
Instead of using an anonymous section of type EFI_SECTION_RAW to pad
out the first aligned FFS section to its required alignment, use a
section with a dedicated GUID if the size of the padding permits it.
This allows for more flexibility when placing such FFS images in a
firmware volume, because we will now be able to remove padding rather
than add more, by shrinking the size of this section instead of
padding out the start of the FFS image to file alignment.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Liming Gao <liming.gao@intel.com>
Reviewed-by: Yingke Liu <yingke.d.liu@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@18079 6f19259b-4bc3-4df7-8a09-765794883524
The secondary header (not the DOS header) of a PE/COFF binary
does not reside at a fixed offset. Instead, its offset into the
file is recorded in the DOS header.
This gives us the flexibility to move it, along with the section
headers, to right before the first section if there is considerable
space before it, i.e., when the PE/COFF file alignment is substantially
larger than the size of the header.
Since the PE/COFF to TE conversion replaces everything before the
section headers with a simple TE header, this change removes all
the header padding from such images, leading to smaller files.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Liming Gao <liming.gao@intel.com>
Reviewed-by: Yingke Liu <yingke.d.liu@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@18078 6f19259b-4bc3-4df7-8a09-765794883524
In order to reduce the memory footprint of PE/COFF images when
using large values for the PE/COFF section alignment, move the
contents of the .debug section to data, and point the debug data
directory entry to it. This allows us to drop the .debug section
entirely, as well as any associated rounding. Since our .debug
section only contains the filename of the ELF input image, the
penalty of keeping this data in a non-discardable section is
negligible.
Note that the PE/COFF spec v6.3 explicitly mentions that this is
allowed.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Liming Gao <liming.gao@intel.com>
Reviewed-by: Yingke Liu <yingke.d.liu@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@18077 6f19259b-4bc3-4df7-8a09-765794883524
When a quoted string is used as initialization data in a DEC file PCD
entry, the PCD data type in that entry must be VOID*. The created
AutoGen.c defines the PCD data as UINT8[] or UINT16[], depending on
the string type. The created AutoGen.h, however, declares the PCD data
as VOID*. For a standard compile/link, this works because AutoGen.c
doesn't include AutoGen.h. But when GCC LTO is used, the link time
code generation detects the mismatch and the build fails. This
change makes the AutoGen.h PCD data declaration match the AutoGen.c
definition.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Scott Duplichan <scott@notabs.org>
Reviewed-by: Yingke Liu <yingke.d.liu@intel.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@18058 6f19259b-4bc3-4df7-8a09-765794883524
ReadMemoryFileLine () appends a NULL character to the string
it returns, but it failed to account for it in the allocation.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Yingke Liu <yingke.d.liu@intel.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@18047 6f19259b-4bc3-4df7-8a09-765794883524
The alignment in rule section is shared by modules to generate FFS,
it should not be modified by certain module.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Yingke Liu <yingke.d.liu@intel.com>
Reviewed-by: Liming Gao <liming.gao@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@18016 6f19259b-4bc3-4df7-8a09-765794883524
Relocations of type EFI_IMAGE_REL_BASED_DIR64 are handled in exactly
the same way on all 64-bit machine types (IPF, X64 and AARCH64).
So move the handling of this type to the generic part of the relocation
routine PeCoffLoaderRelocateImage ().
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Liming Gao <liming.gao@intel.com>
Reviewed-by: Yingke Liu <yingke.d.liu@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17942 6f19259b-4bc3-4df7-8a09-765794883524
Some toolchains, at least Fedora GCC, generate inline unwind tables in
object files. These confuses GenFw to no end, leading to build failures:
GenFw: ERROR 3000: Invalid WriteSections64(): ...
unsupported ELF EM_AARCH64 relocation 0x105.
GenFw: ERROR 3000: Invalid WriteSections64(): ...
unsupported ELF EM_AARCH64 relocation 0x0.
I am aware of no current use of these tables, so explicitly disable
their generation for aarch64.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Tested-by: Wei Huang <wei@redhat.com>
Reviewed-by: Olivier Martin <olivier.martin@arm.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17905 6f19259b-4bc3-4df7-8a09-765794883524
Add a BOM check for UNI file and fix some help message error
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Hess Chen <hesheng.chen@intel.com>
Reviewed-by: YangX Li <yangx.li@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17876 6f19259b-4bc3-4df7-8a09-765794883524
On FreeBSD, uuid.h is in /usr/include, not /usr/include/uuid.
Fix some errors when building using clang caused by self-assignment: the
preferred way to 'use' a variable is '(void)x;', not 'x = x;'.
Where the system provides $(CC) etc. by default, don't override it to be gcc.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Bruce Cran <bruce@cran.org.uk>
Reviewed-by: Yingke Liu <yingke.d.liu@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17866 6f19259b-4bc3-4df7-8a09-765794883524
Instead of relying on the builtin linker script of GNU ld, which
may vary based on binutils version (which is not tightly coupled to
the GCC version) and linker command line options, introduce a linker
script for AArch64 to be used by all GCC/binutils versions.
The script is laid out such that two ELF sections .text and .data are
created that map onto the PE/COFF with the same names. By aligning
.data to the minimum alignment of .text, and by not adding any
additional padding -which is what LD's builtin linker script does- the
relative offset between .text and .data is retained after the PE/COFF
conversion. This should prevent problems with debuggers and other
tooling that are ELF based.
Also provided is an overlay linker script that increases the alignment
of .text and .data to 64 KB. This is intended for DXE_RUNTIME_DRIVER
modules, to make them compatible with the newly introduced
Properties Table feature.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Olivier Martin <Olivier.Martin@arm.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17824 6f19259b-4bc3-4df7-8a09-765794883524
Instead of relying on the builtin linker script of GNU ld, which
may vary based on binutils version (which is not tightly coupled to
the GCC version) and linker command line options, introduce a linker
script for AArch64 to be used by all GCC/binutils versions.
The script is laid out such that two ELF sections .text and .data are
created that map onto the PE/COFF with the same names. By aligning
.data to the minimum alignment of .text, and by not adding any
additional padding -which is what LD's builtin linker script does- the
relative offset between .text and .data is retained after the PE/COFF
conversion. This should prevent problems with debuggers and other
tooling that are ELF based.
Also provided is an overlay linker script that increases the alignment
of .text and .data to 64 KB. This is intended for DXE_RUNTIME_DRIVER
modules, to make them compatible with the newly introduced
Properties Table feature.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17802 6f19259b-4bc3-4df7-8a09-765794883524
Fix a bug to only checking the copyright listed in config.ini file.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Hess Chen <hesheng.chen@intel.com>
Reviewed-by: YangX Li <yangx.li@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17801 6f19259b-4bc3-4df7-8a09-765794883524
Fix a bug to get correct member variable by ignoring 'OPTIONAL' modifier
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Hess Chen <hesheng.chen@intel.com>
Reviewed-by: YangX Li <yangx.li@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17800 6f19259b-4bc3-4df7-8a09-765794883524
The BuildOptions in an INF should also follow override rule: If '==' is used, all previous options are overridden.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Yingke Liu <yingke.d.liu@intel.com>
Reviewed-by: Liming Gao <liming.gao@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17796 6f19259b-4bc3-4df7-8a09-765794883524
Get maximum section alignment from each ELF section, and this alignment is used to create PE header.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Liming Gao <liming.gao@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17727 6f19259b-4bc3-4df7-8a09-765794883524
The version of IASL compiler in the tools_def.template file no longer exists on the acpica.org site.
Update download link and remove the specific version info from the tools_def.template file.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Yingke Liu <yingke.d.liu@intel.com>
Reviewed-by: Liming Gao <liming.gao@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17725 6f19259b-4bc3-4df7-8a09-765794883524
a) Fix a bug of displaying wrong format of a GUID
b) Fix a bug of setting wrong exception keyword
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Hess Chen <hesheng.chen@intel.com>
Reviewed-by: YangX Li <yangx.li@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17710 6f19259b-4bc3-4df7-8a09-765794883524
Fix a bug to not break when parsing a macro and not find its value
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Hess Chen <hesheng.chen@intel.com>
Reviewed-by: YangX Li <yangx.li@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17709 6f19259b-4bc3-4df7-8a09-765794883524
Add a checkpoint to check whether a header file in 'include' directory is defined in DEC file
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Hess Chen <hesheng.chen@intel.com>
Reviewed-by: YangX Li <yangx.li@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17708 6f19259b-4bc3-4df7-8a09-765794883524
Add a checkpoint to check invalid format of @ValidRange, @ValidList and @Expression for a PCD
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Hess Chen <hesheng.chen@intel.com>
Reviewed-by: YangX Li <yangx.li@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17707 6f19259b-4bc3-4df7-8a09-765794883524
Add a checkpoint to check that the UNI file which is associated by INF or DEC file need define the prompt and help information.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Hess Chen <hesheng.chen@intel.com>
Reviewed-by: YangX Li <yangx.li@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17706 6f19259b-4bc3-4df7-8a09-765794883524
Add a ‘SkipFileList’ in config.ini to exclude the files not be scanned.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Hess Chen <hesheng.chen@intel.com>
Reviewed-by: YangX Li <yangx.li@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17705 6f19259b-4bc3-4df7-8a09-765794883524
We test a simple case of UTF-8 with and without the UTF-8 BOM.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael D Kinney <michael.d.kinney@intel.com>
Reviewed-by: Yingke Liu <yingke.d.liu@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17699 6f19259b-4bc3-4df7-8a09-765794883524
Surrogate pair characters can be encoded in UTF-8 files, but they are
not valid UCS-2 characters.
For example, this python interpreter code:
>>> import codecs
>>> codecs.encode(u'\ud801', 'utf-8')
'\xed\xa0\x81'
But, the range of 0xd800 - 0xdfff should be rejected as unicode code
points because they are reserved for the surrogate pair usage in
UTF-16 files.
We test that this case is rejected for UTF-8 with and without the
UTF-8 BOM.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael D Kinney <michael.d.kinney@intel.com>
Reviewed-by: Yingke Liu <yingke.d.liu@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17698 6f19259b-4bc3-4df7-8a09-765794883524
Since UTF-8 .uni unicode files might contain strings with unicode code
points larger than 16-bits, and UEFI only supports UCS-2 characters,
we need to make sure that BaseTools rejects these characters in UTF-8
.uni source files.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Michael D Kinney <michael.d.kinney@intel.com>
Reviewed-by: Yingke Liu <yingke.d.liu@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17697 6f19259b-4bc3-4df7-8a09-765794883524
This allows .uni input files to be encoded with UTF-8. Today, we only
support UTF-16 encoding.
The strings are still converted to UCS-2 data for use in EDK II
modules. (This is the only unicode character format supported by UEFI
and EDK II.)
Although UTF-8 would allow any UCS-4 character to be present in the
source file, we restrict the entire file to the UCS-2 range.
(Including comments.) This allows the files to be converted to UTF-16
if needed.
v2:
* Drop .utf8 extension. Use .uni file for UTF-8 data (mdkinney)
* Merge in 'BaseTools/UniClassObject: Verify string data is 16-bit'
commit
v3:
* Restrict the entire file's characters (including comments) to the
UCS-2 range in addition to string data. (mdkinney)
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Michael D Kinney <michael.d.kinney@intel.com>
Reviewed-by: Yingke Liu <yingke.d.liu@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17696 6f19259b-4bc3-4df7-8a09-765794883524
Supplementary Plane characters can exist in UTF-16 files,
but they are not valid UCS-2 characters.
For example, this python interpreter code:
>>> import codecs
>>> codecs.encode(u'\U00010300', 'utf-16')
'\xff\xfe\x00\xd8\x00\xdf'
Therefore the UCS-4 0x00010300 character is encoded as two
16-bit numbers (0xd800 0xdf00) in a little endian UTF-16
file.
For more information, see:
http://en.wikipedia.org/wiki/UTF-16#U.2B10000_to_U.2B10FFFF
This test checks to make sure that BaseTools will reject these
characters in UTF-16 files.
The range of 0xd800 - 0xdfff should also be rejected as unicode code
points because they are reserved for the surrogate pair usage in
UTF-16 files.
This test was fixed by the previous commit:
"BaseTools/UniClassObject: Verify valid UCS-2 chars in UTF-16 .uni files"
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael D Kinney <michael.d.kinney@intel.com>
Reviewed-by: Yingke Liu <yingke.d.liu@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17695 6f19259b-4bc3-4df7-8a09-765794883524
Supplementary Plane characters can exist in UTF-16 files,
but they are not valid UCS-2 characters.
For example, refer to this python interpreter code:
>>> import codecs
>>> codecs.encode(u'\U00010300', 'utf-16')
'\xff\xfe\x00\xd8\x00\xdf'
Therefore the UCS-4 0x00010300 character is encoded as two
16-bit numbers (0xd800 0xdf00) in a little endian UTF-16
file.
For more information, see:
http://en.wikipedia.org/wiki/UTF-16#U.2B10000_to_U.2B10FFFF
This means that our current BaseTools code could be allowing
unsupported UTF-16 characters be used. To fix this, we decode the file
using python's utf-16 decode support. Then we verify that each
character's code point is 0xffff or less.
v3:
* Based on Mike Kinney's feedback, we now read the whole file and
verify up-front that it contains valid UCS-2 characters. Thanks
also to Laszlo Ersek for pointing out the Supplementary Plane
characters.
v4:
* Reject code points in 0xd800-0xdfff range since they are reserved
for UTF-16 surrogate pairs. (lersek)
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael D Kinney <michael.d.kinney@intel.com>
Reviewed-by: Yingke Liu <yingke.d.liu@intel.com>
git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17694 6f19259b-4bc3-4df7-8a09-765794883524