audk/BaseTools/Source/Python/AutoGen
Jordan Justen d80e451b18 BaseTools/UniClassObject: Verify valid UCS-2 chars in UTF-16 .uni files
Supplementary Plane characters can exist in UTF-16 files,
but they are not valid UCS-2 characters.

For example, refer to this python interpreter code:
>>> import codecs
>>> codecs.encode(u'\U00010300', 'utf-16')
'\xff\xfe\x00\xd8\x00\xdf'

Therefore the UCS-4 0x00010300 character is encoded as two
16-bit numbers (0xd800 0xdf00) in a little endian UTF-16
file.

For more information, see:
http://en.wikipedia.org/wiki/UTF-16#U.2B10000_to_U.2B10FFFF

This means that our current BaseTools code could be allowing
unsupported UTF-16 characters be used. To fix this, we decode the file
using python's utf-16 decode support. Then we verify that each
character's code point is 0xffff or less.

v3:
 * Based on Mike Kinney's feedback, we now read the whole file and
   verify up-front that it contains valid UCS-2 characters. Thanks
   also to Laszlo Ersek for pointing out the Supplementary Plane
   characters.

v4:
 * Reject code points in 0xd800-0xdfff range since they are reserved
   for UTF-16 surrogate pairs. (lersek)

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael D Kinney <michael.d.kinney@intel.com>
Reviewed-by: Yingke Liu <yingke.d.liu@intel.com>

git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17694 6f19259b-4bc3-4df7-8a09-765794883524
2015-06-23 23:34:19 +00:00
..
AutoGen.py BaseTools: The token values cannot be numeric same with different PCDs. 2015-06-23 07:02:03 +00:00
BuildEngine.py BaseTools: Implement BUILDRULEORDER for tools_def 2015-05-26 10:32:07 +00:00
GenC.py BaseTools: Added extern declaration for protocols/PPI/GUID in AutoGhen.h 2015-06-08 08:08:58 +00:00
GenDepex.py There is a limitation on WINDOWS OS for the length of entire file path can’t be larger than 255. There is an OS API provided by Microsoft to add “\\?\” before the path header to support the long file path. Enable this feature on basetools. 2014-08-15 03:06:48 +00:00
GenMake.py BaseTools: Fixed a bug to generate correct path of PACKAGE_RELATIVE_PATH 2015-06-16 04:23:00 +00:00
GenPcdDb.py BaseTools/Build: The PCD value in uninitialized data range should be natural aligned. 2015-05-12 00:58:20 +00:00
InfSectionParser.py License header updated to match correct format. 2014-08-28 13:53:34 +00:00
StrGather.py License header updated to match correct format. 2014-08-28 13:53:34 +00:00
UniClassObject.py BaseTools/UniClassObject: Verify valid UCS-2 chars in UTF-16 .uni files 2015-06-23 23:34:19 +00:00
ValidCheckingInfoObject.py BaseTools/Build: Add SDL support 2015-04-10 06:59:47 +00:00
__init__.py Sync EDKII BaseTools to BaseTools project r1971 2010-05-18 05:04:32 +00:00