Commit Graph

18 Commits

Author SHA1 Message Date
Jordan Justen be264422c9 BaseTools/UniClassObject: Support UTF-8 string data in .uni files
This allows .uni input files to be encoded with UTF-8. Today, we only
support UTF-16 encoding.

The strings are still converted to UCS-2 data for use in EDK II
modules. (This is the only unicode character format supported by UEFI
and EDK II.)

Although UTF-8 would allow any UCS-4 character to be present in the
source file, we restrict the entire file to the UCS-2 range.
(Including comments.) This allows the files to be converted to UTF-16
if needed.

v2:
 * Drop .utf8 extension. Use .uni file for UTF-8 data (mdkinney)
 * Merge in 'BaseTools/UniClassObject: Verify string data is 16-bit'
   commit

v3:
 * Restrict the entire file's characters (including comments) to the
   UCS-2 range in addition to string data. (mdkinney)

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Michael D Kinney <michael.d.kinney@intel.com>
Reviewed-by: Yingke Liu <yingke.d.liu@intel.com>

git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17696 6f19259b-4bc3-4df7-8a09-765794883524
2015-06-23 23:34:28 +00:00
Jordan Justen d80e451b18 BaseTools/UniClassObject: Verify valid UCS-2 chars in UTF-16 .uni files
Supplementary Plane characters can exist in UTF-16 files,
but they are not valid UCS-2 characters.

For example, refer to this python interpreter code:
>>> import codecs
>>> codecs.encode(u'\U00010300', 'utf-16')
'\xff\xfe\x00\xd8\x00\xdf'

Therefore the UCS-4 0x00010300 character is encoded as two
16-bit numbers (0xd800 0xdf00) in a little endian UTF-16
file.

For more information, see:
http://en.wikipedia.org/wiki/UTF-16#U.2B10000_to_U.2B10FFFF

This means that our current BaseTools code could be allowing
unsupported UTF-16 characters be used. To fix this, we decode the file
using python's utf-16 decode support. Then we verify that each
character's code point is 0xffff or less.

v3:
 * Based on Mike Kinney's feedback, we now read the whole file and
   verify up-front that it contains valid UCS-2 characters. Thanks
   also to Laszlo Ersek for pointing out the Supplementary Plane
   characters.

v4:
 * Reject code points in 0xd800-0xdfff range since they are reserved
   for UTF-16 surrogate pairs. (lersek)

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael D Kinney <michael.d.kinney@intel.com>
Reviewed-by: Yingke Liu <yingke.d.liu@intel.com>

git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17694 6f19259b-4bc3-4df7-8a09-765794883524
2015-06-23 23:34:19 +00:00
Yingke Liu 45265a8614 BaseTools: Fix a bug that UNI file can't have comment after #include "file.uni"
The 'include' regular expression cannot match spaces before or after this statement.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Yingke Liu <yingke.d.liu@intel.com>
Reviewed-by: Liming Gao <liming.gao@intel.com>

git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@17680 6f19259b-4bc3-4df7-8a09-765794883524
2015-06-23 06:52:12 +00:00
Yingke Liu 8546dfeace Fix a regression bug to uni parser.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Yingke Liu <yingke.d.liu@intel.com>
Reviewed-by: Liming Gao <liming.gao@intel.com>

git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@16469 6f19259b-4bc3-4df7-8a09-765794883524
2014-12-03 08:30:56 +00:00
Cecil Sheng 71f02911b1 Corrected slash and quote handling in the strings of UNI files.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Cecil Sheng <cecil.sheng@hp.com>
Reviewed-by: Yingke Liu <yingke.d.liu@intel.com>

git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@16456 6f19259b-4bc3-4df7-8a09-765794883524
2014-12-01 01:05:05 +00:00
Yingke Liu 97fa0ee9b1 License header updated to match correct format.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Yingke Liu <yingke.d.liu@intel.com>
Reviewed-by: Liming Gao <liming.gao@intel.com>

git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@15971 6f19259b-4bc3-4df7-8a09-765794883524
2014-08-28 13:53:34 +00:00
Hess Chen 1be2ed90a2 There is a limitation on WINDOWS OS for the length of entire file path can’t be larger than 255. There is an OS API provided by Microsoft to add “\\?\” before the path header to support the long file path. Enable this feature on basetools.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Hess Chen <hesheng.chen@intel.com>
Reviewed-by: Yingke Liu <yingke.d.liu@intel.com>

git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@15809 6f19259b-4bc3-4df7-8a09-765794883524
2014-08-15 03:06:48 +00:00
Liming Gao 4afd3d0422 Sync BaseTool trunk (version r2599) into EDKII BaseTools.
Signed-off-by: Liming Gao <liming.gao@intel.com>
Reviewed-by: Heshen Chen <chen.heshen@intel.com>


git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@14591 6f19259b-4bc3-4df7-8a09-765794883524
2013-08-23 02:18:16 +00:00
lgao4 2bcc713e74 Sync BaseTool trunk (version r2423) into EDKII BaseTools. The change mainly includes:
1. Fix !include issues
  2. Fix Trim to skip the postfix 'U' for hexadecimal and decimal numbers
  3. Fix building error C2733 when building C++ code.
  4. Add GCC46 tool chain definition
  5. Add new RVCT and RVCTLINUX tool chains

Signed-off-by: lgao4


git-svn-id: https://edk2.svn.sourceforge.net/svnroot/edk2/trunk/edk2@12782 6f19259b-4bc3-4df7-8a09-765794883524
2011-11-25 06:21:03 +00:00
lgao4 79b74a03e0 Sync BaseTools Branch (version r2362) to EDKII main trunk.
Signed-off-by: lgao4
Reviewed-by: jsu1
Reviewed-by: ydliu

git-svn-id: https://edk2.svn.sourceforge.net/svnroot/edk2/trunk/edk2@12525 6f19259b-4bc3-4df7-8a09-765794883524
2011-10-11 02:49:48 +00:00
lgao4 da92f27632 Sync BaseTools Branch (version r2149) to EDKII main trunk.
BaseTool Branch:
  https://edk2-buildtools.svn.sourceforge.net/svnroot/edk2-buildtools/branches/Releases/BaseTools_r2100

  



git-svn-id: https://edk2.svn.sourceforge.net/svnroot/edk2/trunk/edk2@11640 6f19259b-4bc3-4df7-8a09-765794883524
2011-05-11 10:26:49 +00:00
lgao4 08dd311f5d Sync EDKII BaseTools to BaseTools project r2065.
git-svn-id: https://edk2.svn.sourceforge.net/svnroot/edk2/trunk/edk2@10915 6f19259b-4bc3-4df7-8a09-765794883524
2010-10-11 06:26:52 +00:00
lgao4 756ad8f8e9 Sync EDKII BaseTools to BaseTools project r2006.
git-svn-id: https://edk2.svn.sourceforge.net/svnroot/edk2/trunk/edk2@10764 6f19259b-4bc3-4df7-8a09-765794883524
2010-08-03 03:29:17 +00:00
lgao4 40d841f6a8 Sync EDKII BaseTools to BaseTools project r1971
git-svn-id: https://edk2.svn.sourceforge.net/svnroot/edk2/trunk/edk2@10502 6f19259b-4bc3-4df7-8a09-765794883524
2010-05-18 05:04:32 +00:00
lgao4 f3decdc362 Sync EDKII BaseTools to BaseTools project r1937.
git-svn-id: https://edk2.svn.sourceforge.net/svnroot/edk2/trunk/edk2@10287 6f19259b-4bc3-4df7-8a09-765794883524
2010-03-19 06:55:07 +00:00
lgao4 52302d4dee Sync EDKII BaseTools to BaseTools project r1903.
git-svn-id: https://edk2.svn.sourceforge.net/svnroot/edk2/trunk/edk2@10123 6f19259b-4bc3-4df7-8a09-765794883524
2010-02-28 23:39:39 +00:00
lgao4 b303ea726e Sync tool code to BuildTools project r1739.
git-svn-id: https://edk2.svn.sourceforge.net/svnroot/edk2/trunk/edk2@9397 6f19259b-4bc3-4df7-8a09-765794883524
2009-11-09 11:47:35 +00:00
lgao4 30fdf1140b Check In tool source code based on Build tool project revision r1655.
git-svn-id: https://edk2.svn.sourceforge.net/svnroot/edk2/trunk/edk2@8964 6f19259b-4bc3-4df7-8a09-765794883524
2009-07-17 09:10:31 +00:00