mirror of
https://gcc.gnu.org/git/gcc.git
synced 2024-12-22 18:45:10 +08:00
73dd5c6c88
Updates cpp_wcwidth() to Unicode 15, following the procedure in contrib/unicode/README mechanically without incident. contrib/ChangeLog: * unicode/DerivedCoreProperties.txt: Update to Unicode 15. * unicode/DerivedNormalizationProps.txt: Likewise. * unicode/EastAsianWidth.txt: Likwise. * unicode/PropList.txt: Likewise. * unicode/README: Likewise. * unicode/UnicodeData.txt: Likewise. libcpp/ChangeLog: * generated_cpp_wcwidth.h: Regenerated for Unicode 15.
56 lines
2.5 KiB
Plaintext
56 lines
2.5 KiB
Plaintext
This directory contains a mechanism for GCC to have its own internal
|
|
implementation of wcwidth functionality (cpp_wcwidth () in libcpp/charset.c),
|
|
as well as a mechanism to update the information about codepoints permitted in
|
|
identifiers, which is encoded in libcpp/ucnid.h.
|
|
|
|
The idea is to produce the necessary lookup tables
|
|
(../../libcpp/{ucnid.h,generated_cpp_wcwidth.h}) in a reproducible way, starting
|
|
from the following files that are distributed by the Unicode Consortium:
|
|
|
|
ftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt
|
|
ftp://ftp.unicode.org/Public/UNIDATA/EastAsianWidth.txt
|
|
ftp://ftp.unicode.org/Public/UNIDATA/PropList.txt
|
|
ftp://ftp.unicode.org/Public/UNIDATA/DerivedNormalizationProps.txt
|
|
ftp://ftp.unicode.org/Public/UNIDATA/DerivedCoreProperties.txt
|
|
|
|
These files have been added to source control in this directory;
|
|
please see unicode-license.txt for the relevant copyright information.
|
|
|
|
In order to keep in sync with glibc's wcwidth as much as possible, it is
|
|
desirable for the logic that processes the Unicode data to be the same as
|
|
glibc's. To that end, we also put in this directory, in the from_glibc/
|
|
directory, the glibc python code that implements their logic. This code was
|
|
copied verbatim from glibc, and it can be updated at any time from the glibc
|
|
source code repository. The files copied from that respository are:
|
|
|
|
localedata/unicode-gen/unicode_utils.py
|
|
localedata/unicode-gen/utf8_gen.py
|
|
|
|
And the most recent versions added to GCC are from glibc git commit:
|
|
4c721f24fc190d1dc935eb0bab283de7cf13182e
|
|
|
|
The script gen_wcwidth.py found here contains the GCC-specific code to
|
|
map glibc's output to the lookup tables we require. This script should not need
|
|
to change, unless there are structural changes to the Unicode data files or to
|
|
the glibc code. Similarly, makeucnid.cc in ../../libcpp contains the logic to
|
|
produce ucnid.h.
|
|
|
|
The procedure to update GCC's Unicode support is the following:
|
|
|
|
1. Update the five Unicode data files from the above URLs.
|
|
|
|
2. Update the two glibc files in from_glibc/ from glibc's git. Update
|
|
the commit number above in this README.
|
|
|
|
3. Run ./gen_wcwidth.py X.Y > ../../libcpp/generated_cpp_wcwidth.h
|
|
(where X.Y is the version of the Unicode standard corresponding to the
|
|
Unicode data files being used, most recently, 14.0.0).
|
|
|
|
4. Compile makeucnid, e.g. with:
|
|
gcc -O2 ../../libcpp/makeucnid.cc -o ../../libcpp/makeucnid
|
|
|
|
5. Generate ucnid.h as follows:
|
|
../../libcpp/makeucnid ../../libcpp/ucnid.tab UnicodeData.txt \
|
|
DerivedNormalizationProps.txt DerivedCoreProperties.txt \
|
|
> ../../libcpp/ucnid.h
|