Commit Graph

26 Commits

Author SHA1 Message Date
Jakub Jelinek
d0e8f58b81 contrib, libcpp, libstdc++: Update to Unicode 16.0
It is autumn again and there is a new Unicode version 16.0.

The following patch updates our Unicode stuff in contrib, libcpp and
libstdc++ from that Unicode version.

2024-10-08  Jakub Jelinek  <jakub@redhat.com>

contrib/
	* unicode/README: Update glibc git commit hash, replace
	Unicode 15 or 15.1 versions with 16.
	* unicode/gen_libstdcxx_unicode_data.py: Use 160000 instead of
	150100 in _GLIBCXX_GET_UNICODE_DATA test.
	* unicode/from_glibc/utf8_gen.py: Updated from glibc
	064c708c78cc2a6b5802dce73108fc0c1c6bfc80 commit.
	* unicode/DerivedCoreProperties.txt: Updated from Unicode 16.0.
	* unicode/emoji-data.txt: Likewise.
	* unicode/PropList.txt: Likewise.
	* unicode/GraphemeBreakProperty.txt: Likewise.
	* unicode/DerivedNormalizationProps.txt: Likewise.
	* unicode/NameAliases.txt: Likewise.
	* unicode/UnicodeData.txt: Likewise.
	* unicode/EastAsianWidth.txt: Likewise.
gcc/testsuite/
	* c-c++-common/cpp/named-universal-char-escape-1.c: Add tests
	for some Unicode 16.0 characters, both normal and generated.
libcpp/
	* makeucnid.cc (write_copyright): Update Unicode Copyright years.
	* makeuname2c.cc (generated_ranges): Adjust Unicode version from 15.1
	to 16.0.  Add EGYPTIAN HIEROGLYPH- generated range, adjust indexes in
	following entries.
	(write_copyright): Update Unicode Copyright years.
	* generated_cpp_wcwidth.h: Regenerated.
	* ucnid.h: Regenerated.
	* uname2c.h: Regenerated.
libstdc++-v3/
	* include/bits/unicode.h (std::__unicode::__v15_1_0): Rename inline
	namespace to ...
	(std::__unicode::__v16_0_0): ... this.
	(_GLIBCXX_GET_UNICODE_DATA): Change from 150100 to 160000.
	* include/bits/unicode-data.h: Regenerated.
	* testsuite/ext/unicode/properties.cc: Check for _Gcb_SpacingMark
	on U+11F03 rather than U+1D16D as the latter lost SpacingMark property
	in Unicode 16.0.
2024-10-08 10:01:47 +02:00
Jakub Jelinek
a945c346f5 Update copyright years. 2024-01-03 12:19:35 +01:00
Jakub Jelinek
d64b7c82da libcpp, contrib: Update to Unicode 15.1
The following patch (in plaintext just a pseudo-patch where I've left out
the too big parts of either wget downloaded or regenerated files out with
..., full patch attached compressed) updates to Unicode 15.1 from 15.0
we had last year.  Apparently Unicode forgot to add a new range to 4-8 Table
we are using, but from the other files it is clear what should have been
added; I've filed a bugreport against Unicode.

2023-11-14  Jakub Jelinek  <jakub@redhat.com>

contrib/
	* unicode/README: Adjust glibc git commit hash, number of Unicode
	data files to be updated and latest Unicode version.
	* unicode/from_glibc/utf8_gen.py: Update from glibc.
	* unicode/UnicodeData.txt: Update from Unicode 15.1.
	* unicode/EastAsianWidth.txt: Likewise.
	* unicode/DerivedNormalizationProps.txt: Likewise.
	* unicode/NameAliases.txt: Likewise.
	* unicode/DerivedCoreProperties.txt: Likewise.
	* unicode/PropList.txt: Likewise.
libcpp/
	* makeucnid.cc (write_copyright): Update copyright year.
	* makeuname2c.cc (write_copyright): Likewise.
	(struct generated): Update latest Unicode version.
	(generated_ranges): Add 2ebf0-2ee5d CJK UNIFIED IDEOGRAPH
	range which was forgotten to be added to 4-8 table, but
	clearly is expected to be there from the 15.1 additions.
	* ucnid.h: Regenerated.
	* uname2c.h: Regenerated.
	* generated_cpp_wcwidth.h: Regenerated.
2023-11-14 18:32:37 +01:00
Jakub Jelinek
99bae6ee66 libcpp: Update Unicode copyright years
I've noticed I forgot to update copyright years when updating from
Unicode 15.0.0 (and makeucnid.cc had it hopelessly obsolete).

2023-03-16  Jakub Jelinek  <jakub@redhat.com>

	* makeucnid.cc (write_copyright): Update Unicode copyright years
	up to 2022.
	* makeuname2c.cc (write_copyright): Likewise.
	* ucnid.h: Regenerated.
	* uname2c.h: Regenerated.
2023-03-16 10:19:04 +01:00
Jakub Jelinek
83ffe9cde7 Update copyright years. 2023-01-16 11:52:17 +01:00
Jakub Jelinek
2662d537b0 libcpp: Update to Unicode 15
The following pseudo-patch regenerates the libcpp tables with Unicode 15.0.0
which added 4489 new characters.

As mentioned previously, this isn't just a matter of running the
two libcpp/make*.cc programs on the new Unicode files, but one needs
to manually update a table inside of makeuname2c.cc according to
a table in Unicode text (which is partially reflected in the text
files, but e.g. in Unicode 14.0.0 not 100% accurately, in 15.0.0
actually accurately).
I've also added some randomly chosen subset of those 4489 new
characters to a testcase.

2022-11-04  Jakub Jelinek  <jakub@redhat.com>

gcc/testsuite/
	* c-c++-common/cpp/named-universal-char-escape-1.c: Add tests for some
	characters newly added in Unicode 15.0.0.
libcpp/
	* makeuname2c.cc (struct generated): Update from Unicode 15.0.0
	table 4-8.
	* ucnid.h: Regenerated for Unicode 15.0.0.
	* uname2c.h: Likewise.
2022-11-04 18:18:42 +01:00
Lewis Hyatt
4fda776a2f libcpp: Update ucnid.h to Unicode 14
This patch updates ucnid.h from Unicode 13 to Unicode 14.  Additionally, the
procedure detailed in contrib/unicode/README, which updates
generated_wcwidth.h, has been expanded with instructions for updating this
file as well, so that both may be done at the same time conveniently.  Two
additional Unicode data files which are needed to create ucnid.h are also
added to source control in contrib/unicode.

contrib/ChangeLog:

	* unicode/README: Added instructions for updating ucnid.h.
	* unicode/DerivedCoreProperties.txt: New file added to source
	control from Unicode 14.0 release.
	* unicode/DerivedNormalizationProps.txt: Likewise.

libcpp/ChangeLog:

	* ucnid.h: Regenerated for Unicode 14.0.
2022-06-28 17:33:37 -04:00
Jakub Jelinek
7adcbafe45 Update copyright years. 2022-01-03 10:42:10 +01:00
Jakub Jelinek
c4d6dcacfc libcpp: Implement C++23 P1949R7 - C++ Identifier Syntax using Unicode Standard Annex 31
The following patch implements the
P1949R7 - C++ Identifier Syntax using Unicode Standard Annex 31
paper.  We already allow UTF-8 characters in the source, so that part
is already implemented, so IMHO all we need to do is pedwarn instead of
just warn for the (default) -Wnormalize=nfc (or for -Wnormalize={id,nkfc})
if the character is not in NFC and to use the unicode XID_Start and
XID_Continue derived code properties to find out what characters are allowed
(the standard actually adds U+005F to XID_Start, but we are handling the
ASCII compatible characters differently already and they aren't allowed
in UCNs in identifiers).  Instead of hardcoding the large tables
in ucnid.tab, this patch makes makeucnid.c read them from the Unicode
tables (13.0.0 version at this point).

For non-pedantic mode, we accept as 2nd+ char in identifiers a union
of valid characters in all supported modes, but for the 1st char it
was actually pedantically requiring that it is not any of the characters
that may not appear in the currently chosen standard as the first character.
This patch changes it such that also what is allowed at the start of an
identifier is a union of characters valid at the start of an identifier
in any of the pedantic modes.

2021-09-01  Jakub Jelinek  <jakub@redhat.com>

	PR c++/100977
libcpp/
	* include/cpplib.h (struct cpp_options): Add cxx23_identifiers.
	* charset.c (CXX23, NXX23): New enumerators.
	(CID, NFC, NKC, CTX): Renumber.
	(ucn_valid_in_identifier): Implement P1949R7 - use CXX23 and
	NXX23 flags for cxx23_identifiers.  For start character in
	non-pedantic mode, allow characters that are allowed as start
	characters in any of the supported language modes, rather than
	disallowing characters allowed only as non-start characters in
	current mode but for characters from other language modes allowing
	them even if they are never allowed at start.
	* init.c (struct lang_flags): Add cxx23_identifiers.
	(lang_defaults): Add cxx23_identifiers column.
	(cpp_set_lang): Initialize CPP_OPTION (pfile, cxx23_identifiers).
	* lex.c (warn_about_normalization): If cxx23_identifiers, use
	cpp_pedwarning_with_line instead of cpp_warning_with_line for
	"is not in NFC" diagnostics.
	* makeucnid.c: Adjust usage comment.
	(CXX23, NXX23): New enumerators.
	(all_languages): Add CXX23.
	(not_NFC, not_NFKC, maybe_not_NFC): Renumber.
	(read_derivedcore): New function.
	(write_table): Print also CXX23 and NXX23 columns.
	(main): Require 5 arguments instead of 4, call read_derivedcore.
	* ucnid.h: Regenerated using Unicode 13.0.0 files.
gcc/testsuite/
	* g++.dg/cpp23/normalize1.C: New test.
	* g++.dg/cpp23/normalize2.C: New test.
	* g++.dg/cpp23/normalize3.C: New test.
	* g++.dg/cpp23/normalize4.C: New test.
	* g++.dg/cpp23/normalize5.C: New test.
	* g++.dg/cpp23/normalize6.C: New test.
	* g++.dg/cpp23/normalize7.C: New test.
	* g++.dg/cpp23/ucnid-1-utf8.C: New test.
	* g++.dg/cpp23/ucnid-2-utf8.C: New test.
	* gcc.dg/cpp/ucnid-4.c: Don't expect
	"not valid at the start of an identifier" errors.
	* gcc.dg/cpp/ucnid-4-utf8.c: Likewise.
	* gcc.dg/cpp/ucnid-5-utf8.c: New test.
2021-09-01 22:33:06 +02:00
Jakub Jelinek
4739344d36 libcpp: Regenerate ucnid.h using Unicode 13.0.0 files [PR100977]
The following patch (incremental to the makeucnid.c fix) regenerates
ucnid.h with https://www.unicode.org/Public/13.0.0/ucd/ files.

2021-08-05  Jakub Jelinek  <jakub@redhat.com>

	PR c++/100977
	* ucnid.h: Regenerated using Unicode 13.0.0 files.
2021-08-05 17:35:20 +02:00
Jakub Jelinek
4805b92a32 libcpp: Fix makeucnid bug with combining values [PR100977]
I've noticed in ucnid.h two adjacent lines that had all flags and combine
values identical and as such were supposed to be merged.

This is due to a bug in makeucnid.c, which records last_flag,
last_combine and really_safe of what has just been printed, but
because of a typo mishandles it for last_combine, always compares against
the combining_value[0] which is 0.

This has two effects on the table, one is that often the table is
unnecessarily large, as for non-zero .combine every character has its own
record instead of adjacent characters with the same flags and combine
being merged.  This means larger tables.
The other is that sometimes the last char that has combine set doesn't
actually have it in the tables, because the code is printing entries only
upon seeing the next character and if that character does have
combining_value of 0 and flags are otherwise the same as previously printed,
it will not print anything.

The following patch fixes that, for clarity what exactly it affects
I've regenerated with the same Unicode files as last time it has
been regenerated.

2021-08-05  Jakub Jelinek  <jakub@redhat.com>

	PR c++/100977
	* makeucnid.c (write_table): Fix computation of last_combine.
	* ucnid.h: Regenerated using Unicode 6.3.0 files.
2021-08-05 17:34:16 +02:00
Jakub Jelinek
99dee82307 Update copyright years. 2021-01-04 10:26:59 +01:00
Jakub Jelinek
8d9254fc8a Update copyright years.
From-SVN: r279813
2020-01-01 12:51:42 +01:00
Jakub Jelinek
a554497024 Update copyright years.
From-SVN: r267494
2019-01-01 13:31:55 +01:00
Jakub Jelinek
85ec4feb11 Update copyright years.
From-SVN: r256169
2018-01-03 11:03:58 +01:00
Jakub Jelinek
cbe34bb5ed Update copyright years.
From-SVN: r243994
2017-01-01 13:07:43 +01:00
Jakub Jelinek
818ab71a41 Update copyright years.
From-SVN: r232055
2016-01-04 15:30:50 +01:00
Jakub Jelinek
5624e564d2 Update copyright years.
From-SVN: r219188
2015-01-05 13:33:28 +01:00
Richard Sandiford
35c3d610e3 Update copyright years in libcpp/
From-SVN: r206293
2014-01-02 22:24:45 +00:00
Joseph Myers
d3f4ff8b51 ucnid-2011-1.c: New test.
gcc/testsuite:
	* c-c++-common/cpp/ucnid-2011-1.c: New test.

libcpp:
	* ucnid.tab: Add C11 and C11NOSTART data.
	* makeucnid.c (digit): Rename enum value to N99.
	(C11, N11, all_languages): New enum values.
	(NUM_CODE_POINTS, MAX_CODE_POINT): New macros.
	(flags, decomp, combining_value): Use NUM_CODE_POINTS as array
	size.
	(decomp): Use unsigned int as element type.
	(all_decomp): New array.
	(read_ucnid): Handle C11 and C11NOSTART.  Use MAX_CODE_POINT.
	(read_table): Use MAX_CODE_POINT.  Store all decompositions in
	all_decomp.
	(read_derived): Use MAX_CODE_POINT.
	(write_table): Use NUM_CODE_POINTS.  Print N99, C11 and N11
	flags.  Print whole array variable declaration rather than just
	array contents.
	(char_id_valid, write_context_switch): New functions.
	(main): Call write_context_switch.
	* ucnid.h: Regenerate.
	* include/cpplib.h (struct cpp_options): Add c11_identifiers.
	* init.c (struct lang_flags): Add c11_identifiers.
	(cpp_set_lang): Set c11_identifiers option from selected language.
	* internal.h (struct normalize_state): Document "previous" as
	previous starter character.
	(NORMALIZE_STATE_UPDATE_IDNUM): Take character as argument.
	* charset.c (DIG): Rename enum value to N99.
	(C11, N11): New enum values.
	(struct ucnrange): Give name to struct.  Use short for flags and
	unsigned int for end of range.  Include ucnid.h for whole variable
	declaration.
	(ucn_valid_in_identifier): Allow for characters up to 0x10FFFF.
	Allow for C11 in determining valid characters and valid start
	characters.  Use check_nfc for non-Hangul context-dependent
	checks.  Only store starter characters in nst->previous.
	(_cpp_valid_ucn): Pass new argument to
	NORMALIZE_STATE_UPDATE_IDNUM.
	* lex.c (lex_identifier): Pass new argument to
	NORMALIZE_STATE_UPDATE_IDNUM.  Call NORMALIZE_STATE_UPDATE_IDNUM
	after initial non-UCN part of identifier.
	(lex_number): Pass new argument to NORMALIZE_STATE_UPDATE_IDNUM.

From-SVN: r204886
2013-11-16 00:05:08 +00:00
Joseph Myers
54848ff84b ucnid-9.c: New test.
gcc/testsuite:
	* gcc.dg/cpp/ucnid-9.c: New test.

libcpp:
	* ucnid.tab: Mark C99 digits as [C99DIG].
	* makeucnid.c (read_ucnid): Handle [C99DIG].
	(read_table): Don't check for digit characters.
	* ucnid.h: Regenerate.

From-SVN: r204835
2013-11-15 02:15:26 +00:00
Richard Sandiford
500f3ed906 Update copyright years in libcpp.
From-SVN: r195162
2013-01-14 18:13:59 +00:00
Jakub Jelinek
748086b7b2 Licensing changes to GPLv3 resp. GPLv3 with GCC Runtime Exception.
From-SVN: r145841
2009-04-09 17:00:19 +02:00
Kelley Cook
200031d1d5 all files: Update FSF address in copyright headers.
2005-06-29  Kelley Cook  <kcook@gcc.gnu.org>

	* all files: Update FSF address in copyright headers.
	* makeucnid.c (write_copyright): Update outputted FSF address.

From-SVN: r101413
2005-06-29 02:34:39 +00:00
Geoffrey Keating
50668cf626 Index: gcc/ChangeLog
2005-03-14  Geoffrey Keating  <geoffk@apple.com>

	* doc/cppopts.texi (-fexec-charset): Add concept index entry.
	(-fwide-exec-charset): Likewise.
	(-finput-charset): Likewise.
	* doc/invoke.texi (Warning Options): Document -Wnormalized=.
	* c-opts.c (c_common_handle_option): Handle -Wnormalized=.
	* c.opt (Wnormalized): New.

Index: libcpp/ChangeLog
2005-03-14  Geoffrey Keating  <geoffk@apple.com>

	* init.c (cpp_create_reader): Default warn_normalize to normalized_C.
	* charset.c: Update for new format of ucnid.h.
	(ucn_valid_in_identifier): Update for new format of ucnid.h.
	Add NST parameter, and update it; update callers.
	(cpp_valid_ucn): Add NST parameter, update callers.  Replace abort
	with cpp_error.
	(convert_ucn): Pass normalize_state to cpp_valid_ucn.
	* internal.h (struct normalize_state): New.
	(INITIAL_NORMALIZE_STATE): New.
	(NORMALIZE_STATE_RESULT): New.
	(NORMALIZE_STATE_UPDATE_IDNUM): New.
	(_cpp_valid_ucn): New.
	* lex.c (warn_about_normalization): New.
	(forms_identifier_p): Add normalize_state parameter, update callers.
	(lex_identifier): Add normalize_state parameter, update callers.  Keep
	the state current.
	(lex_number): Likewise.
	(_cpp_lex_direct): Pass normalize_state to subroutines.  Check
	it with warn_about_normalization.
	* makeucnid.c: New.
	* ucnid.h: Replace.
	* ucnid.pl: Remove.
	* ucnid.tab: Make appropriate for input to makeucnid.c.  Remove
	comments about obsolete version of C++.
	* include/cpplib.h (enum cpp_normalize_level): New.
	(struct cpp_options): Add warn_normalize field.

Index: gcc/testsuite/ChangeLog
2005-03-14  Geoffrey Keating  <geoffk@apple.com>

	* gcc.dg/cpp/normalize-1.c: New.
	* gcc.dg/cpp/normalize-2.c: New.
	* gcc.dg/cpp/normalize-3.c: New.
	* gcc.dg/cpp/normalize-4.c: New.
	* gcc.dg/cpp/ucnid-4.c: New.
	* gcc.dg/cpp/ucnid-5.c: New.
	* g++.dg/cpp/normalize-1.C: New.
	* g++.dg/cpp/ucnid-1.C: New.

From-SVN: r96459
2005-03-15 00:36:33 +00:00
Paolo Bonzini
4f4e53dd85 Makefile.def (host_modules): add libcpp.
ChangeLog:

2004-05-23  Paolo Bonzini  <bonzini@gnu.org>

	* Makefile.def (host_modules): add libcpp.
	* Makefile.tpl: Add dependencies on and for libcpp.
	* Makefile.in: Regenerate.
	* configure.in: Add libcpp host module.
	* configure: Regenerate.

config/ChangeLog:

2004-05-23  Paolo Bonzini  <bonzini@gnu.org>

	* acx.m4 (ACX_HEADER_STDBOOL, ACX_HEADER_STRING):
	From gcc.

gcc/ChangeLog:

2004-05-23  Paolo Bonzini  <bonzini@gnu.org>

	Move libcpp to the toplevel.
	* Makefile.in: Remove references to libcpp files,
	use CPPLIBS instead of libcpp.a.  Define SYMTAB_H
	and change hashtable.h to that.
	* aclocal.m4 (gcc_AC_HEADER_STDBOOL,
	gcc_AC_HEADER_STRING, gcc_AC_C__BOOL): Remove.
	* configure.ac (gcc_AC_C__BOOL, HAVE_UCHAR): Remove tests.
	* configure: Regenerate.
	* config.in: Regenerate.
	* c-ppoutput.c: Include ../libcpp/internal.h instead of cpphash.h.
	* cppcharset.c: Removed.
	* cpperror.c: Removed.
	* cppexp.c: Removed.
	* cppfiles.c: Removed.
	* cpphash.c: Removed.
	* cpphash.h: Removed.
	* cppinit.c: Removed.
	* cpplex.c: Removed.
	* cpplib.c: Removed.
	* cpplib.h: Removed.
	* cppmacro.c: Removed.
	* cpppch.c: Removed.
	* cpptrad.c: Removed.
	* cppucnid.h: Removed.
	* cppucnid.pl: Removed.
	* cppucnid.tab: Removed.
	* hashtable.c: Removed.
	* hashtable.h: Removed.
	* line-map.c: Removed.
	* line-map.h: Removed.
	* mkdeps.c: Removed.
	* mkdeps.h: Removed.
	* stringpool.h: Include symtab.h instead of hashtable.h.
	* tree.h: Include symtab.h instead of hashtable.h.
	* system.h (O_NONBLOCK, O_NOCTTY): Do not define.

gcc/cp/ChangeLog:

2004-05-23  Paolo Bonzini  <bonzini@gnu.org>

	* Make-lang.in: No need to specify $(LIBCPP).

gcc/java/ChangeLog:

2004-05-23  Paolo Bonzini  <bonzini@gnu.org>

	* Make-lang.in: Link in $(LIBCPP) instead of mkdeps.o.

libcpp/ChangeLog:

2004-05-23  Paolo Bonzini  <bonzini@gnu.org>

	Moved libcpp from the gcc subdirectory to the toplevel.
	* Makefile.am: New file.
	* Makefile.in: Regenerate.
	* configure.ac: New file.
	* configure: Regenerate.
	* config.in: Regenerate.
	* charset.c: Moved from gcc/cppcharset.c.  Add note about
	brokenness of input charset detection.  Adjust for change
	in name of cppucnid.h.
	* errors.c: Moved from gcc/cpperror.c.  Do not include intl.h.
	* expr.c: Moved from gcc/cppexp.c.
	* files.c: Moved from gcc/cppfiles.c.  Do not include intl.h.
	Remove #define of O_BINARY, it is in system.h.
	* identifiers.c: Moved from gcc/cpphash.c.
	* internal.h: Moved from gcc/cpphash.h.  Change header
	guard name.  All other files adjusted to match name change.
	* init.c: Moved from gcc/cppinit.c.
	(init_library) [ENABLE_NLS]: Call bindtextdomain.
	* lex.c: Moved from gcc/cpplex.c.
	* directives.c: Moved from gcc/cpplib.c.
	* macro.c: Moved from gcc/cppmacro.c.
	* pch.c: Moved from gcc/cpppch.c.  Do not include intl.h.
	* traditional.c: Moved from gcc/cpptrad.c.
	* ucnid.h: Moved from gcc/cppucnid.h.  Change header
	guard name.
	* ucnid.pl: Moved from gcc/cppucnid.pl.
	* ucnid.tab: Moved from gcc/cppucnid.tab.  Change header
	guard name.
	* symtab.c: Moved from gcc/hashtable.c.
	* line-map.c: Moved from gcc.  Do not include intl.h.
	* mkdeps.c: Moved from gcc.
	* system.h: New file.

libcpp/include/ChangeLog:

2004-05-23  Paolo Bonzini  <bonzini@gnu.org>

	* cpplib.h: Moved from gcc.  Change header guard name.
	* line-map.h: Moved from gcc.  Change header guard name.
	* mkdeps.h: Moved from gcc.  Change header guard name.
	* symtab.h: Moved from gcc/hashtable.h.  Change header
	guard name.

libcpp/po/ChangeLog:

2004-05-23  Paolo Bonzini  <bonzini@gnu.org>

	* be.po: Extracted from gcc/po/be.po.
	* ca.po: Extracted from gcc/po/ca.po.
	* da.po: Extracted from gcc/po/da.po.
	* de.po: Extracted from gcc/po/de.po.
	* el.po: Extracted from gcc/po/el.po.
	* es.po: Extracted from gcc/po/es.po.
	* fr.po: Extracted from gcc/po/fr.po.
	* ja.po: Extracted from gcc/po/ja.po.
	* nl.po: Extracted from gcc/po/nl.po.
	* sv.po: Extracted from gcc/po/sv.po.
	* tr.po: Extracted from gcc/po/tr.po.

From-SVN: r82199
2004-05-24 10:50:45 +00:00