mirrors/gcc

mirror of https://gcc.gnu.org/git/gcc.git synced 2024-12-19 17:15:02 +08:00

Author	SHA1	Message	Date
Jakub Jelinek	63b25b8012	contrib: Update instructions regarding Unicode updates I've noticed we have instructions on how to update from newer Unicode standard, but it didn't mention uname2c.h regeneration. The following patch mentions that, also mentions that the Copyright years of Unicode should be updated and adds a copy of NameAliases.txt which is used for uname2c.h generation. 2023-03-16 Jakub Jelinek <jakub@redhat.com> * unicode/README: Update to mention also makeuname2c. * unicode/NameAliases.txt: New file.	2023-03-16 10:28:25 +01:00
Lewis Hyatt	73dd5c6c88	libcpp: Update cpp_wcwidth() to Unicode 15 Updates cpp_wcwidth() to Unicode 15, following the procedure in contrib/unicode/README mechanically without incident. contrib/ChangeLog: * unicode/DerivedCoreProperties.txt: Update to Unicode 15. * unicode/DerivedNormalizationProps.txt: Likewise. * unicode/EastAsianWidth.txt: Likwise. * unicode/PropList.txt: Likewise. * unicode/README: Likewise. * unicode/UnicodeData.txt: Likewise. libcpp/ChangeLog: * generated_cpp_wcwidth.h: Regenerated for Unicode 15.	2023-03-13 07:40:50 -04:00
Jakub Jelinek	83ffe9cde7	Update copyright years.	2023-01-16 11:52:17 +01:00
Lewis Hyatt	4fda776a2f	libcpp: Update ucnid.h to Unicode 14 This patch updates ucnid.h from Unicode 13 to Unicode 14. Additionally, the procedure detailed in contrib/unicode/README, which updates generated_wcwidth.h, has been expanded with instructions for updating this file as well, so that both may be done at the same time conveniently. Two additional Unicode data files which are needed to create ucnid.h are also added to source control in contrib/unicode. contrib/ChangeLog: * unicode/README: Added instructions for updating ucnid.h. * unicode/DerivedCoreProperties.txt: New file added to source control from Unicode 14.0 release. * unicode/DerivedNormalizationProps.txt: Likewise. libcpp/ChangeLog: * ucnid.h: Regenerated for Unicode 14.0.	2022-06-28 17:33:37 -04:00
Lewis Hyatt	57988cbe73	libcpp: Update cpp_wcwidth() to Unicode 14.0.0 The procedure detailed in contrib/unicode/README was followed with nothing notable coming up. The glibc scripts did not require any update, so the only change was retrieving new versions of the Unicode data files and rerunning gen_wcwidth.py. contrib/ChangeLog: * unicode/EastAsianWidth.txt: Update to Unicode 14.0.0. * unicode/PropList.txt: Likewise. * unicode/README: Likewise. * unicode/UnicodeData.txt: Likewise. libcpp/ChangeLog: * generated_cpp_wcwidth.h: Generated from updated Unicode data files.	2022-06-26 14:13:26 -04:00
David Malcolm	b050653c4c	contrib: add unicode/utf8-dump.py This script may be useful when debugging issues relating to Unicode encoding (e.g. when investigating source files with bidirectional control characters). It dumps a UTF-8 file as a list of numbered lines (mimicking GCC's diagnostic output format), interleaved with lines per character showing the Unicode codepoints, the UTF-8 encoding bytes, the name of the character, and, where printable, the characters themselves. The lines are printed in logical order, which may help the reader to grok the relationship between visual and logical ordering in bi-di files. For example: $ cat test.c int གྷ; const char אבג = "ALEF-BET-GIMEL"; $ ./contrib/unicode/utf8-dump.py test.c 1 \| int གྷ; \| U+0069 0x69 LATIN SMALL LETTER I i \| U+006E 0x6e LATIN SMALL LETTER N n \| U+0074 0x74 LATIN SMALL LETTER T t \| U+0020 0x20 SPACE (separator) \| U+0F43 0xe0 0xbd 0x83 TIBETAN LETTER GHA གྷ \| U+003B 0x3b SEMICOLON ; \| U+000A 0x0a LINE FEED (LF) (control character) 2 \| const char אבג = "ALEF-BET-GIMEL"; \| U+0063 0x63 LATIN SMALL LETTER C c \| U+006F 0x6f LATIN SMALL LETTER O o \| U+006E 0x6e LATIN SMALL LETTER N n \| U+0073 0x73 LATIN SMALL LETTER S s \| U+0074 0x74 LATIN SMALL LETTER T t \| U+0020 0x20 SPACE (separator) \| U+0063 0x63 LATIN SMALL LETTER C c \| U+0068 0x68 LATIN SMALL LETTER H h \| U+0061 0x61 LATIN SMALL LETTER A a \| U+0072 0x72 LATIN SMALL LETTER R r \| U+0020 0x20 SPACE (separator) \| U+002A 0x2a ASTERISK * \| U+05D0 0xd7 0x90 HEBREW LETTER ALEF א \| U+05D1 0xd7 0x91 HEBREW LETTER BET ב \| U+05D2 0xd7 0x92 HEBREW LETTER GIMEL ג \| U+0020 0x20 SPACE (separator) \| U+003D 0x3d EQUALS SIGN = \| U+0020 0x20 SPACE (separator) \| U+0022 0x22 QUOTATION MARK " \| U+0041 0x41 LATIN CAPITAL LETTER A A \| U+004C 0x4c LATIN CAPITAL LETTER L L \| U+0045 0x45 LATIN CAPITAL LETTER E E \| U+0046 0x46 LATIN CAPITAL LETTER F F \| U+002D 0x2d HYPHEN-MINUS - \| U+0042 0x42 LATIN CAPITAL LETTER B B \| U+0045 0x45 LATIN CAPITAL LETTER E E \| U+0054 0x54 LATIN CAPITAL LETTER T T \| U+002D 0x2d HYPHEN-MINUS - \| U+0047 0x47 LATIN CAPITAL LETTER G G \| U+0049 0x49 LATIN CAPITAL LETTER I I \| U+004D 0x4d LATIN CAPITAL LETTER M M \| U+0045 0x45 LATIN CAPITAL LETTER E E \| U+004C 0x4c LATIN CAPITAL LETTER L L \| U+0022 0x22 QUOTATION MARK " \| U+003B 0x3b SEMICOLON ; \| U+000A 0x0a LINE FEED (LF) (control character) Tested with Python 3.8 contrib/ChangeLog: * unicode/utf8-dump.py: New file. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2021-11-01 11:52:28 -04:00
Lewis Hyatt	497c9f8d4d	libcpp: Update cpp_wcwidth() to Unicode 13.0.0 generated_cpp_wcwidth.h was regenerated using Unicode 13.0.0 data files. No material changes to the parsing scripts (either GCC- or glibc-sourced) were necessary; glibc's utf8_gen.py was tweaked slightly by glibc and matched here. contrib/ChangeLog: * unicode/EastAsianWidth.txt: Update to Unicode 13.0.0. * unicode/PropList.txt: Likewise. * unicode/README: Likewise. * unicode/UnicodeData.txt: Likewise. * unicode/from_glibc/unicode_utils.py: Update to latest glibc version. * unicode/from_glibc/utf8_gen.py: Likewise. libcpp/ChangeLog: * generated_cpp_wcwidth.h: Regenerated from Unicode 13.0.0 data.	2020-11-07 09:36:43 -05:00
Lewis Hyatt	ee9256409f	Byte vs column awareness for diagnostic-show-locus.c (PR 49973) contrib/ChangeLog 2019-12-09 Lewis Hyatt <lhyatt@gmail.com> PR preprocessor/49973 * unicode/from_glibc/unicode_utils.py: Support script from glibc (commit 464cd3) to extract character widths from Unicode data files. * unicode/from_glibc/utf8_gen.py: Likewise. * unicode/UnicodeData.txt: Unicode v. 12.1.0 data file. * unicode/EastAsianWidth.txt: Likewise. * unicode/PropList.txt: Likewise. * unicode/gen_wcwidth.py: New utility to generate libcpp/generated_cpp_wcwidth.h with help from the glibc support scripts and the Unicode data files. * unicode/unicode-license.txt: Added. * unicode/README: New explanatory file. libcpp/ChangeLog 2019-12-09 Lewis Hyatt <lhyatt@gmail.com> PR preprocessor/49973 * generated_cpp_wcwidth.h: New file generated by ../contrib/unicode/gen_wcwidth.py, supports new cpp_wcwidth function. * charset.c (compute_next_display_width): New function to help implement display columns. (cpp_byte_column_to_display_column): Likewise. (cpp_display_column_to_byte_column): Likewise. (cpp_wcwidth): Likewise. * include/cpplib.h (cpp_byte_column_to_display_column): Declare. (cpp_display_column_to_byte_column): Declare. (cpp_wcwidth): Declare. (cpp_display_width): New function. gcc/ChangeLog 2019-12-09 Lewis Hyatt <lhyatt@gmail.com> PR preprocessor/49973 * input.c (location_compute_display_column): New function to help with multibyte awareness in diagnostics. (test_cpp_utf8): New self-test. (input_c_tests): Call the new test. * input.h (location_compute_display_column): Declare. * diagnostic-show-locus.c: Pervasive changes to add multibyte awareness to all classes and functions. (enum column_unit): New enum. (class exploc_with_display_col): New class. (class layout_point): Convert m_column member to array m_columns[2]. (layout_range::contains_point): Add col_unit argument. (test_layout_range_for_single_point): Pass new argument. (test_layout_range_for_single_line): Likewise. (test_layout_range_for_multiple_lines): Likewise. (line_bounds::convert_to_display_cols): New function. (layout::get_state_at_point): Add col_unit argument. (make_range): Use empty filename rather than dummy filename. (get_line_width_without_trailing_whitespace): Rename to... (get_line_bytes_without_trailing_whitespace): ...this. (test_get_line_width_without_trailing_whitespace): Rename to... (test_get_line_bytes_without_trailing_whitespace): ...this. (class layout): m_exploc changed to exploc_with_display_col from plain expanded_location. (layout::get_linenum_width): New accessor member function. (layout::get_x_offset_display): Likewise. (layout::calculate_linenum_width): New subroutine for the constuctor. (layout::calculate_x_offset_display): Likewise. (layout::layout): Use the new subroutines. Add multibyte awareness. (layout::print_source_line): Add multibyte awareness. (layout::print_line): Likewise. (layout::print_annotation_line): Likewise. (line_label::line_label): Likewise. (layout::print_any_labels): Likewise. (layout::annotation_line_showed_range_p): Likewise. (get_printed_columns): Likewise. (class line_label): Rename m_length to m_display_width. (get_affected_columns): Rename to... (get_affected_range): ...this; add col_unit argument and multibyte awareness. (class correction): Add m_affected_bytes and m_display_cols members. Rename m_len to m_byte_length for clarity. Add multibyte awareness throughout. (correction::insertion_p): Add multibyte awareness. (correction::compute_display_cols): New function. (correction::ensure_terminated): Use new member name m_byte_length. (line_corrections::add_hint): Add multibyte awareness. (layout::print_trailing_fixits): Likewise. (layout::get_x_bound_for_row): Likewise. (test_one_liner_simple_caret_utf8): New self-test analogous to the one with _utf8 suffix removed, testing multibyte awareness. (test_one_liner_caret_and_range_utf8): Likewise. (test_one_liner_multiple_carets_and_ranges_utf8): Likewise. (test_one_liner_fixit_insert_before_utf8): Likewise. (test_one_liner_fixit_insert_after_utf8): Likewise. (test_one_liner_fixit_remove_utf8): Likewise. (test_one_liner_fixit_replace_utf8): Likewise. (test_one_liner_fixit_replace_non_equal_range_utf8): Likewise. (test_one_liner_fixit_replace_equal_secondary_range_utf8): Likewise. (test_one_liner_fixit_validation_adhoc_locations_utf8): Likewise. (test_one_liner_many_fixits_1_utf8): Likewise. (test_one_liner_many_fixits_2_utf8): Likewise. (test_one_liner_labels_utf8): Likewise. (test_diagnostic_show_locus_one_liner_utf8): Likewise. (test_overlapped_fixit_printing_utf8): Likewise. (test_overlapped_fixit_printing): Adapt for changes to get_affected_columns, get_printed_columns and class corrections. (test_overlapped_fixit_printing_2): Likewise. (test_linenum_sep): New constant. (test_left_margin): Likewise. (test_offset_impl): Helper function for new test. (test_layout_x_offset_display_utf8): New test. (diagnostic_show_locus_c_tests): Call new tests. gcc/testsuite/ChangeLog: 2019-12-09 Lewis Hyatt <lhyatt@gmail.com> PR preprocessor/49973 * gcc.dg/plugin/diagnostic_plugin_test_show_locus.c (test_show_locus): Tweak so that expected output is the same as before the diagnostic-show-locus.c changes. * gcc.dg/cpp/pr66415-1.c: Likewise. From-SVN: r279137	2019-12-09 20:03:47 +00:00

8 Commits