Makefile.in (LIBCPP_DEPS): New macro.
* Makefile.in (LIBCPP_DEPS): New macro.
(cpplib.o, cpphash.o, cpperror.o, cppexp.o, cppfiles.o): Use
it to declare deps.
* cpperror.c: Include cpphash.h.
* cppexp.c: Include cpphash.h. Remove MULTIBYTE_CHARS
dingleberry.
(lex): Don't use CPP_WARN_UNDEF.
(_cpp_parse_expr): Return an int, the truth value.
* cppfiles.c: Include cpphash.h.
(_cpp_merge_include_chains): Move to cppinit.c and make static.
* cppinit.c (include_defaults_array): Disentangle.
(cpp_cleanup): Don't free the if stack here.
(cpp_finish): Pop off all buffers, not just one.
* cpplib.c (eval_if_expr): Return int.
(do_xifdef): Rename do_ifdef.
(handle_directive): Don't use CPP_PREPROCESSED.
(cpp_get_token): Don't use CPP_C89.
* fix-header.c: Don't use CPP_OPTIONS.
* cpplib.h: Move U_CHAR, enum node_type, struct
file_name_list, struct ihash, is_idchar, is_idstart,
is_numchar, is_numstart, is_hspace, is_space, CPP_BUF_PEEK,
CPP_BUF_GET, CPP_FORWARD, CPP_PUTS, CPP_PUTS_Q, CPP_PUTC,
CPP_PUTC_Q, CPP_NUL_TERMINATE, CPP_NUL_TERMINATE_Q,
CPP_BUMP_BUFFER_LINE, CPP_BUMP_LINE, CPP_PREV_BUFFER,
CPP_PRINT_DEPS, CPP_TRADITIONAL, CPP_PEDANTIC, and prototypes
of _cpp_simplify_pathname, _cpp_find_include_file,
_cpp_read_include_file, and _cpp_parse_expr to cpphash.h.
Move struct if_stack to cpplib.c. Move struct cpp_pending to
cppinit.c.
Change all uses of U_CHAR to be unsigned char instead.
Delete CPP_WARN_UNDEF, CPP_C89, and CPP_PREPROCESSED.
From-SVN: r32435
2000-03-09 07:35:19 +08:00
|
|
|
/* Part of CPP library.
|
2024-01-03 19:19:35 +08:00
|
|
|
Copyright (C) 1997-2024 Free Software Foundation, Inc.
|
1999-01-07 04:44:41 +08:00
|
|
|
|
|
|
|
This program is free software; you can redistribute it and/or modify it
|
|
|
|
under the terms of the GNU General Public License as published by the
|
2009-04-09 23:00:19 +08:00
|
|
|
Free Software Foundation; either version 3, or (at your option) any
|
1999-01-07 04:44:41 +08:00
|
|
|
later version.
|
|
|
|
|
|
|
|
This program is distributed in the hope that it will be useful,
|
|
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
GNU General Public License for more details.
|
|
|
|
|
|
|
|
You should have received a copy of the GNU General Public License
|
2009-04-09 23:00:19 +08:00
|
|
|
along with this program; see the file COPYING3. If not see
|
|
|
|
<http://www.gnu.org/licenses/>. */
|
1999-01-07 04:44:41 +08:00
|
|
|
|
Makefile.in (LIBCPP_DEPS): New macro.
* Makefile.in (LIBCPP_DEPS): New macro.
(cpplib.o, cpphash.o, cpperror.o, cppexp.o, cppfiles.o): Use
it to declare deps.
* cpperror.c: Include cpphash.h.
* cppexp.c: Include cpphash.h. Remove MULTIBYTE_CHARS
dingleberry.
(lex): Don't use CPP_WARN_UNDEF.
(_cpp_parse_expr): Return an int, the truth value.
* cppfiles.c: Include cpphash.h.
(_cpp_merge_include_chains): Move to cppinit.c and make static.
* cppinit.c (include_defaults_array): Disentangle.
(cpp_cleanup): Don't free the if stack here.
(cpp_finish): Pop off all buffers, not just one.
* cpplib.c (eval_if_expr): Return int.
(do_xifdef): Rename do_ifdef.
(handle_directive): Don't use CPP_PREPROCESSED.
(cpp_get_token): Don't use CPP_C89.
* fix-header.c: Don't use CPP_OPTIONS.
* cpplib.h: Move U_CHAR, enum node_type, struct
file_name_list, struct ihash, is_idchar, is_idstart,
is_numchar, is_numstart, is_hspace, is_space, CPP_BUF_PEEK,
CPP_BUF_GET, CPP_FORWARD, CPP_PUTS, CPP_PUTS_Q, CPP_PUTC,
CPP_PUTC_Q, CPP_NUL_TERMINATE, CPP_NUL_TERMINATE_Q,
CPP_BUMP_BUFFER_LINE, CPP_BUMP_LINE, CPP_PREV_BUFFER,
CPP_PRINT_DEPS, CPP_TRADITIONAL, CPP_PEDANTIC, and prototypes
of _cpp_simplify_pathname, _cpp_find_include_file,
_cpp_read_include_file, and _cpp_parse_expr to cpphash.h.
Move struct if_stack to cpplib.c. Move struct cpp_pending to
cppinit.c.
Change all uses of U_CHAR to be unsigned char instead.
Delete CPP_WARN_UNDEF, CPP_C89, and CPP_PREPROCESSED.
From-SVN: r32435
2000-03-09 07:35:19 +08:00
|
|
|
/* This header defines all the internal data structures and functions
|
2004-05-24 18:50:45 +08:00
|
|
|
that need to be visible across files. It should not be used outside
|
|
|
|
cpplib. */
|
Makefile.in (LIBCPP_DEPS): New macro.
* Makefile.in (LIBCPP_DEPS): New macro.
(cpplib.o, cpphash.o, cpperror.o, cppexp.o, cppfiles.o): Use
it to declare deps.
* cpperror.c: Include cpphash.h.
* cppexp.c: Include cpphash.h. Remove MULTIBYTE_CHARS
dingleberry.
(lex): Don't use CPP_WARN_UNDEF.
(_cpp_parse_expr): Return an int, the truth value.
* cppfiles.c: Include cpphash.h.
(_cpp_merge_include_chains): Move to cppinit.c and make static.
* cppinit.c (include_defaults_array): Disentangle.
(cpp_cleanup): Don't free the if stack here.
(cpp_finish): Pop off all buffers, not just one.
* cpplib.c (eval_if_expr): Return int.
(do_xifdef): Rename do_ifdef.
(handle_directive): Don't use CPP_PREPROCESSED.
(cpp_get_token): Don't use CPP_C89.
* fix-header.c: Don't use CPP_OPTIONS.
* cpplib.h: Move U_CHAR, enum node_type, struct
file_name_list, struct ihash, is_idchar, is_idstart,
is_numchar, is_numstart, is_hspace, is_space, CPP_BUF_PEEK,
CPP_BUF_GET, CPP_FORWARD, CPP_PUTS, CPP_PUTS_Q, CPP_PUTC,
CPP_PUTC_Q, CPP_NUL_TERMINATE, CPP_NUL_TERMINATE_Q,
CPP_BUMP_BUFFER_LINE, CPP_BUMP_LINE, CPP_PREV_BUFFER,
CPP_PRINT_DEPS, CPP_TRADITIONAL, CPP_PEDANTIC, and prototypes
of _cpp_simplify_pathname, _cpp_find_include_file,
_cpp_read_include_file, and _cpp_parse_expr to cpphash.h.
Move struct if_stack to cpplib.c. Move struct cpp_pending to
cppinit.c.
Change all uses of U_CHAR to be unsigned char instead.
Delete CPP_WARN_UNDEF, CPP_C89, and CPP_PREPROCESSED.
From-SVN: r32435
2000-03-09 07:35:19 +08:00
|
|
|
|
2004-05-24 18:50:45 +08:00
|
|
|
#ifndef LIBCPP_INTERNAL_H
|
|
|
|
#define LIBCPP_INTERNAL_H
|
cpplib.h: Provide HASHNODE typedef and forward decl of struct hashnode only.
* cpplib.h: Provide HASHNODE typedef and forward decl of
struct hashnode only. Kill cpp_hashnode typedef. MACRODEF,
DEFINITION, struct hashnode, struct macrodef, struct
definition, scan_decls prototype, default defn of
INCLUDE_LEN_FUDGE moved elsewhere.
* cpphash.h: MACRODEF, DEFINITION, struct macrodef, struct
definition, and struct hashnode moved here. Remove the unused
'predefined' field from struct definition. Replace the 'args'
union with its sole member. All users updated (cpphash.c).
Delete HASHSTEP and MAKE_POS macros, and hashf prototype. Add
multiple include guard.
* cpphash.c (hashf): Make static; use better algorithm; drop
HASHSIZE parameter; return an unsigned int.
(cpp_lookup): Drop HASH parameter. PFILE parameter is
used. Calculate HASHSIZE modulus here.
(cpp_install): Drop HASH parameter. Calculate HASHSIZE modulus
here.
(create_definition): Drop PREDEFINITION parameter.
* cpplib.c (do_define): Don't calculate a hash value here.
Don't pass (keyword == NULL) to create_definition.
* scan.h: Prototype scan_decls here.
* cppfiles.c: Move INCLUDE_LEN_FUDGE default defn here.
* cppexp.c, cppfiles.c, cppinit.c, cpplib.c, fix-header.c: All
callers of cpp_lookup and cpp_install updated.
From-SVN: r31881
2000-02-10 10:23:08 +08:00
|
|
|
|
2004-05-24 18:50:45 +08:00
|
|
|
#include "symtab.h"
|
2018-08-20 22:20:04 +08:00
|
|
|
#include "cpplib.h"
|
2023-11-19 19:26:40 +08:00
|
|
|
#include "rich-location.h"
|
Makefile.in (OBJS, [...]): Update.
* Makefile.in (OBJS, LIBCPP_OBJS, LIBCPP_DEPS,
cpplib.o, cpphash.o, fix-header): Update.
(hashtable.o): New target.
* c-common.h: Include cpplib.h. Define C_RID_CODE and
struct c_common_identifier here.
* c-lang.c (c_init_options): Update. Call set_identifier_size.
* c-lex.c (c_lex): Update.
* c-pragma.h: Update.
* c-tree.h (struct lang_identifier): Contain c_common_identifier.
Delete rid_code.
(C_RID_CODE): Delete.
* cpphash.c: Rewrite to use hashtable.c.
* cpphash.h: Update include guards.
(struct cpp_reader): Remove hashtab.
hash_ob and buffer_ob are no longer pointers. Add hash_table
and our_hashtable.
(HASHSTEP, _cpp_init_hashtable, _cpp_lookup_with_hash): Delete.
(_cpp_cleanup_hashtable): Rename _cpp_destroy_hashtable.
(_cpp_cleanup_stacks): Rename _cpp_init_directives.
* cppinit.c (cpp_create_reader): Update.
* cpplex.c (cpp_ideq, parse_identifier, cpp_output_token): Update.
(cpp_interpret_charconst): Eliminate warning.
* cpplib.c (do_pragma, do_endif, push_conditional,
cpp_push_buffer, cpp_pop_buffer): Update.
(_cpp_init_stacks): Rename cpp_init_directives.
(_cpp_cleanup_stacks): Remove.
* cpplib.h: Update include guards. Include tree-core.h and c-rid.h.
(cpp_hashnode, cpp_token, NODE_LEN, NODE_NAME,
cpp_forall_identifiers, cpp_create_reader): Update.
(C_RID_CODE, cpp_make_node): New.
(c_common_identifier): New identifier node for C front ends.
* cppmain.c (main): Update.
* fix-header.c (read_scan_file): Update.
* flags.h (id_clash_len): Make unsigned.
* ggc.h (ggc_mark_nonnull_tree): New.
* hashtable.c: New.
* hashtable.h: New.
* stringpool.c: Update comments and copyright. Update to use
hashtable.c.
* toplev.c (approx_sqrt): Move to hashtable.c.
(id_clash_len): Make unsigned.
* toplev.h (ident_hash): New.
* tree.c (gcc_obstack_init): Move to hashtable.c.
* tree.h: Include hashtable.h.
(IDENTIFIER_POINTER, IDENTIFIER_LENGTH): Update.
(GCC_IDENT_TO_HT_IDENT, HT_IDENT_TO_GCC_IDENT): New.
(struct tree_identifier): Update.
(make_identifier): New.
cp:
* cp-tree.h (struct lang_identifier, C_RID_YYCODE): Update.
(C_RID_CODE): Remove.
* lex.c (cxx_init_options): Call set_identifier_size. Update.
(init_parse): Don't do it here.
objc:
* objc-act.c (objc_init_options): Call set_identifier_size. Update.
From-SVN: r42334
2001-05-20 14:26:45 +08:00
|
|
|
|
2004-11-05 04:35:08 +08:00
|
|
|
#if HAVE_ICONV
|
cpplib.h (CPP_AT_NAME, [...]): New token types.
* cpplib.h (CPP_AT_NAME, CPP_OBJC_STRING): New token types.
(struct cpp_options): Add narrow_charset, wide_charset,
bytes_big_endian fields. Remove EBCDIC field.
(cpp_init_iconv, cpp_interpret_string): New external interfaces.
* cpphash.h: Include <iconv.h> if we have it, otherwise
provide a dummy definition of iconv_t.
(struct cpp_reader): Add narrow_cset_desc and wide_cset_desc fields.
(_cpp_valid_ucn): Update prototype.
(_cpp_destroy_iconv): New prototype.
* doc/cpp.texi: Document character set handling.
* doc/cppopts.texi: Document -fexec-charset= and -fexec-wide-charset=.
* doc/extend.texi: Delete entire section on multiline strings.
Rewrite section on __FUNCTION__ etc now that these are
variables in C.
* cppucnid.tab, cppucnid.pl: New files.
* cppucnid.h: New generated file.
* cppcharset.c: Include cppucnid.h. Lots of commentary added.
(iconv_open, iconv, iconv_close): Provide dummy definitions
if !HAVE_ICONV.
(SOURCE_CHARSET, struct strbuf, init_iconv_desc, cpp_init_iconv,
_cpp_destroy_iconv, convert_cset, width_to_mask, convert_ucn,
emit_numeric_escape, convert_hex, convert_oct, convert_escape,
cpp_interpret_string, narrow_str_to_charconst,
wide_str_to_charconst): New.
(ucn_valid_in_identifier): Use a binary search through the
ucnranges table defined in cppucnid.h, not a long chain of if
statements.
(_cpp_valid_ucn): Add a limit pointer. Downgrade "universal
character names are only valid in C++ and C99" to a warning.
Issue the "meaning of \[uU] is different in traditional C"
warning here. Take care not to let iconv see an invalid UCS
value if we get a malformed UCN. Issue an error if we don't
have iconv.
(cpp_interpret_charconst): Moved here from cpplex.c. Use
cpp_interpret_string to do the heavy lifting.
* cppinit.c (cpp_create_reader): Initialize bytes_big_endian,
narrow_charset, wide_charset fields of options structure.
(cpp_destroy): Call _cpp_destroy_iconv.
* cpplex.c (forms_identifier_p): Adjust call to _cpp_valid_ucn.
(maybe_read_ucn, hex_digit_value, cpp_parse_escape): Delete.
(cpp_interpret_charconst): Moved to cppcharset.c.
* cpplib.c (dequote_string): Delete.
(interpret_string_notranslate): New.
(do_line, do_linemarker): Use interpret_string_notranslate.
* Makefile.in (cppcharset.o): Depend on cppucnid.h.
* c-common.c (fname_string, combine_strings): Delete.
* c-common.h (fname_string, combine_strings): Delete prototypes.
* c-lex.c (ignore_escape_flag): Delete.
(cb_ident): Use cpp_interpret_string, not lex_string.
(get_nonpadding_token): New function.
(c_lex): Handle Objective-C @-prefixed identifiers and strings here.
Adjust calls to lex_string. Don't write *value twice.
(lex_string): Now handles string constant concatenation.
Most of the work handed off to cpp_interpret_string.
Call fix_string_type here.
* c-parse.in (STRING_FUNC_NAME, VAR_FUNC_NAME): Replace with
FUNC_NAME, throughout.
(OBJC_STRING): New token type.
(primary:STRING): No need to call fix_string_type here.
(primary:objc_string): Make that OBJC_STRING.
(objc_string nonterminal): Delete.
(yylexname): Delete code to handle fake string constants.
(yylexstring): Delete entirely.
(_yylex): Handle CPP_AT_NAME and CPP_OBJC_STRING. No need
to handle CPP_ATSIGN.
* c.opt (-fexec-charset=, -fwide-exec-charset=): New options.
* c-opts.c (missing_arg, c_common_handle_option): Handle
OPT_fexec_charset_ and OPT_fwide_exec_charset_.
(c_common_init): Set cpp_opts->bytes_big_endian, not
cpp_opts->EBCDIC. Call cpp_init_iconv.
(print_help): Document -fexec-charset= and -fexec-wide-charset=.
(TARGET_EBCDIC): Delete default definition.
* objc/objc-act.c (build_objc_string_object): No need to
handle string constant concatenation.
cp:
* parser.c (cp_lexer_read_token): No need to handle string
constant concatenation.
testsuite:
* gcc.c-torture/execute/wchar_t-1.x: New file; XFAIL wchar_t-1.c
everywhere.
* gcc.dg/concat.c: Concatenation of string constants with
__FUNCTION__ / __PRETTY_FUNCTION__ is now a hard error.
* gcc.dg/wtr-strcat-1.c: Loosen dg-warning regexp.
* gcc.dg/cpp/escape-2.c: Use wide character constants where
necessary to avoid multi-character character constant warning.
* gcc.dg/cpp/escape.c: Likewise.
* gcc.dg/cpp/ucs.c: Likewise.
Remove backslashes from dg-bogus comments, as they confuse Tcl.
Fix a typo.
libstdc++-v3:
* testsuite/22_locale/collate/compare/wchar_t/2.cc
* testsuite/22_locale/collate/compare/wchar_t/wrapped_env.cc
* testsuite/22_locale/collate/compare/wchar_t/wrapped_locale.cc
* testsuite/22_locale/collate/hash/wchar_t/2.cc
* testsuite/22_locale/collate/hash/wchar_t/wrapped_env.cc
* testsuite/22_locale/collate/hash/wchar_t/wrapped_locale.cc
* testsuite/22_locale/collate/transform/wchar_t/2.cc
* testsuite/22_locale/collate/transform/wchar_t/wrapped_env.cc
* testsuite/22_locale/collate/transform/wchar_t/wrapped_locale.cc:
XFAIL on all targets.
From-SVN: r68952
2003-07-05 08:24:00 +08:00
|
|
|
#include <iconv.h>
|
|
|
|
#else
|
|
|
|
#define HAVE_ICONV 0
|
|
|
|
typedef int iconv_t; /* dummy */
|
|
|
|
#endif
|
|
|
|
|
2009-06-01 23:37:03 +08:00
|
|
|
#ifdef __cplusplus
|
|
|
|
extern "C" {
|
|
|
|
#endif
|
|
|
|
|
Makefile.in (OBJS, [...]): Update.
* Makefile.in (OBJS, LIBCPP_OBJS, LIBCPP_DEPS,
cpplib.o, cpphash.o, fix-header): Update.
(hashtable.o): New target.
* c-common.h: Include cpplib.h. Define C_RID_CODE and
struct c_common_identifier here.
* c-lang.c (c_init_options): Update. Call set_identifier_size.
* c-lex.c (c_lex): Update.
* c-pragma.h: Update.
* c-tree.h (struct lang_identifier): Contain c_common_identifier.
Delete rid_code.
(C_RID_CODE): Delete.
* cpphash.c: Rewrite to use hashtable.c.
* cpphash.h: Update include guards.
(struct cpp_reader): Remove hashtab.
hash_ob and buffer_ob are no longer pointers. Add hash_table
and our_hashtable.
(HASHSTEP, _cpp_init_hashtable, _cpp_lookup_with_hash): Delete.
(_cpp_cleanup_hashtable): Rename _cpp_destroy_hashtable.
(_cpp_cleanup_stacks): Rename _cpp_init_directives.
* cppinit.c (cpp_create_reader): Update.
* cpplex.c (cpp_ideq, parse_identifier, cpp_output_token): Update.
(cpp_interpret_charconst): Eliminate warning.
* cpplib.c (do_pragma, do_endif, push_conditional,
cpp_push_buffer, cpp_pop_buffer): Update.
(_cpp_init_stacks): Rename cpp_init_directives.
(_cpp_cleanup_stacks): Remove.
* cpplib.h: Update include guards. Include tree-core.h and c-rid.h.
(cpp_hashnode, cpp_token, NODE_LEN, NODE_NAME,
cpp_forall_identifiers, cpp_create_reader): Update.
(C_RID_CODE, cpp_make_node): New.
(c_common_identifier): New identifier node for C front ends.
* cppmain.c (main): Update.
* fix-header.c (read_scan_file): Update.
* flags.h (id_clash_len): Make unsigned.
* ggc.h (ggc_mark_nonnull_tree): New.
* hashtable.c: New.
* hashtable.h: New.
* stringpool.c: Update comments and copyright. Update to use
hashtable.c.
* toplev.c (approx_sqrt): Move to hashtable.c.
(id_clash_len): Make unsigned.
* toplev.h (ident_hash): New.
* tree.c (gcc_obstack_init): Move to hashtable.c.
* tree.h: Include hashtable.h.
(IDENTIFIER_POINTER, IDENTIFIER_LENGTH): Update.
(GCC_IDENT_TO_HT_IDENT, HT_IDENT_TO_GCC_IDENT): New.
(struct tree_identifier): Update.
(make_identifier): New.
cp:
* cp-tree.h (struct lang_identifier, C_RID_YYCODE): Update.
(C_RID_CODE): Remove.
* lex.c (cxx_init_options): Call set_identifier_size. Update.
(init_parse): Don't do it here.
objc:
* objc-act.c (objc_init_options): Call set_identifier_size. Update.
From-SVN: r42334
2001-05-20 14:26:45 +08:00
|
|
|
struct directive; /* Deliberately incomplete. */
|
2002-04-23 01:48:02 +08:00
|
|
|
struct pending_option;
|
2002-04-29 03:42:54 +08:00
|
|
|
struct op;
|
2003-09-26 13:52:43 +08:00
|
|
|
struct _cpp_strbuf;
|
cppcharset.c (one_utf8_to_cppchar, [...]): New functions.
* cppcharset.c (one_utf8_to_cppchar, one_cppchar_to_utf8,
one_utf8_to_utf32, one_utf32_to_utf8, one_utf8_to_utf16,
one_utf16_to_utf8, conversion_loop, convert_utf8_utf16,
convert_utf8_utf32, convert_utf16_utf8, convert_utf32_utf8,
convert_no_conversion, convert_using_iconv): New functions.
(APPLY_CONVERSION): New macro.
(struct conversion, conversion_tab): New data structure.
(init_iconv_desc): Check conversion_tab for a custom conversion
primitive before trying to use iconv.
(convert_cset): Deleted.
(cpp_init_iconv): Use UTF- terminology, not UCS-.
(_cpp_destroy_iconv): Update to match.
(_cpp_valid_ucn): We don't need iconv to implement UCNs.
(convert_ucn): Use one_cppchar_to_utf8 and APPLY_CONVERSION.
(convert_escape, cpp_interpret_string): Use APPLY_CONVERSION.
(_cpp_interpret_string_notranslate): New function, moved here
from cpplib.c.
* cpphash.h (convert_f, struct cset_converter): New types.
(struct cpp_reader): narrow_cset_desc and wide_cset_desc
are now struct cset_converter, not bare iconv_t.
Update prototypes.
* cpplib.c (interpret_string_notranslate): Moved to cppcharset.c;
all callers changed.
From-SVN: r69204
2003-07-11 07:16:31 +08:00
|
|
|
|
|
|
|
typedef bool (*convert_f) (iconv_t, const unsigned char *, size_t,
|
2003-09-26 13:52:43 +08:00
|
|
|
struct _cpp_strbuf *);
|
cppcharset.c (one_utf8_to_cppchar, [...]): New functions.
* cppcharset.c (one_utf8_to_cppchar, one_cppchar_to_utf8,
one_utf8_to_utf32, one_utf32_to_utf8, one_utf8_to_utf16,
one_utf16_to_utf8, conversion_loop, convert_utf8_utf16,
convert_utf8_utf32, convert_utf16_utf8, convert_utf32_utf8,
convert_no_conversion, convert_using_iconv): New functions.
(APPLY_CONVERSION): New macro.
(struct conversion, conversion_tab): New data structure.
(init_iconv_desc): Check conversion_tab for a custom conversion
primitive before trying to use iconv.
(convert_cset): Deleted.
(cpp_init_iconv): Use UTF- terminology, not UCS-.
(_cpp_destroy_iconv): Update to match.
(_cpp_valid_ucn): We don't need iconv to implement UCNs.
(convert_ucn): Use one_cppchar_to_utf8 and APPLY_CONVERSION.
(convert_escape, cpp_interpret_string): Use APPLY_CONVERSION.
(_cpp_interpret_string_notranslate): New function, moved here
from cpplib.c.
* cpphash.h (convert_f, struct cset_converter): New types.
(struct cpp_reader): narrow_cset_desc and wide_cset_desc
are now struct cset_converter, not bare iconv_t.
Update prototypes.
* cpplib.c (interpret_string_notranslate): Moved to cppcharset.c;
all callers changed.
From-SVN: r69204
2003-07-11 07:16:31 +08:00
|
|
|
struct cset_converter
|
|
|
|
{
|
|
|
|
convert_f func;
|
|
|
|
iconv_t cd;
|
cpp-id-data.h (UC): Was U, conflicts with U...
libcpp/ChangeLog:
2008-04-14 Kris Van Hees <kris.van.hees@oracle.com>
* include/cpp-id-data.h (UC): Was U, conflicts with U... literal.
* include/cpplib.h (CHAR16, CHAR32, STRING16, STRING32): New tokens.
(struct cpp_options): Added uliterals.
(cpp_interpret_string): Update prototype.
(cpp_interpret_string_notranslate): Idem.
* charset.c (init_iconv_desc): New width member in cset_converter.
(cpp_init_iconv): Add support for char{16,32}_cset_desc.
(convert_ucn): Idem.
(emit_numeric_escape): Idem.
(convert_hex): Idem.
(convert_oct): Idem.
(convert_escape): Idem.
(converter_for_type): New function.
(cpp_interpret_string): Use converter_for_type, support u and U prefix.
(cpp_interpret_string_notranslate): Match changed prototype.
(wide_str_to_charconst): Use converter_for_type.
(cpp_interpret_charconst): Add support for CPP_CHAR{16,32}.
* directives.c (linemarker_dir): Macro U changed to UC.
(parse_include): Idem.
(register_pragma_1): Idem.
(restore_registered_pragmas): Idem.
(get__Pragma_string): Support CPP_STRING{16,32}.
* expr.c (eval_token): Support CPP_CHAR{16,32}.
* init.c (struct lang_flags): Added uliterals.
(lang_defaults): Idem.
* internal.h (struct cset_converter) <width>: New field.
(struct cpp_reader) <char16_cset_desc>: Idem.
(struct cpp_reader) <char32_cset_desc>: Idem.
* lex.c (digraph_spellings): Macro U changed to UC.
(OP, TK): Idem.
(lex_string): Add support for u'...', U'...', u... and U....
(_cpp_lex_direct): Idem.
* macro.c (_cpp_builtin_macro_text): Macro U changed to UC.
(stringify_arg): Support CPP_CHAR{16,32} and CPP_STRING{16,32}.
gcc/ChangeLog:
2008-04-14 Kris Van Hees <kris.van.hees@oracle.com>
* c-common.c (CHAR16_TYPE, CHAR32_TYPE): New macros.
(fname_as_string): Match updated cpp_interpret_string prototype.
(fix_string_type): Support char16_t* and char32_t*.
(c_common_nodes_and_builtins): Add char16_t and char32_t (and
derivative) nodes. Register as builtin if C++0x.
(c_parse_error): Support CPP_CHAR{16,32}.
* c-common.h (RID_CHAR16, RID_CHAR32): New elements.
(enum c_tree_index) <CTI_CHAR16_TYPE, CTI_SIGNED_CHAR16_TYPE,
CTI_UNSIGNED_CHAR16_TYPE, CTI_CHAR32_TYPE, CTI_SIGNED_CHAR32_TYPE,
CTI_UNSIGNED_CHAR32_TYPE, CTI_CHAR16_ARRAY_TYPE,
CTI_CHAR32_ARRAY_TYPE>: New elements.
(char16_type_node, signed_char16_type_node, unsigned_char16_type_node,
char32_type_node, signed_char32_type_node, char16_array_type_node,
char32_array_type_node): New defines.
* c-lex.c (cb_ident): Match updated cpp_interpret_string prototype.
(c_lex_with_flags): Support CPP_CHAR{16,32} and CPP_STRING{16,32}.
(lex_string): Support CPP_STRING{16,32}, match updated
cpp_interpret_string and cpp_interpret_string_notranslate prototypes.
(lex_charconst): Support CPP_CHAR{16,32}.
* c-parser.c (c_parser_postfix_expression): Support CPP_CHAR{16,32}
and CPP_STRING{16,32}.
gcc/cp/ChangeLog:
2008-04-14 Kris Van Hees <kris.van.hees@oracle.com>
* cvt.c (type_promotes_to): Support char16_t and char32_t.
* decl.c (grokdeclarator): Disallow signed/unsigned/short/long on
char16_t and char32_t.
* lex.c (reswords): Add char16_t and char32_t (for c++0x).
* mangle.c (write_builtin_type): Mangle char16_t/char32_t as vendor
extended builtin type u8char32_t.
* parser.c (cp_lexer_next_token_is_decl_specifier_keyword): Support
RID_CHAR{16,32}.
(cp_lexer_print_token): Support CPP_STRING{16,32}.
(cp_parser_is_string_literal): Idem.
(cp_parser_string_literal): Idem.
(cp_parser_primary_expression): Support CPP_CHAR{16,32} and
CPP_STRING{16,32}.
(cp_parser_simple_type_specifier): Support RID_CHAR{16,32}.
* tree.c (char_type_p): Support char16_t and char32_t as char types.
* typeck.c (string_conv_p): Support char16_t and char32_t.
gcc/testsuite/ChangeLog:
2008-04-14 Kris Van Hees <kris.van.hees@oracle.com>
Tests for char16_t and char32_t support.
* g++.dg/ext/utf-cvt.C: New
* g++.dg/ext/utf-cxx0x.C: New
* g++.dg/ext/utf-cxx98.C: New
* g++.dg/ext/utf-dflt.C: New
* g++.dg/ext/utf-gnuxx0x.C: New
* g++.dg/ext/utf-gnuxx98.C: New
* g++.dg/ext/utf-mangle.C: New
* g++.dg/ext/utf-typedef-cxx0x.C: New
* g++.dg/ext/utf-typedef-
* g++.dg/ext/utf-typespec.C: New
* g++.dg/ext/utf16-1.C: New
* g++.dg/ext/utf16-2.C: New
* g++.dg/ext/utf16-3.C: New
* g++.dg/ext/utf16-4.C: New
* g++.dg/ext/utf32-1.C: New
* g++.dg/ext/utf32-2.C: New
* g++.dg/ext/utf32-3.C: New
* g++.dg/ext/utf32-4.C: New
* gcc.dg/utf-cvt.c: New
* gcc.dg/utf-dflt.c: New
* gcc.dg/utf16-1.c: New
* gcc.dg/utf16-2.c: New
* gcc.dg/utf16-3.c: New
* gcc.dg/utf16-4.c: New
* gcc.dg/utf32-1.c: New
* gcc.dg/utf32-2.c: New
* gcc.dg/utf32-3.c: New
* gcc.dg/utf32-4.c: New
libiberty/ChangeLog:
2008-04-14 Kris Van Hees <kris.van.hees@oracle.com>
* testsuite/demangle-expected: Added tests for char16_t and char32_t.
From-SVN: r134438
2008-04-18 21:58:08 +08:00
|
|
|
int width;
|
2020-12-02 05:39:47 +08:00
|
|
|
const char* from;
|
|
|
|
const char* to;
|
cppcharset.c (one_utf8_to_cppchar, [...]): New functions.
* cppcharset.c (one_utf8_to_cppchar, one_cppchar_to_utf8,
one_utf8_to_utf32, one_utf32_to_utf8, one_utf8_to_utf16,
one_utf16_to_utf8, conversion_loop, convert_utf8_utf16,
convert_utf8_utf32, convert_utf16_utf8, convert_utf32_utf8,
convert_no_conversion, convert_using_iconv): New functions.
(APPLY_CONVERSION): New macro.
(struct conversion, conversion_tab): New data structure.
(init_iconv_desc): Check conversion_tab for a custom conversion
primitive before trying to use iconv.
(convert_cset): Deleted.
(cpp_init_iconv): Use UTF- terminology, not UCS-.
(_cpp_destroy_iconv): Update to match.
(_cpp_valid_ucn): We don't need iconv to implement UCNs.
(convert_ucn): Use one_cppchar_to_utf8 and APPLY_CONVERSION.
(convert_escape, cpp_interpret_string): Use APPLY_CONVERSION.
(_cpp_interpret_string_notranslate): New function, moved here
from cpplib.c.
* cpphash.h (convert_f, struct cset_converter): New types.
(struct cpp_reader): narrow_cset_desc and wide_cset_desc
are now struct cset_converter, not bare iconv_t.
Update prototypes.
* cpplib.c (interpret_string_notranslate): Moved to cppcharset.c;
all callers changed.
From-SVN: r69204
2003-07-11 07:16:31 +08:00
|
|
|
};
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
|
2002-05-04 15:30:32 +08:00
|
|
|
#define BITS_PER_CPPCHAR_T (CHAR_BIT * sizeof (cppchar_t))
|
|
|
|
|
New macro expander.
2000-10-28 Neil Booth <neilb@earthling.net>
New macro expander.
* cpplib.c (struct answer): New.
(struct if_stack): Use cpp_lexer_pos rather than line and col.
Rename cmacro mi_cmacro.
(struct directive, KANDR, STDC89, EXTENSION, COND, IF_COND, INCL,
IN_I): New directive and flags.
(skip_rest_of_line, check_eol, run_directive, glue_header_name,
parse_answer, parse_assertion, find_answer): New functions.
(parse_ifdef, detect_if_not_defined, validate_else): Remove.
(lex_macro_node): New function to replace parse_ifdef and
get_define_node.
(_cpp_handle_directive): New function, combines _cpp_check_directive
and _cpp_check_linemarker.
(do_define, do_undef, parse_include, do_include, do_import,
do_include_next, read_line_number, do_line, do_ident, do_pragma,
do_pragma_once, do_pragma_poison, do_pragma_dependency):
Update for new token getting interface.
(do_ifdef, do_ifndef, do_if, do_else, do_endif, push_conditional)
: Update for new multiple-include optimisation technique.
(do_elif): Don't forget to invalidate controlling macros.
(unwind_if_stack, cpp_defined, cpp_push_buffer, cpp_pop_buffer): Update.
(parse_assertion, parse_answer, find_answer, _cpp_test_assertion):
Functions to handle assertions with the new token interface.
(do_assert, do_unassert): Use them.
(cpp_define, _cpp_define_builtin, cpp_undef, cpp_assert, cpp_unassert):
Use run_directive.
(_cpp_init_stacks): Register directive names. Don't register special
nodes.
* cpperror.c (print_containing_files, _cpp_begin_message): Update to
new position recording regime.
(cpp_ice, cpp_fatal, cpp_error, cpp_error_with_line, cpp_warning,
cpp_warning_with_line, cpp_pedwarn, cpp_pedwarn_with_line,
cpp_pedwarn_with_file_and_line): Update for _cpp_begin_message changes.
(cpp_type2name): Move to cpplex.c.
* cppexp.c (parse_charconst): spec_nodes is no longer a pointer.
(parse_defined): Update to handle new multiple include optimisation
method. Remove poisoned identifier warning.
(parse_assertion, TYPE_NAME): Delete.
(lex): Update for multiple include optimisation, removal of
CPP_DEFINED, to use _cpp_test_assertion for assertions and
cpp_token_as_text.
(_cpp_parse_expr): Update for MI optimisation, and to use op_as_text.
(op_as_text): New function, to wrap cpp_token_as_text.
* cppfiles.c (stack_include_file, _cpp_pop_file_buffer):
Update for MI optimisation.
(_cpp_execute_include): Take a token rather than 3 arguments. Fix
segfault on diagnostic.
(_cpp_compare_file_date): Take a token rather than 3 args.
(cpp_read_file): Work correctly for zero-length files.
* cpphash.c (_cpp_init_macros, _cpp_cleanup_macros): Rename
_cpp_init_hashtable and _cpp_cleanup_hashtable.
(cpp_lookup): Place identifiers at front of identifier pool
for _cpp_lookup_with_hash.
(_cpp_lookup_with_hash): Require identifiers to be at the front of
the identifier pool. Commit the memory if not already in the
hash table.
* cppinit.c (cpp_reader_init): Move cpp_init_completed test to top.
Initialise various members of cpp_reader, memory pools, and the
special nodes.
(cpp_printer_init): Delete.
(cpp_cleanup): Update.
(struct builtin, builtin_array, initialize_builtins): Update for new
hashnode definition and builtin handling.
(cpp_start_read, cpp_finish): Don't take or initialise a
printer. Update.
* cpplib.h (cpp_printer, cpp_toklist, CPP_DEFINED, BOL,
PASTED, VAR_ARGS, BEG_OF_FILE, IN_DIRECTIVE, KNOWN_DIRECTIVE,
T_VOID, T_SPECLINE, T_DATE, T_FILE, T_BASE_FILE, T_INCLUDE_LEVEL,
T_TIME, T_STDC, T_OPERATOR, T_POISON, T_MACRO, T_ASSERTION): Delete.
(struct cpp_pool, struct cpp_macro, struct cpp_lexer_pos,
struct cpp_lookahead, CPP_DHASH, enum mi_state, enum mi_ind,
NO_EXPAND, VARARGS_FIRST, struct cpp_token_with_pos,
struct toklist, struct cpp_context, struct specnodes,
TOKEN_LOOKAHEAD, TOKEN_BUFFSIZE, NODE_OPERATOR, NODE_POISONED,
NODE_BUILTIN, NODE_DIAGNOSTIC, NT_VOID, NT_MACRO, NT_ASSERTION,
enum builtin_type, cpp_can_paste): New.
(struct cpp_token): Delete line and col members.
(struct cpp_buffer): New member output_lineno.
(struct lexer_state): Delete indented, in_lex_line, seen_dot.
Add va_args_ok, poisoned_ok, prevent_expansion, parsing_args.
(struct cpp_reader): New members lexer_pos, macro_pos, directive_pos,
ident_pool, temp_string_pool, macro_pool, argument_pool, string_pool,
base_context, context, directive, mi_state, mi_if_not_defined,
mi_lexed, mi_cmacro, mi_ind_cmacro, la_read, la_write, la_unused,
mlstring_pos, macro_buffer, macro_buffer_len.
Delete members mls_line, mls_column, token_list, potential_control_macro,
temp_tokens, temp_cap, temp_alloced, temp_used, first_directive_token,
context_cap, cur_context, no_expand_level, paste_level, contexts, args,
save_parameter_spellings, need_newline, .
Change type of date, time and spec_nodes members.
Change prototypes for include and ident callbacks.
(struct cpp_hashnode): Change type of name. Remove union members
expansion and code. Add members macro, operator and builtin.
(cpp_token_len, cpp_token_as_text, cpp_spell_token, cpp_start_read,
cpp_finish, cpp_avoid_paste, cpp_get_token, cpp_get_line,
cpp_get_output_line, cpp_macro_definition, cpp_start_lookahead,
cpp_stop_lookahead): New prototypes.
(cpp_printer_init, cpp_dump_definition): Delete prototypes.
(U_CHAR, U, ustrcmp, ustrncmp, ustrlen, uxstrdup, ustrchr, ufputs):
Move from cpphash.h.
* cpphash.h (U_CHAR, U, ustrcmp, ustrncmp, ustrlen, uxstrdup, ustrchr,
ufputs): Move to cpplib.h.
(enum spell_type, struct token_spelling, _cpp_token_spellings, TOKEN_SPELL,
TOKEN_NAME, struct answer, FREE_ANSWER, KANDR, STDC89, EXTENSION,
COND, EXPAND, INCL, COMMENTS, IN_I, struct directive, directive_handler,
struct spec_nodes, _cpp_digraph_spellings, _cpp_free_temp_tokens,
_cpp_init_input_buffer, _cpp_grow_token_buffer, _cpp_init_toklist,
_cpp_clear_toklist, _cpp_expand_token_space, _cpp_expand_name_space,
_cpp_equiv_tokens, _cpp_equiv_toklists, _cpp_process_directive,
_cpp_run_directive, _cpp_get_line, _cpp_get_raw_token, _cpp_glue_header_name,
_cpp_can_paste, _cpp_check_directive, _cpp_check_linemarker,
_cpp_parse_assertion, _cpp_find_answer): Delete.
(VALID_SIGN, ALIGN, POOL_FRONT, POOL_LIMIT, POOL_BASE, POOL_SIZE,
POOL_USED, POOL_COMMIT, struct cpp_chunk, _cpp_lex_token, _cpp_init_pool,
_cpp_free_pool, _cpp_pool_reserve, _cpp_pool_alloc, _cpp_next_chunk,
_cpp_lock_pool, _cpp_unlock_pool, _cpp_test_assertion,
_cpp_handle_directive, DSC): New.
(struct include_file): New member defined.
(DO_NOT_REREAD, _cpp_begin_message, _cpp_execute_include,
_cpp_compare_file_date): Update.
(_cpp_pop_context, _cpp_get_token, _cpp_free_lookaheads, _cpp_push_token): New.
(_cpp_init_macros, _cpp_cleanup_macros): Rename to _cpp_init_hashtable,
_cpp_cleanup_hashtable.
* Makefile.in: Remove cppoutput.c.
* cppoutput.c: Delete
* fixheader.c (read_scan_file): Update for new cpp_get_token
prototype.
(recognized_function): New argument LINE.
* scan-decls.c (skip_to_closing_brace, scan_decls): Update for
new cpp_get_token prototype.
* scan.h (recognized_function): Update prototype.
* po/POTFILES.in: Remove cppoutput.c.
From-SVN: r37098
2000-10-29 01:59:06 +08:00
|
|
|
/* Test if a sign is valid within a preprocessing number. */
|
|
|
|
#define VALID_SIGN(c, prevc) \
|
|
|
|
(((c) == '+' || (c) == '-') && \
|
|
|
|
((prevc) == 'e' || (prevc) == 'E' \
|
2000-11-27 01:31:13 +08:00
|
|
|
|| (((prevc) == 'p' || (prevc) == 'P') \
|
|
|
|
&& CPP_OPTION (pfile, extended_numbers))))
|
New macro expander.
2000-10-28 Neil Booth <neilb@earthling.net>
New macro expander.
* cpplib.c (struct answer): New.
(struct if_stack): Use cpp_lexer_pos rather than line and col.
Rename cmacro mi_cmacro.
(struct directive, KANDR, STDC89, EXTENSION, COND, IF_COND, INCL,
IN_I): New directive and flags.
(skip_rest_of_line, check_eol, run_directive, glue_header_name,
parse_answer, parse_assertion, find_answer): New functions.
(parse_ifdef, detect_if_not_defined, validate_else): Remove.
(lex_macro_node): New function to replace parse_ifdef and
get_define_node.
(_cpp_handle_directive): New function, combines _cpp_check_directive
and _cpp_check_linemarker.
(do_define, do_undef, parse_include, do_include, do_import,
do_include_next, read_line_number, do_line, do_ident, do_pragma,
do_pragma_once, do_pragma_poison, do_pragma_dependency):
Update for new token getting interface.
(do_ifdef, do_ifndef, do_if, do_else, do_endif, push_conditional)
: Update for new multiple-include optimisation technique.
(do_elif): Don't forget to invalidate controlling macros.
(unwind_if_stack, cpp_defined, cpp_push_buffer, cpp_pop_buffer): Update.
(parse_assertion, parse_answer, find_answer, _cpp_test_assertion):
Functions to handle assertions with the new token interface.
(do_assert, do_unassert): Use them.
(cpp_define, _cpp_define_builtin, cpp_undef, cpp_assert, cpp_unassert):
Use run_directive.
(_cpp_init_stacks): Register directive names. Don't register special
nodes.
* cpperror.c (print_containing_files, _cpp_begin_message): Update to
new position recording regime.
(cpp_ice, cpp_fatal, cpp_error, cpp_error_with_line, cpp_warning,
cpp_warning_with_line, cpp_pedwarn, cpp_pedwarn_with_line,
cpp_pedwarn_with_file_and_line): Update for _cpp_begin_message changes.
(cpp_type2name): Move to cpplex.c.
* cppexp.c (parse_charconst): spec_nodes is no longer a pointer.
(parse_defined): Update to handle new multiple include optimisation
method. Remove poisoned identifier warning.
(parse_assertion, TYPE_NAME): Delete.
(lex): Update for multiple include optimisation, removal of
CPP_DEFINED, to use _cpp_test_assertion for assertions and
cpp_token_as_text.
(_cpp_parse_expr): Update for MI optimisation, and to use op_as_text.
(op_as_text): New function, to wrap cpp_token_as_text.
* cppfiles.c (stack_include_file, _cpp_pop_file_buffer):
Update for MI optimisation.
(_cpp_execute_include): Take a token rather than 3 arguments. Fix
segfault on diagnostic.
(_cpp_compare_file_date): Take a token rather than 3 args.
(cpp_read_file): Work correctly for zero-length files.
* cpphash.c (_cpp_init_macros, _cpp_cleanup_macros): Rename
_cpp_init_hashtable and _cpp_cleanup_hashtable.
(cpp_lookup): Place identifiers at front of identifier pool
for _cpp_lookup_with_hash.
(_cpp_lookup_with_hash): Require identifiers to be at the front of
the identifier pool. Commit the memory if not already in the
hash table.
* cppinit.c (cpp_reader_init): Move cpp_init_completed test to top.
Initialise various members of cpp_reader, memory pools, and the
special nodes.
(cpp_printer_init): Delete.
(cpp_cleanup): Update.
(struct builtin, builtin_array, initialize_builtins): Update for new
hashnode definition and builtin handling.
(cpp_start_read, cpp_finish): Don't take or initialise a
printer. Update.
* cpplib.h (cpp_printer, cpp_toklist, CPP_DEFINED, BOL,
PASTED, VAR_ARGS, BEG_OF_FILE, IN_DIRECTIVE, KNOWN_DIRECTIVE,
T_VOID, T_SPECLINE, T_DATE, T_FILE, T_BASE_FILE, T_INCLUDE_LEVEL,
T_TIME, T_STDC, T_OPERATOR, T_POISON, T_MACRO, T_ASSERTION): Delete.
(struct cpp_pool, struct cpp_macro, struct cpp_lexer_pos,
struct cpp_lookahead, CPP_DHASH, enum mi_state, enum mi_ind,
NO_EXPAND, VARARGS_FIRST, struct cpp_token_with_pos,
struct toklist, struct cpp_context, struct specnodes,
TOKEN_LOOKAHEAD, TOKEN_BUFFSIZE, NODE_OPERATOR, NODE_POISONED,
NODE_BUILTIN, NODE_DIAGNOSTIC, NT_VOID, NT_MACRO, NT_ASSERTION,
enum builtin_type, cpp_can_paste): New.
(struct cpp_token): Delete line and col members.
(struct cpp_buffer): New member output_lineno.
(struct lexer_state): Delete indented, in_lex_line, seen_dot.
Add va_args_ok, poisoned_ok, prevent_expansion, parsing_args.
(struct cpp_reader): New members lexer_pos, macro_pos, directive_pos,
ident_pool, temp_string_pool, macro_pool, argument_pool, string_pool,
base_context, context, directive, mi_state, mi_if_not_defined,
mi_lexed, mi_cmacro, mi_ind_cmacro, la_read, la_write, la_unused,
mlstring_pos, macro_buffer, macro_buffer_len.
Delete members mls_line, mls_column, token_list, potential_control_macro,
temp_tokens, temp_cap, temp_alloced, temp_used, first_directive_token,
context_cap, cur_context, no_expand_level, paste_level, contexts, args,
save_parameter_spellings, need_newline, .
Change type of date, time and spec_nodes members.
Change prototypes for include and ident callbacks.
(struct cpp_hashnode): Change type of name. Remove union members
expansion and code. Add members macro, operator and builtin.
(cpp_token_len, cpp_token_as_text, cpp_spell_token, cpp_start_read,
cpp_finish, cpp_avoid_paste, cpp_get_token, cpp_get_line,
cpp_get_output_line, cpp_macro_definition, cpp_start_lookahead,
cpp_stop_lookahead): New prototypes.
(cpp_printer_init, cpp_dump_definition): Delete prototypes.
(U_CHAR, U, ustrcmp, ustrncmp, ustrlen, uxstrdup, ustrchr, ufputs):
Move from cpphash.h.
* cpphash.h (U_CHAR, U, ustrcmp, ustrncmp, ustrlen, uxstrdup, ustrchr,
ufputs): Move to cpplib.h.
(enum spell_type, struct token_spelling, _cpp_token_spellings, TOKEN_SPELL,
TOKEN_NAME, struct answer, FREE_ANSWER, KANDR, STDC89, EXTENSION,
COND, EXPAND, INCL, COMMENTS, IN_I, struct directive, directive_handler,
struct spec_nodes, _cpp_digraph_spellings, _cpp_free_temp_tokens,
_cpp_init_input_buffer, _cpp_grow_token_buffer, _cpp_init_toklist,
_cpp_clear_toklist, _cpp_expand_token_space, _cpp_expand_name_space,
_cpp_equiv_tokens, _cpp_equiv_toklists, _cpp_process_directive,
_cpp_run_directive, _cpp_get_line, _cpp_get_raw_token, _cpp_glue_header_name,
_cpp_can_paste, _cpp_check_directive, _cpp_check_linemarker,
_cpp_parse_assertion, _cpp_find_answer): Delete.
(VALID_SIGN, ALIGN, POOL_FRONT, POOL_LIMIT, POOL_BASE, POOL_SIZE,
POOL_USED, POOL_COMMIT, struct cpp_chunk, _cpp_lex_token, _cpp_init_pool,
_cpp_free_pool, _cpp_pool_reserve, _cpp_pool_alloc, _cpp_next_chunk,
_cpp_lock_pool, _cpp_unlock_pool, _cpp_test_assertion,
_cpp_handle_directive, DSC): New.
(struct include_file): New member defined.
(DO_NOT_REREAD, _cpp_begin_message, _cpp_execute_include,
_cpp_compare_file_date): Update.
(_cpp_pop_context, _cpp_get_token, _cpp_free_lookaheads, _cpp_push_token): New.
(_cpp_init_macros, _cpp_cleanup_macros): Rename to _cpp_init_hashtable,
_cpp_cleanup_hashtable.
* Makefile.in: Remove cppoutput.c.
* cppoutput.c: Delete
* fixheader.c (read_scan_file): Update for new cpp_get_token
prototype.
(recognized_function): New argument LINE.
* scan-decls.c (skip_to_closing_brace, scan_decls): Update for
new cpp_get_token prototype.
* scan.h (recognized_function): Update prototype.
* po/POTFILES.in: Remove cppoutput.c.
From-SVN: r37098
2000-10-29 01:59:06 +08:00
|
|
|
|
2013-10-31 22:01:23 +08:00
|
|
|
#define DIGIT_SEP(c) ((c) == '\'' && CPP_OPTION (pfile, digit_separators))
|
|
|
|
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
#define CPP_OPTION(PFILE, OPTION) ((PFILE)->opts.OPTION)
|
|
|
|
#define CPP_BUFFER(PFILE) ((PFILE)->buffer)
|
cppfiles.c (ENABLE_VALGRIND_CHECKING, [...]): Remove.
* cppfiles.c (ENABLE_VALGRIND_CHECKING, VALGRIND_DISCARD,
MMAP_THRESHOLD, TEST_THRESHOLD, SHOULD_MMAP): Remove.
(struct include_file): Remove fefcnt, mapped members.
(open_file, stack_include_file, _cpp_pop_file_buffer): Disable caching.
(read_include_file): Don't use mmap, terminate buffers in '\r'.
(purge_cache): Don't use munmap.
* cpphash.h (CPP_BUF_COLUMN): Update.
(lexer_state): Remove lexing_comment.
(struct _cpp_line_note): New.
(struct cpp_buffer): New members cur_note, notes_used, notes_cap,
next_line and need_line. Remove col_adjust and saved_flags.
(_cpp_process_line_notes, _cpp_clean_line, _cpp_get_fresh_line,
_cpp_skip_block_comment, scan_out_logical_line): New.
(_cpp_init_mbchar): Remove.
* cppinit.c (init_library): Remove call to _cpp_init_mbchar.
(cpp_read_main_file): Set line to 1 earlier.
(post_options): -traditional-cpp doesn't want trigraphs.
* cpplex.c (MULTIBYTE_CHARS): Remove code predicated on this.
(add_line_note, _cpp_clean_line, _cpp_process_line_notes,
_cpp_get_fresh_line): New.
(handle_newline, skip_escaped_newlines, trigraph_p,
continue_after_nul, _cpp_init_mbchar): Remove.
(get_effective_char): Update.
(_cpp_skip_block_comment): Rename from skip_block_comment, simplify.
(skip_line_comment): Simplify.
(skip_whitespace, parse_identifier, parse_slow, parse_number,
parse_string): Update.
(cpp_lex_direct): Use clean lines and process line notes. Update.
(cpp_interpret_charconst): No MULTIBYTE_CHARS.
* cpplib.c (prepare_directive_trad): Call scan_out_logical_line
directly.
(_cpp_handle_directive): Don't set saved_flags.
(run_directive, destringize_and_run, cpp_define, cpp_define_builtin,
cpp_undef, handle_assertion, cpp_push_buffer): Update.
(_cpp_pop_buffer): Free notes.
* cppmacro.c (builtin_macro, paste_tokens): \n terminate buffer.
* cpppch.c (cpp_read_state): \n terminate buffer.
* cpptrad.c (skip_escaped_newlines, handle_newline): Remove.
(copy_comment): Use _cpp_skip_block_comment.
(skip_whitespace, lex_identifier, _cpp_read_logical_line_trad):
Simplify.
(_cpp_overlay_buffer, _cpp_remove_overlay, push_replacement_text,
save_replacement_text): Update.
(scan_out_logical_line): Update to use clean lines and process
line notes.
* fix-header.c (read_scan_file): Update.
testsuite:
* gcc.dg/cpp/_Pragma4.c: Remove stray space.
* gcc.dg/cpp/trad/escaped-eof.c: Correct line number.
From-SVN: r65808
2003-04-19 08:22:51 +08:00
|
|
|
#define CPP_BUF_COLUMN(BUF, CUR) ((CUR) - (BUF)->line_base)
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
#define CPP_BUF_COL(BUF) CPP_BUF_COLUMN(BUF, (BUF)->cur)
|
|
|
|
|
Represent column numbers using line-map's source_location.
The "next available source_location" is now managed internally by
line-maps.c rather than by clients.
* line-map.h (struct line_map): New field column_bits.
<from_line>: Rename field to start_location.
(struct line_maps): New fields highest_location and max_column_hint.
(linemap_check_files_exited): New declaration.
(linemap_line_start): New declaration.
(linemap_add): Remove from_line parameter; use highest_location field.
(SOURCE_LINE, LAST_SOURCE_LINE): Modify to use column_bits.
(SOURCE_COLUMN, LAST_SOURCE_LINE_LOCATION): New macros.
(CURRENT_LINE_MAP): Remove macro.
(linemap_position_for_column): New inline function.
* line-map.c (linemap_init): Clear new fields.
(linemap_check_files_exited): New function, extracted from ...
(linemap_free): Use linemap_check_files_exited.
(linemap_add): Remove from_line parameter. Various updates.
(linemap_line_start): New function.
(linemap_lookeup): Update for new field names.
* cpphash.h (struct cpp_reader) <map>: Field removed. Because
linemap_position_for_column may unpredictably change the current map,
it is cleaner and simpler for us to not cache it in cpp_reader.
(struct cpp_buffer): New sysp field.
Changed warned_cplusplus_comments and from_stage3 to bitfields.
* cppinit.c (cpp_read_min_file): pfile->map no longer exists.
* cpplib.c (do_line, do_linemarker, _cpp_do_file_change): Get
current map using linemap_lookup.
(do_linemarker): Also set buffer's sysp field.
(destringize_and_run): No longer need to decrement current line.
* cppfiles.c (_cpp_stack_file): Set sysp from and in buffer.
(search_path_head, open_file_failed): Use buffer's sysp.
(cpp_make_system_header): Get current map using linemap_lookup.
Also set buffer's sysp flag.
* cppmacro.c (_cpp_builtin_macro_text): Likewise use linemap_lookup.
* cpphash.h (CPP_INCREMENT_LINE): New macro.
(struct cpp_buffer): Moved fields saved_cur, saved_rlimit to ...
(struct cpp_reader): ... and adding saved_line_base field.
* cpptrad.c (_cpp_overlay_buffer, _cpp_remove_overlay):
Update accordingly. Don't adjust line.
(_cpp_scan_out_logical_line): Use CPP_INCREMENT_LINE.
* cpphash.c (CPP_IN_SYSTEM_HEADER): Replaced macro by ...
(cpp_in_system_header): ... new inline function, using buffer's sysp.
* cpperror.c (_cpp_begin_message): Update to use cpp_in_system_header.
* cpplex.c (_cpp_lex_direct): Likewise.
* cppmacro.c (_cpp_builtin_macro_text): Likewise.
* cppmacro.c (_cpp_create_definition): Use buffer's sysp field.
* cpplib.h (struct cpp_token): Rename line field to src_loc.
Remove col field as it is now subsumed by src_loc.
* cpperror.c: Update various field, parameter, and macro names.
(print_location): If col==0, try SOURCE_COLUMN of line.
(cpp_error): Use cur_token's src_loc field, rather than line+col.
* cpplib.c (do_diagnostic): Token's src_loc fields replaces line+col.
* cpplex.c (_cpp_process_line_notes, _cpp_lex_direct,
_cpp_skip_block_comment): Use CPP_INCREMENT_LINE.
(_cpp_temp_token): Replace cpp_token's line+col fields by src_loc.
(_cpp_get_fresh_line): Don't need to adjust line for missing newline.
(_cpp_lex_direct): Use linemap_position_for_column.
* c-ppoutput.c (maybe_print_line, print_line): Don't take map
parameter. Instead get it from the line_table global. Adjust callers.
(print): Remove map field. Replace line field to src_line.
(init_pp_output, account_for_newlines, maybe_print_line): Adjust.
(cb_line_change): Use SOURCE_COLUMN. Minor optimizations.
(pp_file_change): Use MAIN_FILE_P since we cannot checked print.map.
Use LAST_SOURCE_LINE_LOCATION to "catch up" after #include.
* cpptrad.c (copy_comment): Rename variable.
* c-lex.c (map): Remove static variable, for same reason we removed
cpp_reader's map field.
(cb_line_change, cb_def_pragma, cb_define, cb_undef): Hence we need
to call linemap_lookup.
(cb_line_change): Token's line field replaced by src_loc.
(fe_file_change): Use MAINFILE_P and LAST_SOURCE_LINE macros.
Don't save new_map.
* cpphash.h, cpperror.c, cpplib.h: Some renames of fileline to
source_location.
From-SVN: r77663
2004-02-11 23:29:30 +08:00
|
|
|
#define CPP_INCREMENT_LINE(PFILE, COLS_HINT) do { \
|
2019-07-10 02:32:49 +08:00
|
|
|
const class line_maps *line_table = PFILE->line_table; \
|
Replace line_map union with C++ class hierarchy
gcc/ChangeLog:
* diagnostic.c (diagnostic_report_current_module): Strengthen
local "new_map" from const line_map * to
const line_map_ordinary *.
* genmatch.c (error_cb): Likewise for local "map".
(output_line_directive): Likewise for local "map".
* input.c (expand_location_1): Likewise for local "map".
Pass NULL rather than &map to
linemap_unwind_to_first_non_reserved_loc, since the value is never
read from there, and the value written back not read from here.
(is_location_from_builtin_token): Strengthen local "map" from
const line_map * to const line_map_ordinary *.
(dump_location_info): Strengthen locals "map" from
line_map *, one to const line_map_ordinary *, the other
to const line_map_macro *.
* tree-diagnostic.c (loc_map_pair): Strengthen field "map" from
const line_map * to const line_map_macro *.
(maybe_unwind_expanded_macro_loc): Add a call to
linemap_check_macro when writing to the "map" field of the
loc_map_pair.
Introduce local const line_map_ordinary * "ord_map", using it in
place of "map" in the part of the function where we know we have
an ordinary map. Strengthen local "m" from const line_map * to
const line_map_ordinary *.
gcc/ada/ChangeLog:
* gcc-interface/trans.c (Sloc_to_locus1): Strenghthen local "map"
from line_map * to line_map_ordinary *.
gcc/c-family/ChangeLog:
* c-common.h (fe_file_change): Strengthen param from
const line_map * to const line_map_ordinary *.
(pp_file_change): Likewise.
* c-lex.c (fe_file_change): Likewise.
(cb_define): Use linemap_check_ordinary when invoking
SOURCE_LINE.
(cb_undef): Likewise.
* c-opts.c (c_finish_options): Use linemap_check_ordinary when
invoking cb_file_change.
(c_finish_options): Likewise.
(push_command_line_include): Likewise.
(cb_file_change): Strengthen param "new_map" from
const line_map * to const line_map_ordinary *.
* c-ppoutput.c (cb_define): Likewise for local "map".
(pp_file_change): Likewise for param "map" and local "from".
gcc/fortran/ChangeLog:
* cpp.c (maybe_print_line): Strengthen local "map" from
const line_map * to const line_map_ordinary *.
(cb_file_change): Likewise for param "map" and local "from".
(cb_line_change): Likewise for local "map".
libcpp/ChangeLog:
* directives.c (do_line): Strengthen local "map" from
const line_map * to const line_map_ordinary *.
(do_linemarker): Likewise.
(_cpp_do_file_change): Assert that we're not dealing with
a macro map. Introduce local "ord_map" via a call to
linemap_check_ordinary, guarded within the check for
non-NULL. Use it for typesafety.
* files.c (cpp_make_system_header): Strengthen local "map" from
const line_map * to const line_map_ordinary *.
* include/cpplib.h (struct cpp_callbacks): Likewise for second
parameter of "file_change" callback.
* include/line-map.h (struct line_map): Convert from a struct
containing a union to a base class.
(struct line_map_ordinary): Convert to a subclass of line_map.
(struct line_map_macro): Likewise.
(linemap_check_ordinary): Strengthen return type from line_map *
to line_map_ordinary *, and add a const-variant.
(linemap_check_macro): New pair of functions.
(ORDINARY_MAP_STARTING_LINE_NUMBER): Strengthen param from
const line_map * to const line_map_ordinary *, eliminating call
to linemap_check_ordinary. Likewise for the non-const variant.
(ORDINARY_MAP_INCLUDER_FILE_INDEX): Likewise.
(ORDINARY_MAP_IN_SYSTEM_HEADER_P): Likewise.
(ORDINARY_MAP_NUMBER_OF_COLUMN_BITS): Likewise.
(ORDINARY_MAP_FILE_NAME): Likewise.
(MACRO_MAP_MACRO): Strengthen param from const line_map * to
const line_map_macro *. Likewise for the non-const variant.
(MACRO_MAP_NUM_MACRO_TOKENS): Likewise.
(MACRO_MAP_LOCATIONS): Likewise.
(MACRO_MAP_EXPANSION_POINT_LOCATION): Likewise.
(struct maps_info): Replace with...
(struct maps_info_ordinary):...this and...
(struct maps_info_macro): ...this.
(struct line_maps): Convert fields "info_ordinary" and
"info_macro" to the above new structs.
(LINEMAPS_MAP_INFO): Delete both functions.
(LINEMAPS_MAPS): Likewise.
(LINEMAPS_ALLOCATED): Rewrite both variants to avoid using
LINEMAPS_MAP_INFO.
(LINEMAPS_USED): Likewise.
(LINEMAPS_CACHE): Likewise.
(LINEMAPS_MAP_AT): Likewise.
(LINEMAPS_ORDINARY_MAPS): Strengthen return type from line_map *
to line_map_ordinary *.
(LINEMAPS_ORDINARY_MAP_AT): Likewise.
(LINEMAPS_LAST_ORDINARY_MAP): Likewise.
(LINEMAPS_LAST_ALLOCATED_ORDINARY_MAP): Likewise.
(LINEMAPS_MACRO_MAPS): Strengthen return type from line_map * to
line_map_macro *.
(LINEMAPS_MACRO_MAP_AT): Likewise.
(LINEMAPS_LAST_MACRO_MAP): Likewise.
(LINEMAPS_LAST_ALLOCATED_MACRO_MAP): Likewise.
(linemap_map_get_macro_name): Strengthen param from
const line_map * to const line_map_macro *.
(SOURCE_LINE): Strengthen first param from const line_map * to
const line_map_ordinary *, removing call to
linemap_check_ordinary.
(SOURCE_COLUMN): Likewise.
(LAST_SOURCE_LINE_LOCATION): Likewise.
(LAST_SOURCE_LINE): Strengthen first param from const line_map *
to const line_map_ordinary *.
(LAST_SOURCE_COLUMN): Likewise.
(INCLUDED_FROM): Strengthen return type from line_map * to
line_map_ordinary *., and second param from const line_map *
to const line_map_ordinary *, removing call to
linemap_check_ordinary.
(MAIN_FILE_P): Strengthen param from const line_map * to
const line_map_ordinary *, removing call to
linemap_check_ordinary.
(linemap_position_for_line_and_column): Strengthen param from
const line_map * to const line_map_ordinary *.
(LINEMAP_FILE): Strengthen param from const line_map * to
const line_map_ordinary *, removing call to
linemap_check_ordinary.
(LINEMAP_LINE): Likewise.
(LINEMAP_SYSP): Likewise.
(linemap_resolve_location): Strengthen final param from
const line_map ** to const line_map_ordinary **.
* internal.h (CPP_INCREMENT_LINE): Likewise for local "map".
(linemap_enter_macro): Strengthen return type from
const line_map * to const line_map_macro *.
(linemap_add_macro_token): Likewise for first param.
* line-map.c (linemap_check_files_exited): Strengthen local "map"
from const line_map * to const line_map_ordinary *.
(new_linemap): Introduce local "map_size" and use it when
calculating how large the buffer should be. Rewrite based
on change of info_macro and info_ordinary into distinct types.
(linemap_add): Strengthen locals "map" and "from" from line_map *
to line_map_ordinary *.
(linemap_enter_macro): Strengthen return type from
const line_map * to const line_map_macro *, and local "map" from
line_map * to line_map_macro *.
(linemap_add_macro_token): Strengthen param "map" from
const line_map * to const line_map_macro *.
(linemap_line_start): Strengthen local "map" from line_map * to
line_map_ordinary *.
(linemap_position_for_column): Likewise.
(linemap_position_for_line_and_column): Strengthen first param
from const line_map * to const line_map_ordinary *.
(linemap_position_for_loc_and_offset): Strengthen local "map" from
const line_map * to const line_map_ordinary *.
(linemap_ordinary_map_lookup): Likewise for return type and locals
"cached" and "result".
(linemap_macro_map_lookup): Strengthen return type and locals
"cached" and "result" from const line_map * to
const line_map_macro *.
(linemap_macro_map_loc_to_exp_point): Likewise for param "map".
(linemap_macro_map_loc_to_def_point): Likewise.
(linemap_macro_map_loc_unwind_toward_spelling): Likewise.
(linemap_get_expansion_line): Strengthen local "map" from
const line_map * to const line_map_ordinary *.
(linemap_get_expansion_filename): Likewise.
(linemap_map_get_macro_name): Strengthen param from
const line_map * to const line_map_macro *.
(linemap_location_in_system_header_p): Add call to
linemap_check_ordinary in region guarded by
!linemap_macro_expansion_map_p. Introduce local "macro_map" via
linemap_check_macro in other region, using it in place of "map"
for typesafety.
(first_map_in_common_1): Add calls to linemap_check_macro.
(trace_include): Strengthen param "map" from const line_map * to
const line_map_ordinary *.
(linemap_macro_loc_to_spelling_point): Strengthen final param from
const line_map ** to const line_map_ordinary **. Replace a
C-style cast with a const_cast, and add calls to
linemap_check_macro and linemap_check_ordinary.
(linemap_macro_loc_to_def_point): Likewise.
(linemap_macro_loc_to_exp_point): Likewise.
(linemap_resolve_location): Strengthen final param from
const line_map ** to const line_map_ordinary **.
(linemap_unwind_toward_expansion): Introduce local "macro_map" via
a checked cast and use it in place of *map.
(linemap_unwind_to_first_non_reserved_loc): Strengthen local
"map1" from const line_map * to const line_map_ordinary *.
(linemap_expand_location): Introduce local "ord_map" via a checked
cast and use it in place of map.
(linemap_dump): Make local "map" const. Strengthen local
"includer_map" from line_map * to const line_map_ordinary *.
Introduce locals "ord_map" and "macro_map" via checked casts and
use them in place of "map" for typesafety.
(linemap_dump_location): Strengthen local "map" from
const line_map * to const line_map_ordinary *.
(linemap_get_file_highest_location): Update for elimination of
union.
(linemap_get_statistics): Strengthen local "cur_map" from
line_map * to const line_map_macro *. Update uses of sizeof to
use the appropriate line_map subclasses.
* macro.c (_cpp_warn_if_unused_macro): Add call to
linemap_check_ordinary.
(builtin_macro): Strengthen local "map" from const line_map * to
const line_map_macro *.
(enter_macro_context): Likewise.
(replace_args): Likewise.
(tokens_buff_put_token_to): Likewise for param "map".
(tokens_buff_add_token): Likewise.
From-SVN: r223365
2015-05-19 21:18:01 +08:00
|
|
|
const struct line_map_ordinary *map = \
|
Linemap infrastructure for virtual locations
This is the first instalment of a set which goal is to track locations
of tokens across macro expansions. Tom Tromey did the original work
and attached the patch to PR preprocessor/7263. This opus is a
derivative of that original work.
This patch modifies the linemap module of libcpp to add virtual
locations support.
A virtual location is a mapped location that can resolve to several
different physical locations. It can always resolve to the spelling
location of a token. For tokens resulting from macro expansion it can
resolve to:
- either the location of the expansion point of the macro.
- or the location of the token in the definition of the
macro
- or, if the token is an argument of a function-like macro,
the location of the use of the matching macro parameter in
the definition of the macro
The patch creates a new type of line map called a macro map. For every
single macro expansion, there is a macro map that generates a virtual
location for every single resulting token of the expansion.
The good old type of line map we all know is now called an ordinary
map. That one still encodes spelling locations as it has always had.
As a result linemap_lookup as been extended to return a macro map when
given a virtual location resulting from a macro expansion. The layout
of structs line_map has changed to support this new type of map. So
did the layout of struct line_maps. Accessor macros have been
introduced to avoid messing with the implementation details of these
datastructures directly. This helped already as we have been testing
different ways of arranging these datastructure. Having to constantly
adjust client code that is too tied with the internals of line_map and
line_maps would have been even more painful.
Of course, many new public functions have been added to the linemap
module to handle the resolution of virtual locations.
This patch introduces the infrastructure but no part of the compiler
uses virtual locations yet.
However the client code of the linemap data structures has been
adjusted as per the changes. E.g, it's not anymore reliable for a
client code to manipulate struct line_map directly if it just wants to
deal with spelling locations, because struct line_map can now
represent a macro map as well. In that case, it's better to use the
convenient API to resolve the initial (possibly virtual) location to a
spelling location (or to an ordinary map) and use that.
This is the reason why the patch adjusts the Java, Ada and Fortran
front ends.
Also, note that virtual locations are not supposed to be ordered for
relations '<' and '>' anymore. To test if a virtual location appears
"before" another one, one has to use a new operator exposed by the
line map interface. The patch updates the only spot (in the
diagnostics module) I have found that was making the assumption that
locations were ordered for these relations. This is the only change
that introduces a use of the new line map API in this patch, so I am
adding a regression test for it only.
From-SVN: r180081
2011-10-17 17:58:56 +08:00
|
|
|
LINEMAPS_LAST_ORDINARY_MAP (line_table); \
|
2008-07-21 17:33:38 +08:00
|
|
|
linenum_type line = SOURCE_LINE (map, line_table->highest_line); \
|
2004-04-23 10:22:27 +08:00
|
|
|
linemap_line_start (PFILE->line_table, line + 1, COLS_HINT); \
|
Represent column numbers using line-map's source_location.
The "next available source_location" is now managed internally by
line-maps.c rather than by clients.
* line-map.h (struct line_map): New field column_bits.
<from_line>: Rename field to start_location.
(struct line_maps): New fields highest_location and max_column_hint.
(linemap_check_files_exited): New declaration.
(linemap_line_start): New declaration.
(linemap_add): Remove from_line parameter; use highest_location field.
(SOURCE_LINE, LAST_SOURCE_LINE): Modify to use column_bits.
(SOURCE_COLUMN, LAST_SOURCE_LINE_LOCATION): New macros.
(CURRENT_LINE_MAP): Remove macro.
(linemap_position_for_column): New inline function.
* line-map.c (linemap_init): Clear new fields.
(linemap_check_files_exited): New function, extracted from ...
(linemap_free): Use linemap_check_files_exited.
(linemap_add): Remove from_line parameter. Various updates.
(linemap_line_start): New function.
(linemap_lookeup): Update for new field names.
* cpphash.h (struct cpp_reader) <map>: Field removed. Because
linemap_position_for_column may unpredictably change the current map,
it is cleaner and simpler for us to not cache it in cpp_reader.
(struct cpp_buffer): New sysp field.
Changed warned_cplusplus_comments and from_stage3 to bitfields.
* cppinit.c (cpp_read_min_file): pfile->map no longer exists.
* cpplib.c (do_line, do_linemarker, _cpp_do_file_change): Get
current map using linemap_lookup.
(do_linemarker): Also set buffer's sysp field.
(destringize_and_run): No longer need to decrement current line.
* cppfiles.c (_cpp_stack_file): Set sysp from and in buffer.
(search_path_head, open_file_failed): Use buffer's sysp.
(cpp_make_system_header): Get current map using linemap_lookup.
Also set buffer's sysp flag.
* cppmacro.c (_cpp_builtin_macro_text): Likewise use linemap_lookup.
* cpphash.h (CPP_INCREMENT_LINE): New macro.
(struct cpp_buffer): Moved fields saved_cur, saved_rlimit to ...
(struct cpp_reader): ... and adding saved_line_base field.
* cpptrad.c (_cpp_overlay_buffer, _cpp_remove_overlay):
Update accordingly. Don't adjust line.
(_cpp_scan_out_logical_line): Use CPP_INCREMENT_LINE.
* cpphash.c (CPP_IN_SYSTEM_HEADER): Replaced macro by ...
(cpp_in_system_header): ... new inline function, using buffer's sysp.
* cpperror.c (_cpp_begin_message): Update to use cpp_in_system_header.
* cpplex.c (_cpp_lex_direct): Likewise.
* cppmacro.c (_cpp_builtin_macro_text): Likewise.
* cppmacro.c (_cpp_create_definition): Use buffer's sysp field.
* cpplib.h (struct cpp_token): Rename line field to src_loc.
Remove col field as it is now subsumed by src_loc.
* cpperror.c: Update various field, parameter, and macro names.
(print_location): If col==0, try SOURCE_COLUMN of line.
(cpp_error): Use cur_token's src_loc field, rather than line+col.
* cpplib.c (do_diagnostic): Token's src_loc fields replaces line+col.
* cpplex.c (_cpp_process_line_notes, _cpp_lex_direct,
_cpp_skip_block_comment): Use CPP_INCREMENT_LINE.
(_cpp_temp_token): Replace cpp_token's line+col fields by src_loc.
(_cpp_get_fresh_line): Don't need to adjust line for missing newline.
(_cpp_lex_direct): Use linemap_position_for_column.
* c-ppoutput.c (maybe_print_line, print_line): Don't take map
parameter. Instead get it from the line_table global. Adjust callers.
(print): Remove map field. Replace line field to src_line.
(init_pp_output, account_for_newlines, maybe_print_line): Adjust.
(cb_line_change): Use SOURCE_COLUMN. Minor optimizations.
(pp_file_change): Use MAIN_FILE_P since we cannot checked print.map.
Use LAST_SOURCE_LINE_LOCATION to "catch up" after #include.
* cpptrad.c (copy_comment): Rename variable.
* c-lex.c (map): Remove static variable, for same reason we removed
cpp_reader's map field.
(cb_line_change, cb_def_pragma, cb_define, cb_undef): Hence we need
to call linemap_lookup.
(cb_line_change): Token's line field replaced by src_loc.
(fe_file_change): Use MAINFILE_P and LAST_SOURCE_LINE macros.
Don't save new_map.
* cpphash.h, cpperror.c, cpplib.h: Some renames of fileline to
source_location.
From-SVN: r77663
2004-02-11 23:29:30 +08:00
|
|
|
} while (0)
|
|
|
|
|
cpptrad.c (struct block, [...]): New.
* cpptrad.c (struct block, BLOCK_HEADER_LEN, BLOCK_LEN,
scan_parameters, save_replacement_text, replacement_length): New.
(scan_out_logical_line): Take a macro and save parameters if
non-NULL.
(_cpp_logical_line_trad): Update.
(_cpp_create_trad_definition): Update to handle function-like
macros.
* cpplex.c (new_buff): Update.
(struct dummy, DEFAULT_ALIGNMENT, CPP_ALIGN): Move...
* cpphash.h: ...here.
(CPP_ALIGN2, _cpp_save_parameter): New.
* cppmacro.c (save_parameter): Rename, export.
(parse_params): Update.
From-SVN: r54331
2002-06-07 14:26:32 +08:00
|
|
|
/* Host alignment handling. */
|
|
|
|
struct dummy
|
|
|
|
{
|
|
|
|
char c;
|
|
|
|
union
|
|
|
|
{
|
|
|
|
double d;
|
|
|
|
int *p;
|
|
|
|
} u;
|
|
|
|
};
|
|
|
|
|
|
|
|
#define DEFAULT_ALIGNMENT offsetof (struct dummy, u)
|
|
|
|
#define CPP_ALIGN2(size, align) (((size) + ((align) - 1)) & ~((align) - 1))
|
|
|
|
#define CPP_ALIGN(size) CPP_ALIGN2 (size, DEFAULT_ALIGNMENT)
|
|
|
|
|
2018-08-16 21:51:38 +08:00
|
|
|
#define _cpp_mark_macro_used(NODE) \
|
|
|
|
(cpp_user_macro_p (NODE) ? (NODE)->value.macro->used = 1 : 0)
|
2002-07-24 06:57:49 +08:00
|
|
|
|
2001-10-02 20:57:24 +08:00
|
|
|
/* A generic memory buffer, and operations on it. */
|
cpphash.h (struct _cpp_buff, [...]): New.
* cpphash.h (struct _cpp_buff, _cpp_get_buff, _cpp_release_buff,
_cpp_extend_buff, _cpp_free_buff): New.
(struct cpp_reader): New member free_buffs.
* cppinit.c (cpp_destroy): Free buffers.
* cpplex.c (new_buff, _cpp_release_buff, _cpp_get_buff,
_cpp_extend_buff, _cpp_free_buff): New.
* cpplib.h (struct cpp_options): Remove unused member.
* cppmacro.c (collect_args): New. Combines the old parse_arg
and parse_args. Use _cpp_buff for memory allocation.
(funlike_invocation_p, replace_args): Update.
From-SVN: r45827
2001-09-27 01:52:50 +08:00
|
|
|
typedef struct _cpp_buff _cpp_buff;
|
|
|
|
struct _cpp_buff
|
|
|
|
{
|
|
|
|
struct _cpp_buff *next;
|
2001-09-28 17:40:22 +08:00
|
|
|
unsigned char *base, *cur, *limit;
|
cpphash.h (struct _cpp_buff, [...]): New.
* cpphash.h (struct _cpp_buff, _cpp_get_buff, _cpp_release_buff,
_cpp_extend_buff, _cpp_free_buff): New.
(struct cpp_reader): New member free_buffs.
* cppinit.c (cpp_destroy): Free buffers.
* cpplex.c (new_buff, _cpp_release_buff, _cpp_get_buff,
_cpp_extend_buff, _cpp_free_buff): New.
* cpplib.h (struct cpp_options): Remove unused member.
* cppmacro.c (collect_args): New. Combines the old parse_arg
and parse_args. Use _cpp_buff for memory allocation.
(funlike_invocation_p, replace_args): Update.
From-SVN: r45827
2001-09-27 01:52:50 +08:00
|
|
|
};
|
|
|
|
|
* cpplib.h, cpphash.h, cppcharset.c, cpperror.c, cppexp.c
* cppfiles.c, cpphash.c, cppinit.c, cpplex.c, cpplib.c
* cppmacro.c, cpppch.c, cpptrad.c, cppspec.c: Convert to
ISO C: new-style function declarations, no need for PARAMS,
no special punctuation on indirect function calls, use string
constant concatenation where convenient.
From-SVN: r68070
2003-06-17 14:17:44 +08:00
|
|
|
extern _cpp_buff *_cpp_get_buff (cpp_reader *, size_t);
|
|
|
|
extern void _cpp_release_buff (cpp_reader *, _cpp_buff *);
|
|
|
|
extern void _cpp_extend_buff (cpp_reader *, _cpp_buff **, size_t);
|
|
|
|
extern _cpp_buff *_cpp_append_extend_buff (cpp_reader *, _cpp_buff *, size_t);
|
|
|
|
extern void _cpp_free_buff (_cpp_buff *);
|
|
|
|
extern unsigned char *_cpp_aligned_alloc (cpp_reader *, size_t);
|
|
|
|
extern unsigned char *_cpp_unaligned_alloc (cpp_reader *, size_t);
|
2001-10-02 20:57:24 +08:00
|
|
|
|
cpphash.h (POOL_ALIGN, [...]): Remove.
* cpphash.h (POOL_ALIGN, POOL_FRONT, POOL_LIMIT, POOL_BASE,
POOL_SIZE, POOL_ROOM, POOL_COMMIT, struct cpp_chunk,
struct cpp_pool, _cpp_init_pool, _cpp_free_pool, _cpp_pool_reserve,
_cpp_pool_alloc, _cpp_next_chunk): Remove.
(_cpp_extend_buff, BUFF_ROOM): Update.
(_cpp_append_extend_buff): New.
(struct cpp_reader): Remove macro_pool, add a_buff.
* cppinit.c (cpp_create_reader): Initialize a_buff, instead of
macro_pool.
(cpp_destroy): Free a_buff instead of macro_pool.
* cpplex.c (new_chunk, chunk_suitable, _cpp_next_chunk,
new_chunk, _cpp_init_pool, _cpp_free_pool, _cpp_pool_reserve,
_cpp_pool_alloc, ): Remove.
(parse_number, parse_string): Update use of _cpp_extend_buff.
(_cpp_extend_buff): Update.
(_cpp_append_extend_buff, cpp_aligned_alloc): New.
* cpplib.c (glue_header_name, parse_answer):
Update use of _cpp_extend_buff.
(cpp_register_pragma, cpp_register_pragma_space): Use
_cpp_aligned_alloc.
(do_assert, do_unassert): Check for EOL, update.
* cppmacro.c (stringify_arg, collect_args): Update to use
_cpp_extend_buff and _cpp_append_extend_buff.
(save_parameter, parse_params, alloc_expansion_token,
_cpp_create_definition): Rework memory management.
* gcc.dg/cpp/redef2.c: Add test.
From-SVN: r45899
2001-09-30 18:03:11 +08:00
|
|
|
#define BUFF_ROOM(BUFF) (size_t) ((BUFF)->limit - (BUFF)->cur)
|
2001-09-28 17:40:22 +08:00
|
|
|
#define BUFF_FRONT(BUFF) ((BUFF)->cur)
|
|
|
|
#define BUFF_LIMIT(BUFF) ((BUFF)->limit)
|
cpphash.h (struct _cpp_buff, [...]): New.
* cpphash.h (struct _cpp_buff, _cpp_get_buff, _cpp_release_buff,
_cpp_extend_buff, _cpp_free_buff): New.
(struct cpp_reader): New member free_buffs.
* cppinit.c (cpp_destroy): Free buffers.
* cpplex.c (new_buff, _cpp_release_buff, _cpp_get_buff,
_cpp_extend_buff, _cpp_free_buff): New.
* cpplib.h (struct cpp_options): Remove unused member.
* cppmacro.c (collect_args): New. Combines the old parse_arg
and parse_args. Use _cpp_buff for memory allocation.
(funlike_invocation_p, replace_args): Update.
From-SVN: r45827
2001-09-27 01:52:50 +08:00
|
|
|
|
2001-03-15 15:57:13 +08:00
|
|
|
/* #include types. */
|
2019-08-29 02:43:37 +08:00
|
|
|
enum include_type
|
|
|
|
{
|
|
|
|
/* Directive-based including mechanisms. */
|
|
|
|
IT_INCLUDE, /* #include */
|
|
|
|
IT_INCLUDE_NEXT, /* #include_next */
|
|
|
|
IT_IMPORT, /* #import */
|
libcpp, c-family: Add (dumb) C23 N3017 #embed support [PR105863]
The following patch implements the C23 N3017 "#embed - a scannable,
tooling-friendly binary resource inclusion mechanism" paper.
The implementation is intentionally dumb, in that it doesn't significantly
speed up compilation of larger initializers and doesn't make it possible
to use huge #embeds (like several gigabytes large, that is compile time
and memory still infeasible).
There are 2 reasons for this. One is that I think like it is implemented
now in the patch is how we should use it for the smaller #embed sizes,
dunno with which boundary, whether 32 bytes or 64 or something like that,
certainly handling the single byte cases which is something that can appear
anywhere in the source where constant integer literal can appear is
desirable and I think for a few bytes it isn't worth it to come up with
something smarter and users would like to e.g. see it in -E readably as
well (perhaps the slow vs. fast boundary should be determined by command
line option). And the other one is to be able to more easily find
regressions in behavior caused by the optimizations, so we have something
to get back in git to compare against.
I'm definitely willing to work on the optimizations (likely introduce a new
CPP_* token type to refer to a range of libcpp owned memory (start + size)
and similarly some tree which can do the same, and can be at any time e.g.
split into 2 subparts + say INTEGER_CST in between if needed say for
const unsigned char d[] = {
#embed "2GB.dat" prefix (0, 0, ) suffix (, [0x40000000] = 42)
}; still without having to copy around huge amounts of data; STRING_CST
owns the memory it points to and can be only 2GB in size), but would
like to do that incrementally.
And would like to first include some extensions also not included in
this patch, like gnu::offset (off) parameter to allow to skip certain
constant amount of bytes at the start of the files, plus
gnu::base64 ("base64_encoded_data") parameter to add something which can
store more efficiently large amounts of the #embed data in preprocessed
source.
I've been cross-checking all the tests also against the LLVM implementation
https://github.com/llvm/llvm-project/pull/68620
which has been for a few hours even committed to LLVM trunk but reverted
afterwards. LLVM now has the support committed and I admit I haven't
rechecked whether the behavior on the below mentioned spots have been fixed
in it already or not yet.
The patch uses --embed-dir= option that clang plans to add above and doesn't
use other variants on the search directories yet, plus there are no
default directories at least for the time being where to search for embed
files. So, #embed "..." works if it is found in the same directory (or
relative to the current file's directory) and #embed "/..." or #embed </...>
work always, but relative #embed <...> doesn't unless at least one
--embed-dir= is specified. There is no reason to differentiate between
system and non-system directories, so we don't need -isystem like
counterpart, perhaps -iquote like counterpart could be useful in the future,
dunno what else. It has --embed-directory=dir and --embed-directory dir
as aliases.
There are some differences beyond clang ICEs, so I'd like to point them out
to make sure there is agreement on the choices in the patch. They are also
mentioned in the comments of the llvm pull request.
The most important is that the GCC patch (as well as the original thephd.dev
LLVM branch on godbolt) expands #embed (or acts as if it is expanded) into
a mere sequence of numbers like 123,2,35,26 rather then what clang
effectively treats as (unsigned char)123,(unsigned char)2,(unsigned
char)35,(unsigned char)26 but only does that when using integrated
preprocessor, not when using -save-temps where it acts as GCC.
JeanHeyd as the original author agrees that is how it is currently worded in
C23.
Another difference (not tested in the testsuite, not sure how to check for
effective target /dev/urandom nor am sure it is desirable to check that
during testsuite) is how to treat character devices, named pipes etc.
(block devices are errored on). The original paper uses /dev/urandom
in various examples and seems to assume that unlike regular files the
devices aren't really cached, so
#embed </dev/urandom> limit(1) prefix(int a = ) suffix(;)
#embed </dev/urandom> limit(1) prefix(int b = ) suffix(;)
usually results in a != b. That is what the godbolt thephd.dev branch
implements too and what this patch does as well, but clang actually seems
to just go from st.st_size == 0, ergo it must be zero-sized resource and
so just copies over if_empty if present. It is really questionable
what to do about the character devices/named pipes with __has_embed, for
regular files the patch doesn't read anything from them, relies on
st.st_size + limit for whether it is empty or non-empty. But I don't know
of a way to check if read on say a character device would read anything
or not (the </dev/null> limit (1) vs. </dev/zero> limit (1) cases), and
if we read something, that would be better cached for later because
#embed later if it reads again could read no further data even when it
first read something. So, the patch currently for __has_embed just
always returns 2 on the non-regular files, like the thephd.dev
branch does as well and like the clang pull request as well.
A question is also what to do for gnu::offset on the non-regular files
even for #embed, those aren't seekable and do we want to just read and throw
away the offset bytes each time we see it used?
clang also chokes on the
#if __has_embed (__FILE__ __limit__ (1) __prefix__ () suffix (1 / 0) \
__if_empty__ ((({{[0[0{0{0(0(0)1)1}1}]]}})))) != __STDC_EMBED_FOUND__
#error "__has_embed fail"
#endif
in embed-1.c, but thephd.dev branch accepts it and I don't see why
it shouldn't, (({{[0[0{0{0(0(0)1)1}1}]]}}))) is a balanced token
sequence and the file isn't empty, so it should just be parsed and
discarded.
clang also IMHO mishandles
const unsigned char w[] = {
#embed __FILE__ prefix([0] = 42, [15] =) limit(32)
};
but again only without -save-temps, seems like it
treats it as
[0] = 42, [15] = (99,111,110,115,116,32,117,110,115,105,103,110,101,100,
32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98)
rather than
[0] = 42, [15] = 99,111,110,115,116,32,117,110,115,105,103,110,101,100,
32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98
and warns on it for -Wunused-value and just compiles it as
[0] = 42, [15] = 98
And also
void foo (int, int, int, int);
void bar (void) { foo (
#embed __FILE__ limit (4) prefix (172 + ) suffix (+ 2)
); }
is treated as
172 + (118, 111, 105, 100) + 2
rather than
172 + 118, 111, 105, 100 + 2
which clang -save-temps or GCC treats it like, so results
in just one argument passed rather than 4.
if (!strstr ((const char *) magna_carta, "imprisonétur")) abort ();
in the testcase fails as well, but in that case calling it in gdb succeeds:
p ((char *(*)(char *, char *))__strstr_sse2) (magna_carta, "imprisonétur")
$2 = 0x555555558d3c <magna_carta+11564> "imprisonétur aut disseisiátur"...
so I guess they are just trying to constant evaluate strstr and do it
incorrectly.
They started with making the optimizations together in the initial patch
set, so they don't have the luxury to compare if it is just because of
the optimization they are trying to do or because that is how the
feature works for them. At least unless they use -save-temps for now.
There is also different behavior between clang and gcc on -M or other
dependency generating options. Seems clang includes the __has_embed
searched files in dependencies, while my patch doesn't. But so does
clang for __has_include and GCC doesn't. Emitting a hard dependency
on some header just because there was __has_include/__has_embed for it
seems wrong to me, because (at least when properly written) the source
likely doesn't mind if the file is missing, it will do something else,
so a hard error from make because of it doesn't seem right. Does
make have some weaker dependencies, such that if some file can be remade
it is but if it doesn't exist, it isn't fatal?
I wonder whether #embed <non-existent-file> really needs to be fatal
or whether we could simply after diagnosing it pretend the file exists
and is empty. For #include I think fatal errors make tons of sense,
but perhaps for #embed which is more localized we'd get better error
reporting if we didn't bail out immediately. Note, both GCC and clang
currently treat those as fatal errors.
clang also added -dE option which with -E instead of preprocessing
the #embed directives keeps them as is, but the preprocessed source
then isn't self-contained. That option looks more harmful than useful to
me.
Also, it isn't clear to me from C23 whether it is possible to have
__has_include/__has_c_attribute/__has_embed expressions inside of
the limit #embed/__has_embed argument.
6.10.3.2/2 says that defined should not appear there (and the patch
diagnoses it and testsuite tests), but for __has_include/__has_embed
etc. 6.10.1/11 says:
"The identifiers __has_include, __has_embed, and __has_c_attribute
shall not appear in any context not mentioned in this subclause."
If that subclause in that case means 6.10.1, then it presumably shouldn't
appear in #embed in 6.10.3, but __has_embed is in 6.10.1...
But 6.10.3.2/3 says that it should be parsed according to the 6.10.1
rules. Haven't included tests like
#if __has_embed (__FILE__ limit (__has_embed (__FILE__ limit (1))))
or
#embed __FILE__ limit (__has_include (__FILE__))
into the testsuite because of the doubts but I think the patch should
handle those right now.
The reason I've used Magna Carta text in some of the testcases is that
I hope it shouldn't be copyrighted after the centuries and I'd strongly
prefer not to have binary blobs in git after the xz backdoor lesson
and wanted something larger which doesn't change all the time.
Oh, BTW, I see in C23 draft 6.10.3.2 in Example 4
if (f_source == NULL);
return 1;
(note the spurious semicolon after closing paren), has that been fixed
already?
Like the thephd.dev and clang implementations, the patch always macro
expands the whole #embed and __has_embed directives except for the
embed keyword. That is most likely not what C23 says, my limited
understanding right now is that in #embed one needs to parse the whole
directive line with macro expansion disabled and check if it satisfies the
grammar, if not, the whole directive is macro expanded, if yes, only
the limit parameter argument is macro expanded and the prefix/suffix/if_empty
arguments are maybe macro expanded when actually used (and not at all if
unused). And I think __has_embed macro expansion has conflicting rules.
2024-09-12 Jakub Jelinek <jakub@redhat.com>
PR c/105863
libcpp/
* include/cpplib.h: Implement C23 N3017 #embed - a scannable,
tooling-friendly binary resource inclusion mechanism paper.
(struct cpp_options): Add embed member.
(enum cpp_builtin_type): Add BT_HAS_EMBED.
(cpp_set_include_chains): Add another cpp_dir * argument to
the declaration.
* internal.h (enum include_type): Add IT_EMBED.
(struct cpp_reader): Add embed_include member.
(struct cpp_embed_params_tokens): New type.
(struct cpp_embed_params): New type.
(_cpp_get_token_no_padding): Declare.
(enum _cpp_find_file_kind): Add _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
(_cpp_stack_embed): Declare.
(_cpp_parse_expr): Change return type to cpp_num_part instead of
bool, change second argument from bool to const char * and add third
argument.
(_cpp_parse_embed_params): Declare.
* directives.cc (DIRECTIVE_TABLE): Add embed entry.
(end_directive): Don't call skip_rest_of_line for T_EMBED directive.
(_cpp_handle_directive): Return 2 rather than 1 for T_EMBED in
directives-only mode.
(parse_include): Don't Call check_eol for T_EMBED directive.
(skip_balanced_token_seq): New function.
(EMBED_PARAMS): Define.
(enum embed_param_kind): New type.
(embed_params): New variable.
(_cpp_parse_embed_params): New function.
(do_embed): New function.
(do_if): Adjust _cpp_parse_expr caller.
(do_elif): Likewise.
* expr.cc (parse_defined): Diagnose defined in #embed or __has_embed
parameters.
(_cpp_parse_expr): Change return type to cpp_num_part instead of
bool, change second argument from bool to const char * and add third
argument. Adjust function comment. For #embed/__has_embed parameters
add an artificial CPP_OPEN_PAREN. Use the second argument DIR
directly instead of string literals conditional on IS_IF.
For #embed/__has_embed parameter, stop on reaching CPP_CLOSE_PAREN
matching the artificial one. Diagnose negative or too large embed
parameter operands.
(num_binary_op): Use #embed instead of #if for diagnostics if inside
#embed/__has_embed parameter.
(num_div_op): Likewise.
* files.cc (struct _cpp_file): Add limit member and embed bitfield.
(search_cache): Add IS_EMBED argument, formatting fix. Skip over
files with different file->embed from the argument.
(find_file_in_dir): Don't call pch_open_file if file->embed.
(_cpp_find_file): Handle _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
(read_file_guts): Formatting fix.
(has_unique_contents): Ignore file->embed files.
(search_path_head): Handle IT_EMBED type.
(_cpp_stack_embed): New function.
(_cpp_get_file_stat): Formatting fix.
(cpp_set_include_chains): Add embed argument, save it to
pfile->embed_include and compute lens for the chain.
* init.cc (struct lang_flags): Add embed member.
(lang_defaults): Add embed initializers.
(cpp_set_lang): Initialize CPP_OPTION (pfile, embed).
(builtin_array): Add __has_embed entry.
(cpp_init_builtins): Predefine __STDC_EMBED_NOT_FOUND__,
__STDC_EMBED_FOUND__ and __STDC_EMBED_EMPTY__.
* lex.cc (cpp_directive_only_process): Handle #embed.
* macro.cc (cpp_get_token_no_padding): Rename to ...
(_cpp_get_token_no_padding): ... this. No longer static.
(builtin_has_include_1): New function.
(builtin_has_include): Use it. Use _cpp_get_token_no_padding
instead of cpp_get_token_no_padding.
(builtin_has_embed): New function.
(_cpp_builtin_macro_text): Handle BT_HAS_EMBED.
gcc/
* doc/cppdiropts.texi (--embed-dir=): Document.
* doc/cpp.texi (Binary Resource Inclusion): New chapter.
(__has_embed): Document.
* doc/invoke.texi (Directory Options): Mention --embed-dir=.
* gcc.cc (cpp_unique_options): Add %{-embed*}.
* genmatch.cc (main): Adjust cpp_set_include_chains caller.
* incpath.h (enum incpath_kind): Add INC_EMBED.
* incpath.cc (merge_include_chains): Handle INC_EMBED.
(register_include_chains): Adjust cpp_set_include_chains caller.
gcc/c-family/
* c.opt (-embed-dir=): New option.
(-embed-directory): New alias.
(-embed-directory=): New alias.
* c-opts.cc (c_common_handle_option): Handle OPT__embed_dir_.
gcc/testsuite/
* c-c++-common/cpp/embed-1.c: New test.
* c-c++-common/cpp/embed-2.c: New test.
* c-c++-common/cpp/embed-3.c: New test.
* c-c++-common/cpp/embed-4.c: New test.
* c-c++-common/cpp/embed-5.c: New test.
* c-c++-common/cpp/embed-6.c: New test.
* c-c++-common/cpp/embed-7.c: New test.
* c-c++-common/cpp/embed-8.c: New test.
* c-c++-common/cpp/embed-9.c: New test.
* c-c++-common/cpp/embed-10.c: New test.
* c-c++-common/cpp/embed-11.c: New test.
* c-c++-common/cpp/embed-12.c: New test.
* c-c++-common/cpp/embed-13.c: New test.
* c-c++-common/cpp/embed-14.c: New test.
* c-c++-common/cpp/embed-25.c: New test.
* c-c++-common/cpp/embed-26.c: New test.
* c-c++-common/cpp/embed-dir/embed-1.inc: New test.
* c-c++-common/cpp/embed-dir/embed-3.c: New test.
* c-c++-common/cpp/embed-dir/embed-4.c: New test.
* c-c++-common/cpp/embed-dir/magna-carta.txt: New test.
* gcc.dg/cpp/embed-1.c: New test.
* gcc.dg/cpp/embed-2.c: New test.
* gcc.dg/cpp/embed-3.c: New test.
* gcc.dg/cpp/embed-4.c: New test.
* g++.dg/cpp/embed-1.C: New test.
* g++.dg/cpp/embed-2.C: New test.
* g++.dg/cpp/embed-3.C: New test.
2024-09-12 17:15:38 +08:00
|
|
|
IT_EMBED, /* #embed */
|
2019-08-29 02:43:37 +08:00
|
|
|
|
|
|
|
/* Non-directive including mechanisms. */
|
|
|
|
IT_CMDLINE, /* -include */
|
|
|
|
IT_DEFAULT, /* forced header */
|
2020-07-08 02:28:59 +08:00
|
|
|
IT_MAIN, /* main, start on line 1 */
|
2020-10-09 03:11:37 +08:00
|
|
|
IT_PRE_MAIN, /* main, but there will be a preamble before line
|
|
|
|
1 */
|
2019-08-29 22:06:32 +08:00
|
|
|
|
|
|
|
IT_DIRECTIVE_HWM = IT_IMPORT + 1, /* Directives below this. */
|
2019-10-02 18:22:05 +08:00
|
|
|
IT_HEADER_HWM = IT_DEFAULT + 1 /* Header files below this. */
|
2019-08-29 02:43:37 +08:00
|
|
|
};
|
2001-03-15 15:57:13 +08:00
|
|
|
|
c-lex.c (cb_def_pragma): Update.
* c-lex.c (cb_def_pragma): Update.
(c_lex): Update, and skip padding.
* cppexp.c (lex, parse_defined): Update, remove unused variable.
* cpphash.h (struct toklist): Delete.
(union utoken): New.
(struct cpp_context): Update.
(struct cpp_reader): New members eof, avoid_paste.
(_cpp_temp_token): New.
* cppinit.c (cpp_create_reader): Update.
* cpplex.c (_cpp_temp_token): New.
(_cpp_lex_direct): Add PREV_WHITE when parsing args.
(cpp_output_token): Don't print leading whitespace.
(cpp_output_line): Update.
* cpplib.c (glue_header_name, parse_include, get__Pragma_string,
do_include_common, do_line, do_ident, do_pragma,
do_pragma_dependency, _cpp_do__Pragma, parse_answer,
parse_assertion): Update.
(get_token_no_padding): New.
* cpplib.h (CPP_PADDING): New.
(AVOID_LPASTE): Delete.
(struct cpp_token): New union member source.
(cpp_get_token): Update.
* cppmacro.c (macro_arg): Convert to use pointers to const tokens.
(builtin_macro, paste_all_tokens, paste_tokens, funlike_invocation_p,
replace_args, quote_string, stringify_arg, parse_arg, next_context,
enter_macro_context, expand_arg, _cpp_pop_context, cpp_scan_nooutput,
_cpp_backup_tokens, _cpp_create_definition): Update.
(push_arg_context): Delete.
(padding_token, push_token_context, push_ptoken_context): New.
(make_string_token, make_number_token): Update, rename.
(cpp_get_token): Update to handle tokens as pointers to const,
and insert padding appropriately.
* cppmain.c (struct printer): New member prev.
(check_multiline_token): Constify.
(do_preprocessing, cb_line_change): Update.
(scan_translation_unit): Update to handle spacing.
* scan-decls.c (get_a_token): New.
(skip_to_closing_brace, scan_decls): Update.
* fix-header.c (read_scan_file): Update.
* doc/cpp.texi: Update.
* gcc.dg/cpp/macro10.c: New test.
* gcc.dg/cpp/strify3.c: New test.
* gcc.dg/cpp/spacing1.c: Add tests.
* gcc.dg/cpp/19990703-1.c: Remove bogus test.
* gcc.dg/cpp/20000625-2.c: Fudge to pass.
From-SVN: r45793
2001-09-25 06:53:12 +08:00
|
|
|
union utoken
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
{
|
c-lex.c (cb_def_pragma): Update.
* c-lex.c (cb_def_pragma): Update.
(c_lex): Update, and skip padding.
* cppexp.c (lex, parse_defined): Update, remove unused variable.
* cpphash.h (struct toklist): Delete.
(union utoken): New.
(struct cpp_context): Update.
(struct cpp_reader): New members eof, avoid_paste.
(_cpp_temp_token): New.
* cppinit.c (cpp_create_reader): Update.
* cpplex.c (_cpp_temp_token): New.
(_cpp_lex_direct): Add PREV_WHITE when parsing args.
(cpp_output_token): Don't print leading whitespace.
(cpp_output_line): Update.
* cpplib.c (glue_header_name, parse_include, get__Pragma_string,
do_include_common, do_line, do_ident, do_pragma,
do_pragma_dependency, _cpp_do__Pragma, parse_answer,
parse_assertion): Update.
(get_token_no_padding): New.
* cpplib.h (CPP_PADDING): New.
(AVOID_LPASTE): Delete.
(struct cpp_token): New union member source.
(cpp_get_token): Update.
* cppmacro.c (macro_arg): Convert to use pointers to const tokens.
(builtin_macro, paste_all_tokens, paste_tokens, funlike_invocation_p,
replace_args, quote_string, stringify_arg, parse_arg, next_context,
enter_macro_context, expand_arg, _cpp_pop_context, cpp_scan_nooutput,
_cpp_backup_tokens, _cpp_create_definition): Update.
(push_arg_context): Delete.
(padding_token, push_token_context, push_ptoken_context): New.
(make_string_token, make_number_token): Update, rename.
(cpp_get_token): Update to handle tokens as pointers to const,
and insert padding appropriately.
* cppmain.c (struct printer): New member prev.
(check_multiline_token): Constify.
(do_preprocessing, cb_line_change): Update.
(scan_translation_unit): Update to handle spacing.
* scan-decls.c (get_a_token): New.
(skip_to_closing_brace, scan_decls): Update.
* fix-header.c (read_scan_file): Update.
* doc/cpp.texi: Update.
* gcc.dg/cpp/macro10.c: New test.
* gcc.dg/cpp/strify3.c: New test.
* gcc.dg/cpp/spacing1.c: Add tests.
* gcc.dg/cpp/19990703-1.c: Remove bogus test.
* gcc.dg/cpp/20000625-2.c: Fudge to pass.
From-SVN: r45793
2001-09-25 06:53:12 +08:00
|
|
|
const cpp_token *token;
|
|
|
|
const cpp_token **ptoken;
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
};
|
|
|
|
|
2002-01-04 05:43:09 +08:00
|
|
|
/* A "run" of tokens; part of a chain of runs. */
|
2001-09-11 15:00:12 +08:00
|
|
|
typedef struct tokenrun tokenrun;
|
|
|
|
struct tokenrun
|
|
|
|
{
|
c-parse.in (_yylex): Use _cpp_backup_tokens.
* c-parse.in (_yylex): Use _cpp_backup_tokens.
* cpphash.h (struct tokenrun): Add prev.
(struct lexer_state): Remove bol.
(struct cpp_reader): Remove old lookahead stuff, add lookaheads.
(_cpp_free_lookaheads, _cpp_release_lookahead, _cpp_push_token)
: Remove.
* cppinit.c (cpp_create_reader): Don't set bol.
(cpp_destroy): Don't free lookaheads.
* cpplex.c (lex_directive): Remove.
(next_tokenrun): Update.
(_cpp_lex_token): Clean up logic.
(lex_token): Update to return a pointer to lexed token, since it
can move to the start of the buffer. Simpify newline handling.
* cpplib.c (SEEN_EOL): Update.
(skip_rest_of_line): Remove lookahead stuff.
(end_directive): Line numbers are already incremented. Revert
to start of lexed token buffer if we can.
(_cpp_handle_directive, do_pragma, do_pragma_dependency,
parse_answer): Use _cpp_backup_tokens.
(run_directive, cpp_pop_buffer): Don't set bol, set saved_flags
instead. Don't check for EOL.
(do_include_common, do_line, do_pragma_system_header): Use
skip_rest_of_line.
* cpplib.h (BOL, _cpp_backup_tokens): New.
* cppmacro.c (save_lookahead_token, take_lookahead_token,
alloc_lookahead, free_lookahead, _cpp_free_lookaheads,
cpp_start_lookahead, cpp_stop_lookahead, _cpp_push_token): Remove.
(builtin_macro): Don't use cpp_get_line.
(cpp_get_line): Short term kludge.
(parse_arg): Handle directives in arguments here. Back up when
appropriate. Store EOF at end of argument list.
(funlike_invocation_p): Use _cpp_backup_tokens.
(push_arg_context): Account for EOF at end of list.
(cpp_get_token): Remove lookahead stuff. Update.
* gcc.dg/cpp/directiv.c: Update.
* gcc.dg/cpp/undef1.c: Update.
From-SVN: r45582
2001-09-14 04:05:17 +08:00
|
|
|
tokenrun *next, *prev;
|
2001-09-11 15:00:12 +08:00
|
|
|
cpp_token *base, *limit;
|
|
|
|
};
|
|
|
|
|
cpphash.h (FIRST, [...]): New.
* cpphash.h (FIRST, LAST, CUR, RLIMIT): New.
(struct cpp_context): Add traditional fields.
* cppmacro.c (paste_all_tokens, push_ptoken_context,
push_token_context, cpp_get_token, _cpp_backup_tokens): Update.
* cpptrad.c (skip_comment, lex_identifier,
_cpp_read_logical_line_trad, scan_out_logical_line): Update.
From-SVN: r54242
2002-06-04 21:07:06 +08:00
|
|
|
/* Accessor macros for struct cpp_context. */
|
2002-09-04 05:55:40 +08:00
|
|
|
#define FIRST(c) ((c)->u.iso.first)
|
|
|
|
#define LAST(c) ((c)->u.iso.last)
|
|
|
|
#define CUR(c) ((c)->u.trad.cur)
|
|
|
|
#define RLIMIT(c) ((c)->u.trad.rlimit)
|
cpphash.h (FIRST, [...]): New.
* cpphash.h (FIRST, LAST, CUR, RLIMIT): New.
(struct cpp_context): Add traditional fields.
* cppmacro.c (paste_all_tokens, push_ptoken_context,
push_token_context, cpp_get_token, _cpp_backup_tokens): Update.
* cpptrad.c (skip_comment, lex_identifier,
_cpp_read_logical_line_trad, scan_out_logical_line): Update.
From-SVN: r54242
2002-06-04 21:07:06 +08:00
|
|
|
|
2011-10-17 17:59:12 +08:00
|
|
|
/* This describes some additional data that is added to the macro
|
|
|
|
token context of type cpp_context, when -ftrack-macro-expansion is
|
|
|
|
on. */
|
|
|
|
typedef struct
|
|
|
|
{
|
|
|
|
/* The node of the macro we are referring to. */
|
|
|
|
cpp_hashnode *macro_node;
|
|
|
|
/* This buffer contains an array of virtual locations. The virtual
|
|
|
|
location at index 0 is the virtual location of the token at index
|
|
|
|
0 in the current instance of cpp_context; similarly for all the
|
|
|
|
other virtual locations. */
|
2018-11-14 04:05:03 +08:00
|
|
|
location_t *virt_locs;
|
2011-10-17 17:59:12 +08:00
|
|
|
/* This is a pointer to the current virtual location. This is used
|
|
|
|
to iterate over the virtual locations while we iterate over the
|
|
|
|
tokens they belong to. */
|
2018-11-14 04:05:03 +08:00
|
|
|
location_t *cur_virt_loc;
|
2011-10-17 17:59:12 +08:00
|
|
|
} macro_context;
|
|
|
|
|
|
|
|
/* The kind of tokens carried by a cpp_context. */
|
|
|
|
enum context_tokens_kind {
|
|
|
|
/* This is the value of cpp_context::tokens_kind if u.iso.first
|
|
|
|
contains an instance of cpp_token **. */
|
|
|
|
TOKENS_KIND_INDIRECT,
|
|
|
|
/* This is the value of cpp_context::tokens_kind if u.iso.first
|
|
|
|
contains an instance of cpp_token *. */
|
|
|
|
TOKENS_KIND_DIRECT,
|
|
|
|
/* This is the value of cpp_context::tokens_kind when the token
|
|
|
|
context contains tokens resulting from macro expansion. In that
|
|
|
|
case struct cpp_context::macro points to an instance of struct
|
|
|
|
macro_context. This is used only when the
|
|
|
|
-ftrack-macro-expansion flag is on. */
|
|
|
|
TOKENS_KIND_EXTENDED
|
|
|
|
};
|
|
|
|
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
typedef struct cpp_context cpp_context;
|
|
|
|
struct cpp_context
|
|
|
|
{
|
|
|
|
/* Doubly-linked list. */
|
|
|
|
cpp_context *next, *prev;
|
|
|
|
|
cpphash.h (FIRST, [...]): New.
* cpphash.h (FIRST, LAST, CUR, RLIMIT): New.
(struct cpp_context): Add traditional fields.
* cppmacro.c (paste_all_tokens, push_ptoken_context,
push_token_context, cpp_get_token, _cpp_backup_tokens): Update.
* cpptrad.c (skip_comment, lex_identifier,
_cpp_read_logical_line_trad, scan_out_logical_line): Update.
From-SVN: r54242
2002-06-04 21:07:06 +08:00
|
|
|
union
|
|
|
|
{
|
|
|
|
/* For ISO macro expansion. Contexts other than the base context
|
|
|
|
are contiguous tokens. e.g. macro expansions, expanded
|
|
|
|
argument tokens. */
|
|
|
|
struct
|
|
|
|
{
|
|
|
|
union utoken first;
|
|
|
|
union utoken last;
|
|
|
|
} iso;
|
|
|
|
|
|
|
|
/* For traditional macro expansion. */
|
|
|
|
struct
|
|
|
|
{
|
2004-11-28 05:59:38 +08:00
|
|
|
const unsigned char *cur;
|
|
|
|
const unsigned char *rlimit;
|
cpphash.h (FIRST, [...]): New.
* cpphash.h (FIRST, LAST, CUR, RLIMIT): New.
(struct cpp_context): Add traditional fields.
* cppmacro.c (paste_all_tokens, push_ptoken_context,
push_token_context, cpp_get_token, _cpp_backup_tokens): Update.
* cpptrad.c (skip_comment, lex_identifier,
_cpp_read_logical_line_trad, scan_out_logical_line): Update.
From-SVN: r54242
2002-06-04 21:07:06 +08:00
|
|
|
} trad;
|
|
|
|
} u;
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
|
2001-09-27 05:44:35 +08:00
|
|
|
/* If non-NULL, a buffer used for storage related to this context.
|
2001-09-27 20:59:38 +08:00
|
|
|
When the context is popped, the buffer is released. */
|
2001-09-27 05:44:35 +08:00
|
|
|
_cpp_buff *buff;
|
|
|
|
|
2011-10-17 17:59:12 +08:00
|
|
|
/* If tokens_kind is TOKEN_KIND_EXTENDED, then (as we thus are in a
|
|
|
|
macro context) this is a pointer to an instance of macro_context.
|
|
|
|
Otherwise if tokens_kind is *not* TOKEN_KIND_EXTENDED, then, if
|
|
|
|
we are in a macro context, this is a pointer to an instance of
|
|
|
|
cpp_hashnode, representing the name of the macro this context is
|
|
|
|
for. If we are not in a macro context, then this is just NULL.
|
|
|
|
Note that when tokens_kind is TOKEN_KIND_EXTENDED, the memory
|
|
|
|
used by the instance of macro_context pointed to by this member
|
|
|
|
is de-allocated upon de-allocation of the instance of struct
|
|
|
|
cpp_context. */
|
|
|
|
union
|
|
|
|
{
|
|
|
|
macro_context *mc;
|
|
|
|
cpp_hashnode *macro;
|
|
|
|
} c;
|
c-lex.c (cb_def_pragma): Update.
* c-lex.c (cb_def_pragma): Update.
(c_lex): Update, and skip padding.
* cppexp.c (lex, parse_defined): Update, remove unused variable.
* cpphash.h (struct toklist): Delete.
(union utoken): New.
(struct cpp_context): Update.
(struct cpp_reader): New members eof, avoid_paste.
(_cpp_temp_token): New.
* cppinit.c (cpp_create_reader): Update.
* cpplex.c (_cpp_temp_token): New.
(_cpp_lex_direct): Add PREV_WHITE when parsing args.
(cpp_output_token): Don't print leading whitespace.
(cpp_output_line): Update.
* cpplib.c (glue_header_name, parse_include, get__Pragma_string,
do_include_common, do_line, do_ident, do_pragma,
do_pragma_dependency, _cpp_do__Pragma, parse_answer,
parse_assertion): Update.
(get_token_no_padding): New.
* cpplib.h (CPP_PADDING): New.
(AVOID_LPASTE): Delete.
(struct cpp_token): New union member source.
(cpp_get_token): Update.
* cppmacro.c (macro_arg): Convert to use pointers to const tokens.
(builtin_macro, paste_all_tokens, paste_tokens, funlike_invocation_p,
replace_args, quote_string, stringify_arg, parse_arg, next_context,
enter_macro_context, expand_arg, _cpp_pop_context, cpp_scan_nooutput,
_cpp_backup_tokens, _cpp_create_definition): Update.
(push_arg_context): Delete.
(padding_token, push_token_context, push_ptoken_context): New.
(make_string_token, make_number_token): Update, rename.
(cpp_get_token): Update to handle tokens as pointers to const,
and insert padding appropriately.
* cppmain.c (struct printer): New member prev.
(check_multiline_token): Constify.
(do_preprocessing, cb_line_change): Update.
(scan_translation_unit): Update to handle spacing.
* scan-decls.c (get_a_token): New.
(skip_to_closing_brace, scan_decls): Update.
* fix-header.c (read_scan_file): Update.
* doc/cpp.texi: Update.
* gcc.dg/cpp/macro10.c: New test.
* gcc.dg/cpp/strify3.c: New test.
* gcc.dg/cpp/spacing1.c: Add tests.
* gcc.dg/cpp/19990703-1.c: Remove bogus test.
* gcc.dg/cpp/20000625-2.c: Fudge to pass.
From-SVN: r45793
2001-09-25 06:53:12 +08:00
|
|
|
|
2011-10-17 17:59:12 +08:00
|
|
|
/* This determines the type of tokens held by this context. */
|
|
|
|
enum context_tokens_kind tokens_kind;
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
};
|
|
|
|
|
|
|
|
struct lexer_state
|
|
|
|
{
|
2019-09-05 19:23:48 +08:00
|
|
|
/* 1 if we're handling a directive. 2 if it's an include-like
|
|
|
|
directive. */
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
unsigned char in_directive;
|
|
|
|
|
2003-02-22 02:06:30 +08:00
|
|
|
/* Nonzero if in a directive that will handle padding tokens itself.
|
|
|
|
#include needs this to avoid problems with computed include and
|
|
|
|
spacing between tokens. */
|
|
|
|
unsigned char directive_wants_padding;
|
|
|
|
|
2001-07-26 14:02:47 +08:00
|
|
|
/* True if we are skipping a failed conditional group. */
|
|
|
|
unsigned char skipping;
|
|
|
|
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
/* Nonzero if in a directive that takes angle-bracketed headers. */
|
|
|
|
unsigned char angled_headers;
|
|
|
|
|
2002-06-18 14:27:40 +08:00
|
|
|
/* Nonzero if in a #if or #elif directive. */
|
|
|
|
unsigned char in_expression;
|
|
|
|
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
/* Nonzero to save comments. Turned off if discard_comments, and in
|
|
|
|
all directives apart from #define. */
|
|
|
|
unsigned char save_comments;
|
|
|
|
|
2017-11-14 04:17:42 +08:00
|
|
|
/* Nonzero if lexing __VA_ARGS__ and __VA_OPT__ are valid. */
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
unsigned char va_args_ok;
|
|
|
|
|
|
|
|
/* Nonzero if lexing poisoned identifiers is valid. */
|
|
|
|
unsigned char poisoned_ok;
|
|
|
|
|
|
|
|
/* Nonzero to prevent macro expansion. */
|
2002-05-23 06:02:16 +08:00
|
|
|
unsigned char prevent_expansion;
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
|
|
|
|
/* Nonzero when parsing arguments to a function-like macro. */
|
|
|
|
unsigned char parsing_args;
|
2002-04-29 03:42:54 +08:00
|
|
|
|
2004-06-06 04:58:06 +08:00
|
|
|
/* Nonzero if prevent_expansion is true only because output is
|
|
|
|
being discarded. */
|
|
|
|
unsigned char discarding_output;
|
|
|
|
|
2002-04-29 03:42:54 +08:00
|
|
|
/* Nonzero to skip evaluating part of an expression. */
|
|
|
|
unsigned int skip_eval;
|
2006-01-05 00:33:38 +08:00
|
|
|
|
2020-05-20 04:20:32 +08:00
|
|
|
/* Nonzero when tokenizing a deferred pragma. */
|
2006-01-05 00:33:38 +08:00
|
|
|
unsigned char in_deferred_pragma;
|
|
|
|
|
2020-11-19 02:24:12 +08:00
|
|
|
/* Count to token that is a header-name. */
|
|
|
|
unsigned char directive_file_token;
|
|
|
|
|
2006-01-05 00:33:38 +08:00
|
|
|
/* Nonzero if the deferred pragma being handled allows macro expansion. */
|
|
|
|
unsigned char pragma_allow_expansion;
|
2021-11-23 05:29:20 +08:00
|
|
|
|
|
|
|
/* Nonzero if _Pragma should not be interpreted. */
|
|
|
|
unsigned char ignore__Pragma;
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
};
|
|
|
|
|
|
|
|
/* Special nodes - identifiers with predefined significance. */
|
|
|
|
struct spec_nodes
|
|
|
|
{
|
|
|
|
cpp_hashnode *n_defined; /* defined operator */
|
2001-02-08 07:13:46 +08:00
|
|
|
cpp_hashnode *n_true; /* C++ keyword true */
|
|
|
|
cpp_hashnode *n_false; /* C++ keyword false */
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
cpp_hashnode *n__VA_ARGS__; /* C99 vararg macros */
|
2017-11-14 04:17:42 +08:00
|
|
|
cpp_hashnode *n__VA_OPT__; /* C++ vararg macros */
|
2020-11-19 02:24:12 +08:00
|
|
|
|
|
|
|
enum {M_EXPORT, M_MODULE, M_IMPORT, M__IMPORT, M_HWM};
|
|
|
|
|
|
|
|
/* C++20 modules, only set when module_directives is in effect.
|
|
|
|
incoming variants [0], outgoing ones [1] */
|
|
|
|
cpp_hashnode *n_modules[M_HWM][2];
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
};
|
|
|
|
|
cppfiles.c (ENABLE_VALGRIND_CHECKING, [...]): Remove.
* cppfiles.c (ENABLE_VALGRIND_CHECKING, VALGRIND_DISCARD,
MMAP_THRESHOLD, TEST_THRESHOLD, SHOULD_MMAP): Remove.
(struct include_file): Remove fefcnt, mapped members.
(open_file, stack_include_file, _cpp_pop_file_buffer): Disable caching.
(read_include_file): Don't use mmap, terminate buffers in '\r'.
(purge_cache): Don't use munmap.
* cpphash.h (CPP_BUF_COLUMN): Update.
(lexer_state): Remove lexing_comment.
(struct _cpp_line_note): New.
(struct cpp_buffer): New members cur_note, notes_used, notes_cap,
next_line and need_line. Remove col_adjust and saved_flags.
(_cpp_process_line_notes, _cpp_clean_line, _cpp_get_fresh_line,
_cpp_skip_block_comment, scan_out_logical_line): New.
(_cpp_init_mbchar): Remove.
* cppinit.c (init_library): Remove call to _cpp_init_mbchar.
(cpp_read_main_file): Set line to 1 earlier.
(post_options): -traditional-cpp doesn't want trigraphs.
* cpplex.c (MULTIBYTE_CHARS): Remove code predicated on this.
(add_line_note, _cpp_clean_line, _cpp_process_line_notes,
_cpp_get_fresh_line): New.
(handle_newline, skip_escaped_newlines, trigraph_p,
continue_after_nul, _cpp_init_mbchar): Remove.
(get_effective_char): Update.
(_cpp_skip_block_comment): Rename from skip_block_comment, simplify.
(skip_line_comment): Simplify.
(skip_whitespace, parse_identifier, parse_slow, parse_number,
parse_string): Update.
(cpp_lex_direct): Use clean lines and process line notes. Update.
(cpp_interpret_charconst): No MULTIBYTE_CHARS.
* cpplib.c (prepare_directive_trad): Call scan_out_logical_line
directly.
(_cpp_handle_directive): Don't set saved_flags.
(run_directive, destringize_and_run, cpp_define, cpp_define_builtin,
cpp_undef, handle_assertion, cpp_push_buffer): Update.
(_cpp_pop_buffer): Free notes.
* cppmacro.c (builtin_macro, paste_tokens): \n terminate buffer.
* cpppch.c (cpp_read_state): \n terminate buffer.
* cpptrad.c (skip_escaped_newlines, handle_newline): Remove.
(copy_comment): Use _cpp_skip_block_comment.
(skip_whitespace, lex_identifier, _cpp_read_logical_line_trad):
Simplify.
(_cpp_overlay_buffer, _cpp_remove_overlay, push_replacement_text,
save_replacement_text): Update.
(scan_out_logical_line): Update to use clean lines and process
line notes.
* fix-header.c (read_scan_file): Update.
testsuite:
* gcc.dg/cpp/_Pragma4.c: Remove stray space.
* gcc.dg/cpp/trad/escaped-eof.c: Correct line number.
From-SVN: r65808
2003-04-19 08:22:51 +08:00
|
|
|
typedef struct _cpp_line_note _cpp_line_note;
|
|
|
|
struct _cpp_line_note
|
|
|
|
{
|
|
|
|
/* Location in the clean line the note refers to. */
|
2004-11-28 05:59:38 +08:00
|
|
|
const unsigned char *pos;
|
cppfiles.c (ENABLE_VALGRIND_CHECKING, [...]): Remove.
* cppfiles.c (ENABLE_VALGRIND_CHECKING, VALGRIND_DISCARD,
MMAP_THRESHOLD, TEST_THRESHOLD, SHOULD_MMAP): Remove.
(struct include_file): Remove fefcnt, mapped members.
(open_file, stack_include_file, _cpp_pop_file_buffer): Disable caching.
(read_include_file): Don't use mmap, terminate buffers in '\r'.
(purge_cache): Don't use munmap.
* cpphash.h (CPP_BUF_COLUMN): Update.
(lexer_state): Remove lexing_comment.
(struct _cpp_line_note): New.
(struct cpp_buffer): New members cur_note, notes_used, notes_cap,
next_line and need_line. Remove col_adjust and saved_flags.
(_cpp_process_line_notes, _cpp_clean_line, _cpp_get_fresh_line,
_cpp_skip_block_comment, scan_out_logical_line): New.
(_cpp_init_mbchar): Remove.
* cppinit.c (init_library): Remove call to _cpp_init_mbchar.
(cpp_read_main_file): Set line to 1 earlier.
(post_options): -traditional-cpp doesn't want trigraphs.
* cpplex.c (MULTIBYTE_CHARS): Remove code predicated on this.
(add_line_note, _cpp_clean_line, _cpp_process_line_notes,
_cpp_get_fresh_line): New.
(handle_newline, skip_escaped_newlines, trigraph_p,
continue_after_nul, _cpp_init_mbchar): Remove.
(get_effective_char): Update.
(_cpp_skip_block_comment): Rename from skip_block_comment, simplify.
(skip_line_comment): Simplify.
(skip_whitespace, parse_identifier, parse_slow, parse_number,
parse_string): Update.
(cpp_lex_direct): Use clean lines and process line notes. Update.
(cpp_interpret_charconst): No MULTIBYTE_CHARS.
* cpplib.c (prepare_directive_trad): Call scan_out_logical_line
directly.
(_cpp_handle_directive): Don't set saved_flags.
(run_directive, destringize_and_run, cpp_define, cpp_define_builtin,
cpp_undef, handle_assertion, cpp_push_buffer): Update.
(_cpp_pop_buffer): Free notes.
* cppmacro.c (builtin_macro, paste_tokens): \n terminate buffer.
* cpppch.c (cpp_read_state): \n terminate buffer.
* cpptrad.c (skip_escaped_newlines, handle_newline): Remove.
(copy_comment): Use _cpp_skip_block_comment.
(skip_whitespace, lex_identifier, _cpp_read_logical_line_trad):
Simplify.
(_cpp_overlay_buffer, _cpp_remove_overlay, push_replacement_text,
save_replacement_text): Update.
(scan_out_logical_line): Update to use clean lines and process
line notes.
* fix-header.c (read_scan_file): Update.
testsuite:
* gcc.dg/cpp/_Pragma4.c: Remove stray space.
* gcc.dg/cpp/trad/escaped-eof.c: Correct line number.
From-SVN: r65808
2003-04-19 08:22:51 +08:00
|
|
|
|
2003-04-21 03:02:53 +08:00
|
|
|
/* Type of note. The 9 'from' trigraph characters represent those
|
|
|
|
trigraphs, '\\' an escaped newline, ' ' an escaped newline with
|
libcpp: Add -Wleading-whitespace= warning
The following patch on top of the r15-4346 patch adds
-Wleading-whitespace= warning option.
This warning doesn't care how much one actually indents which line
in the source (that is something that can't be easily done in the
preprocessor without doing syntactic analysis), but just simple checks
on what kind of whitespace is used in the indentation.
I think it is still useful to get warnings about such issues early,
while git diagnoses some of it in patches (e.g. the tab after space
case), getting the warnings earlier might help avoiding such issues
sooner.
There are projects which ban use of tabs and require just spaces,
others which require indentation just with horizontal tabs, and finally
projects which want indentation with tabs for multiples of tabstop size
followed by spaces (fewer than tabstop size), like GCC.
For all 3 kinds the warning diagnoses indentation with '\v' or '\f'
characters (unless line contains just whitespace), and for the last one
also cases where a space in the indentation is followed by horizontal
tab or where there are N or more consecutive spaces in the indentation
(for -ftabstop=N).
BTW, for additional testing I've enabled the warnings (without -Werror
for them) in stage3. There are many warnings (both trailing and leading
whitespace), some of them something that can be easily fixed in the headers
or source files, but others with whitespace issues in generated sources,
so if we enable the warnings, either we'd need to adjust the generators
or disable the warnings in (some of the) generated files.
2024-10-23 Jakub Jelinek <jakub@redhat.com>
libcpp/
* include/cpplib.h (struct cpp_options): Add
cpp_warn_leading_whitespace and cpp_tabstop members.
(enum cpp_warning_reason): Add CPP_W_LEADING_WHITESPACE.
* internal.h (struct _cpp_line_note): Document new
line note kinds.
* init.cc (cpp_create_reader): Set cpp_tabstop to 8.
* lex.cc (find_leading_whitespace_issues): New function.
(_cpp_clean_line): Use it.
(_cpp_process_line_notes): Handle 'L', 'S' and 'T' line notes.
(lex_raw_string): Clear type on 'L', 'S' and 'T' line notes
inside of raw string literals.
gcc/
* doc/invoke.texi (Wleading-whitespace=): Document.
gcc/c-family/
* c.opt (Wleading-whitespace=): New option.
* c-opts.cc (c_common_post_options): Set cpp_opts->cpp_tabstop
to global_dc->m_tabstop.
gcc/testsuite/
* c-c++-common/cpp/Wleading-whitespace-1.c: New test.
* c-c++-common/cpp/Wleading-whitespace-2.c: New test.
* c-c++-common/cpp/Wleading-whitespace-3.c: New test.
* c-c++-common/cpp/Wleading-whitespace-4.c: New test.
2024-10-23 15:58:06 +08:00
|
|
|
intervening space, 'W' trailing whitespace, 'L', 'S' and 'T' for
|
|
|
|
leading whitespace issues, 0 represents a note that
|
libcpp: Add -Wtrailing-blanks warning
Trailing blanks is something even git diff diagnoses; while it is a coding
style issue, if it is so common that git diff diagnoses it, I think it could
be useful to various projects to check that at compile time.
Dunno if it should be included in -Wextra, currently it isn't, and due to
tons of trailing whitespace in our sources, haven't enabled it for when
building gcc itself either.
Note, git diff also diagnoses indentation with tab following space, wonder
if we couldn't have trivial warning options where one would simply ask for
checking of indentation with no tabs, just spaces vs. indentation with
tabs followed by spaces (but never tab width or more spaces in the
indentation). I think that would be easy to do also on the libcpp side.
Checking how much something should be exactly indented requires syntax
analysis (at least some limited one) and can consider columns of first token
on line, but what the exact indentation blanks were is something only libcpp
knows.
On Thu, Sep 19, 2024 at 08:17:24AM +0200, Richard Biener wrote:
> Generally I like diagnosing this early. For the above I'd say -Wtrailing-whitespace=
> with a set of things to diagnose (and a sane default - just spaces and tabs - for
> -Wtrailiing-whitespace) would be nice. As for naming possibly follow the
> is{space,blank,cntrl} character classifications? If those are a good
> fit, that is.
The patch currently allows blank (' ' '\t') and space (' ' '\t' '\f' '\v'),
cntrl not yet added, not anything non-ASCII, but in theory could
be added later (though, non-ASCII would be just for inside of comments,
say non-breaking space etc. in the source is otherwise an error).
2024-10-15 Jakub Jelinek <jakub@redhat.com>
libcpp/
* include/cpplib.h (struct cpp_options): Add
cpp_warn_trailing_whitespace member.
(enum cpp_warning_reason): Add CPP_W_TRAILING_WHITESPACE.
* internal.h (struct _cpp_line_note): Document 'W' line note.
* lex.cc (_cpp_clean_line): Add 'W' line note for trailing whitespace
except for trailing whitespace after backslash. Formatting fix.
(_cpp_process_line_notes): Emit -Wtrailing-whitespace diagnostics.
Formatting fixes.
(lex_raw_string): Clear type on 'W' notes.
gcc/
* doc/invoke.texi (Wtrailing-whitespace): Document.
gcc/c-family/
* c.opt (Wtrailing-whitespace=): New option.
(Wtrailing-whitespace): New alias.
* c.opt.urls: Regenerate.
gcc/testsuite/
* c-c++-common/cpp/Wtrailing-whitespace-1.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-2.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-3.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-4.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-5.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-6.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-7.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-8.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-9.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-10.c: New test.
2024-10-15 13:53:56 +08:00
|
|
|
has already been handled, and anything else is invalid. */
|
2003-04-21 03:02:53 +08:00
|
|
|
unsigned int type;
|
cppfiles.c (ENABLE_VALGRIND_CHECKING, [...]): Remove.
* cppfiles.c (ENABLE_VALGRIND_CHECKING, VALGRIND_DISCARD,
MMAP_THRESHOLD, TEST_THRESHOLD, SHOULD_MMAP): Remove.
(struct include_file): Remove fefcnt, mapped members.
(open_file, stack_include_file, _cpp_pop_file_buffer): Disable caching.
(read_include_file): Don't use mmap, terminate buffers in '\r'.
(purge_cache): Don't use munmap.
* cpphash.h (CPP_BUF_COLUMN): Update.
(lexer_state): Remove lexing_comment.
(struct _cpp_line_note): New.
(struct cpp_buffer): New members cur_note, notes_used, notes_cap,
next_line and need_line. Remove col_adjust and saved_flags.
(_cpp_process_line_notes, _cpp_clean_line, _cpp_get_fresh_line,
_cpp_skip_block_comment, scan_out_logical_line): New.
(_cpp_init_mbchar): Remove.
* cppinit.c (init_library): Remove call to _cpp_init_mbchar.
(cpp_read_main_file): Set line to 1 earlier.
(post_options): -traditional-cpp doesn't want trigraphs.
* cpplex.c (MULTIBYTE_CHARS): Remove code predicated on this.
(add_line_note, _cpp_clean_line, _cpp_process_line_notes,
_cpp_get_fresh_line): New.
(handle_newline, skip_escaped_newlines, trigraph_p,
continue_after_nul, _cpp_init_mbchar): Remove.
(get_effective_char): Update.
(_cpp_skip_block_comment): Rename from skip_block_comment, simplify.
(skip_line_comment): Simplify.
(skip_whitespace, parse_identifier, parse_slow, parse_number,
parse_string): Update.
(cpp_lex_direct): Use clean lines and process line notes. Update.
(cpp_interpret_charconst): No MULTIBYTE_CHARS.
* cpplib.c (prepare_directive_trad): Call scan_out_logical_line
directly.
(_cpp_handle_directive): Don't set saved_flags.
(run_directive, destringize_and_run, cpp_define, cpp_define_builtin,
cpp_undef, handle_assertion, cpp_push_buffer): Update.
(_cpp_pop_buffer): Free notes.
* cppmacro.c (builtin_macro, paste_tokens): \n terminate buffer.
* cpppch.c (cpp_read_state): \n terminate buffer.
* cpptrad.c (skip_escaped_newlines, handle_newline): Remove.
(copy_comment): Use _cpp_skip_block_comment.
(skip_whitespace, lex_identifier, _cpp_read_logical_line_trad):
Simplify.
(_cpp_overlay_buffer, _cpp_remove_overlay, push_replacement_text,
save_replacement_text): Update.
(scan_out_logical_line): Update to use clean lines and process
line notes.
* fix-header.c (read_scan_file): Update.
testsuite:
* gcc.dg/cpp/_Pragma4.c: Remove stray space.
* gcc.dg/cpp/trad/escaped-eof.c: Correct line number.
From-SVN: r65808
2003-04-19 08:22:51 +08:00
|
|
|
};
|
|
|
|
|
2024-08-24 22:37:13 +08:00
|
|
|
/* Tail padding required by search_line_fast alternatives. */
|
|
|
|
#ifdef HAVE_SSSE3
|
|
|
|
#define CPP_BUFFER_PADDING 64
|
|
|
|
#else
|
|
|
|
#define CPP_BUFFER_PADDING 16
|
|
|
|
#endif
|
|
|
|
|
2002-01-04 05:43:09 +08:00
|
|
|
/* Represents the contents of a file cpplib has read in. */
|
c-lex.c (cb_enter_file, [...]): Combine into the new function cb_change_file.
* c-lex.c (cb_enter_file, cb_leave_file, cb_rename_file):
Combine into the new function cb_change_file.
(init_c_lex): Update.
* cppfiles.c (stack_include_file): Use _cpp_do_file_change.
(cpp_syshdr_flags): Delete.
* cpphash.h (_cpp_do_file_change): New prototype.
Move struct cpp_buffer here from...
* cpplib.h (struct cpp_buffer): ... here.
(enum cpp_fc_reason, struct cpp_file_loc,
struct_cpp_file_change, change_file): New.
(enter_file, leave_file, rename_file, cpp_syshdr_flags): Delete.
* cpplib.c (do_line): Update for new cb_change_file callback.
(_cpp_do_file_change): New function.
(_cpp_pop_buffer): Update to use it.
* cppmain.c (move_printer): Delete.
(main): Set up single callback cb_change_file.
(cb_enter_file, cb_leave_file, cb_rename_file): Delete.
(cb_change_file): New.
* fix-header.c (cur_file, cb_change_file): New.
(recognized_function, read_scan_file): Update.
* scan-decls.c (scan_decls): Update.
* scan.h (recognized_function): Update prototype.
From-SVN: r37784
2000-11-27 16:00:04 +08:00
|
|
|
struct cpp_buffer
|
|
|
|
{
|
2004-11-28 05:59:38 +08:00
|
|
|
const unsigned char *cur; /* Current location. */
|
|
|
|
const unsigned char *line_base; /* Start of current physical line. */
|
|
|
|
const unsigned char *next_line; /* Start of to-be-cleaned logical line. */
|
2004-01-17 06:37:49 +08:00
|
|
|
|
2004-11-28 05:59:38 +08:00
|
|
|
const unsigned char *buf; /* Entire character buffer. */
|
|
|
|
const unsigned char *rlimit; /* Writable byte at end of file. */
|
2013-03-07 00:18:40 +08:00
|
|
|
const unsigned char *to_free; /* Pointer that should be freed when
|
|
|
|
popping the buffer. */
|
cppfiles.c (ENABLE_VALGRIND_CHECKING, [...]): Remove.
* cppfiles.c (ENABLE_VALGRIND_CHECKING, VALGRIND_DISCARD,
MMAP_THRESHOLD, TEST_THRESHOLD, SHOULD_MMAP): Remove.
(struct include_file): Remove fefcnt, mapped members.
(open_file, stack_include_file, _cpp_pop_file_buffer): Disable caching.
(read_include_file): Don't use mmap, terminate buffers in '\r'.
(purge_cache): Don't use munmap.
* cpphash.h (CPP_BUF_COLUMN): Update.
(lexer_state): Remove lexing_comment.
(struct _cpp_line_note): New.
(struct cpp_buffer): New members cur_note, notes_used, notes_cap,
next_line and need_line. Remove col_adjust and saved_flags.
(_cpp_process_line_notes, _cpp_clean_line, _cpp_get_fresh_line,
_cpp_skip_block_comment, scan_out_logical_line): New.
(_cpp_init_mbchar): Remove.
* cppinit.c (init_library): Remove call to _cpp_init_mbchar.
(cpp_read_main_file): Set line to 1 earlier.
(post_options): -traditional-cpp doesn't want trigraphs.
* cpplex.c (MULTIBYTE_CHARS): Remove code predicated on this.
(add_line_note, _cpp_clean_line, _cpp_process_line_notes,
_cpp_get_fresh_line): New.
(handle_newline, skip_escaped_newlines, trigraph_p,
continue_after_nul, _cpp_init_mbchar): Remove.
(get_effective_char): Update.
(_cpp_skip_block_comment): Rename from skip_block_comment, simplify.
(skip_line_comment): Simplify.
(skip_whitespace, parse_identifier, parse_slow, parse_number,
parse_string): Update.
(cpp_lex_direct): Use clean lines and process line notes. Update.
(cpp_interpret_charconst): No MULTIBYTE_CHARS.
* cpplib.c (prepare_directive_trad): Call scan_out_logical_line
directly.
(_cpp_handle_directive): Don't set saved_flags.
(run_directive, destringize_and_run, cpp_define, cpp_define_builtin,
cpp_undef, handle_assertion, cpp_push_buffer): Update.
(_cpp_pop_buffer): Free notes.
* cppmacro.c (builtin_macro, paste_tokens): \n terminate buffer.
* cpppch.c (cpp_read_state): \n terminate buffer.
* cpptrad.c (skip_escaped_newlines, handle_newline): Remove.
(copy_comment): Use _cpp_skip_block_comment.
(skip_whitespace, lex_identifier, _cpp_read_logical_line_trad):
Simplify.
(_cpp_overlay_buffer, _cpp_remove_overlay, push_replacement_text,
save_replacement_text): Update.
(scan_out_logical_line): Update to use clean lines and process
line notes.
* fix-header.c (read_scan_file): Update.
testsuite:
* gcc.dg/cpp/_Pragma4.c: Remove stray space.
* gcc.dg/cpp/trad/escaped-eof.c: Correct line number.
From-SVN: r65808
2003-04-19 08:22:51 +08:00
|
|
|
|
2004-11-28 05:59:38 +08:00
|
|
|
_cpp_line_note *notes; /* Array of notes. */
|
|
|
|
unsigned int cur_note; /* Next note to process. */
|
|
|
|
unsigned int notes_used; /* Number of notes. */
|
|
|
|
unsigned int notes_cap; /* Size of allocated array. */
|
c-lex.c (cb_enter_file, [...]): Combine into the new function cb_change_file.
* c-lex.c (cb_enter_file, cb_leave_file, cb_rename_file):
Combine into the new function cb_change_file.
(init_c_lex): Update.
* cppfiles.c (stack_include_file): Use _cpp_do_file_change.
(cpp_syshdr_flags): Delete.
* cpphash.h (_cpp_do_file_change): New prototype.
Move struct cpp_buffer here from...
* cpplib.h (struct cpp_buffer): ... here.
(enum cpp_fc_reason, struct cpp_file_loc,
struct_cpp_file_change, change_file): New.
(enter_file, leave_file, rename_file, cpp_syshdr_flags): Delete.
* cpplib.c (do_line): Update for new cb_change_file callback.
(_cpp_do_file_change): New function.
(_cpp_pop_buffer): Update to use it.
* cppmain.c (move_printer): Delete.
(main): Set up single callback cb_change_file.
(cb_enter_file, cb_leave_file, cb_rename_file): Delete.
(cb_change_file): New.
* fix-header.c (cur_file, cb_change_file): New.
(recognized_function, read_scan_file): Update.
* scan-decls.c (scan_decls): Update.
* scan.h (recognized_function): Update prototype.
From-SVN: r37784
2000-11-27 16:00:04 +08:00
|
|
|
|
|
|
|
struct cpp_buffer *prev;
|
|
|
|
|
Makefile.in (LIBCPP_DEPS): Add HASHTAB_H.
* Makefile.in (LIBCPP_DEPS): Add HASHTAB_H.
* cppfiles.c: Completely rewritten.
* c-incpath.c (free_path, remove_duplicates, heads, tails, add_path):
struct cpp_path is now struct cpp_dir.
(remove_duplicates): Don't simplify path names.
* c-opts.c (c_common_parse_file): cpp_read_next_file renamed
cpp_stack_file.
* cpphash.h: Include hashtab.h.
(_cpp_file): Declare.
(struct cpp_buffer): struct include_file is now struct _cpp_file,
and struct cpp_path is now struct cpp_dir. Rename members.
(struct cpp_reader): Similarly. New members once_only_files,
file_hash, file_hash_entries, quote_ignores_source_dir,
no_search_path, saw_pragma_once. Remove all_include_files and
max_include_len. Make some members bool.
(_cpp_mark_only_only): Renamed from _cpp_never_reread.
(_cpp_stack_file): Renamed from _cpp_read_file.
(_cpp_stack_include): Renamed from _cpp_execute_include.
(_cpp_init_files): Renamed from _cpp_init_includes.
(_cpp_cleanup_files): Renamed from _cpp_cleanup_includes.
* cppinit.c (cpp_create_reader): Initialize no_search_path. Update.
(cpp_read_next_file): Rename and move to cppfiles.c.
(cpp_read_main_file): Update.
* cpplib.c (run_directive): Update for renamed members.
(do_include_common, _cpp_pop_buffer): Update.
(do_import): Undeprecate #import.
(do_pragma_once): Undeprecate. Use _cpp_mark_file_once_only.
* cpplib.h: Remove file_name_map_list.
(cpp_options): Remove map_list.
(cpp_dir): Rename from cpp_path. New datatype for name_map.
(cpp_set_include_chains, cpp_stack_file, cpp_included): Update.
testsuite:
* gcc.dg/cpp/include2.c: Only expect one message.
From-SVN: r69942
2003-07-30 06:26:13 +08:00
|
|
|
/* Pointer into the file table; non-NULL if this is a file buffer.
|
|
|
|
Used for include_next and to record control macros. */
|
|
|
|
struct _cpp_file *file;
|
c-lex.c (cb_enter_file, [...]): Combine into the new function cb_change_file.
* c-lex.c (cb_enter_file, cb_leave_file, cb_rename_file):
Combine into the new function cb_change_file.
(init_c_lex): Update.
* cppfiles.c (stack_include_file): Use _cpp_do_file_change.
(cpp_syshdr_flags): Delete.
* cpphash.h (_cpp_do_file_change): New prototype.
Move struct cpp_buffer here from...
* cpplib.h (struct cpp_buffer): ... here.
(enum cpp_fc_reason, struct cpp_file_loc,
struct_cpp_file_change, change_file): New.
(enter_file, leave_file, rename_file, cpp_syshdr_flags): Delete.
* cpplib.c (do_line): Update for new cb_change_file callback.
(_cpp_do_file_change): New function.
(_cpp_pop_buffer): Update to use it.
* cppmain.c (move_printer): Delete.
(main): Set up single callback cb_change_file.
(cb_enter_file, cb_leave_file, cb_rename_file): Delete.
(cb_change_file): New.
* fix-header.c (cur_file, cb_change_file): New.
(recognized_function, read_scan_file): Update.
* scan-decls.c (scan_decls): Update.
* scan.h (recognized_function): Update prototype.
From-SVN: r37784
2000-11-27 16:00:04 +08:00
|
|
|
|
2006-02-18 17:25:31 +08:00
|
|
|
/* Saved value of __TIMESTAMP__ macro - date and time of last modification
|
|
|
|
of the assotiated file. */
|
|
|
|
const unsigned char *timestamp;
|
|
|
|
|
c-lex.c (cb_enter_file, [...]): Combine into the new function cb_change_file.
* c-lex.c (cb_enter_file, cb_leave_file, cb_rename_file):
Combine into the new function cb_change_file.
(init_c_lex): Update.
* cppfiles.c (stack_include_file): Use _cpp_do_file_change.
(cpp_syshdr_flags): Delete.
* cpphash.h (_cpp_do_file_change): New prototype.
Move struct cpp_buffer here from...
* cpplib.h (struct cpp_buffer): ... here.
(enum cpp_fc_reason, struct cpp_file_loc,
struct_cpp_file_change, change_file): New.
(enter_file, leave_file, rename_file, cpp_syshdr_flags): Delete.
* cpplib.c (do_line): Update for new cb_change_file callback.
(_cpp_do_file_change): New function.
(_cpp_pop_buffer): Update to use it.
* cppmain.c (move_printer): Delete.
(main): Set up single callback cb_change_file.
(cb_enter_file, cb_leave_file, cb_rename_file): Delete.
(cb_change_file): New.
* fix-header.c (cur_file, cb_change_file): New.
(recognized_function, read_scan_file): Update.
* scan-decls.c (scan_decls): Update.
* scan.h (recognized_function): Update prototype.
From-SVN: r37784
2000-11-27 16:00:04 +08:00
|
|
|
/* Value of if_stack at start of this file.
|
|
|
|
Used to prohibit unmatched #endif (etc) in an include file. */
|
|
|
|
struct if_stack *if_stack;
|
|
|
|
|
cppfiles.c (ENABLE_VALGRIND_CHECKING, [...]): Remove.
* cppfiles.c (ENABLE_VALGRIND_CHECKING, VALGRIND_DISCARD,
MMAP_THRESHOLD, TEST_THRESHOLD, SHOULD_MMAP): Remove.
(struct include_file): Remove fefcnt, mapped members.
(open_file, stack_include_file, _cpp_pop_file_buffer): Disable caching.
(read_include_file): Don't use mmap, terminate buffers in '\r'.
(purge_cache): Don't use munmap.
* cpphash.h (CPP_BUF_COLUMN): Update.
(lexer_state): Remove lexing_comment.
(struct _cpp_line_note): New.
(struct cpp_buffer): New members cur_note, notes_used, notes_cap,
next_line and need_line. Remove col_adjust and saved_flags.
(_cpp_process_line_notes, _cpp_clean_line, _cpp_get_fresh_line,
_cpp_skip_block_comment, scan_out_logical_line): New.
(_cpp_init_mbchar): Remove.
* cppinit.c (init_library): Remove call to _cpp_init_mbchar.
(cpp_read_main_file): Set line to 1 earlier.
(post_options): -traditional-cpp doesn't want trigraphs.
* cpplex.c (MULTIBYTE_CHARS): Remove code predicated on this.
(add_line_note, _cpp_clean_line, _cpp_process_line_notes,
_cpp_get_fresh_line): New.
(handle_newline, skip_escaped_newlines, trigraph_p,
continue_after_nul, _cpp_init_mbchar): Remove.
(get_effective_char): Update.
(_cpp_skip_block_comment): Rename from skip_block_comment, simplify.
(skip_line_comment): Simplify.
(skip_whitespace, parse_identifier, parse_slow, parse_number,
parse_string): Update.
(cpp_lex_direct): Use clean lines and process line notes. Update.
(cpp_interpret_charconst): No MULTIBYTE_CHARS.
* cpplib.c (prepare_directive_trad): Call scan_out_logical_line
directly.
(_cpp_handle_directive): Don't set saved_flags.
(run_directive, destringize_and_run, cpp_define, cpp_define_builtin,
cpp_undef, handle_assertion, cpp_push_buffer): Update.
(_cpp_pop_buffer): Free notes.
* cppmacro.c (builtin_macro, paste_tokens): \n terminate buffer.
* cpppch.c (cpp_read_state): \n terminate buffer.
* cpptrad.c (skip_escaped_newlines, handle_newline): Remove.
(copy_comment): Use _cpp_skip_block_comment.
(skip_whitespace, lex_identifier, _cpp_read_logical_line_trad):
Simplify.
(_cpp_overlay_buffer, _cpp_remove_overlay, push_replacement_text,
save_replacement_text): Update.
(scan_out_logical_line): Update to use clean lines and process
line notes.
* fix-header.c (read_scan_file): Update.
testsuite:
* gcc.dg/cpp/_Pragma4.c: Remove stray space.
* gcc.dg/cpp/trad/escaped-eof.c: Correct line number.
From-SVN: r65808
2003-04-19 08:22:51 +08:00
|
|
|
/* True if we need to get the next clean line. */
|
2019-08-29 02:43:37 +08:00
|
|
|
bool need_line : 1;
|
c-lex.c (cb_enter_file, [...]): Combine into the new function cb_change_file.
* c-lex.c (cb_enter_file, cb_leave_file, cb_rename_file):
Combine into the new function cb_change_file.
(init_c_lex): Update.
* cppfiles.c (stack_include_file): Use _cpp_do_file_change.
(cpp_syshdr_flags): Delete.
* cpphash.h (_cpp_do_file_change): New prototype.
Move struct cpp_buffer here from...
* cpplib.h (struct cpp_buffer): ... here.
(enum cpp_fc_reason, struct cpp_file_loc,
struct_cpp_file_change, change_file): New.
(enter_file, leave_file, rename_file, cpp_syshdr_flags): Delete.
* cpplib.c (do_line): Update for new cb_change_file callback.
(_cpp_do_file_change): New function.
(_cpp_pop_buffer): Update to use it.
* cppmain.c (move_printer): Delete.
(main): Set up single callback cb_change_file.
(cb_enter_file, cb_leave_file, cb_rename_file): Delete.
(cb_change_file): New.
* fix-header.c (cur_file, cb_change_file): New.
(recognized_function, read_scan_file): Update.
* scan-decls.c (scan_decls): Update.
* scan.h (recognized_function): Update prototype.
From-SVN: r37784
2000-11-27 16:00:04 +08:00
|
|
|
|
|
|
|
/* True if we have already warned about C++ comments in this file.
|
|
|
|
The warning happens only for C89 extended mode with -pedantic on,
|
|
|
|
or for -Wtraditional, and only once per file (otherwise it would
|
|
|
|
be far too noisy). */
|
2019-08-29 02:43:37 +08:00
|
|
|
bool warned_cplusplus_comments : 1;
|
c-lex.c (cb_enter_file, [...]): Combine into the new function cb_change_file.
* c-lex.c (cb_enter_file, cb_leave_file, cb_rename_file):
Combine into the new function cb_change_file.
(init_c_lex): Update.
* cppfiles.c (stack_include_file): Use _cpp_do_file_change.
(cpp_syshdr_flags): Delete.
* cpphash.h (_cpp_do_file_change): New prototype.
Move struct cpp_buffer here from...
* cpplib.h (struct cpp_buffer): ... here.
(enum cpp_fc_reason, struct cpp_file_loc,
struct_cpp_file_change, change_file): New.
(enter_file, leave_file, rename_file, cpp_syshdr_flags): Delete.
* cpplib.c (do_line): Update for new cb_change_file callback.
(_cpp_do_file_change): New function.
(_cpp_pop_buffer): Update to use it.
* cppmain.c (move_printer): Delete.
(main): Set up single callback cb_change_file.
(cb_enter_file, cb_leave_file, cb_rename_file): Delete.
(cb_change_file): New.
* fix-header.c (cur_file, cb_change_file): New.
(recognized_function, read_scan_file): Update.
* scan-decls.c (scan_decls): Update.
* scan.h (recognized_function): Update prototype.
From-SVN: r37784
2000-11-27 16:00:04 +08:00
|
|
|
|
|
|
|
/* True if we don't process trigraphs and escaped newlines. True
|
|
|
|
for preprocessed input, command line directives, and _Pragma
|
|
|
|
buffers. */
|
2019-08-29 02:43:37 +08:00
|
|
|
bool from_stage3 : 1;
|
c-lex.c (cb_enter_file, [...]): Combine into the new function cb_change_file.
* c-lex.c (cb_enter_file, cb_leave_file, cb_rename_file):
Combine into the new function cb_change_file.
(init_c_lex): Update.
* cppfiles.c (stack_include_file): Use _cpp_do_file_change.
(cpp_syshdr_flags): Delete.
* cpphash.h (_cpp_do_file_change): New prototype.
Move struct cpp_buffer here from...
* cpplib.h (struct cpp_buffer): ... here.
(enum cpp_fc_reason, struct cpp_file_loc,
struct_cpp_file_change, change_file): New.
(enter_file, leave_file, rename_file, cpp_syshdr_flags): Delete.
* cpplib.c (do_line): Update for new cb_change_file callback.
(_cpp_do_file_change): New function.
(_cpp_pop_buffer): Update to use it.
* cppmain.c (move_printer): Delete.
(main): Set up single callback cb_change_file.
(cb_enter_file, cb_leave_file, cb_rename_file): Delete.
(cb_change_file): New.
* fix-header.c (cur_file, cb_change_file): New.
(recognized_function, read_scan_file): Update.
* scan-decls.c (scan_decls): Update.
* scan.h (recognized_function): Update prototype.
From-SVN: r37784
2000-11-27 16:00:04 +08:00
|
|
|
|
2004-02-19 06:02:39 +08:00
|
|
|
/* At EOF, a buffer is automatically popped. If RETURN_AT_EOF is
|
|
|
|
true, a CPP_EOF token is then returned. Otherwise, the next
|
|
|
|
token from the enclosing buffer is returned. */
|
2019-08-29 02:43:37 +08:00
|
|
|
bool return_at_eof : 1;
|
2001-03-15 15:57:13 +08:00
|
|
|
|
Represent column numbers using line-map's source_location.
The "next available source_location" is now managed internally by
line-maps.c rather than by clients.
* line-map.h (struct line_map): New field column_bits.
<from_line>: Rename field to start_location.
(struct line_maps): New fields highest_location and max_column_hint.
(linemap_check_files_exited): New declaration.
(linemap_line_start): New declaration.
(linemap_add): Remove from_line parameter; use highest_location field.
(SOURCE_LINE, LAST_SOURCE_LINE): Modify to use column_bits.
(SOURCE_COLUMN, LAST_SOURCE_LINE_LOCATION): New macros.
(CURRENT_LINE_MAP): Remove macro.
(linemap_position_for_column): New inline function.
* line-map.c (linemap_init): Clear new fields.
(linemap_check_files_exited): New function, extracted from ...
(linemap_free): Use linemap_check_files_exited.
(linemap_add): Remove from_line parameter. Various updates.
(linemap_line_start): New function.
(linemap_lookeup): Update for new field names.
* cpphash.h (struct cpp_reader) <map>: Field removed. Because
linemap_position_for_column may unpredictably change the current map,
it is cleaner and simpler for us to not cache it in cpp_reader.
(struct cpp_buffer): New sysp field.
Changed warned_cplusplus_comments and from_stage3 to bitfields.
* cppinit.c (cpp_read_min_file): pfile->map no longer exists.
* cpplib.c (do_line, do_linemarker, _cpp_do_file_change): Get
current map using linemap_lookup.
(do_linemarker): Also set buffer's sysp field.
(destringize_and_run): No longer need to decrement current line.
* cppfiles.c (_cpp_stack_file): Set sysp from and in buffer.
(search_path_head, open_file_failed): Use buffer's sysp.
(cpp_make_system_header): Get current map using linemap_lookup.
Also set buffer's sysp flag.
* cppmacro.c (_cpp_builtin_macro_text): Likewise use linemap_lookup.
* cpphash.h (CPP_INCREMENT_LINE): New macro.
(struct cpp_buffer): Moved fields saved_cur, saved_rlimit to ...
(struct cpp_reader): ... and adding saved_line_base field.
* cpptrad.c (_cpp_overlay_buffer, _cpp_remove_overlay):
Update accordingly. Don't adjust line.
(_cpp_scan_out_logical_line): Use CPP_INCREMENT_LINE.
* cpphash.c (CPP_IN_SYSTEM_HEADER): Replaced macro by ...
(cpp_in_system_header): ... new inline function, using buffer's sysp.
* cpperror.c (_cpp_begin_message): Update to use cpp_in_system_header.
* cpplex.c (_cpp_lex_direct): Likewise.
* cppmacro.c (_cpp_builtin_macro_text): Likewise.
* cppmacro.c (_cpp_create_definition): Use buffer's sysp field.
* cpplib.h (struct cpp_token): Rename line field to src_loc.
Remove col field as it is now subsumed by src_loc.
* cpperror.c: Update various field, parameter, and macro names.
(print_location): If col==0, try SOURCE_COLUMN of line.
(cpp_error): Use cur_token's src_loc field, rather than line+col.
* cpplib.c (do_diagnostic): Token's src_loc fields replaces line+col.
* cpplex.c (_cpp_process_line_notes, _cpp_lex_direct,
_cpp_skip_block_comment): Use CPP_INCREMENT_LINE.
(_cpp_temp_token): Replace cpp_token's line+col fields by src_loc.
(_cpp_get_fresh_line): Don't need to adjust line for missing newline.
(_cpp_lex_direct): Use linemap_position_for_column.
* c-ppoutput.c (maybe_print_line, print_line): Don't take map
parameter. Instead get it from the line_table global. Adjust callers.
(print): Remove map field. Replace line field to src_line.
(init_pp_output, account_for_newlines, maybe_print_line): Adjust.
(cb_line_change): Use SOURCE_COLUMN. Minor optimizations.
(pp_file_change): Use MAIN_FILE_P since we cannot checked print.map.
Use LAST_SOURCE_LINE_LOCATION to "catch up" after #include.
* cpptrad.c (copy_comment): Rename variable.
* c-lex.c (map): Remove static variable, for same reason we removed
cpp_reader's map field.
(cb_line_change, cb_def_pragma, cb_define, cb_undef): Hence we need
to call linemap_lookup.
(cb_line_change): Token's line field replaced by src_loc.
(fe_file_change): Use MAINFILE_P and LAST_SOURCE_LINE macros.
Don't save new_map.
* cpphash.h, cpperror.c, cpplib.h: Some renames of fileline to
source_location.
From-SVN: r77663
2004-02-11 23:29:30 +08:00
|
|
|
/* One for a system header, two for a C system header file that therefore
|
2004-02-16 22:20:10 +08:00
|
|
|
needs to be extern "C" protected in C++, and zero otherwise. */
|
Represent column numbers using line-map's source_location.
The "next available source_location" is now managed internally by
line-maps.c rather than by clients.
* line-map.h (struct line_map): New field column_bits.
<from_line>: Rename field to start_location.
(struct line_maps): New fields highest_location and max_column_hint.
(linemap_check_files_exited): New declaration.
(linemap_line_start): New declaration.
(linemap_add): Remove from_line parameter; use highest_location field.
(SOURCE_LINE, LAST_SOURCE_LINE): Modify to use column_bits.
(SOURCE_COLUMN, LAST_SOURCE_LINE_LOCATION): New macros.
(CURRENT_LINE_MAP): Remove macro.
(linemap_position_for_column): New inline function.
* line-map.c (linemap_init): Clear new fields.
(linemap_check_files_exited): New function, extracted from ...
(linemap_free): Use linemap_check_files_exited.
(linemap_add): Remove from_line parameter. Various updates.
(linemap_line_start): New function.
(linemap_lookeup): Update for new field names.
* cpphash.h (struct cpp_reader) <map>: Field removed. Because
linemap_position_for_column may unpredictably change the current map,
it is cleaner and simpler for us to not cache it in cpp_reader.
(struct cpp_buffer): New sysp field.
Changed warned_cplusplus_comments and from_stage3 to bitfields.
* cppinit.c (cpp_read_min_file): pfile->map no longer exists.
* cpplib.c (do_line, do_linemarker, _cpp_do_file_change): Get
current map using linemap_lookup.
(do_linemarker): Also set buffer's sysp field.
(destringize_and_run): No longer need to decrement current line.
* cppfiles.c (_cpp_stack_file): Set sysp from and in buffer.
(search_path_head, open_file_failed): Use buffer's sysp.
(cpp_make_system_header): Get current map using linemap_lookup.
Also set buffer's sysp flag.
* cppmacro.c (_cpp_builtin_macro_text): Likewise use linemap_lookup.
* cpphash.h (CPP_INCREMENT_LINE): New macro.
(struct cpp_buffer): Moved fields saved_cur, saved_rlimit to ...
(struct cpp_reader): ... and adding saved_line_base field.
* cpptrad.c (_cpp_overlay_buffer, _cpp_remove_overlay):
Update accordingly. Don't adjust line.
(_cpp_scan_out_logical_line): Use CPP_INCREMENT_LINE.
* cpphash.c (CPP_IN_SYSTEM_HEADER): Replaced macro by ...
(cpp_in_system_header): ... new inline function, using buffer's sysp.
* cpperror.c (_cpp_begin_message): Update to use cpp_in_system_header.
* cpplex.c (_cpp_lex_direct): Likewise.
* cppmacro.c (_cpp_builtin_macro_text): Likewise.
* cppmacro.c (_cpp_create_definition): Use buffer's sysp field.
* cpplib.h (struct cpp_token): Rename line field to src_loc.
Remove col field as it is now subsumed by src_loc.
* cpperror.c: Update various field, parameter, and macro names.
(print_location): If col==0, try SOURCE_COLUMN of line.
(cpp_error): Use cur_token's src_loc field, rather than line+col.
* cpplib.c (do_diagnostic): Token's src_loc fields replaces line+col.
* cpplex.c (_cpp_process_line_notes, _cpp_lex_direct,
_cpp_skip_block_comment): Use CPP_INCREMENT_LINE.
(_cpp_temp_token): Replace cpp_token's line+col fields by src_loc.
(_cpp_get_fresh_line): Don't need to adjust line for missing newline.
(_cpp_lex_direct): Use linemap_position_for_column.
* c-ppoutput.c (maybe_print_line, print_line): Don't take map
parameter. Instead get it from the line_table global. Adjust callers.
(print): Remove map field. Replace line field to src_line.
(init_pp_output, account_for_newlines, maybe_print_line): Adjust.
(cb_line_change): Use SOURCE_COLUMN. Minor optimizations.
(pp_file_change): Use MAIN_FILE_P since we cannot checked print.map.
Use LAST_SOURCE_LINE_LOCATION to "catch up" after #include.
* cpptrad.c (copy_comment): Rename variable.
* c-lex.c (map): Remove static variable, for same reason we removed
cpp_reader's map field.
(cb_line_change, cb_def_pragma, cb_define, cb_undef): Hence we need
to call linemap_lookup.
(cb_line_change): Token's line field replaced by src_loc.
(fe_file_change): Use MAINFILE_P and LAST_SOURCE_LINE macros.
Don't save new_map.
* cpphash.h, cpperror.c, cpplib.h: Some renames of fileline to
source_location.
From-SVN: r77663
2004-02-11 23:29:30 +08:00
|
|
|
unsigned char sysp;
|
|
|
|
|
2001-03-02 15:35:12 +08:00
|
|
|
/* The directory of the this buffer's file. Its NAME member is not
|
|
|
|
allocated, so we don't need to worry about freeing it. */
|
Makefile.in (LIBCPP_DEPS): Add HASHTAB_H.
* Makefile.in (LIBCPP_DEPS): Add HASHTAB_H.
* cppfiles.c: Completely rewritten.
* c-incpath.c (free_path, remove_duplicates, heads, tails, add_path):
struct cpp_path is now struct cpp_dir.
(remove_duplicates): Don't simplify path names.
* c-opts.c (c_common_parse_file): cpp_read_next_file renamed
cpp_stack_file.
* cpphash.h: Include hashtab.h.
(_cpp_file): Declare.
(struct cpp_buffer): struct include_file is now struct _cpp_file,
and struct cpp_path is now struct cpp_dir. Rename members.
(struct cpp_reader): Similarly. New members once_only_files,
file_hash, file_hash_entries, quote_ignores_source_dir,
no_search_path, saw_pragma_once. Remove all_include_files and
max_include_len. Make some members bool.
(_cpp_mark_only_only): Renamed from _cpp_never_reread.
(_cpp_stack_file): Renamed from _cpp_read_file.
(_cpp_stack_include): Renamed from _cpp_execute_include.
(_cpp_init_files): Renamed from _cpp_init_includes.
(_cpp_cleanup_files): Renamed from _cpp_cleanup_includes.
* cppinit.c (cpp_create_reader): Initialize no_search_path. Update.
(cpp_read_next_file): Rename and move to cppfiles.c.
(cpp_read_main_file): Update.
* cpplib.c (run_directive): Update for renamed members.
(do_include_common, _cpp_pop_buffer): Update.
(do_import): Undeprecate #import.
(do_pragma_once): Undeprecate. Use _cpp_mark_file_once_only.
* cpplib.h: Remove file_name_map_list.
(cpp_options): Remove map_list.
(cpp_dir): Rename from cpp_path. New datatype for name_map.
(cpp_set_include_chains, cpp_stack_file, cpp_included): Update.
testsuite:
* gcc.dg/cpp/include2.c: Only expect one message.
From-SVN: r69942
2003-07-30 06:26:13 +08:00
|
|
|
struct cpp_dir dir;
|
2002-05-18 04:16:48 +08:00
|
|
|
|
2004-01-17 06:37:49 +08:00
|
|
|
/* Descriptor for converting from the input character set to the
|
|
|
|
source character set. */
|
|
|
|
struct cset_converter input_cset_desc;
|
c-lex.c (cb_enter_file, [...]): Combine into the new function cb_change_file.
* c-lex.c (cb_enter_file, cb_leave_file, cb_rename_file):
Combine into the new function cb_change_file.
(init_c_lex): Update.
* cppfiles.c (stack_include_file): Use _cpp_do_file_change.
(cpp_syshdr_flags): Delete.
* cpphash.h (_cpp_do_file_change): New prototype.
Move struct cpp_buffer here from...
* cpplib.h (struct cpp_buffer): ... here.
(enum cpp_fc_reason, struct cpp_file_loc,
struct_cpp_file_change, change_file): New.
(enter_file, leave_file, rename_file, cpp_syshdr_flags): Delete.
* cpplib.c (do_line): Update for new cb_change_file callback.
(_cpp_do_file_change): New function.
(_cpp_pop_buffer): Update to use it.
* cppmain.c (move_printer): Delete.
(main): Set up single callback cb_change_file.
(cb_enter_file, cb_leave_file, cb_rename_file): Delete.
(cb_change_file): New.
* fix-header.c (cur_file, cb_change_file): New.
(recognized_function, read_scan_file): Update.
* scan-decls.c (scan_decls): Update.
* scan.h (recognized_function): Update prototype.
From-SVN: r37784
2000-11-27 16:00:04 +08:00
|
|
|
};
|
|
|
|
|
2009-11-12 02:37:19 +08:00
|
|
|
/* The list of saved macros by push_macro pragma. */
|
|
|
|
struct def_pragma_macro {
|
|
|
|
/* Chain element to previous saved macro. */
|
|
|
|
struct def_pragma_macro *next;
|
|
|
|
/* Name of the macro. */
|
|
|
|
char *name;
|
|
|
|
/* The stored macro content. */
|
2010-09-30 02:18:38 +08:00
|
|
|
unsigned char *definition;
|
|
|
|
|
|
|
|
/* Definition line number. */
|
2018-11-14 04:05:03 +08:00
|
|
|
location_t line;
|
2010-09-30 02:18:38 +08:00
|
|
|
/* If macro defined in system header. */
|
|
|
|
unsigned int syshdr : 1;
|
|
|
|
/* Nonzero if it has been expanded or had its existence tested. */
|
|
|
|
unsigned int used : 1;
|
|
|
|
|
|
|
|
/* Mark if we save an undefined macro. */
|
|
|
|
unsigned int is_undef : 1;
|
2019-11-01 01:38:44 +08:00
|
|
|
/* Nonzero if it was a builtin macro. */
|
|
|
|
unsigned int is_builtin : 1;
|
2009-11-12 02:37:19 +08:00
|
|
|
};
|
|
|
|
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
/* A cpp_reader encapsulates the "state" of a pre-processor run.
|
|
|
|
Applying cpp_get_token repeatedly yields a stream of pre-processor
|
|
|
|
tokens. Usually, there is only one cpp_reader object active. */
|
|
|
|
struct cpp_reader
|
|
|
|
{
|
|
|
|
/* Top of buffer stack. */
|
|
|
|
cpp_buffer *buffer;
|
|
|
|
|
2002-06-18 14:27:40 +08:00
|
|
|
/* Overlaid buffer (can be different after processing #include). */
|
|
|
|
cpp_buffer *overlaid_buffer;
|
|
|
|
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
/* Lexer state. */
|
|
|
|
struct lexer_state state;
|
|
|
|
|
re PR preprocessor/3081 (Preprocessor merges 2 first lines when -imacros is being used)
PR preprocessor/3081
* c-lex.c (map): New.
(cb_file_change): Update map and use it.
(cb_def_pragma, cb_define, cb_undef): Use map and line.
(c_lex): Update to use map.
* cpperror.c (print_location): Move to using logical line numbers.
* cppfiles.c (stack_include_file): Update for new _cpp_do_file_change.
(cpp_make_system_header): Similarly.
(_cpp_execute_include): Stop line numbering hacks. Store the
line we will return to.
* cpphash.h (CPP_BUF_LINE): Remove.
(struct cpp_buffer): Remove lineno and pseudo_newlines.
Add map and return_to_line.
(_cpp_do_file_change): Update.
* cppinit.c (cpp_start_read): Update line kludge.
* cpplex.c (handle_newline): Don't update lineno and pseudo_newlines.
(trigraph_ok): Use logical line numbers for diagnostics.
(skip_block_comment): Likewise.
(skip_whitespace): Likewise.
(skip_line_comment): Use pfile->line instead.
(_cpp_lex_token): Update to use logical line numbering exclusively.
Handle BOL locally. Accept new lines in directives, but keep
pfile->line decremented. Diagnostics use logical lines. Update
directive handling.
* cpplib.c (SEEN_EOL): New.
(skip_rest_of_line, check_eol): Use it.
(end_directive): Increase line number when accepting the newline
at the end of a directive.
(run_directive): Simplify.
(do_line): Bad LC_LEAVEs become LC_RENAMEs. Update.
(_cpp_do_file_change): Update to take buffer line number as an
argument, and store the current map in the cpp_reader. Remove
line number kludges.
(_cpp_do__Pragma): Restore output position after a _Pragma.
(cpp_push_buffer): Don't set output line or lineno.
(_cpp_pop_buffer): Transfer more info from a faked buffer.
Remove line kludge. Set output_line.
* cppmacro.c (builtin_macro): Update handling of __LINE__.
(parse_arg): Use logical lines.
(save_lookahead_token): Save EOFs too now.
* cppmain.c (struct printer): Fix comments.
(printer_init): Simplify, let caller do errors.
(scan_translation_unit, check_multiline_token, dump_macro): Update.
(maybe_print_line): Simplify.
(print_line): Don't print a linemarker if -P.
(cb_define, cb_undef, cb_def_pragma, cb_ident, cb_include): Update.
(cb_file_change): Simplify.
* line-map.h (LAST_SOURCE_LINE): Fix.
(CURRENT_LINE_MAP): New.
* gcc.dg/cpp/19951025-1.c: Revert.
* gcc.dg/cpp/directiv.c: We no longer process directives that
interrupt macro arguments.
From-SVN: r44650
2001-08-06 01:31:25 +08:00
|
|
|
/* Source line tracking. */
|
2019-07-10 02:32:49 +08:00
|
|
|
class line_maps *line_table;
|
2001-08-01 14:19:39 +08:00
|
|
|
|
2001-09-15 18:18:03 +08:00
|
|
|
/* The line of the '#' of the current directive. */
|
2018-11-14 04:05:03 +08:00
|
|
|
location_t directive_line;
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
|
cpphash.h (struct _cpp_buff, [...]): New.
* cpphash.h (struct _cpp_buff, _cpp_get_buff, _cpp_release_buff,
_cpp_extend_buff, _cpp_free_buff): New.
(struct cpp_reader): New member free_buffs.
* cppinit.c (cpp_destroy): Free buffers.
* cpplex.c (new_buff, _cpp_release_buff, _cpp_get_buff,
_cpp_extend_buff, _cpp_free_buff): New.
* cpplib.h (struct cpp_options): Remove unused member.
* cppmacro.c (collect_args): New. Combines the old parse_arg
and parse_args. Use _cpp_buff for memory allocation.
(funlike_invocation_p, replace_args): Update.
From-SVN: r45827
2001-09-27 01:52:50 +08:00
|
|
|
/* Memory buffers. */
|
cpphash.h (POOL_ALIGN, [...]): Remove.
* cpphash.h (POOL_ALIGN, POOL_FRONT, POOL_LIMIT, POOL_BASE,
POOL_SIZE, POOL_ROOM, POOL_COMMIT, struct cpp_chunk,
struct cpp_pool, _cpp_init_pool, _cpp_free_pool, _cpp_pool_reserve,
_cpp_pool_alloc, _cpp_next_chunk): Remove.
(_cpp_extend_buff, BUFF_ROOM): Update.
(_cpp_append_extend_buff): New.
(struct cpp_reader): Remove macro_pool, add a_buff.
* cppinit.c (cpp_create_reader): Initialize a_buff, instead of
macro_pool.
(cpp_destroy): Free a_buff instead of macro_pool.
* cpplex.c (new_chunk, chunk_suitable, _cpp_next_chunk,
new_chunk, _cpp_init_pool, _cpp_free_pool, _cpp_pool_reserve,
_cpp_pool_alloc, ): Remove.
(parse_number, parse_string): Update use of _cpp_extend_buff.
(_cpp_extend_buff): Update.
(_cpp_append_extend_buff, cpp_aligned_alloc): New.
* cpplib.c (glue_header_name, parse_answer):
Update use of _cpp_extend_buff.
(cpp_register_pragma, cpp_register_pragma_space): Use
_cpp_aligned_alloc.
(do_assert, do_unassert): Check for EOL, update.
* cppmacro.c (stringify_arg, collect_args): Update to use
_cpp_extend_buff and _cpp_append_extend_buff.
(save_parameter, parse_params, alloc_expansion_token,
_cpp_create_definition): Rework memory management.
* gcc.dg/cpp/redef2.c: Add test.
From-SVN: r45899
2001-09-30 18:03:11 +08:00
|
|
|
_cpp_buff *a_buff; /* Aligned permanent storage. */
|
2001-09-28 17:40:22 +08:00
|
|
|
_cpp_buff *u_buff; /* Unaligned permanent storage. */
|
|
|
|
_cpp_buff *free_buffs; /* Free buffer chain. */
|
cpphash.h (struct _cpp_buff, [...]): New.
* cpphash.h (struct _cpp_buff, _cpp_get_buff, _cpp_release_buff,
_cpp_extend_buff, _cpp_free_buff): New.
(struct cpp_reader): New member free_buffs.
* cppinit.c (cpp_destroy): Free buffers.
* cpplex.c (new_buff, _cpp_release_buff, _cpp_get_buff,
_cpp_extend_buff, _cpp_free_buff): New.
* cpplib.h (struct cpp_options): Remove unused member.
* cppmacro.c (collect_args): New. Combines the old parse_arg
and parse_args. Use _cpp_buff for memory allocation.
(funlike_invocation_p, replace_args): Update.
From-SVN: r45827
2001-09-27 01:52:50 +08:00
|
|
|
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
/* Context stack. */
|
|
|
|
struct cpp_context base_context;
|
|
|
|
struct cpp_context *context;
|
|
|
|
|
|
|
|
/* If in_directive, the directive if known. */
|
|
|
|
const struct directive *directive;
|
|
|
|
|
2004-09-10 03:16:56 +08:00
|
|
|
/* Token generated while handling a directive, if any. */
|
|
|
|
cpp_token directive_result;
|
|
|
|
|
2007-09-07 00:24:05 +08:00
|
|
|
/* When expanding a macro at top-level, this is the location of the
|
|
|
|
macro invocation. */
|
2018-11-14 04:05:03 +08:00
|
|
|
location_t invocation_location;
|
2007-09-07 00:24:05 +08:00
|
|
|
|
PR preprocessor/64803 - __LINE__ inside macro is not constant
Consider the example code mentionned in this PR:
$ cat -n test.c
1 #define C(a, b) a ## b
2 #define L(x) C(L, x)
3 #define M(a) goto L(__LINE__); __LINE__; L(__LINE__):
4 M(a /* --> this is the line of the expansion point of M. */
5 ); /* --> this is the line of the end of the invocation of M. */
$
"cc1 -quiet -E test.c" yields:
goto L5; 5; L4:
;
Notice how we have a 'L4' there, where it should be L5. That is the issue.
My understanding is that during the *second* expansion of __LINE__
(the one between the two L(__LINE__)), builtin_macro() is called by
enter_macro_context() with the location of the expansion point of M
(which is at line 4). Then _cpp_builtin_macro_text() expands __LINE__
into the line number of the location of the last token that has been
lexed, which is the location of the closing parenthesis of the
invocation of M, at line 5. So that invocation of __LINE__ is
expanded into 5.
Now let's see why the last invocation of __LINE__ is expanded into 4.
In builtin_macro(), we have this code at some point:
/* Set pfile->cur_token as required by _cpp_lex_direct. */
pfile->cur_token = _cpp_temp_token (pfile);
cpp_token *token = _cpp_lex_direct (pfile);
/* We should point to the expansion point of the builtin macro. */
token->src_loc = loc;
The first two statements insert a new token in the stream of lexed
token and pfile->cur_token[-1], is the "new" last token that has been
lexed. But the location of pfile->cur_token[-1] is the same location
as the location of the "previous" pfile->cur_token[-1], by courtesy of
_cpp_temp_token(). So normally, in subsequent invocations of
builtin_macro(), the location of pfile->cur_token[-1] should always be
the location of the closing parenthesis of the invocation of M at line
5. Except that that code in master now has the statement
"token->src_loc = loc;" on the next line. That statement actually
sets the location of pfile->cur_token[-1] to 'loc'. Which is the
location of the expansion point of M, which is on line 4.
So in the subsequent call to builtin_macro() (for the last expansion
of __LINE__ in L(__LINE__)), for _cpp_builtin_macro_text(),
pfile->cur_token[-1].src_loc is going to have a line number of 4.
I think the core issue here is that the location that is passed to
builtin_macro() from enter_macro_context() is not correct when we are
in presence of a top-most function-like macro invocation; in that
case, that location should be the location of the closing parenthesis
of the macro invocation. Otherwise, if we are in presence of a a
top-most object-like macro invocation then the location passed down
to builtin_macro should be the location of the expansion point of the
macro.
That way, in the particular case of the input code above, the location
received by builtin_macro() will always have line number 5.
Boostrapped and tested on x86_64-unknown-linux-gnu against trunk.
libcpp/ChangeLog:
* internal.h (cpp_reader::top_most_macro_node): New data member.
* macro.c (enter_macro_context): Pass the location of the end of
the top-most invocation of the function-like macro, or the
location of the expansion point of the top-most object-like macro.
(cpp_get_token_1): Store the top-most macro node in the new
pfile->top_most_macro_node data member.
(_cpp_pop_context): Clear the new cpp_reader::top_most_macro_node
data member.
gcc/testsuite/ChangeLog:
* gcc.dg/cpp/builtin-macro-1.c: New test case.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
From-SVN: r220367
2015-02-03 17:26:46 +08:00
|
|
|
/* This is the node representing the macro being expanded at
|
|
|
|
top-level. The value of this data member is valid iff
|
2019-08-29 02:43:37 +08:00
|
|
|
cpp_in_macro_expansion_p() returns TRUE. */
|
PR preprocessor/64803 - __LINE__ inside macro is not constant
Consider the example code mentionned in this PR:
$ cat -n test.c
1 #define C(a, b) a ## b
2 #define L(x) C(L, x)
3 #define M(a) goto L(__LINE__); __LINE__; L(__LINE__):
4 M(a /* --> this is the line of the expansion point of M. */
5 ); /* --> this is the line of the end of the invocation of M. */
$
"cc1 -quiet -E test.c" yields:
goto L5; 5; L4:
;
Notice how we have a 'L4' there, where it should be L5. That is the issue.
My understanding is that during the *second* expansion of __LINE__
(the one between the two L(__LINE__)), builtin_macro() is called by
enter_macro_context() with the location of the expansion point of M
(which is at line 4). Then _cpp_builtin_macro_text() expands __LINE__
into the line number of the location of the last token that has been
lexed, which is the location of the closing parenthesis of the
invocation of M, at line 5. So that invocation of __LINE__ is
expanded into 5.
Now let's see why the last invocation of __LINE__ is expanded into 4.
In builtin_macro(), we have this code at some point:
/* Set pfile->cur_token as required by _cpp_lex_direct. */
pfile->cur_token = _cpp_temp_token (pfile);
cpp_token *token = _cpp_lex_direct (pfile);
/* We should point to the expansion point of the builtin macro. */
token->src_loc = loc;
The first two statements insert a new token in the stream of lexed
token and pfile->cur_token[-1], is the "new" last token that has been
lexed. But the location of pfile->cur_token[-1] is the same location
as the location of the "previous" pfile->cur_token[-1], by courtesy of
_cpp_temp_token(). So normally, in subsequent invocations of
builtin_macro(), the location of pfile->cur_token[-1] should always be
the location of the closing parenthesis of the invocation of M at line
5. Except that that code in master now has the statement
"token->src_loc = loc;" on the next line. That statement actually
sets the location of pfile->cur_token[-1] to 'loc'. Which is the
location of the expansion point of M, which is on line 4.
So in the subsequent call to builtin_macro() (for the last expansion
of __LINE__ in L(__LINE__)), for _cpp_builtin_macro_text(),
pfile->cur_token[-1].src_loc is going to have a line number of 4.
I think the core issue here is that the location that is passed to
builtin_macro() from enter_macro_context() is not correct when we are
in presence of a top-most function-like macro invocation; in that
case, that location should be the location of the closing parenthesis
of the macro invocation. Otherwise, if we are in presence of a a
top-most object-like macro invocation then the location passed down
to builtin_macro should be the location of the expansion point of the
macro.
That way, in the particular case of the input code above, the location
received by builtin_macro() will always have line number 5.
Boostrapped and tested on x86_64-unknown-linux-gnu against trunk.
libcpp/ChangeLog:
* internal.h (cpp_reader::top_most_macro_node): New data member.
* macro.c (enter_macro_context): Pass the location of the end of
the top-most invocation of the function-like macro, or the
location of the expansion point of the top-most object-like macro.
(cpp_get_token_1): Store the top-most macro node in the new
pfile->top_most_macro_node data member.
(_cpp_pop_context): Clear the new cpp_reader::top_most_macro_node
data member.
gcc/testsuite/ChangeLog:
* gcc.dg/cpp/builtin-macro-1.c: New test case.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
From-SVN: r220367
2015-02-03 17:26:46 +08:00
|
|
|
cpp_hashnode *top_most_macro_node;
|
|
|
|
|
PR preprocessor/53229 - Fix diagnostics location when pasting tokens
As stated in the audit trail of this problem report, consider this
test case:
$ cat test.c
1 struct x {
2 int i;
3 };
4 struct x x;
5
6 #define TEST(X) x.##X
7
8 void foo (void)
9 {
10 TEST(i) = 0;
11 }
$
$ cc1 -quiet test.c
test.c: In function 'foo':
test.c:10:1: error: pasting "." and "i" does not give a valid preprocessing token
TEST(i) = 0;
^
$
So, when pasting tokens, the error diagnostic uses the global and
imprecise input_location variable, leading to an imprecise output.
To properly fix this, I think libcpp should keep the token of the
pasting operator '##', instead of representing it with flag on the LHS
operand's token. That way, it could use its location. Doing that
would be quite intrusive though. So this patch just uses the location
of the LHS of the pasting operator, for now. It's IMHO better than
the current situation.
The patch makes paste_tokens take a location parameter that is used in
the diagnostics. This change can still be useful later when we can
use the location of the pasting operator, because paste_tokens will
just be passed the new, more precise location.
Incidentally, it appeared that when getting tokens from within
preprocessor directives (like what is done in gcc.dg/cpp/paste12.c),
with -ftrack-macro-expansion disabled, the location of the expansion
point of macros was being lost because
cpp_reader::set_invocation_location wasn't being properly set. It's
because when cpp_get_token_1 calls enter_macro_context, there is a
little period of time between the beginning of that later function and
when the macro is really pushed (and thus when the macro is really
expanded) where we wrongly consider that we are not expanding the
macro because macro_of_context is still NULL. In that period of time,
in the occurrences of indirect recursive calls to cpp_get_token_1,
this later function wrongly sets cpp_reader::invocation_location
because cpp_reader::set_invocation_location is not being properly set.
To avoid that confusion the patch does away with
cpp_reader::set_invocation_location and introduces a new flag
cpp_reader::about_to_expand_macro_p that is set in the small time
interval exposed earlier. A new in_macro_expansion_p is introduced as
well, so that cpp_get_token_1 can now accurately detect when we are in
the process of expanding a macro, and thus correctly collect the
location of the expansion point.
People seem to like screenshots.
Thus, after the patch, we now have:
$ cc1 -quiet test.c
test.c: In function 'foo':
test.c:6:18: error: pasting "." and "i" does not give a valid preprocessing token
#define TEST(X) x.##X
^
test.c:10:3: note: in expansion of macro 'TEST'
TEST(i) = 0;
^
$
Bootstrapped and tested on x86_64-unknown-linux-gnu against trunk.
libcpp/
PR preprocessor/53229
* internal.h (cpp_reader::set_invocation_location): Remove.
(cpp_reader::about_to_expand_macro_p): New member flag.
* directives.c (do_pragma): Remove Kludge as
pfile->set_invocation_location is no more.
* macro.c (cpp_get_token_1): Do away with the use of
cpp_reader::set_invocation_location. Just collect the macro
expansion point when we are about to expand the top-most macro.
Do not override cpp_reader::about_to_expand_macro_p.
This fixes gcc.dg/cpp/paste12.c by making get_token_no_padding
properly handle locations of expansion points.
(cpp_get_token_with_location): Adjust, as
cpp_reader::set_invocation_location is no more.
(paste_tokens): Take a virtual location parameter for
the LHS of the pasting operator. Use it in diagnostics. Update
comments.
(paste_all_tokens): Tighten the assert. Propagate the location of
the expansion point when no virtual locations are available.
Pass the virtual location to paste_tokens.
(in_macro_expansion_p): New static function.
(enter_macro_context): Set the cpp_reader::about_to_expand_macro_p
flag until we really start expanding the macro.
gcc/testsuite/
PR preprocessor/53229
* gcc.dg/cpp/paste6.c: Force to run without
-ftrack-macro-expansion.
* gcc.dg/cpp/paste8.c: Likewise.
* gcc.dg/cpp/paste8-2.c: New test, like paste8.c but run with
-ftrack-macro-expansion.
* gcc.dg/cpp/paste12.c: Force to run without
-ftrack-macro-expansion.
* gcc.dg/cpp/paste12-2.c: New test, like paste12.c but run with
-ftrack-macro-expansion.
* gcc.dg/cpp/paste13.c: Likewise.
* gcc.dg/cpp/paste14.c: Likewise.
* gcc.dg/cpp/paste14-2.c: New test, like paste14.c but run with
-ftrack-macro-expansion.
* gcc.dg/cpp/paste18.c: New test.
From-SVN: r187945
2012-05-29 17:36:29 +08:00
|
|
|
/* Nonzero if we are about to expand a macro. Note that if we are
|
|
|
|
really expanding a macro, the function macro_of_context returns
|
|
|
|
the macro being expanded and this flag is set to false. Client
|
2019-08-29 02:43:37 +08:00
|
|
|
code should use the function cpp_in_macro_expansion_p to know if we
|
PR preprocessor/53229 - Fix diagnostics location when pasting tokens
As stated in the audit trail of this problem report, consider this
test case:
$ cat test.c
1 struct x {
2 int i;
3 };
4 struct x x;
5
6 #define TEST(X) x.##X
7
8 void foo (void)
9 {
10 TEST(i) = 0;
11 }
$
$ cc1 -quiet test.c
test.c: In function 'foo':
test.c:10:1: error: pasting "." and "i" does not give a valid preprocessing token
TEST(i) = 0;
^
$
So, when pasting tokens, the error diagnostic uses the global and
imprecise input_location variable, leading to an imprecise output.
To properly fix this, I think libcpp should keep the token of the
pasting operator '##', instead of representing it with flag on the LHS
operand's token. That way, it could use its location. Doing that
would be quite intrusive though. So this patch just uses the location
of the LHS of the pasting operator, for now. It's IMHO better than
the current situation.
The patch makes paste_tokens take a location parameter that is used in
the diagnostics. This change can still be useful later when we can
use the location of the pasting operator, because paste_tokens will
just be passed the new, more precise location.
Incidentally, it appeared that when getting tokens from within
preprocessor directives (like what is done in gcc.dg/cpp/paste12.c),
with -ftrack-macro-expansion disabled, the location of the expansion
point of macros was being lost because
cpp_reader::set_invocation_location wasn't being properly set. It's
because when cpp_get_token_1 calls enter_macro_context, there is a
little period of time between the beginning of that later function and
when the macro is really pushed (and thus when the macro is really
expanded) where we wrongly consider that we are not expanding the
macro because macro_of_context is still NULL. In that period of time,
in the occurrences of indirect recursive calls to cpp_get_token_1,
this later function wrongly sets cpp_reader::invocation_location
because cpp_reader::set_invocation_location is not being properly set.
To avoid that confusion the patch does away with
cpp_reader::set_invocation_location and introduces a new flag
cpp_reader::about_to_expand_macro_p that is set in the small time
interval exposed earlier. A new in_macro_expansion_p is introduced as
well, so that cpp_get_token_1 can now accurately detect when we are in
the process of expanding a macro, and thus correctly collect the
location of the expansion point.
People seem to like screenshots.
Thus, after the patch, we now have:
$ cc1 -quiet test.c
test.c: In function 'foo':
test.c:6:18: error: pasting "." and "i" does not give a valid preprocessing token
#define TEST(X) x.##X
^
test.c:10:3: note: in expansion of macro 'TEST'
TEST(i) = 0;
^
$
Bootstrapped and tested on x86_64-unknown-linux-gnu against trunk.
libcpp/
PR preprocessor/53229
* internal.h (cpp_reader::set_invocation_location): Remove.
(cpp_reader::about_to_expand_macro_p): New member flag.
* directives.c (do_pragma): Remove Kludge as
pfile->set_invocation_location is no more.
* macro.c (cpp_get_token_1): Do away with the use of
cpp_reader::set_invocation_location. Just collect the macro
expansion point when we are about to expand the top-most macro.
Do not override cpp_reader::about_to_expand_macro_p.
This fixes gcc.dg/cpp/paste12.c by making get_token_no_padding
properly handle locations of expansion points.
(cpp_get_token_with_location): Adjust, as
cpp_reader::set_invocation_location is no more.
(paste_tokens): Take a virtual location parameter for
the LHS of the pasting operator. Use it in diagnostics. Update
comments.
(paste_all_tokens): Tighten the assert. Propagate the location of
the expansion point when no virtual locations are available.
Pass the virtual location to paste_tokens.
(in_macro_expansion_p): New static function.
(enter_macro_context): Set the cpp_reader::about_to_expand_macro_p
flag until we really start expanding the macro.
gcc/testsuite/
PR preprocessor/53229
* gcc.dg/cpp/paste6.c: Force to run without
-ftrack-macro-expansion.
* gcc.dg/cpp/paste8.c: Likewise.
* gcc.dg/cpp/paste8-2.c: New test, like paste8.c but run with
-ftrack-macro-expansion.
* gcc.dg/cpp/paste12.c: Force to run without
-ftrack-macro-expansion.
* gcc.dg/cpp/paste12-2.c: New test, like paste12.c but run with
-ftrack-macro-expansion.
* gcc.dg/cpp/paste13.c: Likewise.
* gcc.dg/cpp/paste14.c: Likewise.
* gcc.dg/cpp/paste14-2.c: New test, like paste14.c but run with
-ftrack-macro-expansion.
* gcc.dg/cpp/paste18.c: New test.
From-SVN: r187945
2012-05-29 17:36:29 +08:00
|
|
|
are either about to expand a macro, or are actually expanding
|
|
|
|
one. */
|
|
|
|
bool about_to_expand_macro_p;
|
2007-09-07 00:24:05 +08:00
|
|
|
|
Makefile.in (C_AND_OBJC_OBJS, [...]): Update.
* Makefile.in (C_AND_OBJC_OBJS, c-incpath.o, c-lex.o, LIBCPP_OBJS,
cppinit.o, cppdefault.o, fix-header): Update.
* c-incpath.c: New file.
* c-incpath.h: New file.
* c-lex.c: Include c-incpath.h.
(init_c_lex): Register path simplifier.
* c-opts.c: Include cppdefault.h and c-incpath.h.
(TARGET_SYSTEM_ROOT, verbose, iprefix, sysroot, std_inc,
std_cxx_inc, quote_chain_split, add_prefixed_path): New.
(COMMAND_LINE_OPTIONS): Add more options from cpplib.
(missing_arg, c_common_decode_option): Handle them.
(c_common_post_options): Register include chains.
(print_help): Update.
* cppdefault.h (struct default include): Update.
Move some macros to ...
* cppdefault.c: ... here.
(cpp_include_defaults): Add extra field add_sysroot.
* cppfiles.c (include_file, search_from, find_or_create_entry,
cpp_included, find_include_file, remap_filename): Update for
renaming of search_path to cpp_path, and of the chain headers.
(remove_component_p, _cpp_simplify_pathname): Move to c-incpath.c.
* cpphash.h (struct search_path): Move to cpplib.h.
(struct cpp_buffer, struct cpp_reader): Update.
(_cpp_simplify_pathname): Remove.
* cppinit.c: Don't include prefix.h and cppdefault.h.
(INO_T_EQ, INO_T_COPY, path_include, append_include_chain,
remove_dup_dir, remove_dup_nonsys_dirs, remove_dup_dirs,
init_standard_includes, BRACKET, SYSTEM, AFTER, no_dir,
no_pth, cpp_handle_options): Remove.
(struct pending_option): Remove chain members.
(cpp_destroy, cpp_read_main_file, COMMAND_LINE_OPTIONS,
cpp_handle_option): Update.
* cpplib.h (struct cpp_path, cpp_set_include_chains): New.
(struct cpp_options): Remove quote_include, bracket_include,
include_prefix, include_prefix_len, verbose, ignore_srcdir,
no_standard_includes, no_standard_cplusplus_includes.
(struct cpp_callbacks): Add simplify_path.
(cpp_handle_options): Remove.
* fix-header.c: Include c-incpath.h.
(read_scan_file): Update to use c-incpath functionality.
* doc/passes.texi: Update.
cp:
* Make-lang.in (CXX_C_OBJS): Update.
From-SVN: r63612
2003-03-01 22:31:21 +08:00
|
|
|
/* Search paths for include files. */
|
Makefile.in (LIBCPP_DEPS): Add HASHTAB_H.
* Makefile.in (LIBCPP_DEPS): Add HASHTAB_H.
* cppfiles.c: Completely rewritten.
* c-incpath.c (free_path, remove_duplicates, heads, tails, add_path):
struct cpp_path is now struct cpp_dir.
(remove_duplicates): Don't simplify path names.
* c-opts.c (c_common_parse_file): cpp_read_next_file renamed
cpp_stack_file.
* cpphash.h: Include hashtab.h.
(_cpp_file): Declare.
(struct cpp_buffer): struct include_file is now struct _cpp_file,
and struct cpp_path is now struct cpp_dir. Rename members.
(struct cpp_reader): Similarly. New members once_only_files,
file_hash, file_hash_entries, quote_ignores_source_dir,
no_search_path, saw_pragma_once. Remove all_include_files and
max_include_len. Make some members bool.
(_cpp_mark_only_only): Renamed from _cpp_never_reread.
(_cpp_stack_file): Renamed from _cpp_read_file.
(_cpp_stack_include): Renamed from _cpp_execute_include.
(_cpp_init_files): Renamed from _cpp_init_includes.
(_cpp_cleanup_files): Renamed from _cpp_cleanup_includes.
* cppinit.c (cpp_create_reader): Initialize no_search_path. Update.
(cpp_read_next_file): Rename and move to cppfiles.c.
(cpp_read_main_file): Update.
* cpplib.c (run_directive): Update for renamed members.
(do_include_common, _cpp_pop_buffer): Update.
(do_import): Undeprecate #import.
(do_pragma_once): Undeprecate. Use _cpp_mark_file_once_only.
* cpplib.h: Remove file_name_map_list.
(cpp_options): Remove map_list.
(cpp_dir): Rename from cpp_path. New datatype for name_map.
(cpp_set_include_chains, cpp_stack_file, cpp_included): Update.
testsuite:
* gcc.dg/cpp/include2.c: Only expect one message.
From-SVN: r69942
2003-07-30 06:26:13 +08:00
|
|
|
struct cpp_dir *quote_include; /* "" */
|
|
|
|
struct cpp_dir *bracket_include; /* <> */
|
|
|
|
struct cpp_dir no_search_path; /* No path. */
|
libcpp, c-family: Add (dumb) C23 N3017 #embed support [PR105863]
The following patch implements the C23 N3017 "#embed - a scannable,
tooling-friendly binary resource inclusion mechanism" paper.
The implementation is intentionally dumb, in that it doesn't significantly
speed up compilation of larger initializers and doesn't make it possible
to use huge #embeds (like several gigabytes large, that is compile time
and memory still infeasible).
There are 2 reasons for this. One is that I think like it is implemented
now in the patch is how we should use it for the smaller #embed sizes,
dunno with which boundary, whether 32 bytes or 64 or something like that,
certainly handling the single byte cases which is something that can appear
anywhere in the source where constant integer literal can appear is
desirable and I think for a few bytes it isn't worth it to come up with
something smarter and users would like to e.g. see it in -E readably as
well (perhaps the slow vs. fast boundary should be determined by command
line option). And the other one is to be able to more easily find
regressions in behavior caused by the optimizations, so we have something
to get back in git to compare against.
I'm definitely willing to work on the optimizations (likely introduce a new
CPP_* token type to refer to a range of libcpp owned memory (start + size)
and similarly some tree which can do the same, and can be at any time e.g.
split into 2 subparts + say INTEGER_CST in between if needed say for
const unsigned char d[] = {
#embed "2GB.dat" prefix (0, 0, ) suffix (, [0x40000000] = 42)
}; still without having to copy around huge amounts of data; STRING_CST
owns the memory it points to and can be only 2GB in size), but would
like to do that incrementally.
And would like to first include some extensions also not included in
this patch, like gnu::offset (off) parameter to allow to skip certain
constant amount of bytes at the start of the files, plus
gnu::base64 ("base64_encoded_data") parameter to add something which can
store more efficiently large amounts of the #embed data in preprocessed
source.
I've been cross-checking all the tests also against the LLVM implementation
https://github.com/llvm/llvm-project/pull/68620
which has been for a few hours even committed to LLVM trunk but reverted
afterwards. LLVM now has the support committed and I admit I haven't
rechecked whether the behavior on the below mentioned spots have been fixed
in it already or not yet.
The patch uses --embed-dir= option that clang plans to add above and doesn't
use other variants on the search directories yet, plus there are no
default directories at least for the time being where to search for embed
files. So, #embed "..." works if it is found in the same directory (or
relative to the current file's directory) and #embed "/..." or #embed </...>
work always, but relative #embed <...> doesn't unless at least one
--embed-dir= is specified. There is no reason to differentiate between
system and non-system directories, so we don't need -isystem like
counterpart, perhaps -iquote like counterpart could be useful in the future,
dunno what else. It has --embed-directory=dir and --embed-directory dir
as aliases.
There are some differences beyond clang ICEs, so I'd like to point them out
to make sure there is agreement on the choices in the patch. They are also
mentioned in the comments of the llvm pull request.
The most important is that the GCC patch (as well as the original thephd.dev
LLVM branch on godbolt) expands #embed (or acts as if it is expanded) into
a mere sequence of numbers like 123,2,35,26 rather then what clang
effectively treats as (unsigned char)123,(unsigned char)2,(unsigned
char)35,(unsigned char)26 but only does that when using integrated
preprocessor, not when using -save-temps where it acts as GCC.
JeanHeyd as the original author agrees that is how it is currently worded in
C23.
Another difference (not tested in the testsuite, not sure how to check for
effective target /dev/urandom nor am sure it is desirable to check that
during testsuite) is how to treat character devices, named pipes etc.
(block devices are errored on). The original paper uses /dev/urandom
in various examples and seems to assume that unlike regular files the
devices aren't really cached, so
#embed </dev/urandom> limit(1) prefix(int a = ) suffix(;)
#embed </dev/urandom> limit(1) prefix(int b = ) suffix(;)
usually results in a != b. That is what the godbolt thephd.dev branch
implements too and what this patch does as well, but clang actually seems
to just go from st.st_size == 0, ergo it must be zero-sized resource and
so just copies over if_empty if present. It is really questionable
what to do about the character devices/named pipes with __has_embed, for
regular files the patch doesn't read anything from them, relies on
st.st_size + limit for whether it is empty or non-empty. But I don't know
of a way to check if read on say a character device would read anything
or not (the </dev/null> limit (1) vs. </dev/zero> limit (1) cases), and
if we read something, that would be better cached for later because
#embed later if it reads again could read no further data even when it
first read something. So, the patch currently for __has_embed just
always returns 2 on the non-regular files, like the thephd.dev
branch does as well and like the clang pull request as well.
A question is also what to do for gnu::offset on the non-regular files
even for #embed, those aren't seekable and do we want to just read and throw
away the offset bytes each time we see it used?
clang also chokes on the
#if __has_embed (__FILE__ __limit__ (1) __prefix__ () suffix (1 / 0) \
__if_empty__ ((({{[0[0{0{0(0(0)1)1}1}]]}})))) != __STDC_EMBED_FOUND__
#error "__has_embed fail"
#endif
in embed-1.c, but thephd.dev branch accepts it and I don't see why
it shouldn't, (({{[0[0{0{0(0(0)1)1}1}]]}}))) is a balanced token
sequence and the file isn't empty, so it should just be parsed and
discarded.
clang also IMHO mishandles
const unsigned char w[] = {
#embed __FILE__ prefix([0] = 42, [15] =) limit(32)
};
but again only without -save-temps, seems like it
treats it as
[0] = 42, [15] = (99,111,110,115,116,32,117,110,115,105,103,110,101,100,
32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98)
rather than
[0] = 42, [15] = 99,111,110,115,116,32,117,110,115,105,103,110,101,100,
32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98
and warns on it for -Wunused-value and just compiles it as
[0] = 42, [15] = 98
And also
void foo (int, int, int, int);
void bar (void) { foo (
#embed __FILE__ limit (4) prefix (172 + ) suffix (+ 2)
); }
is treated as
172 + (118, 111, 105, 100) + 2
rather than
172 + 118, 111, 105, 100 + 2
which clang -save-temps or GCC treats it like, so results
in just one argument passed rather than 4.
if (!strstr ((const char *) magna_carta, "imprisonétur")) abort ();
in the testcase fails as well, but in that case calling it in gdb succeeds:
p ((char *(*)(char *, char *))__strstr_sse2) (magna_carta, "imprisonétur")
$2 = 0x555555558d3c <magna_carta+11564> "imprisonétur aut disseisiátur"...
so I guess they are just trying to constant evaluate strstr and do it
incorrectly.
They started with making the optimizations together in the initial patch
set, so they don't have the luxury to compare if it is just because of
the optimization they are trying to do or because that is how the
feature works for them. At least unless they use -save-temps for now.
There is also different behavior between clang and gcc on -M or other
dependency generating options. Seems clang includes the __has_embed
searched files in dependencies, while my patch doesn't. But so does
clang for __has_include and GCC doesn't. Emitting a hard dependency
on some header just because there was __has_include/__has_embed for it
seems wrong to me, because (at least when properly written) the source
likely doesn't mind if the file is missing, it will do something else,
so a hard error from make because of it doesn't seem right. Does
make have some weaker dependencies, such that if some file can be remade
it is but if it doesn't exist, it isn't fatal?
I wonder whether #embed <non-existent-file> really needs to be fatal
or whether we could simply after diagnosing it pretend the file exists
and is empty. For #include I think fatal errors make tons of sense,
but perhaps for #embed which is more localized we'd get better error
reporting if we didn't bail out immediately. Note, both GCC and clang
currently treat those as fatal errors.
clang also added -dE option which with -E instead of preprocessing
the #embed directives keeps them as is, but the preprocessed source
then isn't self-contained. That option looks more harmful than useful to
me.
Also, it isn't clear to me from C23 whether it is possible to have
__has_include/__has_c_attribute/__has_embed expressions inside of
the limit #embed/__has_embed argument.
6.10.3.2/2 says that defined should not appear there (and the patch
diagnoses it and testsuite tests), but for __has_include/__has_embed
etc. 6.10.1/11 says:
"The identifiers __has_include, __has_embed, and __has_c_attribute
shall not appear in any context not mentioned in this subclause."
If that subclause in that case means 6.10.1, then it presumably shouldn't
appear in #embed in 6.10.3, but __has_embed is in 6.10.1...
But 6.10.3.2/3 says that it should be parsed according to the 6.10.1
rules. Haven't included tests like
#if __has_embed (__FILE__ limit (__has_embed (__FILE__ limit (1))))
or
#embed __FILE__ limit (__has_include (__FILE__))
into the testsuite because of the doubts but I think the patch should
handle those right now.
The reason I've used Magna Carta text in some of the testcases is that
I hope it shouldn't be copyrighted after the centuries and I'd strongly
prefer not to have binary blobs in git after the xz backdoor lesson
and wanted something larger which doesn't change all the time.
Oh, BTW, I see in C23 draft 6.10.3.2 in Example 4
if (f_source == NULL);
return 1;
(note the spurious semicolon after closing paren), has that been fixed
already?
Like the thephd.dev and clang implementations, the patch always macro
expands the whole #embed and __has_embed directives except for the
embed keyword. That is most likely not what C23 says, my limited
understanding right now is that in #embed one needs to parse the whole
directive line with macro expansion disabled and check if it satisfies the
grammar, if not, the whole directive is macro expanded, if yes, only
the limit parameter argument is macro expanded and the prefix/suffix/if_empty
arguments are maybe macro expanded when actually used (and not at all if
unused). And I think __has_embed macro expansion has conflicting rules.
2024-09-12 Jakub Jelinek <jakub@redhat.com>
PR c/105863
libcpp/
* include/cpplib.h: Implement C23 N3017 #embed - a scannable,
tooling-friendly binary resource inclusion mechanism paper.
(struct cpp_options): Add embed member.
(enum cpp_builtin_type): Add BT_HAS_EMBED.
(cpp_set_include_chains): Add another cpp_dir * argument to
the declaration.
* internal.h (enum include_type): Add IT_EMBED.
(struct cpp_reader): Add embed_include member.
(struct cpp_embed_params_tokens): New type.
(struct cpp_embed_params): New type.
(_cpp_get_token_no_padding): Declare.
(enum _cpp_find_file_kind): Add _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
(_cpp_stack_embed): Declare.
(_cpp_parse_expr): Change return type to cpp_num_part instead of
bool, change second argument from bool to const char * and add third
argument.
(_cpp_parse_embed_params): Declare.
* directives.cc (DIRECTIVE_TABLE): Add embed entry.
(end_directive): Don't call skip_rest_of_line for T_EMBED directive.
(_cpp_handle_directive): Return 2 rather than 1 for T_EMBED in
directives-only mode.
(parse_include): Don't Call check_eol for T_EMBED directive.
(skip_balanced_token_seq): New function.
(EMBED_PARAMS): Define.
(enum embed_param_kind): New type.
(embed_params): New variable.
(_cpp_parse_embed_params): New function.
(do_embed): New function.
(do_if): Adjust _cpp_parse_expr caller.
(do_elif): Likewise.
* expr.cc (parse_defined): Diagnose defined in #embed or __has_embed
parameters.
(_cpp_parse_expr): Change return type to cpp_num_part instead of
bool, change second argument from bool to const char * and add third
argument. Adjust function comment. For #embed/__has_embed parameters
add an artificial CPP_OPEN_PAREN. Use the second argument DIR
directly instead of string literals conditional on IS_IF.
For #embed/__has_embed parameter, stop on reaching CPP_CLOSE_PAREN
matching the artificial one. Diagnose negative or too large embed
parameter operands.
(num_binary_op): Use #embed instead of #if for diagnostics if inside
#embed/__has_embed parameter.
(num_div_op): Likewise.
* files.cc (struct _cpp_file): Add limit member and embed bitfield.
(search_cache): Add IS_EMBED argument, formatting fix. Skip over
files with different file->embed from the argument.
(find_file_in_dir): Don't call pch_open_file if file->embed.
(_cpp_find_file): Handle _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
(read_file_guts): Formatting fix.
(has_unique_contents): Ignore file->embed files.
(search_path_head): Handle IT_EMBED type.
(_cpp_stack_embed): New function.
(_cpp_get_file_stat): Formatting fix.
(cpp_set_include_chains): Add embed argument, save it to
pfile->embed_include and compute lens for the chain.
* init.cc (struct lang_flags): Add embed member.
(lang_defaults): Add embed initializers.
(cpp_set_lang): Initialize CPP_OPTION (pfile, embed).
(builtin_array): Add __has_embed entry.
(cpp_init_builtins): Predefine __STDC_EMBED_NOT_FOUND__,
__STDC_EMBED_FOUND__ and __STDC_EMBED_EMPTY__.
* lex.cc (cpp_directive_only_process): Handle #embed.
* macro.cc (cpp_get_token_no_padding): Rename to ...
(_cpp_get_token_no_padding): ... this. No longer static.
(builtin_has_include_1): New function.
(builtin_has_include): Use it. Use _cpp_get_token_no_padding
instead of cpp_get_token_no_padding.
(builtin_has_embed): New function.
(_cpp_builtin_macro_text): Handle BT_HAS_EMBED.
gcc/
* doc/cppdiropts.texi (--embed-dir=): Document.
* doc/cpp.texi (Binary Resource Inclusion): New chapter.
(__has_embed): Document.
* doc/invoke.texi (Directory Options): Mention --embed-dir=.
* gcc.cc (cpp_unique_options): Add %{-embed*}.
* genmatch.cc (main): Adjust cpp_set_include_chains caller.
* incpath.h (enum incpath_kind): Add INC_EMBED.
* incpath.cc (merge_include_chains): Handle INC_EMBED.
(register_include_chains): Adjust cpp_set_include_chains caller.
gcc/c-family/
* c.opt (-embed-dir=): New option.
(-embed-directory): New alias.
(-embed-directory=): New alias.
* c-opts.cc (c_common_handle_option): Handle OPT__embed_dir_.
gcc/testsuite/
* c-c++-common/cpp/embed-1.c: New test.
* c-c++-common/cpp/embed-2.c: New test.
* c-c++-common/cpp/embed-3.c: New test.
* c-c++-common/cpp/embed-4.c: New test.
* c-c++-common/cpp/embed-5.c: New test.
* c-c++-common/cpp/embed-6.c: New test.
* c-c++-common/cpp/embed-7.c: New test.
* c-c++-common/cpp/embed-8.c: New test.
* c-c++-common/cpp/embed-9.c: New test.
* c-c++-common/cpp/embed-10.c: New test.
* c-c++-common/cpp/embed-11.c: New test.
* c-c++-common/cpp/embed-12.c: New test.
* c-c++-common/cpp/embed-13.c: New test.
* c-c++-common/cpp/embed-14.c: New test.
* c-c++-common/cpp/embed-25.c: New test.
* c-c++-common/cpp/embed-26.c: New test.
* c-c++-common/cpp/embed-dir/embed-1.inc: New test.
* c-c++-common/cpp/embed-dir/embed-3.c: New test.
* c-c++-common/cpp/embed-dir/embed-4.c: New test.
* c-c++-common/cpp/embed-dir/magna-carta.txt: New test.
* gcc.dg/cpp/embed-1.c: New test.
* gcc.dg/cpp/embed-2.c: New test.
* gcc.dg/cpp/embed-3.c: New test.
* gcc.dg/cpp/embed-4.c: New test.
* g++.dg/cpp/embed-1.C: New test.
* g++.dg/cpp/embed-2.C: New test.
* g++.dg/cpp/embed-3.C: New test.
2024-09-12 17:15:38 +08:00
|
|
|
struct cpp_dir *embed_include; /* #embed <> */
|
Makefile.in (LIBCPP_DEPS): Add HASHTAB_H.
* Makefile.in (LIBCPP_DEPS): Add HASHTAB_H.
* cppfiles.c: Completely rewritten.
* c-incpath.c (free_path, remove_duplicates, heads, tails, add_path):
struct cpp_path is now struct cpp_dir.
(remove_duplicates): Don't simplify path names.
* c-opts.c (c_common_parse_file): cpp_read_next_file renamed
cpp_stack_file.
* cpphash.h: Include hashtab.h.
(_cpp_file): Declare.
(struct cpp_buffer): struct include_file is now struct _cpp_file,
and struct cpp_path is now struct cpp_dir. Rename members.
(struct cpp_reader): Similarly. New members once_only_files,
file_hash, file_hash_entries, quote_ignores_source_dir,
no_search_path, saw_pragma_once. Remove all_include_files and
max_include_len. Make some members bool.
(_cpp_mark_only_only): Renamed from _cpp_never_reread.
(_cpp_stack_file): Renamed from _cpp_read_file.
(_cpp_stack_include): Renamed from _cpp_execute_include.
(_cpp_init_files): Renamed from _cpp_init_includes.
(_cpp_cleanup_files): Renamed from _cpp_cleanup_includes.
* cppinit.c (cpp_create_reader): Initialize no_search_path. Update.
(cpp_read_next_file): Rename and move to cppfiles.c.
(cpp_read_main_file): Update.
* cpplib.c (run_directive): Update for renamed members.
(do_include_common, _cpp_pop_buffer): Update.
(do_import): Undeprecate #import.
(do_pragma_once): Undeprecate. Use _cpp_mark_file_once_only.
* cpplib.h: Remove file_name_map_list.
(cpp_options): Remove map_list.
(cpp_dir): Rename from cpp_path. New datatype for name_map.
(cpp_set_include_chains, cpp_stack_file, cpp_included): Update.
testsuite:
* gcc.dg/cpp/include2.c: Only expect one message.
From-SVN: r69942
2003-07-30 06:26:13 +08:00
|
|
|
|
2003-08-03 00:29:46 +08:00
|
|
|
/* Chain of all hashed _cpp_file instances. */
|
|
|
|
struct _cpp_file *all_files;
|
Makefile.in (LIBCPP_DEPS): Add HASHTAB_H.
* Makefile.in (LIBCPP_DEPS): Add HASHTAB_H.
* cppfiles.c: Completely rewritten.
* c-incpath.c (free_path, remove_duplicates, heads, tails, add_path):
struct cpp_path is now struct cpp_dir.
(remove_duplicates): Don't simplify path names.
* c-opts.c (c_common_parse_file): cpp_read_next_file renamed
cpp_stack_file.
* cpphash.h: Include hashtab.h.
(_cpp_file): Declare.
(struct cpp_buffer): struct include_file is now struct _cpp_file,
and struct cpp_path is now struct cpp_dir. Rename members.
(struct cpp_reader): Similarly. New members once_only_files,
file_hash, file_hash_entries, quote_ignores_source_dir,
no_search_path, saw_pragma_once. Remove all_include_files and
max_include_len. Make some members bool.
(_cpp_mark_only_only): Renamed from _cpp_never_reread.
(_cpp_stack_file): Renamed from _cpp_read_file.
(_cpp_stack_include): Renamed from _cpp_execute_include.
(_cpp_init_files): Renamed from _cpp_init_includes.
(_cpp_cleanup_files): Renamed from _cpp_cleanup_includes.
* cppinit.c (cpp_create_reader): Initialize no_search_path. Update.
(cpp_read_next_file): Rename and move to cppfiles.c.
(cpp_read_main_file): Update.
* cpplib.c (run_directive): Update for renamed members.
(do_include_common, _cpp_pop_buffer): Update.
(do_import): Undeprecate #import.
(do_pragma_once): Undeprecate. Use _cpp_mark_file_once_only.
* cpplib.h: Remove file_name_map_list.
(cpp_options): Remove map_list.
(cpp_dir): Rename from cpp_path. New datatype for name_map.
(cpp_set_include_chains, cpp_stack_file, cpp_included): Update.
testsuite:
* gcc.dg/cpp/include2.c: Only expect one message.
From-SVN: r69942
2003-07-30 06:26:13 +08:00
|
|
|
|
2003-10-02 15:23:27 +08:00
|
|
|
struct _cpp_file *main_file;
|
|
|
|
|
Makefile.in (LIBCPP_DEPS): Add HASHTAB_H.
* Makefile.in (LIBCPP_DEPS): Add HASHTAB_H.
* cppfiles.c: Completely rewritten.
* c-incpath.c (free_path, remove_duplicates, heads, tails, add_path):
struct cpp_path is now struct cpp_dir.
(remove_duplicates): Don't simplify path names.
* c-opts.c (c_common_parse_file): cpp_read_next_file renamed
cpp_stack_file.
* cpphash.h: Include hashtab.h.
(_cpp_file): Declare.
(struct cpp_buffer): struct include_file is now struct _cpp_file,
and struct cpp_path is now struct cpp_dir. Rename members.
(struct cpp_reader): Similarly. New members once_only_files,
file_hash, file_hash_entries, quote_ignores_source_dir,
no_search_path, saw_pragma_once. Remove all_include_files and
max_include_len. Make some members bool.
(_cpp_mark_only_only): Renamed from _cpp_never_reread.
(_cpp_stack_file): Renamed from _cpp_read_file.
(_cpp_stack_include): Renamed from _cpp_execute_include.
(_cpp_init_files): Renamed from _cpp_init_includes.
(_cpp_cleanup_files): Renamed from _cpp_cleanup_includes.
* cppinit.c (cpp_create_reader): Initialize no_search_path. Update.
(cpp_read_next_file): Rename and move to cppfiles.c.
(cpp_read_main_file): Update.
* cpplib.c (run_directive): Update for renamed members.
(do_include_common, _cpp_pop_buffer): Update.
(do_import): Undeprecate #import.
(do_pragma_once): Undeprecate. Use _cpp_mark_file_once_only.
* cpplib.h: Remove file_name_map_list.
(cpp_options): Remove map_list.
(cpp_dir): Rename from cpp_path. New datatype for name_map.
(cpp_set_include_chains, cpp_stack_file, cpp_included): Update.
testsuite:
* gcc.dg/cpp/include2.c: Only expect one message.
From-SVN: r69942
2003-07-30 06:26:13 +08:00
|
|
|
/* File and directory hash table. */
|
2003-08-01 22:04:02 +08:00
|
|
|
struct htab *file_hash;
|
2004-07-17 01:07:01 +08:00
|
|
|
struct htab *dir_hash;
|
2007-12-07 02:56:26 +08:00
|
|
|
struct file_hash_entry_pool *file_hash_entries;
|
Makefile.in (LIBCPP_DEPS): Add HASHTAB_H.
* Makefile.in (LIBCPP_DEPS): Add HASHTAB_H.
* cppfiles.c: Completely rewritten.
* c-incpath.c (free_path, remove_duplicates, heads, tails, add_path):
struct cpp_path is now struct cpp_dir.
(remove_duplicates): Don't simplify path names.
* c-opts.c (c_common_parse_file): cpp_read_next_file renamed
cpp_stack_file.
* cpphash.h: Include hashtab.h.
(_cpp_file): Declare.
(struct cpp_buffer): struct include_file is now struct _cpp_file,
and struct cpp_path is now struct cpp_dir. Rename members.
(struct cpp_reader): Similarly. New members once_only_files,
file_hash, file_hash_entries, quote_ignores_source_dir,
no_search_path, saw_pragma_once. Remove all_include_files and
max_include_len. Make some members bool.
(_cpp_mark_only_only): Renamed from _cpp_never_reread.
(_cpp_stack_file): Renamed from _cpp_read_file.
(_cpp_stack_include): Renamed from _cpp_execute_include.
(_cpp_init_files): Renamed from _cpp_init_includes.
(_cpp_cleanup_files): Renamed from _cpp_cleanup_includes.
* cppinit.c (cpp_create_reader): Initialize no_search_path. Update.
(cpp_read_next_file): Rename and move to cppfiles.c.
(cpp_read_main_file): Update.
* cpplib.c (run_directive): Update for renamed members.
(do_include_common, _cpp_pop_buffer): Update.
(do_import): Undeprecate #import.
(do_pragma_once): Undeprecate. Use _cpp_mark_file_once_only.
* cpplib.h: Remove file_name_map_list.
(cpp_options): Remove map_list.
(cpp_dir): Rename from cpp_path. New datatype for name_map.
(cpp_set_include_chains, cpp_stack_file, cpp_included): Update.
testsuite:
* gcc.dg/cpp/include2.c: Only expect one message.
From-SVN: r69942
2003-07-30 06:26:13 +08:00
|
|
|
|
2007-05-22 07:43:53 +08:00
|
|
|
/* Negative path lookup hash table. */
|
|
|
|
struct htab *nonexistent_file_hash;
|
|
|
|
struct obstack nonexistent_file_ob;
|
|
|
|
|
Makefile.in (LIBCPP_DEPS): Add HASHTAB_H.
* Makefile.in (LIBCPP_DEPS): Add HASHTAB_H.
* cppfiles.c: Completely rewritten.
* c-incpath.c (free_path, remove_duplicates, heads, tails, add_path):
struct cpp_path is now struct cpp_dir.
(remove_duplicates): Don't simplify path names.
* c-opts.c (c_common_parse_file): cpp_read_next_file renamed
cpp_stack_file.
* cpphash.h: Include hashtab.h.
(_cpp_file): Declare.
(struct cpp_buffer): struct include_file is now struct _cpp_file,
and struct cpp_path is now struct cpp_dir. Rename members.
(struct cpp_reader): Similarly. New members once_only_files,
file_hash, file_hash_entries, quote_ignores_source_dir,
no_search_path, saw_pragma_once. Remove all_include_files and
max_include_len. Make some members bool.
(_cpp_mark_only_only): Renamed from _cpp_never_reread.
(_cpp_stack_file): Renamed from _cpp_read_file.
(_cpp_stack_include): Renamed from _cpp_execute_include.
(_cpp_init_files): Renamed from _cpp_init_includes.
(_cpp_cleanup_files): Renamed from _cpp_cleanup_includes.
* cppinit.c (cpp_create_reader): Initialize no_search_path. Update.
(cpp_read_next_file): Rename and move to cppfiles.c.
(cpp_read_main_file): Update.
* cpplib.c (run_directive): Update for renamed members.
(do_include_common, _cpp_pop_buffer): Update.
(do_import): Undeprecate #import.
(do_pragma_once): Undeprecate. Use _cpp_mark_file_once_only.
* cpplib.h: Remove file_name_map_list.
(cpp_options): Remove map_list.
(cpp_dir): Rename from cpp_path. New datatype for name_map.
(cpp_set_include_chains, cpp_stack_file, cpp_included): Update.
testsuite:
* gcc.dg/cpp/include2.c: Only expect one message.
From-SVN: r69942
2003-07-30 06:26:13 +08:00
|
|
|
/* Nonzero means don't look for #include "foo" the source-file
|
|
|
|
directory. */
|
|
|
|
bool quote_ignores_source_dir;
|
|
|
|
|
2003-09-14 22:49:08 +08:00
|
|
|
/* Nonzero if any file has contained #pragma once or #import has
|
2003-08-03 00:29:46 +08:00
|
|
|
been used. */
|
|
|
|
bool seen_once_only;
|
Makefile.in (C_AND_OBJC_OBJS, [...]): Update.
* Makefile.in (C_AND_OBJC_OBJS, c-incpath.o, c-lex.o, LIBCPP_OBJS,
cppinit.o, cppdefault.o, fix-header): Update.
* c-incpath.c: New file.
* c-incpath.h: New file.
* c-lex.c: Include c-incpath.h.
(init_c_lex): Register path simplifier.
* c-opts.c: Include cppdefault.h and c-incpath.h.
(TARGET_SYSTEM_ROOT, verbose, iprefix, sysroot, std_inc,
std_cxx_inc, quote_chain_split, add_prefixed_path): New.
(COMMAND_LINE_OPTIONS): Add more options from cpplib.
(missing_arg, c_common_decode_option): Handle them.
(c_common_post_options): Register include chains.
(print_help): Update.
* cppdefault.h (struct default include): Update.
Move some macros to ...
* cppdefault.c: ... here.
(cpp_include_defaults): Add extra field add_sysroot.
* cppfiles.c (include_file, search_from, find_or_create_entry,
cpp_included, find_include_file, remap_filename): Update for
renaming of search_path to cpp_path, and of the chain headers.
(remove_component_p, _cpp_simplify_pathname): Move to c-incpath.c.
* cpphash.h (struct search_path): Move to cpplib.h.
(struct cpp_buffer, struct cpp_reader): Update.
(_cpp_simplify_pathname): Remove.
* cppinit.c: Don't include prefix.h and cppdefault.h.
(INO_T_EQ, INO_T_COPY, path_include, append_include_chain,
remove_dup_dir, remove_dup_nonsys_dirs, remove_dup_dirs,
init_standard_includes, BRACKET, SYSTEM, AFTER, no_dir,
no_pth, cpp_handle_options): Remove.
(struct pending_option): Remove chain members.
(cpp_destroy, cpp_read_main_file, COMMAND_LINE_OPTIONS,
cpp_handle_option): Update.
* cpplib.h (struct cpp_path, cpp_set_include_chains): New.
(struct cpp_options): Remove quote_include, bracket_include,
include_prefix, include_prefix_len, verbose, ignore_srcdir,
no_standard_includes, no_standard_cplusplus_includes.
(struct cpp_callbacks): Add simplify_path.
(cpp_handle_options): Remove.
* fix-header.c: Include c-incpath.h.
(read_scan_file): Update to use c-incpath functionality.
* doc/passes.texi: Update.
cp:
* Make-lang.in (CXX_C_OBJS): Update.
From-SVN: r63612
2003-03-01 22:31:21 +08:00
|
|
|
|
2024-10-02 16:53:35 +08:00
|
|
|
/* Multiple include optimization and -Wheader-guard warning. */
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
const cpp_hashnode *mi_cmacro;
|
|
|
|
const cpp_hashnode *mi_ind_cmacro;
|
2024-10-02 16:53:35 +08:00
|
|
|
const cpp_hashnode *mi_def_cmacro;
|
|
|
|
location_t mi_loc, mi_def_loc;
|
2001-07-30 01:27:57 +08:00
|
|
|
bool mi_valid;
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
|
2001-09-11 15:00:12 +08:00
|
|
|
/* Lexing. */
|
|
|
|
cpp_token *cur_token;
|
|
|
|
tokenrun base_run, *cur_run;
|
c-parse.in (_yylex): Use _cpp_backup_tokens.
* c-parse.in (_yylex): Use _cpp_backup_tokens.
* cpphash.h (struct tokenrun): Add prev.
(struct lexer_state): Remove bol.
(struct cpp_reader): Remove old lookahead stuff, add lookaheads.
(_cpp_free_lookaheads, _cpp_release_lookahead, _cpp_push_token)
: Remove.
* cppinit.c (cpp_create_reader): Don't set bol.
(cpp_destroy): Don't free lookaheads.
* cpplex.c (lex_directive): Remove.
(next_tokenrun): Update.
(_cpp_lex_token): Clean up logic.
(lex_token): Update to return a pointer to lexed token, since it
can move to the start of the buffer. Simpify newline handling.
* cpplib.c (SEEN_EOL): Update.
(skip_rest_of_line): Remove lookahead stuff.
(end_directive): Line numbers are already incremented. Revert
to start of lexed token buffer if we can.
(_cpp_handle_directive, do_pragma, do_pragma_dependency,
parse_answer): Use _cpp_backup_tokens.
(run_directive, cpp_pop_buffer): Don't set bol, set saved_flags
instead. Don't check for EOL.
(do_include_common, do_line, do_pragma_system_header): Use
skip_rest_of_line.
* cpplib.h (BOL, _cpp_backup_tokens): New.
* cppmacro.c (save_lookahead_token, take_lookahead_token,
alloc_lookahead, free_lookahead, _cpp_free_lookaheads,
cpp_start_lookahead, cpp_stop_lookahead, _cpp_push_token): Remove.
(builtin_macro): Don't use cpp_get_line.
(cpp_get_line): Short term kludge.
(parse_arg): Handle directives in arguments here. Back up when
appropriate. Store EOF at end of argument list.
(funlike_invocation_p): Use _cpp_backup_tokens.
(push_arg_context): Account for EOF at end of list.
(cpp_get_token): Remove lookahead stuff. Update.
* gcc.dg/cpp/directiv.c: Update.
* gcc.dg/cpp/undef1.c: Update.
From-SVN: r45582
2001-09-14 04:05:17 +08:00
|
|
|
unsigned int lookaheads;
|
2001-09-11 15:00:12 +08:00
|
|
|
|
2002-09-22 10:03:17 +08:00
|
|
|
/* Nonzero prevents the lexer from re-using the token runs. */
|
2001-09-11 15:00:12 +08:00
|
|
|
unsigned int keep_tokens;
|
|
|
|
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
/* Buffer to hold macro definition string. */
|
|
|
|
unsigned char *macro_buffer;
|
|
|
|
unsigned int macro_buffer_len;
|
|
|
|
|
cppcharset.c (one_utf8_to_cppchar, [...]): New functions.
* cppcharset.c (one_utf8_to_cppchar, one_cppchar_to_utf8,
one_utf8_to_utf32, one_utf32_to_utf8, one_utf8_to_utf16,
one_utf16_to_utf8, conversion_loop, convert_utf8_utf16,
convert_utf8_utf32, convert_utf16_utf8, convert_utf32_utf8,
convert_no_conversion, convert_using_iconv): New functions.
(APPLY_CONVERSION): New macro.
(struct conversion, conversion_tab): New data structure.
(init_iconv_desc): Check conversion_tab for a custom conversion
primitive before trying to use iconv.
(convert_cset): Deleted.
(cpp_init_iconv): Use UTF- terminology, not UCS-.
(_cpp_destroy_iconv): Update to match.
(_cpp_valid_ucn): We don't need iconv to implement UCNs.
(convert_ucn): Use one_cppchar_to_utf8 and APPLY_CONVERSION.
(convert_escape, cpp_interpret_string): Use APPLY_CONVERSION.
(_cpp_interpret_string_notranslate): New function, moved here
from cpplib.c.
* cpphash.h (convert_f, struct cset_converter): New types.
(struct cpp_reader): narrow_cset_desc and wide_cset_desc
are now struct cset_converter, not bare iconv_t.
Update prototypes.
* cpplib.c (interpret_string_notranslate): Moved to cppcharset.c;
all callers changed.
From-SVN: r69204
2003-07-11 07:16:31 +08:00
|
|
|
/* Descriptor for converting from the source character set to the
|
|
|
|
execution character set. */
|
|
|
|
struct cset_converter narrow_cset_desc;
|
cpplib.h (CPP_AT_NAME, [...]): New token types.
* cpplib.h (CPP_AT_NAME, CPP_OBJC_STRING): New token types.
(struct cpp_options): Add narrow_charset, wide_charset,
bytes_big_endian fields. Remove EBCDIC field.
(cpp_init_iconv, cpp_interpret_string): New external interfaces.
* cpphash.h: Include <iconv.h> if we have it, otherwise
provide a dummy definition of iconv_t.
(struct cpp_reader): Add narrow_cset_desc and wide_cset_desc fields.
(_cpp_valid_ucn): Update prototype.
(_cpp_destroy_iconv): New prototype.
* doc/cpp.texi: Document character set handling.
* doc/cppopts.texi: Document -fexec-charset= and -fexec-wide-charset=.
* doc/extend.texi: Delete entire section on multiline strings.
Rewrite section on __FUNCTION__ etc now that these are
variables in C.
* cppucnid.tab, cppucnid.pl: New files.
* cppucnid.h: New generated file.
* cppcharset.c: Include cppucnid.h. Lots of commentary added.
(iconv_open, iconv, iconv_close): Provide dummy definitions
if !HAVE_ICONV.
(SOURCE_CHARSET, struct strbuf, init_iconv_desc, cpp_init_iconv,
_cpp_destroy_iconv, convert_cset, width_to_mask, convert_ucn,
emit_numeric_escape, convert_hex, convert_oct, convert_escape,
cpp_interpret_string, narrow_str_to_charconst,
wide_str_to_charconst): New.
(ucn_valid_in_identifier): Use a binary search through the
ucnranges table defined in cppucnid.h, not a long chain of if
statements.
(_cpp_valid_ucn): Add a limit pointer. Downgrade "universal
character names are only valid in C++ and C99" to a warning.
Issue the "meaning of \[uU] is different in traditional C"
warning here. Take care not to let iconv see an invalid UCS
value if we get a malformed UCN. Issue an error if we don't
have iconv.
(cpp_interpret_charconst): Moved here from cpplex.c. Use
cpp_interpret_string to do the heavy lifting.
* cppinit.c (cpp_create_reader): Initialize bytes_big_endian,
narrow_charset, wide_charset fields of options structure.
(cpp_destroy): Call _cpp_destroy_iconv.
* cpplex.c (forms_identifier_p): Adjust call to _cpp_valid_ucn.
(maybe_read_ucn, hex_digit_value, cpp_parse_escape): Delete.
(cpp_interpret_charconst): Moved to cppcharset.c.
* cpplib.c (dequote_string): Delete.
(interpret_string_notranslate): New.
(do_line, do_linemarker): Use interpret_string_notranslate.
* Makefile.in (cppcharset.o): Depend on cppucnid.h.
* c-common.c (fname_string, combine_strings): Delete.
* c-common.h (fname_string, combine_strings): Delete prototypes.
* c-lex.c (ignore_escape_flag): Delete.
(cb_ident): Use cpp_interpret_string, not lex_string.
(get_nonpadding_token): New function.
(c_lex): Handle Objective-C @-prefixed identifiers and strings here.
Adjust calls to lex_string. Don't write *value twice.
(lex_string): Now handles string constant concatenation.
Most of the work handed off to cpp_interpret_string.
Call fix_string_type here.
* c-parse.in (STRING_FUNC_NAME, VAR_FUNC_NAME): Replace with
FUNC_NAME, throughout.
(OBJC_STRING): New token type.
(primary:STRING): No need to call fix_string_type here.
(primary:objc_string): Make that OBJC_STRING.
(objc_string nonterminal): Delete.
(yylexname): Delete code to handle fake string constants.
(yylexstring): Delete entirely.
(_yylex): Handle CPP_AT_NAME and CPP_OBJC_STRING. No need
to handle CPP_ATSIGN.
* c.opt (-fexec-charset=, -fwide-exec-charset=): New options.
* c-opts.c (missing_arg, c_common_handle_option): Handle
OPT_fexec_charset_ and OPT_fwide_exec_charset_.
(c_common_init): Set cpp_opts->bytes_big_endian, not
cpp_opts->EBCDIC. Call cpp_init_iconv.
(print_help): Document -fexec-charset= and -fexec-wide-charset=.
(TARGET_EBCDIC): Delete default definition.
* objc/objc-act.c (build_objc_string_object): No need to
handle string constant concatenation.
cp:
* parser.c (cp_lexer_read_token): No need to handle string
constant concatenation.
testsuite:
* gcc.c-torture/execute/wchar_t-1.x: New file; XFAIL wchar_t-1.c
everywhere.
* gcc.dg/concat.c: Concatenation of string constants with
__FUNCTION__ / __PRETTY_FUNCTION__ is now a hard error.
* gcc.dg/wtr-strcat-1.c: Loosen dg-warning regexp.
* gcc.dg/cpp/escape-2.c: Use wide character constants where
necessary to avoid multi-character character constant warning.
* gcc.dg/cpp/escape.c: Likewise.
* gcc.dg/cpp/ucs.c: Likewise.
Remove backslashes from dg-bogus comments, as they confuse Tcl.
Fix a typo.
libstdc++-v3:
* testsuite/22_locale/collate/compare/wchar_t/2.cc
* testsuite/22_locale/collate/compare/wchar_t/wrapped_env.cc
* testsuite/22_locale/collate/compare/wchar_t/wrapped_locale.cc
* testsuite/22_locale/collate/hash/wchar_t/2.cc
* testsuite/22_locale/collate/hash/wchar_t/wrapped_env.cc
* testsuite/22_locale/collate/hash/wchar_t/wrapped_locale.cc
* testsuite/22_locale/collate/transform/wchar_t/2.cc
* testsuite/22_locale/collate/transform/wchar_t/wrapped_env.cc
* testsuite/22_locale/collate/transform/wchar_t/wrapped_locale.cc:
XFAIL on all targets.
From-SVN: r68952
2003-07-05 08:24:00 +08:00
|
|
|
|
2009-10-20 05:41:15 +08:00
|
|
|
/* Descriptor for converting from the source character set to the
|
|
|
|
UTF-8 execution character set. */
|
|
|
|
struct cset_converter utf8_cset_desc;
|
|
|
|
|
cpp-id-data.h (UC): Was U, conflicts with U...
libcpp/ChangeLog:
2008-04-14 Kris Van Hees <kris.van.hees@oracle.com>
* include/cpp-id-data.h (UC): Was U, conflicts with U... literal.
* include/cpplib.h (CHAR16, CHAR32, STRING16, STRING32): New tokens.
(struct cpp_options): Added uliterals.
(cpp_interpret_string): Update prototype.
(cpp_interpret_string_notranslate): Idem.
* charset.c (init_iconv_desc): New width member in cset_converter.
(cpp_init_iconv): Add support for char{16,32}_cset_desc.
(convert_ucn): Idem.
(emit_numeric_escape): Idem.
(convert_hex): Idem.
(convert_oct): Idem.
(convert_escape): Idem.
(converter_for_type): New function.
(cpp_interpret_string): Use converter_for_type, support u and U prefix.
(cpp_interpret_string_notranslate): Match changed prototype.
(wide_str_to_charconst): Use converter_for_type.
(cpp_interpret_charconst): Add support for CPP_CHAR{16,32}.
* directives.c (linemarker_dir): Macro U changed to UC.
(parse_include): Idem.
(register_pragma_1): Idem.
(restore_registered_pragmas): Idem.
(get__Pragma_string): Support CPP_STRING{16,32}.
* expr.c (eval_token): Support CPP_CHAR{16,32}.
* init.c (struct lang_flags): Added uliterals.
(lang_defaults): Idem.
* internal.h (struct cset_converter) <width>: New field.
(struct cpp_reader) <char16_cset_desc>: Idem.
(struct cpp_reader) <char32_cset_desc>: Idem.
* lex.c (digraph_spellings): Macro U changed to UC.
(OP, TK): Idem.
(lex_string): Add support for u'...', U'...', u... and U....
(_cpp_lex_direct): Idem.
* macro.c (_cpp_builtin_macro_text): Macro U changed to UC.
(stringify_arg): Support CPP_CHAR{16,32} and CPP_STRING{16,32}.
gcc/ChangeLog:
2008-04-14 Kris Van Hees <kris.van.hees@oracle.com>
* c-common.c (CHAR16_TYPE, CHAR32_TYPE): New macros.
(fname_as_string): Match updated cpp_interpret_string prototype.
(fix_string_type): Support char16_t* and char32_t*.
(c_common_nodes_and_builtins): Add char16_t and char32_t (and
derivative) nodes. Register as builtin if C++0x.
(c_parse_error): Support CPP_CHAR{16,32}.
* c-common.h (RID_CHAR16, RID_CHAR32): New elements.
(enum c_tree_index) <CTI_CHAR16_TYPE, CTI_SIGNED_CHAR16_TYPE,
CTI_UNSIGNED_CHAR16_TYPE, CTI_CHAR32_TYPE, CTI_SIGNED_CHAR32_TYPE,
CTI_UNSIGNED_CHAR32_TYPE, CTI_CHAR16_ARRAY_TYPE,
CTI_CHAR32_ARRAY_TYPE>: New elements.
(char16_type_node, signed_char16_type_node, unsigned_char16_type_node,
char32_type_node, signed_char32_type_node, char16_array_type_node,
char32_array_type_node): New defines.
* c-lex.c (cb_ident): Match updated cpp_interpret_string prototype.
(c_lex_with_flags): Support CPP_CHAR{16,32} and CPP_STRING{16,32}.
(lex_string): Support CPP_STRING{16,32}, match updated
cpp_interpret_string and cpp_interpret_string_notranslate prototypes.
(lex_charconst): Support CPP_CHAR{16,32}.
* c-parser.c (c_parser_postfix_expression): Support CPP_CHAR{16,32}
and CPP_STRING{16,32}.
gcc/cp/ChangeLog:
2008-04-14 Kris Van Hees <kris.van.hees@oracle.com>
* cvt.c (type_promotes_to): Support char16_t and char32_t.
* decl.c (grokdeclarator): Disallow signed/unsigned/short/long on
char16_t and char32_t.
* lex.c (reswords): Add char16_t and char32_t (for c++0x).
* mangle.c (write_builtin_type): Mangle char16_t/char32_t as vendor
extended builtin type u8char32_t.
* parser.c (cp_lexer_next_token_is_decl_specifier_keyword): Support
RID_CHAR{16,32}.
(cp_lexer_print_token): Support CPP_STRING{16,32}.
(cp_parser_is_string_literal): Idem.
(cp_parser_string_literal): Idem.
(cp_parser_primary_expression): Support CPP_CHAR{16,32} and
CPP_STRING{16,32}.
(cp_parser_simple_type_specifier): Support RID_CHAR{16,32}.
* tree.c (char_type_p): Support char16_t and char32_t as char types.
* typeck.c (string_conv_p): Support char16_t and char32_t.
gcc/testsuite/ChangeLog:
2008-04-14 Kris Van Hees <kris.van.hees@oracle.com>
Tests for char16_t and char32_t support.
* g++.dg/ext/utf-cvt.C: New
* g++.dg/ext/utf-cxx0x.C: New
* g++.dg/ext/utf-cxx98.C: New
* g++.dg/ext/utf-dflt.C: New
* g++.dg/ext/utf-gnuxx0x.C: New
* g++.dg/ext/utf-gnuxx98.C: New
* g++.dg/ext/utf-mangle.C: New
* g++.dg/ext/utf-typedef-cxx0x.C: New
* g++.dg/ext/utf-typedef-
* g++.dg/ext/utf-typespec.C: New
* g++.dg/ext/utf16-1.C: New
* g++.dg/ext/utf16-2.C: New
* g++.dg/ext/utf16-3.C: New
* g++.dg/ext/utf16-4.C: New
* g++.dg/ext/utf32-1.C: New
* g++.dg/ext/utf32-2.C: New
* g++.dg/ext/utf32-3.C: New
* g++.dg/ext/utf32-4.C: New
* gcc.dg/utf-cvt.c: New
* gcc.dg/utf-dflt.c: New
* gcc.dg/utf16-1.c: New
* gcc.dg/utf16-2.c: New
* gcc.dg/utf16-3.c: New
* gcc.dg/utf16-4.c: New
* gcc.dg/utf32-1.c: New
* gcc.dg/utf32-2.c: New
* gcc.dg/utf32-3.c: New
* gcc.dg/utf32-4.c: New
libiberty/ChangeLog:
2008-04-14 Kris Van Hees <kris.van.hees@oracle.com>
* testsuite/demangle-expected: Added tests for char16_t and char32_t.
From-SVN: r134438
2008-04-18 21:58:08 +08:00
|
|
|
/* Descriptor for converting from the source character set to the
|
|
|
|
UTF-16 execution character set. */
|
|
|
|
struct cset_converter char16_cset_desc;
|
|
|
|
|
|
|
|
/* Descriptor for converting from the source character set to the
|
|
|
|
UTF-32 execution character set. */
|
|
|
|
struct cset_converter char32_cset_desc;
|
|
|
|
|
cppcharset.c (one_utf8_to_cppchar, [...]): New functions.
* cppcharset.c (one_utf8_to_cppchar, one_cppchar_to_utf8,
one_utf8_to_utf32, one_utf32_to_utf8, one_utf8_to_utf16,
one_utf16_to_utf8, conversion_loop, convert_utf8_utf16,
convert_utf8_utf32, convert_utf16_utf8, convert_utf32_utf8,
convert_no_conversion, convert_using_iconv): New functions.
(APPLY_CONVERSION): New macro.
(struct conversion, conversion_tab): New data structure.
(init_iconv_desc): Check conversion_tab for a custom conversion
primitive before trying to use iconv.
(convert_cset): Deleted.
(cpp_init_iconv): Use UTF- terminology, not UCS-.
(_cpp_destroy_iconv): Update to match.
(_cpp_valid_ucn): We don't need iconv to implement UCNs.
(convert_ucn): Use one_cppchar_to_utf8 and APPLY_CONVERSION.
(convert_escape, cpp_interpret_string): Use APPLY_CONVERSION.
(_cpp_interpret_string_notranslate): New function, moved here
from cpplib.c.
* cpphash.h (convert_f, struct cset_converter): New types.
(struct cpp_reader): narrow_cset_desc and wide_cset_desc
are now struct cset_converter, not bare iconv_t.
Update prototypes.
* cpplib.c (interpret_string_notranslate): Moved to cppcharset.c;
all callers changed.
From-SVN: r69204
2003-07-11 07:16:31 +08:00
|
|
|
/* Descriptor for converting from the source character set to the
|
|
|
|
wide execution character set. */
|
|
|
|
struct cset_converter wide_cset_desc;
|
cpplib.h (CPP_AT_NAME, [...]): New token types.
* cpplib.h (CPP_AT_NAME, CPP_OBJC_STRING): New token types.
(struct cpp_options): Add narrow_charset, wide_charset,
bytes_big_endian fields. Remove EBCDIC field.
(cpp_init_iconv, cpp_interpret_string): New external interfaces.
* cpphash.h: Include <iconv.h> if we have it, otherwise
provide a dummy definition of iconv_t.
(struct cpp_reader): Add narrow_cset_desc and wide_cset_desc fields.
(_cpp_valid_ucn): Update prototype.
(_cpp_destroy_iconv): New prototype.
* doc/cpp.texi: Document character set handling.
* doc/cppopts.texi: Document -fexec-charset= and -fexec-wide-charset=.
* doc/extend.texi: Delete entire section on multiline strings.
Rewrite section on __FUNCTION__ etc now that these are
variables in C.
* cppucnid.tab, cppucnid.pl: New files.
* cppucnid.h: New generated file.
* cppcharset.c: Include cppucnid.h. Lots of commentary added.
(iconv_open, iconv, iconv_close): Provide dummy definitions
if !HAVE_ICONV.
(SOURCE_CHARSET, struct strbuf, init_iconv_desc, cpp_init_iconv,
_cpp_destroy_iconv, convert_cset, width_to_mask, convert_ucn,
emit_numeric_escape, convert_hex, convert_oct, convert_escape,
cpp_interpret_string, narrow_str_to_charconst,
wide_str_to_charconst): New.
(ucn_valid_in_identifier): Use a binary search through the
ucnranges table defined in cppucnid.h, not a long chain of if
statements.
(_cpp_valid_ucn): Add a limit pointer. Downgrade "universal
character names are only valid in C++ and C99" to a warning.
Issue the "meaning of \[uU] is different in traditional C"
warning here. Take care not to let iconv see an invalid UCS
value if we get a malformed UCN. Issue an error if we don't
have iconv.
(cpp_interpret_charconst): Moved here from cpplex.c. Use
cpp_interpret_string to do the heavy lifting.
* cppinit.c (cpp_create_reader): Initialize bytes_big_endian,
narrow_charset, wide_charset fields of options structure.
(cpp_destroy): Call _cpp_destroy_iconv.
* cpplex.c (forms_identifier_p): Adjust call to _cpp_valid_ucn.
(maybe_read_ucn, hex_digit_value, cpp_parse_escape): Delete.
(cpp_interpret_charconst): Moved to cppcharset.c.
* cpplib.c (dequote_string): Delete.
(interpret_string_notranslate): New.
(do_line, do_linemarker): Use interpret_string_notranslate.
* Makefile.in (cppcharset.o): Depend on cppucnid.h.
* c-common.c (fname_string, combine_strings): Delete.
* c-common.h (fname_string, combine_strings): Delete prototypes.
* c-lex.c (ignore_escape_flag): Delete.
(cb_ident): Use cpp_interpret_string, not lex_string.
(get_nonpadding_token): New function.
(c_lex): Handle Objective-C @-prefixed identifiers and strings here.
Adjust calls to lex_string. Don't write *value twice.
(lex_string): Now handles string constant concatenation.
Most of the work handed off to cpp_interpret_string.
Call fix_string_type here.
* c-parse.in (STRING_FUNC_NAME, VAR_FUNC_NAME): Replace with
FUNC_NAME, throughout.
(OBJC_STRING): New token type.
(primary:STRING): No need to call fix_string_type here.
(primary:objc_string): Make that OBJC_STRING.
(objc_string nonterminal): Delete.
(yylexname): Delete code to handle fake string constants.
(yylexstring): Delete entirely.
(_yylex): Handle CPP_AT_NAME and CPP_OBJC_STRING. No need
to handle CPP_ATSIGN.
* c.opt (-fexec-charset=, -fwide-exec-charset=): New options.
* c-opts.c (missing_arg, c_common_handle_option): Handle
OPT_fexec_charset_ and OPT_fwide_exec_charset_.
(c_common_init): Set cpp_opts->bytes_big_endian, not
cpp_opts->EBCDIC. Call cpp_init_iconv.
(print_help): Document -fexec-charset= and -fexec-wide-charset=.
(TARGET_EBCDIC): Delete default definition.
* objc/objc-act.c (build_objc_string_object): No need to
handle string constant concatenation.
cp:
* parser.c (cp_lexer_read_token): No need to handle string
constant concatenation.
testsuite:
* gcc.c-torture/execute/wchar_t-1.x: New file; XFAIL wchar_t-1.c
everywhere.
* gcc.dg/concat.c: Concatenation of string constants with
__FUNCTION__ / __PRETTY_FUNCTION__ is now a hard error.
* gcc.dg/wtr-strcat-1.c: Loosen dg-warning regexp.
* gcc.dg/cpp/escape-2.c: Use wide character constants where
necessary to avoid multi-character character constant warning.
* gcc.dg/cpp/escape.c: Likewise.
* gcc.dg/cpp/ucs.c: Likewise.
Remove backslashes from dg-bogus comments, as they confuse Tcl.
Fix a typo.
libstdc++-v3:
* testsuite/22_locale/collate/compare/wchar_t/2.cc
* testsuite/22_locale/collate/compare/wchar_t/wrapped_env.cc
* testsuite/22_locale/collate/compare/wchar_t/wrapped_locale.cc
* testsuite/22_locale/collate/hash/wchar_t/2.cc
* testsuite/22_locale/collate/hash/wchar_t/wrapped_env.cc
* testsuite/22_locale/collate/hash/wchar_t/wrapped_locale.cc
* testsuite/22_locale/collate/transform/wchar_t/2.cc
* testsuite/22_locale/collate/transform/wchar_t/wrapped_env.cc
* testsuite/22_locale/collate/transform/wchar_t/wrapped_locale.cc:
XFAIL on all targets.
From-SVN: r68952
2003-07-05 08:24:00 +08:00
|
|
|
|
2002-06-19 13:40:08 +08:00
|
|
|
/* Date and time text. Calculated together if either is requested. */
|
2004-11-28 05:59:38 +08:00
|
|
|
const unsigned char *date;
|
|
|
|
const unsigned char *time;
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
|
2020-11-07 00:53:31 +08:00
|
|
|
/* Time stamp, set idempotently lazily. */
|
|
|
|
time_t time_stamp;
|
|
|
|
int time_stamp_kind; /* Or errno. */
|
2016-04-28 17:12:05 +08:00
|
|
|
|
2020-10-19 22:57:50 +08:00
|
|
|
/* A token forcing paste avoidance, and one demarking macro arguments. */
|
c-lex.c (cb_def_pragma): Update.
* c-lex.c (cb_def_pragma): Update.
(c_lex): Update, and skip padding.
* cppexp.c (lex, parse_defined): Update, remove unused variable.
* cpphash.h (struct toklist): Delete.
(union utoken): New.
(struct cpp_context): Update.
(struct cpp_reader): New members eof, avoid_paste.
(_cpp_temp_token): New.
* cppinit.c (cpp_create_reader): Update.
* cpplex.c (_cpp_temp_token): New.
(_cpp_lex_direct): Add PREV_WHITE when parsing args.
(cpp_output_token): Don't print leading whitespace.
(cpp_output_line): Update.
* cpplib.c (glue_header_name, parse_include, get__Pragma_string,
do_include_common, do_line, do_ident, do_pragma,
do_pragma_dependency, _cpp_do__Pragma, parse_answer,
parse_assertion): Update.
(get_token_no_padding): New.
* cpplib.h (CPP_PADDING): New.
(AVOID_LPASTE): Delete.
(struct cpp_token): New union member source.
(cpp_get_token): Update.
* cppmacro.c (macro_arg): Convert to use pointers to const tokens.
(builtin_macro, paste_all_tokens, paste_tokens, funlike_invocation_p,
replace_args, quote_string, stringify_arg, parse_arg, next_context,
enter_macro_context, expand_arg, _cpp_pop_context, cpp_scan_nooutput,
_cpp_backup_tokens, _cpp_create_definition): Update.
(push_arg_context): Delete.
(padding_token, push_token_context, push_ptoken_context): New.
(make_string_token, make_number_token): Update, rename.
(cpp_get_token): Update to handle tokens as pointers to const,
and insert padding appropriately.
* cppmain.c (struct printer): New member prev.
(check_multiline_token): Constify.
(do_preprocessing, cb_line_change): Update.
(scan_translation_unit): Update to handle spacing.
* scan-decls.c (get_a_token): New.
(skip_to_closing_brace, scan_decls): Update.
* fix-header.c (read_scan_file): Update.
* doc/cpp.texi: Update.
* gcc.dg/cpp/macro10.c: New test.
* gcc.dg/cpp/strify3.c: New test.
* gcc.dg/cpp/spacing1.c: Add tests.
* gcc.dg/cpp/19990703-1.c: Remove bogus test.
* gcc.dg/cpp/20000625-2.c: Fudge to pass.
From-SVN: r45793
2001-09-25 06:53:12 +08:00
|
|
|
cpp_token avoid_paste;
|
2020-10-19 22:57:50 +08:00
|
|
|
cpp_token endarg;
|
c-lex.c (cb_def_pragma): Update.
* c-lex.c (cb_def_pragma): Update.
(c_lex): Update, and skip padding.
* cppexp.c (lex, parse_defined): Update, remove unused variable.
* cpphash.h (struct toklist): Delete.
(union utoken): New.
(struct cpp_context): Update.
(struct cpp_reader): New members eof, avoid_paste.
(_cpp_temp_token): New.
* cppinit.c (cpp_create_reader): Update.
* cpplex.c (_cpp_temp_token): New.
(_cpp_lex_direct): Add PREV_WHITE when parsing args.
(cpp_output_token): Don't print leading whitespace.
(cpp_output_line): Update.
* cpplib.c (glue_header_name, parse_include, get__Pragma_string,
do_include_common, do_line, do_ident, do_pragma,
do_pragma_dependency, _cpp_do__Pragma, parse_answer,
parse_assertion): Update.
(get_token_no_padding): New.
* cpplib.h (CPP_PADDING): New.
(AVOID_LPASTE): Delete.
(struct cpp_token): New union member source.
(cpp_get_token): Update.
* cppmacro.c (macro_arg): Convert to use pointers to const tokens.
(builtin_macro, paste_all_tokens, paste_tokens, funlike_invocation_p,
replace_args, quote_string, stringify_arg, parse_arg, next_context,
enter_macro_context, expand_arg, _cpp_pop_context, cpp_scan_nooutput,
_cpp_backup_tokens, _cpp_create_definition): Update.
(push_arg_context): Delete.
(padding_token, push_token_context, push_ptoken_context): New.
(make_string_token, make_number_token): Update, rename.
(cpp_get_token): Update to handle tokens as pointers to const,
and insert padding appropriately.
* cppmain.c (struct printer): New member prev.
(check_multiline_token): Constify.
(do_preprocessing, cb_line_change): Update.
(scan_translation_unit): Update to handle spacing.
* scan-decls.c (get_a_token): New.
(skip_to_closing_brace, scan_decls): Update.
* fix-header.c (read_scan_file): Update.
* doc/cpp.texi: Update.
* gcc.dg/cpp/macro10.c: New test.
* gcc.dg/cpp/strify3.c: New test.
* gcc.dg/cpp/spacing1.c: Add tests.
* gcc.dg/cpp/19990703-1.c: Remove bogus test.
* gcc.dg/cpp/20000625-2.c: Fudge to pass.
From-SVN: r45793
2001-09-25 06:53:12 +08:00
|
|
|
|
2022-01-14 23:57:02 +08:00
|
|
|
/* Opaque handle to the dependencies of mkdeps.cc. */
|
2019-07-10 02:32:49 +08:00
|
|
|
class mkdeps *deps;
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
|
|
|
|
/* Obstack holding all macro hash nodes. This never shrinks.
|
2022-01-14 23:57:02 +08:00
|
|
|
See identifiers.cc */
|
Makefile.in (OBJS, [...]): Update.
* Makefile.in (OBJS, LIBCPP_OBJS, LIBCPP_DEPS,
cpplib.o, cpphash.o, fix-header): Update.
(hashtable.o): New target.
* c-common.h: Include cpplib.h. Define C_RID_CODE and
struct c_common_identifier here.
* c-lang.c (c_init_options): Update. Call set_identifier_size.
* c-lex.c (c_lex): Update.
* c-pragma.h: Update.
* c-tree.h (struct lang_identifier): Contain c_common_identifier.
Delete rid_code.
(C_RID_CODE): Delete.
* cpphash.c: Rewrite to use hashtable.c.
* cpphash.h: Update include guards.
(struct cpp_reader): Remove hashtab.
hash_ob and buffer_ob are no longer pointers. Add hash_table
and our_hashtable.
(HASHSTEP, _cpp_init_hashtable, _cpp_lookup_with_hash): Delete.
(_cpp_cleanup_hashtable): Rename _cpp_destroy_hashtable.
(_cpp_cleanup_stacks): Rename _cpp_init_directives.
* cppinit.c (cpp_create_reader): Update.
* cpplex.c (cpp_ideq, parse_identifier, cpp_output_token): Update.
(cpp_interpret_charconst): Eliminate warning.
* cpplib.c (do_pragma, do_endif, push_conditional,
cpp_push_buffer, cpp_pop_buffer): Update.
(_cpp_init_stacks): Rename cpp_init_directives.
(_cpp_cleanup_stacks): Remove.
* cpplib.h: Update include guards. Include tree-core.h and c-rid.h.
(cpp_hashnode, cpp_token, NODE_LEN, NODE_NAME,
cpp_forall_identifiers, cpp_create_reader): Update.
(C_RID_CODE, cpp_make_node): New.
(c_common_identifier): New identifier node for C front ends.
* cppmain.c (main): Update.
* fix-header.c (read_scan_file): Update.
* flags.h (id_clash_len): Make unsigned.
* ggc.h (ggc_mark_nonnull_tree): New.
* hashtable.c: New.
* hashtable.h: New.
* stringpool.c: Update comments and copyright. Update to use
hashtable.c.
* toplev.c (approx_sqrt): Move to hashtable.c.
(id_clash_len): Make unsigned.
* toplev.h (ident_hash): New.
* tree.c (gcc_obstack_init): Move to hashtable.c.
* tree.h: Include hashtable.h.
(IDENTIFIER_POINTER, IDENTIFIER_LENGTH): Update.
(GCC_IDENT_TO_HT_IDENT, HT_IDENT_TO_GCC_IDENT): New.
(struct tree_identifier): Update.
(make_identifier): New.
cp:
* cp-tree.h (struct lang_identifier, C_RID_YYCODE): Update.
(C_RID_CODE): Remove.
* lex.c (cxx_init_options): Call set_identifier_size. Update.
(init_parse): Don't do it here.
objc:
* objc-act.c (objc_init_options): Call set_identifier_size. Update.
From-SVN: r42334
2001-05-20 14:26:45 +08:00
|
|
|
struct obstack hash_ob;
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
|
|
|
|
/* Obstack holding buffer and conditional structures. This is a
|
2022-01-14 23:57:02 +08:00
|
|
|
real stack. See directives.cc. */
|
Makefile.in (OBJS, [...]): Update.
* Makefile.in (OBJS, LIBCPP_OBJS, LIBCPP_DEPS,
cpplib.o, cpphash.o, fix-header): Update.
(hashtable.o): New target.
* c-common.h: Include cpplib.h. Define C_RID_CODE and
struct c_common_identifier here.
* c-lang.c (c_init_options): Update. Call set_identifier_size.
* c-lex.c (c_lex): Update.
* c-pragma.h: Update.
* c-tree.h (struct lang_identifier): Contain c_common_identifier.
Delete rid_code.
(C_RID_CODE): Delete.
* cpphash.c: Rewrite to use hashtable.c.
* cpphash.h: Update include guards.
(struct cpp_reader): Remove hashtab.
hash_ob and buffer_ob are no longer pointers. Add hash_table
and our_hashtable.
(HASHSTEP, _cpp_init_hashtable, _cpp_lookup_with_hash): Delete.
(_cpp_cleanup_hashtable): Rename _cpp_destroy_hashtable.
(_cpp_cleanup_stacks): Rename _cpp_init_directives.
* cppinit.c (cpp_create_reader): Update.
* cpplex.c (cpp_ideq, parse_identifier, cpp_output_token): Update.
(cpp_interpret_charconst): Eliminate warning.
* cpplib.c (do_pragma, do_endif, push_conditional,
cpp_push_buffer, cpp_pop_buffer): Update.
(_cpp_init_stacks): Rename cpp_init_directives.
(_cpp_cleanup_stacks): Remove.
* cpplib.h: Update include guards. Include tree-core.h and c-rid.h.
(cpp_hashnode, cpp_token, NODE_LEN, NODE_NAME,
cpp_forall_identifiers, cpp_create_reader): Update.
(C_RID_CODE, cpp_make_node): New.
(c_common_identifier): New identifier node for C front ends.
* cppmain.c (main): Update.
* fix-header.c (read_scan_file): Update.
* flags.h (id_clash_len): Make unsigned.
* ggc.h (ggc_mark_nonnull_tree): New.
* hashtable.c: New.
* hashtable.h: New.
* stringpool.c: Update comments and copyright. Update to use
hashtable.c.
* toplev.c (approx_sqrt): Move to hashtable.c.
(id_clash_len): Make unsigned.
* toplev.h (ident_hash): New.
* tree.c (gcc_obstack_init): Move to hashtable.c.
* tree.h: Include hashtable.h.
(IDENTIFIER_POINTER, IDENTIFIER_LENGTH): Update.
(GCC_IDENT_TO_HT_IDENT, HT_IDENT_TO_GCC_IDENT): New.
(struct tree_identifier): Update.
(make_identifier): New.
cp:
* cp-tree.h (struct lang_identifier, C_RID_YYCODE): Update.
(C_RID_CODE): Remove.
* lex.c (cxx_init_options): Call set_identifier_size. Update.
(init_parse): Don't do it here.
objc:
* objc-act.c (objc_init_options): Call set_identifier_size. Update.
From-SVN: r42334
2001-05-20 14:26:45 +08:00
|
|
|
struct obstack buffer_ob;
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
|
|
|
|
/* Pragma table - dynamic, because a library user can add to the
|
|
|
|
list of recognized pragmas. */
|
|
|
|
struct pragma_entry *pragmas;
|
|
|
|
|
2003-06-26 05:01:10 +08:00
|
|
|
/* Call backs to cpplib client. */
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
struct cpp_callbacks cb;
|
|
|
|
|
2002-05-23 06:02:16 +08:00
|
|
|
/* Identifier hash table. */
|
Makefile.in (OBJS, [...]): Update.
* Makefile.in (OBJS, LIBCPP_OBJS, LIBCPP_DEPS,
cpplib.o, cpphash.o, fix-header): Update.
(hashtable.o): New target.
* c-common.h: Include cpplib.h. Define C_RID_CODE and
struct c_common_identifier here.
* c-lang.c (c_init_options): Update. Call set_identifier_size.
* c-lex.c (c_lex): Update.
* c-pragma.h: Update.
* c-tree.h (struct lang_identifier): Contain c_common_identifier.
Delete rid_code.
(C_RID_CODE): Delete.
* cpphash.c: Rewrite to use hashtable.c.
* cpphash.h: Update include guards.
(struct cpp_reader): Remove hashtab.
hash_ob and buffer_ob are no longer pointers. Add hash_table
and our_hashtable.
(HASHSTEP, _cpp_init_hashtable, _cpp_lookup_with_hash): Delete.
(_cpp_cleanup_hashtable): Rename _cpp_destroy_hashtable.
(_cpp_cleanup_stacks): Rename _cpp_init_directives.
* cppinit.c (cpp_create_reader): Update.
* cpplex.c (cpp_ideq, parse_identifier, cpp_output_token): Update.
(cpp_interpret_charconst): Eliminate warning.
* cpplib.c (do_pragma, do_endif, push_conditional,
cpp_push_buffer, cpp_pop_buffer): Update.
(_cpp_init_stacks): Rename cpp_init_directives.
(_cpp_cleanup_stacks): Remove.
* cpplib.h: Update include guards. Include tree-core.h and c-rid.h.
(cpp_hashnode, cpp_token, NODE_LEN, NODE_NAME,
cpp_forall_identifiers, cpp_create_reader): Update.
(C_RID_CODE, cpp_make_node): New.
(c_common_identifier): New identifier node for C front ends.
* cppmain.c (main): Update.
* fix-header.c (read_scan_file): Update.
* flags.h (id_clash_len): Make unsigned.
* ggc.h (ggc_mark_nonnull_tree): New.
* hashtable.c: New.
* hashtable.h: New.
* stringpool.c: Update comments and copyright. Update to use
hashtable.c.
* toplev.c (approx_sqrt): Move to hashtable.c.
(id_clash_len): Make unsigned.
* toplev.h (ident_hash): New.
* tree.c (gcc_obstack_init): Move to hashtable.c.
* tree.h: Include hashtable.h.
(IDENTIFIER_POINTER, IDENTIFIER_LENGTH): Update.
(GCC_IDENT_TO_HT_IDENT, HT_IDENT_TO_GCC_IDENT): New.
(struct tree_identifier): Update.
(make_identifier): New.
cp:
* cp-tree.h (struct lang_identifier, C_RID_YYCODE): Update.
(C_RID_CODE): Remove.
* lex.c (cxx_init_options): Call set_identifier_size. Update.
(init_parse): Don't do it here.
objc:
* objc-act.c (objc_init_options): Call set_identifier_size. Update.
From-SVN: r42334
2001-05-20 14:26:45 +08:00
|
|
|
struct ht *hash_table;
|
|
|
|
|
libcpp: Improve the diagnostic for poisoned identifiers [PR36887]
The PR requests an enhancement to the diagnostic issued for the use of a
poisoned identifier. Currently, we show the location of the usage, but not
the location which requested the poisoning, which would be helpful for the
user if the decision to poison an identifier was made externally, such as
in a library header.
In order to output this information, we need to remember a location_t for
each identifier that has been poisoned, and that data needs to be preserved
as well in a PCH. One option would be to add a field to struct cpp_hashnode,
but there is no convenient place to add it without increasing the size of
the struct for all identifiers. Given this facility will be needed rarely,
it seemed better to add a second hash map, which is handled PCH-wise the
same as the current one in gcc/stringpool.cc. This hash map associates a new
struct cpp_hashnode_extra with each identifier that needs one. Currently
that struct only contains the new location_t, but it could be extended in
the future if there is other ancillary data that may be convenient to put
there for other purposes.
libcpp/ChangeLog:
PR preprocessor/36887
* directives.cc (do_pragma_poison): Store in the extra hash map the
location from which an identifier has been poisoned.
* lex.cc (identifier_diagnostics_on_lex): When issuing a diagnostic
for the use of a poisoned identifier, also add a note indicating the
location from which it was poisoned.
* identifiers.cc (alloc_node): Convert to template function.
(_cpp_init_hashtable): Handle the new extra hash map.
(_cpp_destroy_hashtable): Likewise.
* include/cpplib.h (struct cpp_hashnode_extra): New struct.
(cpp_create_reader): Update prototype to...
* init.cc (cpp_create_reader): ...accept an argument for the extra
hash table and pass it to _cpp_init_hashtable.
* include/symtab.h (ht_lookup): New overload for convenience.
* internal.h (struct cpp_reader): Add EXTRA_HASH_TABLE member.
(_cpp_init_hashtable): Adjust prototype.
gcc/c-family/ChangeLog:
PR preprocessor/36887
* c-opts.cc (c_common_init_options): Pass new extra hash map
argument to cpp_create_reader().
gcc/ChangeLog:
PR preprocessor/36887
* toplev.h (ident_hash_extra): Declare...
* stringpool.cc (ident_hash_extra): ...this new global variable.
(init_stringpool): Handle ident_hash_extra as well as ident_hash.
(ggc_mark_stringpool): Likewise.
(ggc_purge_stringpool): Likewise.
(struct string_pool_data_extra): New struct.
(spd2): New GC root variable.
(gt_pch_save_stringpool): Use spd2 to handle ident_hash_extra,
analogous to how spd is used to handle ident_hash.
(gt_pch_restore_stringpool): Likewise.
gcc/testsuite/ChangeLog:
PR preprocessor/36887
* c-c++-common/cpp/diagnostic-poison.c: New test.
* g++.dg/pch/pr36887.C: New test.
* g++.dg/pch/pr36887.Hs: New test.
2023-09-08 05:02:47 +08:00
|
|
|
/* Identifier ancillary data hash table. */
|
|
|
|
struct ht *extra_hash_table;
|
|
|
|
|
2002-04-29 03:42:54 +08:00
|
|
|
/* Expression parser stack. */
|
|
|
|
struct op *op_stack, *op_limit;
|
|
|
|
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
/* User visible options. */
|
|
|
|
struct cpp_options opts;
|
|
|
|
|
|
|
|
/* Special nodes - identifiers with predefined significance to the
|
|
|
|
preprocessor. */
|
|
|
|
struct spec_nodes spec_nodes;
|
|
|
|
|
Makefile.in (OBJS, [...]): Update.
* Makefile.in (OBJS, LIBCPP_OBJS, LIBCPP_DEPS,
cpplib.o, cpphash.o, fix-header): Update.
(hashtable.o): New target.
* c-common.h: Include cpplib.h. Define C_RID_CODE and
struct c_common_identifier here.
* c-lang.c (c_init_options): Update. Call set_identifier_size.
* c-lex.c (c_lex): Update.
* c-pragma.h: Update.
* c-tree.h (struct lang_identifier): Contain c_common_identifier.
Delete rid_code.
(C_RID_CODE): Delete.
* cpphash.c: Rewrite to use hashtable.c.
* cpphash.h: Update include guards.
(struct cpp_reader): Remove hashtab.
hash_ob and buffer_ob are no longer pointers. Add hash_table
and our_hashtable.
(HASHSTEP, _cpp_init_hashtable, _cpp_lookup_with_hash): Delete.
(_cpp_cleanup_hashtable): Rename _cpp_destroy_hashtable.
(_cpp_cleanup_stacks): Rename _cpp_init_directives.
* cppinit.c (cpp_create_reader): Update.
* cpplex.c (cpp_ideq, parse_identifier, cpp_output_token): Update.
(cpp_interpret_charconst): Eliminate warning.
* cpplib.c (do_pragma, do_endif, push_conditional,
cpp_push_buffer, cpp_pop_buffer): Update.
(_cpp_init_stacks): Rename cpp_init_directives.
(_cpp_cleanup_stacks): Remove.
* cpplib.h: Update include guards. Include tree-core.h and c-rid.h.
(cpp_hashnode, cpp_token, NODE_LEN, NODE_NAME,
cpp_forall_identifiers, cpp_create_reader): Update.
(C_RID_CODE, cpp_make_node): New.
(c_common_identifier): New identifier node for C front ends.
* cppmain.c (main): Update.
* fix-header.c (read_scan_file): Update.
* flags.h (id_clash_len): Make unsigned.
* ggc.h (ggc_mark_nonnull_tree): New.
* hashtable.c: New.
* hashtable.h: New.
* stringpool.c: Update comments and copyright. Update to use
hashtable.c.
* toplev.c (approx_sqrt): Move to hashtable.c.
(id_clash_len): Make unsigned.
* toplev.h (ident_hash): New.
* tree.c (gcc_obstack_init): Move to hashtable.c.
* tree.h: Include hashtable.h.
(IDENTIFIER_POINTER, IDENTIFIER_LENGTH): Update.
(GCC_IDENT_TO_HT_IDENT, HT_IDENT_TO_GCC_IDENT): New.
(struct tree_identifier): Update.
(make_identifier): New.
cp:
* cp-tree.h (struct lang_identifier, C_RID_YYCODE): Update.
(C_RID_CODE): Remove.
* lex.c (cxx_init_options): Call set_identifier_size. Update.
(init_parse): Don't do it here.
objc:
* objc-act.c (objc_init_options): Call set_identifier_size. Update.
From-SVN: r42334
2001-05-20 14:26:45 +08:00
|
|
|
/* Whether cpplib owns the hashtable. */
|
libcpp: Improve the diagnostic for poisoned identifiers [PR36887]
The PR requests an enhancement to the diagnostic issued for the use of a
poisoned identifier. Currently, we show the location of the usage, but not
the location which requested the poisoning, which would be helpful for the
user if the decision to poison an identifier was made externally, such as
in a library header.
In order to output this information, we need to remember a location_t for
each identifier that has been poisoned, and that data needs to be preserved
as well in a PCH. One option would be to add a field to struct cpp_hashnode,
but there is no convenient place to add it without increasing the size of
the struct for all identifiers. Given this facility will be needed rarely,
it seemed better to add a second hash map, which is handled PCH-wise the
same as the current one in gcc/stringpool.cc. This hash map associates a new
struct cpp_hashnode_extra with each identifier that needs one. Currently
that struct only contains the new location_t, but it could be extended in
the future if there is other ancillary data that may be convenient to put
there for other purposes.
libcpp/ChangeLog:
PR preprocessor/36887
* directives.cc (do_pragma_poison): Store in the extra hash map the
location from which an identifier has been poisoned.
* lex.cc (identifier_diagnostics_on_lex): When issuing a diagnostic
for the use of a poisoned identifier, also add a note indicating the
location from which it was poisoned.
* identifiers.cc (alloc_node): Convert to template function.
(_cpp_init_hashtable): Handle the new extra hash map.
(_cpp_destroy_hashtable): Likewise.
* include/cpplib.h (struct cpp_hashnode_extra): New struct.
(cpp_create_reader): Update prototype to...
* init.cc (cpp_create_reader): ...accept an argument for the extra
hash table and pass it to _cpp_init_hashtable.
* include/symtab.h (ht_lookup): New overload for convenience.
* internal.h (struct cpp_reader): Add EXTRA_HASH_TABLE member.
(_cpp_init_hashtable): Adjust prototype.
gcc/c-family/ChangeLog:
PR preprocessor/36887
* c-opts.cc (c_common_init_options): Pass new extra hash map
argument to cpp_create_reader().
gcc/ChangeLog:
PR preprocessor/36887
* toplev.h (ident_hash_extra): Declare...
* stringpool.cc (ident_hash_extra): ...this new global variable.
(init_stringpool): Handle ident_hash_extra as well as ident_hash.
(ggc_mark_stringpool): Likewise.
(ggc_purge_stringpool): Likewise.
(struct string_pool_data_extra): New struct.
(spd2): New GC root variable.
(gt_pch_save_stringpool): Use spd2 to handle ident_hash_extra,
analogous to how spd is used to handle ident_hash.
(gt_pch_restore_stringpool): Likewise.
gcc/testsuite/ChangeLog:
PR preprocessor/36887
* c-c++-common/cpp/diagnostic-poison.c: New test.
* g++.dg/pch/pr36887.C: New test.
* g++.dg/pch/pr36887.Hs: New test.
2023-09-08 05:02:47 +08:00
|
|
|
bool our_hashtable, our_extra_hashtable;
|
2002-05-18 04:16:48 +08:00
|
|
|
|
Makefile.in: Update cppmain.o.
* Makefile.in: Update cppmain.o.
* cpphash.h (struct cpp_reader): Move some members to a
nested structure.
(trad_line): Rename saved_line.
(_cpp_read_logical_line_trad): Update.
(_cpp_remove_overlay): New.
* cppinit.c (cpp_create_reader): No need to set saved_line.
(cpp_destroy): Update.
(cpp_read_main_file): Only overlay if compiling.
* cpplex.c (continue_after_nul): Return false if in directive.
* cpplib.c (EXPAND): New.
(directive_table, SEEN_EOL): Update.
(end_directive): Remove overlay if traditional; don't skip
line in traditional #define.
(prepare_directive_trad): New.
(_cpp_handle_directive, run_directive): Update for traditional
directives.
(lex_macro_node): Simplify, don't use lex_identifier_trad.
* cpplib.h (struct options): Add preprocess_only.
* cppmain.c: Don't include intl.h.
(cpp_preprocess_file): Set options->preprocess_only.
(scan_translation_unit_trad): Fix, and print line numbers.
* cpptrad.c (check_output_buffer, lex_identifier, scan_parameters,
maybe_start_funlike, scan_out_logical_line, replace_args_and_push,
save_replacement_text, _cpp_create_trad_definition): Update for
variable renaming.
(_cpp_overlay_buffer): Save line number.
(_cpp_remove_overlay): Rename from restore_buff, restore line.
(_cpp_read_logical_line_trad): Don't handle overlays here.
(scan_out_logical_line): Process directives.
From-SVN: r54485
2002-06-11 13:36:17 +08:00
|
|
|
/* Traditional preprocessing output buffer (a logical line). */
|
|
|
|
struct
|
|
|
|
{
|
2004-11-28 05:59:38 +08:00
|
|
|
unsigned char *base;
|
|
|
|
unsigned char *limit;
|
|
|
|
unsigned char *cur;
|
2018-11-14 04:05:03 +08:00
|
|
|
location_t first_line;
|
Makefile.in: Update cppmain.o.
* Makefile.in: Update cppmain.o.
* cpphash.h (struct cpp_reader): Move some members to a
nested structure.
(trad_line): Rename saved_line.
(_cpp_read_logical_line_trad): Update.
(_cpp_remove_overlay): New.
* cppinit.c (cpp_create_reader): No need to set saved_line.
(cpp_destroy): Update.
(cpp_read_main_file): Only overlay if compiling.
* cpplex.c (continue_after_nul): Return false if in directive.
* cpplib.c (EXPAND): New.
(directive_table, SEEN_EOL): Update.
(end_directive): Remove overlay if traditional; don't skip
line in traditional #define.
(prepare_directive_trad): New.
(_cpp_handle_directive, run_directive): Update for traditional
directives.
(lex_macro_node): Simplify, don't use lex_identifier_trad.
* cpplib.h (struct options): Add preprocess_only.
* cppmain.c: Don't include intl.h.
(cpp_preprocess_file): Set options->preprocess_only.
(scan_translation_unit_trad): Fix, and print line numbers.
* cpptrad.c (check_output_buffer, lex_identifier, scan_parameters,
maybe_start_funlike, scan_out_logical_line, replace_args_and_push,
save_replacement_text, _cpp_create_trad_definition): Update for
variable renaming.
(_cpp_overlay_buffer): Save line number.
(_cpp_remove_overlay): Rename from restore_buff, restore line.
(_cpp_read_logical_line_trad): Don't handle overlays here.
(scan_out_logical_line): Process directives.
From-SVN: r54485
2002-06-11 13:36:17 +08:00
|
|
|
} out;
|
|
|
|
|
2022-01-14 23:57:02 +08:00
|
|
|
/* Used for buffer overlays by traditional.cc. */
|
2004-11-28 05:59:38 +08:00
|
|
|
const unsigned char *saved_cur, *saved_rlimit, *saved_line_base;
|
Represent column numbers using line-map's source_location.
The "next available source_location" is now managed internally by
line-maps.c rather than by clients.
* line-map.h (struct line_map): New field column_bits.
<from_line>: Rename field to start_location.
(struct line_maps): New fields highest_location and max_column_hint.
(linemap_check_files_exited): New declaration.
(linemap_line_start): New declaration.
(linemap_add): Remove from_line parameter; use highest_location field.
(SOURCE_LINE, LAST_SOURCE_LINE): Modify to use column_bits.
(SOURCE_COLUMN, LAST_SOURCE_LINE_LOCATION): New macros.
(CURRENT_LINE_MAP): Remove macro.
(linemap_position_for_column): New inline function.
* line-map.c (linemap_init): Clear new fields.
(linemap_check_files_exited): New function, extracted from ...
(linemap_free): Use linemap_check_files_exited.
(linemap_add): Remove from_line parameter. Various updates.
(linemap_line_start): New function.
(linemap_lookeup): Update for new field names.
* cpphash.h (struct cpp_reader) <map>: Field removed. Because
linemap_position_for_column may unpredictably change the current map,
it is cleaner and simpler for us to not cache it in cpp_reader.
(struct cpp_buffer): New sysp field.
Changed warned_cplusplus_comments and from_stage3 to bitfields.
* cppinit.c (cpp_read_min_file): pfile->map no longer exists.
* cpplib.c (do_line, do_linemarker, _cpp_do_file_change): Get
current map using linemap_lookup.
(do_linemarker): Also set buffer's sysp field.
(destringize_and_run): No longer need to decrement current line.
* cppfiles.c (_cpp_stack_file): Set sysp from and in buffer.
(search_path_head, open_file_failed): Use buffer's sysp.
(cpp_make_system_header): Get current map using linemap_lookup.
Also set buffer's sysp flag.
* cppmacro.c (_cpp_builtin_macro_text): Likewise use linemap_lookup.
* cpphash.h (CPP_INCREMENT_LINE): New macro.
(struct cpp_buffer): Moved fields saved_cur, saved_rlimit to ...
(struct cpp_reader): ... and adding saved_line_base field.
* cpptrad.c (_cpp_overlay_buffer, _cpp_remove_overlay):
Update accordingly. Don't adjust line.
(_cpp_scan_out_logical_line): Use CPP_INCREMENT_LINE.
* cpphash.c (CPP_IN_SYSTEM_HEADER): Replaced macro by ...
(cpp_in_system_header): ... new inline function, using buffer's sysp.
* cpperror.c (_cpp_begin_message): Update to use cpp_in_system_header.
* cpplex.c (_cpp_lex_direct): Likewise.
* cppmacro.c (_cpp_builtin_macro_text): Likewise.
* cppmacro.c (_cpp_create_definition): Use buffer's sysp field.
* cpplib.h (struct cpp_token): Rename line field to src_loc.
Remove col field as it is now subsumed by src_loc.
* cpperror.c: Update various field, parameter, and macro names.
(print_location): If col==0, try SOURCE_COLUMN of line.
(cpp_error): Use cur_token's src_loc field, rather than line+col.
* cpplib.c (do_diagnostic): Token's src_loc fields replaces line+col.
* cpplex.c (_cpp_process_line_notes, _cpp_lex_direct,
_cpp_skip_block_comment): Use CPP_INCREMENT_LINE.
(_cpp_temp_token): Replace cpp_token's line+col fields by src_loc.
(_cpp_get_fresh_line): Don't need to adjust line for missing newline.
(_cpp_lex_direct): Use linemap_position_for_column.
* c-ppoutput.c (maybe_print_line, print_line): Don't take map
parameter. Instead get it from the line_table global. Adjust callers.
(print): Remove map field. Replace line field to src_line.
(init_pp_output, account_for_newlines, maybe_print_line): Adjust.
(cb_line_change): Use SOURCE_COLUMN. Minor optimizations.
(pp_file_change): Use MAIN_FILE_P since we cannot checked print.map.
Use LAST_SOURCE_LINE_LOCATION to "catch up" after #include.
* cpptrad.c (copy_comment): Rename variable.
* c-lex.c (map): Remove static variable, for same reason we removed
cpp_reader's map field.
(cb_line_change, cb_def_pragma, cb_define, cb_undef): Hence we need
to call linemap_lookup.
(cb_line_change): Token's line field replaced by src_loc.
(fe_file_change): Use MAINFILE_P and LAST_SOURCE_LINE macros.
Don't save new_map.
* cpphash.h, cpperror.c, cpplib.h: Some renames of fileline to
source_location.
From-SVN: r77663
2004-02-11 23:29:30 +08:00
|
|
|
|
2003-01-10 10:22:34 +08:00
|
|
|
/* A saved list of the defined macros, for dependency checking
|
|
|
|
of precompiled headers. */
|
|
|
|
struct cpp_savedstate *savedstate;
|
2007-05-25 04:55:36 +08:00
|
|
|
|
|
|
|
/* Next value of __COUNTER__ macro. */
|
|
|
|
unsigned int counter;
|
2008-10-05 20:35:36 +08:00
|
|
|
|
|
|
|
/* Table of comments, when state.save_comments is true. */
|
|
|
|
cpp_comment_table comments;
|
2009-11-12 02:37:19 +08:00
|
|
|
|
|
|
|
/* List of saved macros by push_macro. */
|
|
|
|
struct def_pragma_macro *pushed_macros;
|
2011-08-23 04:41:07 +08:00
|
|
|
|
2018-10-31 23:26:28 +08:00
|
|
|
/* If non-zero, the lexer will use this location for the next token
|
2011-08-23 04:41:07 +08:00
|
|
|
instead of getting a location from the linemap. */
|
2018-11-14 04:05:03 +08:00
|
|
|
location_t forced_token_location;
|
2020-11-19 23:00:51 +08:00
|
|
|
|
|
|
|
/* Location identifying the main source file -- intended to be line
|
|
|
|
zero of said file. */
|
|
|
|
location_t main_loc;
|
libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]
From a link below:
"An issue was discovered in the Bidirectional Algorithm in the Unicode
Specification through 14.0. It permits the visual reordering of
characters via control sequences, which can be used to craft source code
that renders different logic than the logical ordering of tokens
ingested by compilers and interpreters. Adversaries can leverage this to
encode source code for compilers accepting Unicode such that targeted
vulnerabilities are introduced invisibly to human reviewers."
More info:
https://nvd.nist.gov/vuln/detail/CVE-2021-42574
https://trojansource.codes/
This is not a compiler bug. However, to mitigate the problem, this patch
implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
misleading Unicode bidirectional control characters the preprocessor may
encounter.
The default is =unpaired, which warns about improperly terminated
bidirectional control characters; e.g. a LRE without its corresponding PDF.
The level =any warns about any use of bidirectional control characters.
This patch handles both UCNs and UTF-8 characters. UCNs designating
bidi characters in identifiers are accepted since r204886. Then r217144
enabled -fextended-identifiers by default. Extended characters in C/C++
identifiers have been accepted since r275979. However, this patch still
warns about mixing UTF-8 and UCN bidi characters; there seems to be no
good reason to allow mixing them.
We warn in different contexts: comments (both C and C++-style), string
literals, character constants, and identifiers. Expectedly, UCNs are ignored
in comments and raw string literals. The bidirectional control characters
can nest so this patch handles that as well.
I have not included nor tested this at all with Fortran (which also has
string literals and line comments).
Dave M. posted patches improving diagnostic involving Unicode characters.
This patch does not make use of this new infrastructure yet.
PR preprocessor/103026
gcc/c-family/ChangeLog:
* c.opt (Wbidi-chars, Wbidi-chars=): New option.
gcc/ChangeLog:
* doc/invoke.texi: Document -Wbidi-chars.
libcpp/ChangeLog:
* include/cpplib.h (enum cpp_bidirectional_level): New.
(struct cpp_options): Add cpp_warn_bidirectional.
(enum cpp_warning_reason): Add CPP_W_BIDIRECTIONAL.
* internal.h (struct cpp_reader): Add warn_bidi_p member
function.
* init.c (cpp_create_reader): Set cpp_warn_bidirectional.
* lex.c (bidi): New namespace.
(get_bidi_utf8): New function.
(get_bidi_ucn): Likewise.
(maybe_warn_bidi_on_close): Likewise.
(maybe_warn_bidi_on_char): Likewise.
(_cpp_skip_block_comment): Implement warning about bidirectional
control characters.
(skip_line_comment): Likewise.
(forms_identifier_p): Likewise.
(lex_identifier): Likewise.
(lex_string): Likewise.
(lex_raw_string): Likewise.
gcc/testsuite/ChangeLog:
* c-c++-common/Wbidi-chars-1.c: New test.
* c-c++-common/Wbidi-chars-2.c: New test.
* c-c++-common/Wbidi-chars-3.c: New test.
* c-c++-common/Wbidi-chars-4.c: New test.
* c-c++-common/Wbidi-chars-5.c: New test.
* c-c++-common/Wbidi-chars-6.c: New test.
* c-c++-common/Wbidi-chars-7.c: New test.
* c-c++-common/Wbidi-chars-8.c: New test.
* c-c++-common/Wbidi-chars-9.c: New test.
* c-c++-common/Wbidi-chars-10.c: New test.
* c-c++-common/Wbidi-chars-11.c: New test.
* c-c++-common/Wbidi-chars-12.c: New test.
* c-c++-common/Wbidi-chars-13.c: New test.
* c-c++-common/Wbidi-chars-14.c: New test.
* c-c++-common/Wbidi-chars-15.c: New test.
* c-c++-common/Wbidi-chars-16.c: New test.
* c-c++-common/Wbidi-chars-17.c: New test.
2021-10-07 02:33:59 +08:00
|
|
|
|
2024-03-23 00:55:27 +08:00
|
|
|
/* If non-zero, override diagnostic locations (other than DK_NOTE
|
|
|
|
diagnostics) to this one. */
|
|
|
|
location_t diagnostic_override_loc;
|
|
|
|
|
libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]
From a link below:
"An issue was discovered in the Bidirectional Algorithm in the Unicode
Specification through 14.0. It permits the visual reordering of
characters via control sequences, which can be used to craft source code
that renders different logic than the logical ordering of tokens
ingested by compilers and interpreters. Adversaries can leverage this to
encode source code for compilers accepting Unicode such that targeted
vulnerabilities are introduced invisibly to human reviewers."
More info:
https://nvd.nist.gov/vuln/detail/CVE-2021-42574
https://trojansource.codes/
This is not a compiler bug. However, to mitigate the problem, this patch
implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
misleading Unicode bidirectional control characters the preprocessor may
encounter.
The default is =unpaired, which warns about improperly terminated
bidirectional control characters; e.g. a LRE without its corresponding PDF.
The level =any warns about any use of bidirectional control characters.
This patch handles both UCNs and UTF-8 characters. UCNs designating
bidi characters in identifiers are accepted since r204886. Then r217144
enabled -fextended-identifiers by default. Extended characters in C/C++
identifiers have been accepted since r275979. However, this patch still
warns about mixing UTF-8 and UCN bidi characters; there seems to be no
good reason to allow mixing them.
We warn in different contexts: comments (both C and C++-style), string
literals, character constants, and identifiers. Expectedly, UCNs are ignored
in comments and raw string literals. The bidirectional control characters
can nest so this patch handles that as well.
I have not included nor tested this at all with Fortran (which also has
string literals and line comments).
Dave M. posted patches improving diagnostic involving Unicode characters.
This patch does not make use of this new infrastructure yet.
PR preprocessor/103026
gcc/c-family/ChangeLog:
* c.opt (Wbidi-chars, Wbidi-chars=): New option.
gcc/ChangeLog:
* doc/invoke.texi: Document -Wbidi-chars.
libcpp/ChangeLog:
* include/cpplib.h (enum cpp_bidirectional_level): New.
(struct cpp_options): Add cpp_warn_bidirectional.
(enum cpp_warning_reason): Add CPP_W_BIDIRECTIONAL.
* internal.h (struct cpp_reader): Add warn_bidi_p member
function.
* init.c (cpp_create_reader): Set cpp_warn_bidirectional.
* lex.c (bidi): New namespace.
(get_bidi_utf8): New function.
(get_bidi_ucn): Likewise.
(maybe_warn_bidi_on_close): Likewise.
(maybe_warn_bidi_on_char): Likewise.
(_cpp_skip_block_comment): Implement warning about bidirectional
control characters.
(skip_line_comment): Likewise.
(forms_identifier_p): Likewise.
(lex_identifier): Likewise.
(lex_string): Likewise.
(lex_raw_string): Likewise.
gcc/testsuite/ChangeLog:
* c-c++-common/Wbidi-chars-1.c: New test.
* c-c++-common/Wbidi-chars-2.c: New test.
* c-c++-common/Wbidi-chars-3.c: New test.
* c-c++-common/Wbidi-chars-4.c: New test.
* c-c++-common/Wbidi-chars-5.c: New test.
* c-c++-common/Wbidi-chars-6.c: New test.
* c-c++-common/Wbidi-chars-7.c: New test.
* c-c++-common/Wbidi-chars-8.c: New test.
* c-c++-common/Wbidi-chars-9.c: New test.
* c-c++-common/Wbidi-chars-10.c: New test.
* c-c++-common/Wbidi-chars-11.c: New test.
* c-c++-common/Wbidi-chars-12.c: New test.
* c-c++-common/Wbidi-chars-13.c: New test.
* c-c++-common/Wbidi-chars-14.c: New test.
* c-c++-common/Wbidi-chars-15.c: New test.
* c-c++-common/Wbidi-chars-16.c: New test.
* c-c++-common/Wbidi-chars-17.c: New test.
2021-10-07 02:33:59 +08:00
|
|
|
/* Returns true iff we should warn about UTF-8 bidirectional control
|
|
|
|
characters. */
|
|
|
|
bool warn_bidi_p () const
|
|
|
|
{
|
2022-01-20 08:05:22 +08:00
|
|
|
return (CPP_OPTION (this, cpp_warn_bidirectional)
|
|
|
|
& (bidirectional_unpaired|bidirectional_any));
|
libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]
From a link below:
"An issue was discovered in the Bidirectional Algorithm in the Unicode
Specification through 14.0. It permits the visual reordering of
characters via control sequences, which can be used to craft source code
that renders different logic than the logical ordering of tokens
ingested by compilers and interpreters. Adversaries can leverage this to
encode source code for compilers accepting Unicode such that targeted
vulnerabilities are introduced invisibly to human reviewers."
More info:
https://nvd.nist.gov/vuln/detail/CVE-2021-42574
https://trojansource.codes/
This is not a compiler bug. However, to mitigate the problem, this patch
implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
misleading Unicode bidirectional control characters the preprocessor may
encounter.
The default is =unpaired, which warns about improperly terminated
bidirectional control characters; e.g. a LRE without its corresponding PDF.
The level =any warns about any use of bidirectional control characters.
This patch handles both UCNs and UTF-8 characters. UCNs designating
bidi characters in identifiers are accepted since r204886. Then r217144
enabled -fextended-identifiers by default. Extended characters in C/C++
identifiers have been accepted since r275979. However, this patch still
warns about mixing UTF-8 and UCN bidi characters; there seems to be no
good reason to allow mixing them.
We warn in different contexts: comments (both C and C++-style), string
literals, character constants, and identifiers. Expectedly, UCNs are ignored
in comments and raw string literals. The bidirectional control characters
can nest so this patch handles that as well.
I have not included nor tested this at all with Fortran (which also has
string literals and line comments).
Dave M. posted patches improving diagnostic involving Unicode characters.
This patch does not make use of this new infrastructure yet.
PR preprocessor/103026
gcc/c-family/ChangeLog:
* c.opt (Wbidi-chars, Wbidi-chars=): New option.
gcc/ChangeLog:
* doc/invoke.texi: Document -Wbidi-chars.
libcpp/ChangeLog:
* include/cpplib.h (enum cpp_bidirectional_level): New.
(struct cpp_options): Add cpp_warn_bidirectional.
(enum cpp_warning_reason): Add CPP_W_BIDIRECTIONAL.
* internal.h (struct cpp_reader): Add warn_bidi_p member
function.
* init.c (cpp_create_reader): Set cpp_warn_bidirectional.
* lex.c (bidi): New namespace.
(get_bidi_utf8): New function.
(get_bidi_ucn): Likewise.
(maybe_warn_bidi_on_close): Likewise.
(maybe_warn_bidi_on_char): Likewise.
(_cpp_skip_block_comment): Implement warning about bidirectional
control characters.
(skip_line_comment): Likewise.
(forms_identifier_p): Likewise.
(lex_identifier): Likewise.
(lex_string): Likewise.
(lex_raw_string): Likewise.
gcc/testsuite/ChangeLog:
* c-c++-common/Wbidi-chars-1.c: New test.
* c-c++-common/Wbidi-chars-2.c: New test.
* c-c++-common/Wbidi-chars-3.c: New test.
* c-c++-common/Wbidi-chars-4.c: New test.
* c-c++-common/Wbidi-chars-5.c: New test.
* c-c++-common/Wbidi-chars-6.c: New test.
* c-c++-common/Wbidi-chars-7.c: New test.
* c-c++-common/Wbidi-chars-8.c: New test.
* c-c++-common/Wbidi-chars-9.c: New test.
* c-c++-common/Wbidi-chars-10.c: New test.
* c-c++-common/Wbidi-chars-11.c: New test.
* c-c++-common/Wbidi-chars-12.c: New test.
* c-c++-common/Wbidi-chars-13.c: New test.
* c-c++-common/Wbidi-chars-14.c: New test.
* c-c++-common/Wbidi-chars-15.c: New test.
* c-c++-common/Wbidi-chars-16.c: New test.
* c-c++-common/Wbidi-chars-17.c: New test.
2021-10-07 02:33:59 +08:00
|
|
|
}
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
};
|
|
|
|
|
libcpp, c-family: Add (dumb) C23 N3017 #embed support [PR105863]
The following patch implements the C23 N3017 "#embed - a scannable,
tooling-friendly binary resource inclusion mechanism" paper.
The implementation is intentionally dumb, in that it doesn't significantly
speed up compilation of larger initializers and doesn't make it possible
to use huge #embeds (like several gigabytes large, that is compile time
and memory still infeasible).
There are 2 reasons for this. One is that I think like it is implemented
now in the patch is how we should use it for the smaller #embed sizes,
dunno with which boundary, whether 32 bytes or 64 or something like that,
certainly handling the single byte cases which is something that can appear
anywhere in the source where constant integer literal can appear is
desirable and I think for a few bytes it isn't worth it to come up with
something smarter and users would like to e.g. see it in -E readably as
well (perhaps the slow vs. fast boundary should be determined by command
line option). And the other one is to be able to more easily find
regressions in behavior caused by the optimizations, so we have something
to get back in git to compare against.
I'm definitely willing to work on the optimizations (likely introduce a new
CPP_* token type to refer to a range of libcpp owned memory (start + size)
and similarly some tree which can do the same, and can be at any time e.g.
split into 2 subparts + say INTEGER_CST in between if needed say for
const unsigned char d[] = {
#embed "2GB.dat" prefix (0, 0, ) suffix (, [0x40000000] = 42)
}; still without having to copy around huge amounts of data; STRING_CST
owns the memory it points to and can be only 2GB in size), but would
like to do that incrementally.
And would like to first include some extensions also not included in
this patch, like gnu::offset (off) parameter to allow to skip certain
constant amount of bytes at the start of the files, plus
gnu::base64 ("base64_encoded_data") parameter to add something which can
store more efficiently large amounts of the #embed data in preprocessed
source.
I've been cross-checking all the tests also against the LLVM implementation
https://github.com/llvm/llvm-project/pull/68620
which has been for a few hours even committed to LLVM trunk but reverted
afterwards. LLVM now has the support committed and I admit I haven't
rechecked whether the behavior on the below mentioned spots have been fixed
in it already or not yet.
The patch uses --embed-dir= option that clang plans to add above and doesn't
use other variants on the search directories yet, plus there are no
default directories at least for the time being where to search for embed
files. So, #embed "..." works if it is found in the same directory (or
relative to the current file's directory) and #embed "/..." or #embed </...>
work always, but relative #embed <...> doesn't unless at least one
--embed-dir= is specified. There is no reason to differentiate between
system and non-system directories, so we don't need -isystem like
counterpart, perhaps -iquote like counterpart could be useful in the future,
dunno what else. It has --embed-directory=dir and --embed-directory dir
as aliases.
There are some differences beyond clang ICEs, so I'd like to point them out
to make sure there is agreement on the choices in the patch. They are also
mentioned in the comments of the llvm pull request.
The most important is that the GCC patch (as well as the original thephd.dev
LLVM branch on godbolt) expands #embed (or acts as if it is expanded) into
a mere sequence of numbers like 123,2,35,26 rather then what clang
effectively treats as (unsigned char)123,(unsigned char)2,(unsigned
char)35,(unsigned char)26 but only does that when using integrated
preprocessor, not when using -save-temps where it acts as GCC.
JeanHeyd as the original author agrees that is how it is currently worded in
C23.
Another difference (not tested in the testsuite, not sure how to check for
effective target /dev/urandom nor am sure it is desirable to check that
during testsuite) is how to treat character devices, named pipes etc.
(block devices are errored on). The original paper uses /dev/urandom
in various examples and seems to assume that unlike regular files the
devices aren't really cached, so
#embed </dev/urandom> limit(1) prefix(int a = ) suffix(;)
#embed </dev/urandom> limit(1) prefix(int b = ) suffix(;)
usually results in a != b. That is what the godbolt thephd.dev branch
implements too and what this patch does as well, but clang actually seems
to just go from st.st_size == 0, ergo it must be zero-sized resource and
so just copies over if_empty if present. It is really questionable
what to do about the character devices/named pipes with __has_embed, for
regular files the patch doesn't read anything from them, relies on
st.st_size + limit for whether it is empty or non-empty. But I don't know
of a way to check if read on say a character device would read anything
or not (the </dev/null> limit (1) vs. </dev/zero> limit (1) cases), and
if we read something, that would be better cached for later because
#embed later if it reads again could read no further data even when it
first read something. So, the patch currently for __has_embed just
always returns 2 on the non-regular files, like the thephd.dev
branch does as well and like the clang pull request as well.
A question is also what to do for gnu::offset on the non-regular files
even for #embed, those aren't seekable and do we want to just read and throw
away the offset bytes each time we see it used?
clang also chokes on the
#if __has_embed (__FILE__ __limit__ (1) __prefix__ () suffix (1 / 0) \
__if_empty__ ((({{[0[0{0{0(0(0)1)1}1}]]}})))) != __STDC_EMBED_FOUND__
#error "__has_embed fail"
#endif
in embed-1.c, but thephd.dev branch accepts it and I don't see why
it shouldn't, (({{[0[0{0{0(0(0)1)1}1}]]}}))) is a balanced token
sequence and the file isn't empty, so it should just be parsed and
discarded.
clang also IMHO mishandles
const unsigned char w[] = {
#embed __FILE__ prefix([0] = 42, [15] =) limit(32)
};
but again only without -save-temps, seems like it
treats it as
[0] = 42, [15] = (99,111,110,115,116,32,117,110,115,105,103,110,101,100,
32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98)
rather than
[0] = 42, [15] = 99,111,110,115,116,32,117,110,115,105,103,110,101,100,
32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98
and warns on it for -Wunused-value and just compiles it as
[0] = 42, [15] = 98
And also
void foo (int, int, int, int);
void bar (void) { foo (
#embed __FILE__ limit (4) prefix (172 + ) suffix (+ 2)
); }
is treated as
172 + (118, 111, 105, 100) + 2
rather than
172 + 118, 111, 105, 100 + 2
which clang -save-temps or GCC treats it like, so results
in just one argument passed rather than 4.
if (!strstr ((const char *) magna_carta, "imprisonétur")) abort ();
in the testcase fails as well, but in that case calling it in gdb succeeds:
p ((char *(*)(char *, char *))__strstr_sse2) (magna_carta, "imprisonétur")
$2 = 0x555555558d3c <magna_carta+11564> "imprisonétur aut disseisiátur"...
so I guess they are just trying to constant evaluate strstr and do it
incorrectly.
They started with making the optimizations together in the initial patch
set, so they don't have the luxury to compare if it is just because of
the optimization they are trying to do or because that is how the
feature works for them. At least unless they use -save-temps for now.
There is also different behavior between clang and gcc on -M or other
dependency generating options. Seems clang includes the __has_embed
searched files in dependencies, while my patch doesn't. But so does
clang for __has_include and GCC doesn't. Emitting a hard dependency
on some header just because there was __has_include/__has_embed for it
seems wrong to me, because (at least when properly written) the source
likely doesn't mind if the file is missing, it will do something else,
so a hard error from make because of it doesn't seem right. Does
make have some weaker dependencies, such that if some file can be remade
it is but if it doesn't exist, it isn't fatal?
I wonder whether #embed <non-existent-file> really needs to be fatal
or whether we could simply after diagnosing it pretend the file exists
and is empty. For #include I think fatal errors make tons of sense,
but perhaps for #embed which is more localized we'd get better error
reporting if we didn't bail out immediately. Note, both GCC and clang
currently treat those as fatal errors.
clang also added -dE option which with -E instead of preprocessing
the #embed directives keeps them as is, but the preprocessed source
then isn't self-contained. That option looks more harmful than useful to
me.
Also, it isn't clear to me from C23 whether it is possible to have
__has_include/__has_c_attribute/__has_embed expressions inside of
the limit #embed/__has_embed argument.
6.10.3.2/2 says that defined should not appear there (and the patch
diagnoses it and testsuite tests), but for __has_include/__has_embed
etc. 6.10.1/11 says:
"The identifiers __has_include, __has_embed, and __has_c_attribute
shall not appear in any context not mentioned in this subclause."
If that subclause in that case means 6.10.1, then it presumably shouldn't
appear in #embed in 6.10.3, but __has_embed is in 6.10.1...
But 6.10.3.2/3 says that it should be parsed according to the 6.10.1
rules. Haven't included tests like
#if __has_embed (__FILE__ limit (__has_embed (__FILE__ limit (1))))
or
#embed __FILE__ limit (__has_include (__FILE__))
into the testsuite because of the doubts but I think the patch should
handle those right now.
The reason I've used Magna Carta text in some of the testcases is that
I hope it shouldn't be copyrighted after the centuries and I'd strongly
prefer not to have binary blobs in git after the xz backdoor lesson
and wanted something larger which doesn't change all the time.
Oh, BTW, I see in C23 draft 6.10.3.2 in Example 4
if (f_source == NULL);
return 1;
(note the spurious semicolon after closing paren), has that been fixed
already?
Like the thephd.dev and clang implementations, the patch always macro
expands the whole #embed and __has_embed directives except for the
embed keyword. That is most likely not what C23 says, my limited
understanding right now is that in #embed one needs to parse the whole
directive line with macro expansion disabled and check if it satisfies the
grammar, if not, the whole directive is macro expanded, if yes, only
the limit parameter argument is macro expanded and the prefix/suffix/if_empty
arguments are maybe macro expanded when actually used (and not at all if
unused). And I think __has_embed macro expansion has conflicting rules.
2024-09-12 Jakub Jelinek <jakub@redhat.com>
PR c/105863
libcpp/
* include/cpplib.h: Implement C23 N3017 #embed - a scannable,
tooling-friendly binary resource inclusion mechanism paper.
(struct cpp_options): Add embed member.
(enum cpp_builtin_type): Add BT_HAS_EMBED.
(cpp_set_include_chains): Add another cpp_dir * argument to
the declaration.
* internal.h (enum include_type): Add IT_EMBED.
(struct cpp_reader): Add embed_include member.
(struct cpp_embed_params_tokens): New type.
(struct cpp_embed_params): New type.
(_cpp_get_token_no_padding): Declare.
(enum _cpp_find_file_kind): Add _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
(_cpp_stack_embed): Declare.
(_cpp_parse_expr): Change return type to cpp_num_part instead of
bool, change second argument from bool to const char * and add third
argument.
(_cpp_parse_embed_params): Declare.
* directives.cc (DIRECTIVE_TABLE): Add embed entry.
(end_directive): Don't call skip_rest_of_line for T_EMBED directive.
(_cpp_handle_directive): Return 2 rather than 1 for T_EMBED in
directives-only mode.
(parse_include): Don't Call check_eol for T_EMBED directive.
(skip_balanced_token_seq): New function.
(EMBED_PARAMS): Define.
(enum embed_param_kind): New type.
(embed_params): New variable.
(_cpp_parse_embed_params): New function.
(do_embed): New function.
(do_if): Adjust _cpp_parse_expr caller.
(do_elif): Likewise.
* expr.cc (parse_defined): Diagnose defined in #embed or __has_embed
parameters.
(_cpp_parse_expr): Change return type to cpp_num_part instead of
bool, change second argument from bool to const char * and add third
argument. Adjust function comment. For #embed/__has_embed parameters
add an artificial CPP_OPEN_PAREN. Use the second argument DIR
directly instead of string literals conditional on IS_IF.
For #embed/__has_embed parameter, stop on reaching CPP_CLOSE_PAREN
matching the artificial one. Diagnose negative or too large embed
parameter operands.
(num_binary_op): Use #embed instead of #if for diagnostics if inside
#embed/__has_embed parameter.
(num_div_op): Likewise.
* files.cc (struct _cpp_file): Add limit member and embed bitfield.
(search_cache): Add IS_EMBED argument, formatting fix. Skip over
files with different file->embed from the argument.
(find_file_in_dir): Don't call pch_open_file if file->embed.
(_cpp_find_file): Handle _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
(read_file_guts): Formatting fix.
(has_unique_contents): Ignore file->embed files.
(search_path_head): Handle IT_EMBED type.
(_cpp_stack_embed): New function.
(_cpp_get_file_stat): Formatting fix.
(cpp_set_include_chains): Add embed argument, save it to
pfile->embed_include and compute lens for the chain.
* init.cc (struct lang_flags): Add embed member.
(lang_defaults): Add embed initializers.
(cpp_set_lang): Initialize CPP_OPTION (pfile, embed).
(builtin_array): Add __has_embed entry.
(cpp_init_builtins): Predefine __STDC_EMBED_NOT_FOUND__,
__STDC_EMBED_FOUND__ and __STDC_EMBED_EMPTY__.
* lex.cc (cpp_directive_only_process): Handle #embed.
* macro.cc (cpp_get_token_no_padding): Rename to ...
(_cpp_get_token_no_padding): ... this. No longer static.
(builtin_has_include_1): New function.
(builtin_has_include): Use it. Use _cpp_get_token_no_padding
instead of cpp_get_token_no_padding.
(builtin_has_embed): New function.
(_cpp_builtin_macro_text): Handle BT_HAS_EMBED.
gcc/
* doc/cppdiropts.texi (--embed-dir=): Document.
* doc/cpp.texi (Binary Resource Inclusion): New chapter.
(__has_embed): Document.
* doc/invoke.texi (Directory Options): Mention --embed-dir=.
* gcc.cc (cpp_unique_options): Add %{-embed*}.
* genmatch.cc (main): Adjust cpp_set_include_chains caller.
* incpath.h (enum incpath_kind): Add INC_EMBED.
* incpath.cc (merge_include_chains): Handle INC_EMBED.
(register_include_chains): Adjust cpp_set_include_chains caller.
gcc/c-family/
* c.opt (-embed-dir=): New option.
(-embed-directory): New alias.
(-embed-directory=): New alias.
* c-opts.cc (c_common_handle_option): Handle OPT__embed_dir_.
gcc/testsuite/
* c-c++-common/cpp/embed-1.c: New test.
* c-c++-common/cpp/embed-2.c: New test.
* c-c++-common/cpp/embed-3.c: New test.
* c-c++-common/cpp/embed-4.c: New test.
* c-c++-common/cpp/embed-5.c: New test.
* c-c++-common/cpp/embed-6.c: New test.
* c-c++-common/cpp/embed-7.c: New test.
* c-c++-common/cpp/embed-8.c: New test.
* c-c++-common/cpp/embed-9.c: New test.
* c-c++-common/cpp/embed-10.c: New test.
* c-c++-common/cpp/embed-11.c: New test.
* c-c++-common/cpp/embed-12.c: New test.
* c-c++-common/cpp/embed-13.c: New test.
* c-c++-common/cpp/embed-14.c: New test.
* c-c++-common/cpp/embed-25.c: New test.
* c-c++-common/cpp/embed-26.c: New test.
* c-c++-common/cpp/embed-dir/embed-1.inc: New test.
* c-c++-common/cpp/embed-dir/embed-3.c: New test.
* c-c++-common/cpp/embed-dir/embed-4.c: New test.
* c-c++-common/cpp/embed-dir/magna-carta.txt: New test.
* gcc.dg/cpp/embed-1.c: New test.
* gcc.dg/cpp/embed-2.c: New test.
* gcc.dg/cpp/embed-3.c: New test.
* gcc.dg/cpp/embed-4.c: New test.
* g++.dg/cpp/embed-1.C: New test.
* g++.dg/cpp/embed-2.C: New test.
* g++.dg/cpp/embed-3.C: New test.
2024-09-12 17:15:38 +08:00
|
|
|
/* Lists of tokens for #embed/__has_embed prefix/suffix/if_empty
|
|
|
|
parameters. */
|
|
|
|
struct cpp_embed_params_tokens
|
|
|
|
{
|
|
|
|
cpp_token *cur_token;
|
|
|
|
tokenrun base_run, *cur_run;
|
|
|
|
size_t count;
|
|
|
|
};
|
|
|
|
|
|
|
|
/* #embed and __has_embed parameters. */
|
|
|
|
struct cpp_embed_params
|
|
|
|
{
|
|
|
|
location_t loc;
|
|
|
|
bool has_embed;
|
2024-09-12 17:34:06 +08:00
|
|
|
cpp_num_part limit, offset;
|
libcpp, v2: Add support for gnu::base64 #embed parameter
This patch which adds another #embed extension, gnu::base64.
As mentioned in the documentation, this extension is primarily
intended for use by the preprocessor, so that for the larger (say 32+ or
64+ bytes long embeds it doesn't have to emit tens of thousands or
millions of comma separated string literals which would be very expensive
to parse again, but can emit
#embed "." __gnu__::__base64__( \
"Tm9uIGVyYW0gbsOpc2NpdXMsIEJydXRlLCBjdW0sIHF1w6Ygc3VtbWlzIGluZ8OpbmlpcyBleHF1" \
"aXNpdMOhcXVlIGRvY3Ryw61uYSBwaGlsw7Nzb3BoaSBHcsOmY28gc2VybcOzbmUgdHJhY3RhdsOt" \
"c3NlbnQsIGVhIExhdMOtbmlzIGzDrXR0ZXJpcyBtYW5kYXLDqW11cywgZm9yZSB1dCBoaWMgbm9z" \
"dGVyIGxhYm9yIGluIHbDoXJpYXMgcmVwcmVoZW5zacOzbmVzIGluY8O6cnJlcmV0LiBuYW0gcXVp" \
"YsO6c2RhbSwgZXQgaWlzIHF1aWRlbSBub24gw6FkbW9kdW0gaW5kw7NjdGlzLCB0b3R1bSBob2Mg" \
"ZMOtc3BsaWNldCBwaGlsb3NvcGjDoXJpLiBxdWlkYW0gYXV0ZW0gbm9uIHRhbSBpZCByZXByZWjD" \
"qW5kdW50LCBzaSByZW3DrXNzaXVzIGFnw6F0dXIsIHNlZCB0YW50dW0gc3TDumRpdW0gdGFtcXVl" \
"IG11bHRhbSDDs3BlcmFtIHBvbsOpbmRhbSBpbiBlbyBub24gYXJiaXRyw6FudHVyLiBlcnVudCDD" \
"qXRpYW0sIGV0IGlpIHF1aWRlbSBlcnVkw610aSBHcsOmY2lzIGzDrXR0ZXJpcywgY29udGVtbsOp" \
"bnRlcyBMYXTDrW5hcywgcXVpIHNlIGRpY2FudCBpbiBHcsOmY2lzIGxlZ8OpbmRpcyDDs3BlcmFt" \
"IG1hbGxlIGNvbnPDum1lcmUuIHBvc3Ryw6ltbyDDoWxpcXVvcyBmdXTDunJvcyBzw7pzcGljb3Is" \
"IHF1aSBtZSBhZCDDoWxpYXMgbMOtdHRlcmFzIHZvY2VudCwgZ2VudXMgaG9jIHNjcmliw6luZGks" \
"IGV0c2kgc2l0IGVsw6lnYW5zLCBwZXJzw7Nuw6YgdGFtZW4gZXQgZGlnbml0w6F0aXMgZXNzZSBu" \
"ZWdlbnQu")
with the meaning don't actually load some file, instead base64 decode
(RFC4648 with A-Za-z0-9+/ chars and = padding, no newlines in between)
the string and use that as data. This is chosen because it should be
-pedantic-errors clean, fairly cheap to decode and then in optimizing
compiler could be handled as similar binary blob to normal #embed,
while the data isn't left somewhere on the disk, so distcc/ccache etc.
can move the preprocessed source without issues.
It makes no sense to support limit and gnu::offset parameters together
with it IMHO, why would somebody waste providing full data and then
threw some away? prefix/suffix/if_empty are normally supported though,
but not intended to be used by the preprocessor.
This patch adds just the extension side, not the actual emitting of this
during -E or -E -fdirectives-only for now, that will be included in the
upcoming patch.
Compared to the earlier posted version of this extension, this patch
allows the string concatenation in the parameter argument (but still
doesn't allow escapes in the string, why would anyone use them when
only A-Za-z0-9+/= are valid). The patch also adds support for parsing
this even in -fpreprocessed compilation.
2024-09-12 Jakub Jelinek <jakub@redhat.com>
libcpp/
* internal.h (struct cpp_embed_params): Add base64 member.
(_cpp_free_embed_params_tokens): Declare.
* directives.cc (DIRECTIVE_TABLE): Add IN_I flag to T_EMBED.
(save_token_for_embed, _cpp_free_embed_params_tokens): New functions.
(EMBED_PARAMS): Add gnu::base64 entry.
(_cpp_parse_embed_params): Parse gnu::base64 parameter. If
-fpreprocessed without -fdirectives-only, require #embed to have
gnu::base64 parameter. Diagnose conflict between gnu::base64 and
limit or gnu::offset parameters.
(do_embed): Use _cpp_free_embed_params_tokens.
* files.cc (finish_embed, base64_dec_fn): New functions.
(base64_dec): New array.
(B64D0, B64D1, B64D2, B64D3): Define.
(finish_base64_embed): New function.
(_cpp_stack_embed): Use finish_embed. Handle params->base64
using finish_base64_embed.
* macro.cc (builtin_has_embed): Call _cpp_free_embed_params_tokens.
gcc/
* doc/cpp.texi (Binary Resource Inclusion): Document gnu::base64
parameter.
gcc/testsuite/
* c-c++-common/cpp/embed-17.c: New test.
* c-c++-common/cpp/embed-18.c: New test.
* c-c++-common/cpp/embed-19.c: New test.
* c-c++-common/cpp/embed-27.c: New test.
* gcc.dg/cpp/embed-6.c: New test.
* gcc.dg/cpp/embed-7.c: New test.
2024-09-13 00:17:05 +08:00
|
|
|
cpp_embed_params_tokens prefix, suffix, if_empty, base64;
|
libcpp, c-family: Add (dumb) C23 N3017 #embed support [PR105863]
The following patch implements the C23 N3017 "#embed - a scannable,
tooling-friendly binary resource inclusion mechanism" paper.
The implementation is intentionally dumb, in that it doesn't significantly
speed up compilation of larger initializers and doesn't make it possible
to use huge #embeds (like several gigabytes large, that is compile time
and memory still infeasible).
There are 2 reasons for this. One is that I think like it is implemented
now in the patch is how we should use it for the smaller #embed sizes,
dunno with which boundary, whether 32 bytes or 64 or something like that,
certainly handling the single byte cases which is something that can appear
anywhere in the source where constant integer literal can appear is
desirable and I think for a few bytes it isn't worth it to come up with
something smarter and users would like to e.g. see it in -E readably as
well (perhaps the slow vs. fast boundary should be determined by command
line option). And the other one is to be able to more easily find
regressions in behavior caused by the optimizations, so we have something
to get back in git to compare against.
I'm definitely willing to work on the optimizations (likely introduce a new
CPP_* token type to refer to a range of libcpp owned memory (start + size)
and similarly some tree which can do the same, and can be at any time e.g.
split into 2 subparts + say INTEGER_CST in between if needed say for
const unsigned char d[] = {
#embed "2GB.dat" prefix (0, 0, ) suffix (, [0x40000000] = 42)
}; still without having to copy around huge amounts of data; STRING_CST
owns the memory it points to and can be only 2GB in size), but would
like to do that incrementally.
And would like to first include some extensions also not included in
this patch, like gnu::offset (off) parameter to allow to skip certain
constant amount of bytes at the start of the files, plus
gnu::base64 ("base64_encoded_data") parameter to add something which can
store more efficiently large amounts of the #embed data in preprocessed
source.
I've been cross-checking all the tests also against the LLVM implementation
https://github.com/llvm/llvm-project/pull/68620
which has been for a few hours even committed to LLVM trunk but reverted
afterwards. LLVM now has the support committed and I admit I haven't
rechecked whether the behavior on the below mentioned spots have been fixed
in it already or not yet.
The patch uses --embed-dir= option that clang plans to add above and doesn't
use other variants on the search directories yet, plus there are no
default directories at least for the time being where to search for embed
files. So, #embed "..." works if it is found in the same directory (or
relative to the current file's directory) and #embed "/..." or #embed </...>
work always, but relative #embed <...> doesn't unless at least one
--embed-dir= is specified. There is no reason to differentiate between
system and non-system directories, so we don't need -isystem like
counterpart, perhaps -iquote like counterpart could be useful in the future,
dunno what else. It has --embed-directory=dir and --embed-directory dir
as aliases.
There are some differences beyond clang ICEs, so I'd like to point them out
to make sure there is agreement on the choices in the patch. They are also
mentioned in the comments of the llvm pull request.
The most important is that the GCC patch (as well as the original thephd.dev
LLVM branch on godbolt) expands #embed (or acts as if it is expanded) into
a mere sequence of numbers like 123,2,35,26 rather then what clang
effectively treats as (unsigned char)123,(unsigned char)2,(unsigned
char)35,(unsigned char)26 but only does that when using integrated
preprocessor, not when using -save-temps where it acts as GCC.
JeanHeyd as the original author agrees that is how it is currently worded in
C23.
Another difference (not tested in the testsuite, not sure how to check for
effective target /dev/urandom nor am sure it is desirable to check that
during testsuite) is how to treat character devices, named pipes etc.
(block devices are errored on). The original paper uses /dev/urandom
in various examples and seems to assume that unlike regular files the
devices aren't really cached, so
#embed </dev/urandom> limit(1) prefix(int a = ) suffix(;)
#embed </dev/urandom> limit(1) prefix(int b = ) suffix(;)
usually results in a != b. That is what the godbolt thephd.dev branch
implements too and what this patch does as well, but clang actually seems
to just go from st.st_size == 0, ergo it must be zero-sized resource and
so just copies over if_empty if present. It is really questionable
what to do about the character devices/named pipes with __has_embed, for
regular files the patch doesn't read anything from them, relies on
st.st_size + limit for whether it is empty or non-empty. But I don't know
of a way to check if read on say a character device would read anything
or not (the </dev/null> limit (1) vs. </dev/zero> limit (1) cases), and
if we read something, that would be better cached for later because
#embed later if it reads again could read no further data even when it
first read something. So, the patch currently for __has_embed just
always returns 2 on the non-regular files, like the thephd.dev
branch does as well and like the clang pull request as well.
A question is also what to do for gnu::offset on the non-regular files
even for #embed, those aren't seekable and do we want to just read and throw
away the offset bytes each time we see it used?
clang also chokes on the
#if __has_embed (__FILE__ __limit__ (1) __prefix__ () suffix (1 / 0) \
__if_empty__ ((({{[0[0{0{0(0(0)1)1}1}]]}})))) != __STDC_EMBED_FOUND__
#error "__has_embed fail"
#endif
in embed-1.c, but thephd.dev branch accepts it and I don't see why
it shouldn't, (({{[0[0{0{0(0(0)1)1}1}]]}}))) is a balanced token
sequence and the file isn't empty, so it should just be parsed and
discarded.
clang also IMHO mishandles
const unsigned char w[] = {
#embed __FILE__ prefix([0] = 42, [15] =) limit(32)
};
but again only without -save-temps, seems like it
treats it as
[0] = 42, [15] = (99,111,110,115,116,32,117,110,115,105,103,110,101,100,
32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98)
rather than
[0] = 42, [15] = 99,111,110,115,116,32,117,110,115,105,103,110,101,100,
32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98
and warns on it for -Wunused-value and just compiles it as
[0] = 42, [15] = 98
And also
void foo (int, int, int, int);
void bar (void) { foo (
#embed __FILE__ limit (4) prefix (172 + ) suffix (+ 2)
); }
is treated as
172 + (118, 111, 105, 100) + 2
rather than
172 + 118, 111, 105, 100 + 2
which clang -save-temps or GCC treats it like, so results
in just one argument passed rather than 4.
if (!strstr ((const char *) magna_carta, "imprisonétur")) abort ();
in the testcase fails as well, but in that case calling it in gdb succeeds:
p ((char *(*)(char *, char *))__strstr_sse2) (magna_carta, "imprisonétur")
$2 = 0x555555558d3c <magna_carta+11564> "imprisonétur aut disseisiátur"...
so I guess they are just trying to constant evaluate strstr and do it
incorrectly.
They started with making the optimizations together in the initial patch
set, so they don't have the luxury to compare if it is just because of
the optimization they are trying to do or because that is how the
feature works for them. At least unless they use -save-temps for now.
There is also different behavior between clang and gcc on -M or other
dependency generating options. Seems clang includes the __has_embed
searched files in dependencies, while my patch doesn't. But so does
clang for __has_include and GCC doesn't. Emitting a hard dependency
on some header just because there was __has_include/__has_embed for it
seems wrong to me, because (at least when properly written) the source
likely doesn't mind if the file is missing, it will do something else,
so a hard error from make because of it doesn't seem right. Does
make have some weaker dependencies, such that if some file can be remade
it is but if it doesn't exist, it isn't fatal?
I wonder whether #embed <non-existent-file> really needs to be fatal
or whether we could simply after diagnosing it pretend the file exists
and is empty. For #include I think fatal errors make tons of sense,
but perhaps for #embed which is more localized we'd get better error
reporting if we didn't bail out immediately. Note, both GCC and clang
currently treat those as fatal errors.
clang also added -dE option which with -E instead of preprocessing
the #embed directives keeps them as is, but the preprocessed source
then isn't self-contained. That option looks more harmful than useful to
me.
Also, it isn't clear to me from C23 whether it is possible to have
__has_include/__has_c_attribute/__has_embed expressions inside of
the limit #embed/__has_embed argument.
6.10.3.2/2 says that defined should not appear there (and the patch
diagnoses it and testsuite tests), but for __has_include/__has_embed
etc. 6.10.1/11 says:
"The identifiers __has_include, __has_embed, and __has_c_attribute
shall not appear in any context not mentioned in this subclause."
If that subclause in that case means 6.10.1, then it presumably shouldn't
appear in #embed in 6.10.3, but __has_embed is in 6.10.1...
But 6.10.3.2/3 says that it should be parsed according to the 6.10.1
rules. Haven't included tests like
#if __has_embed (__FILE__ limit (__has_embed (__FILE__ limit (1))))
or
#embed __FILE__ limit (__has_include (__FILE__))
into the testsuite because of the doubts but I think the patch should
handle those right now.
The reason I've used Magna Carta text in some of the testcases is that
I hope it shouldn't be copyrighted after the centuries and I'd strongly
prefer not to have binary blobs in git after the xz backdoor lesson
and wanted something larger which doesn't change all the time.
Oh, BTW, I see in C23 draft 6.10.3.2 in Example 4
if (f_source == NULL);
return 1;
(note the spurious semicolon after closing paren), has that been fixed
already?
Like the thephd.dev and clang implementations, the patch always macro
expands the whole #embed and __has_embed directives except for the
embed keyword. That is most likely not what C23 says, my limited
understanding right now is that in #embed one needs to parse the whole
directive line with macro expansion disabled and check if it satisfies the
grammar, if not, the whole directive is macro expanded, if yes, only
the limit parameter argument is macro expanded and the prefix/suffix/if_empty
arguments are maybe macro expanded when actually used (and not at all if
unused). And I think __has_embed macro expansion has conflicting rules.
2024-09-12 Jakub Jelinek <jakub@redhat.com>
PR c/105863
libcpp/
* include/cpplib.h: Implement C23 N3017 #embed - a scannable,
tooling-friendly binary resource inclusion mechanism paper.
(struct cpp_options): Add embed member.
(enum cpp_builtin_type): Add BT_HAS_EMBED.
(cpp_set_include_chains): Add another cpp_dir * argument to
the declaration.
* internal.h (enum include_type): Add IT_EMBED.
(struct cpp_reader): Add embed_include member.
(struct cpp_embed_params_tokens): New type.
(struct cpp_embed_params): New type.
(_cpp_get_token_no_padding): Declare.
(enum _cpp_find_file_kind): Add _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
(_cpp_stack_embed): Declare.
(_cpp_parse_expr): Change return type to cpp_num_part instead of
bool, change second argument from bool to const char * and add third
argument.
(_cpp_parse_embed_params): Declare.
* directives.cc (DIRECTIVE_TABLE): Add embed entry.
(end_directive): Don't call skip_rest_of_line for T_EMBED directive.
(_cpp_handle_directive): Return 2 rather than 1 for T_EMBED in
directives-only mode.
(parse_include): Don't Call check_eol for T_EMBED directive.
(skip_balanced_token_seq): New function.
(EMBED_PARAMS): Define.
(enum embed_param_kind): New type.
(embed_params): New variable.
(_cpp_parse_embed_params): New function.
(do_embed): New function.
(do_if): Adjust _cpp_parse_expr caller.
(do_elif): Likewise.
* expr.cc (parse_defined): Diagnose defined in #embed or __has_embed
parameters.
(_cpp_parse_expr): Change return type to cpp_num_part instead of
bool, change second argument from bool to const char * and add third
argument. Adjust function comment. For #embed/__has_embed parameters
add an artificial CPP_OPEN_PAREN. Use the second argument DIR
directly instead of string literals conditional on IS_IF.
For #embed/__has_embed parameter, stop on reaching CPP_CLOSE_PAREN
matching the artificial one. Diagnose negative or too large embed
parameter operands.
(num_binary_op): Use #embed instead of #if for diagnostics if inside
#embed/__has_embed parameter.
(num_div_op): Likewise.
* files.cc (struct _cpp_file): Add limit member and embed bitfield.
(search_cache): Add IS_EMBED argument, formatting fix. Skip over
files with different file->embed from the argument.
(find_file_in_dir): Don't call pch_open_file if file->embed.
(_cpp_find_file): Handle _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
(read_file_guts): Formatting fix.
(has_unique_contents): Ignore file->embed files.
(search_path_head): Handle IT_EMBED type.
(_cpp_stack_embed): New function.
(_cpp_get_file_stat): Formatting fix.
(cpp_set_include_chains): Add embed argument, save it to
pfile->embed_include and compute lens for the chain.
* init.cc (struct lang_flags): Add embed member.
(lang_defaults): Add embed initializers.
(cpp_set_lang): Initialize CPP_OPTION (pfile, embed).
(builtin_array): Add __has_embed entry.
(cpp_init_builtins): Predefine __STDC_EMBED_NOT_FOUND__,
__STDC_EMBED_FOUND__ and __STDC_EMBED_EMPTY__.
* lex.cc (cpp_directive_only_process): Handle #embed.
* macro.cc (cpp_get_token_no_padding): Rename to ...
(_cpp_get_token_no_padding): ... this. No longer static.
(builtin_has_include_1): New function.
(builtin_has_include): Use it. Use _cpp_get_token_no_padding
instead of cpp_get_token_no_padding.
(builtin_has_embed): New function.
(_cpp_builtin_macro_text): Handle BT_HAS_EMBED.
gcc/
* doc/cppdiropts.texi (--embed-dir=): Document.
* doc/cpp.texi (Binary Resource Inclusion): New chapter.
(__has_embed): Document.
* doc/invoke.texi (Directory Options): Mention --embed-dir=.
* gcc.cc (cpp_unique_options): Add %{-embed*}.
* genmatch.cc (main): Adjust cpp_set_include_chains caller.
* incpath.h (enum incpath_kind): Add INC_EMBED.
* incpath.cc (merge_include_chains): Handle INC_EMBED.
(register_include_chains): Adjust cpp_set_include_chains caller.
gcc/c-family/
* c.opt (-embed-dir=): New option.
(-embed-directory): New alias.
(-embed-directory=): New alias.
* c-opts.cc (c_common_handle_option): Handle OPT__embed_dir_.
gcc/testsuite/
* c-c++-common/cpp/embed-1.c: New test.
* c-c++-common/cpp/embed-2.c: New test.
* c-c++-common/cpp/embed-3.c: New test.
* c-c++-common/cpp/embed-4.c: New test.
* c-c++-common/cpp/embed-5.c: New test.
* c-c++-common/cpp/embed-6.c: New test.
* c-c++-common/cpp/embed-7.c: New test.
* c-c++-common/cpp/embed-8.c: New test.
* c-c++-common/cpp/embed-9.c: New test.
* c-c++-common/cpp/embed-10.c: New test.
* c-c++-common/cpp/embed-11.c: New test.
* c-c++-common/cpp/embed-12.c: New test.
* c-c++-common/cpp/embed-13.c: New test.
* c-c++-common/cpp/embed-14.c: New test.
* c-c++-common/cpp/embed-25.c: New test.
* c-c++-common/cpp/embed-26.c: New test.
* c-c++-common/cpp/embed-dir/embed-1.inc: New test.
* c-c++-common/cpp/embed-dir/embed-3.c: New test.
* c-c++-common/cpp/embed-dir/embed-4.c: New test.
* c-c++-common/cpp/embed-dir/magna-carta.txt: New test.
* gcc.dg/cpp/embed-1.c: New test.
* gcc.dg/cpp/embed-2.c: New test.
* gcc.dg/cpp/embed-3.c: New test.
* gcc.dg/cpp/embed-4.c: New test.
* g++.dg/cpp/embed-1.C: New test.
* g++.dg/cpp/embed-2.C: New test.
* g++.dg/cpp/embed-3.C: New test.
2024-09-12 17:15:38 +08:00
|
|
|
};
|
|
|
|
|
safe-ctype.h: New file.
include:
* safe-ctype.h: New file.
libiberty:
* safe-ctype.c: New file.
* Makefile.in (CFILES): Add safe-ctype.c.
(REQUIRED_OFILES): Add safe-ctype.o.
* argv.c: Define ISBLANK and use it, not isspace.
* basename.c, cplus-dem.c, fnmatch.c, pexecute.c, strtod.c,
strtol.c, strtoul.c: Include safe-ctype.h, not ctype.h. Use
uppercase ctype macros. Don't test ISUPPER(c)/ISLOWER(c)
before calling TOLOWER(c)/TOUPPER(c).
gcc:
* Makefile.in (HOST_RTL): Add safe-ctype.o.
(safe-ctype.o): New rule.
* system.h: Include safe-ctype.h, not ctype.h. No need to
wrap ctype macros.
* cpphash.h: Zap IStable and related macros. Define is_* in
terms of safe-ctype.h macros.
* cppinit.c: Delete the IStable and all related code.
* tradcpp.c: Delete is_idchar, is_idstart, is_hor_space, and
is_space arrays. Delete initialize_char_syntax. Change all
references to the above arrays to use macros instead.
* tradcpp.h: Define is_idchar, is_idstart, is_space, and
is_nvspace in terms of safe_ctype.h's macros.
* tradcif.y: is_idchar, is_idstart are macros not arrays.
* config/i370/i370.c, config/winnt/dirent.c,
config/winnt/fixinc-nt.c, config/winnt/ld.c:
Use uppercase ctype macros. If we included ctype.h,
include safe-ctype.h instead.
* fixinc/fixfixes.c: Use uppercase ctype macros. Don't test
ISLOWER(c) before calling TOUPPER(c).
* fixinc/fixincl.c (extract_quoted_files): Simplify out some gunk.
* fixinc/gnu-regex.c: Include safe-ctype.h, not ctype.h. No need to
wrap ctype macros. Don't test ISUPPER(x) before calling TOLOWER(x).
gcc/ch:
* lex.c: Don't bother checking whether ISUPPER(c) before
calling TOLOWER(c). Don't bother checking whether isascii(c)
before testing ISSPACE(c); ISSPACE(c) includes '\n'.
gcc/f:
* Make-lang.in: Link f/fini with safe-ctype.o.
* bad.c: Don't test ISUPPER(c) || ISLOWER(c) before calling TOUPPER(c).
* com.c: Use TOUPPER, not ffesrc_toupper.
* fini.c: Don't test ISALPHA(c) before calling TOUPPER(c)/TOLOWER(c).
* intrin.c: Don't test IN_CTYPE_DOMAIN(c).
* src.c: Delete ffesrc_toupper_ and ffesrc_tolower_ and their
initializing code; use TOUPPER and TOLOWER instead of
ffesrc_toupper and ffesrc_tolower.
* src.h: Don't declare ffesrc_toupper_ or ffesrc_tolower_.
Don't define ffesrc_toupper or ffesrc_tolower.
gcc/java:
* jvgenmain.c: Use ISPRINT not isascii.
From-SVN: r38124
2000-12-08 11:00:26 +08:00
|
|
|
/* Character classes. Based on the more primitive macros in safe-ctype.h.
|
Makefile.in (LIBCPP_DEPS): New macro.
* Makefile.in (LIBCPP_DEPS): New macro.
(cpplib.o, cpphash.o, cpperror.o, cppexp.o, cppfiles.o): Use
it to declare deps.
* cpperror.c: Include cpphash.h.
* cppexp.c: Include cpphash.h. Remove MULTIBYTE_CHARS
dingleberry.
(lex): Don't use CPP_WARN_UNDEF.
(_cpp_parse_expr): Return an int, the truth value.
* cppfiles.c: Include cpphash.h.
(_cpp_merge_include_chains): Move to cppinit.c and make static.
* cppinit.c (include_defaults_array): Disentangle.
(cpp_cleanup): Don't free the if stack here.
(cpp_finish): Pop off all buffers, not just one.
* cpplib.c (eval_if_expr): Return int.
(do_xifdef): Rename do_ifdef.
(handle_directive): Don't use CPP_PREPROCESSED.
(cpp_get_token): Don't use CPP_C89.
* fix-header.c: Don't use CPP_OPTIONS.
* cpplib.h: Move U_CHAR, enum node_type, struct
file_name_list, struct ihash, is_idchar, is_idstart,
is_numchar, is_numstart, is_hspace, is_space, CPP_BUF_PEEK,
CPP_BUF_GET, CPP_FORWARD, CPP_PUTS, CPP_PUTS_Q, CPP_PUTC,
CPP_PUTC_Q, CPP_NUL_TERMINATE, CPP_NUL_TERMINATE_Q,
CPP_BUMP_BUFFER_LINE, CPP_BUMP_LINE, CPP_PREV_BUFFER,
CPP_PRINT_DEPS, CPP_TRADITIONAL, CPP_PEDANTIC, and prototypes
of _cpp_simplify_pathname, _cpp_find_include_file,
_cpp_read_include_file, and _cpp_parse_expr to cpphash.h.
Move struct if_stack to cpplib.c. Move struct cpp_pending to
cppinit.c.
Change all uses of U_CHAR to be unsigned char instead.
Delete CPP_WARN_UNDEF, CPP_C89, and CPP_PREPROCESSED.
From-SVN: r32435
2000-03-09 07:35:19 +08:00
|
|
|
If the definition of `numchar' looks odd to you, please look up the
|
cpphash.h: ISvspace, is_vspace, is_nvspace: New.
* cpphash.h: ISvspace, is_vspace, is_nvspace: New.
IShspace, ISspace: Update.
* cppinit.c: ISTABLE: Update.
V: New.
* cpplex.c (IS_HSPACE, S_NEWLINE): Remove.
(IS_DIRECTIVE): Rename KNOWN_DIRECTIVE.
(skip_block_comment, skip_line_comment, parse_string,
lex_line): Use is_vspace rather than IS_NEWLINE.
(skip_whitespace, lex_line): Clean up to use is_nvspace.
(lex_line): Use KNOWN_DIRECTIVE. Any kind of directive
gets a BOL flag.
(lex_next): Unconditionally stop if within a directive.
Treat directives within macro invocations as directives
(after parse_args emits error), not as the argument.
* testsuite/gcc.dg/cpp/directiv.c: New tests.
* testsuite/gcc.dg/cpp/undef1.c: Update.
From-SVN: r34933
2000-07-09 17:19:44 +08:00
|
|
|
definition of a pp-number in the C standard [section 6.4.8 of C99].
|
|
|
|
|
|
|
|
In the unlikely event that characters other than \r and \n enter
|
2022-01-14 23:57:02 +08:00
|
|
|
the set is_vspace, the macro handle_newline() in lex.cc must be
|
cpphash.h: ISvspace, is_vspace, is_nvspace: New.
* cpphash.h: ISvspace, is_vspace, is_nvspace: New.
IShspace, ISspace: Update.
* cppinit.c: ISTABLE: Update.
V: New.
* cpplex.c (IS_HSPACE, S_NEWLINE): Remove.
(IS_DIRECTIVE): Rename KNOWN_DIRECTIVE.
(skip_block_comment, skip_line_comment, parse_string,
lex_line): Use is_vspace rather than IS_NEWLINE.
(skip_whitespace, lex_line): Clean up to use is_nvspace.
(lex_line): Use KNOWN_DIRECTIVE. Any kind of directive
gets a BOL flag.
(lex_next): Unconditionally stop if within a directive.
Treat directives within macro invocations as directives
(after parse_args emits error), not as the argument.
* testsuite/gcc.dg/cpp/directiv.c: New tests.
* testsuite/gcc.dg/cpp/undef1.c: Update.
From-SVN: r34933
2000-07-09 17:19:44 +08:00
|
|
|
updated. */
|
cpplib.h: Merge struct cpp_options into struct cpp_reader.
* cpplib.h: Merge struct cpp_options into struct cpp_reader.
Reorder struct cpp_options and struct cpp_reader for better
packing. Replace CPP_OPTIONS macro with CPP_OPTION which
takes two args. Change all 'char' flags to 'unsigned char'.
Move show_column flag into struct cpp_options. Don't
prototype cpp_options_init.
* cpphash.h, cpperror.c, cppexp.c, cppfiles.c, cpphash.c,
cppinit.c, cpplex.c, cpplib.c:
Replace CPP_OPTIONS (pfile)->whatever with
CPP_OPTION (pfile, whatever), and likewise for
opts = CPP_OPTIONS (pfile); ... opts->whatever;
* cppinit.c (merge_include_chains): Take a cpp_reader *.
Extract CPP_OPTION (pfile, pending) and work with that
directly.
(cpp_options_init): Delete.
(cpp_reader_init): Turn on on-by-default options here.
Allocate the pending structure here.
(cl_options, enum opt_code): Define these from the same table,
kept in a large macro. Add -fshow-column and -fno-show-column
options.
* cpperror.c (v_message): If show_column is off, don't print
the column number.
* cppmain.c: Update for new interface.
* fix-header.c: Likewise.
From-SVN: r32850
2000-04-01 07:16:11 +08:00
|
|
|
#define _dollar_ok(x) ((x) == '$' && CPP_OPTION (pfile, dollars_in_ident))
|
Makefile.in (LIBCPP_DEPS): New macro.
* Makefile.in (LIBCPP_DEPS): New macro.
(cpplib.o, cpphash.o, cpperror.o, cppexp.o, cppfiles.o): Use
it to declare deps.
* cpperror.c: Include cpphash.h.
* cppexp.c: Include cpphash.h. Remove MULTIBYTE_CHARS
dingleberry.
(lex): Don't use CPP_WARN_UNDEF.
(_cpp_parse_expr): Return an int, the truth value.
* cppfiles.c: Include cpphash.h.
(_cpp_merge_include_chains): Move to cppinit.c and make static.
* cppinit.c (include_defaults_array): Disentangle.
(cpp_cleanup): Don't free the if stack here.
(cpp_finish): Pop off all buffers, not just one.
* cpplib.c (eval_if_expr): Return int.
(do_xifdef): Rename do_ifdef.
(handle_directive): Don't use CPP_PREPROCESSED.
(cpp_get_token): Don't use CPP_C89.
* fix-header.c: Don't use CPP_OPTIONS.
* cpplib.h: Move U_CHAR, enum node_type, struct
file_name_list, struct ihash, is_idchar, is_idstart,
is_numchar, is_numstart, is_hspace, is_space, CPP_BUF_PEEK,
CPP_BUF_GET, CPP_FORWARD, CPP_PUTS, CPP_PUTS_Q, CPP_PUTC,
CPP_PUTC_Q, CPP_NUL_TERMINATE, CPP_NUL_TERMINATE_Q,
CPP_BUMP_BUFFER_LINE, CPP_BUMP_LINE, CPP_PREV_BUFFER,
CPP_PRINT_DEPS, CPP_TRADITIONAL, CPP_PEDANTIC, and prototypes
of _cpp_simplify_pathname, _cpp_find_include_file,
_cpp_read_include_file, and _cpp_parse_expr to cpphash.h.
Move struct if_stack to cpplib.c. Move struct cpp_pending to
cppinit.c.
Change all uses of U_CHAR to be unsigned char instead.
Delete CPP_WARN_UNDEF, CPP_C89, and CPP_PREPROCESSED.
From-SVN: r32435
2000-03-09 07:35:19 +08:00
|
|
|
|
safe-ctype.h: New file.
include:
* safe-ctype.h: New file.
libiberty:
* safe-ctype.c: New file.
* Makefile.in (CFILES): Add safe-ctype.c.
(REQUIRED_OFILES): Add safe-ctype.o.
* argv.c: Define ISBLANK and use it, not isspace.
* basename.c, cplus-dem.c, fnmatch.c, pexecute.c, strtod.c,
strtol.c, strtoul.c: Include safe-ctype.h, not ctype.h. Use
uppercase ctype macros. Don't test ISUPPER(c)/ISLOWER(c)
before calling TOLOWER(c)/TOUPPER(c).
gcc:
* Makefile.in (HOST_RTL): Add safe-ctype.o.
(safe-ctype.o): New rule.
* system.h: Include safe-ctype.h, not ctype.h. No need to
wrap ctype macros.
* cpphash.h: Zap IStable and related macros. Define is_* in
terms of safe-ctype.h macros.
* cppinit.c: Delete the IStable and all related code.
* tradcpp.c: Delete is_idchar, is_idstart, is_hor_space, and
is_space arrays. Delete initialize_char_syntax. Change all
references to the above arrays to use macros instead.
* tradcpp.h: Define is_idchar, is_idstart, is_space, and
is_nvspace in terms of safe_ctype.h's macros.
* tradcif.y: is_idchar, is_idstart are macros not arrays.
* config/i370/i370.c, config/winnt/dirent.c,
config/winnt/fixinc-nt.c, config/winnt/ld.c:
Use uppercase ctype macros. If we included ctype.h,
include safe-ctype.h instead.
* fixinc/fixfixes.c: Use uppercase ctype macros. Don't test
ISLOWER(c) before calling TOUPPER(c).
* fixinc/fixincl.c (extract_quoted_files): Simplify out some gunk.
* fixinc/gnu-regex.c: Include safe-ctype.h, not ctype.h. No need to
wrap ctype macros. Don't test ISUPPER(x) before calling TOLOWER(x).
gcc/ch:
* lex.c: Don't bother checking whether ISUPPER(c) before
calling TOLOWER(c). Don't bother checking whether isascii(c)
before testing ISSPACE(c); ISSPACE(c) includes '\n'.
gcc/f:
* Make-lang.in: Link f/fini with safe-ctype.o.
* bad.c: Don't test ISUPPER(c) || ISLOWER(c) before calling TOUPPER(c).
* com.c: Use TOUPPER, not ffesrc_toupper.
* fini.c: Don't test ISALPHA(c) before calling TOUPPER(c)/TOLOWER(c).
* intrin.c: Don't test IN_CTYPE_DOMAIN(c).
* src.c: Delete ffesrc_toupper_ and ffesrc_tolower_ and their
initializing code; use TOUPPER and TOLOWER instead of
ffesrc_toupper and ffesrc_tolower.
* src.h: Don't declare ffesrc_toupper_ or ffesrc_tolower_.
Don't define ffesrc_toupper or ffesrc_tolower.
gcc/java:
* jvgenmain.c: Use ISPRINT not isascii.
From-SVN: r38124
2000-12-08 11:00:26 +08:00
|
|
|
#define is_idchar(x) (ISIDNUM(x) || _dollar_ok(x))
|
|
|
|
#define is_numchar(x) ISIDNUM(x)
|
|
|
|
#define is_idstart(x) (ISIDST(x) || _dollar_ok(x))
|
|
|
|
#define is_numstart(x) ISDIGIT(x)
|
|
|
|
#define is_hspace(x) ISBLANK(x)
|
|
|
|
#define is_vspace(x) IS_VSPACE(x)
|
|
|
|
#define is_nvspace(x) IS_NVSPACE(x)
|
|
|
|
#define is_space(x) IS_SPACE_OR_NUL(x)
|
Makefile.in (LIBCPP_DEPS): New macro.
* Makefile.in (LIBCPP_DEPS): New macro.
(cpplib.o, cpphash.o, cpperror.o, cppexp.o, cppfiles.o): Use
it to declare deps.
* cpperror.c: Include cpphash.h.
* cppexp.c: Include cpphash.h. Remove MULTIBYTE_CHARS
dingleberry.
(lex): Don't use CPP_WARN_UNDEF.
(_cpp_parse_expr): Return an int, the truth value.
* cppfiles.c: Include cpphash.h.
(_cpp_merge_include_chains): Move to cppinit.c and make static.
* cppinit.c (include_defaults_array): Disentangle.
(cpp_cleanup): Don't free the if stack here.
(cpp_finish): Pop off all buffers, not just one.
* cpplib.c (eval_if_expr): Return int.
(do_xifdef): Rename do_ifdef.
(handle_directive): Don't use CPP_PREPROCESSED.
(cpp_get_token): Don't use CPP_C89.
* fix-header.c: Don't use CPP_OPTIONS.
* cpplib.h: Move U_CHAR, enum node_type, struct
file_name_list, struct ihash, is_idchar, is_idstart,
is_numchar, is_numstart, is_hspace, is_space, CPP_BUF_PEEK,
CPP_BUF_GET, CPP_FORWARD, CPP_PUTS, CPP_PUTS_Q, CPP_PUTC,
CPP_PUTC_Q, CPP_NUL_TERMINATE, CPP_NUL_TERMINATE_Q,
CPP_BUMP_BUFFER_LINE, CPP_BUMP_LINE, CPP_PREV_BUFFER,
CPP_PRINT_DEPS, CPP_TRADITIONAL, CPP_PEDANTIC, and prototypes
of _cpp_simplify_pathname, _cpp_find_include_file,
_cpp_read_include_file, and _cpp_parse_expr to cpphash.h.
Move struct if_stack to cpplib.c. Move struct cpp_pending to
cppinit.c.
Change all uses of U_CHAR to be unsigned char instead.
Delete CPP_WARN_UNDEF, CPP_C89, and CPP_PREPROCESSED.
From-SVN: r32435
2000-03-09 07:35:19 +08:00
|
|
|
|
2019-01-26 18:08:00 +08:00
|
|
|
#define SEEN_EOL() (pfile->cur_token[-1].type == CPP_EOF)
|
|
|
|
|
safe-ctype.h: New file.
include:
* safe-ctype.h: New file.
libiberty:
* safe-ctype.c: New file.
* Makefile.in (CFILES): Add safe-ctype.c.
(REQUIRED_OFILES): Add safe-ctype.o.
* argv.c: Define ISBLANK and use it, not isspace.
* basename.c, cplus-dem.c, fnmatch.c, pexecute.c, strtod.c,
strtol.c, strtoul.c: Include safe-ctype.h, not ctype.h. Use
uppercase ctype macros. Don't test ISUPPER(c)/ISLOWER(c)
before calling TOLOWER(c)/TOUPPER(c).
gcc:
* Makefile.in (HOST_RTL): Add safe-ctype.o.
(safe-ctype.o): New rule.
* system.h: Include safe-ctype.h, not ctype.h. No need to
wrap ctype macros.
* cpphash.h: Zap IStable and related macros. Define is_* in
terms of safe-ctype.h macros.
* cppinit.c: Delete the IStable and all related code.
* tradcpp.c: Delete is_idchar, is_idstart, is_hor_space, and
is_space arrays. Delete initialize_char_syntax. Change all
references to the above arrays to use macros instead.
* tradcpp.h: Define is_idchar, is_idstart, is_space, and
is_nvspace in terms of safe_ctype.h's macros.
* tradcif.y: is_idchar, is_idstart are macros not arrays.
* config/i370/i370.c, config/winnt/dirent.c,
config/winnt/fixinc-nt.c, config/winnt/ld.c:
Use uppercase ctype macros. If we included ctype.h,
include safe-ctype.h instead.
* fixinc/fixfixes.c: Use uppercase ctype macros. Don't test
ISLOWER(c) before calling TOUPPER(c).
* fixinc/fixincl.c (extract_quoted_files): Simplify out some gunk.
* fixinc/gnu-regex.c: Include safe-ctype.h, not ctype.h. No need to
wrap ctype macros. Don't test ISUPPER(x) before calling TOLOWER(x).
gcc/ch:
* lex.c: Don't bother checking whether ISUPPER(c) before
calling TOLOWER(c). Don't bother checking whether isascii(c)
before testing ISSPACE(c); ISSPACE(c) includes '\n'.
gcc/f:
* Make-lang.in: Link f/fini with safe-ctype.o.
* bad.c: Don't test ISUPPER(c) || ISLOWER(c) before calling TOUPPER(c).
* com.c: Use TOUPPER, not ffesrc_toupper.
* fini.c: Don't test ISALPHA(c) before calling TOUPPER(c)/TOLOWER(c).
* intrin.c: Don't test IN_CTYPE_DOMAIN(c).
* src.c: Delete ffesrc_toupper_ and ffesrc_tolower_ and their
initializing code; use TOUPPER and TOLOWER instead of
ffesrc_toupper and ffesrc_tolower.
* src.h: Don't declare ffesrc_toupper_ or ffesrc_tolower_.
Don't define ffesrc_toupper or ffesrc_tolower.
gcc/java:
* jvgenmain.c: Use ISPRINT not isascii.
From-SVN: r38124
2000-12-08 11:00:26 +08:00
|
|
|
/* This table is constant if it can be initialized at compile time,
|
Makefile.in (LIBCPP_DEPS): New macro.
* Makefile.in (LIBCPP_DEPS): New macro.
(cpplib.o, cpphash.o, cpperror.o, cppexp.o, cppfiles.o): Use
it to declare deps.
* cpperror.c: Include cpphash.h.
* cppexp.c: Include cpphash.h. Remove MULTIBYTE_CHARS
dingleberry.
(lex): Don't use CPP_WARN_UNDEF.
(_cpp_parse_expr): Return an int, the truth value.
* cppfiles.c: Include cpphash.h.
(_cpp_merge_include_chains): Move to cppinit.c and make static.
* cppinit.c (include_defaults_array): Disentangle.
(cpp_cleanup): Don't free the if stack here.
(cpp_finish): Pop off all buffers, not just one.
* cpplib.c (eval_if_expr): Return int.
(do_xifdef): Rename do_ifdef.
(handle_directive): Don't use CPP_PREPROCESSED.
(cpp_get_token): Don't use CPP_C89.
* fix-header.c: Don't use CPP_OPTIONS.
* cpplib.h: Move U_CHAR, enum node_type, struct
file_name_list, struct ihash, is_idchar, is_idstart,
is_numchar, is_numstart, is_hspace, is_space, CPP_BUF_PEEK,
CPP_BUF_GET, CPP_FORWARD, CPP_PUTS, CPP_PUTS_Q, CPP_PUTC,
CPP_PUTC_Q, CPP_NUL_TERMINATE, CPP_NUL_TERMINATE_Q,
CPP_BUMP_BUFFER_LINE, CPP_BUMP_LINE, CPP_PREV_BUFFER,
CPP_PRINT_DEPS, CPP_TRADITIONAL, CPP_PEDANTIC, and prototypes
of _cpp_simplify_pathname, _cpp_find_include_file,
_cpp_read_include_file, and _cpp_parse_expr to cpphash.h.
Move struct if_stack to cpplib.c. Move struct cpp_pending to
cppinit.c.
Change all uses of U_CHAR to be unsigned char instead.
Delete CPP_WARN_UNDEF, CPP_C89, and CPP_PREPROCESSED.
From-SVN: r32435
2000-03-09 07:35:19 +08:00
|
|
|
which is the case if cpp was compiled with GCC >=2.7, or another
|
|
|
|
compiler that supports C99. */
|
2000-08-19 01:35:58 +08:00
|
|
|
#if HAVE_DESIGNATED_INITIALIZERS
|
|
|
|
extern const unsigned char _cpp_trigraph_map[UCHAR_MAX + 1];
|
2024-10-08 03:25:22 +08:00
|
|
|
#elif __cpp_constexpr >= 201304L
|
|
|
|
extern const struct _cpp_trigraph_map_s {
|
|
|
|
unsigned char map[UCHAR_MAX + 1];
|
|
|
|
constexpr _cpp_trigraph_map_s ();
|
|
|
|
} _cpp_trigraph_map_d;
|
|
|
|
#define _cpp_trigraph_map _cpp_trigraph_map_d.map
|
Makefile.in (LIBCPP_DEPS): New macro.
* Makefile.in (LIBCPP_DEPS): New macro.
(cpplib.o, cpphash.o, cpperror.o, cppexp.o, cppfiles.o): Use
it to declare deps.
* cpperror.c: Include cpphash.h.
* cppexp.c: Include cpphash.h. Remove MULTIBYTE_CHARS
dingleberry.
(lex): Don't use CPP_WARN_UNDEF.
(_cpp_parse_expr): Return an int, the truth value.
* cppfiles.c: Include cpphash.h.
(_cpp_merge_include_chains): Move to cppinit.c and make static.
* cppinit.c (include_defaults_array): Disentangle.
(cpp_cleanup): Don't free the if stack here.
(cpp_finish): Pop off all buffers, not just one.
* cpplib.c (eval_if_expr): Return int.
(do_xifdef): Rename do_ifdef.
(handle_directive): Don't use CPP_PREPROCESSED.
(cpp_get_token): Don't use CPP_C89.
* fix-header.c: Don't use CPP_OPTIONS.
* cpplib.h: Move U_CHAR, enum node_type, struct
file_name_list, struct ihash, is_idchar, is_idstart,
is_numchar, is_numstart, is_hspace, is_space, CPP_BUF_PEEK,
CPP_BUF_GET, CPP_FORWARD, CPP_PUTS, CPP_PUTS_Q, CPP_PUTC,
CPP_PUTC_Q, CPP_NUL_TERMINATE, CPP_NUL_TERMINATE_Q,
CPP_BUMP_BUFFER_LINE, CPP_BUMP_LINE, CPP_PREV_BUFFER,
CPP_PRINT_DEPS, CPP_TRADITIONAL, CPP_PEDANTIC, and prototypes
of _cpp_simplify_pathname, _cpp_find_include_file,
_cpp_read_include_file, and _cpp_parse_expr to cpphash.h.
Move struct if_stack to cpplib.c. Move struct cpp_pending to
cppinit.c.
Change all uses of U_CHAR to be unsigned char instead.
Delete CPP_WARN_UNDEF, CPP_C89, and CPP_PREPROCESSED.
From-SVN: r32435
2000-03-09 07:35:19 +08:00
|
|
|
#else
|
2000-08-19 01:35:58 +08:00
|
|
|
extern unsigned char _cpp_trigraph_map[UCHAR_MAX + 1];
|
Makefile.in (LIBCPP_DEPS): New macro.
* Makefile.in (LIBCPP_DEPS): New macro.
(cpplib.o, cpphash.o, cpperror.o, cppexp.o, cppfiles.o): Use
it to declare deps.
* cpperror.c: Include cpphash.h.
* cppexp.c: Include cpphash.h. Remove MULTIBYTE_CHARS
dingleberry.
(lex): Don't use CPP_WARN_UNDEF.
(_cpp_parse_expr): Return an int, the truth value.
* cppfiles.c: Include cpphash.h.
(_cpp_merge_include_chains): Move to cppinit.c and make static.
* cppinit.c (include_defaults_array): Disentangle.
(cpp_cleanup): Don't free the if stack here.
(cpp_finish): Pop off all buffers, not just one.
* cpplib.c (eval_if_expr): Return int.
(do_xifdef): Rename do_ifdef.
(handle_directive): Don't use CPP_PREPROCESSED.
(cpp_get_token): Don't use CPP_C89.
* fix-header.c: Don't use CPP_OPTIONS.
* cpplib.h: Move U_CHAR, enum node_type, struct
file_name_list, struct ihash, is_idchar, is_idstart,
is_numchar, is_numstart, is_hspace, is_space, CPP_BUF_PEEK,
CPP_BUF_GET, CPP_FORWARD, CPP_PUTS, CPP_PUTS_Q, CPP_PUTC,
CPP_PUTC_Q, CPP_NUL_TERMINATE, CPP_NUL_TERMINATE_Q,
CPP_BUMP_BUFFER_LINE, CPP_BUMP_LINE, CPP_PREV_BUFFER,
CPP_PRINT_DEPS, CPP_TRADITIONAL, CPP_PEDANTIC, and prototypes
of _cpp_simplify_pathname, _cpp_find_include_file,
_cpp_read_include_file, and _cpp_parse_expr to cpphash.h.
Move struct if_stack to cpplib.c. Move struct cpp_pending to
cppinit.c.
Change all uses of U_CHAR to be unsigned char instead.
Delete CPP_WARN_UNDEF, CPP_C89, and CPP_PREPROCESSED.
From-SVN: r32435
2000-03-09 07:35:19 +08:00
|
|
|
#endif
|
|
|
|
|
2018-08-17 20:04:13 +08:00
|
|
|
#if !defined (HAVE_UCHAR) && !defined (IN_GCC)
|
|
|
|
typedef unsigned char uchar;
|
|
|
|
#endif
|
|
|
|
|
|
|
|
#define UC (const uchar *) /* Intended use: UC"string" */
|
|
|
|
|
2020-11-19 20:43:13 +08:00
|
|
|
/* Accessors. */
|
Makefile.in (LIBCPP_DEPS): New macro.
* Makefile.in (LIBCPP_DEPS): New macro.
(cpplib.o, cpphash.o, cpperror.o, cppexp.o, cppfiles.o): Use
it to declare deps.
* cpperror.c: Include cpphash.h.
* cppexp.c: Include cpphash.h. Remove MULTIBYTE_CHARS
dingleberry.
(lex): Don't use CPP_WARN_UNDEF.
(_cpp_parse_expr): Return an int, the truth value.
* cppfiles.c: Include cpphash.h.
(_cpp_merge_include_chains): Move to cppinit.c and make static.
* cppinit.c (include_defaults_array): Disentangle.
(cpp_cleanup): Don't free the if stack here.
(cpp_finish): Pop off all buffers, not just one.
* cpplib.c (eval_if_expr): Return int.
(do_xifdef): Rename do_ifdef.
(handle_directive): Don't use CPP_PREPROCESSED.
(cpp_get_token): Don't use CPP_C89.
* fix-header.c: Don't use CPP_OPTIONS.
* cpplib.h: Move U_CHAR, enum node_type, struct
file_name_list, struct ihash, is_idchar, is_idstart,
is_numchar, is_numstart, is_hspace, is_space, CPP_BUF_PEEK,
CPP_BUF_GET, CPP_FORWARD, CPP_PUTS, CPP_PUTS_Q, CPP_PUTC,
CPP_PUTC_Q, CPP_NUL_TERMINATE, CPP_NUL_TERMINATE_Q,
CPP_BUMP_BUFFER_LINE, CPP_BUMP_LINE, CPP_PREV_BUFFER,
CPP_PRINT_DEPS, CPP_TRADITIONAL, CPP_PEDANTIC, and prototypes
of _cpp_simplify_pathname, _cpp_find_include_file,
_cpp_read_include_file, and _cpp_parse_expr to cpphash.h.
Move struct if_stack to cpplib.c. Move struct cpp_pending to
cppinit.c.
Change all uses of U_CHAR to be unsigned char instead.
Delete CPP_WARN_UNDEF, CPP_C89, and CPP_PREPROCESSED.
From-SVN: r32435
2000-03-09 07:35:19 +08:00
|
|
|
|
2020-11-19 20:43:13 +08:00
|
|
|
inline int
|
|
|
|
_cpp_in_system_header (cpp_reader *pfile)
|
Represent column numbers using line-map's source_location.
The "next available source_location" is now managed internally by
line-maps.c rather than by clients.
* line-map.h (struct line_map): New field column_bits.
<from_line>: Rename field to start_location.
(struct line_maps): New fields highest_location and max_column_hint.
(linemap_check_files_exited): New declaration.
(linemap_line_start): New declaration.
(linemap_add): Remove from_line parameter; use highest_location field.
(SOURCE_LINE, LAST_SOURCE_LINE): Modify to use column_bits.
(SOURCE_COLUMN, LAST_SOURCE_LINE_LOCATION): New macros.
(CURRENT_LINE_MAP): Remove macro.
(linemap_position_for_column): New inline function.
* line-map.c (linemap_init): Clear new fields.
(linemap_check_files_exited): New function, extracted from ...
(linemap_free): Use linemap_check_files_exited.
(linemap_add): Remove from_line parameter. Various updates.
(linemap_line_start): New function.
(linemap_lookeup): Update for new field names.
* cpphash.h (struct cpp_reader) <map>: Field removed. Because
linemap_position_for_column may unpredictably change the current map,
it is cleaner and simpler for us to not cache it in cpp_reader.
(struct cpp_buffer): New sysp field.
Changed warned_cplusplus_comments and from_stage3 to bitfields.
* cppinit.c (cpp_read_min_file): pfile->map no longer exists.
* cpplib.c (do_line, do_linemarker, _cpp_do_file_change): Get
current map using linemap_lookup.
(do_linemarker): Also set buffer's sysp field.
(destringize_and_run): No longer need to decrement current line.
* cppfiles.c (_cpp_stack_file): Set sysp from and in buffer.
(search_path_head, open_file_failed): Use buffer's sysp.
(cpp_make_system_header): Get current map using linemap_lookup.
Also set buffer's sysp flag.
* cppmacro.c (_cpp_builtin_macro_text): Likewise use linemap_lookup.
* cpphash.h (CPP_INCREMENT_LINE): New macro.
(struct cpp_buffer): Moved fields saved_cur, saved_rlimit to ...
(struct cpp_reader): ... and adding saved_line_base field.
* cpptrad.c (_cpp_overlay_buffer, _cpp_remove_overlay):
Update accordingly. Don't adjust line.
(_cpp_scan_out_logical_line): Use CPP_INCREMENT_LINE.
* cpphash.c (CPP_IN_SYSTEM_HEADER): Replaced macro by ...
(cpp_in_system_header): ... new inline function, using buffer's sysp.
* cpperror.c (_cpp_begin_message): Update to use cpp_in_system_header.
* cpplex.c (_cpp_lex_direct): Likewise.
* cppmacro.c (_cpp_builtin_macro_text): Likewise.
* cppmacro.c (_cpp_create_definition): Use buffer's sysp field.
* cpplib.h (struct cpp_token): Rename line field to src_loc.
Remove col field as it is now subsumed by src_loc.
* cpperror.c: Update various field, parameter, and macro names.
(print_location): If col==0, try SOURCE_COLUMN of line.
(cpp_error): Use cur_token's src_loc field, rather than line+col.
* cpplib.c (do_diagnostic): Token's src_loc fields replaces line+col.
* cpplex.c (_cpp_process_line_notes, _cpp_lex_direct,
_cpp_skip_block_comment): Use CPP_INCREMENT_LINE.
(_cpp_temp_token): Replace cpp_token's line+col fields by src_loc.
(_cpp_get_fresh_line): Don't need to adjust line for missing newline.
(_cpp_lex_direct): Use linemap_position_for_column.
* c-ppoutput.c (maybe_print_line, print_line): Don't take map
parameter. Instead get it from the line_table global. Adjust callers.
(print): Remove map field. Replace line field to src_line.
(init_pp_output, account_for_newlines, maybe_print_line): Adjust.
(cb_line_change): Use SOURCE_COLUMN. Minor optimizations.
(pp_file_change): Use MAIN_FILE_P since we cannot checked print.map.
Use LAST_SOURCE_LINE_LOCATION to "catch up" after #include.
* cpptrad.c (copy_comment): Rename variable.
* c-lex.c (map): Remove static variable, for same reason we removed
cpp_reader's map field.
(cb_line_change, cb_def_pragma, cb_define, cb_undef): Hence we need
to call linemap_lookup.
(cb_line_change): Token's line field replaced by src_loc.
(fe_file_change): Use MAINFILE_P and LAST_SOURCE_LINE macros.
Don't save new_map.
* cpphash.h, cpperror.c, cpplib.h: Some renames of fileline to
source_location.
From-SVN: r77663
2004-02-11 23:29:30 +08:00
|
|
|
{
|
|
|
|
return pfile->buffer ? pfile->buffer->sysp : 0;
|
|
|
|
}
|
optc-gen.awk: Generate global_options initializer instead of individual variables.
gcc:
* optc-gen.awk: Generate global_options initializer instead of
individual variables. Add x_ prefix to names of structure
members.
* opth-gen.awk: Generate gcc_options structure. Add x_ prefix to
names of structure members.
* doc/tm.texi.in (HARD_FRAME_POINTER_IS_FRAME_POINTER,
HARD_FRAME_POINTER_IS_ARG_POINTER): Document.
* doc/tm.texi: Regenerate.
* alias.c: Use HARD_FRAME_POINTER_IS_FRAME_POINTER
* builtins.c: Use HARD_FRAME_POINTER_IS_ARG_POINTER.
* c-parser.c (disable_extension_diagnostics,
restore_extension_diagnostics): Update names of cpp_options
members.
* combine.c: Use HARD_FRAME_POINTER_IS_FRAME_POINTER
* common.opt (fcompare-debug-second): Don't use Var.
* config/alpha/alpha.h (target_flags): Remove.
* config/arm/arm.h (HARD_FRAME_POINTER_IS_FRAME_POINTER,
HARD_FRAME_POINTER_IS_ARG_POINTER): Define.
* config/bfin/bfin.h (target_flags): Remove.
* config/cris/cris.h (target_flags): Remove.
* config/i386/i386-c.c (ix86_pragma_target_parse): Update names of
cl_target_option members.
* config/i386/i386.c (ix86_force_align_arg_pointer): Remove.
(ix86_function_specific_print, ix86_valid_target_attribute_tree,
ix86_can_inline_p): Update names of cl_target_option members.
* config/i386/i386.h (ix86_isa_flags): Remove.
* config/lm32/lm32.h (target_flags): Remove.
* config/mcore/mcore.h (mcore_stack_increment): Remove.
* config/mcore/mcore.md (addsi3): Remove extern declaration of
flag_omit_frame_pointer.
* config/mep/mep.h (target_flags): Remove.
* config/mips/mips.h (HARD_FRAME_POINTER_IS_FRAME_POINTER,
HARD_FRAME_POINTER_IS_ARG_POINTER): Define.
* config/mmix/mmix.h (target_flags): Remove.
* config/rs6000/rs6000.h (rs6000_xilinx_fpu, flag_pic,
flag_expensive_optimizations): Remove.
* config/s390/s390.h (flag_pic): Remove.
* config/score/score-conv.h (target_flags): Remove.
* config/sh/sh.h (sh_fixed_range_str): Remove.
* config/spu/spu.h (target_flags, spu_fixed_range_string): Remove.
* dbxout.c: Use HARD_FRAME_POINTER_IS_ARG_POINTER
* df-scan.c: Use HARD_FRAME_POINTER_IS_FRAME_POINTER.
* diagnostic.c (diagnostic_initialize): Update names of
diagnostic_context members.
* diagnostic.h (diagnostic_context): Rename inhibit_warnings and
warn_system_headers.
(diagnostic_report_warnings_p): Update for new names.
* dwarf2out.c: Use HARD_FRAME_POINTER_IS_ARG_POINTER
* emit-rtl.c: Use HARD_FRAME_POINTER_IS_FRAME_POINTER and
HARD_FRAME_POINTER_IS_ARG_POINTER.
* flags.h (flag_compare_debug): Declare.
* ira.c: Use HARD_FRAME_POINTER_IS_FRAME_POINTER
* opts.c (flag_compare_debug): Define.
(common_handle_option): Update names of diagnostic_context
members. Handle -fcompare-debug-second.
(fast_math_flags_struct_set_p): Update names of cl_optimization
members.
* reginfo.c: Use HARD_FRAME_POINTER_IS_FRAME_POINTER.
* regrename.c: Use HARD_FRAME_POINTER_IS_FRAME_POINTER.
* reload.c: Use HARD_FRAME_POINTER_IS_FRAME_POINTER.
* reload1.c: Use HARD_FRAME_POINTER_IS_FRAME_POINTER.
* resource.c: Use HARD_FRAME_POINTER_IS_FRAME_POINTER.
* rtl.h (HARD_FRAME_POINTER_IS_FRAME_POINTER,
HARD_FRAME_POINTER_IS_ARG_POINTER): Define and use.
* sel-sched.c: Use HARD_FRAME_POINTER_IS_FRAME_POINTER
* stmt.c: Use HARD_FRAME_POINTER_IS_ARG_POINTER.
gcc/c-family:
* c-common.c (c_cpp_error): Update names of diagnostic_context
members.
* c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): Update names of
cl_optimization members.
* c-opts.c (warning_as_error_callback, c_common_handle_option,
sanitize_cpp_opts, finish_options): Update names of cpp_options
members.
gcc/fortran:
* cpp.c (cpp_define_builtins): Update names of gfc_option_t
members.
(gfc_cpp_post_options): Update names of cpp_options members.
(cb_cpp_error): Update names of diagnostic_context members.
* f95-lang.c (gfc_init_builtin_functions): Update names of
gfc_option_t members.
* gfortran.h (gfc_option_t): Rename warn_conversion and
flag_openmp.
* intrinsic.c (gfc_convert_type_warn): Update names of
gfc_option_t members.
* options.c (gfc_init_options, gfc_post_options, set_Wall,
gfc_handle_option): Update names of gfc_option_t members.
* parse.c (next_free, next_fixed): Update names of gfc_option_t
members.
* scanner.c (pedantic): Remove extern declaration.
(skip_free_comments, skip_fixed_comments, include_line): Update
names of gfc_option_t members.
* trans-decl.c (gfc_generate_function_code): Update names of
gfc_option_t members.
gcc/java:
* java-tree.h (flag_filelist_file, flag_assert, flag_jni,
flag_force_classes_archive_check, flag_redundant, flag_newer,
flag_use_divide_subroutine, flag_use_atomic_builtins,
flag_use_boehm_gc, flag_hash_synchronization,
flag_check_references, flag_optimize_sci, flag_indirect_classes,
flag_indirect_dispatch, flag_store_check,
flag_reduced_reflection): Remove.
* jcf-dump.c (flag_newer): Remove.
* jcf.h (quiet_flag): Remove.
* parse.h (quiet_flag): Remove.
libcpp:
* include/cpplib.h (cpp_options): Rename warn_deprecated,
warn_traditional, warn_long_long and pedantic.
* directives.c (directive_diagnostics, _cpp_handle_directive):
Update names of cpp_options members.
* expr.c (cpp_classify_number, eval_token): Update names of
cpp_options members.
* init.c (cpp_create_reader, post_options): Update names of
cpp_options members.
* internal.h (CPP_PEDANTIC, CPP_WTRADITIONAL): Update names of
cpp_options members.
* macro.c (parse_params): Update names of cpp_options members.
From-SVN: r164723
2010-09-29 22:49:14 +08:00
|
|
|
#define CPP_PEDANTIC(PF) CPP_OPTION (PF, cpp_pedantic)
|
|
|
|
#define CPP_WTRADITIONAL(PF) CPP_OPTION (PF, cpp_warn_traditional)
|
Makefile.in (LIBCPP_DEPS): New macro.
* Makefile.in (LIBCPP_DEPS): New macro.
(cpplib.o, cpphash.o, cpperror.o, cppexp.o, cppfiles.o): Use
it to declare deps.
* cpperror.c: Include cpphash.h.
* cppexp.c: Include cpphash.h. Remove MULTIBYTE_CHARS
dingleberry.
(lex): Don't use CPP_WARN_UNDEF.
(_cpp_parse_expr): Return an int, the truth value.
* cppfiles.c: Include cpphash.h.
(_cpp_merge_include_chains): Move to cppinit.c and make static.
* cppinit.c (include_defaults_array): Disentangle.
(cpp_cleanup): Don't free the if stack here.
(cpp_finish): Pop off all buffers, not just one.
* cpplib.c (eval_if_expr): Return int.
(do_xifdef): Rename do_ifdef.
(handle_directive): Don't use CPP_PREPROCESSED.
(cpp_get_token): Don't use CPP_C89.
* fix-header.c: Don't use CPP_OPTIONS.
* cpplib.h: Move U_CHAR, enum node_type, struct
file_name_list, struct ihash, is_idchar, is_idstart,
is_numchar, is_numstart, is_hspace, is_space, CPP_BUF_PEEK,
CPP_BUF_GET, CPP_FORWARD, CPP_PUTS, CPP_PUTS_Q, CPP_PUTC,
CPP_PUTC_Q, CPP_NUL_TERMINATE, CPP_NUL_TERMINATE_Q,
CPP_BUMP_BUFFER_LINE, CPP_BUMP_LINE, CPP_PREV_BUFFER,
CPP_PRINT_DEPS, CPP_TRADITIONAL, CPP_PEDANTIC, and prototypes
of _cpp_simplify_pathname, _cpp_find_include_file,
_cpp_read_include_file, and _cpp_parse_expr to cpphash.h.
Move struct if_stack to cpplib.c. Move struct cpp_pending to
cppinit.c.
Change all uses of U_CHAR to be unsigned char instead.
Delete CPP_WARN_UNDEF, CPP_C89, and CPP_PREPROCESSED.
From-SVN: r32435
2000-03-09 07:35:19 +08:00
|
|
|
|
2020-11-19 20:43:13 +08:00
|
|
|
/* Return true if we're in the main file (unless it's considered to be
|
|
|
|
an include file in its own right. */
|
|
|
|
inline int
|
|
|
|
_cpp_in_main_source_file (cpp_reader *pfile)
|
2007-01-04 23:32:26 +08:00
|
|
|
{
|
2020-11-19 23:00:51 +08:00
|
|
|
return (!CPP_OPTION (pfile, main_search)
|
|
|
|
&& pfile->buffer->file == pfile->main_file);
|
2007-01-04 23:32:26 +08:00
|
|
|
}
|
|
|
|
|
[PR 80005] Fix __has_include
__has_include is funky in that it is macro-like from the POV of #ifdef and
friends, but lexes its parenthesize argument #include-like. We were
failing the second part of that, because we used a forwarding macro to an
internal name, and hence always lexed the argument in macro-parameter
context. We componded that by not setting the right flag when lexing, so
it didn't even know. Mostly users got lucky.
This reimplements the handline.
1) Remove the forwarding, but declare object-like macros that
expand to themselves. This satisfies the #ifdef requirement
2) Correctly set angled_brackets when lexing the parameter. This tells
the lexer (a) <...> is a header name and (b) "..." is too (not a string).
3) Remove the in__has_include lexer state, just tell find_file that that's
what's happenning, so it doesn't emit an error.
We lose the (undocumented) ability to #undef __has_include. That may well
have been an accident of implementation. There are no tests for it.
We gain __has_include behaviour for all users of the preprocessors -- not
just the C-family ones that defined a forwarding macro.
libcpp/
PR preprocessor/80005
* include/cpplib.h (BT_HAS_ATTRIBUTE): Fix comment.
* internal.h (struct lexer_state): Delete in__has_include field.
(struct spec_nodes): Rename n__has_include{,_next}__ fields.
(_cpp_defined_macro_p): New.
(_cpp_find_file): Add has_include parm.
* directives.c (lex_macro_node): Combine defined,
__has_inline{,_next} checking.
(do_ifdef, do_ifndef): Use _cpp_defined_macro_p.
(_cpp_init_directives): Refactor.
* expr.c (parse_defined): Use _cpp_defined_macro_p.
(eval_token): Adjust parse_has_include calls.
(parse_has_include): Add OP parameter. Reimplement.
* files.c (_cpp_find_file): Add HAS_INCLUDE parm. Use it to
inhibit error message.
(_cpp_stack_include): Adjust _cpp_find_file call.
(_cpp_fake_include, _cpp_compare_file_date): Likewise.
(open_file_failed): Remove in__has_include check.
(_cpp_has_header): Adjust _cpp_find_file call.
* identifiers.c (_cpp_init_hashtable): Don't init
__has_include{,_next} here ...
* init.c (cpp_init_builtins): ... init them here. Define as
macros.
(cpp_read_main_file): Adjust _cpp_find_file call.
* pch.c (cpp_read_state): Adjust __has_include{,_next} access.
* traditional.c (_cpp_scan_out_locgical_line): Likewise.
gcc/c-family/
PR preprocessor/80005
* c-cppbuiltins.c (c_cpp_builtins): Don't define __has_include{,_next}.
gcc/testsuite/
PR preprocessor/80005
* g++.dg/cpp1y/feat-cxx14.C: Adjust.
* g++.dg/cpp1z/feat-cxx17.C: Adjust.
* g++.dg/cpp2a/feat-cxx2a.C: Adjust.
* g++.dg/cpp/pr80005.C: New.
2020-01-20 21:39:59 +08:00
|
|
|
/* True if NODE is a macro for the purposes of ifdef, defined etc. */
|
2024-10-02 16:53:35 +08:00
|
|
|
inline bool
|
|
|
|
_cpp_defined_macro_p (const cpp_hashnode *node)
|
[PR 80005] Fix __has_include
__has_include is funky in that it is macro-like from the POV of #ifdef and
friends, but lexes its parenthesize argument #include-like. We were
failing the second part of that, because we used a forwarding macro to an
internal name, and hence always lexed the argument in macro-parameter
context. We componded that by not setting the right flag when lexing, so
it didn't even know. Mostly users got lucky.
This reimplements the handline.
1) Remove the forwarding, but declare object-like macros that
expand to themselves. This satisfies the #ifdef requirement
2) Correctly set angled_brackets when lexing the parameter. This tells
the lexer (a) <...> is a header name and (b) "..." is too (not a string).
3) Remove the in__has_include lexer state, just tell find_file that that's
what's happenning, so it doesn't emit an error.
We lose the (undocumented) ability to #undef __has_include. That may well
have been an accident of implementation. There are no tests for it.
We gain __has_include behaviour for all users of the preprocessors -- not
just the C-family ones that defined a forwarding macro.
libcpp/
PR preprocessor/80005
* include/cpplib.h (BT_HAS_ATTRIBUTE): Fix comment.
* internal.h (struct lexer_state): Delete in__has_include field.
(struct spec_nodes): Rename n__has_include{,_next}__ fields.
(_cpp_defined_macro_p): New.
(_cpp_find_file): Add has_include parm.
* directives.c (lex_macro_node): Combine defined,
__has_inline{,_next} checking.
(do_ifdef, do_ifndef): Use _cpp_defined_macro_p.
(_cpp_init_directives): Refactor.
* expr.c (parse_defined): Use _cpp_defined_macro_p.
(eval_token): Adjust parse_has_include calls.
(parse_has_include): Add OP parameter. Reimplement.
* files.c (_cpp_find_file): Add HAS_INCLUDE parm. Use it to
inhibit error message.
(_cpp_stack_include): Adjust _cpp_find_file call.
(_cpp_fake_include, _cpp_compare_file_date): Likewise.
(open_file_failed): Remove in__has_include check.
(_cpp_has_header): Adjust _cpp_find_file call.
* identifiers.c (_cpp_init_hashtable): Don't init
__has_include{,_next} here ...
* init.c (cpp_init_builtins): ... init them here. Define as
macros.
(cpp_read_main_file): Adjust _cpp_find_file call.
* pch.c (cpp_read_state): Adjust __has_include{,_next} access.
* traditional.c (_cpp_scan_out_locgical_line): Likewise.
gcc/c-family/
PR preprocessor/80005
* c-cppbuiltins.c (c_cpp_builtins): Don't define __has_include{,_next}.
gcc/testsuite/
PR preprocessor/80005
* g++.dg/cpp1y/feat-cxx14.C: Adjust.
* g++.dg/cpp1z/feat-cxx17.C: Adjust.
* g++.dg/cpp2a/feat-cxx2a.C: Adjust.
* g++.dg/cpp/pr80005.C: New.
2020-01-20 21:39:59 +08:00
|
|
|
{
|
|
|
|
/* Do not treat conditional macros as being defined. This is due to
|
|
|
|
the powerpc port using conditional macros for 'vector', 'bool',
|
|
|
|
and 'pixel' to act as conditional keywords. This messes up tests
|
|
|
|
like #ifndef bool. */
|
|
|
|
return cpp_macro_p (node) && !(node->flags & NODE_CONDITIONAL);
|
|
|
|
}
|
|
|
|
|
2022-01-14 23:57:02 +08:00
|
|
|
/* In macro.cc */
|
preprocessor: Add deferred macros
Deferred macros are needed for C++ modules. Header units may export
macro definitions and undefinitions. These are resolved lazily at the
point of (potential) use. (The language specifies that, it's not just
a useful optimization.) Thus, identifier nodes grow a 'deferred'
field, which fortunately doesn't expand the structure on 64-bit
systems as there was padding there. This is non-zero on NT_MACRO
nodes, if the macro is deferred. When such an identifier is lexed, it
is resolved via a callback that I added recently. That will either
provide the macro definition, or discover it there was an overriding
undef. Either way the identifier is no longer a deferred macro.
Notice it is now possible for NT_MACRO nodes to have a NULL macro
expansion.
libcpp/
* include/cpplib.h (struct cpp_hashnode): Add deferred field.
(cpp_set_deferred_macro): Define.
(cpp_get_deferred_macro): Declare.
(cpp_macro_definition): Reformat, add overload.
(cpp_macro_definition_location): Deal with deferred macro.
(cpp_alloc_token_string, cpp_compare_macro): Declare.
* internal.h (_cpp_notify_macro_use): Return bool
(_cpp_maybe_notify_macro_use): Likewise.
* directives.c (do_undef): Check macro is not undef before
warning.
(do_ifdef, do_ifndef): Deal with deferred macro.
* expr.c (parse_defined): Likewise.
* lex.c (cpp_allocate_token_string): Break out of ...
(create_literal): ... here. Call it.
(cpp_maybe_module_directive): Deal with deferred macro.
* macro.c (cpp_get_token_1): Deal with deferred macro.
(warn_of_redefinition): Deal with deferred macro.
(compare_macros): Rename to ...
(cpp_compare_macro): ... here. Make extern.
(cpp_get_deferred_macro): New.
(_cpp_notify_macro_use): Deal with deferred macro, return bool
indicating definedness.
(cpp_macro_definition): Deal with deferred macro.
2020-11-25 00:23:55 +08:00
|
|
|
extern bool _cpp_notify_macro_use (cpp_reader *pfile, cpp_hashnode *node,
|
|
|
|
location_t);
|
|
|
|
inline bool _cpp_maybe_notify_macro_use (cpp_reader *pfile, cpp_hashnode *node,
|
2020-11-03 00:29:58 +08:00
|
|
|
location_t loc)
|
2018-08-16 21:51:38 +08:00
|
|
|
{
|
|
|
|
if (!(node->flags & NODE_USED))
|
preprocessor: Add deferred macros
Deferred macros are needed for C++ modules. Header units may export
macro definitions and undefinitions. These are resolved lazily at the
point of (potential) use. (The language specifies that, it's not just
a useful optimization.) Thus, identifier nodes grow a 'deferred'
field, which fortunately doesn't expand the structure on 64-bit
systems as there was padding there. This is non-zero on NT_MACRO
nodes, if the macro is deferred. When such an identifier is lexed, it
is resolved via a callback that I added recently. That will either
provide the macro definition, or discover it there was an overriding
undef. Either way the identifier is no longer a deferred macro.
Notice it is now possible for NT_MACRO nodes to have a NULL macro
expansion.
libcpp/
* include/cpplib.h (struct cpp_hashnode): Add deferred field.
(cpp_set_deferred_macro): Define.
(cpp_get_deferred_macro): Declare.
(cpp_macro_definition): Reformat, add overload.
(cpp_macro_definition_location): Deal with deferred macro.
(cpp_alloc_token_string, cpp_compare_macro): Declare.
* internal.h (_cpp_notify_macro_use): Return bool
(_cpp_maybe_notify_macro_use): Likewise.
* directives.c (do_undef): Check macro is not undef before
warning.
(do_ifdef, do_ifndef): Deal with deferred macro.
* expr.c (parse_defined): Likewise.
* lex.c (cpp_allocate_token_string): Break out of ...
(create_literal): ... here. Call it.
(cpp_maybe_module_directive): Deal with deferred macro.
* macro.c (cpp_get_token_1): Deal with deferred macro.
(warn_of_redefinition): Deal with deferred macro.
(compare_macros): Rename to ...
(cpp_compare_macro): ... here. Make extern.
(cpp_get_deferred_macro): New.
(_cpp_notify_macro_use): Deal with deferred macro, return bool
indicating definedness.
(cpp_macro_definition): Deal with deferred macro.
2020-11-25 00:23:55 +08:00
|
|
|
return _cpp_notify_macro_use (pfile, node, loc);
|
|
|
|
return true;
|
2018-08-16 21:51:38 +08:00
|
|
|
}
|
2018-08-18 00:07:19 +08:00
|
|
|
extern cpp_macro *_cpp_new_macro (cpp_reader *, cpp_macro_kind, void *);
|
* cpplib.h, cpphash.h, cppcharset.c, cpperror.c, cppexp.c
* cppfiles.c, cpphash.c, cppinit.c, cpplex.c, cpplib.c
* cppmacro.c, cpppch.c, cpptrad.c, cppspec.c: Convert to
ISO C: new-style function declarations, no need for PARAMS,
no special punctuation on indirect function calls, use string
constant concatenation where convenient.
From-SVN: r68070
2003-06-17 14:17:44 +08:00
|
|
|
extern void _cpp_free_definition (cpp_hashnode *);
|
2022-08-03 22:46:23 +08:00
|
|
|
extern bool _cpp_create_definition (cpp_reader *, cpp_hashnode *, location_t);
|
* cpplib.h, cpphash.h, cppcharset.c, cpperror.c, cppexp.c
* cppfiles.c, cpphash.c, cppinit.c, cpplex.c, cpplib.c
* cppmacro.c, cpppch.c, cpptrad.c, cppspec.c: Convert to
ISO C: new-style function declarations, no need for PARAMS,
no special punctuation on indirect function calls, use string
constant concatenation where convenient.
From-SVN: r68070
2003-06-17 14:17:44 +08:00
|
|
|
extern void _cpp_pop_context (cpp_reader *);
|
|
|
|
extern void _cpp_push_text_context (cpp_reader *, cpp_hashnode *,
|
2004-11-28 05:59:38 +08:00
|
|
|
const unsigned char *, size_t);
|
2018-08-17 03:18:42 +08:00
|
|
|
extern bool _cpp_save_parameter (cpp_reader *, unsigned, cpp_hashnode *,
|
Preserve original spellings of extended identifiers.
This patch makes cpplib track the original spellings of extended
identifiers, as well as the canonical UTF-8 version, in order to
follow standard semantics properly without needing a convoluted and
undocumented canonicalization in translation phase 1 (see bug 9449
comments 39-46 regarding such a canonicalization).
The spelling is tracked in cpp_identifier and cpp_macro_arg without
making cpp_token any larger. The original spelling is used for checks
of duplicate macro definitions, stringizing (see the C++ tests added;
this case is only an issue for C++ not C because C makes it
implementation-defined whether a \ is inserted before the \ of a UCN
in a string or character constant when stringizing, while C++ does
not), pasting (relevant when the result is then stringized for C++)
and when macro definitions are output as text (e.g. for -d options).
Once a macro has been defined, only the original spelling of the
argument names needs keeping in the argument list. While it is being
defined, however, both spellings are needed: the original one for
subsequent saving for checks of duplicate macro definitions, and the
canonical one which is the node marked specially to generate macro
argument tokens rather than normal identifier tokens. The buffer that
is used to save the original values of the identifier tokens is
changed so that it stores both those original values and a pointer to
the canonical hash nodes, so that those canonical nodes can be found
when their values need restoring after the macro definition has been
parsed.
I believe this covers the known standards issues in extended
identifiers support (the remaining unimplemented C99 areas in GCC all
being floating-point-related), except for C++ translation of extended
characters to UCNs in phase 1 (which I have no plans to work on).
There are however probably issues left with handling of extended
identifiers in other places, as listed in
<https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00337.html> (those
issues are generally the sort of thing that could be addressed as bugs
outside development stage 1). (The bulk of the potential issues Zack
was concerned about in 2003-5, that resulted in extended identifiers
being disabled in the absence of -fextended-identifiers, were
effectively eliminated by the audit and fixes I did in 2009, however;
that todo list reflects what was left over after that audit.)
Bootstrapped with no regressions on x86_64-unknown-linux-gnu.
libcpp:
* include/cpp-id-data.h (struct cpp_macro): Update comment
regarding parameters.
* include/cpplib.h (struct cpp_macro_arg, struct cpp_identifier):
Add spelling fields.
(struct cpp_token): Update comment on macro_arg.
* internal.h (_cpp_save_parameter): Add extra argument.
(_cpp_spell_ident_ucns): New declaration.
* lex.c (lex_identifier): Add SPELLING argument. Set *SPELLING to
original spelling of identifier.
(_cpp_lex_direct): Update calls to lex_identifier.
(_cpp_spell_ident_ucns): New function, factored out of
cpp_spell_token.
(cpp_spell_token): Adjust FORSTRING argument semantics to return
original spelling of identifiers. Use _cpp_spell_ident_ucns in
!FORSTRING case.
(_cpp_equiv_tokens): Check spellings of identifiers and macro
arguments are identical.
* macro.c (macro_arg_saved_data): New structure.
(paste_tokens): Use original spellings of identifiers from
cpp_spell_token.
(_cpp_save_parameter): Add argument SPELLING. Save both canonical
node and its value.
(parse_params): Update calls to _cpp_save_parameter.
(lex_expansion_token): Save spelling of macro argument tokens.
(_cpp_create_definition): Extract canonical node from saved data.
(cpp_macro_definition): Use UCNs in spelling of macro name. Use
original spellings of macro argument tokens and identifiers.
* traditional.c (scan_parameters): Update call to
_cpp_save_parameter.
gcc:
* doc/invoke.texi (-std=c99, -std=c11): Don't refer to corner
cases of extended identifiers.
gcc/testsuite:
* g++.dg/cpp/ucnid-2.C, g++.dg/cpp/ucnid-3.C,
gcc.dg/cpp/ucnid-11.c, gcc.dg/cpp/ucnid-12.c,
gcc.dg/cpp/ucnid-13.c, gcc.dg/cpp/ucnid-14.c,
gcc.dg/cpp/ucnid-15.c: New tests.
From-SVN: r217202
2014-11-07 05:08:52 +08:00
|
|
|
cpp_hashnode *);
|
2018-08-17 03:18:42 +08:00
|
|
|
extern void _cpp_unsave_parameters (cpp_reader *, unsigned);
|
* cpplib.h, cpphash.h, cppcharset.c, cpperror.c, cppexp.c
* cppfiles.c, cpphash.c, cppinit.c, cpplex.c, cpplib.c
* cppmacro.c, cpppch.c, cpptrad.c, cppspec.c: Convert to
ISO C: new-style function declarations, no need for PARAMS,
no special punctuation on indirect function calls, use string
constant concatenation where convenient.
From-SVN: r68070
2003-06-17 14:17:44 +08:00
|
|
|
extern bool _cpp_arguments_ok (cpp_reader *, cpp_macro *, const cpp_hashnode *,
|
|
|
|
unsigned int);
|
2004-11-28 05:59:38 +08:00
|
|
|
extern const unsigned char *_cpp_builtin_macro_text (cpp_reader *,
|
2016-04-07 02:35:16 +08:00
|
|
|
cpp_hashnode *,
|
2018-11-14 04:05:03 +08:00
|
|
|
location_t = 0);
|
libcpp, c-family: Add (dumb) C23 N3017 #embed support [PR105863]
The following patch implements the C23 N3017 "#embed - a scannable,
tooling-friendly binary resource inclusion mechanism" paper.
The implementation is intentionally dumb, in that it doesn't significantly
speed up compilation of larger initializers and doesn't make it possible
to use huge #embeds (like several gigabytes large, that is compile time
and memory still infeasible).
There are 2 reasons for this. One is that I think like it is implemented
now in the patch is how we should use it for the smaller #embed sizes,
dunno with which boundary, whether 32 bytes or 64 or something like that,
certainly handling the single byte cases which is something that can appear
anywhere in the source where constant integer literal can appear is
desirable and I think for a few bytes it isn't worth it to come up with
something smarter and users would like to e.g. see it in -E readably as
well (perhaps the slow vs. fast boundary should be determined by command
line option). And the other one is to be able to more easily find
regressions in behavior caused by the optimizations, so we have something
to get back in git to compare against.
I'm definitely willing to work on the optimizations (likely introduce a new
CPP_* token type to refer to a range of libcpp owned memory (start + size)
and similarly some tree which can do the same, and can be at any time e.g.
split into 2 subparts + say INTEGER_CST in between if needed say for
const unsigned char d[] = {
#embed "2GB.dat" prefix (0, 0, ) suffix (, [0x40000000] = 42)
}; still without having to copy around huge amounts of data; STRING_CST
owns the memory it points to and can be only 2GB in size), but would
like to do that incrementally.
And would like to first include some extensions also not included in
this patch, like gnu::offset (off) parameter to allow to skip certain
constant amount of bytes at the start of the files, plus
gnu::base64 ("base64_encoded_data") parameter to add something which can
store more efficiently large amounts of the #embed data in preprocessed
source.
I've been cross-checking all the tests also against the LLVM implementation
https://github.com/llvm/llvm-project/pull/68620
which has been for a few hours even committed to LLVM trunk but reverted
afterwards. LLVM now has the support committed and I admit I haven't
rechecked whether the behavior on the below mentioned spots have been fixed
in it already or not yet.
The patch uses --embed-dir= option that clang plans to add above and doesn't
use other variants on the search directories yet, plus there are no
default directories at least for the time being where to search for embed
files. So, #embed "..." works if it is found in the same directory (or
relative to the current file's directory) and #embed "/..." or #embed </...>
work always, but relative #embed <...> doesn't unless at least one
--embed-dir= is specified. There is no reason to differentiate between
system and non-system directories, so we don't need -isystem like
counterpart, perhaps -iquote like counterpart could be useful in the future,
dunno what else. It has --embed-directory=dir and --embed-directory dir
as aliases.
There are some differences beyond clang ICEs, so I'd like to point them out
to make sure there is agreement on the choices in the patch. They are also
mentioned in the comments of the llvm pull request.
The most important is that the GCC patch (as well as the original thephd.dev
LLVM branch on godbolt) expands #embed (or acts as if it is expanded) into
a mere sequence of numbers like 123,2,35,26 rather then what clang
effectively treats as (unsigned char)123,(unsigned char)2,(unsigned
char)35,(unsigned char)26 but only does that when using integrated
preprocessor, not when using -save-temps where it acts as GCC.
JeanHeyd as the original author agrees that is how it is currently worded in
C23.
Another difference (not tested in the testsuite, not sure how to check for
effective target /dev/urandom nor am sure it is desirable to check that
during testsuite) is how to treat character devices, named pipes etc.
(block devices are errored on). The original paper uses /dev/urandom
in various examples and seems to assume that unlike regular files the
devices aren't really cached, so
#embed </dev/urandom> limit(1) prefix(int a = ) suffix(;)
#embed </dev/urandom> limit(1) prefix(int b = ) suffix(;)
usually results in a != b. That is what the godbolt thephd.dev branch
implements too and what this patch does as well, but clang actually seems
to just go from st.st_size == 0, ergo it must be zero-sized resource and
so just copies over if_empty if present. It is really questionable
what to do about the character devices/named pipes with __has_embed, for
regular files the patch doesn't read anything from them, relies on
st.st_size + limit for whether it is empty or non-empty. But I don't know
of a way to check if read on say a character device would read anything
or not (the </dev/null> limit (1) vs. </dev/zero> limit (1) cases), and
if we read something, that would be better cached for later because
#embed later if it reads again could read no further data even when it
first read something. So, the patch currently for __has_embed just
always returns 2 on the non-regular files, like the thephd.dev
branch does as well and like the clang pull request as well.
A question is also what to do for gnu::offset on the non-regular files
even for #embed, those aren't seekable and do we want to just read and throw
away the offset bytes each time we see it used?
clang also chokes on the
#if __has_embed (__FILE__ __limit__ (1) __prefix__ () suffix (1 / 0) \
__if_empty__ ((({{[0[0{0{0(0(0)1)1}1}]]}})))) != __STDC_EMBED_FOUND__
#error "__has_embed fail"
#endif
in embed-1.c, but thephd.dev branch accepts it and I don't see why
it shouldn't, (({{[0[0{0{0(0(0)1)1}1}]]}}))) is a balanced token
sequence and the file isn't empty, so it should just be parsed and
discarded.
clang also IMHO mishandles
const unsigned char w[] = {
#embed __FILE__ prefix([0] = 42, [15] =) limit(32)
};
but again only without -save-temps, seems like it
treats it as
[0] = 42, [15] = (99,111,110,115,116,32,117,110,115,105,103,110,101,100,
32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98)
rather than
[0] = 42, [15] = 99,111,110,115,116,32,117,110,115,105,103,110,101,100,
32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98
and warns on it for -Wunused-value and just compiles it as
[0] = 42, [15] = 98
And also
void foo (int, int, int, int);
void bar (void) { foo (
#embed __FILE__ limit (4) prefix (172 + ) suffix (+ 2)
); }
is treated as
172 + (118, 111, 105, 100) + 2
rather than
172 + 118, 111, 105, 100 + 2
which clang -save-temps or GCC treats it like, so results
in just one argument passed rather than 4.
if (!strstr ((const char *) magna_carta, "imprisonétur")) abort ();
in the testcase fails as well, but in that case calling it in gdb succeeds:
p ((char *(*)(char *, char *))__strstr_sse2) (magna_carta, "imprisonétur")
$2 = 0x555555558d3c <magna_carta+11564> "imprisonétur aut disseisiátur"...
so I guess they are just trying to constant evaluate strstr and do it
incorrectly.
They started with making the optimizations together in the initial patch
set, so they don't have the luxury to compare if it is just because of
the optimization they are trying to do or because that is how the
feature works for them. At least unless they use -save-temps for now.
There is also different behavior between clang and gcc on -M or other
dependency generating options. Seems clang includes the __has_embed
searched files in dependencies, while my patch doesn't. But so does
clang for __has_include and GCC doesn't. Emitting a hard dependency
on some header just because there was __has_include/__has_embed for it
seems wrong to me, because (at least when properly written) the source
likely doesn't mind if the file is missing, it will do something else,
so a hard error from make because of it doesn't seem right. Does
make have some weaker dependencies, such that if some file can be remade
it is but if it doesn't exist, it isn't fatal?
I wonder whether #embed <non-existent-file> really needs to be fatal
or whether we could simply after diagnosing it pretend the file exists
and is empty. For #include I think fatal errors make tons of sense,
but perhaps for #embed which is more localized we'd get better error
reporting if we didn't bail out immediately. Note, both GCC and clang
currently treat those as fatal errors.
clang also added -dE option which with -E instead of preprocessing
the #embed directives keeps them as is, but the preprocessed source
then isn't self-contained. That option looks more harmful than useful to
me.
Also, it isn't clear to me from C23 whether it is possible to have
__has_include/__has_c_attribute/__has_embed expressions inside of
the limit #embed/__has_embed argument.
6.10.3.2/2 says that defined should not appear there (and the patch
diagnoses it and testsuite tests), but for __has_include/__has_embed
etc. 6.10.1/11 says:
"The identifiers __has_include, __has_embed, and __has_c_attribute
shall not appear in any context not mentioned in this subclause."
If that subclause in that case means 6.10.1, then it presumably shouldn't
appear in #embed in 6.10.3, but __has_embed is in 6.10.1...
But 6.10.3.2/3 says that it should be parsed according to the 6.10.1
rules. Haven't included tests like
#if __has_embed (__FILE__ limit (__has_embed (__FILE__ limit (1))))
or
#embed __FILE__ limit (__has_include (__FILE__))
into the testsuite because of the doubts but I think the patch should
handle those right now.
The reason I've used Magna Carta text in some of the testcases is that
I hope it shouldn't be copyrighted after the centuries and I'd strongly
prefer not to have binary blobs in git after the xz backdoor lesson
and wanted something larger which doesn't change all the time.
Oh, BTW, I see in C23 draft 6.10.3.2 in Example 4
if (f_source == NULL);
return 1;
(note the spurious semicolon after closing paren), has that been fixed
already?
Like the thephd.dev and clang implementations, the patch always macro
expands the whole #embed and __has_embed directives except for the
embed keyword. That is most likely not what C23 says, my limited
understanding right now is that in #embed one needs to parse the whole
directive line with macro expansion disabled and check if it satisfies the
grammar, if not, the whole directive is macro expanded, if yes, only
the limit parameter argument is macro expanded and the prefix/suffix/if_empty
arguments are maybe macro expanded when actually used (and not at all if
unused). And I think __has_embed macro expansion has conflicting rules.
2024-09-12 Jakub Jelinek <jakub@redhat.com>
PR c/105863
libcpp/
* include/cpplib.h: Implement C23 N3017 #embed - a scannable,
tooling-friendly binary resource inclusion mechanism paper.
(struct cpp_options): Add embed member.
(enum cpp_builtin_type): Add BT_HAS_EMBED.
(cpp_set_include_chains): Add another cpp_dir * argument to
the declaration.
* internal.h (enum include_type): Add IT_EMBED.
(struct cpp_reader): Add embed_include member.
(struct cpp_embed_params_tokens): New type.
(struct cpp_embed_params): New type.
(_cpp_get_token_no_padding): Declare.
(enum _cpp_find_file_kind): Add _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
(_cpp_stack_embed): Declare.
(_cpp_parse_expr): Change return type to cpp_num_part instead of
bool, change second argument from bool to const char * and add third
argument.
(_cpp_parse_embed_params): Declare.
* directives.cc (DIRECTIVE_TABLE): Add embed entry.
(end_directive): Don't call skip_rest_of_line for T_EMBED directive.
(_cpp_handle_directive): Return 2 rather than 1 for T_EMBED in
directives-only mode.
(parse_include): Don't Call check_eol for T_EMBED directive.
(skip_balanced_token_seq): New function.
(EMBED_PARAMS): Define.
(enum embed_param_kind): New type.
(embed_params): New variable.
(_cpp_parse_embed_params): New function.
(do_embed): New function.
(do_if): Adjust _cpp_parse_expr caller.
(do_elif): Likewise.
* expr.cc (parse_defined): Diagnose defined in #embed or __has_embed
parameters.
(_cpp_parse_expr): Change return type to cpp_num_part instead of
bool, change second argument from bool to const char * and add third
argument. Adjust function comment. For #embed/__has_embed parameters
add an artificial CPP_OPEN_PAREN. Use the second argument DIR
directly instead of string literals conditional on IS_IF.
For #embed/__has_embed parameter, stop on reaching CPP_CLOSE_PAREN
matching the artificial one. Diagnose negative or too large embed
parameter operands.
(num_binary_op): Use #embed instead of #if for diagnostics if inside
#embed/__has_embed parameter.
(num_div_op): Likewise.
* files.cc (struct _cpp_file): Add limit member and embed bitfield.
(search_cache): Add IS_EMBED argument, formatting fix. Skip over
files with different file->embed from the argument.
(find_file_in_dir): Don't call pch_open_file if file->embed.
(_cpp_find_file): Handle _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
(read_file_guts): Formatting fix.
(has_unique_contents): Ignore file->embed files.
(search_path_head): Handle IT_EMBED type.
(_cpp_stack_embed): New function.
(_cpp_get_file_stat): Formatting fix.
(cpp_set_include_chains): Add embed argument, save it to
pfile->embed_include and compute lens for the chain.
* init.cc (struct lang_flags): Add embed member.
(lang_defaults): Add embed initializers.
(cpp_set_lang): Initialize CPP_OPTION (pfile, embed).
(builtin_array): Add __has_embed entry.
(cpp_init_builtins): Predefine __STDC_EMBED_NOT_FOUND__,
__STDC_EMBED_FOUND__ and __STDC_EMBED_EMPTY__.
* lex.cc (cpp_directive_only_process): Handle #embed.
* macro.cc (cpp_get_token_no_padding): Rename to ...
(_cpp_get_token_no_padding): ... this. No longer static.
(builtin_has_include_1): New function.
(builtin_has_include): Use it. Use _cpp_get_token_no_padding
instead of cpp_get_token_no_padding.
(builtin_has_embed): New function.
(_cpp_builtin_macro_text): Handle BT_HAS_EMBED.
gcc/
* doc/cppdiropts.texi (--embed-dir=): Document.
* doc/cpp.texi (Binary Resource Inclusion): New chapter.
(__has_embed): Document.
* doc/invoke.texi (Directory Options): Mention --embed-dir=.
* gcc.cc (cpp_unique_options): Add %{-embed*}.
* genmatch.cc (main): Adjust cpp_set_include_chains caller.
* incpath.h (enum incpath_kind): Add INC_EMBED.
* incpath.cc (merge_include_chains): Handle INC_EMBED.
(register_include_chains): Adjust cpp_set_include_chains caller.
gcc/c-family/
* c.opt (-embed-dir=): New option.
(-embed-directory): New alias.
(-embed-directory=): New alias.
* c-opts.cc (c_common_handle_option): Handle OPT__embed_dir_.
gcc/testsuite/
* c-c++-common/cpp/embed-1.c: New test.
* c-c++-common/cpp/embed-2.c: New test.
* c-c++-common/cpp/embed-3.c: New test.
* c-c++-common/cpp/embed-4.c: New test.
* c-c++-common/cpp/embed-5.c: New test.
* c-c++-common/cpp/embed-6.c: New test.
* c-c++-common/cpp/embed-7.c: New test.
* c-c++-common/cpp/embed-8.c: New test.
* c-c++-common/cpp/embed-9.c: New test.
* c-c++-common/cpp/embed-10.c: New test.
* c-c++-common/cpp/embed-11.c: New test.
* c-c++-common/cpp/embed-12.c: New test.
* c-c++-common/cpp/embed-13.c: New test.
* c-c++-common/cpp/embed-14.c: New test.
* c-c++-common/cpp/embed-25.c: New test.
* c-c++-common/cpp/embed-26.c: New test.
* c-c++-common/cpp/embed-dir/embed-1.inc: New test.
* c-c++-common/cpp/embed-dir/embed-3.c: New test.
* c-c++-common/cpp/embed-dir/embed-4.c: New test.
* c-c++-common/cpp/embed-dir/magna-carta.txt: New test.
* gcc.dg/cpp/embed-1.c: New test.
* gcc.dg/cpp/embed-2.c: New test.
* gcc.dg/cpp/embed-3.c: New test.
* gcc.dg/cpp/embed-4.c: New test.
* g++.dg/cpp/embed-1.C: New test.
* g++.dg/cpp/embed-2.C: New test.
* g++.dg/cpp/embed-3.C: New test.
2024-09-12 17:15:38 +08:00
|
|
|
extern const cpp_token *_cpp_get_token_no_padding (cpp_reader *);
|
2006-01-05 00:33:38 +08:00
|
|
|
extern int _cpp_warn_if_unused_macro (cpp_reader *, cpp_hashnode *, void *);
|
|
|
|
extern void _cpp_push_token_context (cpp_reader *, cpp_hashnode *,
|
|
|
|
const cpp_token *, unsigned int);
|
2008-07-14 13:09:48 +08:00
|
|
|
extern void _cpp_backup_tokens_direct (cpp_reader *, unsigned int);
|
2006-01-05 00:33:38 +08:00
|
|
|
|
2022-01-14 23:57:02 +08:00
|
|
|
/* In identifiers.cc */
|
libcpp: Improve the diagnostic for poisoned identifiers [PR36887]
The PR requests an enhancement to the diagnostic issued for the use of a
poisoned identifier. Currently, we show the location of the usage, but not
the location which requested the poisoning, which would be helpful for the
user if the decision to poison an identifier was made externally, such as
in a library header.
In order to output this information, we need to remember a location_t for
each identifier that has been poisoned, and that data needs to be preserved
as well in a PCH. One option would be to add a field to struct cpp_hashnode,
but there is no convenient place to add it without increasing the size of
the struct for all identifiers. Given this facility will be needed rarely,
it seemed better to add a second hash map, which is handled PCH-wise the
same as the current one in gcc/stringpool.cc. This hash map associates a new
struct cpp_hashnode_extra with each identifier that needs one. Currently
that struct only contains the new location_t, but it could be extended in
the future if there is other ancillary data that may be convenient to put
there for other purposes.
libcpp/ChangeLog:
PR preprocessor/36887
* directives.cc (do_pragma_poison): Store in the extra hash map the
location from which an identifier has been poisoned.
* lex.cc (identifier_diagnostics_on_lex): When issuing a diagnostic
for the use of a poisoned identifier, also add a note indicating the
location from which it was poisoned.
* identifiers.cc (alloc_node): Convert to template function.
(_cpp_init_hashtable): Handle the new extra hash map.
(_cpp_destroy_hashtable): Likewise.
* include/cpplib.h (struct cpp_hashnode_extra): New struct.
(cpp_create_reader): Update prototype to...
* init.cc (cpp_create_reader): ...accept an argument for the extra
hash table and pass it to _cpp_init_hashtable.
* include/symtab.h (ht_lookup): New overload for convenience.
* internal.h (struct cpp_reader): Add EXTRA_HASH_TABLE member.
(_cpp_init_hashtable): Adjust prototype.
gcc/c-family/ChangeLog:
PR preprocessor/36887
* c-opts.cc (c_common_init_options): Pass new extra hash map
argument to cpp_create_reader().
gcc/ChangeLog:
PR preprocessor/36887
* toplev.h (ident_hash_extra): Declare...
* stringpool.cc (ident_hash_extra): ...this new global variable.
(init_stringpool): Handle ident_hash_extra as well as ident_hash.
(ggc_mark_stringpool): Likewise.
(ggc_purge_stringpool): Likewise.
(struct string_pool_data_extra): New struct.
(spd2): New GC root variable.
(gt_pch_save_stringpool): Use spd2 to handle ident_hash_extra,
analogous to how spd is used to handle ident_hash.
(gt_pch_restore_stringpool): Likewise.
gcc/testsuite/ChangeLog:
PR preprocessor/36887
* c-c++-common/cpp/diagnostic-poison.c: New test.
* g++.dg/pch/pr36887.C: New test.
* g++.dg/pch/pr36887.Hs: New test.
2023-09-08 05:02:47 +08:00
|
|
|
extern void
|
|
|
|
_cpp_init_hashtable (cpp_reader *, cpp_hash_table *, cpp_hash_table *);
|
* cpplib.h, cpphash.h, cppcharset.c, cpperror.c, cppexp.c
* cppfiles.c, cpphash.c, cppinit.c, cpplex.c, cpplib.c
* cppmacro.c, cpppch.c, cpptrad.c, cppspec.c: Convert to
ISO C: new-style function declarations, no need for PARAMS,
no special punctuation on indirect function calls, use string
constant concatenation where convenient.
From-SVN: r68070
2003-06-17 14:17:44 +08:00
|
|
|
extern void _cpp_destroy_hashtable (cpp_reader *);
|
cpplib.h: Provide HASHNODE typedef and forward decl of struct hashnode only.
* cpplib.h: Provide HASHNODE typedef and forward decl of
struct hashnode only. Kill cpp_hashnode typedef. MACRODEF,
DEFINITION, struct hashnode, struct macrodef, struct
definition, scan_decls prototype, default defn of
INCLUDE_LEN_FUDGE moved elsewhere.
* cpphash.h: MACRODEF, DEFINITION, struct macrodef, struct
definition, and struct hashnode moved here. Remove the unused
'predefined' field from struct definition. Replace the 'args'
union with its sole member. All users updated (cpphash.c).
Delete HASHSTEP and MAKE_POS macros, and hashf prototype. Add
multiple include guard.
* cpphash.c (hashf): Make static; use better algorithm; drop
HASHSIZE parameter; return an unsigned int.
(cpp_lookup): Drop HASH parameter. PFILE parameter is
used. Calculate HASHSIZE modulus here.
(cpp_install): Drop HASH parameter. Calculate HASHSIZE modulus
here.
(create_definition): Drop PREDEFINITION parameter.
* cpplib.c (do_define): Don't calculate a hash value here.
Don't pass (keyword == NULL) to create_definition.
* scan.h: Prototype scan_decls here.
* cppfiles.c: Move INCLUDE_LEN_FUDGE default defn here.
* cppexp.c, cppfiles.c, cppinit.c, cpplib.c, fix-header.c: All
callers of cpp_lookup and cpp_install updated.
From-SVN: r31881
2000-02-10 10:23:08 +08:00
|
|
|
|
2022-01-14 23:57:02 +08:00
|
|
|
/* In files.cc */
|
2020-05-20 21:21:10 +08:00
|
|
|
enum _cpp_find_file_kind
|
libcpp, c-family: Add (dumb) C23 N3017 #embed support [PR105863]
The following patch implements the C23 N3017 "#embed - a scannable,
tooling-friendly binary resource inclusion mechanism" paper.
The implementation is intentionally dumb, in that it doesn't significantly
speed up compilation of larger initializers and doesn't make it possible
to use huge #embeds (like several gigabytes large, that is compile time
and memory still infeasible).
There are 2 reasons for this. One is that I think like it is implemented
now in the patch is how we should use it for the smaller #embed sizes,
dunno with which boundary, whether 32 bytes or 64 or something like that,
certainly handling the single byte cases which is something that can appear
anywhere in the source where constant integer literal can appear is
desirable and I think for a few bytes it isn't worth it to come up with
something smarter and users would like to e.g. see it in -E readably as
well (perhaps the slow vs. fast boundary should be determined by command
line option). And the other one is to be able to more easily find
regressions in behavior caused by the optimizations, so we have something
to get back in git to compare against.
I'm definitely willing to work on the optimizations (likely introduce a new
CPP_* token type to refer to a range of libcpp owned memory (start + size)
and similarly some tree which can do the same, and can be at any time e.g.
split into 2 subparts + say INTEGER_CST in between if needed say for
const unsigned char d[] = {
#embed "2GB.dat" prefix (0, 0, ) suffix (, [0x40000000] = 42)
}; still without having to copy around huge amounts of data; STRING_CST
owns the memory it points to and can be only 2GB in size), but would
like to do that incrementally.
And would like to first include some extensions also not included in
this patch, like gnu::offset (off) parameter to allow to skip certain
constant amount of bytes at the start of the files, plus
gnu::base64 ("base64_encoded_data") parameter to add something which can
store more efficiently large amounts of the #embed data in preprocessed
source.
I've been cross-checking all the tests also against the LLVM implementation
https://github.com/llvm/llvm-project/pull/68620
which has been for a few hours even committed to LLVM trunk but reverted
afterwards. LLVM now has the support committed and I admit I haven't
rechecked whether the behavior on the below mentioned spots have been fixed
in it already or not yet.
The patch uses --embed-dir= option that clang plans to add above and doesn't
use other variants on the search directories yet, plus there are no
default directories at least for the time being where to search for embed
files. So, #embed "..." works if it is found in the same directory (or
relative to the current file's directory) and #embed "/..." or #embed </...>
work always, but relative #embed <...> doesn't unless at least one
--embed-dir= is specified. There is no reason to differentiate between
system and non-system directories, so we don't need -isystem like
counterpart, perhaps -iquote like counterpart could be useful in the future,
dunno what else. It has --embed-directory=dir and --embed-directory dir
as aliases.
There are some differences beyond clang ICEs, so I'd like to point them out
to make sure there is agreement on the choices in the patch. They are also
mentioned in the comments of the llvm pull request.
The most important is that the GCC patch (as well as the original thephd.dev
LLVM branch on godbolt) expands #embed (or acts as if it is expanded) into
a mere sequence of numbers like 123,2,35,26 rather then what clang
effectively treats as (unsigned char)123,(unsigned char)2,(unsigned
char)35,(unsigned char)26 but only does that when using integrated
preprocessor, not when using -save-temps where it acts as GCC.
JeanHeyd as the original author agrees that is how it is currently worded in
C23.
Another difference (not tested in the testsuite, not sure how to check for
effective target /dev/urandom nor am sure it is desirable to check that
during testsuite) is how to treat character devices, named pipes etc.
(block devices are errored on). The original paper uses /dev/urandom
in various examples and seems to assume that unlike regular files the
devices aren't really cached, so
#embed </dev/urandom> limit(1) prefix(int a = ) suffix(;)
#embed </dev/urandom> limit(1) prefix(int b = ) suffix(;)
usually results in a != b. That is what the godbolt thephd.dev branch
implements too and what this patch does as well, but clang actually seems
to just go from st.st_size == 0, ergo it must be zero-sized resource and
so just copies over if_empty if present. It is really questionable
what to do about the character devices/named pipes with __has_embed, for
regular files the patch doesn't read anything from them, relies on
st.st_size + limit for whether it is empty or non-empty. But I don't know
of a way to check if read on say a character device would read anything
or not (the </dev/null> limit (1) vs. </dev/zero> limit (1) cases), and
if we read something, that would be better cached for later because
#embed later if it reads again could read no further data even when it
first read something. So, the patch currently for __has_embed just
always returns 2 on the non-regular files, like the thephd.dev
branch does as well and like the clang pull request as well.
A question is also what to do for gnu::offset on the non-regular files
even for #embed, those aren't seekable and do we want to just read and throw
away the offset bytes each time we see it used?
clang also chokes on the
#if __has_embed (__FILE__ __limit__ (1) __prefix__ () suffix (1 / 0) \
__if_empty__ ((({{[0[0{0{0(0(0)1)1}1}]]}})))) != __STDC_EMBED_FOUND__
#error "__has_embed fail"
#endif
in embed-1.c, but thephd.dev branch accepts it and I don't see why
it shouldn't, (({{[0[0{0{0(0(0)1)1}1}]]}}))) is a balanced token
sequence and the file isn't empty, so it should just be parsed and
discarded.
clang also IMHO mishandles
const unsigned char w[] = {
#embed __FILE__ prefix([0] = 42, [15] =) limit(32)
};
but again only without -save-temps, seems like it
treats it as
[0] = 42, [15] = (99,111,110,115,116,32,117,110,115,105,103,110,101,100,
32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98)
rather than
[0] = 42, [15] = 99,111,110,115,116,32,117,110,115,105,103,110,101,100,
32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98
and warns on it for -Wunused-value and just compiles it as
[0] = 42, [15] = 98
And also
void foo (int, int, int, int);
void bar (void) { foo (
#embed __FILE__ limit (4) prefix (172 + ) suffix (+ 2)
); }
is treated as
172 + (118, 111, 105, 100) + 2
rather than
172 + 118, 111, 105, 100 + 2
which clang -save-temps or GCC treats it like, so results
in just one argument passed rather than 4.
if (!strstr ((const char *) magna_carta, "imprisonétur")) abort ();
in the testcase fails as well, but in that case calling it in gdb succeeds:
p ((char *(*)(char *, char *))__strstr_sse2) (magna_carta, "imprisonétur")
$2 = 0x555555558d3c <magna_carta+11564> "imprisonétur aut disseisiátur"...
so I guess they are just trying to constant evaluate strstr and do it
incorrectly.
They started with making the optimizations together in the initial patch
set, so they don't have the luxury to compare if it is just because of
the optimization they are trying to do or because that is how the
feature works for them. At least unless they use -save-temps for now.
There is also different behavior between clang and gcc on -M or other
dependency generating options. Seems clang includes the __has_embed
searched files in dependencies, while my patch doesn't. But so does
clang for __has_include and GCC doesn't. Emitting a hard dependency
on some header just because there was __has_include/__has_embed for it
seems wrong to me, because (at least when properly written) the source
likely doesn't mind if the file is missing, it will do something else,
so a hard error from make because of it doesn't seem right. Does
make have some weaker dependencies, such that if some file can be remade
it is but if it doesn't exist, it isn't fatal?
I wonder whether #embed <non-existent-file> really needs to be fatal
or whether we could simply after diagnosing it pretend the file exists
and is empty. For #include I think fatal errors make tons of sense,
but perhaps for #embed which is more localized we'd get better error
reporting if we didn't bail out immediately. Note, both GCC and clang
currently treat those as fatal errors.
clang also added -dE option which with -E instead of preprocessing
the #embed directives keeps them as is, but the preprocessed source
then isn't self-contained. That option looks more harmful than useful to
me.
Also, it isn't clear to me from C23 whether it is possible to have
__has_include/__has_c_attribute/__has_embed expressions inside of
the limit #embed/__has_embed argument.
6.10.3.2/2 says that defined should not appear there (and the patch
diagnoses it and testsuite tests), but for __has_include/__has_embed
etc. 6.10.1/11 says:
"The identifiers __has_include, __has_embed, and __has_c_attribute
shall not appear in any context not mentioned in this subclause."
If that subclause in that case means 6.10.1, then it presumably shouldn't
appear in #embed in 6.10.3, but __has_embed is in 6.10.1...
But 6.10.3.2/3 says that it should be parsed according to the 6.10.1
rules. Haven't included tests like
#if __has_embed (__FILE__ limit (__has_embed (__FILE__ limit (1))))
or
#embed __FILE__ limit (__has_include (__FILE__))
into the testsuite because of the doubts but I think the patch should
handle those right now.
The reason I've used Magna Carta text in some of the testcases is that
I hope it shouldn't be copyrighted after the centuries and I'd strongly
prefer not to have binary blobs in git after the xz backdoor lesson
and wanted something larger which doesn't change all the time.
Oh, BTW, I see in C23 draft 6.10.3.2 in Example 4
if (f_source == NULL);
return 1;
(note the spurious semicolon after closing paren), has that been fixed
already?
Like the thephd.dev and clang implementations, the patch always macro
expands the whole #embed and __has_embed directives except for the
embed keyword. That is most likely not what C23 says, my limited
understanding right now is that in #embed one needs to parse the whole
directive line with macro expansion disabled and check if it satisfies the
grammar, if not, the whole directive is macro expanded, if yes, only
the limit parameter argument is macro expanded and the prefix/suffix/if_empty
arguments are maybe macro expanded when actually used (and not at all if
unused). And I think __has_embed macro expansion has conflicting rules.
2024-09-12 Jakub Jelinek <jakub@redhat.com>
PR c/105863
libcpp/
* include/cpplib.h: Implement C23 N3017 #embed - a scannable,
tooling-friendly binary resource inclusion mechanism paper.
(struct cpp_options): Add embed member.
(enum cpp_builtin_type): Add BT_HAS_EMBED.
(cpp_set_include_chains): Add another cpp_dir * argument to
the declaration.
* internal.h (enum include_type): Add IT_EMBED.
(struct cpp_reader): Add embed_include member.
(struct cpp_embed_params_tokens): New type.
(struct cpp_embed_params): New type.
(_cpp_get_token_no_padding): Declare.
(enum _cpp_find_file_kind): Add _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
(_cpp_stack_embed): Declare.
(_cpp_parse_expr): Change return type to cpp_num_part instead of
bool, change second argument from bool to const char * and add third
argument.
(_cpp_parse_embed_params): Declare.
* directives.cc (DIRECTIVE_TABLE): Add embed entry.
(end_directive): Don't call skip_rest_of_line for T_EMBED directive.
(_cpp_handle_directive): Return 2 rather than 1 for T_EMBED in
directives-only mode.
(parse_include): Don't Call check_eol for T_EMBED directive.
(skip_balanced_token_seq): New function.
(EMBED_PARAMS): Define.
(enum embed_param_kind): New type.
(embed_params): New variable.
(_cpp_parse_embed_params): New function.
(do_embed): New function.
(do_if): Adjust _cpp_parse_expr caller.
(do_elif): Likewise.
* expr.cc (parse_defined): Diagnose defined in #embed or __has_embed
parameters.
(_cpp_parse_expr): Change return type to cpp_num_part instead of
bool, change second argument from bool to const char * and add third
argument. Adjust function comment. For #embed/__has_embed parameters
add an artificial CPP_OPEN_PAREN. Use the second argument DIR
directly instead of string literals conditional on IS_IF.
For #embed/__has_embed parameter, stop on reaching CPP_CLOSE_PAREN
matching the artificial one. Diagnose negative or too large embed
parameter operands.
(num_binary_op): Use #embed instead of #if for diagnostics if inside
#embed/__has_embed parameter.
(num_div_op): Likewise.
* files.cc (struct _cpp_file): Add limit member and embed bitfield.
(search_cache): Add IS_EMBED argument, formatting fix. Skip over
files with different file->embed from the argument.
(find_file_in_dir): Don't call pch_open_file if file->embed.
(_cpp_find_file): Handle _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
(read_file_guts): Formatting fix.
(has_unique_contents): Ignore file->embed files.
(search_path_head): Handle IT_EMBED type.
(_cpp_stack_embed): New function.
(_cpp_get_file_stat): Formatting fix.
(cpp_set_include_chains): Add embed argument, save it to
pfile->embed_include and compute lens for the chain.
* init.cc (struct lang_flags): Add embed member.
(lang_defaults): Add embed initializers.
(cpp_set_lang): Initialize CPP_OPTION (pfile, embed).
(builtin_array): Add __has_embed entry.
(cpp_init_builtins): Predefine __STDC_EMBED_NOT_FOUND__,
__STDC_EMBED_FOUND__ and __STDC_EMBED_EMPTY__.
* lex.cc (cpp_directive_only_process): Handle #embed.
* macro.cc (cpp_get_token_no_padding): Rename to ...
(_cpp_get_token_no_padding): ... this. No longer static.
(builtin_has_include_1): New function.
(builtin_has_include): Use it. Use _cpp_get_token_no_padding
instead of cpp_get_token_no_padding.
(builtin_has_embed): New function.
(_cpp_builtin_macro_text): Handle BT_HAS_EMBED.
gcc/
* doc/cppdiropts.texi (--embed-dir=): Document.
* doc/cpp.texi (Binary Resource Inclusion): New chapter.
(__has_embed): Document.
* doc/invoke.texi (Directory Options): Mention --embed-dir=.
* gcc.cc (cpp_unique_options): Add %{-embed*}.
* genmatch.cc (main): Adjust cpp_set_include_chains caller.
* incpath.h (enum incpath_kind): Add INC_EMBED.
* incpath.cc (merge_include_chains): Handle INC_EMBED.
(register_include_chains): Adjust cpp_set_include_chains caller.
gcc/c-family/
* c.opt (-embed-dir=): New option.
(-embed-directory): New alias.
(-embed-directory=): New alias.
* c-opts.cc (c_common_handle_option): Handle OPT__embed_dir_.
gcc/testsuite/
* c-c++-common/cpp/embed-1.c: New test.
* c-c++-common/cpp/embed-2.c: New test.
* c-c++-common/cpp/embed-3.c: New test.
* c-c++-common/cpp/embed-4.c: New test.
* c-c++-common/cpp/embed-5.c: New test.
* c-c++-common/cpp/embed-6.c: New test.
* c-c++-common/cpp/embed-7.c: New test.
* c-c++-common/cpp/embed-8.c: New test.
* c-c++-common/cpp/embed-9.c: New test.
* c-c++-common/cpp/embed-10.c: New test.
* c-c++-common/cpp/embed-11.c: New test.
* c-c++-common/cpp/embed-12.c: New test.
* c-c++-common/cpp/embed-13.c: New test.
* c-c++-common/cpp/embed-14.c: New test.
* c-c++-common/cpp/embed-25.c: New test.
* c-c++-common/cpp/embed-26.c: New test.
* c-c++-common/cpp/embed-dir/embed-1.inc: New test.
* c-c++-common/cpp/embed-dir/embed-3.c: New test.
* c-c++-common/cpp/embed-dir/embed-4.c: New test.
* c-c++-common/cpp/embed-dir/magna-carta.txt: New test.
* gcc.dg/cpp/embed-1.c: New test.
* gcc.dg/cpp/embed-2.c: New test.
* gcc.dg/cpp/embed-3.c: New test.
* gcc.dg/cpp/embed-4.c: New test.
* g++.dg/cpp/embed-1.C: New test.
* g++.dg/cpp/embed-2.C: New test.
* g++.dg/cpp/embed-3.C: New test.
2024-09-12 17:15:38 +08:00
|
|
|
{ _cpp_FFK_NORMAL, _cpp_FFK_FAKE, _cpp_FFK_PRE_INCLUDE, _cpp_FFK_HAS_INCLUDE,
|
|
|
|
_cpp_FFK_EMBED, _cpp_FFK_HAS_EMBED };
|
2005-10-22 01:54:20 +08:00
|
|
|
extern _cpp_file *_cpp_find_file (cpp_reader *, const char *, cpp_dir *,
|
2020-05-20 21:21:10 +08:00
|
|
|
int angle, _cpp_find_file_kind, location_t);
|
2003-10-02 15:23:27 +08:00
|
|
|
extern bool _cpp_find_failed (_cpp_file *);
|
2003-08-03 00:29:46 +08:00
|
|
|
extern void _cpp_mark_file_once_only (cpp_reader *, struct _cpp_file *);
|
2021-02-19 04:46:25 +08:00
|
|
|
extern const char *_cpp_find_header_unit (cpp_reader *, const char *file,
|
|
|
|
bool angle_p, location_t);
|
libcpp, c-family: Add (dumb) C23 N3017 #embed support [PR105863]
The following patch implements the C23 N3017 "#embed - a scannable,
tooling-friendly binary resource inclusion mechanism" paper.
The implementation is intentionally dumb, in that it doesn't significantly
speed up compilation of larger initializers and doesn't make it possible
to use huge #embeds (like several gigabytes large, that is compile time
and memory still infeasible).
There are 2 reasons for this. One is that I think like it is implemented
now in the patch is how we should use it for the smaller #embed sizes,
dunno with which boundary, whether 32 bytes or 64 or something like that,
certainly handling the single byte cases which is something that can appear
anywhere in the source where constant integer literal can appear is
desirable and I think for a few bytes it isn't worth it to come up with
something smarter and users would like to e.g. see it in -E readably as
well (perhaps the slow vs. fast boundary should be determined by command
line option). And the other one is to be able to more easily find
regressions in behavior caused by the optimizations, so we have something
to get back in git to compare against.
I'm definitely willing to work on the optimizations (likely introduce a new
CPP_* token type to refer to a range of libcpp owned memory (start + size)
and similarly some tree which can do the same, and can be at any time e.g.
split into 2 subparts + say INTEGER_CST in between if needed say for
const unsigned char d[] = {
#embed "2GB.dat" prefix (0, 0, ) suffix (, [0x40000000] = 42)
}; still without having to copy around huge amounts of data; STRING_CST
owns the memory it points to and can be only 2GB in size), but would
like to do that incrementally.
And would like to first include some extensions also not included in
this patch, like gnu::offset (off) parameter to allow to skip certain
constant amount of bytes at the start of the files, plus
gnu::base64 ("base64_encoded_data") parameter to add something which can
store more efficiently large amounts of the #embed data in preprocessed
source.
I've been cross-checking all the tests also against the LLVM implementation
https://github.com/llvm/llvm-project/pull/68620
which has been for a few hours even committed to LLVM trunk but reverted
afterwards. LLVM now has the support committed and I admit I haven't
rechecked whether the behavior on the below mentioned spots have been fixed
in it already or not yet.
The patch uses --embed-dir= option that clang plans to add above and doesn't
use other variants on the search directories yet, plus there are no
default directories at least for the time being where to search for embed
files. So, #embed "..." works if it is found in the same directory (or
relative to the current file's directory) and #embed "/..." or #embed </...>
work always, but relative #embed <...> doesn't unless at least one
--embed-dir= is specified. There is no reason to differentiate between
system and non-system directories, so we don't need -isystem like
counterpart, perhaps -iquote like counterpart could be useful in the future,
dunno what else. It has --embed-directory=dir and --embed-directory dir
as aliases.
There are some differences beyond clang ICEs, so I'd like to point them out
to make sure there is agreement on the choices in the patch. They are also
mentioned in the comments of the llvm pull request.
The most important is that the GCC patch (as well as the original thephd.dev
LLVM branch on godbolt) expands #embed (or acts as if it is expanded) into
a mere sequence of numbers like 123,2,35,26 rather then what clang
effectively treats as (unsigned char)123,(unsigned char)2,(unsigned
char)35,(unsigned char)26 but only does that when using integrated
preprocessor, not when using -save-temps where it acts as GCC.
JeanHeyd as the original author agrees that is how it is currently worded in
C23.
Another difference (not tested in the testsuite, not sure how to check for
effective target /dev/urandom nor am sure it is desirable to check that
during testsuite) is how to treat character devices, named pipes etc.
(block devices are errored on). The original paper uses /dev/urandom
in various examples and seems to assume that unlike regular files the
devices aren't really cached, so
#embed </dev/urandom> limit(1) prefix(int a = ) suffix(;)
#embed </dev/urandom> limit(1) prefix(int b = ) suffix(;)
usually results in a != b. That is what the godbolt thephd.dev branch
implements too and what this patch does as well, but clang actually seems
to just go from st.st_size == 0, ergo it must be zero-sized resource and
so just copies over if_empty if present. It is really questionable
what to do about the character devices/named pipes with __has_embed, for
regular files the patch doesn't read anything from them, relies on
st.st_size + limit for whether it is empty or non-empty. But I don't know
of a way to check if read on say a character device would read anything
or not (the </dev/null> limit (1) vs. </dev/zero> limit (1) cases), and
if we read something, that would be better cached for later because
#embed later if it reads again could read no further data even when it
first read something. So, the patch currently for __has_embed just
always returns 2 on the non-regular files, like the thephd.dev
branch does as well and like the clang pull request as well.
A question is also what to do for gnu::offset on the non-regular files
even for #embed, those aren't seekable and do we want to just read and throw
away the offset bytes each time we see it used?
clang also chokes on the
#if __has_embed (__FILE__ __limit__ (1) __prefix__ () suffix (1 / 0) \
__if_empty__ ((({{[0[0{0{0(0(0)1)1}1}]]}})))) != __STDC_EMBED_FOUND__
#error "__has_embed fail"
#endif
in embed-1.c, but thephd.dev branch accepts it and I don't see why
it shouldn't, (({{[0[0{0{0(0(0)1)1}1}]]}}))) is a balanced token
sequence and the file isn't empty, so it should just be parsed and
discarded.
clang also IMHO mishandles
const unsigned char w[] = {
#embed __FILE__ prefix([0] = 42, [15] =) limit(32)
};
but again only without -save-temps, seems like it
treats it as
[0] = 42, [15] = (99,111,110,115,116,32,117,110,115,105,103,110,101,100,
32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98)
rather than
[0] = 42, [15] = 99,111,110,115,116,32,117,110,115,105,103,110,101,100,
32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98
and warns on it for -Wunused-value and just compiles it as
[0] = 42, [15] = 98
And also
void foo (int, int, int, int);
void bar (void) { foo (
#embed __FILE__ limit (4) prefix (172 + ) suffix (+ 2)
); }
is treated as
172 + (118, 111, 105, 100) + 2
rather than
172 + 118, 111, 105, 100 + 2
which clang -save-temps or GCC treats it like, so results
in just one argument passed rather than 4.
if (!strstr ((const char *) magna_carta, "imprisonétur")) abort ();
in the testcase fails as well, but in that case calling it in gdb succeeds:
p ((char *(*)(char *, char *))__strstr_sse2) (magna_carta, "imprisonétur")
$2 = 0x555555558d3c <magna_carta+11564> "imprisonétur aut disseisiátur"...
so I guess they are just trying to constant evaluate strstr and do it
incorrectly.
They started with making the optimizations together in the initial patch
set, so they don't have the luxury to compare if it is just because of
the optimization they are trying to do or because that is how the
feature works for them. At least unless they use -save-temps for now.
There is also different behavior between clang and gcc on -M or other
dependency generating options. Seems clang includes the __has_embed
searched files in dependencies, while my patch doesn't. But so does
clang for __has_include and GCC doesn't. Emitting a hard dependency
on some header just because there was __has_include/__has_embed for it
seems wrong to me, because (at least when properly written) the source
likely doesn't mind if the file is missing, it will do something else,
so a hard error from make because of it doesn't seem right. Does
make have some weaker dependencies, such that if some file can be remade
it is but if it doesn't exist, it isn't fatal?
I wonder whether #embed <non-existent-file> really needs to be fatal
or whether we could simply after diagnosing it pretend the file exists
and is empty. For #include I think fatal errors make tons of sense,
but perhaps for #embed which is more localized we'd get better error
reporting if we didn't bail out immediately. Note, both GCC and clang
currently treat those as fatal errors.
clang also added -dE option which with -E instead of preprocessing
the #embed directives keeps them as is, but the preprocessed source
then isn't self-contained. That option looks more harmful than useful to
me.
Also, it isn't clear to me from C23 whether it is possible to have
__has_include/__has_c_attribute/__has_embed expressions inside of
the limit #embed/__has_embed argument.
6.10.3.2/2 says that defined should not appear there (and the patch
diagnoses it and testsuite tests), but for __has_include/__has_embed
etc. 6.10.1/11 says:
"The identifiers __has_include, __has_embed, and __has_c_attribute
shall not appear in any context not mentioned in this subclause."
If that subclause in that case means 6.10.1, then it presumably shouldn't
appear in #embed in 6.10.3, but __has_embed is in 6.10.1...
But 6.10.3.2/3 says that it should be parsed according to the 6.10.1
rules. Haven't included tests like
#if __has_embed (__FILE__ limit (__has_embed (__FILE__ limit (1))))
or
#embed __FILE__ limit (__has_include (__FILE__))
into the testsuite because of the doubts but I think the patch should
handle those right now.
The reason I've used Magna Carta text in some of the testcases is that
I hope it shouldn't be copyrighted after the centuries and I'd strongly
prefer not to have binary blobs in git after the xz backdoor lesson
and wanted something larger which doesn't change all the time.
Oh, BTW, I see in C23 draft 6.10.3.2 in Example 4
if (f_source == NULL);
return 1;
(note the spurious semicolon after closing paren), has that been fixed
already?
Like the thephd.dev and clang implementations, the patch always macro
expands the whole #embed and __has_embed directives except for the
embed keyword. That is most likely not what C23 says, my limited
understanding right now is that in #embed one needs to parse the whole
directive line with macro expansion disabled and check if it satisfies the
grammar, if not, the whole directive is macro expanded, if yes, only
the limit parameter argument is macro expanded and the prefix/suffix/if_empty
arguments are maybe macro expanded when actually used (and not at all if
unused). And I think __has_embed macro expansion has conflicting rules.
2024-09-12 Jakub Jelinek <jakub@redhat.com>
PR c/105863
libcpp/
* include/cpplib.h: Implement C23 N3017 #embed - a scannable,
tooling-friendly binary resource inclusion mechanism paper.
(struct cpp_options): Add embed member.
(enum cpp_builtin_type): Add BT_HAS_EMBED.
(cpp_set_include_chains): Add another cpp_dir * argument to
the declaration.
* internal.h (enum include_type): Add IT_EMBED.
(struct cpp_reader): Add embed_include member.
(struct cpp_embed_params_tokens): New type.
(struct cpp_embed_params): New type.
(_cpp_get_token_no_padding): Declare.
(enum _cpp_find_file_kind): Add _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
(_cpp_stack_embed): Declare.
(_cpp_parse_expr): Change return type to cpp_num_part instead of
bool, change second argument from bool to const char * and add third
argument.
(_cpp_parse_embed_params): Declare.
* directives.cc (DIRECTIVE_TABLE): Add embed entry.
(end_directive): Don't call skip_rest_of_line for T_EMBED directive.
(_cpp_handle_directive): Return 2 rather than 1 for T_EMBED in
directives-only mode.
(parse_include): Don't Call check_eol for T_EMBED directive.
(skip_balanced_token_seq): New function.
(EMBED_PARAMS): Define.
(enum embed_param_kind): New type.
(embed_params): New variable.
(_cpp_parse_embed_params): New function.
(do_embed): New function.
(do_if): Adjust _cpp_parse_expr caller.
(do_elif): Likewise.
* expr.cc (parse_defined): Diagnose defined in #embed or __has_embed
parameters.
(_cpp_parse_expr): Change return type to cpp_num_part instead of
bool, change second argument from bool to const char * and add third
argument. Adjust function comment. For #embed/__has_embed parameters
add an artificial CPP_OPEN_PAREN. Use the second argument DIR
directly instead of string literals conditional on IS_IF.
For #embed/__has_embed parameter, stop on reaching CPP_CLOSE_PAREN
matching the artificial one. Diagnose negative or too large embed
parameter operands.
(num_binary_op): Use #embed instead of #if for diagnostics if inside
#embed/__has_embed parameter.
(num_div_op): Likewise.
* files.cc (struct _cpp_file): Add limit member and embed bitfield.
(search_cache): Add IS_EMBED argument, formatting fix. Skip over
files with different file->embed from the argument.
(find_file_in_dir): Don't call pch_open_file if file->embed.
(_cpp_find_file): Handle _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
(read_file_guts): Formatting fix.
(has_unique_contents): Ignore file->embed files.
(search_path_head): Handle IT_EMBED type.
(_cpp_stack_embed): New function.
(_cpp_get_file_stat): Formatting fix.
(cpp_set_include_chains): Add embed argument, save it to
pfile->embed_include and compute lens for the chain.
* init.cc (struct lang_flags): Add embed member.
(lang_defaults): Add embed initializers.
(cpp_set_lang): Initialize CPP_OPTION (pfile, embed).
(builtin_array): Add __has_embed entry.
(cpp_init_builtins): Predefine __STDC_EMBED_NOT_FOUND__,
__STDC_EMBED_FOUND__ and __STDC_EMBED_EMPTY__.
* lex.cc (cpp_directive_only_process): Handle #embed.
* macro.cc (cpp_get_token_no_padding): Rename to ...
(_cpp_get_token_no_padding): ... this. No longer static.
(builtin_has_include_1): New function.
(builtin_has_include): Use it. Use _cpp_get_token_no_padding
instead of cpp_get_token_no_padding.
(builtin_has_embed): New function.
(_cpp_builtin_macro_text): Handle BT_HAS_EMBED.
gcc/
* doc/cppdiropts.texi (--embed-dir=): Document.
* doc/cpp.texi (Binary Resource Inclusion): New chapter.
(__has_embed): Document.
* doc/invoke.texi (Directory Options): Mention --embed-dir=.
* gcc.cc (cpp_unique_options): Add %{-embed*}.
* genmatch.cc (main): Adjust cpp_set_include_chains caller.
* incpath.h (enum incpath_kind): Add INC_EMBED.
* incpath.cc (merge_include_chains): Handle INC_EMBED.
(register_include_chains): Adjust cpp_set_include_chains caller.
gcc/c-family/
* c.opt (-embed-dir=): New option.
(-embed-directory): New alias.
(-embed-directory=): New alias.
* c-opts.cc (c_common_handle_option): Handle OPT__embed_dir_.
gcc/testsuite/
* c-c++-common/cpp/embed-1.c: New test.
* c-c++-common/cpp/embed-2.c: New test.
* c-c++-common/cpp/embed-3.c: New test.
* c-c++-common/cpp/embed-4.c: New test.
* c-c++-common/cpp/embed-5.c: New test.
* c-c++-common/cpp/embed-6.c: New test.
* c-c++-common/cpp/embed-7.c: New test.
* c-c++-common/cpp/embed-8.c: New test.
* c-c++-common/cpp/embed-9.c: New test.
* c-c++-common/cpp/embed-10.c: New test.
* c-c++-common/cpp/embed-11.c: New test.
* c-c++-common/cpp/embed-12.c: New test.
* c-c++-common/cpp/embed-13.c: New test.
* c-c++-common/cpp/embed-14.c: New test.
* c-c++-common/cpp/embed-25.c: New test.
* c-c++-common/cpp/embed-26.c: New test.
* c-c++-common/cpp/embed-dir/embed-1.inc: New test.
* c-c++-common/cpp/embed-dir/embed-3.c: New test.
* c-c++-common/cpp/embed-dir/embed-4.c: New test.
* c-c++-common/cpp/embed-dir/magna-carta.txt: New test.
* gcc.dg/cpp/embed-1.c: New test.
* gcc.dg/cpp/embed-2.c: New test.
* gcc.dg/cpp/embed-3.c: New test.
* gcc.dg/cpp/embed-4.c: New test.
* g++.dg/cpp/embed-1.C: New test.
* g++.dg/cpp/embed-2.C: New test.
* g++.dg/cpp/embed-3.C: New test.
2024-09-12 17:15:38 +08:00
|
|
|
extern int _cpp_stack_embed (cpp_reader *, const char *, bool,
|
|
|
|
cpp_embed_params *);
|
* cpplib.h, cpphash.h, cppcharset.c, cpperror.c, cppexp.c
* cppfiles.c, cpphash.c, cppinit.c, cpplex.c, cpplib.c
* cppmacro.c, cpppch.c, cpptrad.c, cppspec.c: Convert to
ISO C: new-style function declarations, no need for PARAMS,
no special punctuation on indirect function calls, use string
constant concatenation where convenient.
From-SVN: r68070
2003-06-17 14:17:44 +08:00
|
|
|
extern void _cpp_fake_include (cpp_reader *, const char *);
|
2019-08-29 22:06:32 +08:00
|
|
|
extern bool _cpp_stack_file (cpp_reader *, _cpp_file*, include_type, location_t);
|
Makefile.in (LIBCPP_DEPS): Add HASHTAB_H.
* Makefile.in (LIBCPP_DEPS): Add HASHTAB_H.
* cppfiles.c: Completely rewritten.
* c-incpath.c (free_path, remove_duplicates, heads, tails, add_path):
struct cpp_path is now struct cpp_dir.
(remove_duplicates): Don't simplify path names.
* c-opts.c (c_common_parse_file): cpp_read_next_file renamed
cpp_stack_file.
* cpphash.h: Include hashtab.h.
(_cpp_file): Declare.
(struct cpp_buffer): struct include_file is now struct _cpp_file,
and struct cpp_path is now struct cpp_dir. Rename members.
(struct cpp_reader): Similarly. New members once_only_files,
file_hash, file_hash_entries, quote_ignores_source_dir,
no_search_path, saw_pragma_once. Remove all_include_files and
max_include_len. Make some members bool.
(_cpp_mark_only_only): Renamed from _cpp_never_reread.
(_cpp_stack_file): Renamed from _cpp_read_file.
(_cpp_stack_include): Renamed from _cpp_execute_include.
(_cpp_init_files): Renamed from _cpp_init_includes.
(_cpp_cleanup_files): Renamed from _cpp_cleanup_includes.
* cppinit.c (cpp_create_reader): Initialize no_search_path. Update.
(cpp_read_next_file): Rename and move to cppfiles.c.
(cpp_read_main_file): Update.
* cpplib.c (run_directive): Update for renamed members.
(do_include_common, _cpp_pop_buffer): Update.
(do_import): Undeprecate #import.
(do_pragma_once): Undeprecate. Use _cpp_mark_file_once_only.
* cpplib.h: Remove file_name_map_list.
(cpp_options): Remove map_list.
(cpp_dir): Rename from cpp_path. New datatype for name_map.
(cpp_set_include_chains, cpp_stack_file, cpp_included): Update.
testsuite:
* gcc.dg/cpp/include2.c: Only expect one message.
From-SVN: r69942
2003-07-30 06:26:13 +08:00
|
|
|
extern bool _cpp_stack_include (cpp_reader *, const char *, int,
|
2018-11-14 04:05:03 +08:00
|
|
|
enum include_type, location_t);
|
* cpplib.h, cpphash.h, cppcharset.c, cpperror.c, cppexp.c
* cppfiles.c, cpphash.c, cppinit.c, cpplex.c, cpplib.c
* cppmacro.c, cpppch.c, cpptrad.c, cppspec.c: Convert to
ISO C: new-style function declarations, no need for PARAMS,
no special punctuation on indirect function calls, use string
constant concatenation where convenient.
From-SVN: r68070
2003-06-17 14:17:44 +08:00
|
|
|
extern int _cpp_compare_file_date (cpp_reader *, const char *, int);
|
|
|
|
extern void _cpp_report_missing_guards (cpp_reader *);
|
Makefile.in (LIBCPP_DEPS): Add HASHTAB_H.
* Makefile.in (LIBCPP_DEPS): Add HASHTAB_H.
* cppfiles.c: Completely rewritten.
* c-incpath.c (free_path, remove_duplicates, heads, tails, add_path):
struct cpp_path is now struct cpp_dir.
(remove_duplicates): Don't simplify path names.
* c-opts.c (c_common_parse_file): cpp_read_next_file renamed
cpp_stack_file.
* cpphash.h: Include hashtab.h.
(_cpp_file): Declare.
(struct cpp_buffer): struct include_file is now struct _cpp_file,
and struct cpp_path is now struct cpp_dir. Rename members.
(struct cpp_reader): Similarly. New members once_only_files,
file_hash, file_hash_entries, quote_ignores_source_dir,
no_search_path, saw_pragma_once. Remove all_include_files and
max_include_len. Make some members bool.
(_cpp_mark_only_only): Renamed from _cpp_never_reread.
(_cpp_stack_file): Renamed from _cpp_read_file.
(_cpp_stack_include): Renamed from _cpp_execute_include.
(_cpp_init_files): Renamed from _cpp_init_includes.
(_cpp_cleanup_files): Renamed from _cpp_cleanup_includes.
* cppinit.c (cpp_create_reader): Initialize no_search_path. Update.
(cpp_read_next_file): Rename and move to cppfiles.c.
(cpp_read_main_file): Update.
* cpplib.c (run_directive): Update for renamed members.
(do_include_common, _cpp_pop_buffer): Update.
(do_import): Undeprecate #import.
(do_pragma_once): Undeprecate. Use _cpp_mark_file_once_only.
* cpplib.h: Remove file_name_map_list.
(cpp_options): Remove map_list.
(cpp_dir): Rename from cpp_path. New datatype for name_map.
(cpp_set_include_chains, cpp_stack_file, cpp_included): Update.
testsuite:
* gcc.dg/cpp/include2.c: Only expect one message.
From-SVN: r69942
2003-07-30 06:26:13 +08:00
|
|
|
extern void _cpp_init_files (cpp_reader *);
|
|
|
|
extern void _cpp_cleanup_files (cpp_reader *);
|
2013-03-07 00:18:40 +08:00
|
|
|
extern void _cpp_pop_file_buffer (cpp_reader *, struct _cpp_file *,
|
|
|
|
const unsigned char *);
|
2004-01-17 08:37:47 +08:00
|
|
|
extern bool _cpp_save_file_entries (cpp_reader *pfile, FILE *f);
|
|
|
|
extern bool _cpp_read_file_entries (cpp_reader *, FILE *);
|
2012-01-09 16:48:43 +08:00
|
|
|
extern const char *_cpp_get_file_name (_cpp_file *);
|
2006-02-18 17:25:31 +08:00
|
|
|
extern struct stat *_cpp_get_file_stat (_cpp_file *);
|
2014-10-01 19:49:23 +08:00
|
|
|
extern bool _cpp_has_header (cpp_reader *, const char *, int,
|
|
|
|
enum include_type);
|
Makefile.in (LIBCPP_DEPS): New macro.
* Makefile.in (LIBCPP_DEPS): New macro.
(cpplib.o, cpphash.o, cpperror.o, cppexp.o, cppfiles.o): Use
it to declare deps.
* cpperror.c: Include cpphash.h.
* cppexp.c: Include cpphash.h. Remove MULTIBYTE_CHARS
dingleberry.
(lex): Don't use CPP_WARN_UNDEF.
(_cpp_parse_expr): Return an int, the truth value.
* cppfiles.c: Include cpphash.h.
(_cpp_merge_include_chains): Move to cppinit.c and make static.
* cppinit.c (include_defaults_array): Disentangle.
(cpp_cleanup): Don't free the if stack here.
(cpp_finish): Pop off all buffers, not just one.
* cpplib.c (eval_if_expr): Return int.
(do_xifdef): Rename do_ifdef.
(handle_directive): Don't use CPP_PREPROCESSED.
(cpp_get_token): Don't use CPP_C89.
* fix-header.c: Don't use CPP_OPTIONS.
* cpplib.h: Move U_CHAR, enum node_type, struct
file_name_list, struct ihash, is_idchar, is_idstart,
is_numchar, is_numstart, is_hspace, is_space, CPP_BUF_PEEK,
CPP_BUF_GET, CPP_FORWARD, CPP_PUTS, CPP_PUTS_Q, CPP_PUTC,
CPP_PUTC_Q, CPP_NUL_TERMINATE, CPP_NUL_TERMINATE_Q,
CPP_BUMP_BUFFER_LINE, CPP_BUMP_LINE, CPP_PREV_BUFFER,
CPP_PRINT_DEPS, CPP_TRADITIONAL, CPP_PEDANTIC, and prototypes
of _cpp_simplify_pathname, _cpp_find_include_file,
_cpp_read_include_file, and _cpp_parse_expr to cpphash.h.
Move struct if_stack to cpplib.c. Move struct cpp_pending to
cppinit.c.
Change all uses of U_CHAR to be unsigned char instead.
Delete CPP_WARN_UNDEF, CPP_C89, and CPP_PREPROCESSED.
From-SVN: r32435
2000-03-09 07:35:19 +08:00
|
|
|
|
2022-01-14 23:57:02 +08:00
|
|
|
/* In expr.cc */
|
libcpp, c-family: Add (dumb) C23 N3017 #embed support [PR105863]
The following patch implements the C23 N3017 "#embed - a scannable,
tooling-friendly binary resource inclusion mechanism" paper.
The implementation is intentionally dumb, in that it doesn't significantly
speed up compilation of larger initializers and doesn't make it possible
to use huge #embeds (like several gigabytes large, that is compile time
and memory still infeasible).
There are 2 reasons for this. One is that I think like it is implemented
now in the patch is how we should use it for the smaller #embed sizes,
dunno with which boundary, whether 32 bytes or 64 or something like that,
certainly handling the single byte cases which is something that can appear
anywhere in the source where constant integer literal can appear is
desirable and I think for a few bytes it isn't worth it to come up with
something smarter and users would like to e.g. see it in -E readably as
well (perhaps the slow vs. fast boundary should be determined by command
line option). And the other one is to be able to more easily find
regressions in behavior caused by the optimizations, so we have something
to get back in git to compare against.
I'm definitely willing to work on the optimizations (likely introduce a new
CPP_* token type to refer to a range of libcpp owned memory (start + size)
and similarly some tree which can do the same, and can be at any time e.g.
split into 2 subparts + say INTEGER_CST in between if needed say for
const unsigned char d[] = {
#embed "2GB.dat" prefix (0, 0, ) suffix (, [0x40000000] = 42)
}; still without having to copy around huge amounts of data; STRING_CST
owns the memory it points to and can be only 2GB in size), but would
like to do that incrementally.
And would like to first include some extensions also not included in
this patch, like gnu::offset (off) parameter to allow to skip certain
constant amount of bytes at the start of the files, plus
gnu::base64 ("base64_encoded_data") parameter to add something which can
store more efficiently large amounts of the #embed data in preprocessed
source.
I've been cross-checking all the tests also against the LLVM implementation
https://github.com/llvm/llvm-project/pull/68620
which has been for a few hours even committed to LLVM trunk but reverted
afterwards. LLVM now has the support committed and I admit I haven't
rechecked whether the behavior on the below mentioned spots have been fixed
in it already or not yet.
The patch uses --embed-dir= option that clang plans to add above and doesn't
use other variants on the search directories yet, plus there are no
default directories at least for the time being where to search for embed
files. So, #embed "..." works if it is found in the same directory (or
relative to the current file's directory) and #embed "/..." or #embed </...>
work always, but relative #embed <...> doesn't unless at least one
--embed-dir= is specified. There is no reason to differentiate between
system and non-system directories, so we don't need -isystem like
counterpart, perhaps -iquote like counterpart could be useful in the future,
dunno what else. It has --embed-directory=dir and --embed-directory dir
as aliases.
There are some differences beyond clang ICEs, so I'd like to point them out
to make sure there is agreement on the choices in the patch. They are also
mentioned in the comments of the llvm pull request.
The most important is that the GCC patch (as well as the original thephd.dev
LLVM branch on godbolt) expands #embed (or acts as if it is expanded) into
a mere sequence of numbers like 123,2,35,26 rather then what clang
effectively treats as (unsigned char)123,(unsigned char)2,(unsigned
char)35,(unsigned char)26 but only does that when using integrated
preprocessor, not when using -save-temps where it acts as GCC.
JeanHeyd as the original author agrees that is how it is currently worded in
C23.
Another difference (not tested in the testsuite, not sure how to check for
effective target /dev/urandom nor am sure it is desirable to check that
during testsuite) is how to treat character devices, named pipes etc.
(block devices are errored on). The original paper uses /dev/urandom
in various examples and seems to assume that unlike regular files the
devices aren't really cached, so
#embed </dev/urandom> limit(1) prefix(int a = ) suffix(;)
#embed </dev/urandom> limit(1) prefix(int b = ) suffix(;)
usually results in a != b. That is what the godbolt thephd.dev branch
implements too and what this patch does as well, but clang actually seems
to just go from st.st_size == 0, ergo it must be zero-sized resource and
so just copies over if_empty if present. It is really questionable
what to do about the character devices/named pipes with __has_embed, for
regular files the patch doesn't read anything from them, relies on
st.st_size + limit for whether it is empty or non-empty. But I don't know
of a way to check if read on say a character device would read anything
or not (the </dev/null> limit (1) vs. </dev/zero> limit (1) cases), and
if we read something, that would be better cached for later because
#embed later if it reads again could read no further data even when it
first read something. So, the patch currently for __has_embed just
always returns 2 on the non-regular files, like the thephd.dev
branch does as well and like the clang pull request as well.
A question is also what to do for gnu::offset on the non-regular files
even for #embed, those aren't seekable and do we want to just read and throw
away the offset bytes each time we see it used?
clang also chokes on the
#if __has_embed (__FILE__ __limit__ (1) __prefix__ () suffix (1 / 0) \
__if_empty__ ((({{[0[0{0{0(0(0)1)1}1}]]}})))) != __STDC_EMBED_FOUND__
#error "__has_embed fail"
#endif
in embed-1.c, but thephd.dev branch accepts it and I don't see why
it shouldn't, (({{[0[0{0{0(0(0)1)1}1}]]}}))) is a balanced token
sequence and the file isn't empty, so it should just be parsed and
discarded.
clang also IMHO mishandles
const unsigned char w[] = {
#embed __FILE__ prefix([0] = 42, [15] =) limit(32)
};
but again only without -save-temps, seems like it
treats it as
[0] = 42, [15] = (99,111,110,115,116,32,117,110,115,105,103,110,101,100,
32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98)
rather than
[0] = 42, [15] = 99,111,110,115,116,32,117,110,115,105,103,110,101,100,
32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98
and warns on it for -Wunused-value and just compiles it as
[0] = 42, [15] = 98
And also
void foo (int, int, int, int);
void bar (void) { foo (
#embed __FILE__ limit (4) prefix (172 + ) suffix (+ 2)
); }
is treated as
172 + (118, 111, 105, 100) + 2
rather than
172 + 118, 111, 105, 100 + 2
which clang -save-temps or GCC treats it like, so results
in just one argument passed rather than 4.
if (!strstr ((const char *) magna_carta, "imprisonétur")) abort ();
in the testcase fails as well, but in that case calling it in gdb succeeds:
p ((char *(*)(char *, char *))__strstr_sse2) (magna_carta, "imprisonétur")
$2 = 0x555555558d3c <magna_carta+11564> "imprisonétur aut disseisiátur"...
so I guess they are just trying to constant evaluate strstr and do it
incorrectly.
They started with making the optimizations together in the initial patch
set, so they don't have the luxury to compare if it is just because of
the optimization they are trying to do or because that is how the
feature works for them. At least unless they use -save-temps for now.
There is also different behavior between clang and gcc on -M or other
dependency generating options. Seems clang includes the __has_embed
searched files in dependencies, while my patch doesn't. But so does
clang for __has_include and GCC doesn't. Emitting a hard dependency
on some header just because there was __has_include/__has_embed for it
seems wrong to me, because (at least when properly written) the source
likely doesn't mind if the file is missing, it will do something else,
so a hard error from make because of it doesn't seem right. Does
make have some weaker dependencies, such that if some file can be remade
it is but if it doesn't exist, it isn't fatal?
I wonder whether #embed <non-existent-file> really needs to be fatal
or whether we could simply after diagnosing it pretend the file exists
and is empty. For #include I think fatal errors make tons of sense,
but perhaps for #embed which is more localized we'd get better error
reporting if we didn't bail out immediately. Note, both GCC and clang
currently treat those as fatal errors.
clang also added -dE option which with -E instead of preprocessing
the #embed directives keeps them as is, but the preprocessed source
then isn't self-contained. That option looks more harmful than useful to
me.
Also, it isn't clear to me from C23 whether it is possible to have
__has_include/__has_c_attribute/__has_embed expressions inside of
the limit #embed/__has_embed argument.
6.10.3.2/2 says that defined should not appear there (and the patch
diagnoses it and testsuite tests), but for __has_include/__has_embed
etc. 6.10.1/11 says:
"The identifiers __has_include, __has_embed, and __has_c_attribute
shall not appear in any context not mentioned in this subclause."
If that subclause in that case means 6.10.1, then it presumably shouldn't
appear in #embed in 6.10.3, but __has_embed is in 6.10.1...
But 6.10.3.2/3 says that it should be parsed according to the 6.10.1
rules. Haven't included tests like
#if __has_embed (__FILE__ limit (__has_embed (__FILE__ limit (1))))
or
#embed __FILE__ limit (__has_include (__FILE__))
into the testsuite because of the doubts but I think the patch should
handle those right now.
The reason I've used Magna Carta text in some of the testcases is that
I hope it shouldn't be copyrighted after the centuries and I'd strongly
prefer not to have binary blobs in git after the xz backdoor lesson
and wanted something larger which doesn't change all the time.
Oh, BTW, I see in C23 draft 6.10.3.2 in Example 4
if (f_source == NULL);
return 1;
(note the spurious semicolon after closing paren), has that been fixed
already?
Like the thephd.dev and clang implementations, the patch always macro
expands the whole #embed and __has_embed directives except for the
embed keyword. That is most likely not what C23 says, my limited
understanding right now is that in #embed one needs to parse the whole
directive line with macro expansion disabled and check if it satisfies the
grammar, if not, the whole directive is macro expanded, if yes, only
the limit parameter argument is macro expanded and the prefix/suffix/if_empty
arguments are maybe macro expanded when actually used (and not at all if
unused). And I think __has_embed macro expansion has conflicting rules.
2024-09-12 Jakub Jelinek <jakub@redhat.com>
PR c/105863
libcpp/
* include/cpplib.h: Implement C23 N3017 #embed - a scannable,
tooling-friendly binary resource inclusion mechanism paper.
(struct cpp_options): Add embed member.
(enum cpp_builtin_type): Add BT_HAS_EMBED.
(cpp_set_include_chains): Add another cpp_dir * argument to
the declaration.
* internal.h (enum include_type): Add IT_EMBED.
(struct cpp_reader): Add embed_include member.
(struct cpp_embed_params_tokens): New type.
(struct cpp_embed_params): New type.
(_cpp_get_token_no_padding): Declare.
(enum _cpp_find_file_kind): Add _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
(_cpp_stack_embed): Declare.
(_cpp_parse_expr): Change return type to cpp_num_part instead of
bool, change second argument from bool to const char * and add third
argument.
(_cpp_parse_embed_params): Declare.
* directives.cc (DIRECTIVE_TABLE): Add embed entry.
(end_directive): Don't call skip_rest_of_line for T_EMBED directive.
(_cpp_handle_directive): Return 2 rather than 1 for T_EMBED in
directives-only mode.
(parse_include): Don't Call check_eol for T_EMBED directive.
(skip_balanced_token_seq): New function.
(EMBED_PARAMS): Define.
(enum embed_param_kind): New type.
(embed_params): New variable.
(_cpp_parse_embed_params): New function.
(do_embed): New function.
(do_if): Adjust _cpp_parse_expr caller.
(do_elif): Likewise.
* expr.cc (parse_defined): Diagnose defined in #embed or __has_embed
parameters.
(_cpp_parse_expr): Change return type to cpp_num_part instead of
bool, change second argument from bool to const char * and add third
argument. Adjust function comment. For #embed/__has_embed parameters
add an artificial CPP_OPEN_PAREN. Use the second argument DIR
directly instead of string literals conditional on IS_IF.
For #embed/__has_embed parameter, stop on reaching CPP_CLOSE_PAREN
matching the artificial one. Diagnose negative or too large embed
parameter operands.
(num_binary_op): Use #embed instead of #if for diagnostics if inside
#embed/__has_embed parameter.
(num_div_op): Likewise.
* files.cc (struct _cpp_file): Add limit member and embed bitfield.
(search_cache): Add IS_EMBED argument, formatting fix. Skip over
files with different file->embed from the argument.
(find_file_in_dir): Don't call pch_open_file if file->embed.
(_cpp_find_file): Handle _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
(read_file_guts): Formatting fix.
(has_unique_contents): Ignore file->embed files.
(search_path_head): Handle IT_EMBED type.
(_cpp_stack_embed): New function.
(_cpp_get_file_stat): Formatting fix.
(cpp_set_include_chains): Add embed argument, save it to
pfile->embed_include and compute lens for the chain.
* init.cc (struct lang_flags): Add embed member.
(lang_defaults): Add embed initializers.
(cpp_set_lang): Initialize CPP_OPTION (pfile, embed).
(builtin_array): Add __has_embed entry.
(cpp_init_builtins): Predefine __STDC_EMBED_NOT_FOUND__,
__STDC_EMBED_FOUND__ and __STDC_EMBED_EMPTY__.
* lex.cc (cpp_directive_only_process): Handle #embed.
* macro.cc (cpp_get_token_no_padding): Rename to ...
(_cpp_get_token_no_padding): ... this. No longer static.
(builtin_has_include_1): New function.
(builtin_has_include): Use it. Use _cpp_get_token_no_padding
instead of cpp_get_token_no_padding.
(builtin_has_embed): New function.
(_cpp_builtin_macro_text): Handle BT_HAS_EMBED.
gcc/
* doc/cppdiropts.texi (--embed-dir=): Document.
* doc/cpp.texi (Binary Resource Inclusion): New chapter.
(__has_embed): Document.
* doc/invoke.texi (Directory Options): Mention --embed-dir=.
* gcc.cc (cpp_unique_options): Add %{-embed*}.
* genmatch.cc (main): Adjust cpp_set_include_chains caller.
* incpath.h (enum incpath_kind): Add INC_EMBED.
* incpath.cc (merge_include_chains): Handle INC_EMBED.
(register_include_chains): Adjust cpp_set_include_chains caller.
gcc/c-family/
* c.opt (-embed-dir=): New option.
(-embed-directory): New alias.
(-embed-directory=): New alias.
* c-opts.cc (c_common_handle_option): Handle OPT__embed_dir_.
gcc/testsuite/
* c-c++-common/cpp/embed-1.c: New test.
* c-c++-common/cpp/embed-2.c: New test.
* c-c++-common/cpp/embed-3.c: New test.
* c-c++-common/cpp/embed-4.c: New test.
* c-c++-common/cpp/embed-5.c: New test.
* c-c++-common/cpp/embed-6.c: New test.
* c-c++-common/cpp/embed-7.c: New test.
* c-c++-common/cpp/embed-8.c: New test.
* c-c++-common/cpp/embed-9.c: New test.
* c-c++-common/cpp/embed-10.c: New test.
* c-c++-common/cpp/embed-11.c: New test.
* c-c++-common/cpp/embed-12.c: New test.
* c-c++-common/cpp/embed-13.c: New test.
* c-c++-common/cpp/embed-14.c: New test.
* c-c++-common/cpp/embed-25.c: New test.
* c-c++-common/cpp/embed-26.c: New test.
* c-c++-common/cpp/embed-dir/embed-1.inc: New test.
* c-c++-common/cpp/embed-dir/embed-3.c: New test.
* c-c++-common/cpp/embed-dir/embed-4.c: New test.
* c-c++-common/cpp/embed-dir/magna-carta.txt: New test.
* gcc.dg/cpp/embed-1.c: New test.
* gcc.dg/cpp/embed-2.c: New test.
* gcc.dg/cpp/embed-3.c: New test.
* gcc.dg/cpp/embed-4.c: New test.
* g++.dg/cpp/embed-1.C: New test.
* g++.dg/cpp/embed-2.C: New test.
* g++.dg/cpp/embed-3.C: New test.
2024-09-12 17:15:38 +08:00
|
|
|
extern cpp_num_part _cpp_parse_expr (cpp_reader *, const char *,
|
|
|
|
const cpp_token *);
|
* cpplib.h, cpphash.h, cppcharset.c, cpperror.c, cppexp.c
* cppfiles.c, cpphash.c, cppinit.c, cpplex.c, cpplib.c
* cppmacro.c, cpppch.c, cpptrad.c, cppspec.c: Convert to
ISO C: new-style function declarations, no need for PARAMS,
no special punctuation on indirect function calls, use string
constant concatenation where convenient.
From-SVN: r68070
2003-06-17 14:17:44 +08:00
|
|
|
extern struct op *_cpp_expand_op_stack (cpp_reader *);
|
Makefile.in (LIBCPP_DEPS): New macro.
* Makefile.in (LIBCPP_DEPS): New macro.
(cpplib.o, cpphash.o, cpperror.o, cppexp.o, cppfiles.o): Use
it to declare deps.
* cpperror.c: Include cpphash.h.
* cppexp.c: Include cpphash.h. Remove MULTIBYTE_CHARS
dingleberry.
(lex): Don't use CPP_WARN_UNDEF.
(_cpp_parse_expr): Return an int, the truth value.
* cppfiles.c: Include cpphash.h.
(_cpp_merge_include_chains): Move to cppinit.c and make static.
* cppinit.c (include_defaults_array): Disentangle.
(cpp_cleanup): Don't free the if stack here.
(cpp_finish): Pop off all buffers, not just one.
* cpplib.c (eval_if_expr): Return int.
(do_xifdef): Rename do_ifdef.
(handle_directive): Don't use CPP_PREPROCESSED.
(cpp_get_token): Don't use CPP_C89.
* fix-header.c: Don't use CPP_OPTIONS.
* cpplib.h: Move U_CHAR, enum node_type, struct
file_name_list, struct ihash, is_idchar, is_idstart,
is_numchar, is_numstart, is_hspace, is_space, CPP_BUF_PEEK,
CPP_BUF_GET, CPP_FORWARD, CPP_PUTS, CPP_PUTS_Q, CPP_PUTC,
CPP_PUTC_Q, CPP_NUL_TERMINATE, CPP_NUL_TERMINATE_Q,
CPP_BUMP_BUFFER_LINE, CPP_BUMP_LINE, CPP_PREV_BUFFER,
CPP_PRINT_DEPS, CPP_TRADITIONAL, CPP_PEDANTIC, and prototypes
of _cpp_simplify_pathname, _cpp_find_include_file,
_cpp_read_include_file, and _cpp_parse_expr to cpphash.h.
Move struct if_stack to cpplib.c. Move struct cpp_pending to
cppinit.c.
Change all uses of U_CHAR to be unsigned char instead.
Delete CPP_WARN_UNDEF, CPP_C89, and CPP_PREPROCESSED.
From-SVN: r32435
2000-03-09 07:35:19 +08:00
|
|
|
|
2022-01-14 23:57:02 +08:00
|
|
|
/* In lex.cc */
|
* cpplib.h, cpphash.h, cppcharset.c, cpperror.c, cppexp.c
* cppfiles.c, cpphash.c, cppinit.c, cpplex.c, cpplib.c
* cppmacro.c, cpppch.c, cpptrad.c, cppspec.c: Convert to
ISO C: new-style function declarations, no need for PARAMS,
no special punctuation on indirect function calls, use string
constant concatenation where convenient.
From-SVN: r68070
2003-06-17 14:17:44 +08:00
|
|
|
extern void _cpp_process_line_notes (cpp_reader *, int);
|
|
|
|
extern void _cpp_clean_line (cpp_reader *);
|
|
|
|
extern bool _cpp_get_fresh_line (cpp_reader *);
|
|
|
|
extern bool _cpp_skip_block_comment (cpp_reader *);
|
|
|
|
extern cpp_token *_cpp_temp_token (cpp_reader *);
|
|
|
|
extern const cpp_token *_cpp_lex_token (cpp_reader *);
|
|
|
|
extern cpp_token *_cpp_lex_direct (cpp_reader *);
|
Preserve original spellings of extended identifiers.
This patch makes cpplib track the original spellings of extended
identifiers, as well as the canonical UTF-8 version, in order to
follow standard semantics properly without needing a convoluted and
undocumented canonicalization in translation phase 1 (see bug 9449
comments 39-46 regarding such a canonicalization).
The spelling is tracked in cpp_identifier and cpp_macro_arg without
making cpp_token any larger. The original spelling is used for checks
of duplicate macro definitions, stringizing (see the C++ tests added;
this case is only an issue for C++ not C because C makes it
implementation-defined whether a \ is inserted before the \ of a UCN
in a string or character constant when stringizing, while C++ does
not), pasting (relevant when the result is then stringized for C++)
and when macro definitions are output as text (e.g. for -d options).
Once a macro has been defined, only the original spelling of the
argument names needs keeping in the argument list. While it is being
defined, however, both spellings are needed: the original one for
subsequent saving for checks of duplicate macro definitions, and the
canonical one which is the node marked specially to generate macro
argument tokens rather than normal identifier tokens. The buffer that
is used to save the original values of the identifier tokens is
changed so that it stores both those original values and a pointer to
the canonical hash nodes, so that those canonical nodes can be found
when their values need restoring after the macro definition has been
parsed.
I believe this covers the known standards issues in extended
identifiers support (the remaining unimplemented C99 areas in GCC all
being floating-point-related), except for C++ translation of extended
characters to UCNs in phase 1 (which I have no plans to work on).
There are however probably issues left with handling of extended
identifiers in other places, as listed in
<https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00337.html> (those
issues are generally the sort of thing that could be addressed as bugs
outside development stage 1). (The bulk of the potential issues Zack
was concerned about in 2003-5, that resulted in extended identifiers
being disabled in the absence of -fextended-identifiers, were
effectively eliminated by the audit and fixes I did in 2009, however;
that todo list reflects what was left over after that audit.)
Bootstrapped with no regressions on x86_64-unknown-linux-gnu.
libcpp:
* include/cpp-id-data.h (struct cpp_macro): Update comment
regarding parameters.
* include/cpplib.h (struct cpp_macro_arg, struct cpp_identifier):
Add spelling fields.
(struct cpp_token): Update comment on macro_arg.
* internal.h (_cpp_save_parameter): Add extra argument.
(_cpp_spell_ident_ucns): New declaration.
* lex.c (lex_identifier): Add SPELLING argument. Set *SPELLING to
original spelling of identifier.
(_cpp_lex_direct): Update calls to lex_identifier.
(_cpp_spell_ident_ucns): New function, factored out of
cpp_spell_token.
(cpp_spell_token): Adjust FORSTRING argument semantics to return
original spelling of identifiers. Use _cpp_spell_ident_ucns in
!FORSTRING case.
(_cpp_equiv_tokens): Check spellings of identifiers and macro
arguments are identical.
* macro.c (macro_arg_saved_data): New structure.
(paste_tokens): Use original spellings of identifiers from
cpp_spell_token.
(_cpp_save_parameter): Add argument SPELLING. Save both canonical
node and its value.
(parse_params): Update calls to _cpp_save_parameter.
(lex_expansion_token): Save spelling of macro argument tokens.
(_cpp_create_definition): Extract canonical node from saved data.
(cpp_macro_definition): Use UCNs in spelling of macro name. Use
original spellings of macro argument tokens and identifiers.
* traditional.c (scan_parameters): Update call to
_cpp_save_parameter.
gcc:
* doc/invoke.texi (-std=c99, -std=c11): Don't refer to corner
cases of extended identifiers.
gcc/testsuite:
* g++.dg/cpp/ucnid-2.C, g++.dg/cpp/ucnid-3.C,
gcc.dg/cpp/ucnid-11.c, gcc.dg/cpp/ucnid-12.c,
gcc.dg/cpp/ucnid-13.c, gcc.dg/cpp/ucnid-14.c,
gcc.dg/cpp/ucnid-15.c: New tests.
From-SVN: r217202
2014-11-07 05:08:52 +08:00
|
|
|
extern unsigned char *_cpp_spell_ident_ucns (unsigned char *, cpp_hashnode *);
|
* cpplib.h, cpphash.h, cppcharset.c, cpperror.c, cppexp.c
* cppfiles.c, cpphash.c, cppinit.c, cpplex.c, cpplib.c
* cppmacro.c, cpppch.c, cpptrad.c, cppspec.c: Convert to
ISO C: new-style function declarations, no need for PARAMS,
no special punctuation on indirect function calls, use string
constant concatenation where convenient.
From-SVN: r68070
2003-06-17 14:17:44 +08:00
|
|
|
extern int _cpp_equiv_tokens (const cpp_token *, const cpp_token *);
|
|
|
|
extern void _cpp_init_tokenrun (tokenrun *, unsigned int);
|
2011-10-23 01:49:18 +08:00
|
|
|
extern int _cpp_remaining_tokens_num_in_context (cpp_context *);
|
2011-12-08 06:05:59 +08:00
|
|
|
extern void _cpp_init_lexer (void);
|
2018-08-18 00:07:19 +08:00
|
|
|
static inline void *_cpp_reserve_room (cpp_reader *pfile, size_t have,
|
|
|
|
size_t extra)
|
|
|
|
{
|
|
|
|
if (BUFF_ROOM (pfile->a_buff) < (have + extra))
|
|
|
|
_cpp_extend_buff (pfile, &pfile->a_buff, extra);
|
|
|
|
return BUFF_FRONT (pfile->a_buff);
|
|
|
|
}
|
|
|
|
extern void *_cpp_commit_buff (cpp_reader *pfile, size_t size);
|
Makefile.in (LIBCPP_OBJS): Add cpplex.o.
* Makefile.in (LIBCPP_OBJS): Add cpplex.o.
(cpplex.o): New target.
* po/POTFILES.in: Add cpplex.c.
* cpplex.c (_cpp_grow_token_buffer, null_cleanup,
cpp_push_buffer, cpp_pop_buffer, cpp_scan_buffer,
cpp_expand_to_buffer, cpp_buf_line_and_col, cpp_file_buffer,
skip_block_comment, skip_line_comment, skip_comment,
copy_comment, _cpp_skip_hspace, _cpp_skip_rest_of_line,
_cpp_parse_name, skip_string, parse_string,
_cpp_parse_assertion, cpp_get_token, cpp_get_non_space_token,
_cpp_get_directive_token, find_position,
_cpp_read_and_prescan, _cpp_init_input_buffer): Move here.
(maybe_macroexpand, _cpp_lex_token): New functions.
* cpplib.c (SKIP_WHITE_SPACE, eval_if_expr, parse_set_mark,
parse_goto_mark): Delete.
(_cpp_handle_eof): New function.
(_cpp_handle_directive): Rename from handle_directive.
(_cpp_output_line_command): Rename from output_line_command.
(do_if, do_elif): Call _cpp_parse_expr directly.
* cppfiles.c (_cpp_read_include_file): Don't call
init_input_buffer here.
* cpphash.c (quote_string): Move here, rename _cpp_quote_string.
* cppexp.c (_cpp_parse_expr): Diddle parsing_if_directive
here; pop the token_buffer and skip the rest of the line here.
* cppinit.c (cpp_start_read): Call _cpp_init_input_buffer
here.
* cpphash.h (CPP_RESERVE, CPP_IS_MACRO_BUFFER, ACTIVE_MARK_P):
Define here.
(CPP_SET_BUF_MARK, CPP_GOTO_BUF_MARK, CPP_SET_MARK,
CPP_GOTO_MARK): New macros.
(_cpp_quote_string, _cpp_parse_name, _cpp_skip_rest_of_line,
_cpp_skip_hspace, _cpp_parse_assertion, _cpp_lex_token,
_cpp_read_and_prescan, _cpp_init_input_buffer,
_cpp_grow_token_buffer, _cpp_get_directive_token,
_cpp_handle_directive, _cpp_handle_eof,
_cpp_output_line_command): Prototype them here.
* cpplib.h (enum cpp_token): Add CPP_MACRO.
(CPP_RESERVE, get_directive_token, cpp_grow_buffer,
quote_string, output_line_command): Remove.
From-SVN: r32513
2000-03-14 06:01:08 +08:00
|
|
|
|
2022-01-14 23:57:02 +08:00
|
|
|
/* In init.cc. */
|
* cpplib.h, cpphash.h, cppcharset.c, cpperror.c, cppexp.c
* cppfiles.c, cpphash.c, cppinit.c, cpplex.c, cpplib.c
* cppmacro.c, cpppch.c, cpptrad.c, cppspec.c: Convert to
ISO C: new-style function declarations, no need for PARAMS,
no special punctuation on indirect function calls, use string
constant concatenation where convenient.
From-SVN: r68070
2003-06-17 14:17:44 +08:00
|
|
|
extern void _cpp_maybe_push_include_file (cpp_reader *);
|
2009-04-22 23:32:18 +08:00
|
|
|
extern const char *cpp_named_operator2name (enum cpp_ttype type);
|
2019-11-01 01:38:44 +08:00
|
|
|
extern void _cpp_restore_special_builtin (cpp_reader *pfile,
|
|
|
|
struct def_pragma_macro *);
|
2001-08-21 14:20:18 +08:00
|
|
|
|
2022-01-14 23:57:02 +08:00
|
|
|
/* In directives.cc */
|
* cpplib.h, cpphash.h, cppcharset.c, cpperror.c, cppexp.c
* cppfiles.c, cpphash.c, cppinit.c, cpplex.c, cpplib.c
* cppmacro.c, cpppch.c, cpptrad.c, cppspec.c: Convert to
ISO C: new-style function declarations, no need for PARAMS,
no special punctuation on indirect function calls, use string
constant concatenation where convenient.
From-SVN: r68070
2003-06-17 14:17:44 +08:00
|
|
|
extern int _cpp_test_assertion (cpp_reader *, unsigned int *);
|
2019-08-29 02:43:37 +08:00
|
|
|
extern int _cpp_handle_directive (cpp_reader *, bool);
|
* cpplib.h, cpphash.h, cppcharset.c, cpperror.c, cppexp.c
* cppfiles.c, cpphash.c, cppinit.c, cpplex.c, cpplib.c
* cppmacro.c, cpppch.c, cpptrad.c, cppspec.c: Convert to
ISO C: new-style function declarations, no need for PARAMS,
no special punctuation on indirect function calls, use string
constant concatenation where convenient.
From-SVN: r68070
2003-06-17 14:17:44 +08:00
|
|
|
extern void _cpp_define_builtin (cpp_reader *, const char *);
|
|
|
|
extern char ** _cpp_save_pragma_names (cpp_reader *);
|
|
|
|
extern void _cpp_restore_pragma_names (cpp_reader *, char **);
|
2018-11-14 04:05:03 +08:00
|
|
|
extern int _cpp_do__Pragma (cpp_reader *, location_t);
|
* cpplib.h, cpphash.h, cppcharset.c, cpperror.c, cppexp.c
* cppfiles.c, cpphash.c, cppinit.c, cpplex.c, cpplib.c
* cppmacro.c, cpppch.c, cpptrad.c, cppspec.c: Convert to
ISO C: new-style function declarations, no need for PARAMS,
no special punctuation on indirect function calls, use string
constant concatenation where convenient.
From-SVN: r68070
2003-06-17 14:17:44 +08:00
|
|
|
extern void _cpp_init_directives (cpp_reader *);
|
|
|
|
extern void _cpp_init_internal_pragmas (cpp_reader *);
|
libcpp, v2: Add support for gnu::base64 #embed parameter
This patch which adds another #embed extension, gnu::base64.
As mentioned in the documentation, this extension is primarily
intended for use by the preprocessor, so that for the larger (say 32+ or
64+ bytes long embeds it doesn't have to emit tens of thousands or
millions of comma separated string literals which would be very expensive
to parse again, but can emit
#embed "." __gnu__::__base64__( \
"Tm9uIGVyYW0gbsOpc2NpdXMsIEJydXRlLCBjdW0sIHF1w6Ygc3VtbWlzIGluZ8OpbmlpcyBleHF1" \
"aXNpdMOhcXVlIGRvY3Ryw61uYSBwaGlsw7Nzb3BoaSBHcsOmY28gc2VybcOzbmUgdHJhY3RhdsOt" \
"c3NlbnQsIGVhIExhdMOtbmlzIGzDrXR0ZXJpcyBtYW5kYXLDqW11cywgZm9yZSB1dCBoaWMgbm9z" \
"dGVyIGxhYm9yIGluIHbDoXJpYXMgcmVwcmVoZW5zacOzbmVzIGluY8O6cnJlcmV0LiBuYW0gcXVp" \
"YsO6c2RhbSwgZXQgaWlzIHF1aWRlbSBub24gw6FkbW9kdW0gaW5kw7NjdGlzLCB0b3R1bSBob2Mg" \
"ZMOtc3BsaWNldCBwaGlsb3NvcGjDoXJpLiBxdWlkYW0gYXV0ZW0gbm9uIHRhbSBpZCByZXByZWjD" \
"qW5kdW50LCBzaSByZW3DrXNzaXVzIGFnw6F0dXIsIHNlZCB0YW50dW0gc3TDumRpdW0gdGFtcXVl" \
"IG11bHRhbSDDs3BlcmFtIHBvbsOpbmRhbSBpbiBlbyBub24gYXJiaXRyw6FudHVyLiBlcnVudCDD" \
"qXRpYW0sIGV0IGlpIHF1aWRlbSBlcnVkw610aSBHcsOmY2lzIGzDrXR0ZXJpcywgY29udGVtbsOp" \
"bnRlcyBMYXTDrW5hcywgcXVpIHNlIGRpY2FudCBpbiBHcsOmY2lzIGxlZ8OpbmRpcyDDs3BlcmFt" \
"IG1hbGxlIGNvbnPDum1lcmUuIHBvc3Ryw6ltbyDDoWxpcXVvcyBmdXTDunJvcyBzw7pzcGljb3Is" \
"IHF1aSBtZSBhZCDDoWxpYXMgbMOtdHRlcmFzIHZvY2VudCwgZ2VudXMgaG9jIHNjcmliw6luZGks" \
"IGV0c2kgc2l0IGVsw6lnYW5zLCBwZXJzw7Nuw6YgdGFtZW4gZXQgZGlnbml0w6F0aXMgZXNzZSBu" \
"ZWdlbnQu")
with the meaning don't actually load some file, instead base64 decode
(RFC4648 with A-Za-z0-9+/ chars and = padding, no newlines in between)
the string and use that as data. This is chosen because it should be
-pedantic-errors clean, fairly cheap to decode and then in optimizing
compiler could be handled as similar binary blob to normal #embed,
while the data isn't left somewhere on the disk, so distcc/ccache etc.
can move the preprocessed source without issues.
It makes no sense to support limit and gnu::offset parameters together
with it IMHO, why would somebody waste providing full data and then
threw some away? prefix/suffix/if_empty are normally supported though,
but not intended to be used by the preprocessor.
This patch adds just the extension side, not the actual emitting of this
during -E or -E -fdirectives-only for now, that will be included in the
upcoming patch.
Compared to the earlier posted version of this extension, this patch
allows the string concatenation in the parameter argument (but still
doesn't allow escapes in the string, why would anyone use them when
only A-Za-z0-9+/= are valid). The patch also adds support for parsing
this even in -fpreprocessed compilation.
2024-09-12 Jakub Jelinek <jakub@redhat.com>
libcpp/
* internal.h (struct cpp_embed_params): Add base64 member.
(_cpp_free_embed_params_tokens): Declare.
* directives.cc (DIRECTIVE_TABLE): Add IN_I flag to T_EMBED.
(save_token_for_embed, _cpp_free_embed_params_tokens): New functions.
(EMBED_PARAMS): Add gnu::base64 entry.
(_cpp_parse_embed_params): Parse gnu::base64 parameter. If
-fpreprocessed without -fdirectives-only, require #embed to have
gnu::base64 parameter. Diagnose conflict between gnu::base64 and
limit or gnu::offset parameters.
(do_embed): Use _cpp_free_embed_params_tokens.
* files.cc (finish_embed, base64_dec_fn): New functions.
(base64_dec): New array.
(B64D0, B64D1, B64D2, B64D3): Define.
(finish_base64_embed): New function.
(_cpp_stack_embed): Use finish_embed. Handle params->base64
using finish_base64_embed.
* macro.cc (builtin_has_embed): Call _cpp_free_embed_params_tokens.
gcc/
* doc/cpp.texi (Binary Resource Inclusion): Document gnu::base64
parameter.
gcc/testsuite/
* c-c++-common/cpp/embed-17.c: New test.
* c-c++-common/cpp/embed-18.c: New test.
* c-c++-common/cpp/embed-19.c: New test.
* c-c++-common/cpp/embed-27.c: New test.
* gcc.dg/cpp/embed-6.c: New test.
* gcc.dg/cpp/embed-7.c: New test.
2024-09-13 00:17:05 +08:00
|
|
|
extern void _cpp_free_embed_params_tokens (cpp_embed_params_tokens *);
|
libcpp, c-family: Add (dumb) C23 N3017 #embed support [PR105863]
The following patch implements the C23 N3017 "#embed - a scannable,
tooling-friendly binary resource inclusion mechanism" paper.
The implementation is intentionally dumb, in that it doesn't significantly
speed up compilation of larger initializers and doesn't make it possible
to use huge #embeds (like several gigabytes large, that is compile time
and memory still infeasible).
There are 2 reasons for this. One is that I think like it is implemented
now in the patch is how we should use it for the smaller #embed sizes,
dunno with which boundary, whether 32 bytes or 64 or something like that,
certainly handling the single byte cases which is something that can appear
anywhere in the source where constant integer literal can appear is
desirable and I think for a few bytes it isn't worth it to come up with
something smarter and users would like to e.g. see it in -E readably as
well (perhaps the slow vs. fast boundary should be determined by command
line option). And the other one is to be able to more easily find
regressions in behavior caused by the optimizations, so we have something
to get back in git to compare against.
I'm definitely willing to work on the optimizations (likely introduce a new
CPP_* token type to refer to a range of libcpp owned memory (start + size)
and similarly some tree which can do the same, and can be at any time e.g.
split into 2 subparts + say INTEGER_CST in between if needed say for
const unsigned char d[] = {
#embed "2GB.dat" prefix (0, 0, ) suffix (, [0x40000000] = 42)
}; still without having to copy around huge amounts of data; STRING_CST
owns the memory it points to and can be only 2GB in size), but would
like to do that incrementally.
And would like to first include some extensions also not included in
this patch, like gnu::offset (off) parameter to allow to skip certain
constant amount of bytes at the start of the files, plus
gnu::base64 ("base64_encoded_data") parameter to add something which can
store more efficiently large amounts of the #embed data in preprocessed
source.
I've been cross-checking all the tests also against the LLVM implementation
https://github.com/llvm/llvm-project/pull/68620
which has been for a few hours even committed to LLVM trunk but reverted
afterwards. LLVM now has the support committed and I admit I haven't
rechecked whether the behavior on the below mentioned spots have been fixed
in it already or not yet.
The patch uses --embed-dir= option that clang plans to add above and doesn't
use other variants on the search directories yet, plus there are no
default directories at least for the time being where to search for embed
files. So, #embed "..." works if it is found in the same directory (or
relative to the current file's directory) and #embed "/..." or #embed </...>
work always, but relative #embed <...> doesn't unless at least one
--embed-dir= is specified. There is no reason to differentiate between
system and non-system directories, so we don't need -isystem like
counterpart, perhaps -iquote like counterpart could be useful in the future,
dunno what else. It has --embed-directory=dir and --embed-directory dir
as aliases.
There are some differences beyond clang ICEs, so I'd like to point them out
to make sure there is agreement on the choices in the patch. They are also
mentioned in the comments of the llvm pull request.
The most important is that the GCC patch (as well as the original thephd.dev
LLVM branch on godbolt) expands #embed (or acts as if it is expanded) into
a mere sequence of numbers like 123,2,35,26 rather then what clang
effectively treats as (unsigned char)123,(unsigned char)2,(unsigned
char)35,(unsigned char)26 but only does that when using integrated
preprocessor, not when using -save-temps where it acts as GCC.
JeanHeyd as the original author agrees that is how it is currently worded in
C23.
Another difference (not tested in the testsuite, not sure how to check for
effective target /dev/urandom nor am sure it is desirable to check that
during testsuite) is how to treat character devices, named pipes etc.
(block devices are errored on). The original paper uses /dev/urandom
in various examples and seems to assume that unlike regular files the
devices aren't really cached, so
#embed </dev/urandom> limit(1) prefix(int a = ) suffix(;)
#embed </dev/urandom> limit(1) prefix(int b = ) suffix(;)
usually results in a != b. That is what the godbolt thephd.dev branch
implements too and what this patch does as well, but clang actually seems
to just go from st.st_size == 0, ergo it must be zero-sized resource and
so just copies over if_empty if present. It is really questionable
what to do about the character devices/named pipes with __has_embed, for
regular files the patch doesn't read anything from them, relies on
st.st_size + limit for whether it is empty or non-empty. But I don't know
of a way to check if read on say a character device would read anything
or not (the </dev/null> limit (1) vs. </dev/zero> limit (1) cases), and
if we read something, that would be better cached for later because
#embed later if it reads again could read no further data even when it
first read something. So, the patch currently for __has_embed just
always returns 2 on the non-regular files, like the thephd.dev
branch does as well and like the clang pull request as well.
A question is also what to do for gnu::offset on the non-regular files
even for #embed, those aren't seekable and do we want to just read and throw
away the offset bytes each time we see it used?
clang also chokes on the
#if __has_embed (__FILE__ __limit__ (1) __prefix__ () suffix (1 / 0) \
__if_empty__ ((({{[0[0{0{0(0(0)1)1}1}]]}})))) != __STDC_EMBED_FOUND__
#error "__has_embed fail"
#endif
in embed-1.c, but thephd.dev branch accepts it and I don't see why
it shouldn't, (({{[0[0{0{0(0(0)1)1}1}]]}}))) is a balanced token
sequence and the file isn't empty, so it should just be parsed and
discarded.
clang also IMHO mishandles
const unsigned char w[] = {
#embed __FILE__ prefix([0] = 42, [15] =) limit(32)
};
but again only without -save-temps, seems like it
treats it as
[0] = 42, [15] = (99,111,110,115,116,32,117,110,115,105,103,110,101,100,
32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98)
rather than
[0] = 42, [15] = 99,111,110,115,116,32,117,110,115,105,103,110,101,100,
32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98
and warns on it for -Wunused-value and just compiles it as
[0] = 42, [15] = 98
And also
void foo (int, int, int, int);
void bar (void) { foo (
#embed __FILE__ limit (4) prefix (172 + ) suffix (+ 2)
); }
is treated as
172 + (118, 111, 105, 100) + 2
rather than
172 + 118, 111, 105, 100 + 2
which clang -save-temps or GCC treats it like, so results
in just one argument passed rather than 4.
if (!strstr ((const char *) magna_carta, "imprisonétur")) abort ();
in the testcase fails as well, but in that case calling it in gdb succeeds:
p ((char *(*)(char *, char *))__strstr_sse2) (magna_carta, "imprisonétur")
$2 = 0x555555558d3c <magna_carta+11564> "imprisonétur aut disseisiátur"...
so I guess they are just trying to constant evaluate strstr and do it
incorrectly.
They started with making the optimizations together in the initial patch
set, so they don't have the luxury to compare if it is just because of
the optimization they are trying to do or because that is how the
feature works for them. At least unless they use -save-temps for now.
There is also different behavior between clang and gcc on -M or other
dependency generating options. Seems clang includes the __has_embed
searched files in dependencies, while my patch doesn't. But so does
clang for __has_include and GCC doesn't. Emitting a hard dependency
on some header just because there was __has_include/__has_embed for it
seems wrong to me, because (at least when properly written) the source
likely doesn't mind if the file is missing, it will do something else,
so a hard error from make because of it doesn't seem right. Does
make have some weaker dependencies, such that if some file can be remade
it is but if it doesn't exist, it isn't fatal?
I wonder whether #embed <non-existent-file> really needs to be fatal
or whether we could simply after diagnosing it pretend the file exists
and is empty. For #include I think fatal errors make tons of sense,
but perhaps for #embed which is more localized we'd get better error
reporting if we didn't bail out immediately. Note, both GCC and clang
currently treat those as fatal errors.
clang also added -dE option which with -E instead of preprocessing
the #embed directives keeps them as is, but the preprocessed source
then isn't self-contained. That option looks more harmful than useful to
me.
Also, it isn't clear to me from C23 whether it is possible to have
__has_include/__has_c_attribute/__has_embed expressions inside of
the limit #embed/__has_embed argument.
6.10.3.2/2 says that defined should not appear there (and the patch
diagnoses it and testsuite tests), but for __has_include/__has_embed
etc. 6.10.1/11 says:
"The identifiers __has_include, __has_embed, and __has_c_attribute
shall not appear in any context not mentioned in this subclause."
If that subclause in that case means 6.10.1, then it presumably shouldn't
appear in #embed in 6.10.3, but __has_embed is in 6.10.1...
But 6.10.3.2/3 says that it should be parsed according to the 6.10.1
rules. Haven't included tests like
#if __has_embed (__FILE__ limit (__has_embed (__FILE__ limit (1))))
or
#embed __FILE__ limit (__has_include (__FILE__))
into the testsuite because of the doubts but I think the patch should
handle those right now.
The reason I've used Magna Carta text in some of the testcases is that
I hope it shouldn't be copyrighted after the centuries and I'd strongly
prefer not to have binary blobs in git after the xz backdoor lesson
and wanted something larger which doesn't change all the time.
Oh, BTW, I see in C23 draft 6.10.3.2 in Example 4
if (f_source == NULL);
return 1;
(note the spurious semicolon after closing paren), has that been fixed
already?
Like the thephd.dev and clang implementations, the patch always macro
expands the whole #embed and __has_embed directives except for the
embed keyword. That is most likely not what C23 says, my limited
understanding right now is that in #embed one needs to parse the whole
directive line with macro expansion disabled and check if it satisfies the
grammar, if not, the whole directive is macro expanded, if yes, only
the limit parameter argument is macro expanded and the prefix/suffix/if_empty
arguments are maybe macro expanded when actually used (and not at all if
unused). And I think __has_embed macro expansion has conflicting rules.
2024-09-12 Jakub Jelinek <jakub@redhat.com>
PR c/105863
libcpp/
* include/cpplib.h: Implement C23 N3017 #embed - a scannable,
tooling-friendly binary resource inclusion mechanism paper.
(struct cpp_options): Add embed member.
(enum cpp_builtin_type): Add BT_HAS_EMBED.
(cpp_set_include_chains): Add another cpp_dir * argument to
the declaration.
* internal.h (enum include_type): Add IT_EMBED.
(struct cpp_reader): Add embed_include member.
(struct cpp_embed_params_tokens): New type.
(struct cpp_embed_params): New type.
(_cpp_get_token_no_padding): Declare.
(enum _cpp_find_file_kind): Add _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
(_cpp_stack_embed): Declare.
(_cpp_parse_expr): Change return type to cpp_num_part instead of
bool, change second argument from bool to const char * and add third
argument.
(_cpp_parse_embed_params): Declare.
* directives.cc (DIRECTIVE_TABLE): Add embed entry.
(end_directive): Don't call skip_rest_of_line for T_EMBED directive.
(_cpp_handle_directive): Return 2 rather than 1 for T_EMBED in
directives-only mode.
(parse_include): Don't Call check_eol for T_EMBED directive.
(skip_balanced_token_seq): New function.
(EMBED_PARAMS): Define.
(enum embed_param_kind): New type.
(embed_params): New variable.
(_cpp_parse_embed_params): New function.
(do_embed): New function.
(do_if): Adjust _cpp_parse_expr caller.
(do_elif): Likewise.
* expr.cc (parse_defined): Diagnose defined in #embed or __has_embed
parameters.
(_cpp_parse_expr): Change return type to cpp_num_part instead of
bool, change second argument from bool to const char * and add third
argument. Adjust function comment. For #embed/__has_embed parameters
add an artificial CPP_OPEN_PAREN. Use the second argument DIR
directly instead of string literals conditional on IS_IF.
For #embed/__has_embed parameter, stop on reaching CPP_CLOSE_PAREN
matching the artificial one. Diagnose negative or too large embed
parameter operands.
(num_binary_op): Use #embed instead of #if for diagnostics if inside
#embed/__has_embed parameter.
(num_div_op): Likewise.
* files.cc (struct _cpp_file): Add limit member and embed bitfield.
(search_cache): Add IS_EMBED argument, formatting fix. Skip over
files with different file->embed from the argument.
(find_file_in_dir): Don't call pch_open_file if file->embed.
(_cpp_find_file): Handle _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
(read_file_guts): Formatting fix.
(has_unique_contents): Ignore file->embed files.
(search_path_head): Handle IT_EMBED type.
(_cpp_stack_embed): New function.
(_cpp_get_file_stat): Formatting fix.
(cpp_set_include_chains): Add embed argument, save it to
pfile->embed_include and compute lens for the chain.
* init.cc (struct lang_flags): Add embed member.
(lang_defaults): Add embed initializers.
(cpp_set_lang): Initialize CPP_OPTION (pfile, embed).
(builtin_array): Add __has_embed entry.
(cpp_init_builtins): Predefine __STDC_EMBED_NOT_FOUND__,
__STDC_EMBED_FOUND__ and __STDC_EMBED_EMPTY__.
* lex.cc (cpp_directive_only_process): Handle #embed.
* macro.cc (cpp_get_token_no_padding): Rename to ...
(_cpp_get_token_no_padding): ... this. No longer static.
(builtin_has_include_1): New function.
(builtin_has_include): Use it. Use _cpp_get_token_no_padding
instead of cpp_get_token_no_padding.
(builtin_has_embed): New function.
(_cpp_builtin_macro_text): Handle BT_HAS_EMBED.
gcc/
* doc/cppdiropts.texi (--embed-dir=): Document.
* doc/cpp.texi (Binary Resource Inclusion): New chapter.
(__has_embed): Document.
* doc/invoke.texi (Directory Options): Mention --embed-dir=.
* gcc.cc (cpp_unique_options): Add %{-embed*}.
* genmatch.cc (main): Adjust cpp_set_include_chains caller.
* incpath.h (enum incpath_kind): Add INC_EMBED.
* incpath.cc (merge_include_chains): Handle INC_EMBED.
(register_include_chains): Adjust cpp_set_include_chains caller.
gcc/c-family/
* c.opt (-embed-dir=): New option.
(-embed-directory): New alias.
(-embed-directory=): New alias.
* c-opts.cc (c_common_handle_option): Handle OPT__embed_dir_.
gcc/testsuite/
* c-c++-common/cpp/embed-1.c: New test.
* c-c++-common/cpp/embed-2.c: New test.
* c-c++-common/cpp/embed-3.c: New test.
* c-c++-common/cpp/embed-4.c: New test.
* c-c++-common/cpp/embed-5.c: New test.
* c-c++-common/cpp/embed-6.c: New test.
* c-c++-common/cpp/embed-7.c: New test.
* c-c++-common/cpp/embed-8.c: New test.
* c-c++-common/cpp/embed-9.c: New test.
* c-c++-common/cpp/embed-10.c: New test.
* c-c++-common/cpp/embed-11.c: New test.
* c-c++-common/cpp/embed-12.c: New test.
* c-c++-common/cpp/embed-13.c: New test.
* c-c++-common/cpp/embed-14.c: New test.
* c-c++-common/cpp/embed-25.c: New test.
* c-c++-common/cpp/embed-26.c: New test.
* c-c++-common/cpp/embed-dir/embed-1.inc: New test.
* c-c++-common/cpp/embed-dir/embed-3.c: New test.
* c-c++-common/cpp/embed-dir/embed-4.c: New test.
* c-c++-common/cpp/embed-dir/magna-carta.txt: New test.
* gcc.dg/cpp/embed-1.c: New test.
* gcc.dg/cpp/embed-2.c: New test.
* gcc.dg/cpp/embed-3.c: New test.
* gcc.dg/cpp/embed-4.c: New test.
* g++.dg/cpp/embed-1.C: New test.
* g++.dg/cpp/embed-2.C: New test.
* g++.dg/cpp/embed-3.C: New test.
2024-09-12 17:15:38 +08:00
|
|
|
extern bool _cpp_parse_embed_params (cpp_reader *, struct cpp_embed_params *);
|
* cpplib.h, cpphash.h, cppcharset.c, cpperror.c, cppexp.c
* cppfiles.c, cpphash.c, cppinit.c, cpplex.c, cpplib.c
* cppmacro.c, cpppch.c, cpptrad.c, cppspec.c: Convert to
ISO C: new-style function declarations, no need for PARAMS,
no special punctuation on indirect function calls, use string
constant concatenation where convenient.
From-SVN: r68070
2003-06-17 14:17:44 +08:00
|
|
|
extern void _cpp_do_file_change (cpp_reader *, enum lc_reason, const char *,
|
2008-07-21 17:33:38 +08:00
|
|
|
linenum_type, unsigned int);
|
* cpplib.h, cpphash.h, cppcharset.c, cpperror.c, cppexp.c
* cppfiles.c, cpphash.c, cppinit.c, cpplex.c, cpplib.c
* cppmacro.c, cpppch.c, cpptrad.c, cppspec.c: Convert to
ISO C: new-style function declarations, no need for PARAMS,
no special punctuation on indirect function calls, use string
constant concatenation where convenient.
From-SVN: r68070
2003-06-17 14:17:44 +08:00
|
|
|
extern void _cpp_pop_buffer (cpp_reader *);
|
2014-10-01 19:49:23 +08:00
|
|
|
extern char *_cpp_bracket_include (cpp_reader *);
|
cpphash.h (U): New define, to correct type of string constants.
gcc:
* cpphash.h (U): New define, to correct type of string constants.
(ustrcmp, ustrncmp, ustrlen, uxstrdup, ustrchr): New wrapper
routines, to do casts when passing unsigned strings to libc.
* cppexp.c, cppfiles.c, cpphash.c, cppinit.c, cpplib.c: Use them.
* cppfiles.c (_cpp_execute_include): Make filename an U_CHAR *.
* cpphash.c (_cpp_quote_string): Make string an U_CHAR *.
* cppinit.c (dump_special_to_buffer): Make macro name an U_CHAR *.
* cpplex.c (parse_ifdef, parse_include, validate_else): Make
second argument an U_CHAR *.
* cppinit.c (builtin_array): Make name and value U_CHAR *, add
length field, clean up initializer.
(ISTABLE): Add __extension__ to designated-
initializers version.
* cpplex.c (CHARTAB): Likewise.
* mbchar.c: Add dummy external declaration to the !MULTIBYTE_CHARS
case so the file won't be empty.
include:
* symcat.h: Remove #endif label.
From-SVN: r33657
2000-05-04 12:38:01 +08:00
|
|
|
|
2022-01-14 23:57:02 +08:00
|
|
|
/* In errors.cc */
|
diagnostics: escape non-ASCII source bytes for certain diagnostics
This patch adds support to GCC's diagnostic subsystem for escaping certain
bytes and Unicode characters when quoting source code.
Specifically, this patch adds a new flag rich_location::m_escape_on_output
which is a hint from a diagnostic that non-ASCII bytes in the pertinent
lines of the user's source code should be escaped when printed.
The patch sets this for the following diagnostics:
- when complaining about stray bytes in the program (when these
are non-printable)
- when complaining about "null character(s) ignored");
- for -Wnormalized= (and generate source ranges for such warnings)
The escaping is controlled by a new option:
-fdiagnostics-escape-format=[unicode|bytes]
For example, consider a diagnostic involing a source line containing the
string "before" followed by the Unicode character U+03C0 ("GREEK SMALL
LETTER PI", with UTF-8 encoding 0xCF 0x80) followed by the byte 0xBF
(a stray UTF-8 trailing byte), followed by the string "after", where the
diagnostic highlights the U+03C0 character.
By default, this line will be printed verbatim to the user when
reporting a diagnostic at it, as:
beforeπXafter
^
(using X for the stray byte to avoid putting invalid UTF-8 in this
commit message)
If the diagnostic sets the "escape" flag, it will be printed as:
before<U+03C0><BF>after
^~~~~~~~
with -fdiagnostics-escape-format=unicode (the default), or as:
before<CF><80><BF>after
^~~~~~~~
if the user supplies -fdiagnostics-escape-format=bytes.
This only affects how the source is printed; it does not affect
how column numbers that are printed (as per -fdiagnostics-column-unit=
and -fdiagnostics-column-origin=).
gcc/c-family/ChangeLog:
* c-lex.c (c_lex_with_flags): When complaining about non-printable
CPP_OTHER tokens, set the "escape on output" flag.
gcc/ChangeLog:
* common.opt (fdiagnostics-escape-format=): New.
(diagnostics_escape_format): New enum.
(DIAGNOSTICS_ESCAPE_FORMAT_UNICODE): New enum value.
(DIAGNOSTICS_ESCAPE_FORMAT_BYTES): Likewise.
* diagnostic-format-json.cc (json_end_diagnostic): Add
"escape-source" attribute.
* diagnostic-show-locus.c
(exploc_with_display_col::exploc_with_display_col): Replace
"tabstop" param with a cpp_char_column_policy and add an "aspect"
param. Use these to compute m_display_col accordingly.
(struct char_display_policy): New struct.
(layout::m_policy): New field.
(layout::m_escape_on_output): New field.
(def_policy): New function.
(make_range): Update for changes to exploc_with_display_col ctor.
(default_print_decoded_ch): New.
(width_per_escaped_byte): New.
(escape_as_bytes_width): New.
(escape_as_bytes_print): New.
(escape_as_unicode_width): New.
(escape_as_unicode_print): New.
(make_policy): New.
(layout::layout): Initialize new fields. Update m_exploc ctor
call for above change to ctor.
(layout::maybe_add_location_range): Update for changes to
exploc_with_display_col ctor.
(layout::calculate_x_offset_display): Update for change to
cpp_display_width.
(layout::print_source_line): Pass policy
to cpp_display_width_computation. Capture cpp_decoded_char when
calling process_next_codepoint. Move printing of source code to
m_policy.m_print_cb.
(line_label::line_label): Pass in policy rather than context.
(layout::print_any_labels): Update for change to line_label ctor.
(get_affected_range): Pass in policy rather than context, updating
calls to location_compute_display_column accordingly.
(get_printed_columns): Likewise, also for cpp_display_width.
(correction::correction): Pass in policy rather than tabstop.
(correction::compute_display_cols): Pass m_policy rather than
m_tabstop to cpp_display_width.
(correction::m_tabstop): Replace with...
(correction::m_policy): ...this.
(line_corrections::line_corrections): Pass in policy rather than
context.
(line_corrections::m_context): Replace with...
(line_corrections::m_policy): ...this.
(line_corrections::add_hint): Update to use m_policy rather than
m_context.
(line_corrections::add_hint): Likewise.
(layout::print_trailing_fixits): Likewise.
(selftest::test_display_widths): New.
(selftest::test_layout_x_offset_display_utf8): Update to use
policy rather than tabstop.
(selftest::test_one_liner_labels_utf8): Add test of escaping
source lines.
(selftest::test_diagnostic_show_locus_one_liner_utf8): Update to
use policy rather than tabstop.
(selftest::test_overlapped_fixit_printing): Likewise.
(selftest::test_overlapped_fixit_printing_utf8): Likewise.
(selftest::test_overlapped_fixit_printing_2): Likewise.
(selftest::test_tab_expansion): Likewise.
(selftest::test_escaping_bytes_1): New.
(selftest::test_escaping_bytes_2): New.
(selftest::diagnostic_show_locus_c_tests): Call the new tests.
* diagnostic.c (diagnostic_initialize): Initialize
context->escape_format.
(convert_column_unit): Update to use default character width policy.
(selftest::test_diagnostic_get_location_text): Likewise.
* diagnostic.h (enum diagnostics_escape_format): New enum.
(diagnostic_context::escape_format): New field.
* doc/invoke.texi (-fdiagnostics-escape-format=): New option.
(-fdiagnostics-format=): Add "escape-source" attribute to examples
of JSON output, and document it.
* input.c (location_compute_display_column): Pass in "policy"
rather than "tabstop", passing to
cpp_byte_column_to_display_column.
(selftest::test_cpp_utf8): Update to use cpp_char_column_policy.
* input.h (class cpp_char_column_policy): New forward decl.
(location_compute_display_column): Pass in "policy" rather than
"tabstop".
* opts.c (common_handle_option): Handle
OPT_fdiagnostics_escape_format_.
* selftest.c (temp_source_file::temp_source_file): New ctor
overload taking a size_t.
* selftest.h (temp_source_file::temp_source_file): Likewise.
gcc/testsuite/ChangeLog:
* c-c++-common/diagnostic-format-json-1.c: Add regexp to consume
"escape-source" attribute.
* c-c++-common/diagnostic-format-json-2.c: Likewise.
* c-c++-common/diagnostic-format-json-3.c: Likewise.
* c-c++-common/diagnostic-format-json-4.c: Likewise, twice.
* c-c++-common/diagnostic-format-json-5.c: Likewise.
* gcc.dg/cpp/warn-normalized-4-bytes.c: New test.
* gcc.dg/cpp/warn-normalized-4-unicode.c: New test.
* gcc.dg/encoding-issues-bytes.c: New test.
* gcc.dg/encoding-issues-unicode.c: New test.
* gfortran.dg/diagnostic-format-json-1.F90: Add regexp to consume
"escape-source" attribute.
* gfortran.dg/diagnostic-format-json-2.F90: Likewise.
* gfortran.dg/diagnostic-format-json-3.F90: Likewise.
libcpp/ChangeLog:
* charset.c (convert_escape): Use encoding_rich_location when
complaining about nonprintable unknown escape sequences.
(cpp_display_width_computation::::cpp_display_width_computation):
Pass in policy rather than tabstop.
(cpp_display_width_computation::process_next_codepoint): Add "out"
param and populate *out if non-NULL.
(cpp_display_width_computation::advance_display_cols): Pass NULL
to process_next_codepoint.
(cpp_byte_column_to_display_column): Pass in policy rather than
tabstop. Pass NULL to process_next_codepoint.
(cpp_display_column_to_byte_column): Pass in policy rather than
tabstop.
* errors.c (cpp_diagnostic_get_current_location): New function,
splitting out the logic from...
(cpp_diagnostic): ...here.
(cpp_warning_at): New function.
(cpp_pedwarning_at): New function.
* include/cpplib.h (cpp_warning_at): New decl for rich_location.
(cpp_pedwarning_at): Likewise.
(struct cpp_decoded_char): New.
(struct cpp_char_column_policy): New.
(cpp_display_width_computation::cpp_display_width_computation):
Replace "tabstop" param with "policy".
(cpp_display_width_computation::process_next_codepoint): Add "out"
param.
(cpp_display_width_computation::m_tabstop): Replace with...
(cpp_display_width_computation::m_policy): ...this.
(cpp_byte_column_to_display_column): Replace "tabstop" param with
"policy".
(cpp_display_width): Likewise.
(cpp_display_column_to_byte_column): Likewise.
* include/line-map.h (rich_location::escape_on_output_p): New.
(rich_location::set_escape_on_output): New.
(rich_location::m_escape_on_output): New.
* internal.h (cpp_diagnostic_get_current_location): New decl.
(class encoding_rich_location): New.
* lex.c (skip_whitespace): Use encoding_rich_location when
complaining about null characters.
(warn_about_normalization): Generate a source range when
complaining about improperly normalized tokens, rather than just a
point, and use encoding_rich_location so that the source code
is escaped on printing.
* line-map.c (rich_location::rich_location): Initialize
m_escape_on_output.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2021-10-19 06:55:31 +08:00
|
|
|
extern location_t cpp_diagnostic_get_current_location (cpp_reader *);
|
|
|
|
|
2022-01-14 23:57:02 +08:00
|
|
|
/* In traditional.cc. */
|
2015-03-23 16:02:39 +08:00
|
|
|
extern bool _cpp_scan_out_logical_line (cpp_reader *, cpp_macro *, bool);
|
* cpplib.h, cpphash.h, cppcharset.c, cpperror.c, cppexp.c
* cppfiles.c, cpphash.c, cppinit.c, cpplex.c, cpplib.c
* cppmacro.c, cpppch.c, cpptrad.c, cppspec.c: Convert to
ISO C: new-style function declarations, no need for PARAMS,
no special punctuation on indirect function calls, use string
constant concatenation where convenient.
From-SVN: r68070
2003-06-17 14:17:44 +08:00
|
|
|
extern bool _cpp_read_logical_line_trad (cpp_reader *);
|
2004-11-28 05:59:38 +08:00
|
|
|
extern void _cpp_overlay_buffer (cpp_reader *pfile, const unsigned char *,
|
|
|
|
size_t);
|
* cpplib.h, cpphash.h, cppcharset.c, cpperror.c, cppexp.c
* cppfiles.c, cpphash.c, cppinit.c, cpplex.c, cpplib.c
* cppmacro.c, cpppch.c, cpptrad.c, cppspec.c: Convert to
ISO C: new-style function declarations, no need for PARAMS,
no special punctuation on indirect function calls, use string
constant concatenation where convenient.
From-SVN: r68070
2003-06-17 14:17:44 +08:00
|
|
|
extern void _cpp_remove_overlay (cpp_reader *);
|
2018-08-18 00:07:19 +08:00
|
|
|
extern cpp_macro *_cpp_create_trad_definition (cpp_reader *);
|
* cpplib.h, cpphash.h, cppcharset.c, cpperror.c, cppexp.c
* cppfiles.c, cpphash.c, cppinit.c, cpplex.c, cpplib.c
* cppmacro.c, cpppch.c, cpptrad.c, cppspec.c: Convert to
ISO C: new-style function declarations, no need for PARAMS,
no special punctuation on indirect function calls, use string
constant concatenation where convenient.
From-SVN: r68070
2003-06-17 14:17:44 +08:00
|
|
|
extern bool _cpp_expansions_different_trad (const cpp_macro *,
|
|
|
|
const cpp_macro *);
|
2004-11-28 05:59:38 +08:00
|
|
|
extern unsigned char *_cpp_copy_replacement_text (const cpp_macro *,
|
|
|
|
unsigned char *);
|
* cpplib.h, cpphash.h, cppcharset.c, cpperror.c, cppexp.c
* cppfiles.c, cpphash.c, cppinit.c, cpplex.c, cpplib.c
* cppmacro.c, cpppch.c, cpptrad.c, cppspec.c: Convert to
ISO C: new-style function declarations, no need for PARAMS,
no special punctuation on indirect function calls, use string
constant concatenation where convenient.
From-SVN: r68070
2003-06-17 14:17:44 +08:00
|
|
|
extern size_t _cpp_replacement_text_len (const cpp_macro *);
|
2002-05-18 04:16:48 +08:00
|
|
|
|
2022-01-14 23:57:02 +08:00
|
|
|
/* In charset.cc. */
|
2005-03-15 08:36:33 +08:00
|
|
|
|
|
|
|
/* The normalization state at this point in the sequence.
|
|
|
|
It starts initialized to all zeros, and at the end
|
|
|
|
'level' is the normalization level of the sequence. */
|
|
|
|
|
|
|
|
struct normalize_state
|
|
|
|
{
|
2013-11-16 08:05:08 +08:00
|
|
|
/* The previous starter character. */
|
2005-03-15 08:36:33 +08:00
|
|
|
cppchar_t previous;
|
2013-11-16 08:05:08 +08:00
|
|
|
/* The combining class of the previous character (whether or not a
|
|
|
|
starter). */
|
2005-03-15 08:36:33 +08:00
|
|
|
unsigned char prev_class;
|
|
|
|
/* The lowest normalization level so far. */
|
|
|
|
enum cpp_normalize_level level;
|
|
|
|
};
|
|
|
|
#define INITIAL_NORMALIZE_STATE { 0, 0, normalized_KC }
|
|
|
|
#define NORMALIZE_STATE_RESULT(st) ((st)->level)
|
|
|
|
|
2013-11-16 08:05:08 +08:00
|
|
|
/* We saw a character C that matches ISIDNUM(), update a
|
2005-03-15 08:36:33 +08:00
|
|
|
normalize_state appropriately. */
|
2013-11-16 08:05:08 +08:00
|
|
|
#define NORMALIZE_STATE_UPDATE_IDNUM(st, c) \
|
|
|
|
((st)->previous = (c), (st)->prev_class = 0)
|
2005-03-15 08:36:33 +08:00
|
|
|
|
2015-07-03 02:54:41 +08:00
|
|
|
extern bool _cpp_valid_ucn (cpp_reader *, const unsigned char **,
|
|
|
|
const unsigned char *, int,
|
|
|
|
struct normalize_state *state,
|
On-demand locations within string-literals
gcc/c-family/ChangeLog:
* c-common.c: Include "substring-locations.h".
(get_cpp_ttype_from_string_type): New function.
(g_string_concat_db): New global.
(substring_loc::get_range): New method.
* c-common.h (g_string_concat_db): New declaration.
(class substring_loc): New class.
* c-lex.c (lex_string): When concatenating strings, capture the
locations of all tokens using a new obstack, and record the
concatenation locations within g_string_concat_db.
* c-opts.c (c_common_init_options): Construct g_string_concat_db
on the ggc-heap.
gcc/ChangeLog:
* input.c (string_concat::string_concat): New constructor.
(string_concat_db::string_concat_db): New constructor.
(string_concat_db::record_string_concatenation): New method.
(string_concat_db::get_string_concatenation): New method.
(string_concat_db::get_key_loc): New method.
(class auto_cpp_string_vec): New class.
(get_substring_ranges_for_loc): New function.
(get_source_range_for_substring): New function.
(get_num_source_ranges_for_substring): New function.
(class selftest::lexer_test_options): New class.
(struct selftest::lexer_test): New struct.
(class selftest::ebcdic_execution_charset): New class.
(selftest::ebcdic_execution_charset::s_singleton): New variable.
(selftest::lexer_test::lexer_test): New constructor.
(selftest::lexer_test::~lexer_test): New destructor.
(selftest::lexer_test::get_token): New method.
(selftest::assert_char_at_range): New function.
(ASSERT_CHAR_AT_RANGE): New macro.
(selftest::assert_num_substring_ranges): New function.
(ASSERT_NUM_SUBSTRING_RANGES): New macro.
(selftest::assert_has_no_substring_ranges): New function.
(ASSERT_HAS_NO_SUBSTRING_RANGES): New macro.
(selftest::test_lexer_string_locations_simple): New function.
(selftest::test_lexer_string_locations_ebcdic): New function.
(selftest::test_lexer_string_locations_hex): New function.
(selftest::test_lexer_string_locations_oct): New function.
(selftest::test_lexer_string_locations_letter_escape_1): New function.
(selftest::test_lexer_string_locations_letter_escape_2): New function.
(selftest::test_lexer_string_locations_ucn4): New function.
(selftest::test_lexer_string_locations_ucn8): New function.
(selftest::uint32_from_big_endian): New function.
(selftest::test_lexer_string_locations_wide_string): New function.
(selftest::uint16_from_big_endian): New function.
(selftest::test_lexer_string_locations_string16): New function.
(selftest::test_lexer_string_locations_string32): New function.
(selftest::test_lexer_string_locations_u8): New function.
(selftest::test_lexer_string_locations_utf8_source): New function.
(selftest::test_lexer_string_locations_concatenation_1): New
function.
(selftest::test_lexer_string_locations_concatenation_2): New
function.
(selftest::test_lexer_string_locations_concatenation_3): New
function.
(selftest::test_lexer_string_locations_macro): New function.
(selftest::test_lexer_string_locations_stringified_macro_argument):
New function.
(selftest::test_lexer_string_locations_non_string): New function.
(selftest::test_lexer_string_locations_long_line): New function.
(selftest::test_lexer_char_constants): New function.
(selftest::input_c_tests): Call the new test functions once per
case within the line_table test matrix.
* input.h (struct string_concat): New struct.
(struct location_hash): New struct.
(class string_concat_db): New class.
* substring-locations.h: New header.
gcc/testsuite/ChangeLog:
* gcc.dg/plugin/diagnostic-test-string-literals-1.c: New file.
* gcc.dg/plugin/diagnostic-test-string-literals-2.c: New file.
* gcc.dg/plugin/diagnostic_plugin_test_string_literals.c: New file.
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add the above new files.
libcpp/ChangeLog:
* charset.c (cpp_substring_ranges::cpp_substring_ranges): New
constructor.
(cpp_substring_ranges::~cpp_substring_ranges): New destructor.
(cpp_substring_ranges::add_range): New method.
(cpp_substring_ranges::add_n_ranges): New method.
(_cpp_valid_ucn): Add "char_range" and "loc_reader" params; if
they are non-NULL, read position information from *loc_reader
and update char_range->m_finish accordingly.
(convert_ucn): Add "char_range", "loc_reader", and "ranges"
params. If loc_reader is non-NULL, read location information from
it, and update *ranges accordingly, using char_range.
Conditionalize the conversion into tbuf on tbuf being non-NULL.
(convert_hex): Likewise, conditionalizing the call to
emit_numeric_escape on tbuf.
(convert_oct): Likewise.
(convert_escape): Add params "loc_reader" and "ranges". If
loc_reader is non-NULL, read location information from it, and
update *ranges accordingly. Conditionalize the conversion into
tbuf on tbuf being non-NULL.
(cpp_interpret_string): Rename to...
(cpp_interpret_string_1): ...this, adding params "loc_readers" and
"out". Use "to" to conditionalize the initialization and usage of
"tbuf", such as running the converter. If "loc_readers" is
non-NULL, use the instances within it, reading location
information from them, and passing them to convert_escape; likewise
write to "out" if loc_readers is non-NULL. Check for leading
quote and issue an error if it is not present. Update boundary
check from "== limit" to ">= limit" to protect against erroneous
location values to calls that are not parsing string literals.
(cpp_interpret_string): Reimplement in terms to
cpp_interpret_string_1.
(noop_error_cb): New function.
(cpp_interpret_string_ranges): New function.
(cpp_string_location_reader::cpp_string_location_reader): New
constructor.
(cpp_string_location_reader::get_next): New method.
* include/cpplib.h (class cpp_string_location_reader): New class.
(class cpp_substring_ranges): New class.
(cpp_interpret_string_ranges): New prototype.
* internal.h (_cpp_valid_ucn): Add params "char_range" and
"loc_reader".
* lex.c (forms_identifier_p): Pass NULL for new params to
_cpp_valid_ucn.
From-SVN: r239175
2016-08-06 02:08:33 +08:00
|
|
|
cppchar_t *,
|
|
|
|
source_range *char_range,
|
|
|
|
cpp_string_location_reader *loc_reader);
|
2019-09-20 03:56:11 +08:00
|
|
|
|
|
|
|
extern bool _cpp_valid_utf8 (cpp_reader *pfile,
|
|
|
|
const uchar **pstr,
|
|
|
|
const uchar *limit,
|
|
|
|
int identifier_pos,
|
|
|
|
struct normalize_state *nst,
|
|
|
|
cppchar_t *cp);
|
|
|
|
|
cppcharset.c (one_utf8_to_cppchar, [...]): New functions.
* cppcharset.c (one_utf8_to_cppchar, one_cppchar_to_utf8,
one_utf8_to_utf32, one_utf32_to_utf8, one_utf8_to_utf16,
one_utf16_to_utf8, conversion_loop, convert_utf8_utf16,
convert_utf8_utf32, convert_utf16_utf8, convert_utf32_utf8,
convert_no_conversion, convert_using_iconv): New functions.
(APPLY_CONVERSION): New macro.
(struct conversion, conversion_tab): New data structure.
(init_iconv_desc): Check conversion_tab for a custom conversion
primitive before trying to use iconv.
(convert_cset): Deleted.
(cpp_init_iconv): Use UTF- terminology, not UCS-.
(_cpp_destroy_iconv): Update to match.
(_cpp_valid_ucn): We don't need iconv to implement UCNs.
(convert_ucn): Use one_cppchar_to_utf8 and APPLY_CONVERSION.
(convert_escape, cpp_interpret_string): Use APPLY_CONVERSION.
(_cpp_interpret_string_notranslate): New function, moved here
from cpplib.c.
* cpphash.h (convert_f, struct cset_converter): New types.
(struct cpp_reader): narrow_cset_desc and wide_cset_desc
are now struct cset_converter, not bare iconv_t.
Update prototypes.
* cpplib.c (interpret_string_notranslate): Moved to cppcharset.c;
all callers changed.
From-SVN: r69204
2003-07-11 07:16:31 +08:00
|
|
|
extern void _cpp_destroy_iconv (cpp_reader *);
|
2004-11-28 05:59:38 +08:00
|
|
|
extern unsigned char *_cpp_convert_input (cpp_reader *, const char *,
|
|
|
|
unsigned char *, size_t, size_t,
|
2008-04-21 22:02:00 +08:00
|
|
|
const unsigned char **, off_t *);
|
2004-02-03 04:20:58 +08:00
|
|
|
extern const char *_cpp_default_encoding (void);
|
2005-03-12 18:44:06 +08:00
|
|
|
extern cpp_hashnode * _cpp_interpret_identifier (cpp_reader *pfile,
|
|
|
|
const unsigned char *id,
|
|
|
|
size_t len);
|
2003-04-20 15:29:23 +08:00
|
|
|
|
cppfiles.c: Include splay-tree.h, not hashtab.h.
* cppfiles.c: Include splay-tree.h, not hashtab.h.
(redundant_include_p, make_IHASH, hash_IHASH, eq_IHASH): Delete.
(destroy_include_file_node): New.
(_cpp_init_include_hash): Rename _cpp_init_include_table.
Create a splay tree, not a hash table.
(open_include_file): Look up the path in the include table,
do the multiple include optimization here, etc.
(cpp_included): Walk the path.
(find_include_file): Just walk the path calling
open_include_file, or call it directly for an absolute path.
(_cpp_fake_ihash): Rename _cpp_fake_include and update for new
scheme.
(read_include_file): Update for new scheme. Don't close the
file unless reading fails.
(_cpp_execute_include, cpp_read_file): Tweak for new scheme.
* cpphash.h (struct ihash, NEVER_REINCLUDE): Delete.
(struct include_file): New.
(NEVER_REREAD, DO_NOT_REREAD, CPP_IN_SYSTEM_HEADER): New
macros.
(CPP_PEDANTIC, CPP_WTRADITIONAL): Update.
Update prototypes.
* cppinit.c: Include splay-tree.h.
(cpp_reader_init, cpp_cleanup): Update.
* cpplib.h (struct cpp_buffer): Change ihash field to
'struct include_file *inc'. Remove system_header_p.
(struct cpp_reader): Change all_include_files to a
struct splay_tree_s *.
* cpplex.c: Update all references to cpp_buffer->ihash and/or
cpp_buffer->system_header_p.
(cpp_pop_buffer): Close file here, only if DO_NOT_REREAD.
From-SVN: r34636
2000-06-22 02:33:51 +08:00
|
|
|
/* Utility routines and macros. */
|
2004-11-28 05:59:38 +08:00
|
|
|
#define DSC(str) (const unsigned char *)str, sizeof str - 1
|
cppfiles.c: Include splay-tree.h, not hashtab.h.
* cppfiles.c: Include splay-tree.h, not hashtab.h.
(redundant_include_p, make_IHASH, hash_IHASH, eq_IHASH): Delete.
(destroy_include_file_node): New.
(_cpp_init_include_hash): Rename _cpp_init_include_table.
Create a splay tree, not a hash table.
(open_include_file): Look up the path in the include table,
do the multiple include optimization here, etc.
(cpp_included): Walk the path.
(find_include_file): Just walk the path calling
open_include_file, or call it directly for an absolute path.
(_cpp_fake_ihash): Rename _cpp_fake_include and update for new
scheme.
(read_include_file): Update for new scheme. Don't close the
file unless reading fails.
(_cpp_execute_include, cpp_read_file): Tweak for new scheme.
* cpphash.h (struct ihash, NEVER_REINCLUDE): Delete.
(struct include_file): New.
(NEVER_REREAD, DO_NOT_REREAD, CPP_IN_SYSTEM_HEADER): New
macros.
(CPP_PEDANTIC, CPP_WTRADITIONAL): Update.
Update prototypes.
* cppinit.c: Include splay-tree.h.
(cpp_reader_init, cpp_cleanup): Update.
* cpplib.h (struct cpp_buffer): Change ihash field to
'struct include_file *inc'. Remove system_header_p.
(struct cpp_reader): Change all_include_files to a
struct splay_tree_s *.
* cpplex.c: Update all references to cpp_buffer->ihash and/or
cpp_buffer->system_header_p.
(cpp_pop_buffer): Close file here, only if DO_NOT_REREAD.
From-SVN: r34636
2000-06-22 02:33:51 +08:00
|
|
|
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
/* These are inline functions instead of macros so we can get type
|
|
|
|
checking. */
|
2004-11-28 05:59:38 +08:00
|
|
|
static inline int ustrcmp (const unsigned char *, const unsigned char *);
|
|
|
|
static inline int ustrncmp (const unsigned char *, const unsigned char *,
|
|
|
|
size_t);
|
|
|
|
static inline size_t ustrlen (const unsigned char *);
|
2011-11-03 04:22:53 +08:00
|
|
|
static inline const unsigned char *uxstrdup (const unsigned char *);
|
|
|
|
static inline const unsigned char *ustrchr (const unsigned char *, int);
|
2004-11-28 05:59:38 +08:00
|
|
|
static inline int ufputs (const unsigned char *, FILE *);
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
|
2005-02-14 16:52:24 +08:00
|
|
|
/* Use a const char for the second parameter since it is usually a literal. */
|
|
|
|
static inline int ustrcspn (const unsigned char *, const char *);
|
|
|
|
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
static inline int
|
2004-11-28 05:59:38 +08:00
|
|
|
ustrcmp (const unsigned char *s1, const unsigned char *s2)
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
{
|
|
|
|
return strcmp ((const char *)s1, (const char *)s2);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline int
|
2004-11-28 05:59:38 +08:00
|
|
|
ustrncmp (const unsigned char *s1, const unsigned char *s2, size_t n)
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
{
|
|
|
|
return strncmp ((const char *)s1, (const char *)s2, n);
|
|
|
|
}
|
|
|
|
|
2005-02-14 16:52:24 +08:00
|
|
|
static inline int
|
|
|
|
ustrcspn (const unsigned char *s1, const char *s2)
|
|
|
|
{
|
|
|
|
return strcspn ((const char *)s1, s2);
|
|
|
|
}
|
|
|
|
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
static inline size_t
|
2004-11-28 05:59:38 +08:00
|
|
|
ustrlen (const unsigned char *s1)
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
{
|
|
|
|
return strlen ((const char *)s1);
|
|
|
|
}
|
|
|
|
|
2011-11-03 04:22:53 +08:00
|
|
|
static inline const unsigned char *
|
2004-11-28 05:59:38 +08:00
|
|
|
uxstrdup (const unsigned char *s1)
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
{
|
2011-11-03 04:22:53 +08:00
|
|
|
return (const unsigned char *) xstrdup ((const char *)s1);
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
}
|
|
|
|
|
2011-11-03 04:22:53 +08:00
|
|
|
static inline const unsigned char *
|
2004-11-28 05:59:38 +08:00
|
|
|
ustrchr (const unsigned char *s1, int c)
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
{
|
2011-11-03 04:22:53 +08:00
|
|
|
return (const unsigned char *) strchr ((const char *)s1, c);
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline int
|
2004-11-28 05:59:38 +08:00
|
|
|
ufputs (const unsigned char *s, FILE *f)
|
cpplib.h (cpp_pool, [...]): Move to cpphash.h (struct macro_args): Delete.
* cpplib.h (cpp_pool, mi_state, mi_ind, struct cpp_macro,
struct cpp_chunk, struct htab, struct toklist,
struct cpp_context, CPP_STACK_MAX, struct lexer_state,
struct spec_nodes, struct cpp_reader, CPP_OPTION, CPP_BUFFER,
CPP_BUF_LINE, CPP_BUF_COL, CPP_BUF_COLUMN, U, ustrcmp, ustrncmp,
ustrlen, uxstrdup, ustrchr, ufputs): Move to cpphash.h
(struct macro_args): Delete.
* cpphash.h: See above.
From-SVN: r38984
2001-01-13 22:32:59 +08:00
|
|
|
{
|
|
|
|
return fputs ((const char *)s, f);
|
|
|
|
}
|
|
|
|
|
2022-01-14 23:57:02 +08:00
|
|
|
/* In line-map.cc. */
|
Linemap infrastructure for virtual locations
This is the first instalment of a set which goal is to track locations
of tokens across macro expansions. Tom Tromey did the original work
and attached the patch to PR preprocessor/7263. This opus is a
derivative of that original work.
This patch modifies the linemap module of libcpp to add virtual
locations support.
A virtual location is a mapped location that can resolve to several
different physical locations. It can always resolve to the spelling
location of a token. For tokens resulting from macro expansion it can
resolve to:
- either the location of the expansion point of the macro.
- or the location of the token in the definition of the
macro
- or, if the token is an argument of a function-like macro,
the location of the use of the matching macro parameter in
the definition of the macro
The patch creates a new type of line map called a macro map. For every
single macro expansion, there is a macro map that generates a virtual
location for every single resulting token of the expansion.
The good old type of line map we all know is now called an ordinary
map. That one still encodes spelling locations as it has always had.
As a result linemap_lookup as been extended to return a macro map when
given a virtual location resulting from a macro expansion. The layout
of structs line_map has changed to support this new type of map. So
did the layout of struct line_maps. Accessor macros have been
introduced to avoid messing with the implementation details of these
datastructures directly. This helped already as we have been testing
different ways of arranging these datastructure. Having to constantly
adjust client code that is too tied with the internals of line_map and
line_maps would have been even more painful.
Of course, many new public functions have been added to the linemap
module to handle the resolution of virtual locations.
This patch introduces the infrastructure but no part of the compiler
uses virtual locations yet.
However the client code of the linemap data structures has been
adjusted as per the changes. E.g, it's not anymore reliable for a
client code to manipulate struct line_map directly if it just wants to
deal with spelling locations, because struct line_map can now
represent a macro map as well. In that case, it's better to use the
convenient API to resolve the initial (possibly virtual) location to a
spelling location (or to an ordinary map) and use that.
This is the reason why the patch adjusts the Java, Ada and Fortran
front ends.
Also, note that virtual locations are not supposed to be ordered for
relations '<' and '>' anymore. To test if a virtual location appears
"before" another one, one has to use a new operator exposed by the
line map interface. The patch updates the only spot (in the
diagnostics module) I have found that was making the assumption that
locations were ordered for these relations. This is the only change
that introduces a use of the new line map API in this patch, so I am
adding a regression test for it only.
From-SVN: r180081
2011-10-17 17:58:56 +08:00
|
|
|
|
|
|
|
/* Create and return a virtual location for a token that is part of a
|
|
|
|
macro expansion-list at a macro expansion point. See the comment
|
|
|
|
inside struct line_map_macro to see what an expansion-list exactly
|
|
|
|
is.
|
|
|
|
|
|
|
|
A call to this function must come after a call to
|
|
|
|
linemap_enter_macro.
|
|
|
|
|
|
|
|
MAP is the map into which the source location is created. TOKEN_NO
|
|
|
|
is the index of the token in the macro replacement-list, starting
|
|
|
|
at number 0.
|
|
|
|
|
|
|
|
ORIG_LOC is the location of the token outside of this macro
|
|
|
|
expansion. If the token comes originally from the macro
|
|
|
|
definition, it is the locus in the macro definition; otherwise it
|
|
|
|
is a location in the context of the caller of this macro expansion
|
|
|
|
(which is a virtual location or a source location if the caller is
|
|
|
|
itself a macro expansion or not).
|
|
|
|
|
|
|
|
MACRO_DEFINITION_LOC is the location in the macro definition,
|
|
|
|
either of the token itself or of a macro parameter that it
|
|
|
|
replaces. */
|
2018-11-14 04:05:03 +08:00
|
|
|
location_t linemap_add_macro_token (const line_map_macro *,
|
|
|
|
unsigned int,
|
|
|
|
location_t,
|
|
|
|
location_t);
|
Linemap infrastructure for virtual locations
This is the first instalment of a set which goal is to track locations
of tokens across macro expansions. Tom Tromey did the original work
and attached the patch to PR preprocessor/7263. This opus is a
derivative of that original work.
This patch modifies the linemap module of libcpp to add virtual
locations support.
A virtual location is a mapped location that can resolve to several
different physical locations. It can always resolve to the spelling
location of a token. For tokens resulting from macro expansion it can
resolve to:
- either the location of the expansion point of the macro.
- or the location of the token in the definition of the
macro
- or, if the token is an argument of a function-like macro,
the location of the use of the matching macro parameter in
the definition of the macro
The patch creates a new type of line map called a macro map. For every
single macro expansion, there is a macro map that generates a virtual
location for every single resulting token of the expansion.
The good old type of line map we all know is now called an ordinary
map. That one still encodes spelling locations as it has always had.
As a result linemap_lookup as been extended to return a macro map when
given a virtual location resulting from a macro expansion. The layout
of structs line_map has changed to support this new type of map. So
did the layout of struct line_maps. Accessor macros have been
introduced to avoid messing with the implementation details of these
datastructures directly. This helped already as we have been testing
different ways of arranging these datastructure. Having to constantly
adjust client code that is too tied with the internals of line_map and
line_maps would have been even more painful.
Of course, many new public functions have been added to the linemap
module to handle the resolution of virtual locations.
This patch introduces the infrastructure but no part of the compiler
uses virtual locations yet.
However the client code of the linemap data structures has been
adjusted as per the changes. E.g, it's not anymore reliable for a
client code to manipulate struct line_map directly if it just wants to
deal with spelling locations, because struct line_map can now
represent a macro map as well. In that case, it's better to use the
convenient API to resolve the initial (possibly virtual) location to a
spelling location (or to an ordinary map) and use that.
This is the reason why the patch adjusts the Java, Ada and Fortran
front ends.
Also, note that virtual locations are not supposed to be ordered for
relations '<' and '>' anymore. To test if a virtual location appears
"before" another one, one has to use a new operator exposed by the
line map interface. The patch updates the only spot (in the
diagnostics module) I have found that was making the assumption that
locations were ordered for these relations. This is the only change
that introduces a use of the new line map API in this patch, so I am
adding a regression test for it only.
From-SVN: r180081
2011-10-17 17:58:56 +08:00
|
|
|
|
|
|
|
/* Return the source line number corresponding to source location
|
|
|
|
LOCATION. SET is the line map set LOCATION comes from. If
|
|
|
|
LOCATION is the location of token that is part of the
|
|
|
|
expansion-list of a macro expansion return the line number of the
|
|
|
|
macro expansion point. */
|
2023-10-09 06:43:16 +08:00
|
|
|
int linemap_get_expansion_line (const line_maps *,
|
2018-11-14 04:05:03 +08:00
|
|
|
location_t);
|
Linemap infrastructure for virtual locations
This is the first instalment of a set which goal is to track locations
of tokens across macro expansions. Tom Tromey did the original work
and attached the patch to PR preprocessor/7263. This opus is a
derivative of that original work.
This patch modifies the linemap module of libcpp to add virtual
locations support.
A virtual location is a mapped location that can resolve to several
different physical locations. It can always resolve to the spelling
location of a token. For tokens resulting from macro expansion it can
resolve to:
- either the location of the expansion point of the macro.
- or the location of the token in the definition of the
macro
- or, if the token is an argument of a function-like macro,
the location of the use of the matching macro parameter in
the definition of the macro
The patch creates a new type of line map called a macro map. For every
single macro expansion, there is a macro map that generates a virtual
location for every single resulting token of the expansion.
The good old type of line map we all know is now called an ordinary
map. That one still encodes spelling locations as it has always had.
As a result linemap_lookup as been extended to return a macro map when
given a virtual location resulting from a macro expansion. The layout
of structs line_map has changed to support this new type of map. So
did the layout of struct line_maps. Accessor macros have been
introduced to avoid messing with the implementation details of these
datastructures directly. This helped already as we have been testing
different ways of arranging these datastructure. Having to constantly
adjust client code that is too tied with the internals of line_map and
line_maps would have been even more painful.
Of course, many new public functions have been added to the linemap
module to handle the resolution of virtual locations.
This patch introduces the infrastructure but no part of the compiler
uses virtual locations yet.
However the client code of the linemap data structures has been
adjusted as per the changes. E.g, it's not anymore reliable for a
client code to manipulate struct line_map directly if it just wants to
deal with spelling locations, because struct line_map can now
represent a macro map as well. In that case, it's better to use the
convenient API to resolve the initial (possibly virtual) location to a
spelling location (or to an ordinary map) and use that.
This is the reason why the patch adjusts the Java, Ada and Fortran
front ends.
Also, note that virtual locations are not supposed to be ordered for
relations '<' and '>' anymore. To test if a virtual location appears
"before" another one, one has to use a new operator exposed by the
line map interface. The patch updates the only spot (in the
diagnostics module) I have found that was making the assumption that
locations were ordered for these relations. This is the only change
that introduces a use of the new line map API in this patch, so I am
adding a regression test for it only.
From-SVN: r180081
2011-10-17 17:58:56 +08:00
|
|
|
|
|
|
|
/* Return the path of the file corresponding to source code location
|
|
|
|
LOCATION.
|
|
|
|
|
|
|
|
If LOCATION is the location of a token that is part of the
|
|
|
|
replacement-list of a macro expansion return the file path of the
|
|
|
|
macro expansion point.
|
|
|
|
|
|
|
|
SET is the line map set LOCATION comes from. */
|
2023-10-09 06:43:16 +08:00
|
|
|
const char* linemap_get_expansion_filename (const line_maps *,
|
2018-11-14 04:05:03 +08:00
|
|
|
location_t);
|
Linemap infrastructure for virtual locations
This is the first instalment of a set which goal is to track locations
of tokens across macro expansions. Tom Tromey did the original work
and attached the patch to PR preprocessor/7263. This opus is a
derivative of that original work.
This patch modifies the linemap module of libcpp to add virtual
locations support.
A virtual location is a mapped location that can resolve to several
different physical locations. It can always resolve to the spelling
location of a token. For tokens resulting from macro expansion it can
resolve to:
- either the location of the expansion point of the macro.
- or the location of the token in the definition of the
macro
- or, if the token is an argument of a function-like macro,
the location of the use of the matching macro parameter in
the definition of the macro
The patch creates a new type of line map called a macro map. For every
single macro expansion, there is a macro map that generates a virtual
location for every single resulting token of the expansion.
The good old type of line map we all know is now called an ordinary
map. That one still encodes spelling locations as it has always had.
As a result linemap_lookup as been extended to return a macro map when
given a virtual location resulting from a macro expansion. The layout
of structs line_map has changed to support this new type of map. So
did the layout of struct line_maps. Accessor macros have been
introduced to avoid messing with the implementation details of these
datastructures directly. This helped already as we have been testing
different ways of arranging these datastructure. Having to constantly
adjust client code that is too tied with the internals of line_map and
line_maps would have been even more painful.
Of course, many new public functions have been added to the linemap
module to handle the resolution of virtual locations.
This patch introduces the infrastructure but no part of the compiler
uses virtual locations yet.
However the client code of the linemap data structures has been
adjusted as per the changes. E.g, it's not anymore reliable for a
client code to manipulate struct line_map directly if it just wants to
deal with spelling locations, because struct line_map can now
represent a macro map as well. In that case, it's better to use the
convenient API to resolve the initial (possibly virtual) location to a
spelling location (or to an ordinary map) and use that.
This is the reason why the patch adjusts the Java, Ada and Fortran
front ends.
Also, note that virtual locations are not supposed to be ordered for
relations '<' and '>' anymore. To test if a virtual location appears
"before" another one, one has to use a new operator exposed by the
line map interface. The patch updates the only spot (in the
diagnostics module) I have found that was making the assumption that
locations were ordered for these relations. This is the only change
that introduces a use of the new line map API in this patch, so I am
adding a regression test for it only.
From-SVN: r180081
2011-10-17 17:58:56 +08:00
|
|
|
|
diagnostics: escape non-ASCII source bytes for certain diagnostics
This patch adds support to GCC's diagnostic subsystem for escaping certain
bytes and Unicode characters when quoting source code.
Specifically, this patch adds a new flag rich_location::m_escape_on_output
which is a hint from a diagnostic that non-ASCII bytes in the pertinent
lines of the user's source code should be escaped when printed.
The patch sets this for the following diagnostics:
- when complaining about stray bytes in the program (when these
are non-printable)
- when complaining about "null character(s) ignored");
- for -Wnormalized= (and generate source ranges for such warnings)
The escaping is controlled by a new option:
-fdiagnostics-escape-format=[unicode|bytes]
For example, consider a diagnostic involing a source line containing the
string "before" followed by the Unicode character U+03C0 ("GREEK SMALL
LETTER PI", with UTF-8 encoding 0xCF 0x80) followed by the byte 0xBF
(a stray UTF-8 trailing byte), followed by the string "after", where the
diagnostic highlights the U+03C0 character.
By default, this line will be printed verbatim to the user when
reporting a diagnostic at it, as:
beforeπXafter
^
(using X for the stray byte to avoid putting invalid UTF-8 in this
commit message)
If the diagnostic sets the "escape" flag, it will be printed as:
before<U+03C0><BF>after
^~~~~~~~
with -fdiagnostics-escape-format=unicode (the default), or as:
before<CF><80><BF>after
^~~~~~~~
if the user supplies -fdiagnostics-escape-format=bytes.
This only affects how the source is printed; it does not affect
how column numbers that are printed (as per -fdiagnostics-column-unit=
and -fdiagnostics-column-origin=).
gcc/c-family/ChangeLog:
* c-lex.c (c_lex_with_flags): When complaining about non-printable
CPP_OTHER tokens, set the "escape on output" flag.
gcc/ChangeLog:
* common.opt (fdiagnostics-escape-format=): New.
(diagnostics_escape_format): New enum.
(DIAGNOSTICS_ESCAPE_FORMAT_UNICODE): New enum value.
(DIAGNOSTICS_ESCAPE_FORMAT_BYTES): Likewise.
* diagnostic-format-json.cc (json_end_diagnostic): Add
"escape-source" attribute.
* diagnostic-show-locus.c
(exploc_with_display_col::exploc_with_display_col): Replace
"tabstop" param with a cpp_char_column_policy and add an "aspect"
param. Use these to compute m_display_col accordingly.
(struct char_display_policy): New struct.
(layout::m_policy): New field.
(layout::m_escape_on_output): New field.
(def_policy): New function.
(make_range): Update for changes to exploc_with_display_col ctor.
(default_print_decoded_ch): New.
(width_per_escaped_byte): New.
(escape_as_bytes_width): New.
(escape_as_bytes_print): New.
(escape_as_unicode_width): New.
(escape_as_unicode_print): New.
(make_policy): New.
(layout::layout): Initialize new fields. Update m_exploc ctor
call for above change to ctor.
(layout::maybe_add_location_range): Update for changes to
exploc_with_display_col ctor.
(layout::calculate_x_offset_display): Update for change to
cpp_display_width.
(layout::print_source_line): Pass policy
to cpp_display_width_computation. Capture cpp_decoded_char when
calling process_next_codepoint. Move printing of source code to
m_policy.m_print_cb.
(line_label::line_label): Pass in policy rather than context.
(layout::print_any_labels): Update for change to line_label ctor.
(get_affected_range): Pass in policy rather than context, updating
calls to location_compute_display_column accordingly.
(get_printed_columns): Likewise, also for cpp_display_width.
(correction::correction): Pass in policy rather than tabstop.
(correction::compute_display_cols): Pass m_policy rather than
m_tabstop to cpp_display_width.
(correction::m_tabstop): Replace with...
(correction::m_policy): ...this.
(line_corrections::line_corrections): Pass in policy rather than
context.
(line_corrections::m_context): Replace with...
(line_corrections::m_policy): ...this.
(line_corrections::add_hint): Update to use m_policy rather than
m_context.
(line_corrections::add_hint): Likewise.
(layout::print_trailing_fixits): Likewise.
(selftest::test_display_widths): New.
(selftest::test_layout_x_offset_display_utf8): Update to use
policy rather than tabstop.
(selftest::test_one_liner_labels_utf8): Add test of escaping
source lines.
(selftest::test_diagnostic_show_locus_one_liner_utf8): Update to
use policy rather than tabstop.
(selftest::test_overlapped_fixit_printing): Likewise.
(selftest::test_overlapped_fixit_printing_utf8): Likewise.
(selftest::test_overlapped_fixit_printing_2): Likewise.
(selftest::test_tab_expansion): Likewise.
(selftest::test_escaping_bytes_1): New.
(selftest::test_escaping_bytes_2): New.
(selftest::diagnostic_show_locus_c_tests): Call the new tests.
* diagnostic.c (diagnostic_initialize): Initialize
context->escape_format.
(convert_column_unit): Update to use default character width policy.
(selftest::test_diagnostic_get_location_text): Likewise.
* diagnostic.h (enum diagnostics_escape_format): New enum.
(diagnostic_context::escape_format): New field.
* doc/invoke.texi (-fdiagnostics-escape-format=): New option.
(-fdiagnostics-format=): Add "escape-source" attribute to examples
of JSON output, and document it.
* input.c (location_compute_display_column): Pass in "policy"
rather than "tabstop", passing to
cpp_byte_column_to_display_column.
(selftest::test_cpp_utf8): Update to use cpp_char_column_policy.
* input.h (class cpp_char_column_policy): New forward decl.
(location_compute_display_column): Pass in "policy" rather than
"tabstop".
* opts.c (common_handle_option): Handle
OPT_fdiagnostics_escape_format_.
* selftest.c (temp_source_file::temp_source_file): New ctor
overload taking a size_t.
* selftest.h (temp_source_file::temp_source_file): Likewise.
gcc/testsuite/ChangeLog:
* c-c++-common/diagnostic-format-json-1.c: Add regexp to consume
"escape-source" attribute.
* c-c++-common/diagnostic-format-json-2.c: Likewise.
* c-c++-common/diagnostic-format-json-3.c: Likewise.
* c-c++-common/diagnostic-format-json-4.c: Likewise, twice.
* c-c++-common/diagnostic-format-json-5.c: Likewise.
* gcc.dg/cpp/warn-normalized-4-bytes.c: New test.
* gcc.dg/cpp/warn-normalized-4-unicode.c: New test.
* gcc.dg/encoding-issues-bytes.c: New test.
* gcc.dg/encoding-issues-unicode.c: New test.
* gfortran.dg/diagnostic-format-json-1.F90: Add regexp to consume
"escape-source" attribute.
* gfortran.dg/diagnostic-format-json-2.F90: Likewise.
* gfortran.dg/diagnostic-format-json-3.F90: Likewise.
libcpp/ChangeLog:
* charset.c (convert_escape): Use encoding_rich_location when
complaining about nonprintable unknown escape sequences.
(cpp_display_width_computation::::cpp_display_width_computation):
Pass in policy rather than tabstop.
(cpp_display_width_computation::process_next_codepoint): Add "out"
param and populate *out if non-NULL.
(cpp_display_width_computation::advance_display_cols): Pass NULL
to process_next_codepoint.
(cpp_byte_column_to_display_column): Pass in policy rather than
tabstop. Pass NULL to process_next_codepoint.
(cpp_display_column_to_byte_column): Pass in policy rather than
tabstop.
* errors.c (cpp_diagnostic_get_current_location): New function,
splitting out the logic from...
(cpp_diagnostic): ...here.
(cpp_warning_at): New function.
(cpp_pedwarning_at): New function.
* include/cpplib.h (cpp_warning_at): New decl for rich_location.
(cpp_pedwarning_at): Likewise.
(struct cpp_decoded_char): New.
(struct cpp_char_column_policy): New.
(cpp_display_width_computation::cpp_display_width_computation):
Replace "tabstop" param with "policy".
(cpp_display_width_computation::process_next_codepoint): Add "out"
param.
(cpp_display_width_computation::m_tabstop): Replace with...
(cpp_display_width_computation::m_policy): ...this.
(cpp_byte_column_to_display_column): Replace "tabstop" param with
"policy".
(cpp_display_width): Likewise.
(cpp_display_column_to_byte_column): Likewise.
* include/line-map.h (rich_location::escape_on_output_p): New.
(rich_location::set_escape_on_output): New.
(rich_location::m_escape_on_output): New.
* internal.h (cpp_diagnostic_get_current_location): New decl.
(class encoding_rich_location): New.
* lex.c (skip_whitespace): Use encoding_rich_location when
complaining about null characters.
(warn_about_normalization): Generate a source range when
complaining about improperly normalized tokens, rather than just a
point, and use encoding_rich_location so that the source code
is escaped on printing.
* line-map.c (rich_location::rich_location): Initialize
m_escape_on_output.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2021-10-19 06:55:31 +08:00
|
|
|
/* A subclass of rich_location for emitting a diagnostic
|
|
|
|
at the current location of the reader, but flagging
|
|
|
|
it with set_escape_on_output (true). */
|
|
|
|
class encoding_rich_location : public rich_location
|
|
|
|
{
|
|
|
|
public:
|
|
|
|
encoding_rich_location (cpp_reader *pfile)
|
|
|
|
: rich_location (pfile->line_table,
|
|
|
|
cpp_diagnostic_get_current_location (pfile))
|
|
|
|
{
|
|
|
|
set_escape_on_output (true);
|
|
|
|
}
|
|
|
|
|
|
|
|
encoding_rich_location (cpp_reader *pfile, location_t loc)
|
|
|
|
: rich_location (pfile->line_table, loc)
|
|
|
|
{
|
|
|
|
set_escape_on_output (true);
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
2009-06-01 23:37:03 +08:00
|
|
|
#ifdef __cplusplus
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2004-05-24 18:50:45 +08:00
|
|
|
#endif /* ! LIBCPP_INTERNAL_H */
|