With modules streamed entities have two new properties -- the module
that declares them and the module that instantiates them. Here
'instantiate' applies to more than just templates -- for instance an
implicit member fn. These may well be the same module. This adds the
calls to places that need it.
gcc/cp/
* class.c (layout_class_type): Call set_instantiating_module.
(build_self_reference): Likewise.
* decl.c (grokfndecl): Call set_originating_module.
(grokvardecl): Likewise.
(grokdeclarator): Likewise.
* pt.c (maybe_new_partial_specialization): Call
set_instantiating_module, propagate DECL_MODULE_EXPORT_P.
(lookup_template_class_1): Likewise.
(tsubst_function_decl): Likewise.
(tsubst_decl, instantiate_template_1): Likewise.
(build_template_decl): Propagate module flags.
(tsubst_template_dcl): Likewise.
(finish_concept_definition): Call set_originating_module.
* module.cc (set_instantiating_module, set_originating_module): Stubs.
I thought I had implemented P1186R3, but apparently I didn't read it closely
enough to understand the point of the paper, namely that for a defaulted
operator<=>, if a member type doesn't have a viable operator<=>, we will use
its operator< and operator== if the defaulted operator has an specific
comparison category as its return type; the compiler can't guess if it
should be strong_ordering or something else, but the user can make that
choice explicit.
The libstdc++ test change was necessary because of the change in
genericize_spaceship from op0 > op1 to op1 < op0; this should be equivalent,
but isn't because of PR88173.
gcc/cp/ChangeLog:
PR c++/96299
* cp-tree.h (build_new_op): Add overload that omits some parms.
(genericize_spaceship): Add location_t parm.
* constexpr.c (cxx_eval_binary_expression): Pass it.
* cp-gimplify.c (genericize_spaceship): Pass it.
* method.c (genericize_spaceship): Handle class-type arguments.
(build_comparison_op): Fall back to op</== when appropriate.
gcc/testsuite/ChangeLog:
PR c++/96299
* g++.dg/cpp2a/spaceship-synth-neg2.C: Move error.
* g++.dg/cpp2a/spaceship-p1186.C: New test.
libstdc++-v3/ChangeLog:
PR c++/96299
* testsuite/18_support/comparisons/algorithms/partial_order.cc:
One more line needs to use VERIFY instead of static_assert.
Several recent C++ features are specified to try overload resolution, and if
no viable candidate is found, do something else. But our error return
doesn't distinguish between that situation and finding multiple viable
candidates that end up being ambiguous. We're already trying to separately
return the single function we found even if it ends up being ill-formed for
some reason; for ambiguity let's pass back error_mark_node, to be
distinguished from NULL_TREE meaning no viable candidate.
gcc/cp/ChangeLog:
* call.c (build_new_op_1): Set *overload for ambiguity.
(build_new_method_call_1): Likewise.
When the atomic access involves a call to __sync_synchronize
it is better to call __cxa_guard_acquire unconditionally,
since it handles the atomics too, or is a non-threaded
implementation when there is no gthread support for this target.
This fixes also a bug for the ARM EABI big-endian target,
that is, previously the wrong bit was checked.
2020-12-08 Bernd Edlinger <bernd.edlinger@hotmail.de>
* decl2.c: (is_atomic_expensive_p): New helper function.
(build_atomic_load_byte): Rename to...
(build_atomic_load_type): ... and add new parameter type.
(get_guard_cond): Skip the atomic here if that is expensive.
Use the correct type for the atomic load on certain targets.
We need to expose build_cdtor_clones, it fortunately has the desired
API -- gosh, how did that happen? :) The template machinery will need
to cache path-of-instantiation information, so add two more fields to
the tinst_level struct. I also had to adjust the
match_mergeable_specialization API since adding it, so including that
change too.
gcc/cp/
* cp-tree.h (struct tinst_level): Add path & visible fields.
(build_cdtor_clones): Declare.
(match_mergeable_specialization): Use a spec_entry, add insert parm.
* class.c (build_cdtor_clones): Externalize.
* pt.c (push_tinst_level_loc): Clear new fields.
(match_mergeable_specialization): Adjust API.
Here are the couple of raw accessors I make use of in the module streaming.
gcc/
* tree.h (DECL_ALIGN_RAW): New.
(DECL_ALIGN): Use it.
(DECL_WARN_IF_NOT_ALIGN_RAW): New.
(DECL_WARN_IF_NOT_ALIGN): Use it.
(SET_DECL_WARN_IF_NOT_ALIGN): Likewise.
C++ 20 modules adds some new rules about when the global initializers
of imported modules run. They must run no later than before any
initializers in the importer that appear after the import. To provide
this, each named module emits an idempotent global initializer that
calls the global initializer functions of its imports (these of course
may call further import initializers). This is the machinery in our
global-init emission to accomplish that, other than the actual
emission of calls, which is in the module file. The naming of this
global init is a new piece of the ABI.
FWIW, the module's emitter does some optimization to avoid calling a
direct import's initializer when it can determine thatr import is also
indirect.
gcc/cp/
* decl2.c (start_objects): Refactor and adjust for named module
initializers.
(finish_objects): Likewise.
(generate_ctor_or_dtor_function): Likewise.
* module.cc (module_initializer_kind)
(module_add_import_initializers): Stubs.
The documentation says
For a named pattern, the condition may not depend on the data in
the insn being matched, but only the target-machine-type flags.
The i386 backend violates that by using flag_excess_precision and
flag_unsafe_math_optimizations in the conditions too, which is bad
when optimize attribute or pragmas are used. The problem is that the
middle-end caches the enabled conditions for the optabs for a particular
switchable target, but multiple functions can share the same
TARGET_OPTION_NODE, but have different TREE_OPTIMIZATION_NODE with different
flag_excess_precision or flag_unsafe_math_optimizations, so the enabled
conditions then match only one of those.
I think best would be to just have a single options node for both the
generic and target options, then such problems wouldn't exist, but that
would be very risky at this point and quite large change.
So, instead the following patch just shadows flag_excess_precision and
flag_unsafe_math_optimizations values for uses in the instruction conditions
in TargetVariable and during set_cfun artificially creates new
TARGET_OPTION_NODE if flag_excess_precision and/or
flag_unsafe_math_optimizations change from what is recorded in their
TARGET_OPTION_NODE. The target nodes are hashed, so worst case we can get 4
times as many target option nodes if one would for each unique target option
try all the flag_excess_precision and flag_unsafe_math_optimizations values.
2020-12-08 Jakub Jelinek <jakub@redhat.com>
PR target/94440
* config/i386/i386.opt (ix86_excess_precision,
ix86_unsafe_math_optimizations): New TargetVariables.
* config/i386/i386.h (X87_ENABLE_ARITH, X87_ENABLE_FLOAT): Use
ix86_unsafe_math_optimizations instead of
flag_unsafe_math_optimizations and ix86_excess_precision instead of
flag_excess_precision.
* config/i386/i386.c (ix86_excess_precision): Rename to ...
(ix86_get_excess_precision): ... this.
(TARGET_C_EXCESS_PRECISION): Define to ix86_get_excess_precision.
* config/i386/i386-options.c (ix86_valid_target_attribute_tree,
ix86_option_override_internal): Update ix86_unsafe_math_optimization
from flag_unsafe_math_optimizations and ix86_excess_precision
from flag_excess_precision when constructing target option nodes.
(ix86_set_current_function): If flag_unsafe_math_optimizations
or flag_excess_precision is different from the one recorded
in TARGET_OPTION_NODE, create a new target option node for the
current function and switch to that.
Adding includes to module.cc triggered the kind of build failure I
wanted to check for. In this case it was MODULE_VERSION not being
defined, and module.cc's internal #error triggering. I've relaxed the
check in Make-lang, so we proviude MODULE_VERSION when DEVPHASE is not
empty (rather than when it is 'experimental'). AFAICT devphase is
empty for release builds, and the #error will force us to decide
whether modules is sufficiently baked at that point.
gcc/cp
* Make-lang.in (MODULE_VERSION): Override when DEVPHASE not empty.
* module.cc: Comment.
This is the mangling changes for modules. These were developed in
collaboration with clang, which also implemements the same ABI (or
plans to, I do not think the global init is in clang). The global
init mangling is captured in
https://github.com/itanium-cxx-abi/cxx-abi/issues/99
gcc/cp/
* cp-tree.h (mangle_module_substitution, mangle_identifier)
(mangle_module_global_init): Declare.
* mangle.c (struct globals): Add mod field.
(mangle_module_substitution, mangle_identifier)
(mangle_module_global_init): Define.
(write_module, maybe_write_module): New.
(write_name): Call it.
(start_mangling): Clear mod field.
(finish_mangling_internal): Adjust.
* module.cc (mangle_module, mangle_module_fini)
(get_originating_module): Stubs.
As mentioned in the preprocessor patches, there's a new kind of
preprocessor directive for modules, and it interacts with the
compiler-proper, as that has to stream in header-unit macro
information (when the directive is an import that names a
header-unit). This is that machinery. It's an FSM that inspects the
token stream and does the minimal parsing to detect such imports.
This ends up being called from the C++ parser's tokenizer and from the
-E tokenizer (via a lang hook). The actual module streaming is a stub
here.
gcc/cp/
* cp-tree.h (module_token_pre, module_token_cdtor)
(module_token_lang): Declare.
* lex.c: Include langhooks.
(struct module_token_filter): New.
* cp-tree.h (module_token_pre, module_token_cdtor)
(module_token_lang): Define.
* module.cc (get_module, preprocess_module, preprocessed_module):
Nop stubs.
Two recent AVX512 tests FAIL on Solaris/x86 with /bin/as:
FAIL: gcc.target/i386/avx512vpopcntdq-pr97770-2.c (test for excess errors)
Excess errors:
Assembler: avx512vpopcntdq-pr97770-2.c
"/var/tmp//ccM4Gt1a.s", line 171 : Illegal mnemonic
Near line: " vpopcntd (%eax), %zmm0"
"/var/tmp//ccM4Gt1a.s", line 171 : Syntax error
Near line: " vpopcntd (%eax), %zmm0"
FAIL: gcc.target/i386/avx512vpopcntdqvl-pr97770-1.c (test for excess errors)
similarly.
Fixed as follows.
Tested on i386-pc-solaris2.11 with as and gas and x86_64-pc-linux-gnu.
2020-12-07 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
gcc/testsuite:
* gcc.target/i386/avx512vpopcntdq-pr97770-2.c: Require
avx512vpopcntdq support.
* gcc.target/i386/avx512vpopcntdqvl-pr97770-1.c: Require
avx512vpopcntdq, avx512vl support.
The new gcc.target/i386/pr98100.c test FAILs on Solaris/x86:
FAIL: gcc.target/i386/pr98100.c (test for excess errors)
Excess errors:
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr98100.c:6:1: error: the call requires 'ifunc', which is not supported by this target
Fixed as follows.
Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.
2020-12-07 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
gcc/testsuite:
* gcc.target/i386/pr98100.c: Require ifunc support.
This makes sure to clear the vector pointer on release.
2020-12-08 Richard Biener <rguenther@suse.de>
PR tree-optimization/98192
* tree-vect-slp.c (vect_build_slp_instance): Get scalar_stmts
by reference.
We require a vector-by-scalar shift, there's no appropriate target
selector so use SSE2 for now.
2020-12-08 Richard Biener <rguenther@suse.de>
PR testsuite/95900
* gcc.dg/vect/bb-slp-pr95866.c: Require sse2 for the
BIT_FIELD_REF match.
These tests violated strict aliasing, fixed by using a union and
type punning through that.
2020-12-08 Jakub Jelinek <jakub@redhat.com>
* gcc.target/i386/avx512dq-vandnpd-2.c (CALC): Use union
to avoid aliasing violations.
* gcc.target/i386/avx512dq-vandnps-2.c (CALC): Likewise.
* gcc.target/i386/avx512dq-vandpd-2.c (CALC): Likewise.
* gcc.target/i386/avx512dq-vandps-2.c (CALC): Likewise.
* gcc.target/i386/avx512dq-vorpd-2.c (CALC): Likewise.
* gcc.target/i386/avx512dq-vorps-2.c (CALC): Likewise.
* gcc.target/i386/avx512dq-vxorpd-2.c (CALC): Likewise.
* gcc.target/i386/avx512dq-vxorps-2.c (CALC): Likewise.
This patch fixes two bugs in the -fopenmp-simd support. One is that
in C++ #pragma omp parallel master would actually create OMP_PARALLEL
in the IL, which is a big no-no for -fopenmp-simd, we should be creating
only the constructs -fopenmp-simd handles (mainly OMP_SIMD, OMP_LOOP which
is gimplified as simd in that case, declare simd/reduction and ordered simd).
The other bug was that #pragma omp master taskloop simd combined construct
contains simd and thus should be recognized as #pragma omp simd (with only
the simd applicable clauses), but as master wasn't included in
omp_pragmas_simd, we'd ignore it completely instead.
2020-12-08 Jakub Jelinek <jakub@redhat.com>
PR c++/98187
* c-pragma.c (omp_pragmas): Remove "master".
(omp_pragmas_simd): Add "master".
* parser.c (cp_parser_omp_parallel): For parallel master with
-fopenmp-simd only, just call cp_parser_omp_master instead of
wrapping it in OMP_PARALLEL.
* c-c++-common/gomp/pr98187.c: New test.
This adds a missing check.
2020-12-08 Richard Biener <rguenther@suse.de>
PR tree-optimization/98191
* tree-vect-slp.c (vect_slp_check_for_constructors): Do not
follow a non-SSA def chain.
* gcc.dg/torture/pr98191.c: New testcase.
This fixes sinking of loads when irreducible regions are involved
and the heuristics to find stores on the path along the sink
breaks down since that uses dominator queries.
2020-12-08 Richard Biener <rguenther@suse.de>
PR tree-optimization/97559
* tree-ssa-sink.c (statement_sink_location): Never ignore
PHIs on sink paths in irreducible regions.
* gcc.dg/torture/pr97559-1.c: New testcase.
* gcc.dg/torture/pr97559-2.c: Likewise.
gcc/
2020-12-08 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
PR target/97872
* gimple-isel.cc (gimple_expand_vec_cond_expr): Try to fold
x CMP y ? -1 : 0 to x CMP y.
gcc/testsuite/
2020-12-08 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
PR target/97872
* gcc.target/arm/pr97872.c: New test.
This adds a missing check for the first inserted value.
2020-12-08 Richard Biener <rguenther@suse.de>
PR tree-optimization/98180
* tree-vect-slp.c (vect_slp_check_for_constructors): Check the
first inserted value has a def.
This forces the scalarization of the testcase on PowerPC.
gcc/testsuite/ChangeLog:
PR target/96470
* gnat.dg/opt39.adb: Add dg-additional-options for PowerPC.
The very recent addition of the if_to_switch pass has partially disabled
the optimization added back in June to optimize_range_tests_to_bit_test,
as witnessed by the 3 new failures in the gnat.dg testsuite. It turns out
that both tree-ssa-reassoc.c and tree-switch-conversion.c can turn things
into bit tests so the optimization is added to bit_test_cluster::emit too.
The patch also contains a secondary optimization, whereby the full bit-test
sequence is sent to the folder before being gimplified in case there is only
one test, so that the optimal sequence (bt + jc on x86) can be emitted like
with optimize_range_tests_to_bit_test.
gcc/ChangeLog:
PR tree-optimization/96344
* tree-switch-conversion.c (bit_test_cluster::emit): Compute the
range only if an entry test is necessary. Merge the entry test in
the bit test when possible. Use PREC local variable consistently.
When there is only one test, do a single gimplification at the end.
We'll try to canonicalize the arch string for --with-arch,
and the script is written in python, however it will turns out
GCC require python to build for RISC-V port, it's not expect as
the GCC requirement.
So this patch is made this as optional, detect python and only use it
when it available, it won't break any functionality with out doing
canonicalization, just might build one more redundant multi-lib.
gcc/ChangeLog:
PR target/98152
* config.gcc (riscv*-*-*): Checking python, python3 or python2
is available, and skip doing with_arch canonicalize if no python
available.
To handle atomic loads correctly, we need to move the code that
drops qualifiers in lvalue conversion after the code that
handles atomics.
2020-12-07 Martin Uecker <muecker@gwdg.de>
gcc/c/
PR c/97981
* c-typeck.c (convert_lvalue_to_rvalue): Move the code
that drops qualifiers to the end of the function.
gcc/testsuite/
PR c/97981
* gcc.dg/pr97981.c: New test.
* gcc.dg/pr60195.c: Adapt test.
The function that calls targetm.emit_call_builtin___clear_cache
asserts that each of the begin and end operands has either ptr_mode or
Pmode.
On most targets that is the same mode, but e.g. on aarch64 -mabi=ilp32
or a few others it is different. When a target has a clear cache
non-library handler, it will use create_address_operand which will do the
conversion to the right mode automatically, but when emitting a library
call, we just say the operands are ptr_mode even when they can be Pmode
too; in that case we need to convert explicitly.
2020-12-07 Jakub Jelinek <jakub@redhat.com>
PR target/98147
* builtins.c (default_emit_call_builtin___clear_cache): Call
convert_memory_address to ptr_mode on both begin and end.
* gcc.dg/pr98147.c: New test.
In this testcase we are crashing trying to gimplify a switch, because
the types of the switch condition and case constants have different
TYPE_PRECISIONs.
This started with my r5-3726 fix: SWITCH_STMT_TYPE is supposed to be the
original type of the switch condition before any conversions, so in the
C++ FE we need to use unlowered_expr_type to get the unlowered type of
enum bit-fields.
Normally, the switch type is subject to integral promotions, but here
we have a scoped enum type and those don't promote:
enum class B { A };
struct C { B c : 8; };
switch (x.c) // type B
case B::A: // type int, will be converted to B
Here TREE_TYPE is "signed char" but SWITCH_STMT_TYPE is "B". When
gimplifying this in gimplify_switch_expr, the index type is "B" and
we convert all the case values to "B" in preprocess_case_label_vec,
but SWITCH_COND is of type "signed char": gimple_switch_index should
be the (possibly promoted) type, not the original type, so we gimplify
the "x.c" SWITCH_COND to a SSA_NAME of type "signed char". And then
we crash because the precision of the index type doesn't match the
precision of the case value type.
I think it makes sense to do the following; at the end of pop_switch
we've already issued the switch warnings, and since scoped enums don't
promote, it should be okay to use the type of SWITCH_STMT_COND. The
r5-3726 change was about giving warnings for enum bit-fields anyway.
gcc/cp/ChangeLog:
PR c++/98043
* decl.c (pop_switch): If SWITCH_STMT_TYPE is a scoped enum type,
set it to the type of SWITCH_STMT_COND.
gcc/testsuite/ChangeLog:
PR c++/98043
* g++.dg/cpp0x/enum41.C: New test.
verify_sequence_points uses verify_tree to recursively walk the
subexpressions of an expression, and while recursing, it also
keeps lists of expressions found after/before a sequence point.
For a large expression, the list can grow significantly. And
merge_tlist is at least N(n^2): for a list of length n it will
iterate n(n -1) times, and call candidate_equal_p each time, and
that can recurse further. warn_for_collision also has to go
through the whole list. With a large-enough expression, the
compilation can easily get stuck here for 24 hours.
This patch is a simple kludge: if we see that the expression is
overly complex, don't even try.
gcc/c-family/ChangeLog:
PR c++/98126
* c-common.c (verify_tree_lim_r): New function.
(verify_sequence_points): Use it. Use nullptr instead of 0.
gcc/testsuite/ChangeLog:
PR c++/98126
* g++.dg/warn/Wsequence-point-4.C: New test.
This restores the dependent array changes I reverted, now that pr98116
appears fixed. As mentioned before, when deserializing a module we
need to construct arrays without using the dependent-type predicates
themselves.
gcc/cp/
* cp-tree.h (build_cplus_array_type): Add defaulted DEP parm.
* tree.c (set_array_type_common): Add DEP parm.
(build_cplus_array_type): Add DEP parm, determine dependency if
needed. Mark dependency of new types.
(cp_build_qualified_type_real): Adjust array-building call, assert
no surprising dependency.
(strip_typedefs): Likewise.
This fixes the underlying problem my recent (backedout) changes to
array type creation uncovered. We had paths through
structural_comptypes that ignored alias templates, even when
significant. This adds the necessary checks.
PR c++/98116
gcc/cp/
* typeck.c (structural_comptypes): Move early outs to comptype.
Always check template-alias match when comparing_specializations.
(comptypes): Do early out checking here.
gcc/testsuite/
* g++.dg/template/pr98116.C: Remove dg-ice.
* g++.dg/template/pr98116-2.C: New.
Copy the location info from the passed in call stmt
to the newly built gimple call stmt.
2020-12-07 Bernd Edlinger <bernd.edlinger@hotmail.de>
* ipa-param-manipulation.c
(ipa_param_body_adjustments::modify_call_stmt): Set location info.
This adds the capability to handle a sequence of vector BIT_INSERT_EXPRs
to be vectorized similar as to how we vectorize vector constructors.
2020-12-03 Richard Biener <rguenther@suse.de>
PR tree-optimization/98113
* tree-vectorizer.h (struct slp_root): New.
(_bb_vec_info::roots): New member.
* tree-vect-slp.c (vect_analyze_slp): Also walk BB info
roots.
(_bb_vec_info::_bb_vec_info): Adjust.
(_bb_vec_info::~_bb_vec_info): Likewise.
(vld_cmp): New.
(vect_slp_is_lane_insert): Likewise.
(vect_slp_check_for_constructors): Match a series of
BIT_INSERT_EXPRs as vector constructor.
(vect_slp_analyze_bb_1): Continue if BB info roots is
not empty.
(vect_slp_analyze_bb_1): Mark the whole BIT_INSERT_EXPR root
sequence as pure_slp.
* gcc.dg/vect/bb-slp-70.c: New testcase.
This avoids the degenerate case of a TYPE_MAX_VALUE latch iteration
count value causing wrong range info for the vector IV. There's
still the case of VF == 1 where if we don't know whether we hit the
above case we cannot emit a range.
2020-12-07 Richard Biener <rguenther@suse.de>
PR tree-optimization/98117
* tree-vect-loop-manip.c (vect_gen_vector_loop_niters):
Properly handle degenerate niter when setting the vector
loop IV range.
* gcc.dg/torture/pr98117.c: New testcase.