No functional changes.
gcc/ada/
* gcc-interface/decl.cc (gnat_to_gnu_entity): Do not check the
scope of anonymous access Itypes.
* gcc-interface/trans.cc (Identifier_to_gnu): Do not translate
the return type of a subprogram here.
The -gnatzr switch triggers the creation of distribution stubs for use
by the implementation of PolyORB. Now these stubs declare tagged types
and are generated at the very end of the analysis of compilation units,
after the static dispatch tables have been built, so these tables are
missing for the tagged types of the stubs.
Therefore this change defers the generation of static dispatch tables
for compilation units, which is the common case, until after the stubs
are (potentially) generated. For the other cases, in particular the
generic instances that are not compilation units, nothing is changed.
gcc/ada/
* exp_ch7.adb (Expand_N_Package_Body): Build static dispatch
tables only for units that are not compilation units, unless
they are generic instances. Do not push a scope for this.
(Expand_N_Package_Declaration): Build static dispatch tables
only for units that are both not compilation units and generic
instances.
* exp_disp.adb (Build_Static_Dispatch_Tables): Remove redundant
early return. Push a scope for package bodies.
* sem_ch10.adb: Add with and use clauses for Exp_Disp.
(Analyze_Compilation_Unit): Build static dispatch tables here.
Thunks are only referenced locally by dispatch tables and never inlined.
gcc/ada/
* sem_ch6.adb (Analyze_Subprogram_Body_Helper): Clear the Is_Public
flag on thunks.
When iterating over list elements with First/Next there is no need to
check if the list is present, because First intentionally returns Empty
if list is not present and the condition of subsequent loop will not be
satisfied.
Code cleanup; semantics is unaffected.
Occurrences of the redundant pattern were found with:
$ grep First -B 3 | less
and examining the output for the calls to Present.
gcc/ada/
* exp_ch13.adb, exp_ch5.adb, exp_ch9.adb, exp_strm.adb,
sem_ch10.adb, sem_ch13.adb, sem_ch5.adb, sem_ch6.adb,
sem_ch8.adb, sem_elab.adb, sem_eval.adb, sem_prag.adb,
sem_util.adb: Remove checks for the missing list before
iterating with First/Next; reindent code and refill comments.
This eliminates the use of the secondary stack to return specific tagged
types from functions in calls that are not dispatching on result, which
comprises returning controlled types, by introducing thunks whose only
purpose is to move the result from the primary to the secondary stack
for primitive functions that are controlling on result, and referencing
them in the dispatch table in lieu of the primitive functions.
The implementation reuses the existing machinery of interface thunks and
thus creates another kind of thunks, secondary stack thunks, which only
perform a call to the primitive function and return the result.
gcc/ada/
* einfo.ads (Has_Controlling_Result): Document new usage.
(Is_Thunk): Document secondary stack thunks.
(Returns_By_Ref): Adjust.
* exp_ch6.adb (Caller_Known_Size): Return true for tagged types.
(Expand_N_Extended_Return_Statement): Do not call Set_By_Ref.
(Expand_Simple_Function_Return): For a BIP return with an Alloc_Form
parameter, mark the node as returning on the secondary stack.
Replace call to Is_Limited_Interface with Is_Limited_View. Deal wit
secondary stack thunks. Do not call Set_By_Ref. Optimize the case
of a call to a function whose type also needs finalization.
(Needs_BIP_Task_Actuals): Replace Thunk_Entity with Thunk_Target.
(Needs_BIP_Finalization_Master): Cosmetic fixes.
(Needs_BIP_Alloc_Form): Check No_Secondary_Stack restriction and
return true for tagged types.
* exp_ch7.adb (Transient Scope Management): Update description.
* exp_disp.adb (Expand_Dispatching_Call): Always set Returns_By_Ref
on designated type if the call is dispatching on result. Tidy up.
(Expand_Interface_Thunk): Change type of Thunk_Code from Node_Id to
List_Id. Change type of local variables from Node_Id to Entity_Id.
Propagate Aliased_Present flag to create the formals and explicitly
set Has_Controlling_Result to False. Build a secondary stack thunk
if necessary in the function case.
(Expand_Secondary_Stack_Thunk): New function.
(Make_Secondary_DT): Build secondary stack thunks if necessary.
(Make_DT): Likewise.
(Register_Predefined_Primitive): Likewise.
(Register_Primitive): Likewise.
* exp_util.ads (Is_Secondary_Stack_Thunk): Declare.
(Thunk_Target): Likewise.
* exp_util.adb (Is_Secondary_Stack_Thunk): New function.
(Thunk_Target): Likewise.
* fe.h (Is_Secondary_Stack_Thunk): Declare.
(Thunk_Target): Likewise.
* gen_il-fields.ads (Opt_Field_Enum): Remove By_Ref.
* gen_il-gen-gen_nodes.adb (N_Simple_Return_Statement): Likewise.
(N_Extended_Return_Statement): Likewise.
* sem_ch6.adb (Analyze_Subprogram_Specification): Skip check for
abstract return type in the thunk case.
(Create_Extra_Formals): Replace Thunk_Entity with Thunk_Target.
* sem_disp.adb (Check_Controlling_Formals): Skip in the thunk case.
* sem_util.adb: Add use and with clauses for Exp_Ch6.
(Compute_Returns_By_Ref): Do not process procedures and only set
the flag for direct return by reference.
(Needs_Secondary_Stack): Do not return true for specific tagged
types and adjust comments accordingly.
* sinfo.ads (By_Ref): Delete.
(N_Simple_Return_Statement): Remove By_Ref.
(N_Extended_Return_Statement): Likewise.
* gcc-interface/ada-tree.h (TYPE_RETURN_UNCONSTRAINED_P): Delete.
* gcc-interface/decl.cc (gnat_to_gnu_subprog_type): Do not use it.
Return by direct reference if the return type needs the secondary
stack as well as for secondary stack thunks.
* gcc-interface/gigi.h (fntype_same_flags_p): Remove parameter.
* gcc-interface/misc.cc (gnat_type_hash_eq): Adjust to above change.
* gcc-interface/trans.cc (finalize_nrv): Replace test on
TYPE_RETURN_UNCONSTRAINED_P with TYPE_RETURN_BY_DIRECT_REF_P.
(Subprogram_Body_to_gnu): Do not call maybe_make_gnu_thunk for
secondary stack thunks.
(Call_to_gnu): Do not test TYPE_RETURN_UNCONSTRAINED_P.
(gnat_to_gnu) <N_Simple_Return_Statement>: In the return by direct
reference case, test for the presence of Storage_Pool on the node
to build an allocator.
(maybe_make_gnu_thunk): Deal with Thunk_Entity and Thunk_Target.
* gcc-interface/utils.cc (fntype_same_flags_p): Remove parameter.
Local_Entity_Suppress and Global_Entity_Suppress variables referencing
tables were refactored to Local_Suppress_Stack_Top and
Global_Suppress_Stack_Top stacks back in 2007. Fix remaining references
to these variables.
gcc/ada/
* einfo.ads: Fix reference to Global_Entity_Suppress and
Local_Entity_Suppress variable in the comments.
* sem.ads: Likewise.
* sem_prag.adb: Likewise.
GNATprove changed the name of the pragma Annotate used to verify that
a subprogram always returns normally. It is now called Always_Return
instead of Terminating.
gcc/ada/
* libgnat/s-aridou.adb: Use Always_Return instead of Terminating
to annotate termination for GNATprove.
* libgnat/s-arit32.adb: Idem.
* libgnat/s-spcuop.ads: Idem.
Before this patch, the Functional Sets ans Maps were bounded both from
the user and the implementation points of view. To make them closer to
mathematical Sets ans Maps, this patch removes the bounds from the
contracts. Note that, in practice, they are still bounded by
Count_Type'Last, even if the user is not aware of it anymore.
This patch removed constraints on length of sets and maps from the
preconditions of functions. The function Length and Num_Overlaps now
return a Big_Natural.
gcc/ada/
* libgnat/a-cofuse.ads, libgnat/a-cofuse.adb,
libgnat/a-cofuma.ads, libgnat/a-cofuma.adb: Make Length and
Num_Overlaps return Big_Natural.
* libgnat/a-cforse.ads, libgnat/a-cforse.adb,
libgnat/a-cforma.adb, libgnat/a-cfhase.ads,
libgnat/a-cfhase.adb, libgnat/a-cfhama.adb,
libgnat/a-cfdlli.adb: Adapt code to handle Big_Integers instead
of Count_Type.
Function pointers must always be built with '[Unrestricted_]Access.
gcc/ada/
* exp_ch3.adb (Init_Secondary_Tags.Initialize_Tag): Initialize the
Offset_Func component by means of 'Unrestricted_Access.
This commit adds information allowing identification of the subprogram
surrounding the message emitted by gnat when using -gnatdJ along with
-fdiagnostics-format=json.
gcc/ada/
* errout.adb (Write_JSON_Span): Add subprogram name to emitted
JSON.
Inline_Always procedures should be kept public for proper inter unit
inlining.
gcc/ada/
* sem_ch7.adb (Set_Referencer_Of_Non_Subprograms): New local
procedure, used for code refactoring. Also take into account
Inline_Always pragma when deciding to make a symbol public for
C generation.
After the recent fix for detecting illegal use of ghost entities in
code, spurious errors could be raised on generic code with ghost, due to
wrong setting of the ghost flags on copied entities from the generic to
the instantiation.
gcc/ada/
* atree.adb (New_Copy): Reset flags related to ghost entities
before marking the new node.
This avoids making Expand_Interface_Thunk visible from the outside.
No functional changes.
gcc/ada/
* exp_ch6.adb (Freeze_Subprogram.Register_Predefined_DT_Entry): Move
procedure to...
* exp_disp.ads (Expand_Interface_Thunk): Move declaration to...
(Register_Predefined_Primitive): Declare.
* exp_disp.adb (Expand_Interface_Thunk): ...here.
(Register_Predefined_Primitive): ...here and change into a function
returning List_Id.
The static dispatch tables of library-level tagged types are either built
on the first object declaration or at the end of the declarative part of
the package spec or body. There is no real need for the former case, and
the tables are not built for other constructs that freeze (tagged) types.
Therefore this change removes the former case, thus causing the tables to
be always built at the end of the declarative part; that's orthogonal to
freezing and the tagged types are still frozen at the appropriate place.
Moreover, it wraps the code in the Actions list of a freeze node (like
for the nonstatic case) so that it is considered elaboration code by the
processing done in Sem_Elab and does not disturb it.
No functional changes.
gcc/ada/
* exp_ch3.adb (Expand_Freeze_Record_Type): Adjust comment.
(Expand_N_Object_Declaration): Do not build static dispatch tables.
* exp_disp.adb (Make_And_Insert_Dispatch_Table): New procedure.
(Build_Static_Dispatch_Tables): Call it to build the dispatch tables
and wrap them in the Actions list of a freeze node.
This feature is an architecture feature, not an OS feature, so enable
on vx7r2 for arm and aarch64 to coincide with what is done on similarly
capable targets.
gcc/ada/
* libgnat/system-vxworks7-arm.ads (Support_Atomic_Primitives):
Set True.
* libgnat/system-vxworks7-arm-rtp-smp.ads: Likewise.
* libgnat/system-vxworks7-aarch64.ads: Likewise.
* libgnat/system-vxworks7-aarch64-rtp-smp.ads: Likewise:
Testing Is_Frozen is not robust enough, so instead test that the full view
has been seen and that the Has_Completion flag is set on it.
gcc/ada/
* freeze.adb (Check_Expression_Function.Find_Constant): Make test
for deferred constants more robust.
Preconditions of Update procedures were always true when Offset was 0.
The changes enable to protect from Update_Error when Offset is 0.
gcc/ada/
* libgnat/i-cstrin.ads (Update): Update precondition.
References to ghost entities should only occur in ghost context. This
was not checked systematically on all references.
gcc/ada/
* sem_ch2.adb (Analyze_Identifier): Add checking for ghost
context.
* sem_ch5.adb (Analyze_Implicit_Label_Declaration): Treat
implicit labels like other entities by setting their ghost
status according to context.
* ghost.adb (Check_Ghost_Context): Adapt checking.
This patch adds preconditions to Update procedures, to protect from
Update_Error propagations.
gcc/ada/
* libgnat/i-cstrin.ads (Update): Add precondition.
The two flags apply to base types only like Has_Own_Invariants.
gcc/ada/
* sem_util.adb (Propagate_DIC_Attributes): Add ??? comment.
(Propagate_Invariant_Attributes): Likewise. Propagate the
Has_Inheritable_Invariants and Has_Inherited_Invariants to
the base type of the target type.
Systemitize Word_Size and Memory_Size declarations rather than hard code
with numerical values or OS specific Long_Integer size.
gcc/ada/
* libgnat/system-linux-arm.ads (Memory_Size): Compute based on
Word_Size.
Systemitize Word_Size and Memory_Size declarations rather than hard code
with numerical values or OS specific Long_Integer size.
gcc/ada/
* libgnat/system-vxworks7-aarch64-rtp-smp.ads (Word_Size):
Compute based on Standard'Word_Size. (Memory_Size): Compute
based on Word_Size.
* libgnat/system-vxworks7-arm-rtp-smp.ads: Likewise.
* libgnat/system-vxworks7-e500-rtp-smp.ads: Likewise.
* libgnat/system-vxworks7-e500-rtp.ads: Likewise.
* libgnat/system-vxworks7-ppc-rtp-smp.ads: Likewise.
* libgnat/system-vxworks7-ppc-rtp.ads: Likewise.
* libgnat/system-vxworks7-ppc64-rtp-smp.ads: Likewise.
* libgnat/system-vxworks7-x86-rtp-smp.ads: Likewise.
* libgnat/system-vxworks7-x86-rtp.ads: Likewise.
This patch corrects an error in the compiler whereby gnatbind may crash
during calculation of file checksums in certain corner cases due to
uninitialized lookup tables.
gcc/ada/
* gnatbind.adb (Gnatbind): Add initialize call for Uintp
* gnatls.adb (Gnatls): Likewise.
* gprep.adb (Gnatprep): Likewise.
* make.adb (Initialize): Likewise.
We need to use Extended_Index for the Position parameter of the Element
function in formal vectors so it is compatible with other primitives of
the Iterable aspect.
gcc/ada/
* libgnat/a-cfinve.ads (Element): Change the type of the
Position parameter to Extended_Index.
* libgnat/a-cfinve.adb (Element): Idem.
* libgnat/a-cofove.ads (Element): Idem.
* libgnat/a-cofove.adb (Element): Idem.
This patch adds SPARK annotations to subprograms from
System.Address_To_Access_Conversions. To_Pointer is considered to have
no global items, if the returned value has no aliases. To_Address is
forbidden in SPARK because addresses are not handled.
gcc/ada/
* libgnat/s-atacco.ads (To_Pointer): Add Global => null.
(To_Address): Add SPARK_Mode => Off.
This patch adds Global contracts and preconditions to subprograms of
Interfaces.C.Strings. Effects on allocated memory are modelled
through an abstract state, C_Memory. The preconditions protect against
Dereference_Error, but not Storage_Error (which is not handled by
SPARK). This patch also disables the use of To_Chars_Ptr, which
creates an alias between an ownership pointer and the abstract state,
and the use of Free, in SPARK code. Thus, memory leaks will happen
if the user creates the Chars_Ptr using New_Char_Array and New_String.
gcc/ada/
* libgnat/i-cstrin.ads (To_Chars_Ptr): Add SPARK_Mode => Off.
(Free): Likewise.
(New_Char_Array): Add global contracts and Volatile attribute.
(New_String): Likewise.
(Value, Strlen, Update): Add global contracts and preconditions.
* libgnat/i-cstrin.adb: Add SPARK_Mode => Off to the package
body.
As the following testcase shows, our x86 backend support for optimizing
out useless masking of shift/rotate counts when using instructions
that naturally modulo the count themselves is insufficient.
The *_mask define_insn_and_split patterns use
(subreg:QI (and:SI (match_operand:SI) (match_operand "const_int_operand")))
for the masking, but that can catch only the case where the masking
is done in SImode, so typically in SImode in the source.
We then have another set of patterns, *_mask_1, which use
(and:QI (match_operand:QI) (match_operand "const_int_operand"))
If the masking is done in DImode or in theory in HImode, we don't match
it.
The following patch does 4 different things to improve this:
1) drops the mode from AND and MATCH_OPERAND inside of the subreg:QI
and replaces that by checking that the register shift count has
SWI48 mode - I think doing it this way is cheaper than adding
another mode iterator to patterns which use already another mode
iterator and sometimes a code iterator as well
2) the doubleword shift patterns were only handling the case where
the shift count is masked with a constant that has the most significant
bit clear, i.e. where we know the shift count is less than half the
number of bits in double-word. If the mask is equal to half the
number of bits in double-word minus 1, the masking was optimized
away, otherwise the AND was kept.
But if the most significant bit isn't clear, e use a word-sized shift
and SHRD instruction, where the former does the modulo and the latter
modulo with 64 / 32 depending on what mode the CPU is in (so 64 for
128-bit doubleword and 32 or 64-bit doubleword). So we can also
optimize away the masking when the mask has all the relevant bits set,
masking with the most significant bit will remain for the cmove
test.
3) as requested, this patch adds a bunch of force_reg calls before
gen_lowpart
4) 1-3 above unfortunately regressed
+FAIL: gcc.target/i386/bt-mask-2.c scan-assembler-not and[lq][ \\t]
+FAIL: gcc.target/i386/pr57819.c scan-assembler-not and[lq][ \\t]
where we during combine match the new pattern we didn't match
before and in the end don't match the pattern we were testing for.
These 2 tests are fixed by the *jcc_bt<mode>_mask_1 pattern
addition and small tweak to target rtx_costs, because even with
the pattern around we'd refuse to match it because it appeared to
have higher instruction cost
2022-06-02 Jakub Jelinek <jakub@redhat.com>
PR target/105778
* config/i386/i386.md (*ashl<dwi>3_doubleword_mask): Remove :SI
from AND and its operands and just verify operands[2] has HImode,
SImode or for TARGET_64BIT DImode. Allow operands[3] to be a mask
with all low 6 (64-bit) or 5 (32-bit) bits set and in that case
just throw away the masking. Use force_reg before calling
gen_lowpart.
(*ashl<dwi>3_doubleword_mask_1): Allow operands[3] to be a mask
with all low 6 (64-bit) or 5 (32-bit) bits set and in that case
just throw away the masking.
(*ashl<mode>3_doubleword): Rename to ...
(ashl<mode>3_doubleword): ... this.
(*ashl<mode>3_mask): Remove :SI from AND and its operands and just
verify operands[2] has HImode, SImode or for TARGET_64BIT DImode.
Use force_reg before calling gen_lowpart.
(*<insn><mode>3_mask): Likewise.
(*<insn><dwi>3_doubleword_mask): Likewise. Allow operands[3] to be
a mask with all low 6 (64-bit) or 5 (32-bit) bits set and in that
case just throw away the masking. Use force_reg before calling
gen_lowpart.
(*<insn><dwi>3_doubleword_mask_1): Allow operands[3] to be a mask
with all low 6 (64-bit) or 5 (32-bit) bits set and in that case just
throw away the masking.
(*<insn><mode>3_doubleword): Rename to ...
(<insn><mode>3_doubleword): ... this.
(*<insn><mode>3_mask): Remove :SI from AND and its operands and just
verify operands[2] has HImode, SImode or for TARGET_64BIT DImode.
Use force_reg before calling gen_lowpart.
(splitter after it): Remove :SI from AND and its operands and just
verify operands[2] has HImode, SImode or for TARGET_64BIT DImode.
(*<btsc><mode>_mask, *<btsc><mode>_mask): Remove :SI from AND and its
operands and just verify operands[1] has HImode, SImode or for
TARGET_64BIT DImode. Use force_reg before calling gen_lowpart.
(*jcc_bt<mode>_mask_1): New define_insn_and_split pattern.
* config/i386/i386.cc (ix86_rtx_costs): For ZERO_EXTRACT with
ZERO_EXTEND QI->SI in last operand ignore the cost of the ZERO_EXTEND.
* gcc.target/i386/pr105778.c: New test.
This relaxes the conditions on SLPing extracts from existing vectors
leveraging the relaxed VEC_PERM conditions on the input vs output
vector type compatibility. It also handles lowpart extracts
and concats without VEC_PERMs now.
2022-05-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/101668
* tree-vect-slp.cc (vect_build_slp_tree_1): Allow BIT_FIELD_REFs
for vector types with compatible lane types.
(vect_build_slp_tree_2): Deal with this.
(vect_add_slp_permutation): Adjust. Emit lowpart/concat
special cases without VEC_PERM.
(vectorizable_slp_permutation): Select the operand vector
type and relax requirements. Handle identity permutes
with mismatching operand types.
* optabs-query.cc (can_vec_perm_const_p): Only allow variable
permutes for op_mode == mode.
* gcc.target/i386/pr101668.c: New testcase.
* gcc.dg/vect/bb-slp-pr101668.c: Likewise.
This also fixes the type of the irange used for unswitching of
switch statements.
PR tree-optimization/105802
* tree-ssa-loop-unswitch.cc (find_unswitching_predicates_for_bb):
Make sure to also compute the range in the type of the switch index.
* g++.dg/opt/pr105802.C: New testcase.
Aligne __EH_FRAME_BEGIN__ to pointer size since gcc/unwind-dw2-fde.h has
/* The first few fields of a CIE. The CIE_id field is 0 for a CIE,
to distinguish it from a valid FDE. FDEs are aligned to an addressing
unit boundary, but the fields within are unaligned. */
struct dwarf_cie
{
uword length;
sword CIE_id;
ubyte version;
unsigned char augmentation[];
} __attribute__ ((packed, aligned (__alignof__ (void *))));
/* The first few fields of an FDE. */
struct dwarf_fde
{
uword length;
sword CIE_delta;
unsigned char pc_begin[];
} __attribute__ ((packed, aligned (__alignof__ (void *))));
which indicates that CIE/FDE should be aligned at the pointer size.
PR libgcc/27576
* crtstuff.c (__EH_FRAME_BEGIN__): Aligned to pointer size.
$ac_cv_prog_OBJDUMP contains the --host OBJDUMP that
libtool has inferred. Current config/gcc-plugin.m4 does
not respect the user's choice for OBJDUMP.
PR plugins/95648
config/
* gcc-plugin.m4: Use libtool's $ac_cv_prog_OBJDUMP.
gcc/
* configure: Regenerate.
libcc1/
* configure: Regenerate.
RTL DSE tracks redundant constant stores within a basic block. When RTL
loop invariant motion hoists a constant initialization out of the loop
into a separate basic block, the constant store value becomes unknown
within the original basic block. When recording store for RTL DSE, check
if the source register is set only once to a constant by a non-partial
unconditional load. If yes, record the constant as the constant store
source. It eliminates unrolled zero stores after memset 0 in a loop
where a vector register is used as the zero store source.
gcc/
PR rtl-optimization/105638
* df-core.cc (df_find_single_def_src): Moved and renamed from
find_single_def_src in loop-iv.cc. Change the argument to rtx
and use rtx_equal_p. Return null for partial or conditional
defs.
* df.h (df_find_single_def_src): New prototype.
* dse.cc (record_store): Use the constant source if the source
register is set only once.
* loop-iv.cc (find_single_def_src): Moved to df-core.cc.
(replace_single_def_regs): Replace find_single_def_src with
df_find_single_def_src.
gcc/testsuite/
PR rtl-optimization/105638
* g++.target/i386/pr105638.C: New test.
In r12-3643 I improved our handling of type names after . or -> when
unqualified lookup doesn't find anything, but it needs to handle auto
specially.
PR c++/105734
gcc/cp/ChangeLog:
* parser.cc (cp_parser_postfix_dot_deref_expression): Use typeof
if the expression has auto type.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/auto57.C: New test.
This testcase demonstrates that the issue in PR105623 is not limited to
templates, so we should do the marking in a less template-specific place.
PR c++/105779
gcc/cp/ChangeLog:
* call.cc (resolve_args): Call mark_single_function here.
* pt.cc (unify_one_argument): Not here.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1y/auto-fn63.C: New test.
Further cleanup option processing. Remove the duplication of global
variables for CPU and tune settings so that CPU option processing is
simplified even further. Move global variables that need save and
restore due to target option processing into aarch64.opt. This removes
the need for explicit saving/restoring and unnecessary reparsing of
options.
gcc/
* config/aarch64/aarch64.opt (explicit_tune_core): Rename to
selected_tune.
(explicit_arch): Rename to selected_arch.
(x_aarch64_override_tune_string): Remove.
(aarch64_ra_sign_key): Add as TargetVariable so it gets saved/restored.
(aarch64_override_tune_string): Add Save so it gets saved/restored.
* config/aarch64/aarch64.h (aarch64_architecture_version): Remove.
* config/aarch64/aarch64.cc (aarch64_architecture_version): Remove.
(processor): Remove archtecture_version field.
(selected_arch): Remove global.
(selected_cpu): Remove global.
(selected_tune): Remove global.
(aarch64_ra_sign_key): Move global to aarch64.opt so it is saved.
(aarch64_override_options_internal): Use aarch64_get_tune_cpu.
(aarch64_override_options): Further simplify code to only set
selected_arch and selected_tune globals.
(aarch64_option_save): Remove now that target options are saved.
(aarch64_option_restore): Remove redundant target option restores.
* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Use
AARCH64_ISA_V9.
* config/aarch64/aarch64-opts.h (aarch64_key_type): Add, moved from...
* config/aarch64/aarch64-protos.h (aarch64_key_type): Remove.
(aarch64_ra_sign_key): Remove.
A comparison with a constant is most likely always faster than
.MUL_OVERFLOW from which we only check whether it overflowed and not the
multiplication result, and even if not, it is simpler operation on GIMPLE
and even if a target exists where such multiplications with overflow checking
are cheaper than comparisons, because comparisons are so much more common
than overflow checking multiplications, it would be nice if it simply
arranged for comparisons to be emitted like those multiplications on its
own...
2022-06-01 Jakub Jelinek <jakub@redhat.com>
PR middle-end/30314
* match.pd (__builtin_mul_overflow_p (x, cst, (utype) 0) ->
x > ~(utype)0 / cst): New simplification.
* gcc.dg/tree-ssa/pr30314.c: New test.
The guard generation for a static var init was overly verbose. We can
use a bit of RAII and avoid some rechecking. Also in the !cxa_atexit
case, the only difference is whether can become whether to use
post-inc or pre-dec.
gcc/cp/
* decl2.cc (fix_temporary_vars_context_r): Use data argument
for new context.
(one_static_initialization_or_destruction): Adjust tree walk
call. Refactor guard generation.
The static init/fini generation is showing some bitrot. This cleans
up several places to use C++, and also take advantage of already
having checked a variable for non-nullness.
gcc/cp/
* decl2.cc (ssdf_decl): Delete global.
(start_static_storage_duration_function): Use some RAII.
(do_static_initialization_or_destruction): Likewise.
(c_parse_final_cleanups): Likewise. Avoid rechecking 'vars'.
The end-of-compilation static init code generation functions are:
* Inconsistent in argument ordering (swapping 'is-init' and 'priority',
wrt each other and other arguments).
* Inconsistent in naming. mostly calling the is-init argument 'initp',
but sometimes calling it 'constructor_p' and in the worst case using
a transcoded 'methody_type' character, and naming the priority
argument 'initp'.
* Inconsistent in typing. Sometimes the priority is unsigned,
sometimes signed. And the initp argument can of course be a bool.
* Several of the function comments have bit-rotted.
This addresses those oddities. Name is-init 'initp', name priority
'priority'. Place initp first, make priority unsigned.
gcc/cp/
* decl2.cc (start_objects): Replace 'method_type' parameter
with 'initp' boolean, rename and retype 'priority' parameter.
(finish_objects): Likewise. Do not expand here.
(one_static_initialization_or_destruction): Move 'initp'
parameter first.
(do_static_initialization_or_destruction): Likewise.
(generate_ctor_or_dtor_function): Rename 'initp' parameter.
Adjust start_objects/finish_obects calls and expand here.
(generate_ctor_and_dtor_functions_for_priority): Adjust calls.
(c_parse_final_cleanups): Likewise.
(vtv_start_verification_constructor_init): Adjust.
(vtv_finish_verification_constructor_init): Use finish_objects.
This avoids matching strlen to a pointer result, avoiding ICEing
because of an integer adjustment using PLUS_EXPR on pointers.
2022-06-01 Richard Biener <rguenther@suse.de>
PR tree-optimization/105786
* tree-loop-distribution.cc
(loop_distribution::transform_reduction_loop): Only do strlen
replacement for integer type reductions.
* gcc.dg/torture/pr105786.c: New testcase.
The following testcase ICEs because we use different types in comparison,
idx has int type, while CASE_LOW has char type.
While I believe all CASE_{LOW,HIGH} in the same switch have to use the same
or compatible type, the index expression can have a promoted type as happens
in this testcase. Other spots that handle switches do such foldings too.
2022-06-01 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/105770
* tree-ssa-loop-unswitch.cc (find_unswitching_predicates_for_bb): Cast
CASE_LOW and CASE_HIGH to TREE_TYPE (idx) before comparisons with idx.
* gcc.dg/pr105770.c: New test.
This patch revamps the range allocator to handle generic vrange's.
I've cleaned it up somehow to make it obvious the various things you
can allocate with it. I've also moved away from overloads into
distinct names when appropriate.
The various entry points are now:
// Allocate a range of TYPE.
vrange *alloc_vrange (tree type);
// Allocate a memory block of BYTES.
void *alloc (unsigned bytes);
// Return a clone of SRC.
template <typename T> T *clone (const T &src);
It is now possible to allocate a clone of an irange, or any future
range types:
irange *i = allocator.clone <irange> (some_irange);
frange *f = allocator.clone <frange> (some_frange);
You can actually do so without the <>, but I find it clearer to
specify the vrange type.
So with it you can allocate a specific range type, or vrange, or a
block of memory.
I have rewritten the C style casts to C++ casts, since casts tend to
be hints of problematic designs. With the C++ casts you can at least
grep for them easier. Speak of which, the next patch, which converts
ranger to vrange, will further clean this space by removing some
unnecessary casts.
Tested on x86-64 Linux and ppc64le Linux.
* gimple-range-cache.cc (sbr_vector::sbr_vector): Adjust for
vrange allocator.
(sbr_vector::grow): Same.
(sbr_vector::set_bb_range): Same.
(sbr_sparse_bitmap::sbr_sparse_bitmap): Same.
(sbr_sparse_bitmap::set_bb_range): Same.
(block_range_cache::~block_range_cache): Same.
(block_range_cache::set_bb_range): Same.
(ssa_global_cache::ssa_global_cache): Same.
(ssa_global_cache::~ssa_global_cache): Same.
(ssa_global_cache::set_global_range): Same.
* gimple-range-cache.h (block_range_cache): Same.
(ssa_global_cache): Same.
* gimple-range-edge.cc
(gimple_outgoing_range::calc_switch_ranges): Same.
* gimple-range-edge.h (gimple_outgoing_range): Same.
* gimple-range-infer.cc (infer_range_manager::get_nonzero):
Same.
(infer_range_manager::add_range): Same.
* gimple-range-infer.h (class infer_range_manager): Same.
* value-range.h (class irange_allocator): Rename to...
(class vrange_allocator): ...this.
(irange_allocator::irange_allocator): New.
(vrange_allocator::vrange_allocator): New.
(irange_allocator::~irange_allocator): New.
(vrange_allocator::~vrange_allocator): New.
(irange_allocator::get_memory): Rename to...
(vrange_allocator::alloc): ...this.
(vrange_allocator::alloc_vrange): Rename from...
(irange_allocator::allocate): ...this.
(vrange_allocator::alloc_irange): New.
This patch provides the infrastructure to make range-ops type agnostic.
First, the range_op_handler function has been replaced with an object
of the same name. It's coded in such a way to minimize changes to the
code base, and to encapsulate the dispatch code.
Instead of:
range_operator *op = range_op_handler (code, type);
if (op)
op->fold_range (...);
We now do:
range_op_handler op (code, type);
if (op)
op->fold_range (...);
I've folded gimple_range_handler into the range_op_handler class,
since it's also a query into the range operators.
Instead of:
range_operator *handler = gimple_range_handler (stmt);
We now do:
range_op_handler handler (stmt);
This all has the added benefit of moving all the dispatch code into an
independent class and avoid polluting range_operator (which we'll
further split later when frange and prange come live).
There's this annoying "using" keyword that's been added to each
operator due to hiding rules in C++. The issue is that we will have
different virtual versions of fold_range() for each combination of
operands. For example:
// Traditional binary op on irange's.
fold_range (irange &lhs, const irange &op1, const irange &op2);
// For POINTER_DIFF_EXPR:
fold_range (irange &lhs, const prange &op1, const prange &op2);
// Cast from irange to prange.
fold_range (prange &lhs, const irange &op1, const irange &op2);
Overloading virtuals when there are multiple same named methods causes
hidden virtuals warnings from -Woverloaded-virtual, thus the using
keyword. An alternative would be to have different names:
fold_range_III, fold_range_IPP, fold_range_PII, but that's uglier
still.
Tested on x86-64 & ppc64le Linux.
gcc/ChangeLog:
* gimple-range-edge.cc (gimple_outgoing_range_stmt_p): Adjust for
vrange and convert range_op_handler function calls to use the
identically named object.
* gimple-range-fold.cc (gimple_range_operand1): Same.
(gimple_range_operand2): Same.
(fold_using_range::fold_stmt): Same.
(fold_using_range::range_of_range_op): Same.
(fold_using_range::range_of_builtin_ubsan_call): Same.
(fold_using_range::relation_fold_and_or): Same.
(fur_source::register_outgoing_edges): Same.
* gimple-range-fold.h (gimple_range_handler): Remove.
* gimple-range-gori.cc (gimple_range_calc_op1): Adjust for vrange.
(gimple_range_calc_op2): Same.
(range_def_chain::get_def_chain): Same.
(gori_compute::compute_operand_range): Same.
(gori_compute::condexpr_adjust): Same.
* gimple-range.cc (gimple_ranger::prefill_name): Same.
(gimple_ranger::prefill_stmt_dependencies): Same.
* range-op.cc (get_bool_state): Same.
(class operator_equal): Add using clause.
(class operator_not_equal): Same.
(class operator_lt): Same.
(class operator_le): Same.
(class operator_gt): Same.
(class operator_ge): Same.
(class operator_plus): Same.
(class operator_minus): Same.
(class operator_mult): Same.
(class operator_exact_divide): Same.
(class operator_lshift): Same.
(class operator_rshift): Same.
(class operator_cast): Same.
(class operator_logical_and): Same.
(class operator_bitwise_and): Same.
(class operator_logical_or): Same.
(class operator_bitwise_or): Same.
(class operator_bitwise_xor): Same.
(class operator_trunc_mod): Same.
(class operator_logical_not): Same.
(class operator_bitwise_not): Same.
(class operator_cst): Same.
(class operator_identity): Same.
(class operator_unknown): Same.
(class operator_abs): Same.
(class operator_negate): Same.
(class operator_addr_expr): Same.
(class pointer_or_operator): Same.
(operator_plus::op1_range): Adjust for vrange.
(operator_minus::op1_range): Same.
(operator_mult::op1_range): Same.
(operator_cast::op1_range): Same.
(operator_bitwise_not::fold_range): Same.
(operator_negate::fold_range): Same.
(range_op_handler): Rename to...
(get_handler): ...this.
(range_op_handler::range_op_handler): New.
(range_op_handler::fold_range): New.
(range_op_handler::op1_range): New.
(range_op_handler::op2_range): New.
(range_op_handler::lhs_op1_relation): New.
(range_op_handler::lhs_op2_relation): New.
(range_op_handler::op1_op2_relation): New.
(range_cast): Adjust for vrange.
* range-op.h (range_op_handler): Remove function.
(range_cast): Adjust for vrange.
(class range_op_handler): New.
(get_bool_state): Adjust for vrange.
(empty_range_varying): Same.
(relop_early_resolve): Same.
* tree-data-ref.cc (compute_distributive_range): Same.
* tree-vrp.cc (get_range_op_handler): Remove.
(range_fold_binary_symbolics_p): Use range_op_handler class
instead of get_range_op_handler.
(range_fold_unary_symbolics_p): Same.
(range_fold_binary_expr): Same.
(range_fold_unary_expr): Same.
* value-query.cc (range_query::get_tree_range): Adjust for vrange.
Now that we have generic ranges, we need a way to define generic local
temporaries on the stack for intermediate calculations in the ranger
and elsewhere. We need temporaries analogous to int_range_max, but
for any of the supported types (currently just integers, but soon
integers, pointers, and floats).
The Value_Range object is such a temporary. It is designed to be
transparently used as a vrange. It shares vrange's abstract API, and
implicitly casts itself to a vrange when passed around.
The ultimate name will be value_range, but we need to remove legacy
first for that to happen. Until then, Value_Range will do.
Sample usage is as follows. Instead of:
extern void foo (vrange &);
int_range_max t;
t.set_nonzero (type);
foo (t);
one does:
Value_Range t (type);
t.set_nonzero (type);
foo (t);
You can also delay initialization, for use in loops for example:
Value_Range t;
...
t.set_type (type);
t.set_varying (type);
Creating an supported range type, will result in an unsupported_range
object being created, which will trap if anything but set_undefined()
and undefined_p() are called on it. There's no size penalty for the
unsupported_range, since its immutable and can be shared across
instances.
Since supports_type_p() is called at construction time for each
temporary, I've removed the non-zero check from this function, which
was mostly unneeded. I fixed the handful of callers that were
passing null types, and in the process sped things up a bit.
As more range types come about, the Value_Range class will be augmented
to support them by adding the relevant bits in the initialization
code, etc.
Tested on x86-64 & ppc64le Linux.
gcc/ChangeLog:
* gimple-range-fold.h (gimple_range_type): Check type before
calling supports_type_p.
* gimple-range-path.cc (path_range_query::range_of_stmt): Same.
* value-query.cc (range_query::get_tree_range): Same.
* value-range.cc (Value_Range::lower_bound): New.
(Value_Range::upper_bound): New.
(Value_Range::dump): New.
* value-range.h (class Value_Range): New.
(irange::supports_type_p): Do not check if type is non-zero.
This is a series of patches making ranger type agnostic in preparation
for contributing support for other types of ranges (pointers and
floats initially).
The first step in this process is to implement vrange, an abstract
class that will be exclusively used by ranger, and from which all
ranges will inherit. Vrange provides the minimum operations for
ranger to work. The current virtual methods are what we've used to
implement frange (floats) and prange (pointers), but we may restrict
the virtual methods further as other ranges come about
(i.e. set_nonnegative() has no meaning for a future string range).
This patchset also provides a mechanism for declaring local type
agnostic ranges that can transparently hold an irange, frange,
prange's, etc, and a dispatch mechanism for range-ops to work with
various range types. More details in the relevant patches.
FUTURE PLAN
===========
The plan after this is to contribute a bare bones implementation for
floats (frange) that will provide relationals, followed by a
separation of integers and pointers (irange and prange). Once this is
in place, we can further enhance both floats and pointers. For
example, pointer tracking, pointer plus optimizations, and keeping
track of NaN's, etc.
Once frange and prange come live, all ranger clients will immediately
benefit from these enhancements. For instance, in our local branch,
the threader is already float aware with regards to relationals.
We expect to wait a few weeks before starting to contribute further
enhancements to give the tree a time to stabilize, and Andrew time to
rebase his upcoming patches :-P.
NOTES
=====
In discussions with Andrew, it has become clear that with vrange
coming about, supports_type_p() is somewhat ambiguous. Prior to
vrange it has been used to (a) determine if a type is supported by
ranger, (b) as a short-cut for checking if a type is pointer or integer,
as well as (c) to see if a given range can hold a type. These things
have had the same meaning in irange, but are slightly different with
vrange. I will address this in a follow-up patch.
Speaking of supported types, we now provide an unsupported_range
for passing around ranges for unsupported types. We've been silently
doing this for a while, in both vr-values by creating VARYING for
unsupported types with error_mark_node end points, and in ranger when
we pass an unsupported range before we realize in range_of_expr that
it's unsupported. This class just formalizes what we've already been
doing in an irange, but making it explicit that you can't do anything
with these ranges except pass them. Any other operation traps.
There is no GTY support for vrange yet, as we don't store it long
term. When we contribute support for global ranges (think
SSA_NAME_RANGE_INFO but for generic ranges), we will include it. There
was just no need to pollute this patchset with it.
TESTING
=======
The patchset has been tested on x86-64 Linux as well as ppc64 Linux.
I have also verified that we fold the same number of conditionals in
evrp as well as thread the same number of paths. There should be no
user visible changes.
We have also benchmarked the work, with the final numbers being an
*improvement* of 1.92% for evrp, and 0.82% for VRP. Overall
compilation has a miniscule improvement. This is despite the extra
indirection level.
The improvements are mostly because of small cleanups required for the
generalization of ranges. As a sanity check, I stuck kcachegrind on a
few sample .ii files to see where the time was being gained. Most of
the gain came from gimple_range_global() being 19% faster. This
function is called a lot, and it was constructing a legacy
value_range, then returning it by value, which the caller then had to
convert to an irange. This is in line with other pending work:
anytime we get rid of legacy, we gain time.
I will wait a few days before committing to welcome any comments.
gcc/ChangeLog:
* value-range-equiv.cc (value_range_equiv::set): New.
* value-range-equiv.h (class value_range_equiv): Make set method
virtual.
Remove default bitmap argument from set method.
* value-range.cc (vrange::contains_p): New.
(vrange::singleton_p): New.
(vrange::operator=): New.
(vrange::operator==): New.
(irange::fits_p): Move to .cc file.
(irange::set_nonnegative): New.
(unsupported_range::unsupported_range): New.
(unsupported_range::set): New.
(unsupported_range::type): New.
(unsupported_range::set_undefined): New.
(unsupported_range::set_varying): New.
(unsupported_range::dump): New.
(unsupported_range::union_): New.
(unsupported_range::intersect): New.
(unsupported_range::zero_p): New.
(unsupported_range::nonzero_p): New.
(unsupported_range::set_nonzero): New.
(unsupported_range::set_zero): New.
(unsupported_range::set_nonnegative): New.
(unsupported_range::fits_p): New.
(irange::set): Call irange::set_undefined.
(irange::verify_range): Check discriminator field.
(irange::dump): Dump [irange] marker.
(irange::debug): Move to...
(vrange::debug): ...here.
(dump_value_range): Accept vrange.
(debug): Same.
* value-range.h (enum value_range_discriminator): New.
(class vrange): New.
(class unsupported_range): New.
(struct vrange_traits): New.
(is_a): New.
(as_a): New.
(class irange): Inherit from vrange.
(dump_value_range): Adjust for vrange.
(irange::kind): Rename to...
(vrange::kind): ...this.
(irange::varying_p): Rename to...
(vrange::varying_p): ...this.
(irange::undefined_p): Rename to...
(vrange::undefined_p): ...this.
(irange::irange): Set discriminator.
(irange::union_): Convert to irange before passing to irange
method.
(irange::intersect): Same.
(vrange::supports_type_p): New.
* vr-values.cc (vr_values::extract_range_from_binary_expr): Pass
NULL bitmap argument to value_range_equiv::set.
(vr_values::extract_range_basic): Same.