Commit Graph

201957 Commits

Author SHA1 Message Date
David Malcolm
4f01ae3761 diagnostics: add support for "text art" diagrams
Existing text output in GCC has to be implemented by writing
sequentially to a pretty_printer instance.  This makes it
hard to implement some kinds of diagnostic output (see e.g.
diagnostic-show-locus.cc).

This patch adds more flexible ways of creating text output:
- a canvas class, which can be "painted" to via random-access (rather
that sequentially)
- a table class for 2D grid layout, supporting items that span
multiple rows/columns
- a widget class for organizing diagrams hierarchically.

The patch also expands GCC's diagnostics subsystem so that diagnostics
can have "text art" diagrams - think ASCII art, but potentially
including some Unicode characters, such as box-drawing chars.

The new code is in a new "gcc/text-art" subdirectory and "text_art"
namespace.

The patch adds a new "-fdiagnostics-text-art-charset=VAL" option, with
values:
- "none": don't emit diagrams (added to -fdiagnostics-plain-output)
- "ascii": use pure ASCII in diagrams
- "unicode": allow for conservative use of unicode drawing characters
(such as box-drawing characters).
- "emoji" (the default): as "unicode", but potentially allow for
conservative use of emoji in the output (such as U+26A0 WARNING SIGN).
I made it possible to disable emoji separately from unicode as I believe
there's a generation gap in acceptance of these characters (some older
programmers have a visceral reaction against them, whereas younger
programmers may have no problem with them).

Diagrams are emitted to stderr by default.  With SARIF output they are
captured as a location in "relatedLocations", with the diagram as a
code block in Markdown within a "markdown" property of a message.

This patch doesn't add any such diagram usage to GCC, saving that for
followups, apart from adding a plugin to the test suite to exercise the
functionality.

contrib/ChangeLog:
	* unicode/gen-box-drawing-chars.py: New file.
	* unicode/gen-combining-chars.py: New file.
	* unicode/gen-printable-chars.py: New file.

gcc/ChangeLog:
	* Makefile.in (OBJS-libcommon): Add text-art/box-drawing.o,
	text-art/canvas.o, text-art/ruler.o, text-art/selftests.o,
	text-art/style.o, text-art/styled-string.o, text-art/table.o,
	text-art/theme.o, and text-art/widget.o.
	* color-macros.h (COLOR_FG_BRIGHT_BLACK): New.
	(COLOR_FG_BRIGHT_RED): New.
	(COLOR_FG_BRIGHT_GREEN): New.
	(COLOR_FG_BRIGHT_YELLOW): New.
	(COLOR_FG_BRIGHT_BLUE): New.
	(COLOR_FG_BRIGHT_MAGENTA): New.
	(COLOR_FG_BRIGHT_CYAN): New.
	(COLOR_FG_BRIGHT_WHITE): New.
	(COLOR_BG_BRIGHT_BLACK): New.
	(COLOR_BG_BRIGHT_RED): New.
	(COLOR_BG_BRIGHT_GREEN): New.
	(COLOR_BG_BRIGHT_YELLOW): New.
	(COLOR_BG_BRIGHT_BLUE): New.
	(COLOR_BG_BRIGHT_MAGENTA): New.
	(COLOR_BG_BRIGHT_CYAN): New.
	(COLOR_BG_BRIGHT_WHITE): New.
	* common.opt (fdiagnostics-text-art-charset=): New option.
	(diagnostic-text-art.h): New SourceInclude.
	(diagnostic_text_art_charset) New Enum and EnumValues.
	* configure: Regenerate.
	* configure.ac (gccdepdir): Add text-art to loop.
	* diagnostic-diagram.h: New file.
	* diagnostic-format-json.cc (json_emit_diagram): New.
	(diagnostic_output_format_init_json): Wire it up to
	context->m_diagrams.m_emission_cb.
	* diagnostic-format-sarif.cc: Include "diagnostic-diagram.h" and
	"text-art/canvas.h".
	(sarif_result::on_nested_diagnostic): Move code to...
	(sarif_result::add_related_location): ...this new function.
	(sarif_result::on_diagram): New.
	(sarif_builder::emit_diagram): New.
	(sarif_builder::make_message_object_for_diagram): New.
	(sarif_emit_diagram): New.
	(diagnostic_output_format_init_sarif): Set
	context->m_diagrams.m_emission_cb to sarif_emit_diagram.
	* diagnostic-text-art.h: New file.
	* diagnostic.cc: Include "diagnostic-text-art.h",
	"diagnostic-diagram.h", and "text-art/theme.h".
	(diagnostic_initialize): Initialize context->m_diagrams and
	call diagnostics_text_art_charset_init.
	(diagnostic_finish): Clean up context->m_diagrams.m_theme.
	(diagnostic_emit_diagram): New.
	(diagnostics_text_art_charset_init): New.
	* diagnostic.h (text_art::theme): New forward decl.
	(class diagnostic_diagram): Likewise.
	(diagnostic_context::m_diagrams): New field.
	(diagnostic_emit_diagram): New decl.
	* doc/invoke.texi (Diagnostic Message Formatting Options): Add
	-fdiagnostics-text-art-charset=.
	(-fdiagnostics-plain-output): Add
	-fdiagnostics-text-art-charset=none.
	* gcc.cc: Include "diagnostic-text-art.h".
	(driver_handle_option): Handle OPT_fdiagnostics_text_art_charset_.
	* opts-common.cc (decode_cmdline_options_to_array): Add
	"-fdiagnostics-text-art-charset=none" to expanded_args for
	-fdiagnostics-plain-output.
	* opts.cc: Include "diagnostic-text-art.h".
	(common_handle_option): Handle OPT_fdiagnostics_text_art_charset_.
	* pretty-print.cc (pp_unicode_character): New.
	* pretty-print.h (pp_unicode_character): New decl.
	* selftest-run-tests.cc: Include "text-art/selftests.h".
	(selftest::run_tests): Call text_art_tests.
	* text-art/box-drawing-chars.inc: New file, generated by
	contrib/unicode/gen-box-drawing-chars.py.
	* text-art/box-drawing.cc: New file.
	* text-art/box-drawing.h: New file.
	* text-art/canvas.cc: New file.
	* text-art/canvas.h: New file.
	* text-art/ruler.cc: New file.
	* text-art/ruler.h: New file.
	* text-art/selftests.cc: New file.
	* text-art/selftests.h: New file.
	* text-art/style.cc: New file.
	* text-art/styled-string.cc: New file.
	* text-art/table.cc: New file.
	* text-art/table.h: New file.
	* text-art/theme.cc: New file.
	* text-art/theme.h: New file.
	* text-art/types.h: New file.
	* text-art/widget.cc: New file.
	* text-art/widget.h: New file.

gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/diagnostic-test-text-art-ascii-bw.c: New test.
	* gcc.dg/plugin/diagnostic-test-text-art-ascii-color.c: New test.
	* gcc.dg/plugin/diagnostic-test-text-art-none.c: New test.
	* gcc.dg/plugin/diagnostic-test-text-art-unicode-bw.c: New test.
	* gcc.dg/plugin/diagnostic-test-text-art-unicode-color.c: New test.
	* gcc.dg/plugin/diagnostic_plugin_test_text_art.c: New test plugin.
	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add them.

libcpp/ChangeLog:
	* charset.cc (get_cppchar_property): New function template, based
	on...
	(cpp_wcwidth): ...this function.  Rework to use the above.
	Include "combining-chars.inc".
	(cpp_is_combining_char): New function
	Include "printable-chars.inc".
	(cpp_is_printable_char): New function
	* combining-chars.inc: New file, generated by
	contrib/unicode/gen-combining-chars.py.
	* include/cpplib.h (cpp_is_combining_char): New function decl.
	(cpp_is_printable_char): New function decl.
	* printable-chars.inc: New file, generated by
	contrib/unicode/gen-printable-chars.py.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-06-21 21:49:00 -04:00
David Malcolm
985d6480fe testsuite: move handle-multiline-outputs to before check for blank lines
I have followup patches that require checking for multiline patterns
that have blank lines within them, so this moves the handling of
multiline patterns before the check for blank lines, allowing for such
multiline patterns.

Doing so uncovers some issues with existing multiline directives, which
the patch fixes.

gcc/testsuite/ChangeLog:
	* c-c++-common/Wlogical-not-parentheses-2.c: Split up the
	multiline directive.
	* gcc.dg/analyzer/malloc-macro-inline-events.c: Remove redundant
	dg-regexp directives.
	* gcc.dg/missing-header-fixit-5.c: Split up the multiline
	directives.
	* lib/gcc-dg.exp (gcc-dg-prune): Move call to
	handle-multiline-outputs from prune_gcc_output to here.
	* lib/multiline.exp (dg-end-multiline-output): Move call to
	maybe-handle-nn-line-numbers from prune_gcc_output to here.
	* lib/prune.exp (prune_gcc_output): Move calls to
	maybe-handle-nn-line-numbers and handle-multiline-outputs from
	here to the above.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-06-21 21:48:59 -04:00
Ian Lance Taylor
cb760f66e0 compiler: determine types of Slice_{value,info} expressions
This fixes an accidental omission in the determine types pass.

Test case is https://go.dev/cl/505015.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/504797
2023-06-21 17:52:47 -07:00
GCC Administrator
80e9ca0e36 Daily bump. 2023-06-22 00:17:09 +00:00
Uros Bizjak
ce47d3c2cf function: Change return type of predicate function from int to bool
Also change some internal variables to bool and some functions to void.

gcc/ChangeLog:

	* function.h (emit_initial_value_sets):
	Change return type from int to void.
	(aggregate_value_p): Change return type from int to bool.
	(prologue_contains): Ditto.
	(epilogue_contains): Ditto.
	(prologue_epilogue_contains): Ditto.
	* function.cc (temp_slot): Make "in_use" variable bool.
	(make_slot_available): Update for changed "in_use" variable.
	(assign_stack_temp_for_type): Ditto.
	(emit_initial_value_sets): Change return type from int to void
	and update function body accordingly.
	(instantiate_virtual_regs): Ditto.
	(rest_of_handle_thread_prologue_and_epilogue): Ditto.
	(safe_insn_predicate): Change return type from int to bool.
	(aggregate_value_p): Change return type from int to bool
	and update function body accordingly.
	(prologue_contains): Change return type from int to bool.
	(prologue_epilogue_contains): Ditto.
2023-06-21 21:56:14 +02:00
Alexander Monakov
1c1dd39625 c-family: implement -ffp-contract=on
Implement -ffp-contract=on for C and C++ without changing default
behavior (=off for -std=cNN, =fast for C++ and -std=gnuNN).

gcc/c-family/ChangeLog:

	* c-gimplify.cc (fma_supported_p): New helper.
	(c_gimplify_expr) [PLUS_EXPR, MINUS_EXPR]: Implement FMA
	contraction.

gcc/ChangeLog:

	* common.opt (fp_contract_mode) [on]: Remove fallback.
	* config/sh/sh.md (*fmasf4): Correct flag_fp_contract_mode test.
	* doc/invoke.texi (-ffp-contract): Update.
	* trans-mem.cc (diagnose_tm_1): Skip internal function calls.
2023-06-21 21:31:25 +03:00
Paul Thomas
577223aebc Fortran: Fix some bugs in associate [PR87477]
2023-06-21  Paul Thomas  <pault@gcc.gnu.org>

gcc/fortran
	PR fortran/87477
	PR fortran/88688
	PR fortran/94380
	PR fortran/107900
	PR fortran/110224
	* decl.cc (char_len_param_value): Fix memory leak.
	(resolve_block_construct): Remove unnecessary static decls.
	* expr.cc (gfc_is_ptr_fcn): New function.
	(gfc_check_vardef_context): Use it to permit pointer function
	result selectors to be used for associate names in variable
	definition context.
	* gfortran.h: Prototype for gfc_is_ptr_fcn.
	* match.cc (build_associate_name): New function.
	(gfc_match_select_type): Use the new function to replace inline
	version and to build a new associate name for the case where
	the supplied associate name is already used for that purpose.
	* resolve.cc (resolve_assoc_var): Call gfc_is_ptr_fcn to allow
	associate names with pointer function targets to be used in
	variable definition context.
	* trans-decl.cc (gfc_get_symbol_decl): Unlimited polymorphic
	variables need deferred initialisation of the vptr.
	(gfc_trans_deferred_vars): Do the vptr initialisation.
	* trans-stmt.cc (trans_associate_var): Ensure that a pointer
	associate name points to the target of the selector and not
	the selector itself.

gcc/testsuite/
	PR fortran/87477
	PR fortran/107900
	* gfortran.dg/pr107900.f90 : New test

	PR fortran/110224
	* gfortran.dg/pr110224.f90 : New test

	PR fortran/88688
	* gfortran.dg/pr88688.f90 : New test

	PR fortran/94380
	* gfortran.dg/pr94380.f90 : New test

	PR fortran/95398
	* gfortran.dg/pr95398.f90 : Set -std=f2008, bump the line
	numbers in the error tests by two and change the text in two.
2023-06-21 17:05:58 +01:00
Paul Thomas
caf0892eea Fortran: Seg fault passing string to type cptr dummy [PR108961].
2023-06-21  Paul Thomas  <pault@gcc.gnu.org>

gcc/fortran
	PR fortran/108961
	* trans-expr.cc (gfc_conv_procedure_call): The hidden string
	length must not be passed to a formal arg of type(cptr).

gcc/testsuite/
	PR fortran/108961
	* gfortran.dg/pr108961.f90: New test.
2023-06-21 17:01:57 +01:00
Uros Bizjak
b9401c3a32 vect: Add testcases for unsigned conversions [PR110018]
Also test convresions with unsigned types.

	PR target/110018

gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr110018-1.c: Use explicit signed types.
	* gcc.target/i386/pr110018-2.c: New test.
2023-06-21 16:34:39 +02:00
Kyrylo Tkachov
b375c5340b aarch64: Avoid same input and output Z register for gather loads
The architecture recommends that load-gather instructions avoid using the same
Z register for the load address and the destination, and the Software Optimization
Guides for Arm cores recommend that as well.
This means that for code like:

svuint64_t
food (svbool_t p, uint64_t *in, svint64_t offsets, svuint64_t a)
{
  return svadd_u64_x (p, a, svld1_gather_offset(p, in, offsets));
}

we'll want to avoid generating the current:
food:
        ld1d    z0.d, p0/z, [x0, z0.d] // Z0 reused as input and output.
        add     z0.d, z1.d, z0.d
        ret

However, we still want to avoid generating extra moves where there were
none before, so the tight aarch64-sve-acle.exp tests for load gathers
should still pass as they are.

This patch implements that recommendation for the load gather patterns by:
* duplicating the alternatives
* marking the output operand as early clobber
* Tying the input Z register operand in the original alternatives to 0
* Penalising the original alternatives with '?'

This results in a large-ish patch in terms of diff lines but the new
compact syntax (thanks Tamar) makes it quite a readable an regular change.

The benchmark numbers on a Neoverse V1 on fprate look okay:
	        diff
503.bwaves_r	0.00%
507.cactuBSSN_r	0.00%
508.namd_r	0.00%
510.parest_r	0.55%
511.povray_r	0.22%
519.lbm_r	0.00%
521.wrf_r	0.00%
526.blender_r	0.00%
527.cam4_r	0.56%
538.imagick_r	0.00%
544.nab_r	0.00%
549.fotonik3d_r	0.00%
554.roms_r	0.00%
fprate	        0.10%

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/ChangeLog:

	* config/aarch64/aarch64-sve.md (mask_gather_load<mode><v_int_container>):
	Add alternatives to prefer to avoid same input and output Z register.
	(mask_gather_load<mode><v_int_container>): Likewise.
	(*mask_gather_load<mode><v_int_container>_<su>xtw_unpacked): Likewise.
	(*mask_gather_load<mode><v_int_container>_sxtw): Likewise.
	(*mask_gather_load<mode><v_int_container>_uxtw): Likewise.
	(@aarch64_gather_load_<ANY_EXTEND:optab><SVE_4HSI:mode><SVE_4BHI:mode>):
	Likewise.
	(@aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>):
	Likewise.
	(*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode>
	<SVE_2BHSI:mode>_<ANY_EXTEND2:su>xtw_unpacked): Likewise.
	(*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode>
	<SVE_2BHSI:mode>_sxtw): Likewise.
	(*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode>
	<SVE_2BHSI:mode>_uxtw): Likewise.
	(@aarch64_ldff1_gather<mode>): Likewise.
	(@aarch64_ldff1_gather<mode>): Likewise.
	(*aarch64_ldff1_gather<mode>_sxtw): Likewise.
	(*aarch64_ldff1_gather<mode>_uxtw): Likewise.
	(@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode>
	<VNx4_NARROW:mode>): Likewise.
	(@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode>
	<VNx2_NARROW:mode>): Likewise.
	(*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode>
	<VNx2_NARROW:mode>_sxtw): Likewise.
	(*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode>
	<VNx2_NARROW:mode>_uxtw): Likewise.
	* config/aarch64/aarch64-sve2.md (@aarch64_gather_ldnt<mode>): Likewise.
	(@aarch64_gather_ldnt_<ANY_EXTEND:optab><SVE_FULL_SDI:mode>
	<SVE_PARTIAL_I:mode>): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/sve/gather_earlyclobber.c: New test.
	* gcc.target/aarch64/sve2/gather_earlyclobber.c: New test.
2023-06-21 13:43:26 +01:00
Kyrylo Tkachov
4d9d207c66 aarch64: Convert SVE gather patterns to compact syntax
This patch converts the SVE load gather patterns to the new compact syntax
that Tamar introduced. This allows for a future patch I want to contribute
to add more alternatives that are better viewed in the more compact form.

The lines in some patterns are >80 long now, but I think that's unavoidable
and those patterns already had overly long constraint strings.

No functional change intended.
Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/ChangeLog:

	* config/aarch64/aarch64-sve.md (mask_gather_load<mode><v_int_container>):
	Convert to compact alternatives syntax.
	(mask_gather_load<mode><v_int_container>): Likewise.
	(*mask_gather_load<mode><v_int_container>_<su>xtw_unpacked): Likewise.
	(*mask_gather_load<mode><v_int_container>_sxtw): Likewise.
	(*mask_gather_load<mode><v_int_container>_uxtw): Likewise.
	(@aarch64_gather_load_<ANY_EXTEND:optab><SVE_4HSI:mode><SVE_4BHI:mode>):
	Likewise.
	(@aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>):
	Likewise.
	(*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode>
	<SVE_2BHSI:mode>_<ANY_EXTEND2:su>xtw_unpacked): Likewise.
	(*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode>
	<SVE_2BHSI:mode>_sxtw): Likewise.
	(*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode>
	<SVE_2BHSI:mode>_uxtw): Likewise.
	(@aarch64_ldff1_gather<mode>): Likewise.
	(@aarch64_ldff1_gather<mode>): Likewise.
	(*aarch64_ldff1_gather<mode>_sxtw): Likewise.
	(*aarch64_ldff1_gather<mode>_uxtw): Likewise.
	(@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode>
	<VNx4_NARROW:mode>): Likewise.
	(@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode>
	<VNx2_NARROW:mode>): Likewise.
	(*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode>
	<VNx2_NARROW:mode>_sxtw): Likewise.
	(*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode>
	<VNx2_NARROW:mode>_uxtw): Likewise.
	* config/aarch64/aarch64-sve2.md (@aarch64_gather_ldnt<mode>): Likewise.
	(@aarch64_gather_ldnt_<ANY_EXTEND:optab><SVE_FULL_SDI:mode>
	<SVE_PARTIAL_I:mode>): Likewise.
2023-06-21 13:40:15 +01:00
Kyrylo Tkachov
31cd5f9ae4 Revert "aarch64: Convert SVE gather patterns to compact syntax"
This reverts commit bb3c69058a.
2023-06-21 13:38:56 +01:00
Ju-Zhe Zhong
4b23d10ce8 Move can_vec_mask_load_store_p and get_len_load_store_mode from "optabs-query" into "optabs-tree"
Since we want both can_vec_mask_load_store_p and get_len_load_store_mode
can see "internal_fn", move these 2 functions into optabs-tree.

gcc/ChangeLog:

	* optabs-query.cc (can_vec_mask_load_store_p): Move to optabs-tree.cc.
	(get_len_load_store_mode): Ditto.
	* optabs-query.h (can_vec_mask_load_store_p): Move to optabs-tree.h.
	(get_len_load_store_mode): Ditto.
	* optabs-tree.cc (can_vec_mask_load_store_p): New function.
	(get_len_load_store_mode): Ditto.
	* optabs-tree.h (can_vec_mask_load_store_p): Ditto.
	(get_len_load_store_mode): Ditto.
	* tree-if-conv.cc: include optabs-tree instead of optabs-query
2023-06-21 20:29:10 +08:00
Richard Biener
b54d0f2959 Less strip_offset in IVOPTs
This avoids one strip_offset use in add_iv_candidate_for_use where
we know it operates on a sizetype quantity.

	* tree-ssa-loop-ivopts.cc (add_iv_candidate_for_use): Use
	split_constant_offset for the POINTER_PLUS_EXPR case.
2023-06-21 13:38:10 +02:00
Richard Biener
5d88932657 Less strip_offset in IVOPTs
This avoids a strip_offset use in record_group_use where we know
it operates on addresses.

	* tree-ssa-loop-ivopts.cc (record_group_use): Use
	split_constant_offset.
2023-06-21 13:38:09 +02:00
Richard Biener
fb0447b1f6 Hide IVOPTs strip_offset
PR110243 shows strip_offset has some correctness issues, the following
avoids using it from loop distribution which can use the more correct
split_constant_offset from data-ref analysis instead.  The patch then
un-exports the function from IVOPTs.

	* tree-loop-distribution.cc (classify_builtin_st): Use
	split_constant_offset.
	* tree-ssa-loop-ivopts.h (strip_offset): Remove.
	* tree-ssa-loop-ivopts.cc (strip_offset): Make static.
2023-06-21 13:38:02 +02:00
Kyrylo Tkachov
bb3c69058a aarch64: Convert SVE gather patterns to compact syntax
This patch converts the SVE load gather patterns to the new compact syntax
that Tamar introduced. This allows for a future patch I want to contribute
to add more alternatives that are better viewed in the more compact form.

The lines in some patterns are >80 long now, but I think that's unavoidable
and those patterns already had overly long constraint strings.

No functional change intended.
Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/ChangeLog:

	* config/aarch64/aarch64-sve.md (mask_gather_load<mode><v_int_container>):
	Convert to compact alternatives syntax.
	(mask_gather_load<mode><v_int_container>): Likewise.
	(*mask_gather_load<mode><v_int_container>_<su>xtw_unpacked): Likewise.
	(*mask_gather_load<mode><v_int_container>_sxtw): Likewise.
	(*mask_gather_load<mode><v_int_container>_uxtw): Likewise.
	(@aarch64_gather_load_<ANY_EXTEND:optab><SVE_4HSI:mode><SVE_4BHI:mode>):
	Likewise.
	(@aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>):
	Likewise.
	(*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode>
	<SVE_2BHSI:mode>_<ANY_EXTEND2:su>xtw_unpacked): Likewise.
	(*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode>
	<SVE_2BHSI:mode>_sxtw): Likewise.
	(*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode>
	<SVE_2BHSI:mode>_uxtw): Likewise.
	(@aarch64_ldff1_gather<mode>): Likewise.
	(@aarch64_ldff1_gather<mode>): Likewise.
	(*aarch64_ldff1_gather<mode>_sxtw): Likewise.
	(*aarch64_ldff1_gather<mode>_uxtw): Likewise.
	(@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode>
	<VNx4_NARROW:mode>): Likewise.
	(@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode>
	<VNx2_NARROW:mode>): Likewise.
	(*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode>
	<VNx2_NARROW:mode>_sxtw): Likewise.
	(*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode>
	<VNx2_NARROW:mode>_uxtw): Likewise.
	* config/aarch64/aarch64-sve2.md (@aarch64_gather_ldnt<mode>): Likewise.
	(@aarch64_gather_ldnt_<ANY_EXTEND:optab><SVE_FULL_SDI:mode>
	<SVE_PARTIAL_I:mode>): Likewise.
2023-06-21 12:03:22 +01:00
Tamar Christina
b8b19729e6 docs: replace backslashchar [PR 110329].
It seems like @blackslashchar{} is a relatively new addition
to texinfo.  Other parts of the docs use @samp{\} so use it
here too so older distros work.

gcc/ChangeLog:

	PR other/110329
	* doc/md.texi: Replace backslashchar.
2023-06-21 10:34:54 +01:00
Richard Biener
24c125fe47 [i386] Reject too large vectors for partial vector vectorization
The following works around the lack of the x86 backend making the
vectorizer compare the costs of the different possible vector
sizes the backed advertises through the vector_modes hook.  When
enabling masked epilogues or main loops then this means we will
select the prefered vector mode which is usually the largest even
for loops that do not iterate close to the times the vector has
lanes.  When not using masking the vectorizer would reject any
mode resulting in a VF bigger than the number of iterations
but with masking they are simply masked out.

So this overloads the finish_cost function and matches for
the problematic case, forcing a high cost to make us try a
smaller vector size.

	* config/i386/i386.cc (ix86_vector_costs::finish_cost):
	Overload.  For masked main loops make sure the vectorization
	factor isn't more than double the number of iterations.

	* gcc.target/i386/vect-partial-vectors-1.c: New testcase.
	* gcc.target/i386/vect-partial-vectors-2.c: Likewise.
2023-06-21 09:10:13 +02:00
Jan Beulich
864c6471bd x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F
There's no reason to constrain this to AVX512VL, unless instructed so by
-mprefer-vector-width=, as the wider operation is unusable for more
narrow operands only when the possible memory source is a non-broadcast
one. This way even the scalar copysign<mode>3 can benefit from the
operation being a single-insn one (leaving aside moves which the
compiler decides to insert for unclear reasons, and leaving aside the
fact that bcst_mem_operand() is too restrictive for broadcast to be
embedded right into VPTERNLOG*).

While there also bring *<avx512>_vternlog<mode>_all's in sync with that
of the three splitters.

Along with this also request value duplication in
ix86_expand_copysign()'s call to ix86_build_signbit_mask(), eliminating
excess space allocation in .rodata.*, filled with zeros which are never
read.

gcc/

	* config/i386/i386-expand.cc (ix86_expand_copysign): Request
	value duplication by ix86_build_signbit_mask() when AVX512F and
	not HFmode.
	* config/i386/sse.md (*<avx512>_vternlog<mode>_all): Convert to
	2-alternative form. Adjust "mode" attribute. Add "enabled"
	attribute.
	(*<avx512>_vpternlog<mode>_1): Also permit when TARGET_AVX512F
	&& !TARGET_PREFER_AVX256.
	(*<avx512>_vpternlog<mode>_2): Likewise.
	(*<avx512>_vpternlog<mode>_3): Likewise.

gcc/testsuite/

	* gcc.target/i386/avx512f-copysign.c: New test.
2023-06-21 08:03:05 +02:00
Jan Beulich
67061960b6 x86: add -mprefer-vector-width=512 to new avx512f-dupv2di.c testcase
This is to cover testing also being done with -march=cascadelake.

gcc/testsuite/

	* gcc.target/i386/avx512f-dupv2di.c: Add
	-mprefer-vector-width=512.
2023-06-21 08:01:32 +02:00
liuhongt
6f19cf7526 Use intermiediate integer type for float_expr/fix_trunc_expr when direct optab is not existed.
We have already use intermidate type in case WIDEN, but not for NONE,
this patch extended that.

gcc/ChangeLog:

	PR target/110018
	* tree-vect-stmts.cc (vectorizable_conversion): Use
	intermiediate integer type for float_expr/fix_trunc_expr when
	direct optab is not existed.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr110018-1.c: New test.
2023-06-21 10:16:07 +08:00
GCC Administrator
bfc6d29f8b Daily bump. 2023-06-21 00:17:14 +00:00
Tamar Christina
f5d0cec170 gensupport: drop suppport for define_cond_exec from compact syntac
define_cond_exec does not support the special @@ syntax
and so can't support {@.  As such just remove support
for it.

gcc/ChangeLog:

	PR bootstrap/110324
	* gensupport.cc (convert_syntax): Explicitly check for RTX code.
2023-06-20 23:31:40 +01:00
Lewis Hyatt
4f3be7cbeb libcpp: Improve location for macro names [PR66290]
When libcpp reports diagnostics whose locus is a macro name (such as for
-Wunused-macros), it uses the location in the cpp_macro object that was
stored by _cpp_new_macro. This is currently set to pfile->directive_line,
which contains the line number only and no column information. This patch
changes the stored location to the src_loc for the token defining the macro
name, which includes the location and range information.

libcpp/ChangeLog:

	PR c++/66290
	* macro.cc (_cpp_create_definition): Add location argument.
	* internal.h (_cpp_create_definition): Adjust prototype.
	* directives.cc (do_define): Pass new location argument to
	_cpp_create_definition.
	(do_undef): Stop passing inferior location to cpp_warning_with_line;
	the default from cpp_warning is better.
	(cpp_pop_definition): Pass new location argument to
	_cpp_create_definition.
	* pch.cc (cpp_read_state): Likewise.

gcc/testsuite/ChangeLog:

	PR c++/66290
	* c-c++-common/cpp/macro-ranges.c: New test.
	* c-c++-common/cpp/line-2.c: Adapt to check for column information
	on macro-related libcpp warnings.
	* c-c++-common/cpp/line-3.c: Likewise.
	* c-c++-common/cpp/macro-arg-count-1.c: Likewise.
	* c-c++-common/cpp/pr58844-1.c: Likewise.
	* c-c++-common/cpp/pr58844-2.c: Likewise.
	* c-c++-common/cpp/warning-zero-location.c: Likewise.
	* c-c++-common/pragma-diag-14.c: Likewise.
	* c-c++-common/pragma-diag-15.c: Likewise.
	* g++.dg/modules/macro-2_d.C: Likewise.
	* g++.dg/modules/macro-4_d.C: Likewise.
	* g++.dg/modules/macro-4_e.C: Likewise.
	* g++.dg/spellcheck-macro-ordering.C: Likewise.
	* gcc.dg/builtin-redefine.c: Likewise.
	* gcc.dg/cpp/Wunused.c: Likewise.
	* gcc.dg/cpp/redef2.c: Likewise.
	* gcc.dg/cpp/redef3.c: Likewise.
	* gcc.dg/cpp/redef4.c: Likewise.
	* gcc.dg/cpp/ucnid-11-utf8.c: Likewise.
	* gcc.dg/cpp/ucnid-11.c: Likewise.
	* gcc.dg/cpp/undef2.c: Likewise.
	* gcc.dg/cpp/warn-redefined-2.c: Likewise.
	* gcc.dg/cpp/warn-redefined.c: Likewise.
	* gcc.dg/cpp/warn-unused-macros-2.c: Likewise.
	* gcc.dg/cpp/warn-unused-macros.c: Likewise.
2023-06-20 16:58:12 -04:00
Richard Sandiford
079f31c553 aarch64: Fix gcc.target/aarch64/sve/pcs failures
Several gcc.target/aarch64/sve/pcs tests started failing after
6a2e8dcbbd, because the tests weren't robust against whether
an indirect argument register or the stack pointer was used as
the base for stores.

The patch allows either base register when there is only one
indirect argument.  It disables -fcprop-registers in cases where
there are sometimes multiple indirect arguments, since the name
of the argument register is then an important part of the test.

Disabling -fcprop-registers gives poor final register allocation,
since:

* combine's make_more_copies hack adds extra redundant moves
* code with those moves is not allocated as well as moves without them
* we often rely on -fcprop-registers to clean up the allocation later

The patch therefore disables combine in the same tests as
cprop-registers.

gcc/testsuite/
	* gcc.target/aarch64/sve/pcs/args_1.c: Match moves from the stack
	pointer to indirect argument registers and allow either to be used
	as the base register in subsequent stores.
	* gcc.target/aarch64/sve/pcs/args_8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_2.c: Allow the store of the
	indirect argument to happen via the argument register or the
	stack pointer.
	* gcc.target/aarch64/sve/pcs/args_3.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_4.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_bf16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_f16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_f32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_f64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_s16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_s32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_s64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_s8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_u16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_u32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_u64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_u8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_bf16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_f16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_f32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_f64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_s16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_s32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_s64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_s8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_u16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_u32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_u64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_u8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_bf16.c: Disable
	-fcprop-registers and combine.
	* gcc.target/aarch64/sve/pcs/args_6_be_f16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_f32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_f64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_s16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_s32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_s64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_s8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_u16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_u32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_u64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_u8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_bf16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_f16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_f32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_f64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_s16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_s32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_s64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_s8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_u16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_u32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_u64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_u8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_1.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_f16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_f32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_f64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_s16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_s32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_s64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_s8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_u16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_u32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_u64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_u8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_3_nosc.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_3_sc.c: Likewise.
2023-06-20 21:48:38 +01:00
Richard Sandiford
580b74a791 aarch64: Robustify stack tie handling
The SVE handling of stack clash protection copied the stack
pointer to X11 before the probe and set up X11 as the CFA
for unwind purposes:

    /* This is done to provide unwinding information for the stack
       adjustments we're about to do, however to prevent the optimizers
       from removing the R11 move and leaving the CFA note (which would be
       very wrong) we tie the old and new stack pointer together.
       The tie will expand to nothing but the optimizers will not touch
       the instruction.  */
    rtx stack_ptr_copy = gen_rtx_REG (Pmode, STACK_CLASH_SVE_CFA_REGNUM);
    emit_move_insn (stack_ptr_copy, stack_pointer_rtx);
    emit_insn (gen_stack_tie (stack_ptr_copy, stack_pointer_rtx));

    /* We want the CFA independent of the stack pointer for the
       duration of the loop.  */
    add_reg_note (insn, REG_CFA_DEF_CFA, stack_ptr_copy);
    RTX_FRAME_RELATED_P (insn) = 1;

-fcprop-registers is now smart enough to realise that X11 = SP,
replace X11 with SP in the stack tie, and delete the instruction
created above.

This patch tries to prevent that by making stack_tie fussy about
the register numbers.  It fixes failures in
gcc.target/aarch64/sve/pcs/stack_clash*.c.

gcc/
	* config/aarch64/aarch64.md (stack_tie): Hard-code the first
	register operand to the stack pointer.  Require the second register
	operand to have the number specified in a separate const_int operand.
	* config/aarch64/aarch64.cc (aarch64_emit_stack_tie): New function.
	(aarch64_allocate_and_probe_stack_space): Use it.
	(aarch64_expand_prologue, aarch64_expand_epilogue): Likewise.
	(aarch64_expand_epilogue): Likewise.
2023-06-20 21:48:38 +01:00
Jakub Jelinek
f8f68c4ca6 tree-ssa-math-opts: Small uaddc/usubc pattern matching improvement [PR79173]
In the following testcase we fail to pattern recognize the least significant
.UADDC call.  The reason is that arg3 in that case is
  _3 = .ADD_OVERFLOW (...);
  _2 = __imag__ _3;
  _1 = _2 != 0;
  arg3 = (unsigned long) _1;
and while before the changes arg3 has a single use in some .ADD_OVERFLOW
later on, we add a .UADDC call next to it (and gsi_remove/gsi_replace only
what is strictly necessary and leave quite a few dead stmts around which
next DCE cleans up) and so it all of sudden isn't used just once, but twice
(.ADD_OVERFLOW and .UADDC) and so uaddc_cast fails.  While we could tweak
uaddc_cast and not require has_single_use in these uses, there is also
no vrp that would figure out that because __imag__ _3 is in [0, 1] range,
it can just use arg3 = __imag__ _3; and drop the comparison and cast.

We already search if either arg2 or arg3 is ultimately set from __imag__
of .{{ADD,SUB}_OVERFLOW,U{ADD,SUB}C} call, so the following patch just
remembers the lhs of __imag__ from that case and uses it later.

2023-06-20  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/79173
	* tree-ssa-math-opts.cc (match_uaddc_usubc): Remember lhs of
	IMAGPART_EXPR of arg2/arg3 and use that as arg3 if it has the right
	type.

	* g++.target/i386/pr79173-1.C: New test.
2023-06-20 20:17:41 +02:00
Uros Bizjak
4c7d264eee calls: Change return type of predicate function from int to bool
Also change some internal variables and some function arguments to bool.

gcc/ChangeLog:

	* calls.h (setjmp_call_p): Change return type from int to bool.
	* calls.cc (struct arg_data): Change "pass_on_stack" to bool.
	(store_one_arg): Change return type from int to bool
	and adjust function body accordingly.  Change "sibcall_failure"
	variable to bool.
	(finalize_must_preallocate): Ditto.  Change *must_preallocate pointer
	argument  to bool.  Change "partial_seen" variable to bool.
	(load_register_parameters):  Change *sibcall_failure
	pointer argument to bool.
	(check_sibcall_argument_overlap_1): Change return type from int to bool
	and adjust function body accordingly.
	(check_sibcall_argument_overlap):  Ditto.  Change
	"mark_stored_args_map" argument to bool.
	(emit_call_1): Change "already_popped" variable to bool.
	(setjmp_call_p): Change return type from int to bool
	and adjust function body accordingly.
	(initialize_argument_information): Change *must_preallocate
	pointer argument to bool.
	(expand_call): Change "pcc_struct_value", "must_preallocate"
	and "sibcall_failure" variables to bool.
	(emit_library_call_value_1): Change "pcc_struct_value"
	variable to bool.
2023-06-20 19:44:14 +02:00
Ian Lance Taylor
efecb298d8 runtime: use a C function to call mmap
The final argument to mmap, of type off_t, varies.
In CL 445375 we changed it to always use the C off_t type,
but that broke 32-bit big-endian Linux systems.  On those systems,
using the C off_t type requires calling the mmap64 function.
In C this is automatically handled by the <sys/mman.h> file.
In Go, we would have to change the magic //extern comment to
call mmap64 when appropriate.  Rather than try to get that right,
we instead go through a C function that uses C implicit type
conversions to pick the right type.

Fixes PR go/110297

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/504415
2023-06-20 09:55:58 -07:00
Martin Jambor
0be3a051c0
ipa-sra: Disable candidates with no known callers (PR 110276)
In IPA-SRA we use can_be_local_p () predicate rather than just plain
local call graph flag in order to figure out whether the node is a
part of an external API that we cannot change.  Although there are
cases where this can allow more transformations, it also means we can
analyze functions which have no callers at all, which is pointless.

Moreover, it makes an assert of hint propagation trigger, which checks
that we have looked at callers before processing hints that come from
them.  This has been reported as PR 110276.

This patch simply adds a check that a node has at least one caller
into the early checks and makes the node a non-candidate for any
transformation if it does not.

gcc/ChangeLog:

2023-06-16  Martin Jambor  <mjambor@suse.cz>

	PR ipa/110276
	* ipa-sra.cc (struct caller_issues): New field there_is_one.
	(check_for_caller_issues): Set it.
	(check_all_callers_for_issues): Check it.

gcc/testsuite/ChangeLog:

2023-06-16  Martin Jambor  <mjambor@suse.cz>

	PR ipa/110276
	* gcc.dg/ipa/pr110276.c: New test.
2023-06-20 18:16:29 +02:00
Martin Jambor
7f986e2ed9
ipa-cp: Avoid long linear searches through DECL_ARGUMENTS
There have been concerns that linear searches through DECL_ARGUMENTS
that are often necessary to compute the index of a particular
PARM_DECL which is the key to results of IPA-CP can happen often
enough to be a compile time issue, especially if we plug the results
into value numbering, as I intend to do with a follow-up patch.

This patch creates a vector sorted according to PARM_DECLs to do the look-up
for all functions which have some information discovered by IPA-CP and which
have 32 parameters or more.  32 is a hard-wired magical constant here to
capture the trade-off between the memory allocation overhead and length of the
linear search.  I do not think it is worth making it a --param but if people
think it appropriate, I can turn it into one.

gcc/ChangeLog:

2023-05-31  Martin Jambor  <mjambor@suse.cz>

	* ipa-prop.h (ipa_uid_to_idx_map_elt): New type.
	(struct ipcp_transformation): Rearrange members	according to
	C++ class coding convention, add m_uid_to_idx,
	get_param_index and maybe_create_parm_idx_map.
	* ipa-cp.cc (ipcp_transformation::get_param_index): New function.
	(compare_uids): Likewise.
	(ipcp_transformation::maype_create_parm_idx_map): Likewise.
	* ipa-prop.cc (ipcp_get_parm_bits): Use get_param_index.
	(ipcp_update_bits): Accept TS as a parameter, assume it is not NULL.
	(ipcp_update_vr): Likewise.
	(ipcp_transform_function): Call, maybe_create_parm_idx_map of TS, bail
	out quickly if empty, pass it to ipcp_update_bits and ipcp_update_vr.
2023-06-20 18:16:14 +02:00
Carl Love
86df278de1 rs6000: Add builtins for IEEE 128-bit floating point values
Add support for the following builtins:

 __vector unsigned long long int scalar_extract_exp_to_vec (__ieee128);
 __vector unsigned __int128 scalar_extract_sig_to_vec (__ieee128);
 __ieee128 scalar_insert_exp (__vector unsigned __int128,
 			      __vector unsigned long long);

The instructions used in the builtins operate on vector registers.  Thus
the result must be moved to a scalar type.  There is no clean, performant
way to do this.  The user code typically needs the result as a vector
anyway.

gcc/
	* config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin):
	Rename CODE_FOR_xsxsigqp_tf to CODE_FOR_xsxsigqp_tf_ti.
	Rename CODE_FOR_xsxsigqp_kf to CODE_FOR_xsxsigqp_kf_ti.
	Rename CCDE_FOR_xsxexpqp_tf to CODE_FOR_xsxexpqp_tf_di.
	Rename CODE_FOR_xsxexpqp_kf to CODE_FOR_xsxexpqp_kf_di.
	(CODE_FOR_xsxexpqp_kf_v2di, CODE_FOR_xsxsigqp_kf_v1ti,
	CODE_FOR_xsiexpqp_kf_v2di): Add case statements.
	* config/rs6000/rs6000-builtins.def
	(__builtin_vsx_scalar_extract_exp_to_vec,
	__builtin_vsx_scalar_extract_sig_to_vec,
	__builtin_vsx_scalar_insert_exp_vqp): Add new builtin definitions.
	Rename xsxexpqp_kf, xsxsigqp_kf, xsiexpqp_kf to xsexpqp_kf_di,
	xsxsigqp_kf_ti, xsiexpqp_kf_di respectively.
	* config/rs6000/rs6000-c.cc (altivec_resolve_overloaded_builtin):
	Update case RS6000_OVLD_VEC_VSIE to handle MODE_VECTOR_INT for new
	overloaded instance. Update comments.
	* config/rs6000/rs6000-overload.def
	(__builtin_vec_scalar_insert_exp): Add new overload definition with
	vector arguments.
	(scalar_extract_exp_to_vec, scalar_extract_sig_to_vec): New
	overloaded definitions.
	* config/rs6000/vsx.md (V2DI_DI): New mode iterator.
	(DI_to_TI): New mode attribute.
	Rename xsxexpqp_<mode> to sxexpqp_<IEEE128:mode>_<V2DI_DI:mode>.
	Rename xsxsigqp_<mode> to xsxsigqp_<IEEE128:mode>_<VEC_TI:mode>.
	Rename xsiexpqp_<mode> to xsiexpqp_<IEEE128:mode>_<V2DI_DI:mode>.
	* doc/extend.texi (scalar_extract_exp_to_vec,
	scalar_extract_sig_to_vec): Add documentation for new builtins.
	(scalar_insert_exp): Add new overloaded builtin definition.

gcc/testsuite/
	* gcc.target/powerpc/bfp/scalar-extract-exp-8.c: New test case.
	* gcc.target/powerpc/bfp/scalar-extract-sig-8.c: New test case.
	* gcc.target/powerpc/bfp/scalar-insert-exp-16.c: New test case.
2023-06-20 11:42:40 -04:00
Jonathan Wakely
b4f1e4a644 libstdc++: Remove redundant code in std::to_array
libstdc++-v3/ChangeLog:

	* include/std/array (to_array(T(&)[N])): Remove redundant
	condition.
	(to_array(T(&&)[N])): Remove redundant std::move.
2023-06-20 15:35:48 +01:00
Robin Dapp
649c640cc4 RISC-V: testsuite: Add missing -mabi=lp64d.
This fixes more cases of missing -mabi=lp64d.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vls-vlmax/full-vec-move1.c: Add
	-mabi=lp64d.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1.c: Dito.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2.c: Dito.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3.c: Dito.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4.c: Dito.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c: Dito.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c: Dito.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c: Dito.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c: Dito.
2023-06-20 16:16:39 +02:00
Li Xu
cb421ffff6 RISC-V: Set the natural size of constant vector mask modes to one RVV data vector.
If reinterpret vnx2bi as vnx16qi, vnx16qi must occupy no more of the underlying
registers than vnx2bi.

Consider this following case:
void test_vreinterpret_v_b64_i8m1 (uint8_t *in, int8_t *out)
{
  vbool64_t vmask = __riscv_vlm_v_b64 (in, 2);
  vint8m1_t vout = __riscv_vreinterpret_v_b64_i8m1 (vmask);
  __riscv_vse8_v_i8m1(out, vout, 16);
}

compiler parameters: -march=rv64gcv -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax -O3
Compilation fails with:
test_vreinterpret_v_b64_i8m1during RTL pass: expand

test.c: In function 'test_vreinterpret_v_b64_i8m1':
test.c:11:22: internal compiler error: in gen_lowpart_general, at rtlhooks.cc:57
   11 |     vint8m1_t vout = __riscv_vreinterpret_v_b64_i8m1(src);
      |                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0xf11876 gen_lowpart_general(machine_mode, rtx_def*)
        ../.././riscv-gcc/gcc/rtlhooks.cc:57
0x191435e gen_vreinterpretvnx16qi(rtx_def*, rtx_def*)
        ../.././riscv-gcc/gcc/config/riscv/vector.md:486
0xe08858 maybe_expand_insn(insn_code, unsigned int, expand_operand*)
        ../.././riscv-gcc/gcc/optabs.cc:8213
0x1471209 riscv_vector::function_expander::generate_insn(insn_code)
        ../.././riscv-gcc/gcc/config/riscv/riscv-vector-builtins.cc:3813
0x147629c riscv_vector::function_expander::expand()
        ../.././riscv-gcc/gcc/config/riscv/riscv-vector-builtins.h:520
0x147629c riscv_vector::expand_builtin(unsigned int, tree_node*, rtx_def*)
        ../.././riscv-gcc/gcc/config/riscv/riscv-vector-builtins.cc:4103
0x9868f9 expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int)
        ../.././riscv-gcc/gcc/builtins.cc:7342

gcc/ChangeLog:

	* config/riscv/riscv.cc (riscv_regmode_natural_size): set the natural
	size of vector mask mode to one rvv register.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vreinterpet-fixed.c: New test.
2023-06-20 22:14:22 +08:00
Juzhe-Zhong
1c0b118bab RISC-V: Optimize codegen of VLA SLP
Add comments for Robin:
We want to create a pattern where value[ix] = floor (ix / NPATTERNS).
As NPATTERNS is always a power of two we can rewrite this as
= ix & -NPATTERNS.
`
Recently, I figure out a better approach in case of codegen for VLA stepped vector.

Here is the detail descriptions:

Case 1:
void
f (uint8_t *restrict a, uint8_t *restrict b)
{
  for (int i = 0; i < 100; ++i)
    {
      a[i * 8] = b[i * 8 + 37] + 1;
      a[i * 8 + 1] = b[i * 8 + 37] + 2;
      a[i * 8 + 2] = b[i * 8 + 37] + 3;
      a[i * 8 + 3] = b[i * 8 + 37] + 4;
      a[i * 8 + 4] = b[i * 8 + 37] + 5;
      a[i * 8 + 5] = b[i * 8 + 37] + 6;
      a[i * 8 + 6] = b[i * 8 + 37] + 7;
      a[i * 8 + 7] = b[i * 8 + 37] + 8;
    }
}

We need to generate the stepped vector:
NPATTERNS = 8.
{ 0, 0, 0, 0, 0, 0, 0, 0, 8, 8, 8, 8, 8, 8, 8, 8 }

Before this patch:
vid.v    v4         ;; {0,1,2,3,4,5,6,7,...}
vsrl.vi  v4,v4,3    ;; {0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,...}
li       a3,8       ;; {8}
vmul.vx  v4,v4,a3   ;; {0,0,0,0,0,0,0,8,8,8,8,8,8,8,8,...}

After this patch:
vid.v    v4                    ;; {0,1,2,3,4,5,6,7,...}
vand.vi  v4,v4,-8(-NPATTERNS)  ;; {0,0,0,0,0,0,0,8,8,8,8,8,8,8,8,...}

Case 2:
void
f (uint8_t *restrict a, uint8_t *restrict b)
{
  for (int i = 0; i < 100; ++i)
    {
      a[i * 8] = b[i * 8 + 3] + 1;
      a[i * 8 + 1] = b[i * 8 + 2] + 2;
      a[i * 8 + 2] = b[i * 8 + 1] + 3;
      a[i * 8 + 3] = b[i * 8 + 0] + 4;
      a[i * 8 + 4] = b[i * 8 + 7] + 5;
      a[i * 8 + 5] = b[i * 8 + 6] + 6;
      a[i * 8 + 6] = b[i * 8 + 5] + 7;
      a[i * 8 + 7] = b[i * 8 + 4] + 8;
    }
}

We need to generate the stepped vector:
NPATTERNS = 4.
{ 3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12, ... }

Before this patch:
li       a6,134221824
slli     a6,a6,5
addi     a6,a6,3        ;; 64-bit: 0x0003000200010000
vmv.v.x  v6,a6          ;; {3, 2, 1, 0, ... }
vid.v    v4             ;; {0, 1, 2, 3, 4, 5, 6, 7, ... }
vsrl.vi  v4,v4,2        ;; {0, 0, 0, 0, 1, 1, 1, 1, ... }
li       a3,4           ;; {4}
vmul.vx  v4,v4,a3       ;; {0, 0, 0, 0, 4, 4, 4, 4, ... }
vadd.vv  v4,v4,v6       ;; {3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12, ... }

After this patch:
li	a3,-536875008
slli	a3,a3,4
addi	a3,a3,1
slli	a3,a3,16
vmv.v.x	v2,a3           ;; {3, 1, -1, -3, ... }
vid.v	v4              ;; {0, 1, 2, 3, 4, 5, 6, 7, ... }
vadd.vv	v4,v4,v2        ;; {3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12, ... }

gcc/ChangeLog:

	* config/riscv/riscv-v.cc (expand_const_vector): Optimize codegen.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/partial/slp-1.c: Adapt testcase.
	* gcc.target/riscv/rvv/autovec/partial/slp-16.c: New test.
	* gcc.target/riscv/rvv/autovec/partial/slp_run-16.c: New test.
2023-06-20 21:59:22 +08:00
Robin Dapp
b26f1735cb RISC-V: testsuite: Add -Wno-psabi to vec_set/vec_extract testcases.
This fixes some fallout from the recent psabi changes.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1.c: Add
	-Wno-psabi.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2.c: Dito.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3.c: Dito.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4.c: Dito.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c:
	Dito.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c: Dito.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c: Dito.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c: Dito.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c: Dito.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c: Dito.
2023-06-20 15:04:19 +02:00
Lehua Ding
4a6c44f4ad RISC-V: Fix compiler warning of riscv_arg_has_vector
Hi,

This little patch fixes a compile warning issue that my previous patch
introduced, sorry for introducing this issue.

Best,
Lehua

gcc/ChangeLog:

	* config/riscv/riscv.cc (riscv_arg_has_vector): Add default
	switch handler.
2023-06-20 15:04:19 +02:00
Robin Dapp
37c167e89b RISC-V: testsuite: Fix vmul test expectation and fix -ffast-math.
I forgot to check for vfmul in the multiplication tests as well as
some -ffast-math arguments.  Fix this.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/binop/vadd-run.c: Add
	-ffast-math.
	* gcc.target/riscv/rvv/autovec/binop/vadd-zvfh-run.c: Dito.
	* gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c: Remove
	-ffast-math
	* gcc.target/riscv/rvv/autovec/binop/vmul-rv32gcv.c: Check for
	vfmul.
	* gcc.target/riscv/rvv/autovec/binop/vmul-rv64gcv.c: Dito.
2023-06-20 15:04:09 +02:00
Tobias Burnus
99e3214f58 Fortran: Fix parse-dump-tree for OpenMP ALLOCATE clause
Commit r14-1301-gd64e8e1224708e added u2.allocator to gfc_omp_namelist
for better readability and to permit to use namelist->expr for code
like the following:
  !$omp allocators allocate(align(32) : dt%alloc_comp)
    allocate (dt%alloc_comp(5))
  !$omp allocate(dt%alloc_comp2) align(64)
    allocate (dt%alloc_comp2(10))
However, for the parse-tree dump the change was incomplete.

gcc/fortran/ChangeLog:

	* dump-parse-tree.cc (show_omp_namelist): Fix dump of the allocator
	modifier of OMP_LIST_ALLOCATE.
2023-06-20 13:49:54 +02:00
Eric Botcazou
6f695bfd73 ada: Minor tweaks
gcc/ada/

	* gcc-interface/decl.cc (gnat_to_gnu_entity) <E_Variable>: Pass
	the NULL_TREE explicitly and test imported_p in lieu of
	Is_Imported. <E_Function>: Remove public_flag local variable and
	make extern_flag local variable a constant.
2023-06-20 13:25:28 +02:00
Yannick Moy
c11ef75cb2 ada: Fix crash on inlining in GNATprove
After the recent change on detection of non-inlining, calls inside
the iterator part of a quantified expression were not considered
as preventing inlining anymore, leading to a crash later on inside
GNATprove. Now fixed.

gcc/ada/

	* sem_res.adb (Resolve_Call): Fix change that replaced test for
	quantified expressions by the test for potentially unevaluated
	contexts. Both should be performed.
2023-06-20 13:25:28 +02:00
Eric Botcazou
865c5db7cb ada: Further fixes to handling of private views in instances
This removes more bypasses for private views in instances that are present
in type predicates (Conforming_Types, Covers, Specific_Type and Wrong_Type),
which in exchange requires additional work in Sem_Ch12 to restore the proper
view of types during the instantiation of generic bodies.

The main mechanism for this is the Has_Private_View flag, but it comes with
the limitations that 1) there must be a direct reference to the global type
in the generic construct (either a reference to a global object of this type
or the explicit declaration of a local object of this type), which is not
always the case e.g. for loop parameters and 2) it can deal with a single
type at a time, e.g. it cannot deal with an array type and its component
type if their respective views are not the same in the instance.

To overcome the second limitation, a new Has_Secondary_Private_View flag
is introduced to deal with a secondary type, which as of this writing is
either the component type of an array type or the designated type of an
access type (together they make up the vast majority of the problematic
cases for the Has_Private_View flag alone). This new mechanism subsumes
a specific treatment for them that was added in Copy_Generic_Node a few
years ago, although a specific treatment still needs to be preserved for
comparison and equality operators in a narrower case.

Additional handling is also introduced to overcome the first limitation
for loop parameters in Copy_Generic_Node, and a relaxed condition is used
in Exp_Ch7.Convert_View to generate an unchecked conversion between views.

gcc/ada/

	* exp_ch7.adb (Convert_View): Detect more cases of mismatches for
	private types and use Implementation_Base_Type as main criterion.
	* gen_il-fields.ads (Opt_Field_Enum): Add
	Has_Secondary_Private_View
	* gen_il-gen-gen_nodes.adb (N_Expanded_Name): Likewise.
	(N_Direct_Name): Likewise.
	(N_Op): Likewise.
	* sem_ch12.ads (Check_Private_View): Document the usage of second
	flag Has_Secondary_Private_View.
	* sem_ch12.adb (Get_Associated_Entity): New function to retrieve
	the ultimate associated entity, if any.
	(Check_Private_View): Implement Has_Secondary_Private_View
	support.
	(Copy_Generic_Node): Remove specific treatment for Component_Type
	of an array type and Designated_Type of an access type. Add
	specific treatment for comparison and equality operators, as well
	as iterator and loop parameter specifications.
	(Instantiate_Type): Implement Has_Secondary_Private_View support.
	(Requires_Delayed_Save): Call Get_Associated_Entity.
	(Set_Global_Type): Implement Has_Secondary_Private_View support.
	* sem_ch6.adb (Conforming_Types): Remove bypass for private views
	in instances.
	* sem_type.adb (Covers): Return true if Is_Subtype_Of does so.
	Remove bypass for private views in instances.
	(Specific_Type): Likewise.
	* sem_util.adb (Wrong_Type): Likewise.
	* sinfo.ads (Has_Secondary_Private_View): Document new flag.
2023-06-20 13:25:28 +02:00
Ronan Desplanques
31edd39bc4 ada: Remove outdated comment
The Preelaborate pragma the removed comment was referring to was
indeed present in AI 167, as well as in clause 5.3 of the rationale
for Ada 2012, but it never made it into the 2012 version of the
reference manual.

gcc/ada/

	* libgnarl/s-mudido.ads: Remove outdated comment.
2023-06-20 13:25:28 +02:00
Tobias Burnus
0607e93490 Fortran's gfc_match_char: %S to match symbol with host_assoc
gfc_match ("... %s ...", ...) matches a gfc_symbol but with
host_assoc = 0. This commit adds '%S' as variant which matches
with host_assoc = 1

gcc/fortran/ChangeLog:

	* match.cc (gfc_match_char): Match with '%S' a symbol
	with host_assoc = 1.
2023-06-20 13:23:40 +02:00
Richard Biener
9d597e0075 Improve DSE to handle stores before __builtin_unreachable ()
DSE isn't good at identifying program points that end lifetime
of variables that are not associated with virtual operands.  But
at least for those that end basic-blocks we can handle the simple
case where this ending is in the same basic-block as the definition
we want to elide.  That should catch quite some common cases already.

	* tree-ssa-dse.cc (dse_classify_store): When we found
	no defs and the basic-block with the original definition
	ends in __builtin_unreachable[_trap] the store is dead.

	* gcc.dg/tree-ssa/ssa-dse-47.c: New testcase.
	* c-c++-common/asan/pr106558.c: Avoid undefined behavior
	due to missing return.
2023-06-20 12:48:24 +02:00
Richard Biener
85107abeb7 Update virtual SSA form manually where easily possible in phiprop
This keeps virtual SSA form up-to-date in phiprop when easily possible.
Only when we deal with aggregate copies the work would be too
heavy-handed in general.

	* tree-ssa-phiprop.cc (phiprop_insert_phi): For simple loads
	keep the virtual SSA form up-to-date.
2023-06-20 12:48:23 +02:00
Kyrylo Tkachov
63aaff9b3a aarch64: Optimise ADDP with same source operands
We've been asked to optimise the testcase in this patch of a 64-bit ADDP with
the low and high halves of the same 128-bit vector. This can be done by a
single .4s ADDP followed by just reading the bottom 64 bits. A splitter for
this is quite straightforward now that all the vec_concat stuff is collapsed
by simplify-rtx.

With this patch we generate a single:
	addp	v0.4s, v0.4s, v0.4s
instead of:
        dup     d31, v0.d[1]
        addp    v0.2s, v0.2s, v31.2s
        ret

Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

gcc/ChangeLog:

	* config/aarch64/aarch64-simd.md (*aarch64_addp_same_reg<mode>):
	New define_insn_and_split.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/simd/addp-same-low_1.c: New test.
2023-06-20 11:03:47 +01:00
Tamar Christina
36de416df8 AArch64: remove test comment from *mov<mode>_aarch64
I accidentally left a test comment in the final version of the patch.
This removes the comment.

gcc/ChangeLog:

	* config/aarch64/aarch64.md (*mov<mode>_aarch64): Drop test comment.
2023-06-20 08:54:42 +01:00