Commit Graph

191830 Commits

Author SHA1 Message Date
Tom de Vries
f0ae4257e3 [nvptx] Xfail sibcall execution tests
On nvptx I see the following FAIL:
...
FAIL: gcc.dg/sibcall-3.c execution test
...

The test-case states that "this test is xfailed on targets without sibcall
patterns".

The nvptx port doesn't have a sibcall pattern, so add an xfail.  Likewise in
two similar test-cases.

Tested on nvptx.

gcc/testsuite/ChangeLog:

2022-02-20  Tom de Vries  <tdevries@suse.de>

	* gcc.dg/sibcall-10.c: Xfail execution test for nvptx.
	* gcc.dg/sibcall-3.c: Same.
	* gcc.dg/sibcall-4.c: Same.
2022-02-22 10:14:59 +01:00
Tom de Vries
7d3e649895 [nvptx, testsuite] Remove mptx settings in gcc.target/nvptx tests
Some test-cases in gcc/testsuite/gcc.target/nvptx contain mptx
settings, which are paired with misa settings, in order to have the mptx
version support the misa version.

Since commit decde11183 ("[nvptx] Choose -mptx default based on -misa"),
this is no longer necessary.

Remove the mptx settings.

Tested on nvptx.

gcc/testsuite/ChangeLog:

2022-02-20  Tom de Vries  <tdevries@suse.de>

	* gcc.target/nvptx/float16-1.c: Drop -mptx setting.
	* gcc.target/nvptx/float16-2.c: Same.
	* gcc.target/nvptx/float16-3.c: Same.
	* gcc.target/nvptx/float16-4.c: Same.
	* gcc.target/nvptx/float16-5.c: Same.
	* gcc.target/nvptx/float16-6.c: Same.
	* gcc.target/nvptx/tanh-1.c: Same.
2022-02-22 10:14:18 +01:00
Richard Biener
90d693bdc9 target/99881 - x86 vector cost of CTOR from integer regs
This uses the now passed SLP node to the vectorizer costing hook
to adjust vector construction costs for the cost of moving an
integer component from a GPR to a vector register when that's
required for building a vector from components.  A cruical difference
here is whether the component is loaded from memory or extracted
from a vector register as in those cases no intermediate GPR is involved.

The pr99881.c testcase can be Un-XFAILed with this patch, the
pr91446.c testcase now produces scalar code which looks superior
to me so I've adjusted it as well.

2022-02-18  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/104582
	PR target/99881
	* config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
	Cost GPR to vector register moves for integer vector construction.

	* gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-1.c: New.
	* gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-2.c: Likewise.
	* gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-3.c: Likewise.
	* gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-4.c: Likewise.
	* gcc.target/i386/pr99881.c: Un-XFAIL.
	* gcc.target/i386/pr91446.c: Adjust to not expect vectorization.
2022-02-22 07:48:45 +01:00
Richard Biener
f24dfc7617 tree-optimization/104582 - make SLP node available in vector cost hook
This adjusts the vectorizer costing API to allow passing down the
SLP node the vector stmt is created from.

2022-02-18  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/104582
	* tree-vectorizer.h (stmt_info_for_cost::node): New field.
	(vector_costs::add_stmt_cost): Add SLP node parameter.
	(dump_stmt_cost): Likewise.
	(add_stmt_cost): Likewise, new overload and adjust.
	(add_stmt_costs): Adjust.
	(record_stmt_cost): New overload.
	* tree-vectorizer.cc (dump_stmt_cost): Dump the SLP node.
	(vector_costs::add_stmt_cost): Adjust.
	* tree-vect-loop.cc (vect_estimate_min_profitable_iters):
	Adjust.
	* tree-vect-slp.cc (vect_prologue_cost_for_slp): Record
	the SLP node for costing.
	(vectorizable_slp_permutation): Likewise.
	* tree-vect-stmts.cc (record_stmt_cost): Adjust and add
	new overloads.
	* config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
	Adjust.
	* config/aarch64/aarch64.cc (aarch64_vector_costs::add_stmt_cost):
	Adjust.
	* config/rs6000/rs6000.cc (rs6000_vector_costs::add_stmt_cost):
	Adjust.
	(rs6000_cost_data::adjust_vect_cost_per_loop): Likewise.
2022-02-22 07:48:38 +01:00
Richard Biener
61fc5e098e tree-optimization/104582 - Simplify vectorizer cost API and fixes
This simplifies the vectorizer cost API by providing overloads
to add_stmt_cost and record_stmt_cost suitable for scalar stmt
and branch stmt costing which do not need information like
a vector type or alignment.  It also fixes two mistakes where
costs for versioning tests were recorded as vector stmt rather
than scalar stmt.

This is a first patch to simplify the actual fix for PR104582.

2022-02-18  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/104582
	* tree-vectorizer.h (add_stmt_cost): New overload.
	(record_stmt_cost): Likewise.
	* tree-vect-loop.cc (vect_compute_single_scalar_iteration_cost):
	Use add_stmt_costs.
	(vect_get_known_peeling_cost): Use new overloads.
	(vect_estimate_min_profitable_iters): Likewise.  Consistently
	use scalar_stmt for costing versioning checks.
	* tree-vect-stmts.cc (record_stmt_cost): New overload.
2022-02-22 07:48:31 +01:00
Hongyu Wang
0435b978f9 i386: Relax cmpxchg instruction under -mrelax-cmpxchg-loop [PR103069]
For cmpxchg, it is commonly used in spin loop, and several user code
such as pthread directly takes cmpxchg as loop condition, which cause
huge cache bouncing.

This patch extends previous implementation to relax all cmpxchg
instruction under -mrelax-cmpxchg-loop with an extra atomic load,
compare and emulate the failed cmpxchg behavior.

For original spin loop which looks like

loop: mov    %eax,%r8d
      or     $1,%r8d
      lock cmpxchg %r8d,(%rdi)
      jne    loop

It will now truns to

loop: mov    %eax,%r8d
      or     $1,%r8d
      mov    (%r8),%rsi <--- load lock first
      cmp    %rsi,%rax <--- compare with expected input
      jne    .L2 <--- lock ne expected
      lock cmpxchg %r8d,(%rdi)
      jne    loop
  L2: mov    %rsi,%rax <--- perform the behavior of failed cmpxchg
      jne    loop

under -mrelax-cmpxchg-loop.

gcc/ChangeLog:

	PR target/103069
	* config/i386/i386-expand.cc (ix86_expand_atomic_fetch_op_loop):
	Split atomic fetch and loop part.
	(ix86_expand_cmpxchg_loop): New expander for cmpxchg loop.
	* config/i386/i386-protos.h (ix86_expand_cmpxchg_loop): New
	prototype.
	* config/i386/sync.md (atomic_compare_and_swap<mode>): Call new
	expander under TARGET_RELAX_CMPXCHG_LOOP.
	(atomic_compare_and_swap<mode>): Likewise for doubleword modes.

gcc/testsuite/ChangeLog:

	PR target/103069
	* gcc.target/i386/pr103069-2.c: Adjust result check.
	* gcc.target/i386/pr103069-3.c: New test.
	* gcc.target/i386/pr103069-4.c: Likewise.
2022-02-22 09:30:19 +08:00
GCC Administrator
5c105adbf2 Daily bump. 2022-02-22 00:16:33 +00:00
Ian Lance Taylor
a7eeaa4898 runtime/internal/syscall: build dummy package if not Linux
Fixes libgo build on non-Linux systems.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/387134
2022-02-21 13:24:38 -08:00
Dan Li
ce09ab17dd aarch64: Add compiler support for Shadow Call Stack
Shadow Call Stack can be used to protect the return address of a
function at runtime, and clang already supports this feature[1].

To enable SCS in user mode, in addition to compiler, other support
is also required (as discussed in [2]). This patch only adds basic
support for SCS from the compiler side, and provides convenience
for users to enable SCS.

For linux kernel, only the support of the compiler is required.

[1] https://clang.llvm.org/docs/ShadowCallStack.html
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768

Signed-off-by: Dan Li <ashimida@linux.alibaba.com>

gcc/ChangeLog:

	* config/aarch64/aarch64.cc (SLOT_REQUIRED):
	Change wb_candidate[12] to wb_push_candidate[12].
	(aarch64_layout_frame): Likewise, and
	change callee_adjust when scs is enabled.
	(aarch64_save_callee_saves):
	Change wb_candidate[12] to wb_push_candidate[12].
	(aarch64_restore_callee_saves):
	Change wb_candidate[12] to wb_pop_candidate[12].
	(aarch64_get_separate_components):
	Change wb_candidate[12] to wb_push_candidate[12].
	(aarch64_expand_prologue): Push x30 onto SCS before it's
	pushed onto stack.
	(aarch64_expand_epilogue): Pop x30 frome SCS, while
	preventing it from being popped from the regular stack again.
	(aarch64_override_options_internal): Add SCS compile option check.
	(TARGET_HAVE_SHADOW_CALL_STACK): New hook.
	* config/aarch64/aarch64.h (struct GTY): Add is_scs_enabled,
	wb_pop_candidate[12], and rename wb_candidate[12] to
	wb_push_candidate[12].
	* config/aarch64/aarch64.md (scs_push): New template.
	(scs_pop): Likewise.
	* doc/invoke.texi: Document -fsanitize=shadow-call-stack.
	* doc/tm.texi: Regenerate.
	* doc/tm.texi.in: Add hook have_shadow_call_stack.
	* flag-types.h (enum sanitize_code):
	Add SANITIZE_SHADOW_CALL_STACK.
	* opts.cc (parse_sanitizer_options): Add shadow-call-stack
	and exclude SANITIZE_SHADOW_CALL_STACK.
	* target.def: New hook.
	* toplev.cc (process_options): Add SCS compile option check.
	* ubsan.cc (ubsan_expand_null_ifn): Enum type conversion.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/shadow_call_stack_1.c: New test.
	* gcc.target/aarch64/shadow_call_stack_2.c: New test.
	* gcc.target/aarch64/shadow_call_stack_3.c: New test.
	* gcc.target/aarch64/shadow_call_stack_4.c: New test.
	* gcc.target/aarch64/shadow_call_stack_5.c: New test.
	* gcc.target/aarch64/shadow_call_stack_6.c: New test.
	* gcc.target/aarch64/shadow_call_stack_7.c: New test.
	* gcc.target/aarch64/shadow_call_stack_8.c: New test.
2022-02-21 20:01:14 +00:00
Tom de Vries
02aedc6f26 [nvptx] Initialize ptx regs
With nvptx target, driver version 510.47.03 and board GT 1030 I, we run into:
...
FAIL: gcc.c-torture/execute/pr53465.c -O1 execution test
FAIL: gcc.c-torture/execute/pr53465.c -O2 execution test
FAIL: gcc.c-torture/execute/pr53465.c -O3 -g execution test
...
while the test-cases pass with nvptx-none-run -O0.

The problem is that the generated ptx contains a read from an uninitialized
ptx register, and the driver JIT doesn't handle this well.

For -O2 and -O3, we can get rid of the FAIL using --param
logical-op-non-short-circuit=0.  But not for -O1.

At -O1, the test-case minimizes to:
...
void __attribute__((noinline, noclone))
foo (int y) {
  int c;
  for (int i = 0; i < y; i++)
    {
      int d = i + 1;
      if (i && d <= c)
        __builtin_abort ();
      c = d;
    }
}

int main () {
  foo (2); return 0;
}
...

Note that the test-case does not contain an uninitialized use.  In the first
iteration, i is 0 and consequently c is not read.  In the second iteration, c
is read, but by that time it's already initialized by 'c = d' from the first
iteration.

AFAICT the problem is introduced as follows: the conditional use of c in the
loop body is translated into an unconditional use of c in the loop header:
...
  # c_1 = PHI <c_4(D)(2), c_9(6)>
...
which forwprop1 propagates the 'c_9 = d_7' assignment into:
...
  # c_1 = PHI <c_4(D)(2), d_7(6)>
...
which ends up being translated by expand into an unconditional:
...
(insn 13 12 0 (set (reg/v:SI 22 [ c ])
        (reg/v:SI 23 [ d ])) -1
     (nil))
...
at the start of the loop body, creating an uninitialized read of d on the
path from loop entry.

By disabling coalesce_ssa_name, we get the more usual copies on the incoming
edges.  The copy on the loop entry path still does an uninitialized read, but
that one's now initialized by init-regs.  The test-case passes, also when
disabling init-regs, so it's possible that the JIT driver doesn't object to
this type of uninitialized read.

Now that we characterized the problem to some degree, we need to fix this,
because either:
- we're violating an undocumented ptx invariant, and this is a compiler bug,
  or
- this is is a driver JIT bug and we need to work around it.

There are essentially two strategies to address this:
- stop the compiler from creating uninitialized reads
- patch up uninitialized reads using additional initialization

The former will probably involve:
- making some optimizations more conservative in the presence of
  uninitialized reads, and
- disabling some other optimizations (where making them more conservative is
  not possible, or cannot easily be achieved).
This will probably will have a cost penalty for code that does not suffer from
the original problem.

The latter has the problem that it may paper over uninitialized reads
in the source code, or indeed over ones that were incorrectly introduced
by the compiler.  But it has the advantage that it allows for the problem to
be addressed at a single location.

There's an existing pass, init-regs, which implements a form of the latter,
but it doesn't work for this example because it only inserts additional
initialization for uses that have not a single reaching definition.

Fix this by adding initialization of uninitialized ptx regs in reorg.

Control the new functionality using -minit-regs=<0|1|2|3>, meaning:
- 0: disabled.
- 1: add initialization of all regs at the entry bb
- 2: add initialization of uninitialized regs at the entry bb
- 3: add initialization of uninitialized regs close to the use
and defaulting to 3.

Tested on nvptx.

gcc/ChangeLog:

2022-02-17  Tom de Vries  <tdevries@suse.de>

	PR target/104440
	* config/nvptx/nvptx.cc (workaround_uninit_method_1)
	(workaround_uninit_method_2, workaround_uninit_method_3)
	(workaround_uninit): New function.
	(nvptx_reorg): Use workaround_uninit.
	* config/nvptx/nvptx.opt (minit-regs): New option.
2022-02-21 16:49:37 +01:00
Patrick Palka
e74d764e17 c++: Add testcase for already fixed PR [PR85493]
The a1 and a2 case were fixed (by diagnosing the invalid expression)
with r11-434, and the a3 case with r8-7625.

	PR c++/85493

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/decltype80.C: New test.
2022-02-21 09:20:23 -05:00
Andre Vieira
d34cdec567 rtl-optimization/104498: Fix comparing symbol reference
gcc/ChangeLog:

	PR rtl-optimization/104498
	* alias.cc (compare_base_symbol_refs): Correct distance computation
	when swapping x and y.
2022-02-21 09:41:53 +00:00
Andrew Pinski
e01530ec1e c: [PR104506] Fix ICE after error due to change of type to error_mark_node
The problem here is we end up with an error_mark_node when calling
useless_type_conversion_p and that ICEs. STRIP_NOPS/tree_nop_conversion
has had a check for the inner type being an error_mark_node since g9a6bb3f78c96
(2000). This just adds the check also to tree_ssa_useless_type_conversion.
STRIP_USELESS_TYPE_CONVERSION is mostly used inside the gimplifier
and the places where it is used outside of the gimplifier would not
be adding too much overhead.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Thanks,
Andrew Pinski

	PR c/104506

gcc/ChangeLog:

	* tree-ssa.cc (tree_ssa_useless_type_conversion):
	Check the inner type before calling useless_type_conversion_p.

gcc/testsuite/ChangeLog:

	* gcc.dg/pr104506-1.c: New test.
	* gcc.dg/pr104506-2.c: New test.
	* gcc.dg/pr104506-3.c: New test.
2022-02-21 09:05:50 +00:00
GCC Administrator
c42f1e7734 Daily bump. 2022-02-21 00:16:24 +00:00
Iain Buclaw
1d98337c6b d: Remove handling of deleting GC allocated classes.
Now that the `delete' keyword has been removed from the front-end, only
compiler-generated uses of DeleteExp reach the code generator via the
auto-destruction of `scope class' variables.

The run-time library helpers that previously were used to delete GC
class objects can now be removed from the compiler.

gcc/d/ChangeLog:

	* expr.cc (ExprVisitor::visit (DeleteExp *)): Remove handling of
	deleting GC allocated classes.
	* runtime.def (DELCLASS): Remove.
	(DELINTERFACE): Remove.
2022-02-21 00:12:01 +01:00
Iain Buclaw
6384eff56d d: Merge upstream dmd cb49e99f8, druntime 55528bd1, phobos 1a3e80ec2.
D front-end changes:

    - Import dmd v2.099.0-beta.1.
    - It's now an error to use `alias this' for partial assignment.
    - The `delete' keyword has been removed from the language.
    - Using `this' and `super' as types has been removed from the
      language, the parser no longer specially handles this wrong code
      with an informative error.

D Runtime changes:

    - Import druntime v2.099.0-beta.1.

Phobos changes:

    - Import phobos v2.099.0-beta.1.

gcc/d/ChangeLog:

	* dmd/MERGE: Merge upstream dmd cb49e99f8.
	* dmd/VERSION: Update version to v2.099.0-beta.1.
	* decl.cc (layout_class_initializer): Update call to NewExp::create.
	* expr.cc (ExprVisitor::visit (DeleteExp *)): Remove handling of
	deleting arrays and pointers.
	(ExprVisitor::visit (DotVarExp *)): Convert complex types to the
	front-end library type representing them.
	(ExprVisitor::visit (StringExp *)): Use getCodeUnit instead of charAt
	to get the value of each index in a string expression.
	* runtime.def (DELMEMORY): Remove.
	(DELARRAYT): Remove.
	* types.cc (TypeVisitor::visit (TypeEnum *)): Handle anonymous enums.

libphobos/ChangeLog:

	* libdruntime/MERGE: Merge upstream druntime 55528bd1.
	* src/MERGE: Merge upstream phobos 1a3e80ec2.
	* testsuite/libphobos.hash/test_hash.d: Update.
	* testsuite/libphobos.betterc/test19933.d: New test.
2022-02-20 23:37:32 +01:00
Harald Anlauf
e49508ac6b Fortran: improve check of pointer initialization in DATA statements
gcc/fortran/ChangeLog:

	PR fortran/77693
	* data.cc (gfc_assign_data_value): If a variable in a data
	statement has the POINTER attribute, check for allowed initial
	data target that is compatible with pointer assignment.
	* gfortran.h (IS_POINTER): New macro.

gcc/testsuite/ChangeLog:

	PR fortran/77693
	* gfortran.dg/data_pointer_2.f90: New test.
2022-02-20 22:34:21 +01:00
GCC Administrator
1f96b5eeef Daily bump. 2022-02-20 00:16:22 +00:00
Tom de Vries
69cb3f2abb [nvptx] Use _ as destination operand of atom.exch
We currently generate this code for an atomic store:
...
.reg.u32 %r21;
atom.exch.b32 %r21,[%r22],%r23;
...
where %r21 is set but unused.

Use the ptx bit bucket operand '_' instead, such that we have:
...
atom.exch.b32 _,[%r22],%r23;
...

[ Note that the same problem still occurs for this code:
...
void atomic_store (int *ptr, int val) {
  __atomic_exchange_n (ptr, val, MEMMODEL_RELAXED);
}
... ]

Tested on nvptx.

gcc/ChangeLog:

2022-02-19  Tom de Vries  <tdevries@suse.de>

	* config/nvptx/nvptx.cc (nvptx_reorg_uniform_simt): Handle SET insn.
	* config/nvptx/nvptx.md
	(define_insn "nvptx_atomic_store<mode>"): Rename to ...
	(define_insn "nvptx_atomic_store_sm70<mode>"): This.
	(define_insn "nvptx_atomic_store<mode>"): New define_insn.
	(define_expand "atomic_store<mode>"): Handle rename.  Use
	nvptx_atomic_store instead of atomic_exchange.

gcc/testsuite/ChangeLog:

2022-02-19  Tom de Vries  <tdevries@suse.de>

	* gcc.target/nvptx/atomic-store-1.c: Update.
2022-02-19 20:05:56 +01:00
Tom de Vries
9ed52438b8 [nvptx] Don't skip atomic insns in nvptx_reorg_uniform_simt
In nvptx_reorg_uniform_simt we have a loop:
...
  for (insn = get_insns (); insn; insn = next)
    {
      next = NEXT_INSN (insn);
      if (!(CALL_P (insn) && nvptx_call_insn_is_syscall_p (insn))
         && !(NONJUMP_INSN_P (insn)
              && GET_CODE (PATTERN (insn)) == PARALLEL
              && get_attr_atomic (insn)))
       continue;
...
that intends to handle syscalls and atomic insns.

However, this also silently skips the atomic insn nvptx_atomic_store, which
has GET_CODE (PATTERN (insn)) == SET.

This does not cause problems, because the nvptx_atomic_store actually maps
onto a "st" insn, and therefore is not atomic and doesn't need to be handled
by nvptx_reorg_uniform_simt.

Fix this by:
- explicitly setting nvptx_atomic_store's atomic attribute to false,
- rewriting the skip condition to make sure all insn
  with atomic attribute are handled, and
- asserting that all handled insns are PARALLEL.

Tested on nvptx.

gcc/ChangeLog:

2022-02-19  Tom de Vries  <tdevries@suse.de>

	* config/nvptx/nvptx.cc (nvptx_reorg_uniform_simt): Handle all
	insns with atomic attribute.  Assert that all handled insns are
	PARALLELs.
	* config/nvptx/nvptx.md (define_insn "nvptx_atomic_store<mode>"):
	Set atomic attribute to false.

gcc/testsuite/ChangeLog:

2022-02-19  Tom de Vries  <tdevries@suse.de>

	* gcc.target/nvptx/uniform-simt-3.c: New test.
2022-02-19 19:57:12 +01:00
Tom de Vries
8e5c34ab45 [nvptx] Use nvptx_warpsync / nvptx_uniform_warp_check for -muniform-simt
With the default ptx isa 6.0, we have for uniform-simt-1.c:
...
        @%r33   atom.global.cas.b32     %r26, [a], %r28, %r29;
                shfl.sync.idx.b32       %r26, %r26, %r32, 31, 0xffffffff;
...

The atomic insn is predicated by -muniform-simt, and the subsequent insn does
a warp sync, at which point the warp is uniform again.

But with -mptx=3.1, we have instead:
...
        @%r33   atom.global.cas.b32     %r26, [a], %r28, %r29;
                shfl.idx.b32    %r26, %r26, %r32, 31;
...

The shfl does not sync the warp, and we want the warp to go back to executing
uniformly asap.  We cannot enforce this, but at least check this using
nvptx_uniform_warp_check, similar to how that is done for openacc.

Likewise, detect the case that no shfl insn is emitted, and add a
nvptx_uniform_warp_check or nvptx_warpsync.

gcc/ChangeLog:

2022-02-19  Tom de Vries  <tdevries@suse.de>

	* config/nvptx/nvptx.cc (nvptx_unisimt_handle_set): Change return
	type to bool.
	(nvptx_reorg_uniform_simt): Insert nvptx_uniform_warp_check or
	nvptx_warpsync, if necessary.

gcc/testsuite/ChangeLog:

2022-02-19  Tom de Vries  <tdevries@suse.de>

	* gcc.target/nvptx/uniform-simt-1.c: Add scan-assembler test.
	* gcc.target/nvptx/uniform-simt-2.c: New test.
2022-02-19 19:57:12 +01:00
Jakub Jelinek
9e3bbb4a80 asan: Mark instrumented vars addressable [PR102656]
We ICE on the following testcase, because the asan1 pass decides to
instrument
  <retval>.x = 0;
and does that by
  _13 = &<retval>.x;
  .ASAN_CHECK (7, _13, 4, 4);
  <retval>.x = 0;
and later sanopt pass turns that into:
  _39 = (unsigned long) &<retval>.x;
  _40 = _39 >> 3;
  _41 = _40 + 2147450880;
  _42 = (signed char *) _41;
  _43 = *_42;
  _44 = _43 != 0;
  _45 = _39 & 7;
  _46 = (signed char) _45;
  _47 = _46 + 3;
  _48 = _47 >= _43;
  _49 = _44 & _48;
  if (_49 != 0)
    goto <bb 10>; [0.05%]
  else
    goto <bb 9>; [99.95%]

  <bb 10> [local count: 536864]:
  __builtin___asan_report_store4 (_39);

  <bb 9> [local count: 1073741824]:
  <retval>.x = 0;
The problem is during expansion, <retval> isn't marked TREE_ADDRESSABLE,
even when we take its address in (unsigned long) &<retval>.x.

Now, instrument_derefs has code to avoid the instrumentation altogether
if we can prove the access is within bounds of an automatic variable in the
current function and the var isn't TREE_ADDRESSABLE (or we don't instrument
use after scope), but we do it solely for VAR_DECLs.

I think we should treat RESULT_DECLs exactly like that too, which is what
the following patch does.  I must say I'm unsure about PARM_DECLs, those can
have different cases, either they are fully or partially passed in
registers, then if we take parameter's address, they are in a local copy
inside of a function and so work like those automatic vars.  But if they
are fully passed in memory, we typically just take address of the slot
and in that case they live in the caller's frame.  It is true we don't
(can't) put any asan padding in between the arguments, so all asan could
detect in that case is if caller passes fewer on stack arguments or smaller
arguments than callee accepts.  Anyway, as I'm unsure, I haven't added
PARM_DECLs to that case.

And another thing is, when we actually build_fold_addr_expr, we need to
mark_addressable the inner if it isn't addressable already.

2022-02-19  Jakub Jelinek  <jakub@redhat.com>

	PR sanitizer/102656
	* asan.cc (instrument_derefs): If inner is a RESULT_DECL and access is
	known to be within bounds, treat it like automatic variables.
	If instrumenting access and inner is {VAR,PARM,RESULT}_DECL from
	current function and !TREE_STATIC which is not TREE_ADDRESSABLE, mark
	it addressable.

	* g++.dg/asan/pr102656.C: New test.
2022-02-19 09:03:57 +01:00
GCC Administrator
5a9ba3f27f Daily bump. 2022-02-19 00:16:17 +00:00
Ian Lance Taylor
3343e7e2c4 libgo: update Hurd support
Patches from Svante Signell for PR go/104290.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/386797
2022-02-18 15:33:32 -08:00
Pat Haugen
4984f882f4 Mark Power10 fusion option undocumented and remove sub-options.
gcc/
	* config/rs6000/rs6000.opt (mpower10-fusion): Mark Undocumented.
	(mpower10-fusion-ld-cmpi, mpower10-fusion-2logical,
	mpower10-fusion-logical-add, mpower10-fusion-add-logical,
	mpower10-fusion-2add, mpower10-fusion-2store): Remove.
	* config/rs6000/rs6000-cpus.def (ISA_3_1_MASKS_SERVER,
	OTHER_P9_VECTOR_MASKS): Remove Power10 fusion sub-options.
	* config/rs6000/rs6000.cc (rs6000_option_override_internal,
	power10_sched_reorder): Likewise.
	* config/rs6000/genfusion.pl (gen_ld_cmpi_p10, gen_logical_addsubf,
	gen_addadd): Likewise
	* config/rs6000/fusion.md: Regenerate.
2022-02-18 15:38:23 -06:00
Ian Lance Taylor
20a33efdf3 libgo: update to Go1.18rc1 release
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/386594
2022-02-18 13:12:08 -08:00
H.J. Lu
1931cbad49 pieces-memset-21.c: Expect vzeroupper for ia32
Update gcc.target/i386/pieces-memset-21.c to expect vzeroupper for ia32
caused by

commit fe79d652c9
Author: Richard Biener <rguenther@suse.de>
Date:   Thu Feb 17 14:40:16 2022 +0100

    target/104581 - compile-time regression in mode-switching

	PR target/104581
	* gcc.target/i386/pieces-memset-21.c: Expect vzeroupper for ia32.
2022-02-18 10:36:53 -08:00
Jakub Jelinek
df5ed150ee rs6000: Fix up posix_memalign call in _mm_malloc [PR104598]
The uglification changes went in one spot too far and uglified also
the anem of function, posix_memalign should be called like that and
not a non-existent function instead of it.

2022-02-18  Jakub Jelinek  <jakub@redhat.com>

	PR target/104257
	PR target/104598
	* config/rs6000/mm_malloc.h (_mm_malloc): Call posix_memalign
	rather than __posix_memalign.
2022-02-18 17:21:43 +01:00
Richard Biener
fe79d652c9 target/104581 - compile-time regression in mode-switching
The x86 backend piggy-backs on mode-switching for insertion of
vzeroupper.  A recent improvement there was implemented in a way
to walk possibly the whole basic-block for all DF reg def definitions
in its mode_needed hook which is called for each instruction in
a basic-block during mode-switching local analysis.

The following mostly reverts this improvement.  It needs to be
re-done in a way more consistent with a local dataflow which
probably means making targets aware of the state of the local
dataflow analysis.

2022-02-17  Richard Biener  <rguenther@suse.de>

	PR target/104581
	* config/i386/i386.cc (ix86_avx_u128_mode_source): Remove.
	(ix86_avx_u128_mode_needed): Return AVX_U128_DIRTY instead
	of calling ix86_avx_u128_mode_source which would eventually
	have returned AVX_U128_ANY in some very special case.

	* gcc.target/i386/pr101456-1.c: XFAIL.
2022-02-18 07:58:54 +01:00
Richard Biener
422d1d378e tree-optimization/96881 - CD-DCE and CLOBBERs
CD-DCE does not consider CLOBBERs as necessary in the attempt
to not prevent DCE of SSA defs it uses.  A side-effect of that
is that it also removes all its control dependences if they are
not made necessary by other means.  When we later try to preserve
as many CLOBBERs as possible we have to make sure we also
preserved the controlling conditions, otherwise a CLOBBER can
now appear on a path where it was not executed before, leading
to wrong code as seen in the testcase.

I've tried to continue to handle both direct and indirect
CLOBBERs optimistically, allowing CD-DCE to remove control
flow that just controls CLOBBERs but that regresses for
example the stack coalescing test g++.dg/opt/pr80032.C.
The pattern there is
  if (pred) D.2512 = CLOBBER; else D.2512 = CLOBBER;
basically we have all paths leading to the same clobber but
we could safely cut some branches which we do not realize
early enough.  This regression can be mitigated by no longer
considering direct CLOBBERs optimistically - the original
motivation for the CD-DCE handling wasn't removal of control
flow but SSA defs of the address.

Handling indirect vs. direct clobbers differently feels
somewhat wrong, still the patch goes with this solution.

2022-02-15  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/96881
	* tree-ssa-dce.cc (mark_stmt_if_obviously_necessary): Comment
	CLOBBER handling.
	(control_parents_preserved_p): New function.
	(eliminate_unnecessary_stmts): Check that we preserved control
	parents before retaining a CLOBBER.
	(perform_tree_ssa_dce): Pass down aggressive flag
	to eliminate_unnecessary_stmts.

	* g++.dg/torture/pr96881-1.C: New testcase.
	* g++.dg/torture/pr96881-2.C: Likewise.
2022-02-18 07:58:54 +01:00
Patrick Palka
36278f48cb c++: implicit 'this' in noexcept-spec within class tmpl [PR94944]
Here when instantiating the noexcept-spec we fail to resolve the
implicit object for the member call A<T>::f() ultimately because
maybe_instantiate_noexcept sets current_class_ptr/ref to the dependent
'this' (of type B<T>) rather than the specialized 'this' (of type B<int>).

This patch fixes this by making maybe_instantiate_noexcept set
current_class_ptr/ref to the specialized 'this' instead, consistent
with what tsubst_function_type does when substituting into the trailing
return type of a non-static member function.

	PR c++/94944

gcc/cp/ChangeLog:

	* pt.cc (maybe_instantiate_noexcept): For non-static member
	functions, set current_class_ptr/ref to the specialized 'this'
	instead.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/noexcept34.C: Adjusted expected diagnostics.
	* g++.dg/cpp0x/noexcept75.C: New test.
2022-02-17 20:20:24 -05:00
GCC Administrator
0bdb049877 Daily bump. 2022-02-18 00:16:39 +00:00
Jonathan Wakely
12a88e6e20 libstdc++: Deprecate non-standard std::vector<bool>::insert(pos) [PR104559]
The SGI STL and pre-1998 drafts of the C++ standard had a default
argument for vector<bool>::insert(iterator, const bool&) which was
remove by N1051. The default argument is still present in libstdc++ for
some reason. There are no tests verifying it as an extension, so I don't
think it has been kept intentionally.

This removes the default argument but adds an overload without the
second parameter, and adds the deprecated attribute to it. This allows
any code using it to keep working (for now) but with a warning.

libstdc++-v3/ChangeLog:

	PR libstdc++/104559
	* doc/xml/manual/evolution.xml: Document deprecation.
	* doc/html/manual/api.html: Regenerate.
	* include/bits/stl_bvector.h (insert(const_iterator, const bool&)):
	Remove default argument.
	(insert(const_iterator)): New overload with deprecated attribute.
	* testsuite/23_containers/vector/bool/modifiers/insert/104559.cc:
	New test.
2022-02-17 23:44:25 +00:00
Jason Merrill
2c9b7077b7 c++: inlining explicit instantiations [PR104539]
The PR10968 fix cleared DECL_COMDAT to force output of explicit
instantiations.  Then the PR59469 fix added a call to mark_needed, after
which we no longer need to clear DECL_COMDAT, and leaving it set allows us
to inline explicit instantiations without worrying about symbol
interposition.

I suppose there's an argument to be made that an explicit instantiation
declaration (extern template) should clear DECL_COMDAT, since that suggests
that there will be only a single instantiation somewhere that could be
subject to interposition, but that doesn't change the 'inline' semantics,
and it seems cleaner to treat template instantiations uniformly.

	PR c++/104539

gcc/cp/ChangeLog:

	* pt.cc (mark_decl_instantiated): Don't clear DECL_COMDAT.

gcc/testsuite/ChangeLog:

	* g++.dg/ipa/inline-4.C: New test.
2022-02-17 17:50:59 -05:00
Jason Merrill
1b71bc7c8b tree: tweak warn_deprecated_use
While looking at PR90451 I noticed that this function was failing to find
the attributes if called with a variant of the struct.

gcc/ChangeLog:

	* tree.cc (warn_deprecated_use): Look for TYPE_STUB_DECL
	on TYPE_MAIN_VARIANT.

gcc/testsuite/ChangeLog:

	* g++.dg/warn/deprecated-16.C: New test.
2022-02-17 17:50:59 -05:00
Jonathan Wakely
36100e0e95 libstdc++: Make std::error_code printer more robust
This attempts to implement a partial workaround for the GDB bug
https://sourceware.org/bugzilla/show_bug.cgi?id=28856 which causes GDB
to crash when printing a frame with a std::error_code argument.

By recognising the known error categories defined in the library and
hardcoding their names we do not need to call cat->name() on the
category.  This has the additional benefit of also working when
debugging a core file rather than a running process. For those known
categories we can also cast the int value to the corresponding error
code enum (e.g. future_errc) so that we show an enumerator instead of
just an integer.

For program-defined categories we just use the name of the dynamic type
to identify the category, and print the value as an integer. Once the
GDB bug is fixed and the virtual name() function can be called safely,
that would be preferable. For now it's better to have an imperfect
printer that doesn't crash GDB.

This rewritten StdErrorCodePrinter needs gdb.Value.dynamic_type, so is
only registered if that is supported, which means GDB 7.7 and later.

libstdc++-v3/ChangeLog:

	* python/libstdcxx/v6/printers.py (StdErrorCodePrinter): Replace
	code that call cat->name() on std::error_category objects.
	Identify known categories by symbol name and use a hardcoded
	name. Print error code values as enumerators where appopriate.
	* testsuite/libstdc++-prettyprinters/cxx11.cc: Adjust expected
	name of custom category. Check io_errc and future_errc errors.
2022-02-17 22:22:14 +00:00
Jason Merrill
c352ef0ed9 c++: avoid duplicate deprecated warning [PR90451]
We were getting the deprecated warning twice for the same call because we
called mark_used first in finish_qualified_id_expr and then again in
build_over_call.  Let's not call it the first time; C++17 clarified that a
function is used only when it is selected from an overload set, which
happens later.

Then I had to add a few more uses in places that don't do anything further
with the expression (convert_to_void, finish_decltype_type), and places that
use the expression more unusually (cp_build_addr_expr_1,
convert_nontype_argument).  The new mark_single_function is mostly so
that I only have to put the comment in one place.

	PR c++/90451

gcc/cp/ChangeLog:

	* decl2.cc (mark_single_function): New.
	* cp-tree.h: Declare it.
	* typeck.cc (cp_build_addr_expr_1): mark_used when making a PMF.
	* semantics.cc (finish_qualified_id_expr): Not here.
	(finish_id_expression_1): Or here.
	(finish_decltype_type): Call mark_single_function.
	* cvt.cc (convert_to_void): And here.
	* pt.cc (convert_nontype_argument): And here.
	* init.cc (build_offset_ref): Adjust assert.

gcc/testsuite/ChangeLog:

	* g++.dg/warn/deprecated-14.C: New test.
	* g++.dg/warn/deprecated-15.C: New test.
2022-02-17 16:22:27 -05:00
Paul A. Clarke
efbb17db52 rs6000: __Uglify non-uglified local variables in headers
Properly prefix (with "__")  all local variables in shipped headers for x86
compatibility intrinsics implementations.  This avoids possible problems with
usages like:
```
```

2022-02-16  Paul A. Clarke  <pc@us.ibm.com>

gcc
	PR target/104257
	* config/rs6000/bmi2intrin.h: Uglify local variables.
	* config/rs6000/emmintrin.h: Likewise.
	* config/rs6000/mm_malloc.h: Likewise.
	* config/rs6000/mmintrin.h: Likewise.
	* config/rs6000/pmmintrin.h: Likewise.
	* config/rs6000/smmintrin.h: Likewise.
	* config/rs6000/tmmintrin.h: Likewise.
	* config/rs6000/xmmintrin.h: Likewise.
2022-02-17 13:13:05 -06:00
Robin Dapp
fac15bf848 rs6000: Workaround for new ifcvt behavior [PR104335].
Since r12-6747-gaa8cfe785953a0 ifcvt passes a "cc comparison"
i.e. the representation of the result of a comparison to the
backend.  rs6000_emit_int_cmove () is not prepared to handle this.
Therefore, this patch makes it return false in such a case.

	PR target/104335

gcc/ChangeLog:

	* config/rs6000/rs6000.cc (rs6000_emit_int_cmove): Return false
	if the expected comparison's first operand is of mode MODE_CC.
2022-02-17 19:59:51 +01:00
Jonathan Wakely
73a118c209 c-family: Remove names of unused parameters
C++ allows unnamed parameters, which means we don't need to call them
'dummy' and mark them with the unused attribute.

gcc/c-family/ChangeLog:

	* c-pragma.cc (handle_pragma_pack): Remove parameter name.
	(handle_pragma_weak): Likewise.
	(handle_pragma_scalar_storage_order): Likewise.
	(handle_pragma_redefine_extname): Likewise.
	(handle_pragma_visibility): Likewise.
	(handle_pragma_diagnostic): Likewise.
	(handle_pragma_target): Likewise.
	(handle_pragma_optimize): Likewise.
	(handle_pragma_push_options): Likewise.
	(handle_pragma_pop_options): Likewise.
	(handle_pragma_reset_options): Likewise.
	(handle_pragma_message): Likewise.
	(handle_pragma_float_const_decimal64): Likewise.
2022-02-17 17:48:04 +00:00
Eric Botcazou
bc6d2f460a Add missing target selector
gcc/testsuite/
	PR target/79754
	* gcc.target/i386/pr79754.c: Add target dfp.
2022-02-17 18:36:43 +01:00
Ian Lance Taylor
3f2a6b041d net: add hurd build tag for setReadMsgCloseOnExec
Patch from Svante Signell.

	PR go/103573
	PR go/104290

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/386216
2022-02-17 09:30:02 -08:00
Mark Wielaard
d3b2ead595 libiberty rust-demangle, ignore .suffix
Rust symbols can have a .suffix because of compiler transformations.
These can be ignored in the demangled name. Which is what this patch
implements. By stopping at the first dot for v0 symbols and searching
backwards to the ending 'E' for legacy symbols.

An alternative implementation could be to follow what C++ does and
represent these as [clone .suffix] tagged onto the demangled name.
But this seems somewhat confusing since it results in a demangled
name that cannot be mangled again. And it would mean trying to
decode compiler internal naming.

https://bugs.kde.org/show_bug.cgi?id=445916
https://github.com/rust-lang/rust/issues/60705

libiberty/Changelog

	* rust-demangle.c (rust_demangle_callback): Ignore everything
	after '.' char in sym for v0. For legacy symbols search
	backwards to find the last 'E' before any '.'.
	* testsuite/rust-demangle-expected: Add new .suffix testcases.
2022-02-17 18:06:24 +01:00
Vladimir N. Makarov
db69f666a7 [PR104447] LRA: Do not split non-alloc hard regs.
LRA tried to split non-allocated hard reg for reload pseudos again and
again until number of assignment passes reaches the limit.  The patch fixes
this.

gcc/ChangeLog:

	PR rtl-optimization/104447
	* lra-constraints.cc (spill_hard_reg_in_range): Initiate ignore
	hard reg set by lra_no_alloc_regs.

gcc/testsuite/ChangeLog:

	PR rtl-optimization/104447
	* gcc.target/i386/pr104447.c: New.
2022-02-17 11:33:33 -05:00
Patrick Palka
6bbd8afee0 c++: double non-dep folding from finish_compound_literal [PR104565]
In finish_compound_literal, we perform non-dependent expr folding before
the call to check_narrowing ever since r9-5973.  But ever since r10-7096,
check_narrowing also performs non-dependent expr folding of its own.
This double folding means tsubst will see non-templated trees during the
second folding, which causes a spurious error in the below testcase.

This patch removes the former folding operation; it seems obviated by
the latter one.

	PR c++/104565

gcc/cp/ChangeLog:

	* semantics.cc (finish_compound_literal): Don't perform
	non-dependent expr folding before calling check_narrowing.

gcc/testsuite/ChangeLog:

	* g++.dg/template/non-dependent22.C: New test.
2022-02-17 08:35:23 -05:00
liuhongt
754dce903c Restrict the two sources of vect_recog_cond_expr_convert_pattern to be of the same type when convert is extension.
It's not equal to transform

 (cond (cmp @1 @2) (convert@3 @4) (convert@5 @6))

 to

 (convert (cmp @1 @2) (convert)@4 @6)

when(convert@3 @4) is extension because it's zero_extend vs sign_extend.

gcc/ChangeLog:

	PR tree-optimization/104551
	PR tree-optimization/103771
	* match.pd (cond_expr_convert_p): Add types_match check when
	convert is extension.
	* tree-vect-patterns.cc
	(gimple_cond_expr_convert_p): Adjust comments.
	(vect_recog_cond_expr_convert_pattern): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr104551.c: New test.
2022-02-17 18:58:22 +08:00
Jakub Jelinek
1c2b44b523 valtrack: Avoid creating raw SUBREGs with VOIDmode argument [PR104557]
After the recent r12-7240 simplify_immed_subreg changes, we bail on more
simplify_subreg calls than before, e.g. apparently for decimal modes
in the NaN representations  we almost never preserve anything except the
canonical {q,s}NaNs.
simplify_gen_subreg will punt in such cases because a SUBREG with VOIDmode
is not valid, but debug_lowpart_subreg wants to attempt even harder, even
if e.g. target indicates certain mode combinations aren't valid for the
backend, dwarf2out can still handle them.  But a SUBREG from a VOIDmode
operand is just too much, the inner mode is lost there.  We'd need some
new rtx that would be able to represent those cases.
For now, just punt in those cases.

2022-02-17  Jakub Jelinek  <jakub@redhat.com>

	PR debug/104557
	* valtrack.cc (debug_lowpart_subreg): Don't call gen_rtx_raw_SUBREG
	if expr has VOIDmode.

	* gcc.dg/dfp/pr104557.c: New test.
2022-02-17 11:14:38 +01:00
Jakub Jelinek
f99ad11af9 openmp: Ensure proper diagnostics for -> in map/to/from clauses [PR104532]
The following patch uses the functions normal CPP_DEREF parsing uses,
i.e. convert_lvalue_to_rvalue and build_indirect_ref, instead of
blindly calling build_simple_mem_ref, so that if the variable does not
have correct type, we properly diagnose it instead of ICEing on it.

2022-02-17  Jakub Jelinek  <jakub@redhat.com>

	PR c/104532
	* c-parser.cc (c_parser_omp_variable_list): For CPP_DEREF, use
	convert_lvalue_to_rvalue and build_indirect_ref instead of
	build_simple_mem_ref.

	* gcc.dg/gomp/pr104532.c: New test.
2022-02-17 10:29:06 +01:00
liuhongt
550cabd002 Clean up MPX-related bit_{MPX,BNDREGS,BNDCSR}.
gcc/ChangeLog:

	* config/i386/cpuid.h (bit_MPX): Removed.
	(bit_BNDREGS): Ditto.
	(bit_BNDCSR): Ditto.
2022-02-17 15:47:33 +08:00
Ian Lance Taylor
837eb12629 libbacktrace: gather address ranges from skeleton units
* dwarf.c (find_address_ranges): Handle skeleton units.
	(read_function_entry): Likewise.
2022-02-16 20:21:48 -08:00