We use this in the sim tree currently. Rather than require people to
have pkg-config installed, include it in the config/ dir.
config/ChangeLog:
* pkg.m4: New file from pkg-config-0.29.2.
Ideally, the linker will be queried for its version and that will be
used to determine capabilities that cannot be discovered from
reasonable configuration testing.
When building cross tools, this might not be possible, and we have
strategies for providing useful defaults. These are adjusted here to
refect current choices.
gcc/ChangeLog:
* config/darwin.h (MIN_LD64_NO_COAL_SECTS): Adjust.
Amend handling for LD64_VERSION fallback defaults.
The darwinN.h headers (with the sole exception of darwin7.h,
which contains a target macro definition) now only contain
values that set fall-backs for cross-compilations, these can
be provided from the config.gcc script which means we no longer
need the darwinN.h - so delete them.
gcc/ChangeLog:
* config.gcc: Compute default version information
from the configured target. Likewise defaults for
ld64.
* config/darwin10.h: Removed.
* config/darwin12.h: Removed.
* config/darwin9.h: Removed.
* config/rs6000/darwin8.h: Removed.
Darwin defines ASM_OUTPUT_ALIGNED_DECL_COMMON which is used in
preference to ASM_OUTPUT_ALIGNED_COMMON, which makes the latter
definition dead code. Remove this.
gcc/ChangeLog:
* config/darwin9.h (ASM_OUTPUT_ALIGNED_COMMON): Delete.
We now need a modern (C++11) toolchain to bootstrap GCC, so there's no
need to skip the stack protect for Darwin < 9.
gcc/ChangeLog:
* config/darwin9.h (STACK_CHECK_STATIC_BUILTIN): Move from here..
* config/darwin.h (STACK_CHECK_STATIC_BUILTIN): .. to here.
There is no need to make the LINK_GCC_C_SEQUENCE_SPEC conditional on
configuration parameters, it is adequately conditionalized on the
macosx-version-min.
gcc/ChangeLog:
* config/darwin10.h (LINK_GCC_C_SEQUENCE_SPEC): Move from
here...
* config/darwin.h (LINK_GCC_C_SEQUENCE_SPEC): ... to here.
The darwinN.h headers were (presumably) introduced to allow specs to be
adjusted when there was no mmacosx-version-min handling, or that was
considered unreliable.
We have version-specific specs for the values that have configuration
data, and the version is set in the driver (so may be considered
reliably present).
Some of the 'darwinN.h' content has become dead code, and the reminder
is either conditionalised on version information (or is setting values
used as fall-backs in cross-compilations).
With the changes needed for Darwin20 / macOS 11 the 'darwnN.h' headers
are now too unwieldy to be useful - so this series moves the relevant
specs definitons to the common 'darwin.h' header and then finally uses
the config.gcc script to supply the fall-back defaults for cross-
compilations.
We can then delete all but the main header, since the darwinN.h are
unused.
This change moves a spec from darwin10.h to the main darwin.h
target header.
gcc/ChangeLog:
* config/darwin10.h (LINK_GCC_C_SEQUENCE_SPEC): Move the spec
for the Darwin10 unwinder stub from here ...
* config/darwin.h (LINK_COMMAND_SPEC_A): ... to here.
The toolchain now requires a C++11 compiler to bootstrap and
none of the older Darwin toolchains which were based on stabs
debugging are suitable. We can simplify the debug setup now.
gcc/ChangeLog:
* config/darwin.h (DSYMUTIL_SPEC): Default to DWARF
(ASM_DEBUG_SPEC):Only define if the assembler supports
stabs.
(PREFERRED_DEBUGGING_TYPE): Default to DWARF.
(DARWIN_PREFER_DWARF): Define.
* config/darwin9.h (PREFERRED_DEBUGGING_TYPE): Remove.
(DARWIN_PREFER_DWARF): Likewise
(DSYMUTIL_SPEC): Likewise.
(COLLECT_RUN_DSYMUTIL): Likewise.
(ASM_DEBUG_SPEC): Likewise.
(ASM_DEBUG_OPTION_SPEC): Likewise.
An invalid declaration of a CLASS instance can lead to an internal state
with inconsistent attributes during parsing that needs to be handled with
sufficient care when processing subsequent statements. Avoid a lookup of
the vtab entry for such cases.
gcc/fortran/ChangeLog:
* class.c (gfc_find_vtab): Add check on attribute is_class.
The tests use -mfp16-format=alternative, and so should not be run
if that option isn't supported.
for gcc/testsuite/ChangeLog
* lib/target-supports.exp
(check_effective_target_arm_fp16_alternative_ok_nocache):
Return zero for *-*-vxworks7r* targets.
* gcc.target/arm/aapcs/vfp22.c: Require arm_fp16_alternative_ok.
* gcc.target/arm/aapcs/vfp23.c: Likewise.
* gcc.target/arm/aapcs/vfp24.c: Likewise.
* gcc.target/arm/aapcs/vfp25.c: Likewise.
This test fails during the execution on VxWorks 7 when using
C++-14 and C++-17.
for gcc/testsuite/ChangeLog
* g++.dg/init/new26.C: Fix overriding of the delete operator
for c++14 profile.
The only TLS model supported in VxWorks kernel mode is local-exec.
for gcc/testsuite/ChangeLog
* g++.dg/tls/pr79288.C: Skip on vxworks_kernel (TLS model
not supported).
If the target is configured such that -mlong-call is passed
by default, the function calls these tests are trying to detect
by scanning the assembly file are performed using long calls,
like so:
| foo:
| @ memset-inline-2.c:12: memset (a, -1, 14);
| mov r2, #14 @,
| mvn r1, #0 @,
| ldr r0, .L2 @,
| ldr r3, .L2+4 @ tmp112,
| bx r3 @ tmp112
Looking at .L2 (and in particular at .L2+4):
| .L2:
| .word a
| .word memset <<<---
This change adds -mno-long-calls to the list of compiler options
to make sure we generate short call code, allowing the assembly
matching to pass.
This is added unconditionally to the dg-options (as opposed to using
dg-additional-options) because this test is already specific to ARM
targets, and -mno-long-calls is available on all ARM targets.
for gcc/testsuite/ChangeLog
* gcc.target/arm/memset-inline-2.c: Add -mno-long-calls to
the test's dg-options.
* gcc.target/arm/pr78255-2.c: Likewise.
The conflicting definition of OK is present in VxWorks RTP headers too.
for gcc/testsuite/ChangeLog
* g++.old-deja/g++.mike/p658.C: Also undefine OK on VxWorks RTP.
In VxWorks 7, UINT32 is defined in both modes, kernel and rtp. Adjust
the work around accordingly.
for gcc/testsuite/ChangeLog
* g++.dg/opt/20050511-1.C: Work around UINT32 in vxworks rtp
headers too.
Linking in vxworks kernel-mode is partial linking, so missing symbols
are not detected.
for gcc/testsuite/ChangeLog
* g++.old-deja/g++.pt/const2.C: Skip on vxworks kernel.
VxWorks headers define ERROR as a macro, which conflicts with the use
in the test.
for gcc/testsuite/ChangeLog
* g++.dg/tree-ssa/copyprop.C: Undefine ERROR if defined.
The vxworks kernel-mode linking is partial linking, so it cannot
detect missing symbols.
for gcc/testsuite/ChangeLog
* g++.dg/other/anon5.C: Skip on vxworks kernel.
Adjust vxworks initpri expectations, given that vxworks7 has switched
to .init_array.
for gcc/testsuite/ChangeLog
* gcc.dg/vxworks/initpri1.c: Tigthen VxWorks version check.
* gcc.dg/vxworks/initpri2.c: Likewise.
This test currently fails on VxWorks 7 SR06x0 targets when in kernel
mode, because it expects a discrepancy between built-in and system
intmax_t for all VxWorks targets when in kernel mode. Fortunately,
this has now been fixed when targetting VxWorks 7 SR06x0, so this
commit adjusts the "dg-error" condition to exclude newer versions of
VxWorks 7.
for gcc/testsuite/ChangeLog
* gcc.dg/intmax_t-1.c: Do not expect an error on *-*-vxworks7r*
targets.
Match xfail on kernel instead of rtp mode.
for gcc/testsuite/changeLog
* gcc.dg/pthread-init-1.c: Fix the VxWorks xfail filters.
* gcc.dg/pthread-init-2.c: Ditto.
Explicitly disable some vxworks-missing features in the testsuite, that
the current feature tests detect as present.
for gcc/testsuite/ChangeLog
* lib/target-supports.exp (check_weak_available,
check_fork_available, check_effective_target_lto,
check_effective_target_mempcpy): Add vxworks filters.
The implicit -mlong-calls used in our vxworks configurations changes
the call sequences from those expected in the mve_libcall testcases.
This patch brings the test output in line with the expectations, with
an explicit -mno-long-calls.
for gcc/testsuite/ChangeLog
* gcc.target/arm/mve/intrinsics/mve_libcall1.c: Pass an
explicit -mno-long-calls.
* gcc.target/arm/mve/intrinsics/mve_libcall2.c: Likewise.
The implicit -mlong-calls from our vxworks configurations makes the
tail-call instructions differ from those expected by the
no_unique_address tests in gcc.target/arm.
This patch adds -mno-long-calls to the compilation commands, so that
we generate the expected sequences.
for gcc/testsuite/ChangeLog
* g++.target/arm/no_unique_address_1.C: Add -mno-long-calls.
* g++.target/arm/no_unique_address_2.C: Likewise.
The headmerge tests pass a constant to conditional calls, so that the
same constant is always passed to a function, though it's a different
function depending on which path is taken.
The test checks that the constant appears only once in the assembly
output, as a means to verify that the insns setting up the argument
are unified: they appear as separate insns up to jump2, where
crossjump identifies a common prefix to all conditional paths and
unifies them.
Alas, with -mlong-calls, that we enable in our arm-vxworks
configurations, the argument register is loaded after loading the
callee address into another register. Since each path calls a
different function, there's no common initial code sequence for
crossjump to unify, and the argument register set up remains separate,
so the test fails.
Though it would surely be desirable for the compiler to perform the
unification of the argument register setting up, this patch merely
avoids the effects of -mlong-calls, with an explicit -mno-long-calls.
for gcc/testsuite/ChangeLog
* gcc.target/arm/headmerge-1.c: Add -mno-long-calls.
* gcc.target/arm/headmerge-2.c: Likewise.
The implicit -mlong-calls used in our arm-vxworks configurations
changes the register allocation patterns in the arm/fp16-aapcs-2.c
test: r3 ends up used in the long-call sequence, and we end up using
ip as a temporary, which doesn't match the expected mov patterns.
This patch adds an explicit -mno-long-calls for the generated code to
match the expectation.
for gcc/testsuite/ChangeLog
* gcc.target/arm/fp16-aapcs-2.c: Use -mno-long-calls.
On some targets, there are no < 8191; and >= 8191; strings,
but < 8191) and >= 8191), so just remove the ; from the regexps.
2021-01-01 Jakub Jelinek <jakub@redhat.com>
PR testsuite/98489
PR tree-optimization/56719
* gcc.dg/tree-ssa/pr56719.c: Remove semicolon from
scan-tree-dump-times regexps.
In this testcase we end up with:
unsigned long long x = ...;
char y = (char) (x << 37);
The overwidening pattern realised that only the low 8 bits
of x << 37 are needed, but then tried to turn that into:
unsigned long long x = ...;
char y = (char) x << 37;
which gives an out-of-range shift. In this case y can simply
be replaced by zero, but as the comment in the patch says,
it's kind-of awkward to do that in the middle of vectorisation.
Most of the overwidening stuff is about keeping operations
as narrow as possible, which is important for vectorisation
but could be counter-productive for scalars (especially on
RISC targets). In contrast, optimising y to zero in the above
feels like an independent optimisation that would benefit scalar
code and that should happen before vectorisation.
gcc/
PR tree-optimization/98302
* tree-vect-patterns.c (vect_determine_precisions_from_users): Make
sure that the precision remains greater than the shift count.
gcc/testsuite/
PR tree-optimization/98302
* gcc.dg/vect/pr98302.c: New test.
This PR is about a case in which the vectoriser was feeding
incorrect alignment information to tree-data-ref.c, leading
to incorrect runtime alias checks. The alignment was taken
from the TREE_TYPE of the DR_REF, which in this case was a
COMPONENT_REF with a normally-aligned type. However, the
underlying MEM_REF was only byte-aligned.
This patch uses dr_alignment to calculate the (byte) alignment
instead, just like we do when creating vector MEM_REFs.
gcc/
PR tree-optimization/94994
* tree-vect-data-refs.c (vect_vfa_align): Use dr_alignment.
gcc/testsuite/
PR tree-optimization/94994
* gcc.dg/vect/pr94994.c: New test.
The static GET_MODE_MASKs for SVE vectors are based on the
static precisions, which in turn are based on 128-bit SVE.
The precisions are later updated based on -msve-vector-bits
(usually to become variable length), but the GET_MODE_MASK
stayed the same. This caused combine to fold:
(*_extract:DI (subreg:DI (reg:VNxMM R) 0) ...)
to zero because the extracted bits appeared to be insignificant.
gcc/
PR rtl-optimization/98214
* genmodes.c (emit_insn_modes_h): Emit a definition of CONST_MODE_MASK.
(emit_mode_mask): Treat mode_mask_array as non-constant if adj_nunits.
(emit_mode_adjustments): Update GET_MODE_MASK when updating
GET_MODE_NUNITS.
* machmode.h (mode_mask_array): Use CONST_MODE_MASK.
The following patch adds some clz simplifications. If
clz is 0, then the MSB of the argument is set, and if clz is prec-1, then
the argument is 1.
2020-12-31 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94802
* match.pd (clz(X) == 0 -> (int)X < 0): New simplification.
(clz(X) == (prec-1) -> X == 1): Likewise.
* gcc.dg/tree-ssa/pr94802-1.c: New test.
The following patch adds two simplifications to recognize idioms
for ABS_EXPR resp. ABSU_EXPR.
2020-12-31 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94785
* match.pd ((-(X < 0) | 1) * X -> abs (X)): New simplification.
((-(X < 0) | 1U) * X -> absu (X)): Likewise.
* gcc.dg/tree-ssa/pr94785.c: New test.
The following testcase is miscompiled, because niter analysis miscomputes
the number of iterations to 0.
The problem is that niter analysis uses mpz_t (wonder why, wouldn't
widest_int do the same job?) and when wi::to_mpz is called e.g. on the
TYPE_MAX_VALUE of __uint128_t, it initializes the mpz_t result with wrong
value.
wi::to_mpz has code to handle negative wide_ints in signed types by
inverting all bits, importing to mpz and complementing it, which is fine,
but doesn't handle correctly the case when the wide_int's len (times
HOST_BITS_PER_WIDE_INT) is smaller than precision when wi::neg_p.
E.g. the 0xffffffffffffffffffffffffffffffff TYPE_MAX_VALUE is represented
in wide_int as 0xffffffffffffffff len 1, and wi::to_mpz would create
0xffffffffffffffff mpz_t value from that.
This patch handles it by adding the needed -1 host wide int words (and has
also code to deal with precision that aren't multiple of
HOST_BITS_PER_WIDE_INT).
2020-12-31 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/98474
* wide-int.cc (wi::to_mpz): If wide_int has MSB set, but type
is unsigned and excess negative, append set bits after len until
precision.
* gcc.c-torture/execute/pr98474.c: New test.
The following testcase is diagnosed by UBSan as invalid, even when it is
valid.
We have a derived type Base2 at offset 1 with alignment 1 and do:
(const Derived &) ((const Base2 *) this + -1)
but the folder before ubsan in the FE gets a chance to instrument it
optimizes that into:
(const Derived &) this + -1
and so we require that this has 8-byte alignment which Derived class needs.
Fixed by avoiding such an optimization when -fsanitize=alignment is in
effect if it would affect the alignments (and guarded with !in_gimple_form
because we don't really care during GIMPLE, though pointer conversions are
useless then and so such folding isn't needed very much during GIMPLE).
2020-12-31 Jakub Jelinek <jakub@redhat.com>
PR c++/98206
* fold-const.c: Include asan.h.
(fold_unary_loc): Don't optimize (ptr_type) (((ptr_type2) x) p+ y)
into ((ptr_type) x) p+ y if sanitizing alignment in GENERIC and
ptr_type points to type with higher alignment than ptr_type2.
* g++.dg/ubsan/align-4.C: New test.
The following patch adds an optimization mentioned in PR56719 #c8.
We already have the x != 0 && y != 0 && z != 0 into (x | y | z) != 0
and x != -1 && y != -1 && y != -1 into (x & y & z) != -1
optimizations, this patch just extends that to
x < C && y < C && z < C for power of two constants C into
(x | y | z) < C (for unsigned comparisons).
I didn't want to create too many buckets (there can be TYPE_PRECISION such
constants), so the patch instead just uses one buckets for all such
constants and loops over that bucket up to TYPE_PRECISION times.
2020-12-31 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/56719
* tree-ssa-reassoc.c (optimize_range_tests_cmp_bitwise): Also optimize
x < C && y < C && z < C when C is a power of two constant into
(x | y | z) < C.
* gcc.dg/tree-ssa/pr56719.c: New test.
Symbols with extern(D) linkage are now mangled using back references to
types and identifiers if these occur more than once in the mangled name
as emitted before. This reduces symbol length, especially with chained
expressions of templated functions with Voldemort return types.
For example, the average symbol length of the 127000+ symbols created by
a libphobos unittest build is reduced by a factor of about 3, while the
longest symbol shrinks from 416133 to 1142 characters.
Reviewed-on: https://github.com/dlang/dmd/pull/12079
gcc/d/ChangeLog:
* dmd/MERGE: Merge upstream dmd 2bd4fc3fe.