mirrors/gcc

mirror of https://gcc.gnu.org/git/gcc.git synced 2024-11-27 05:44:15 +08:00

Author	SHA1	Message	Date
Richard Biener	c77b1c833e	Fixup unaligned load/store cost for znver5 Currently unaligned YMM and ZMM load and store costs are cheaper than aligned which causes the vectorizer to purposely mis-align accesses by adding an alignment prologue. It looks like the unaligned costs were simply copied from the bogus znver4 costs. The following makes the unaligned costs equal to the aligned costs like in the fixed znver4 version. * config/i386/x86-tune-costs.h (znver5_cost): Update unaligned load and store cost from the aligned costs. (cherry picked from commit `896393791e`)	2024-09-29 01:55:50 +02:00
Jan Hubicka	54806268b4	Add AMD znver5 processor enablement with scheduler model 2024-02-14 Jan Hubicka <jh@suse.cz> Karthiban Anbazhagan <Karthiban.Anbazhagan@amd.com> gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_amd_cpu): Recognize znver5. * common/config/i386/i386-common.cc (processor_names): Add znver5. (processor_alias_table): Likewise. * common/config/i386/i386-cpuinfo.h (processor_types): Add new zen family. (processor_subtypes): Add znver5. * config.gcc (x86_64-- \|...): Likewise. * config/i386/driver-i386.cc (host_detect_local_cpu): Let march=native detect znver5 cpu's. * config/i386/i386-c.cc (ix86_target_macros_internal): Add znver5. * config/i386/i386-options.cc (m_ZNVER5): New definition (processor_cost_table): Add znver5. * config/i386/i386.cc (ix86_reassociation_width): Likewise. * config/i386/i386.h (processor_type): Add PROCESSOR_ZNVER5 (PTA_ZNVER5): New definition. * config/i386/i386.md (define_attr "cpu"): Add znver5. (Scheduling descriptions) Add znver5.md. * config/i386/x86-tune-costs.h (znver5_cost): New definition. * config/i386/x86-tune-sched.cc (ix86_issue_rate): Add znver5. (ix86_adjust_cost): Likewise. * config/i386/x86-tune.def (avx512_move_by_pieces): Add m_ZNVER5. (avx512_store_by_pieces): Add m_ZNVER5. * doc/extend.texi: Add znver5. * doc/invoke.texi: Likewise. * config/i386/znver4.md: Rename to zn4zn5.md; combine znver4 and znver5 Scheduler. gcc/testsuite/ChangeLog: * g++.target/i386/mv29.C: Handle znver5 arch. * gcc.target/i386/funcspec-56.inc:Likewise. (cherry picked from commit `d0aa0af9a9`)	2024-09-29 01:55:19 +02:00
H.J. Lu	2e66eb7e7e	x86: Don't use address override with segment regsiter Address override only applies to the (reg32) part in the thread address fs:(reg32). Don't rewrite thread address like (set (reg:CCZ 17 flags) (compare:CCZ (reg:SI 98 [ __gmpfr_emax.0_1 ]) (mem/c:SI (plus:SI (plus:SI (unspec:SI [ (const_int 0 [0]) ] UNSPEC_TP) (reg:SI 107)) (const:SI (unspec:SI [ (symbol_ref:SI ("previous_emax") [flags 0x1a] <var_decl 0x7fffe9a11cf0 previous_emax>) ] UNSPEC_DTPOFF))) [1 previous_emax+0 S4 A32]))) if address override is used to avoid the invalid memory operand like cmpl %fs:previous_emax@dtpoff(%eax), %r12d gcc/ PR target/116839 * config/i386/i386.cc (ix86_rewrite_tls_address_1): Make it static. Return if TLS address is thread register plus an integer register. gcc/testsuite/ PR target/116839 * gcc.target/i386/pr116839.c: New file. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `c79cc30862`)	2024-09-28 18:59:24 +08:00
GCC Administrator	e282606b6c	Daily bump.	2024-09-28 00:20:43 +00:00
Stefan Schulze Frielinghaus	7051fa5fa4	s390: Fix TF to FPRX2 conversion [PR115860] Currently subregs originating from tf_to_fprx2_0 and tf_to_fprx2_1 survive register allocation. This in turn leads to wrong register renaming. Keeping the current approach would mean we need two insns for tf_to_fprx2_0 and tf_to_fprx2_1, respectively. Something along the lines (define_insn "tf_to_fprx2_0" [(set (subreg:DF (match_operand:FPRX2 0 "nonimmediate_operand" "=f") 0) (unspec:DF [(match_operand:TF 1 "general_operand" "v")] UNSPEC_TF_TO_FPRX2_0))] "TARGET_VXE" "#") (define_insn "tf_to_fprx2_0" [(set (match_operand:DF 0 "nonimmediate_operand" "=f") (unspec:DF [(match_operand:TF 1 "general_operand" "v")] UNSPEC_TF_TO_FPRX2_0))] "TARGET_VXE" "vpdi\t%v0,%v1,%v0,1 [(set_attr "op_type" "VRR")]) and similar for tf_to_fprx2_1. Note, pre register allocation operand 0 has mode FPRX2 and afterwards DF once subregs have been eliminated. Since we always copy a whole vector register into a floating-point register pair, another way to fix this is to merge tf_to_fprx2_0 and tf_to_fprx2_1 into a single insn which means we don't have to use subregs at all. The downside of this is that the assembler template contains two instructions, now. The upside is that we don't have to come up with some artificial insn before RA which might be more readable/maintainable. That is implemented by this patch. In commit r11-4872-ge627cda5686592, the output operand specifier %V was introduced which is used in tf_to_fprx2 only, now. Instead of coming up with its counterpart %F for floating-point registers, which would also only be used in tf_to_fprx2, I print the operands directly. This renders %V unused which is why it is removed by this patch. gcc/ChangeLog: PR target/115860 config/s390/s390.cc (print_operand): Remove operand specifier %V. * config/s390/s390.md (UNSPEC_TF_TO_FPRX2): New. * config/s390/vector.md (tf_to_fprx2_0): Remove. (tf_to_fprx2_1): Remove. (tf_to_fprx2): New. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/long-double-asm-abi.c: Adapt scan-assembler directive. * gcc.target/s390/vector/long-double-to-i64.c: Adapt scan-assembler directive. * gcc.target/s390/pr115860-1.c: New test. (cherry picked from commit `46c2538435`)	2024-09-27 12:45:42 +02:00
Stefan Schulze Frielinghaus	8d29e1c4ce	s390: Fix AQ and AR constraints Ensure for AQ and AR constraints that the resulting displacement after adding any positive offset less than the size of the object being referenced is still valid. gcc/ChangeLog: * config/s390/s390.cc (s390_mem_constraint): Check displacement for AQ and AR constraints. (cherry picked from commit `1a71ff3b89`)	2024-09-27 12:45:42 +02:00
GCC Administrator	63a5a1fbb7	Daily bump.	2024-09-27 00:20:45 +00:00
GCC Administrator	596d857e68	Daily bump.	2024-09-26 00:21:00 +00:00
GCC Administrator	50c8048de9	Daily bump.	2024-09-25 00:20:31 +00:00
GCC Administrator	52bb3a257d	Daily bump.	2024-09-24 00:20:06 +00:00
GCC Administrator	917b6c6a89	Daily bump.	2024-09-23 00:19:53 +00:00
GCC Administrator	2a6e9bfcd3	Daily bump.	2024-09-22 00:20:46 +00:00
GCC Administrator	a761f1007f	Daily bump.	2024-09-21 00:19:46 +00:00
Harald Anlauf	cb25c5dd6b	Fortran: fix ICE in gfc_create_module_variable [PR100273] gcc/fortran/ChangeLog: PR fortran/100273 * trans-decl.cc (gfc_create_module_variable): Handle module variable also when it is needed for the result specification of a contained function. gcc/testsuite/ChangeLog: PR fortran/100273 * gfortran.dg/pr100273.f90: New test. (cherry picked from commit `1f462b5072`)	2024-09-20 21:29:21 +02:00
GCC Administrator	645a11f70e	Daily bump.	2024-09-20 17:37:53 +00:00
Eric Botcazou	0f32c31250	Fix small thinko in IPA mod/ref pass When a memory copy operation is analyzed by analyze_ssa_name, if both the load and store are made through the same SSA name, the store is overlooked. gcc/ * ipa-modref.cc (modref_eaf_analysis::analyze_ssa_name): Always process both the load and the store of a memory copy operation. gcc/testsuite/ * gcc.dg/ipa/modref-4.c: New test.	2024-09-20 17:31:22 +02:00
Stefan Schulze Frielinghaus	4fe0b88159	s390: Fix strict_low_part generation In s390_expand_insv(), if generating code for ICM et al. src is a MEM and gen_lowpart might force src into a register such that we end up with patterns which do not match anymore. Use adjust_address() instead in order to preserve a MEM. Furthermore, it is not straight forward to enforce a subreg. For example, in case of a paradoxical subreg, gen_lowpart() may return a register. In order to compensate this, s390_gen_lowpart_subreg() emits a reference to a pseudo which does not coincide with its definition which is wrong. Additionally, if dest is a paradoxical subreg, then do not try to emit a strict_low_part since it could mean that dest was not initialized even though this might be fixed up later by init-regs. Splitter for insn get_tp_64, zero_extendhisi2_31, zero_extendqisi2_31, zero_extendqihi2_31 are applied after reload. Thus, operands[0] is a hard register and gen_lowpart (m, operands[0]) just returns the hard register for mode m which is fine to use as an argument for strict_low_part, i.e., we do not need to enforce subregs here since after reload subregs are supposed to be eliminated anyway. This fixes gcc.dg/torture/pr111821.c. gcc/ChangeLog: * config/s390/s390-protos.h (s390_gen_lowpart_subreg): Remove. * config/s390/s390.cc (s390_gen_lowpart_subreg): Remove. (s390_expand_insv): Use adjust_address() and emit a strict_low_part only in case of a natural subreg. * config/s390/s390.md: Use gen_lowpart() instead of s390_gen_lowpart_subreg(). (cherry picked from commit `9ebc9fbddd`)	2024-09-20 14:08:32 +02:00
Haochen Jiang	8483527158	doc: Add more alias option and reorder Intel CPU -march documentation This patch is backported from GCC15 with some tweaks. Since r15-3539, there are requests coming in to add other alias option documentation. This patch will add all of them, including corei7, corei7-avx, core-avx-i, core-avx2, atom and slm. Also in the patch, I reordered that part of documentation, currently all the CPUs/products are just all over the place. I regrouped them by date-to-now products (since the very first CPU to latest Panther Lake), P-core (since the clients become hybrid cores, starting from Sapphire Rapids) and E-core (since Bonnell). In GCC14 and eariler GCC, Xeon Phi CPUs are still there, I put them after E-core CPUs. And in the patch, I refined the product names in documentation. gcc/ChangeLog: * doc/invoke.texi: Add corei7, corei7-avx, core-avx-i, core-avx2, atom, and slm. Reorder the -march documentation by splitting them into date-to-now products, P-core, E-core and Xeon Phi. Refine the product names in documentation.	2024-09-19 14:42:05 +08:00
GCC Administrator	f467bbb06d	Daily bump.	2024-09-19 00:20:51 +00:00
GCC Administrator	0ab2379e3a	Daily bump.	2024-09-18 00:19:22 +00:00
Marek Polacek	9046f9aeae	c++: crash with anon VAR_DECL [PR116676] r12-3495 added maybe_warn_about_constant_value which will crash if it gets a nameless VAR_DECL, which is what happens in this PR. We created this VAR_DECL in cp_parser_decomposition_declaration. PR c++/116676 gcc/cp/ChangeLog: * constexpr.cc (maybe_warn_about_constant_value): Check DECL_NAME. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/constexpr-116676.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com> (cherry picked from commit `dfe0d4389a`)	2024-09-17 12:37:05 -04:00
GCC Administrator	46bf97c534	Daily bump.	2024-09-17 00:18:27 +00:00
GCC Administrator	772393c20b	Daily bump.	2024-09-16 00:18:36 +00:00
H.J. Lu	ebdc85b6ce	x86-64: Don't use temp for argument in a TImode register Don't use temp for a PARALLEL BLKmode argument of an EXPR_LIST expression in a TImode register. Otherwise, the TImode variable will be put in the GPR save area which guarantees only 8-byte alignment. gcc/ PR target/116621 * config/i386/i386.cc (ix86_gimplify_va_arg): Don't use temp for a PARALLEL BLKmode container of an EXPR_LIST expression in a TImode register. gcc/testsuite/ PR target/116621 * gcc.target/i386/pr116621.c: New test. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `fa7bbb065c`)	2024-09-16 04:38:46 +08:00
GCC Administrator	5c8f84c5dd	Daily bump.	2024-09-15 00:18:06 +00:00
GCC Administrator	6aceb85821	Daily bump.	2024-09-14 00:18:08 +00:00
GCC Administrator	0344276a00	Daily bump.	2024-09-13 00:19:01 +00:00
GCC Administrator	682cc3f90d	Daily bump.	2024-09-12 00:18:11 +00:00
GCC Administrator	b64a99840e	Daily bump.	2024-09-11 00:19:52 +00:00
GCC Administrator	b48e7c28b6	Daily bump.	2024-09-10 00:25:59 +00:00
GCC Administrator	0dba9570a4	Daily bump.	2024-09-09 00:18:31 +00:00
GCC Administrator	0f053a8519	Daily bump.	2024-09-08 00:20:33 +00:00
GCC Administrator	fc14ff0c9e	Daily bump.	2024-09-07 00:18:42 +00:00
GCC Administrator	71f9ca6c69	Daily bump.	2024-09-06 00:19:57 +00:00
H.J. Lu	42d4aa02c6	ipa: Don't disable function parameter analysis for fat LTO Update analyze_parms not to disable function parameter analysis for -ffat-lto-objects. Tested on x86-64, there are no differences in zstd with "-O2 -flto=auto" -g "vs -O2 -flto=auto -g -ffat-lto-objects". PR ipa/116410 * ipa-modref.cc (analyze_parms): Always analyze function parameter for LTO. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit `2f1689ea8e`)	2024-09-05 07:49:09 -07:00
GCC Administrator	87a5641b65	Daily bump.	2024-09-05 00:20:00 +00:00
GCC Administrator	93e66cab19	Daily bump.	2024-09-04 00:25:58 +00:00
Haochen Jiang	6e59b188c4	i386: Fix vfpclassph non-optimizied intrin The intrin for non-optimized got a typo in mask type, which will cause the high bits of __mmask32 being unexpectedly zeroed. The test does not fail under O0 with current 1b since the testcase is wrong. We need to include avx512-mask-type.h after SIZE is defined, or it will always be __mmask8. That problem also happened in AVX10.2 testcases. I will write a seperate patch to fix that. gcc/ChangeLog: * config/i386/avx512fp16intrin.h (_mm512_mask_fpclass_ph_mask): Correct mask type to __mmask32. (_mm512_fpclass_ph_mask): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-vfpclassph-1c.c: New test.	2024-09-03 16:48:07 +08:00
GCC Administrator	911eadd490	Daily bump.	2024-09-03 00:23:54 +00:00
liuhongt	6585b06303	Check avx upper register for parallel. For function arguments/return, when it's BLK mode, it's put in a parallel with an expr_list, and the expr_list contains the real mode and registers. Current ix86_check_avx_upper_register only checked for SSE_REG_P, and failed to handle that. The patch extend the handle to each subrtx. gcc/ChangeLog: PR target/116512 * config/i386/i386.cc (ix86_check_avx_upper_register): Iterate subrtx to scan for avx upper register. (ix86_check_avx_upper_stores): Inline old ix86_check_avx_upper_register. (ix86_avx_u128_mode_needed): Ditto, and replace FOR_EACH_SUBRTX with call to new ix86_check_avx_upper_register. gcc/testsuite/ChangeLog: * gcc.target/i386/pr116512.c: New test. (cherry picked from commit `ab214ef734`)	2024-09-02 09:37:41 +08:00
GCC Administrator	4dc921bcf2	Daily bump.	2024-09-02 00:20:58 +00:00
GCC Administrator	bb95e77900	Daily bump.	2024-09-01 00:29:35 +00:00
GCC Administrator	a9284c5d4e	Daily bump.	2024-08-31 00:20:04 +00:00
GCC Administrator	2875f9fd29	Daily bump.	2024-08-30 00:24:32 +00:00
GCC Administrator	9742dbd709	Daily bump.	2024-08-29 00:21:13 +00:00
GCC Administrator	c2305c8285	Daily bump.	2024-08-28 00:21:21 +00:00
GCC Administrator	84fc228288	Daily bump.	2024-08-26 00:20:27 +00:00
GCC Administrator	19fedf7aa7	Daily bump.	2024-08-25 00:20:22 +00:00
GCC Administrator	15176abb93	Daily bump.	2024-08-24 00:19:20 +00:00
GCC Administrator	61d63da66e	Daily bump.	2024-08-23 00:18:43 +00:00

1 2 3 4 5 ...

195242 Commits