Go to file
Richard Sandiford 648fc673e6 aarch64: Add support for SVE_B16B16
This patch adds support for the SVE_B16B16 extension, which provides
non-widening BF16 versions of existing instructions.

Mostly it's just a simple extension of iterators.  The main
complications are:

(1) The new instructions have no immediate forms.  This is easy to
    handle for the cond_* patterns (the ones that have an explicit
    else value) since those are already divided into register and
    non-register versions.  All we need to do is tighten the predicates.

    However, the @aarch64_pred_<optab><mode> patterns handle the
    immediates directly.  Rather than complicate them further,
    it seemed best to add a single @aarch64_pred_<optab><mode> for
    all BF16 arithmetic.

(2) There is no BFSUBR, so the usual method of handling reversed
    operands breaks down.  The patch deals with this using some
    new attributes that together disable the "BFSUBR" alternative.

(3) Similarly, there are no BFMAD or BFMSB instructions, so we need
    to disable those forms in the BFMLA and BFMLS patterns.

The patch includes support for generic bf16 vectors too.

It would be possible to use these instructions for scalars, as with
the recent FLOGB patch, but that's left as future work.

gcc/
	* config/aarch64/aarch64-option-extensions.def
	(sve-b16b16): New extension.
	* doc/invoke.texi: Document it.
	* config/aarch64/aarch64.h (TARGET_SME_B16B16, TARGET_SVE2_OR_SME2)
	(TARGET_SSVE_B16B16): New macros.
	* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins):
	Conditionally define __ARM_FEATURE_SVE_B16B16
	* config/aarch64/aarch64-sve-builtins-sve2.def: Add AARCH64_FL_SVE2
	to the SVE2p1 requirements.  Add SVE_B16B16 forms of existing
	intrinsics.
	* config/aarch64/aarch64-sve-builtins.cc (type_suffixes): Treat
	bfloat as a floating-point type.
	(TYPES_h_bfloat): New macro.
	* config/aarch64/aarch64.md (is_bf16, is_rev, supports_bf16_rev)
	(mode_enabled): New attributes.
	(enabled): Test mode_enabled.
	* config/aarch64/iterators.md (SVE_FULL_F_BF): New mode iterator.
	(SVE_CLAMP_F): Likewise.
	(SVE_Fx24): Add BF16 modes when TARGET_SSVE_B16B16.
	(sve_lane_con): Handle BF16 modes.
	(b): Handle SF and DF modes.
	(is_bf16): New mode attribute.
	(supports_bf16, supports_bf16_rev): New int attributes.
	* config/aarch64/predicates.md
	(aarch64_sve_float_maxmin_immediate): Reject BF16 modes.
	* config/aarch64/aarch64-sve.md
	(*post_ra_<sve_fp_op><mode>3): Add BF16 support, and likewise
	for the associated define_split.
	(<optab:SVE_COND_FP_BINARY_OPTAB><mode>): Add BF16 support.
	(@cond_<optab:SVE_COND_FP_BINARY><mode>): Likewise.
	(*cond_<optab:SVE_COND_FP_BINARY><mode>_2_relaxed): Likewise.
	(*cond_<optab:SVE_COND_FP_BINARY><mode>_2_strict): Likewise.
	(*cond_<optab:SVE_COND_FP_BINARY><mode>_3_relaxed): Likewise.
	(*cond_<optab:SVE_COND_FP_BINARY><mode>_3_strict): Likewise.
	(*cond_<optab:SVE_COND_FP_BINARY><mode>_any_relaxed): Likewise.
	(*cond_<optab:SVE_COND_FP_BINARY><mode>_any_strict): Likewise.
	(@aarch64_mul_lane_<mode>): Likewise.
	(<optab:SVE_COND_FP_TERNARY><mode>): Likewise.
	(@aarch64_pred_<optab:SVE_COND_FP_TERNARY><mode>): Likewise.
	(@cond_<optab:SVE_COND_FP_TERNARY><mode>): Likewise.
	(*cond_<optab:SVE_COND_FP_TERNARY><mode>_4_relaxed): Likewise.
	(*cond_<optab:SVE_COND_FP_TERNARY><mode>_4_strict): Likewise.
	(*cond_<optab:SVE_COND_FP_TERNARY><mode>_any_relaxed): Likewise.
	(*cond_<optab:SVE_COND_FP_TERNARY><mode>_any_strict): Likewise.
	(@aarch64_<optab:SVE_FP_TERNARY_LANE>_lane_<mode>): Likewise.
	* config/aarch64/aarch64-sve2.md
	(@aarch64_pred_<optab:SVE_COND_FP_BINARY><mode>): Define BF16 version.
	(@aarch64_sve_fclamp<mode>): Add BF16 support.
	(*aarch64_sve_fclamp<mode>_x): Likewise.
	(*aarch64_sve_<maxmin_uns_op><SVE_Fx24:mode>): Likewise.
	(*aarch64_sve_single_<maxmin_uns_op><SVE_Fx24:mode>): Likewise.
	* config/aarch64/aarch64.cc (aarch64_sve_float_arith_immediate_p)
	(aarch64_sve_float_mul_immediate_p): Return false for BF16 modes.

gcc/testsuite/
	* lib/target-supports.exp: Test the assembler for sve-b16b16 support.
	* gcc.target/aarch64/pragma_cpp_predefs_4.c: Test the new B16B16
	macros.
	* gcc.target/aarch64/sve/fmad_1.c: Test bfloat16 too.
	* gcc.target/aarch64/sve/fmla_1.c: Likewise.
	* gcc.target/aarch64/sve/fmls_1.c: Likewise.
	* gcc.target/aarch64/sve/fmsb_1.c: Likewise.
	* gcc.target/aarch64/sve/cond_mla_9.c: New test.
	* gcc.target/aarch64/sme2/acle-asm/clamp_bf16_x2.c: Likewise.
	* gcc.target/aarch64/sme2/acle-asm/clamp_bf16_x4.c: Likewise.
	* gcc.target/aarch64/sme2/acle-asm/max_bf16_x2.c: Likewise.
	* gcc.target/aarch64/sme2/acle-asm/max_bf16_x4.c: Likewise.
	* gcc.target/aarch64/sme2/acle-asm/maxnm_bf16_x2.c: Likewise.
	* gcc.target/aarch64/sme2/acle-asm/maxnm_bf16_x4.c: Likewise.
	* gcc.target/aarch64/sme2/acle-asm/min_bf16_x2.c: Likewise.
	* gcc.target/aarch64/sme2/acle-asm/min_bf16_x4.c: Likewise.
	* gcc.target/aarch64/sme2/acle-asm/minnm_bf16_x2.c: Likewise.
	* gcc.target/aarch64/sme2/acle-asm/minnm_bf16_x4.c: Likewise.
	* gcc.target/aarch64/sve/bf16_arith_1.c: Likewise.
	* gcc.target/aarch64/sve/bf16_arith_1.h: Likewise.
	* gcc.target/aarch64/sve/bf16_arith_2.c: Likewise.
	* gcc.target/aarch64/sve/bf16_arith_3.c: Likewise.
	* gcc.target/aarch64/sve2/acle/asm/add_bf16.c: Likewise.
	* gcc.target/aarch64/sve2/acle/asm/clamp_bf16.c: Likewise.
	* gcc.target/aarch64/sve2/acle/asm/max_bf16.c: Likewise.
	* gcc.target/aarch64/sve2/acle/asm/maxnm_bf16.c: Likewise.
	* gcc.target/aarch64/sve2/acle/asm/min_bf16.c: Likewise.
	* gcc.target/aarch64/sve2/acle/asm/minnm_bf16.c: Likewise.
	* gcc.target/aarch64/sve2/acle/asm/mla_bf16.c: Likewise.
	* gcc.target/aarch64/sve2/acle/asm/mla_lane_bf16.c: Likewise.
	* gcc.target/aarch64/sve2/acle/asm/mls_bf16.c: Likewise.
	* gcc.target/aarch64/sve2/acle/asm/mls_lane_bf16.c: Likewise.
	* gcc.target/aarch64/sve2/acle/asm/mul_bf16.c: Likewise.
	* gcc.target/aarch64/sve2/acle/asm/mul_lane_bf16.c: Likewise.
	* gcc.target/aarch64/sve2/acle/asm/sub_bf16.c: Likewise.
2024-11-20 13:27:40 +00:00
.forgejo top-level: Add pull request template for Forgejo 2024-10-23 19:45:09 +01:00
.github
c++tools Daily bump. 2024-05-09 10:58:01 +00:00
config Daily bump. 2024-04-17 00:18:45 +00:00
contrib Daily bump. 2024-11-19 00:19:52 +00:00
fixincludes Daily bump. 2024-07-12 00:17:52 +00:00
gcc aarch64: Add support for SVE_B16B16 2024-11-20 13:27:40 +00:00
gnattools Daily bump. 2024-07-08 00:17:01 +00:00
gotools Daily bump. 2024-04-16 00:18:06 +00:00
include Daily bump. 2024-11-02 00:19:21 +00:00
INSTALL
libada Update copyright years. 2024-01-03 12:19:35 +01:00
libatomic Daily bump. 2024-11-19 00:19:52 +00:00
libbacktrace Daily bump. 2024-10-26 00:19:39 +00:00
libcc1 Daily bump. 2024-09-21 00:16:55 +00:00
libcody
libcpp Daily bump. 2024-11-19 00:19:52 +00:00
libdecnumber Daily bump. 2024-04-03 00:17:29 +00:00
libffi Daily bump. 2024-10-26 00:19:39 +00:00
libgcc Daily bump. 2024-11-20 00:19:59 +00:00
libgfortran Daily bump. 2024-10-08 00:19:04 +00:00
libgm2 Daily bump. 2024-05-30 00:16:44 +00:00
libgo syscall: don't define syscall stub on Hurd 2024-10-30 11:33:07 -07:00
libgomp Daily bump. 2024-11-19 00:19:52 +00:00
libgrust Daily bump. 2024-08-02 00:18:55 +00:00
libiberty Daily bump. 2024-11-20 00:19:59 +00:00
libitm Daily bump. 2024-11-19 00:19:52 +00:00
libobjc Daily bump. 2024-09-24 00:18:14 +00:00
libphobos Daily bump. 2024-11-19 00:19:52 +00:00
libquadmath Daily bump. 2024-08-29 00:19:25 +00:00
libsanitizer libsanitizer: Update LOCAL_PATCHES 2024-11-12 21:59:49 +08:00
libssp Daily bump. 2024-05-09 10:58:01 +00:00
libstdc++-v3 libstdc++: Use const_iterator in std::set::find<K> return type 2024-11-20 06:44:50 +00:00
libvtv Daily bump. 2024-11-19 00:19:52 +00:00
lto-plugin Daily bump. 2024-08-24 00:18:13 +00:00
maintainer-scripts Daily bump. 2024-07-20 00:17:53 +00:00
zlib
.b4-config Add config file so b4 uses inbox.sourceware.org automatically 2024-07-28 11:13:16 +01:00
.dir-locals.el dir-locals: apply our C settings in C++ also 2024-07-31 20:38:27 +02:00
.gitattributes
.gitignore Git ignores .vscode 2024-09-12 22:51:00 +08:00
ABOUT-NLS
ar-lib
ChangeLog Daily bump. 2024-11-20 00:19:59 +00:00
ChangeLog.jit
ChangeLog.tree-ssa
compile
config-ml.in config-ml.in: Fix multi-os-dir search 2024-05-06 12:08:28 +08:00
config.guess
config.rpath
config.sub
configure Add libdiagnostics (v4) 2024-11-18 17:08:36 -05:00
configure.ac Add libdiagnostics (v4) 2024-11-18 17:08:36 -05:00
COPYING
COPYING3
COPYING3.LIB
COPYING.LIB
COPYING.RUNTIME
depcomp
install-sh
libtool-ldflags
libtool.m4
lt~obsolete.m4
ltgcc.m4
ltmain.sh ltmain.sh: allow more flags at link-time 2024-09-25 19:05:24 +01:00
ltoptions.m4
ltsugar.m4
ltversion.m4
MAINTAINERS [MAINTAINERS] Add myself to write after approval and DCO. 2024-10-30 14:31:09 +05:30
Makefile.def gccrs: Fix missing build dependency 2024-01-16 16:23:02 +01:00
Makefile.in Makefile.tpl: fix whitespace in licence header 2024-08-22 03:41:12 +01:00
Makefile.tpl Makefile.tpl: fix whitespace in licence header 2024-08-22 03:41:12 +01:00
missing
mkdep
mkinstalldirs
move-if-change
multilib.am
README
SECURITY.txt Remove Debian from SECURITY.txt 2024-11-19 12:27:33 +01:00
symlink-tree
test-driver
ylwrap

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.