mirror of
https://gcc.gnu.org/git/gcc.git
synced 2024-12-03 00:34:21 +08:00
425ad87dcf
On Tue, Jan 19, 2021 at 04:10:33PM +0000, Richard Sandiford via Gcc-patches wrote: > Ah, ok, thanks for the extra context. > > So AIUI the problem when recording xmm2<-di isn't just: > > [A] partial_subreg_p (vd->e[sr].mode, GET_MODE (src)) > > but also that: > > [B] partial_subreg_p (vd->e[sr].mode, vd->e[vd->e[sr].oldest_regno].mode) > > For example, all registers in this sequence can be part of the same chain: > > (set (reg:HI R1) (reg:HI R0)) > (set (reg:SI R2) (reg:SI R1)) // [A] > (set (reg:DI R3) (reg:DI R2)) // [A] > (set (reg:SI R4) (reg:SI R[0-3])) > (set (reg:HI R5) (reg:HI R[0-4])) > > But: > > (set (reg:SI R1) (reg:SI R0)) > (set (reg:HI R2) (reg:HI R1)) > (set (reg:SI R3) (reg:SI R2)) // [A] && [B] > > is problematic because it dips below the precision of the oldest regno > and then increases again. > > When this happens, I guess we have two choices: > > (1) what the patch does: treat R3 as the start of a new chain. > (2) pretend that the copy occured in vd->e[sr].mode instead > (i.e. copy vd->e[sr].mode to vd->e[dr].mode) > > I guess (2) would need to be subject to REG_CAN_CHANGE_MODE_P. > Maybe the optimisation provided by (2) compared to (1) isn't common > enough to be worth the complication. > > I think we should test [B] as well as [A] though. The pass is set > up to do some quite elaborate mode changes and I think rejecting > [A] on its own would make some of the other code redundant. > It also feels like it should be a seperate “if” or “else if”, > with its own comment. Unfortunately, we now have a testcase that shows that testing also [B] is a problem (unfortunately now latent on the trunk, only reproduces on 10 and 11 branches). The comment in the patch tries to list just the interesting instructions, we have a 64-bit value, copy low 8 bit of those to another register, copy full 64 bits to another register and then clobber the original register. Before that (set (reg:DI r14) (const_int ...)) we have a chain DI r14, QI si, DI bp , that instruction drops the DI r14 from that chain, so we have QI si, DI bp , si being the oldest_regno. Next DI si is copied into DI dx. Only the low 8 bits of that are defined, the rest is unspecified, but we would add DI dx into that same chain at the end, so QI si, DI bp, DI dx [*]. Next si is overwritten, so the chain is DI bp, DI dx. And then we see (set (reg:DI dx) (reg:DI bp)) and remove it as redundant, because we think bp and dx are already equivalent, when in reality that is true only for the lowpart 8 bits. I believe the [*] marked step above is where the bug is. The committed regcprop.c (copy_value) change (but only committed to trunk/11, not to 10) added else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src)) && partial_subreg_p (vd->e[sr].mode, vd->e[vd->e[sr].oldest_regno].mode)) return; and while the first partial_subreg_p call returns true, the second one doesn't; before the (set (reg:DI r14) (const_int ...)) insn it would be true and we'd return, but as that reg got clobbered, si became the oldest regno in the chain and so vd->e[vd->e[sr].oldest_regno].mode is QImode and vd->e[sr].mode is QImode too, so the second partial_subreg_p is false. But as the testcase shows, what is the oldest_regno in the chain is something that changes over time, so relying on it for anything is problematic, something could have a different oldest_regno and later on get a different oldest_regno (perhaps with different mode) because the oldest_regno got overwritten and it can change both ways. The following patch effectively implements your (2) above. 2021-05-15 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/100342 * regcprop.c (copy_value): When copying a source reg in a wider mode than it has recorded for the value, adjust recorded destination mode too or punt if !REG_CAN_CHANGE_MODE_P. * gcc.target/i386/pr100342.c: New test. |
||
---|---|---|
c++tools | ||
config | ||
contrib | ||
fixincludes | ||
gcc | ||
gnattools | ||
gotools | ||
include | ||
INSTALL | ||
intl | ||
libada | ||
libatomic | ||
libbacktrace | ||
libcc1 | ||
libcody | ||
libcpp | ||
libdecnumber | ||
libffi | ||
libgcc | ||
libgfortran | ||
libgo | ||
libgomp | ||
libiberty | ||
libitm | ||
libobjc | ||
liboffloadmic | ||
libphobos | ||
libquadmath | ||
libsanitizer | ||
libssp | ||
libstdc++-v3 | ||
libvtv | ||
lto-plugin | ||
maintainer-scripts | ||
zlib | ||
.dir-locals.el | ||
.gitattributes | ||
.gitignore | ||
ABOUT-NLS | ||
ar-lib | ||
ChangeLog | ||
ChangeLog.jit | ||
ChangeLog.tree-ssa | ||
compile | ||
config-ml.in | ||
config.guess | ||
config.rpath | ||
config.sub | ||
configure | ||
configure.ac | ||
COPYING | ||
COPYING3 | ||
COPYING3.LIB | ||
COPYING.LIB | ||
COPYING.RUNTIME | ||
depcomp | ||
install-sh | ||
libtool-ldflags | ||
libtool.m4 | ||
lt~obsolete.m4 | ||
ltgcc.m4 | ||
ltmain.sh | ||
ltoptions.m4 | ||
ltsugar.m4 | ||
ltversion.m4 | ||
MAINTAINERS | ||
Makefile.def | ||
Makefile.in | ||
Makefile.tpl | ||
missing | ||
mkdep | ||
mkinstalldirs | ||
move-if-change | ||
multilib.am | ||
README | ||
symlink-tree | ||
test-driver | ||
ylwrap |
This directory contains the GNU Compiler Collection (GCC). The GNU Compiler Collection is free software. See the files whose names start with COPYING for copying permission. The manuals, and some of the runtime libraries, are under different terms; see the individual source files for details. The directory INSTALL contains copies of the installation information as HTML and plain text. The source of this information is gcc/doc/install.texi. The installation information includes details of what is included in the GCC sources and what files GCC installs. See the file gcc/doc/gcc.texi (together with other files that it includes) for usage and porting information. An online readable version of the manual is in the files gcc/doc/gcc.info*. See http://gcc.gnu.org/bugs/ for how to report bugs usefully. Copyright years on GCC source files may be listed using range notation, e.g., 1987-2012, indicating that every year in the range, inclusive, is a copyrightable year that could otherwise be listed individually.