Replace VRP threader with a hybrid forward threader.

This patch implements the new hybrid forward threader and replaces the
embedded VRP threader with it.

With all the pieces that have gone in, the implementation of the hybrid
threader is straightforward: convert the current state into
SSA imports that the solver will understand, and let the path solver
precompute ranges and relations for the path.  After this setup is done,
we can use the range_query API to solve gimple statements in the threader.
The forward threader is now engine agnostic so there are no changes to
the threader per se.

I have put the hybrid bits in tree-ssa-threadedge.*, instead of VRP,
because they will also be used in the evrp removal of the DOM/threader,
which is my next task.

Most of the patch, is actually test changes.  I have gone through every
single one and verified that we're correct.  Most were trivial dump
file name changes, but others required going through the IL an
certifying that the different IL was expected.

For example, in pr59597.c, we have one less thread because the
ASSERT_EXPR was getting in the way, and making it seem like things were
not crossing loops.  The hybrid threader sees the correct representation
of the IL, and avoids threading this one case.

The final numbers are a 12.16% improvement in jump threads immediately
after VRP, and a 0.82% improvement in overall jump threads.  The
performance drop is 0.6% (plus the 1.43% hit from moving the embedded
threader into its own pass).  As I've said, I'd prefer to keep the
threader in its own pass, but if this is an issue, we can address this
with a shared ranger when VRP is replaced with an evrp instance
(upcoming).

Note, that these numbers are slightly different than what I originally
posted.  A few correctness tweaks, plus restricting loop threads, made
the difference.  That being said, I was aiming for par.  A 12% gain is
just gravy ;-).  When we merge the threaders, we should see even better
numbers-- and we'll have the benefit of an entire release stress testing
the solver.

As I mentioned in my introductory note, paths ending in MEM_REF
conditional are missing.  In reality, this didn't make a difference, as
it was so rare.  However, as a follow-up, I will distill a test and add
a suitable PR to keep us honest.

There is a one-line change to libgomp/team.c silencing a new used
uninitialized warning.  As my previous work with the threaders has
shown, warnings flare up after each improvement to jump threading.  I
expect this to be no different.  I've promised Jakub to investigate
fully, so I will analyze and add the appropriate PR for the warning
experts.

Oh yeah, the new pass dump is called vrp-threader[12] to match each
VRP[12] pass.  However, there's no reason for it to either be named
vrp-threader, or for it to live in tree-vrp.c.

Tested on x86-64 Linux.

OK?

p.s. "Did I say 5 weeks?  My bad, I meant 5 months."

gcc/ChangeLog:

	* passes.def (pass_vrp_threader): New.
	* tree-pass.h (make_pass_vrp_threader): Add make_pass_vrp_threader.
	* tree-ssa-threadedge.c (hybrid_jt_state::register_equivs_stmt): New.
	(hybrid_jt_simplifier::hybrid_jt_simplifier): New.
	(hybrid_jt_simplifier::simplify): New.
	(hybrid_jt_simplifier::compute_ranges_from_state): New.
	* tree-ssa-threadedge.h (class hybrid_jt_state): New.
	(class hybrid_jt_simplifier): New.
	* tree-vrp.c (execute_vrp): Remove ASSERT_EXPR based jump
	threader.
	(class hybrid_threader): New.
	(hybrid_threader::hybrid_threader): New.
	(hybrid_threader::~hybrid_threader): New.
	(hybrid_threader::before_dom_children): New.
	(hybrid_threader::after_dom_children): New.
	(execute_vrp_threader): New.
	(class pass_vrp_threader): New.
	(make_pass_vrp_threader): New.

libgomp/ChangeLog:

	* team.c: Initialize start_data.
	* testsuite/libgomp.graphite/force-parallel-4.c: Adjust.
	* testsuite/libgomp.graphite/force-parallel-8.c: Adjust.

gcc/testsuite/ChangeLog:

	* gcc.dg/torture/pr55107.c: Adjust.
	* gcc.dg/tree-ssa/phi_on_compare-1.c: Adjust.
	* gcc.dg/tree-ssa/phi_on_compare-2.c: Adjust.
	* gcc.dg/tree-ssa/phi_on_compare-3.c: Adjust.
	* gcc.dg/tree-ssa/phi_on_compare-4.c: Adjust.
	* gcc.dg/tree-ssa/pr21559.c: Adjust.
	* gcc.dg/tree-ssa/pr59597.c: Adjust.
	* gcc.dg/tree-ssa/pr61839_1.c: Adjust.
	* gcc.dg/tree-ssa/pr61839_3.c: Adjust.
	* gcc.dg/tree-ssa/pr71437.c: Adjust.
	* gcc.dg/tree-ssa/ssa-dom-thread-11.c: Adjust.
	* gcc.dg/tree-ssa/ssa-dom-thread-16.c: Adjust.
	* gcc.dg/tree-ssa/ssa-dom-thread-18.c: Adjust.
	* gcc.dg/tree-ssa/ssa-dom-thread-2a.c: Adjust.
	* gcc.dg/tree-ssa/ssa-dom-thread-4.c: Adjust.
	* gcc.dg/tree-ssa/ssa-thread-14.c: Adjust.
	* gcc.dg/tree-ssa/ssa-vrp-thread-1.c: Adjust.
	* gcc.dg/tree-ssa/vrp106.c: Adjust.
	* gcc.dg/tree-ssa/vrp55.c: Adjust.
This commit is contained in:
Aldy Hernandez 2021-09-21 10:27:53 +02:00
parent dd11aab646
commit 0288527f47
27 changed files with 270 additions and 63 deletions

View File

@ -212,6 +212,7 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_merge_phi); NEXT_PASS (pass_merge_phi);
NEXT_PASS (pass_thread_jumps); NEXT_PASS (pass_thread_jumps);
NEXT_PASS (pass_vrp, true /* warn_array_bounds_p */); NEXT_PASS (pass_vrp, true /* warn_array_bounds_p */);
NEXT_PASS (pass_vrp_threader);
NEXT_PASS (pass_dse); NEXT_PASS (pass_dse);
NEXT_PASS (pass_dce); NEXT_PASS (pass_dce);
/* pass_stdarg is always run and at this point we execute /* pass_stdarg is always run and at this point we execute
@ -337,6 +338,7 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_strlen); NEXT_PASS (pass_strlen);
NEXT_PASS (pass_thread_jumps); NEXT_PASS (pass_thread_jumps);
NEXT_PASS (pass_vrp, false /* warn_array_bounds_p */); NEXT_PASS (pass_vrp, false /* warn_array_bounds_p */);
NEXT_PASS (pass_vrp_threader);
/* Threading can leave many const/copy propagations in the IL. /* Threading can leave many const/copy propagations in the IL.
Clean them up. Instead of just copy_prop, we use ccp to Clean them up. Instead of just copy_prop, we use ccp to
compute alignment and nonzero bits. */ compute alignment and nonzero bits. */

View File

@ -1,5 +1,5 @@
/* { dg-do compile } */ /* { dg-do compile } */
/* { dg-additional-options "-fno-split-loops" } */ /* { dg-additional-options "-fno-split-loops -w" } */
typedef unsigned short uint16_t; typedef unsigned short uint16_t;

View File

@ -1,5 +1,5 @@
/* { dg-do compile } */ /* { dg-do compile } */
/* { dg-options "-Ofast -fdump-tree-vrp1" } */ /* { dg-options "-Ofast -fdump-tree-vrp-thread1" } */
void g (int); void g (int);
void g1 (int); void g1 (int);
@ -27,4 +27,4 @@ f (long a, long b, long c, long d, long x)
g (a); g (a);
} }
/* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp1" } } */ /* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp-thread1" } } */

View File

@ -1,5 +1,5 @@
/* { dg-do compile } */ /* { dg-do compile } */
/* { dg-options "-Ofast -fdump-tree-vrp1" } */ /* { dg-options "-Ofast -fdump-tree-vrp-thread1" } */
void g (void); void g (void);
void g1 (void); void g1 (void);
@ -20,4 +20,4 @@ f (long a, long b, long c, long d, int x)
} }
} }
/* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp1" } } */ /* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp-thread1" } } */

View File

@ -1,5 +1,5 @@
/* { dg-do compile } */ /* { dg-do compile } */
/* { dg-options "-Ofast -fdump-tree-vrp1" } */ /* { dg-options "-Ofast -fdump-tree-vrp-thread1" } */
void g (void); void g (void);
void g1 (void); void g1 (void);
@ -22,4 +22,4 @@ f (long a, long b, long c, long d, int x)
} }
} }
/* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp1" } } */ /* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp-thread1" } } */

View File

@ -1,5 +1,5 @@
/* { dg-do compile } */ /* { dg-do compile } */
/* { dg-options "-Ofast -fdump-tree-vrp1" } */ /* { dg-options "-Ofast -fdump-tree-vrp-thread1" } */
void g (int); void g (int);
void g1 (int); void g1 (int);
@ -37,4 +37,4 @@ f (long a, long b, long c, long d, int x)
g (c + d); g (c + d);
} }
/* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp1" } } */ /* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp-thread1" } } */

View File

@ -1,5 +1,5 @@
/* { dg-do compile } */ /* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-evrp-details -fdump-tree-vrp1-details" } */ /* { dg-options "-O2 -fdump-tree-evrp-details -fdump-tree-vrp-thread1-details" } */
static int blocksize = 4096; static int blocksize = 4096;
@ -39,6 +39,6 @@ void foo (void)
statement. We also realize that the final bytes == 0 test is useless, statement. We also realize that the final bytes == 0 test is useless,
and thread over it. We also know that toread != 0 is useless when and thread over it. We also know that toread != 0 is useless when
entering while loop and thread over it. */ entering while loop and thread over it. */
/* { dg-final { scan-tree-dump-times "Threaded jump" 3 "vrp1" } } */ /* { dg-final { scan-tree-dump-times "Threaded jump" 3 "vrp-thread1" } } */

View File

@ -1,5 +1,5 @@
/* { dg-do compile } */ /* { dg-do compile } */
/* { dg-options "-Ofast -fdump-tree-vrp1-details" } */ /* { dg-options "-Ofast -fdump-tree-vrp-thread1-details" } */
typedef unsigned short u16; typedef unsigned short u16;
typedef unsigned char u8; typedef unsigned char u8;
@ -56,6 +56,11 @@ main (int argc, char argv[])
return crc; return crc;
} }
/* { dg-final { scan-tree-dump-times "Registering jump thread" 3 "vrp1" } } */ /* Previously we had 3 jump threads, but one of them crossed loops.
/* { dg-final { scan-tree-dump-not "joiner" "vrp1" } } */ The reason the old threader was allowing it, was because there was
/* { dg-final { scan-tree-dump-times "Threaded jump" 3 "vrp1" } } */ an ASSERT_EXPR getting in the way. Without the ASSERT_EXPR, we
have an empty pre-header block as the final block in the thread,
which the threader will simply join with the next block which *is*
in a different loop. */
/* { dg-final { scan-tree-dump-times "Registering jump thread" 2 "vrp-thread1" } } */
/* { dg-final { scan-tree-dump-not "joiner" "vrp-thread1" } } */

View File

@ -1,6 +1,6 @@
/* PR tree-optimization/61839. */ /* PR tree-optimization/61839. */
/* { dg-do run } */ /* { dg-do run } */
/* { dg-options "-O2 -fdump-tree-vrp1 -fdisable-tree-evrp -fdump-tree-optimized -fdisable-tree-ethread -fdisable-tree-thread1" } */ /* { dg-options "-O2 -fdump-tree-vrp-thread1 -fdisable-tree-evrp -fdump-tree-optimized -fdisable-tree-ethread -fdisable-tree-thread1" } */
/* { dg-require-effective-target int32plus } */ /* { dg-require-effective-target int32plus } */
__attribute__ ((noinline)) __attribute__ ((noinline))
@ -38,7 +38,11 @@ int main ()
} }
/* Scan for c = 972195717) >> [0, 1] in function foo. */ /* Scan for c = 972195717) >> [0, 1] in function foo. */
/* { dg-final { scan-tree-dump-times "486097858 : 972195717" 1 "vrp1" } } */ /* { dg-final { scan-tree-dump-times "486097858 : 972195717" 1 "vrp-thread1" } } */
/* Previously we were checking for two ?: with constant PHI arguments,
but now we collapse them into one. */
/* Scan for c = 972195717) >> [2, 3] in function bar. */ /* Scan for c = 972195717) >> [2, 3] in function bar. */
/* { dg-final { scan-tree-dump-times "243048929 : 121524464" 2 "vrp1" } } */ /* { dg-final { scan-tree-dump-times "243048929 : 121524464" 1 "vrp-thread1" } } */
/* { dg-final { scan-tree-dump-times "486097858" 0 "optimized" } } */ /* { dg-final { scan-tree-dump-times "486097858" 0 "optimized" } } */

View File

@ -1,6 +1,6 @@
/* PR tree-optimization/61839. */ /* PR tree-optimization/61839. */
/* { dg-do run } */ /* { dg-do run } */
/* { dg-options "-O2 -fdump-tree-vrp1 -fdump-tree-optimized -fdisable-tree-ethread -fdisable-tree-thread1" } */ /* { dg-options "-O2 -fdump-tree-vrp-thread1 -fdump-tree-optimized -fdisable-tree-ethread -fdisable-tree-thread1" } */
__attribute__ ((noinline)) __attribute__ ((noinline))
int foo (int a, unsigned b) int foo (int a, unsigned b)
@ -22,5 +22,5 @@ int main ()
} }
/* Scan for c [12, 13] << 8 in function foo. */ /* Scan for c [12, 13] << 8 in function foo. */
/* { dg-final { scan-tree-dump-times "3072 : 3328" 2 "vrp1" } } */ /* { dg-final { scan-tree-dump-times "3072 : 3328" 1 "vrp-thread1" } } */
/* { dg-final { scan-tree-dump-times "3072" 0 "optimized" } } */ /* { dg-final { scan-tree-dump-times "3072" 0 "optimized" } } */

View File

@ -1,5 +1,5 @@
/* { dg-do compile } */ /* { dg-do compile } */
/* { dg-options "-ffast-math -O3 -fdump-tree-vrp1-details" } */ /* { dg-options "-ffast-math -O3 -fdump-tree-vrp-thread1-details" } */
int I = 50, J = 50; int I = 50, J = 50;
int S, L; int S, L;
@ -39,4 +39,4 @@ void foo (int K)
bar (LD, SD); bar (LD, SD);
} }
} }
/* { dg-final { scan-tree-dump-times "Threaded jump " 2 "vrp1" } } */ /* { dg-final { scan-tree-dump-times "Threaded jump " 2 "vrp-thread1" } } */

View File

@ -1,5 +1,5 @@
/* { dg-do compile } */ /* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-dom2-details --param logical-op-non-short-circuit=1 -fdisable-tree-thread1 -fdisable-tree-thread2" } */ /* { dg-options "-O2 -fdump-tree-dom2-details --param logical-op-non-short-circuit=1 -fdisable-tree-thread1 -fdisable-tree-thread2 -fdisable-tree-vrp-thread1 " } */
static int *bb_ticks; static int *bb_ticks;
extern void frob (void); extern void frob (void);

View File

@ -1,5 +1,5 @@
/* { dg-do compile } */ /* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-dom2-details -w --param logical-op-non-short-circuit=1" } */ /* { dg-options "-O2 -fdump-tree-dom2-details -w --param logical-op-non-short-circuit=1 -fdisable-tree-vrp-thread1" } */
unsigned char unsigned char
validate_subreg (unsigned int offset, unsigned int isize, unsigned int osize, int zz, int qq) validate_subreg (unsigned int offset, unsigned int isize, unsigned int osize, int zz, int qq)
{ {

View File

@ -1,5 +1,5 @@
/* { dg-do compile } */ /* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-vrp1-details -fdump-tree-thread1-details -std=gnu89 --param logical-op-non-short-circuit=0" } */ /* { dg-options "-O2 -fdump-tree-vrp-thread1-details -std=gnu89 --param logical-op-non-short-circuit=0" } */
#include "ssa-dom-thread-4.c" #include "ssa-dom-thread-4.c"
@ -24,4 +24,4 @@
/* There used to be 6 jump threads found by thread1, but they all /* There used to be 6 jump threads found by thread1, but they all
depended on threading through distinct loops in ethread. */ depended on threading through distinct loops in ethread. */
/* { dg-final { scan-tree-dump-times "Threaded" 2 "vrp1" } } */ /* { dg-final { scan-tree-dump-times "Threaded" 2 "vrp-thread1" } } */

View File

@ -1,5 +1,5 @@
/* { dg-do compile } */ /* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-vrp1-stats -fdump-tree-dom2-stats" } */ /* { dg-options "-O2 -fdump-tree-vrp-thread1-stats -fdump-tree-dom2-stats" } */
void bla(); void bla();
@ -16,6 +16,6 @@ void thread_entry_through_header (void)
/* There's a single jump thread that should be handled by the VRP /* There's a single jump thread that should be handled by the VRP
jump threading pass. */ jump threading pass. */
/* { dg-final { scan-tree-dump-times "Jumps threaded: 1" 1 "vrp1"} } */ /* { dg-final { scan-tree-dump-times "Jumps threaded: 1" 1 "vrp-thread1"} } */
/* { dg-final { scan-tree-dump-times "Jumps threaded: 2" 0 "vrp1"} } */ /* { dg-final { scan-tree-dump-times "Jumps threaded: 2" 0 "vrp-thread1"} } */
/* { dg-final { scan-tree-dump-not "Jumps threaded" "dom2"} } */ /* { dg-final { scan-tree-dump-not "Jumps threaded" "dom2"} } */

View File

@ -1,5 +1,5 @@
/* { dg-do compile } */ /* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-vrp1-details -fdump-tree-dom2-details -std=gnu89 --param logical-op-non-short-circuit=1" } */ /* { dg-options "-O2 -fdump-tree-vrp-thread1-details -fdump-tree-dom2-details -std=gnu89 --param logical-op-non-short-circuit=1" } */
struct bitmap_head_def; struct bitmap_head_def;
typedef struct bitmap_head_def *bitmap; typedef struct bitmap_head_def *bitmap;
typedef const struct bitmap_head_def *const_bitmap; typedef const struct bitmap_head_def *const_bitmap;
@ -58,4 +58,5 @@ bitmap_ior_and_compl (bitmap dst, const_bitmap a, const_bitmap b,
code we missed the edge when the first conditional is false code we missed the edge when the first conditional is false
(b_elt is zero, which means the second conditional is always (b_elt is zero, which means the second conditional is always
zero. VRP1 catches all three. */ zero. VRP1 catches all three. */
/* { dg-final { scan-tree-dump-times "Threaded" 3 "vrp1" } } */ /* { dg-final { scan-tree-dump-times "Registering jump thread" 2 "vrp-thread1" } } */
/* { dg-final { scan-tree-dump-times "Path crosses loops" 1 "vrp-thread1" } } */

View File

@ -1,7 +1,7 @@
/* { dg-do compile } */ /* { dg-do compile } */
/* { dg-additional-options "-O2 -fdump-tree-vrp-details --param logical-op-non-short-circuit=1" } */ /* { dg-additional-options "-O2 -fdump-tree-vrp-thread1-details --param logical-op-non-short-circuit=1" } */
/* { dg-additional-options "-fdisable-tree-thread1" } */ /* { dg-additional-options "-fdisable-tree-thread1" } */
/* { dg-final { scan-tree-dump-times "Threaded jump" 8 "vrp1" } } */ /* { dg-final { scan-tree-dump-times "Threaded jump" 8 "vrp-thread1" } } */
void foo (void); void foo (void);
void bar (void); void bar (void);

View File

@ -1,5 +1,5 @@
/* { dg-do compile } */ /* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-vrp1-details -fdelete-null-pointer-checks" } */ /* { dg-options "-O2 -fdump-tree-vrp-thread1-details -fdelete-null-pointer-checks" } */
/* { dg-skip-if "" keeps_null_pointer_checks } */ /* { dg-skip-if "" keeps_null_pointer_checks } */
void oof (void); void oof (void);
@ -29,5 +29,5 @@ build_omp_regions_1 (basic_block bb, struct omp_region *parent,
/* ARM Cortex-M defined LOGICAL_OP_NON_SHORT_CIRCUIT to false, /* ARM Cortex-M defined LOGICAL_OP_NON_SHORT_CIRCUIT to false,
so skip below test. */ so skip below test. */
/* { dg-final { scan-tree-dump-times "Threaded" 1 "vrp1" { target { ! arm_cortex_m } } } } */ /* { dg-final { scan-tree-dump-times "Threaded" 1 "vrp-thread1" { target { ! arm_cortex_m } } } } */

View File

@ -1,6 +1,6 @@
/* PR tree-optimization/18046 */ /* PR tree-optimization/18046 */
/* { dg-options "-O2 -fdump-tree-vrp1-details" } */ /* { dg-options "-O2 -fdump-tree-vrp-thread1-details" } */
/* { dg-final { scan-tree-dump-times "Threaded jump" 1 "vrp1" } } */ /* { dg-final { scan-tree-dump-times "Threaded jump" 1 "vrp-thread1" } } */
/* During VRP we expect to thread the true arm of the conditional through the switch /* During VRP we expect to thread the true arm of the conditional through the switch
and to the BB that corresponds to the 7 ... 9 case label. */ and to the BB that corresponds to the 7 ... 9 case label. */
extern void foo (void); extern void foo (void);

View File

@ -1,5 +1,5 @@
/* { dg-do compile } */ /* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-vrp1-blocks-vops-details -fdelete-null-pointer-checks" } */ /* { dg-options "-O2 -fdump-tree-vrp-thread1-blocks-vops-details -fdelete-null-pointer-checks" } */
void arf (void); void arf (void);
@ -12,6 +12,6 @@ fu (char *p, int x)
arf (); arf ();
} }
/* { dg-final { scan-tree-dump-times "Threaded jump" 1 "vrp1" { target { ! keeps_null_pointer_checks } } } } */ /* { dg-final { scan-tree-dump-times "Threaded jump" 1 "vrp-thread1" { target { ! keeps_null_pointer_checks } } } } */
/* { dg-final { scan-tree-dump-times "Threaded jump" 0 "vrp1" { target { keeps_null_pointer_checks } } } } */ /* { dg-final { scan-tree-dump-times "Threaded jump" 0 "vrp-thread1" { target { keeps_null_pointer_checks } } } } */

View File

@ -462,6 +462,7 @@ extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt); extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_early_vrp (gcc::context *ctxt); extern gimple_opt_pass *make_pass_early_vrp (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt); extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_vrp_threader (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt); extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt); extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_reassoc (gcc::context *ctxt); extern gimple_opt_pass *make_pass_reassoc (gcc::context *ctxt);

View File

@ -39,6 +39,7 @@ along with GCC; see the file COPYING3. If not see
#include "vr-values.h" #include "vr-values.h"
#include "gimple-ssa-evrp-analyze.h" #include "gimple-ssa-evrp-analyze.h"
#include "gimple-range.h" #include "gimple-range.h"
#include "gimple-range-path.h"
/* To avoid code explosion due to jump threading, we limit the /* To avoid code explosion due to jump threading, we limit the
number of statements we are going to copy. This variable number of statements we are going to copy. This variable
@ -1397,3 +1398,73 @@ jt_state::register_equivs_stmt (gimple *stmt, basic_block bb,
register_equiv (gimple_get_lhs (stmt), cached_lhs, register_equiv (gimple_get_lhs (stmt), cached_lhs,
/*update_range=*/false); /*update_range=*/false);
} }
// Hybrid threader implementation.
void
hybrid_jt_state::register_equivs_stmt (gimple *, basic_block, jt_simplifier *)
{
// Ranger has no need to simplify anything to improve equivalences.
}
hybrid_jt_simplifier::hybrid_jt_simplifier (gimple_ranger *r,
path_range_query *q)
{
m_ranger = r;
m_query = q;
}
tree
hybrid_jt_simplifier::simplify (gimple *stmt, gimple *, basic_block,
jt_state *state)
{
int_range_max r;
compute_ranges_from_state (stmt, state);
if (gimple_code (stmt) == GIMPLE_COND
|| gimple_code (stmt) == GIMPLE_ASSIGN)
{
tree ret;
if (m_query->range_of_stmt (r, stmt) && r.singleton_p (&ret))
return ret;
}
else if (gimple_code (stmt) == GIMPLE_SWITCH)
{
gswitch *switch_stmt = dyn_cast <gswitch *> (stmt);
tree index = gimple_switch_index (switch_stmt);
if (m_query->range_of_expr (r, index, stmt))
return find_case_label_range (switch_stmt, &r);
}
return NULL;
}
// Use STATE to generate the list of imports needed for the solver,
// and calculate the ranges along the path.
void
hybrid_jt_simplifier::compute_ranges_from_state (gimple *stmt, jt_state *state)
{
auto_bitmap imports;
gori_compute &gori = m_ranger->gori ();
state->get_path (m_path);
// Start with the imports to the final conditional.
bitmap_copy (imports, gori.imports (m_path[0]));
// Add any other interesting operands we may have missed.
if (gimple_bb (stmt) != m_path[0])
{
for (unsigned i = 0; i < gimple_num_ops (stmt); ++i)
{
tree op = gimple_op (stmt, i);
if (op
&& TREE_CODE (op) == SSA_NAME
&& irange::supports_type_p (TREE_TYPE (op)))
bitmap_set_bit (imports, SSA_NAME_VERSION (op));
}
}
m_query->precompute_ranges (m_path, imports);
}

View File

@ -53,6 +53,26 @@ public:
virtual tree simplify (gimple *, gimple *, basic_block, jt_state *) = 0; virtual tree simplify (gimple *, gimple *, basic_block, jt_state *) = 0;
}; };
class hybrid_jt_state : public jt_state
{
private:
void register_equivs_stmt (gimple *, basic_block, jt_simplifier *) override;
};
class hybrid_jt_simplifier : public jt_simplifier
{
public:
hybrid_jt_simplifier (class gimple_ranger *r, class path_range_query *q);
private:
tree simplify (gimple *stmt, gimple *, basic_block, jt_state *) override;
void compute_ranges_from_state (gimple *stmt, jt_state *);
gimple_ranger *m_ranger;
path_range_query *m_query;
auto_vec<basic_block> m_path;
};
// This is the high level threader. The entry point is // This is the high level threader. The entry point is
// thread_outgoing_edges(), which calculates and registers paths to be // thread_outgoing_edges(), which calculates and registers paths to be
// threaded. When all candidates have been registered, // threaded. When all candidates have been registered,

View File

@ -66,6 +66,8 @@ along with GCC; see the file COPYING3. If not see
#include "range-op.h" #include "range-op.h"
#include "value-range-equiv.h" #include "value-range-equiv.h"
#include "gimple-array-bounds.h" #include "gimple-array-bounds.h"
#include "gimple-range.h"
#include "gimple-range-path.h"
#include "tree-ssa-dom.h" #include "tree-ssa-dom.h"
/* Set of SSA names found live during the RPO traversal of the function /* Set of SSA names found live during the RPO traversal of the function
@ -4591,11 +4593,6 @@ execute_vrp (struct function *fun, bool warn_array_bounds_p)
array_checker.check (); array_checker.check ();
} }
/* We must identify jump threading opportunities before we release
the datastructures built by VRP. */
vrp_jump_threader threader (fun, &vrp_vr_values);
threader.thread_jumps ();
simplify_casted_conds (fun, &vrp_vr_values); simplify_casted_conds (fun, &vrp_vr_values);
free_numbers_of_iterations_estimates (fun); free_numbers_of_iterations_estimates (fun);
@ -4605,21 +4602,6 @@ execute_vrp (struct function *fun, bool warn_array_bounds_p)
does not properly handle ASSERT_EXPRs. */ does not properly handle ASSERT_EXPRs. */
assert_engine.remove_range_assertions (); assert_engine.remove_range_assertions ();
/* If we exposed any new variables, go ahead and put them into
SSA form now, before we handle jump threading. This simplifies
interactions between rewriting of _DECL nodes into SSA form
and rewriting SSA_NAME nodes into SSA form after block
duplication and CFG manipulation. */
update_ssa (TODO_update_ssa);
/* We identified all the jump threading opportunities earlier, but could
not transform the CFG at that time. This routine transforms the
CFG and arranges for the dominator tree to be rebuilt if necessary.
Note the SSA graph update will occur during the normal TODO
processing by the pass manager. */
threader.thread_through_all_blocks ();
scev_finalize (); scev_finalize ();
loop_optimizer_finalize (); loop_optimizer_finalize ();
return 0; return 0;
@ -4669,3 +4651,124 @@ make_pass_vrp (gcc::context *ctxt)
{ {
return new pass_vrp (ctxt); return new pass_vrp (ctxt);
} }
// This is the dom walker for the hybrid threader. The reason this is
// here, as opposed to the generic threading files, is because the
// other client would be DOM, and they have their own custom walker.
class hybrid_threader : public dom_walker
{
public:
hybrid_threader ();
~hybrid_threader ();
void thread_jumps (function *fun)
{
walk (fun->cfg->x_entry_block_ptr);
}
void thread_through_all_blocks ()
{
m_threader->thread_through_all_blocks (false);
}
private:
edge before_dom_children (basic_block) override;
void after_dom_children (basic_block bb) override;
hybrid_jt_simplifier *m_simplifier;
jump_threader *m_threader;
jt_state *m_state;
gimple_ranger *m_ranger;
path_range_query *m_query;
};
hybrid_threader::hybrid_threader () : dom_walker (CDI_DOMINATORS, REACHABLE_BLOCKS)
{
loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS);
scev_initialize ();
calculate_dominance_info (CDI_DOMINATORS);
mark_dfs_back_edges ();
m_ranger = new gimple_ranger;
m_query = new path_range_query (*m_ranger, /*resolve=*/true);
m_simplifier = new hybrid_jt_simplifier (m_ranger, m_query);
m_state = new hybrid_jt_state;
m_threader = new jump_threader (m_simplifier, m_state);
}
hybrid_threader::~hybrid_threader ()
{
delete m_simplifier;
delete m_threader;
delete m_state;
delete m_ranger;
scev_finalize ();
loop_optimizer_finalize ();
}
edge
hybrid_threader::before_dom_children (basic_block bb)
{
gimple_stmt_iterator gsi;
int_range<2> r;
for (gsi = gsi_start_nondebug_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
{
gimple *stmt = gsi_stmt (gsi);
m_ranger->range_of_stmt (r, stmt);
}
return NULL;
}
void
hybrid_threader::after_dom_children (basic_block bb)
{
m_threader->thread_outgoing_edges (bb);
}
static unsigned int
execute_vrp_threader (function *fun)
{
hybrid_threader threader;
threader.thread_jumps (fun);
threader.thread_through_all_blocks ();
return 0;
}
namespace {
const pass_data pass_data_vrp_threader =
{
GIMPLE_PASS, /* type */
"vrp-thread", /* name */
OPTGROUP_NONE, /* optinfo_flags */
TV_TREE_VRP, /* tv_id */
PROP_ssa, /* properties_required */
0, /* properties_provided */
0, /* properties_destroyed */
0, /* todo_flags_start */
( TODO_cleanup_cfg | TODO_update_ssa ), /* todo_flags_finish */
};
class pass_vrp_threader : public gimple_opt_pass
{
public:
pass_vrp_threader (gcc::context *ctxt)
: gimple_opt_pass (pass_data_vrp_threader, ctxt)
{}
/* opt_pass methods: */
opt_pass * clone () { return new pass_vrp_threader (m_ctxt); }
virtual bool gate (function *) { return flag_tree_vrp != 0; }
virtual unsigned int execute (function *fun)
{ return execute_vrp_threader (fun); }
};
} // namespace {
gimple_opt_pass *
make_pass_vrp_threader (gcc::context *ctxt)
{
return new pass_vrp_threader (ctxt);
}

View File

@ -312,7 +312,7 @@ gomp_team_start (void (*fn) (void *), void *data, unsigned nthreads,
unsigned flags, struct gomp_team *team, unsigned flags, struct gomp_team *team,
struct gomp_taskgroup *taskgroup) struct gomp_taskgroup *taskgroup)
{ {
struct gomp_thread_start_data *start_data; struct gomp_thread_start_data *start_data = NULL;
struct gomp_thread *thr, *nthr; struct gomp_thread *thr, *nthr;
struct gomp_task *task; struct gomp_task *task;
struct gomp_task_icv *icv; struct gomp_task_icv *icv;

View File

@ -1,5 +1,5 @@
/* Autopar with IF conditions. */ /* Autopar with IF conditions. */
/* { dg-additional-options "-fdisable-tree-thread1" } */ /* { dg-additional-options "-fdisable-tree-thread1 -fdisable-tree-vrp-thread1" } */
void abort(); void abort();

View File

@ -1,4 +1,4 @@
/* { dg-additional-options "-fdisable-tree-thread1" } */ /* { dg-additional-options "-fdisable-tree-thread1 -fdisable-tree-vrp-thread1" } */
#define N 1500 #define N 1500