mirror of
https://gcc.gnu.org/git/gcc.git
synced 2024-12-04 01:04:26 +08:00
Remove obsolete file, whose contents has moved to the web pages as
projects.html. From-SVN: r27552
This commit is contained in:
parent
afe958103c
commit
96559bc552
431
gcc/PROJECTS
431
gcc/PROJECTS
@ -1,431 +0,0 @@
|
||||
Haifa scheduler (haifa-sched.c, loop.[ch], unroll.[ch], genattrtab.c):
|
||||
(contact law@cygnus.com before starting any serious haifa work)
|
||||
|
||||
* Fix all the formatting problems. Simple, mindless work.
|
||||
|
||||
* Fix/add comments throughout the code. Many of the comments are from
|
||||
the old scheduler and are out of date and misleading. Many new hunks
|
||||
of code don't have sufficient comments and documentation. Those which
|
||||
do have comments need to be rewritten to use complete sentences and
|
||||
proper formatting.
|
||||
|
||||
* Someone needs make one (or more) passes over the scheduler as a whole to
|
||||
just clean it up. Try to move the machine dependent bits into the target
|
||||
files where they belong, avoid re-creating functions where or near
|
||||
equivalents already exist (ie is_conditional_branch and friends), etc., etc.
|
||||
|
||||
* Document the new scheduling options. Remove those options which are
|
||||
not really useful (like reverse scheduling for example). In general
|
||||
the haifa scheduler adds _way_ too many options. I'm definitely of the
|
||||
opinion that gcc already has too many -foptions, and haifa doesn't help
|
||||
that situation.
|
||||
|
||||
* Testing and benchmarking. We've converted a few ports to using the
|
||||
Haifa scheduler (hppa, sparc, ppc, alpha). We need to continue testing
|
||||
and benchmarking the new scheduler on additional targets.
|
||||
|
||||
We need to have some kind of docs for how to best describe a machine to
|
||||
the haifa scheduler to get good performance. Some existing ports have
|
||||
been tuned to deal with the old scheduler -- they may need to be tuned
|
||||
to generate good schedules with haifa.
|
||||
|
||||
|
||||
|
||||
Improvements to global cse and partial redundancy elimination:
|
||||
|
||||
The current implementation of global cse uses a lazy code motion algorithm
|
||||
from Muchnick's "Advanced Compiler Design and Implementation".
|
||||
|
||||
Longer term we want to convert to an edge based LCM algorithm using the
|
||||
new structures defined by flow.c. This allows for better expression
|
||||
placement and provides edge splitting "for free".
|
||||
|
||||
lcm also provides the underlying framework for several additional optimizations
|
||||
such as shrink wrapping, spill code motion, dead store elimination, and generic
|
||||
load/store motion (all the other examples are subcases of load/store motion).
|
||||
|
||||
It can probably also be used to improve the reg-stack pass of the compiler.
|
||||
|
||||
Contact law@cygnus.com if you're interested in working on lazy code motion.
|
||||
|
||||
-------------
|
||||
|
||||
The old PROJECTS file. Stuff I know has been done has been deleted.
|
||||
Stuff in progress has a contact name associated with it.
|
||||
has been
|
||||
|
||||
1. Better optimization.
|
||||
|
||||
* Constants in unused inline functions
|
||||
|
||||
It would be nice to delay output of string constants so that string
|
||||
constants mentioned in unused inline functions are never generated.
|
||||
Perhaps this would also take care of string constants in dead code.
|
||||
|
||||
The difficulty is in finding a clean way for the RTL which refers
|
||||
to the constant (currently, only by an assembler symbol name)
|
||||
to point to the constant and cause it to be output.
|
||||
|
||||
* Optimize a sequence of if statements whose conditions are exclusive.
|
||||
|
||||
It is possible to optimize
|
||||
|
||||
if (x == 1) ...;
|
||||
if (x == 2) ...;
|
||||
if (x == 3) ...;
|
||||
|
||||
into
|
||||
|
||||
if (x == 1) ...;
|
||||
else if (x == 2) ...;
|
||||
else if (x == 3) ...;
|
||||
|
||||
provided that x is not altered by the contents of the if statements.
|
||||
|
||||
It's not certain whether this is worth doing. Perhaps programmers
|
||||
nearly always write the else's themselves, leaving few opportunities
|
||||
to improve anything.
|
||||
|
||||
* Un-cse.
|
||||
|
||||
Perhaps we should have an un-cse step right after cse, which tries to
|
||||
replace a reg with its value if the value can be substituted for the
|
||||
reg everywhere, if that looks like an improvement. Which is if the
|
||||
reg is used only a few times. Use rtx_cost to determine if the
|
||||
change is really an improvement.
|
||||
|
||||
* Clean up how cse works.
|
||||
|
||||
The scheme is that each value has just one hash entry. The
|
||||
first_same_value and next_same_value chains are no longer needed.
|
||||
|
||||
For arithmetic, each hash table elt has the following slots:
|
||||
|
||||
* Operation. This is an rtx code.
|
||||
* Mode.
|
||||
* Operands 0, 1 and 2. These point to other hash table elements.
|
||||
|
||||
So, if we want to enter (PLUS:SI (REG:SI 30) (CONST_INT 104)), we
|
||||
first enter (CONST_INT 104) and find the entry that (REG:SI 30) now
|
||||
points to. Then we put these elts into operands 0 and 1 of a new elt.
|
||||
We put PLUS and SI into the new elt.
|
||||
|
||||
Registers and mem refs would never be entered into the table as such.
|
||||
However, the values they contain would be entered. There would be a
|
||||
table indexed by regno which points at the hash entry for the value in
|
||||
that reg.
|
||||
|
||||
The hash entry index now plays the role of a qty number.
|
||||
We still need qty_first_reg, reg_next_eqv, etc. to record which regs
|
||||
share a particular qty.
|
||||
|
||||
When a reg is used whose contents are unknown, we need to create a
|
||||
hash table entry whose contents say "unknown", as a place holder for
|
||||
whatever the reg contains. If that reg is added to something, then
|
||||
the hash entry for the sum will refer to the "unknown" entry. Use
|
||||
UNKNOWN for the rtx code in this entry. This replaces make_new_qty.
|
||||
|
||||
For a constant, a unique hash entry would be made based on the
|
||||
value of the constant.
|
||||
|
||||
What about MEM? Each time a memory address is referenced, we need a
|
||||
qty (a hash table elt) to represent what is in it. (Just as for a
|
||||
register.) If this isn't known, create one, just as for a reg whose
|
||||
contents are unknown.
|
||||
|
||||
We need a way to find all mem refs that still contain a certain value.
|
||||
Do this with a chain of hash elts (for memory addresses) that point to
|
||||
locations that hold the value. The hash elt for the value itself should
|
||||
point to the start of the chain. It would be good for the hash elt
|
||||
for an address to point to the hash elt for the contents of that address
|
||||
(but this ptr can be null if the contents have never been entered).
|
||||
|
||||
With this data structure, nothing need ever be invalidated except
|
||||
the lists of which regs or mems hold a particular value. It is easy
|
||||
to see if there is a reg or mem that is equiv to a particular value.
|
||||
If the value is constant, it is always explicitly constant.
|
||||
|
||||
* Support more general tail-recursion among different functions.
|
||||
|
||||
This might be possible under certain circumstances, such as when
|
||||
the argument lists of the functions have the same lengths.
|
||||
Perhaps it could be done with a special declaration.
|
||||
|
||||
You would need to verify in the calling function that it does not
|
||||
use the addresses of any local variables and does not use setjmp.
|
||||
|
||||
* Put short statics vars at low addresses and use short addressing mode?
|
||||
|
||||
Useful on the 68000/68020 and perhaps on the 32000 series,
|
||||
provided one has a linker that works with the feature.
|
||||
This is said to make a 15% speedup on the 68000.
|
||||
|
||||
* Keep global variables in registers.
|
||||
|
||||
Here is a scheme for doing this. A global variable, or a local variable
|
||||
whose address is taken, can be kept in a register for an entire function
|
||||
if it does not use non-constant memory addresses and (for globals only)
|
||||
does not call other functions. If the entire function does not meet
|
||||
this criterion, a loop may.
|
||||
|
||||
The VAR_DECL for such a variable would have to have two RTL expressions:
|
||||
the true home in memory, and the pseudo-register used temporarily.
|
||||
It is necessary to emit insns to copy the memory location into the
|
||||
pseudo-register at the beginning of the function or loop, and perhaps
|
||||
back out at the end. These insns should have REG_EQUIV notes so that,
|
||||
if the pseudo-register does not get a hard register, it is spilled into
|
||||
the memory location which exists in any case.
|
||||
|
||||
The easiest way to set up these insns is to modify the routine
|
||||
put_var_into_stack so that it does not apply to the entire function
|
||||
(sparing any loops which contain nothing dangerous) and to call it at
|
||||
the end of the function regardless of where in the function the
|
||||
address of a local variable is taken. It would be called
|
||||
unconditionally at the end of the function for all relevant global
|
||||
variables.
|
||||
|
||||
For debugger output, the thing to do is to invent a new binding level
|
||||
around the appropriate loop and define the variable name as a register
|
||||
variable with that scope.
|
||||
|
||||
* Live-range splitting.
|
||||
|
||||
Currently a variable is allocated a hard register either for the full
|
||||
extent of its use or not at all. Sometimes it would be good to
|
||||
allocate a variable a hard register for just part of a function; for
|
||||
example, through a particular loop where the variable is mostly used,
|
||||
or outside of a particular loop where the variable is not used. (The
|
||||
latter is nice because it might let the variable be in a register most
|
||||
of the time even though the loop needs all the registers.)
|
||||
|
||||
Contact meissner@cygnus.com before starting any work on live range
|
||||
splitting.
|
||||
|
||||
* Detect dead stores into memory?
|
||||
|
||||
A store into memory is dead if it is followed by another store into
|
||||
the same location; and, in between, there is no reference to anything
|
||||
that might be that location (including no reference to a variable
|
||||
address).
|
||||
|
||||
This can be modeled as a partial redundancy elimination/lazy code motion
|
||||
problem. Contact law@cygnus.com before working on dead store elimination
|
||||
optimizations.
|
||||
|
||||
* Loop optimization.
|
||||
|
||||
Strength reduction and iteration variable elimination could be
|
||||
smarter. They should know how to decide which iteration variables are
|
||||
not worth making explicit because they can be computed as part of an
|
||||
address calculation. Based on this information, they should decide
|
||||
when it is desirable to eliminate one iteration variable and create
|
||||
another in its place.
|
||||
|
||||
It should be possible to compute what the value of an iteration
|
||||
variable will be at the end of the loop, and eliminate the variable
|
||||
within the loop by computing that value at the loop end.
|
||||
|
||||
When a loop has a simple increment that adds 1,
|
||||
instead of jumping in after the increment,
|
||||
decrement the loop count and jump to the increment.
|
||||
This allows aob insns to be used.
|
||||
|
||||
* Using constraints on values.
|
||||
|
||||
Many operations could be simplified based on knowledge of the
|
||||
minimum and maximum possible values of a register at any particular time.
|
||||
These limits could come from the data types in the tree, via rtl generation,
|
||||
or they can be deduced from operations that are performed. For example,
|
||||
the result of an `and' operation one of whose operands is 7 must be in
|
||||
the range 0 to 7. Compare instructions also tell something about the
|
||||
possible values of the operand, in the code beyond the test.
|
||||
|
||||
Value constraints can be used to determine the results of a further
|
||||
comparison. They can also indicate that certain `and' operations are
|
||||
redundant. Constraints might permit a decrement and branch
|
||||
instruction that checks zeroness to be used when the user has
|
||||
specified to exit if negative.
|
||||
|
||||
* Change the type of a variable.
|
||||
|
||||
Sometimes a variable is declared as `int', it is assigned only once
|
||||
from a value of type `char', and then it is used only by comparison
|
||||
against constants. On many machines, better code would result if
|
||||
the variable had type `char'. If the compiler could detect this
|
||||
case, it could change the declaration of the variable and change
|
||||
all the places that use it.
|
||||
|
||||
* Better handling for very sparse switches.
|
||||
|
||||
There may be cases where it would be better to compile a switch
|
||||
statement to use a fixed hash table rather than the current
|
||||
combination of jump tables and binary search.
|
||||
|
||||
* Order of subexpressions.
|
||||
|
||||
It might be possible to make better code by paying attention
|
||||
to the order in which to generate code for subexpressions of an expression.
|
||||
|
||||
* More code motion.
|
||||
|
||||
Consider hoisting common code up past conditional branches or tablejumps.
|
||||
|
||||
Contact law@cygnus.com before working on code hoisting.
|
||||
|
||||
* Trace scheduling.
|
||||
|
||||
This technique is said to be able to figure out which way a jump
|
||||
will usually go, and rearrange the code to make that path the
|
||||
faster one.
|
||||
|
||||
* Distributive law.
|
||||
|
||||
The C expression *(X + 4 * (Y + C)) compiles better on certain
|
||||
machines if rewritten as *(X + 4*C + 4*Y) because of known addressing
|
||||
modes. It may be tricky to determine when, and for which machines, to
|
||||
use each alternative.
|
||||
|
||||
Some work has been done on this, in combine.c.
|
||||
|
||||
* Can optimize by changing if (x) y; else z; into z; if (x) y;
|
||||
if z and x do not interfere and z has no effects not undone by y.
|
||||
This is desirable if z is faster than jumping.
|
||||
|
||||
* For a two-insn loop on the 68020, such as
|
||||
foo: movb a2@+,a3@+
|
||||
jne foo
|
||||
it is better to insert dbeq d0,foo before the jne.
|
||||
d0 can be a junk register. The challenge is to fit this into
|
||||
a portable framework: when can you detect this situation and
|
||||
still be able to allocate a junk register?
|
||||
|
||||
2. Simpler porting.
|
||||
|
||||
Right now, describing the target machine's instructions is done
|
||||
cleanly, but describing its addressing mode is done with several
|
||||
ad-hoc macro definitions. Porting would be much easier if there were
|
||||
an RTL description for addressing modes like that for instructions.
|
||||
Tools analogous to genflags and genrecog would generate macros from
|
||||
this description.
|
||||
|
||||
There would be one pattern in the address-description file for each
|
||||
kind of addressing, and this pattern would have:
|
||||
|
||||
* the RTL expression for the address
|
||||
* C code to verify its validity (since that may depend on
|
||||
the exact data).
|
||||
* C code to print the address in assembler language.
|
||||
* C code to convert the address into a valid one, if it is not valid.
|
||||
(This would replace LEGITIMIZE_ADDRESS).
|
||||
* Register constraints for all indeterminates that appear
|
||||
in the RTL expression.
|
||||
|
||||
3. Other languages.
|
||||
|
||||
Front ends for Pascal, Fortran, Algol, Cobol, Modula-2 and Ada are
|
||||
desirable.
|
||||
|
||||
Pascal, Modula-2 and Ada require the implementation of functions
|
||||
within functions. Some of the mechanisms for this already exist.
|
||||
|
||||
4. More extensions.
|
||||
|
||||
* Generated unique labels. Have some way of generating distinct labels
|
||||
for use in extended asm statements. I don't know what a good syntax would
|
||||
be.
|
||||
|
||||
* A way of defining a structure containing a union, in which the choice of
|
||||
union alternative is controlled by a previous structure component.
|
||||
|
||||
Here is a possible syntax for this.
|
||||
|
||||
struct foo {
|
||||
enum { INT, DOUBLE } code;
|
||||
auto union { case INT: int i; case DOUBLE: double d;} value : code;
|
||||
};
|
||||
|
||||
* Allow constructor expressions as lvalues, like this:
|
||||
|
||||
(struct foo) {a, b, c} = foo();
|
||||
|
||||
This would call foo, which returns a structure, and then store the
|
||||
several components of the structure into the variables a, b, and c.
|
||||
|
||||
5. Generalize the machine model.
|
||||
|
||||
* Some new compiler features may be needed to do a good job on machines
|
||||
where static data needs to be addressed using base registers.
|
||||
|
||||
* Some machines have two stacks in different areas of memory, one used
|
||||
for scalars and another for large objects. The compiler does not
|
||||
now have a way to understand this.
|
||||
|
||||
6. Useful warnings.
|
||||
|
||||
* Warn about statements that are undefined because the order of
|
||||
evaluation of increment operators makes a big difference. Here is an
|
||||
example:
|
||||
|
||||
*foo++ = hack (*foo);
|
||||
|
||||
7. Better documentation of how GCC works and how to port it.
|
||||
|
||||
Here is an outline proposed by Allan Adler.
|
||||
|
||||
I. Overview of this document
|
||||
II. The machines on which GCC is implemented
|
||||
A. Prose description of those characteristics of target machines and
|
||||
their operating systems which are pertinent to the implementation
|
||||
of GCC.
|
||||
i. target machine characteristics
|
||||
ii. comparison of this system of machine characteristics with
|
||||
other systems of machine specification currently in use
|
||||
B. Tables of the characteristics of the target machines on which
|
||||
GCC is implemented.
|
||||
C. A priori restrictions on the values of characteristics of target
|
||||
machines, with special reference to those parts of the source code
|
||||
which entail those restrictions
|
||||
i. restrictions on individual characteristics
|
||||
ii. restrictions involving relations between various characteristics
|
||||
D. The use of GCC as a cross-compiler
|
||||
i. cross-compilation to existing machines
|
||||
ii. cross-compilation to non-existent machines
|
||||
E. Assumptions which are made regarding the target machine
|
||||
i. assumptions regarding the architecture of the target machine
|
||||
ii. assumptions regarding the operating system of the target machine
|
||||
iii. assumptions regarding software resident on the target machine
|
||||
iv. where in the source code these assumptions are in effect made
|
||||
III. A systematic approach to writing the files tm.h and xm.h
|
||||
A. Macros which require special care or skill
|
||||
B. Examples, with special reference to the underlying reasoning
|
||||
IV. A systematic approach to writing the machine description file md
|
||||
A. Minimal viable sets of insn descriptions
|
||||
B. Examples, with special reference to the underlying reasoning
|
||||
V. Uses of the file aux-output.c
|
||||
VI. Specification of what constitutes correct performance of an
|
||||
implementation of GCC
|
||||
A. The components of GCC
|
||||
B. The itinerary of a C program through GCC
|
||||
C. A system of benchmark programs
|
||||
D. What your RTL and assembler should look like with these benchmarks
|
||||
E. Fine tuning for speed and size of compiled code
|
||||
VII. A systematic procedure for debugging an implementation of GCC
|
||||
A. Use of GDB
|
||||
i. the macros in the file .gdbinit for GCC
|
||||
ii. obstacles to the use of GDB
|
||||
a. functions implemented as macros can't be called in GDB
|
||||
B. Debugging without GDB
|
||||
i. How to turn off the normal operation of GCC and access specific
|
||||
parts of GCC
|
||||
C. Debugging tools
|
||||
D. Debugging the parser
|
||||
i. how machine macros and insn definitions affect the parser
|
||||
E. Debugging the recognizer
|
||||
i. how machine macros and insn definitions affect the recognizer
|
||||
|
||||
ditto for other components
|
||||
|
||||
VIII. Data types used by GCC, with special reference to restrictions not
|
||||
specified in the formal definition of the data type
|
||||
IX. References to the literature for the algorithms used in GCC
|
||||
|
Loading…
Reference in New Issue
Block a user