LIBRARY OF THE 
 
 UNIVERSITY OF ILLINOIS 
 
 AT URBANA-CHAMPAIGN 
 
 51 . 84 
 
 l^6T 
 
 no. 171 -187 
 
 cop. a. 
 
Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/fasttranslationt172gear 
 
Dort No. 172 
 
 *o. I7A. 
 
 coo-li+69-0001 
 
 A FAST TRANSLATION TECHNIQUE 
 WHICH PARTIALLY OPTIMIZES THE OBJECT CODE 
 
 by 
 
 C. W. Gear 
 
 JAN 1 ■ ■■ 
 
 April 5, 1965 
 
The person charging this material is re- 
 sponsible for its return on or before the 
 Latest Date stamped below. 
 
 Theft, mutilation, and underlining of books 
 are reasons for disciplinary action and may 
 result in dismissal from the University. 
 
 University of Illinois Library 
 
Report No. 172 
 
 A FAST TRANSLATION TECHNIQUE 
 WHICH PARTIALLY OPTIMIZES THE OBJECT CODE 
 
 by 
 C. W. Gear 
 
 April 5, 1965 
 
 Department of Computer Science 
 
 University of Illinois 
 
 Urbana, Illinois 
 
Abstract 
 
 A three-pass compiler is briefly described with the following 
 properties : The last two passes scan an intermediate language produced by the 
 preceding pass in essentially the reverse of the order in which it was generated, 
 so that the first pass is the only one which has to read the relatively bulky 
 problem oriented input. The double scan, one in either direction, performed by 
 the first two passes, allows the compiler to remove locally constant expressions 
 and recursively calculable expressions from loops and to do the important part 
 of common subexpression recognition. Optimization such as the effective use of 
 index registers, although as important, is not discussed since the object code 
 which would be most efficient is highly machine dependent. The discussion is in 
 terms of a F0RTRAN-like language, although the technique is applicable to most 
 algebraic languages . 
 
 • 111- 
 
1. Introduction 
 
 Existing compilers tend to fall into two classes. Those that are 
 relatively fast, very often syntax directed, one or two pass programs with 
 either an inefficient or an interpretive object, and those which spend a large 
 amount of time optimizing the object for the particular machine. Descriptions 
 of the latter do not often appear in print because of their highly machine 
 
 dependent nature. (However, descriptions of IBM FORTRAN II L J and the KDF 9 
 
 [3] 
 optimizing ALGOL are available.) 
 
 The arguments in favor of this situation are well known. Code checks 
 are run on the fast compiler and then an optimized object is obtained from the 
 other for production. 
 
 It has been the experience of this author that a large number of 
 medium-sized jobs seldom remain unchanged . Before each production run small 
 changes are made at the source language level. Both because of this, and 
 because the source language is a condensed, readily comprehensible representation 
 of the job, it is highly desirable to seek a middle ground with some of the 
 features of both worlds . 
 
 The purpose of this paper is to describe a method for the high-speed 
 compilation of code which has been rearranged so as to optimize it in terms 
 of the number of basic operations performed. This technique is machine 
 independent. Machine- dependent features, such as index registers, finite length 
 operand stacks, etc., are not discussed. 
 
 The object code of a compiler can be made more efficient in a number of 
 ways. They include the location of locally constant expressions or expressions 
 that can be calculated by recursion, location of common subexpressions, and the 
 rearranging of the calculation to facilitate any of these objectives, such that 
 either code can be removed from loop or duplicated coding can be removed. Non- 
 buffered input from auxiliary storage can also be made more efficient within 
 the operating system by pre-calling the input package to load a buffer. For 
 example, the FORTRAN statement READ TAPE N, i/o list will cause a wait by the 
 main frame until tape N has been read into a buffer so that it can be copied to 
 the i/o list. This wait can be reduced by calling for the buffer fill earlier. 
 
 -1- 
 
In order to perform these types of optimization it is necessary to 
 analyze all statements in a neighborhood of a given statement. This neighbor- 
 hood is the extent of the loop or block of program from which it is desired to 
 remove code. Thus most optimization procedures involve a considerable amount of 
 rescanning, or the construction of very large tables which must be examined 
 more or less at random. 
 
 The requirements for high-speed compilation are that a minimum of 
 scans of the entire data be used, and that it not be necessary to refer at 
 random to sections of the program remote from the one being considered, or 
 equivalent ly, to very large tables containing sections of the program . For high- 
 speed computer systems now being designed, it is a further advantage to refer to 
 memory in a sequential manner rather than in a highly data dependent manner, since 
 a larger degree of memory-main frame overlap is then possible using the various 
 look-ahead schemes . 
 
 Thus the requirements of high-speed compilation and efficient object 
 code production are somewhat incompatible. Any solution is bound to reflect the 
 assessment of the individual as to the relative importance. In this paper, high- 
 speed compilation is stressed, optimization is performed within this constraint. 
 
 This paper will not deal with the techniques involved in the standard 
 compilation process since they are both well known and machine dependent. It 
 will only point out additions to the processes which achieve optimization. 
 
 2. An Outline of the Program 
 
 An example will clarify the type of optimization being sought. The 
 following section of a FORTRAN program 
 
 DIMENSION A (15, 5) 
 
 D0 11= M1,N1,L1 
 
 D0 2 J = M2,N2,L2 
 2 A(I,J) = A(I,J)*C*15'5*D**5 
 1 C = C + 1. 
 
 might compile on a fast "single-pass" compiler as the machine language equivalent 
 of 
 
 -2- 
 
Move Ml to I 
 LOOP 1 Move M2 to J 
 LOOP 2 (J-1)*15 + 1-1 + A -» Tl 
 
 (J-1)*15 + 1-1 + A - T2 
 
 Load indirect from T2 
 
 Multiply by C 
 
 Multiply by 15 -5 
 
 Multiply by D 
 
 Multiply by 0.5 
 
 Store indirect into Tl 
 
 J + L2 - J 
 
 Subtract N2 and transfer to LOOP 2 if result < 
 
 C + 1. -> C 
 
 I + LI - I 
 
 Subtract Nl and transfer to L00P1 if result < 
 
 where Tl and T2 are temporary storage cells, and A is the memory cell containing 
 A(l,l). 
 
 A fairly efficient object version is given in the form below. 
 
 LI -» Xk 
 
 A - ik + 15*M2 + Ml -> X3 
 
 L2*15 -> X2 
 
 T2 = D*T-T5 
 
 Ml - I 
 
 Transfer to LOOP 1+1 
 
 L00P1 X3 + Xk -» X3 
 
 X3 - XI <3 
 
 T2*C -» Tl 
 
 M2 - J 
 
 Transfer to L00P2 + 1 
 
 L00P2 XI + X2 - XI 
 
 Load indirect from XI <}— 
 
 ■3- 
 
Multiply by Tl 
 
 Store indirect in XI 
 
 J + L2 - J 
 
 Subtract N2 and transfer to L00P2 if result < 
 
 C + 1. -* C 
 
 I + LI -* I 
 
 Subtract Nl and transfer to L00P1 if result < 
 
 The basic steps that have been taken in this example are: 
 
 a) The locally constant expression C*15. 5*D*. 5 was removed from the 
 inner loop since C and D are not changed in that loop. Since C but 
 not D is changed in the outer loop the 15.5*D*.5 part can also be 
 removed from that loop. 
 
 b) The expression (j + l)*15 + I - 1 + A is recognized to be 
 recursively calculable in the inner loop by starting at 
 
 (M2 - l)*15 + I - 1 + A and adding increments of 15*L2. Then the 
 starting expression is also recognized as recursively calculable 
 in the outer loop. 
 
 c) The expression 15.5*D*-5 is rearranged to get D*7-75- 
 
 d) The occurrence of the expression (J - l)*15 + I - 1 + A twice is 
 recognized. 
 
 The first two of these steps may only be taken if the variables 
 concerned are either not changed inside the loop, or are changed in an incremental 
 manner. It is therefore necessary to examine the range of each loop for potential 
 changes. The assumptions made about function or subroutine calls are crucial in 
 this problem. If side effects are allowed, then it is necessary either to 
 tabulate information about the changes effected by each such procedure, or else 
 to assume that everything may be changed and abandon this optimization in any 
 loop containing such a call. 
 
 In order to gather the information required about each loop in the 
 program in a straight forward manner Pass I maintains a stack S , each of whose 
 entries is a simple list of variable names. The top level of the stack 
 corresponds to the iterative loop of the program currently being scanned and 
 lower levels correspond to each of the loops within which the current one is 
 
 -h- 
 
nested. As each statement is scanned, the names of variables which are changed 
 by the statement are added to the top level of S . When the end of a loop is 
 reached, the top level contains the list of all changed variables. This list 
 
 is saved and also added to the second level of S , so that when the old top 
 
 v * 
 
 level is discarded, the new top level contains the current set of changed 
 variables for the next outer loop. 
 
 Pass I also translates the source input into a more compact internal 
 form where, for example, arithmetic expressions are in a polish postfix notation. 
 In order not to have to save each of the lists of changed variables in main core, 
 they also are added to the intermediate language output from Pass I. They occur 
 at the end of the block to which they refer. 
 
 To handle recursive calculation, a second stack S (which could be 
 part of S ) is maintained by Pass I. The loop parameters are entered into the 
 top of this stack when they are read so that they can also be moved to the end 
 of the loop so that a test can be made to see if the controlled variable or the 
 step quantity is changed inside the loop. 
 
 Pass I will also perform the syntax checking of the source language, 
 and, for efficiency, it should replace all identifiers by an internal form such 
 as a table address. 
 
 Pass II basically scans the output from Pass I in reverse order. This 
 makes the list of changed variable names and other loop information available 
 to the program just as the first (in time) statement of a loop is read. Pass II 
 maintains a stack S by placing the changed variable list on top of S when the 
 end of each loop is entered, and removing it when the beginning of the loop is 
 left. Thus during the processing of any loop, the top of the stack S contains 
 the list of variables changed in that loop. Similarly a stack S contains at 
 its top the current loop control parameters. Additional stacks S T and S whose 
 entries are also lists are maintained by Pass II. These stacks contain 
 respectively the lists of strings representing the local constants and recursive 
 increments which are in the process of being moved to the beginning of a loop. 
 
 The output of Pass II is in a machine language with symbolic addresses. 
 The length of the program being generated can be calculated during the second 
 pass, so that at its completion, variable storage can be assigned. Pass III 
 will then produce a relocatable binary object. 
 
 -5- 
 
The method described below involves operations such as concatenating 
 elements and strings to form larger strings, where these elements or strings are 
 members of stacks. Deliberately, no mention is made of how to mechanize this 
 since it is highly machine dependent. On some machines it is better to chain 
 everything and not to move elements unless absolutely necessary, on other 
 machines the opposite holds. These tricks of the programmer's trade can easily 
 save or cost a factor of 2 in speed, so a judgment of the method should include 
 an evaluation of their potential value in this problem. 
 
 3= Removing Locally Constant Subexpressions 
 
 This section indicates the manner in which Pass II scans arithmetic 
 
 expressions in order to locate locally constant subexpressions. The information 
 
 it has available is contained in the top entry of the stack S , which contains 
 
 v 
 
 a list of changed names. For speed it is convenient on most machines to 
 translate this information so that it is represented by a bit in the name table. 
 
 Arithmetic expressions have been translated by Pass I into a polish 
 postfix form. Although the main Pass II scan is proceeding backwards, it is 
 more convenient to scan expressions forward. This can be handled either by 
 having Pass I reverse the order of the expression or, probably more conveniently, 
 have Pass I generate a pointer at the end of an expression indicating the 
 beginning of the expression. 
 
 As the polish postfix form is scanned from left to right, operand 
 names are read and placed in an operand stack S . As they are put in this stack 
 they are marked as changed variables or not depending on whether they appear in 
 the top of S or not. When operations are read in from the polish postfix 
 string they must be compiled. The nature of the compiling action depends on 
 the operands involved. Consider only binary operations. (n-ary operations 
 follow the same principle.) If neither of the top two operands in S are 
 changed no code is generated, rather the top two levels are replaced with a 
 single entry representing the combination. For example, if the top two levels 
 are both locally constant variables (LCV) they are replaced by a single 
 locally constant expression L CE. This is mechanized by forming a polish postfix 
 string consisting of the top two operands followed by the operation, placing 
 this in main store and placing a pointer to it in the stack with the mark LCE. 
 Thus the entries in the operand stack can be any one of the classes 
 
C (constant, e.g., 15 • 5) 
 
 LCV (locally constant variable) 
 
 LCE (locally constant expression) 
 
 V (variable, one that is changed in the loop) 
 
 The rules for the combination of these by any binary operation are: 
 
 1st and 2nd level 
 Operands 
 
 Action 
 
 C - C 
 
 Combine at compile time to get new constant 
 
 C - LCV 
 
 C - LCE 
 
 LCV - LCV 
 
 LCV - LCE 
 
 and 
 
 LCE - LCE 
 
 Form new string consisting of the concatenation of 1st level of 
 
 S , 2nd level of S , operation 
 
 Single stack entry becomes LCE pointing to this string 
 
 C - V 
 LCV - V 
 
 V - V 
 
 Compile object code directly 
 Single stack entry becomes V type 
 
 LCE - V 
 
 Allocate a temporary storage cell with name T, say. 
 Compile code for T, V, operation 
 
 Add 'T = ' to the end of the string representing the LCE 
 and add this new string to the top level of the S stack, 
 
 Li 
 
 the stack of locally constant expressions. The single Sq 
 stack entry becomes V type. 
 
 When the beginning of the loop is reached in the Pass II scan, the loop is closed, 
 
 the top level of S„ discarded and the polish postfix assignment statements 
 
 removed from the top of the S stack and compiled to yield the program to generate 
 
 L 
 
 the local constants. Since these expressions may be partially or wholly local 
 constants in this outer loop, they should be handled in the same manner as Pass I 
 
 -7- 
 
generated expressions, that is, all that is necessary is to switch for input 
 from the intermediate language of Pass I to this list until it is exhausted. 
 Since the temporary storage name generated for a locally constant expression 
 is not marked as changed, the rules outlined above will cause an expression 
 to be removed from all loops in which it does not change. 
 
 General n-ary operations such as functions can be handled by the 
 obvious extension of the process. Indexing is an extension of functions. 
 If the indexing expressions are local constants then the address of the element 
 can be calculated outside of the loop. If in addition no elements of the array 
 are changed, then the element could be part of a larger locally constant 
 expression. 
 
 h. Recursive Calculation 
 
 Recursive calculation of addresses has been discussed by Samelson 
 and Bauer for ALG0L. Restrictions essentially similar to those of FORTRAN 
 are imposed on the subscripts. The method proposed here is to calculate all 
 expressions recursively where that can be done. If the expression happens to 
 be an index expression, then a compiler for a particular machine can take note 
 of it and use index registers or other features to perform the calculation. 
 
 The technique is essentially identical to the one used for identifying 
 local constants. A new type of operand stack entry is defined, the "Step 
 Variable" (SV)o This is a variable which is incremented by constant amounts 
 during the execution of a loop. The controlled variable in the loop is of this 
 type provided that neither it nor the step quantity are changed inside the loop. 
 This is the reason for moving the controlled variable name and loop parameters 
 to the end of the loop by the S stack mechanism in Pass I. As the end of a loop 
 is entered, the new top level is added to the S stack, and the loop count 
 mechanism is compiled. If the expression used for the step compiler is a fixed 
 point constant or local constant, the controlled variable is checked against 
 the changed list. If it is not changed in the loop (except by the loop control) 
 it is flagged in the table as a step variable for the duration of this loop. (if 
 the controlled variable of the next outer loop is also flagged as a step variable, 
 its flag must be turned off until the beginning of the loop has been reached.) 
 
 -8- 
 
Another type of operand stack entry is defined, a "Step Expression" 
 (SE). An SE is a combination, by addition or subtraction of any two SV's, SE's, 
 LCE's, LCV's or C's, one of which is an SV or SE, or a combination, by multi- 
 plication of an SE or SV with an LCE, LCV or C type. 
 
 Thus a step expression can be expressed in the form LCEA + LCEB*I 
 where I is the step variable and LCEA and LCEB are two C's, LCV's or LCE's. 
 Since I is being incremented by a locally constant amount, say INC, the value 
 of the step expression can be obtained by starting at LCEA + LCEB*S and 
 incrementing each time by LCEB*TNC where S is the starting value of I. 
 
 The procedure is mechanized by marking a variable in the operand 
 stack S n as an SV if it is so marked in the variable table. When two (or more) 
 operands are combined which result in an SE, the entry in the stack is so marked, 
 and a pointer is placed there which points to a pair of strings. The first 
 string represents LCEA and the second represents LCEB. 
 
 When the operation is an addition (or subtraction), the new lists are 
 formed by concatenating the pairs of strings representing each operand together 
 with a trailing plus (minus) sign. If either of the expressions is simpler than 
 an SE, then its LCEA and LCEB are implied in the obvious way. 
 
 When the operation is multiplication of an SE or SV by a C, LCV or 
 
 LCE, the new LCEA is the string LCEA A LCV f * and the new LCEB is similarly 
 
 r C "1 L LCE J 
 
 LCEB^ LCV Y *. 
 
 ^LCE J 
 
 In all other cases involving an SE or SV, some object code is to be 
 compiled. If the operand is either an SV or an SE with its LCEB equal to +1, 
 then there is no point to handling it as other than a changed variable (with 
 possibly a locally constant additive LCEA). Apart from these cases, it is 
 necessary to assign a temporary storage cell, say Tl, to the SE for the purpose 
 of compiling the current operation. A second temporary storage cell T2 must 
 also be provided for the increment storage. The strings LCEA LCEB S * + Tl = 
 and LCEB INC * T2 = are added to the top list of the stack S T of locally constant 
 
 Li 
 
 expressions and Tl T2 + Tl = is added to the top of the S D stack of recursive 
 
 K 
 
 increments . 
 
 -9- 
 
When the beginning of the loop is reached in the scan, it is necessary 
 
 to prepare the incrementing orders by compiling the top level of the S stack 
 
 R 
 
 and then discarding it. The loop is then closed by removing the top level of the 
 
 S stack and compiling the loop initialization plus the transfer around the 
 
 incrementation orders before compiling the top list of the S stack. 
 
 L 
 
 Just as the multiplication operation has been reduced to successive 
 
 additions, the exponentiation operation could be reduced to successive 
 
 multiplications, but this author feels that it is of doubtful value, both because 
 
 of rounding error and frequency of occurrence. 
 
 5. Rearranging the Calculation of an Expression 
 
 Whether or not this should be done is open to question numerically. 
 The fact that some compilers work from left to right, others right to left and 
 still others in some optimizing manner means that if the programmer desires to 
 calculate A+B+C in a specific order, he should indicate it with parentheses, 
 e.g., A+(B+C), if he intends to use the program in more than one installation. 
 In order to indicate the difference between the three expressions A+B+C, A+(B+C) 
 and (A+B)+C in the polish postfix string, a convention must be used which allows 
 the rearrangement of calculation in the first instance. Such a convention is to 
 allow n-ary versions of operands. Thus A+B+C becomes ABC+ • The subscript 3 
 indicates that + is the summation of the 3 elements A,B and C. A+(B+C) and 
 (A+B)+C would then uniquely be ABC++ and AB+C+ where the subscript 2 for binary 
 operations has been dropped. In order that this convention also handle the 
 subtraction (and division) operations the "sign" of the operation is placed 
 over the last element representing that variable. Thus A-B+C is ABC+ while 
 A-(B+C) is ABC++ and A+(-B+C) is ABC++. Similarly a/b*C and A/(B*C) are ABC* 
 and ABC** respectively. 
 
 When the operation + in the expression ABC+ is to be compiled by 
 Pass II, the objective is to rearrange the elements so that the minimum of work 
 
 is done at execution time. This has been discussed in the case of constants by 
 Floyd and Gear . There is a hierarc 
 the stack at any time. This ordering is 
 
 Floyd and Gear . There is a hierarchy of operands that are possible in 
 
 -10- 
 
Constant 
 
 Locally Constant Expression 
 
 (including Locally Constant Variable) 
 
 Step Expression/ . . . N 
 
 (including Step Variable) 
 
 Variable 
 
 When an n-ary commutative operation + or * is encountered by Pass II, the top 
 n levels of the operand stack S are examined to find all occurrences of constants 
 first, these are combined at compile time and n is appropriately reduced. If 
 it is not now 1, the remaining levels are examined for LCE's or LCV's. If there 
 are any, they are combined with any constant to give a new LCE. If n still has 
 not been reduced to 1, the search continues for SE's or SV's which can be 
 legitimately combined into SE's. Finally, if n is not 1, code is compiled and 
 the stack entry becomes a variable. 
 
 6. Common Subexpressions 
 
 Floyd has described one way of locating common subexpressions 
 within a single expression. This method could be extended to a number of 
 expressions within a program segment at a considerable cost in main storage 
 requirements. It seems doubtful that any simple technique can be found to handle 
 this problem without considerable searching and comparing. This author feels 
 that in general the pay-off for the amount of work involved is marginal for the 
 following reasons : 
 
 a) If the common subexpression is very short, e.g., A+B or A*B, 
 the gain in execution speed is small since in most cases a 
 store order is involved to save the result in place of the 
 common arithmetic operation. 
 
 b) If the common subexpression is lengthy, the laziness of the 
 programmer can be relied upon to do the necessary simplification, 
 
 e.g., ... (A+B)*(C+D)/(E+(A+B)*(C+D)) ••• 
 
 is likely to appear as 
 
 T = (A+B)*(C+D) 
 
 . . . t/(e+t) ... 
 
 which is a considerable saving in writing. 
 
 -11- 
 
However, when a feature of the language expresses a fairly involved expression 
 in a simple manner, the recognition of common subexpressions becomes advantageous, 
 Such a case is indexing. For example, the assignment statement for three- 
 dimensional Gaus-Seidel iteration is 
 
 A(l,J,K) = (A(l,J,K-l) + A(l,J,K+l) + A(l,J-l,K) + A(l,J+l,K) 
 
 + A(l-1,J,K) + A(l+l,J,k))/6. 
 
 This contains the expression ((K-l)*N2 + J-l)*Nl + 1-1 + A plus various constants 
 seven times where Nl and N2 are the dimensions of the first two subscripts of A. 
 The method proposed here is not to look for such common subexpressions unless 
 they are either local constants or step functions. If this is done, then each 
 time that a new step function or local constant is generated, it should be 
 compared against the existing set in the top level of S before being entered 
 
 b 
 
 there. A step function contains two local constants. If the step part LCEB 
 agrees with the step part of another step function (the names of these are 
 located in S_), then only one incremented function is needed; the other can be 
 determined by calculating the difference between the two constant parts LCEA and 
 adding this relocation at the time it is used. If these are identical, so much 
 the better. 
 
 In many situations, such as the example above, the two constant parts 
 LCEA of two step functions will only differ by a fixed constant, but this fact 
 will be hidden by the arrangement of the polish postfix strings representing 
 them. For example, if I is the step variable, and it is being incremented by 1 
 starting at 1, the LCEA part of A(l,J,K-l) is 
 
 K 2 + N2 * J 1 + Nl * -A' + 
 
 whereas for 
 
 A(I+1,J,K) 
 
 it is 
 
 K 1 + N2 * J 1 + Nl 1 'A' + 
 
 where 'A' is the address of A(l,l,l). A direct comparison will not easily yield 
 the fact that these differ by N2*N1+1. 
 
 -12- 
 
To aid this recognition, it is proposed that the LCE's be reorganized 
 as they are generated. A suitable form is such that LC's always appear before 
 C's in additive groups. They can easily be rearranged as they are removed from 
 the S stack during formation. Additionally when an LCE of the form"LCE string, 
 constant, +" is multiplied by a constant such as N2, the value of "constant * N2" 
 should be calculated, yielding say, C . The new LCE is then 'LCE string N2 * C +." 
 In this way, additive constants are moved to the right hand side of the string 
 (and in an expression like (l+3)*7+J+2 additions are saved). A direct comparison 
 of the LCE's can now be used to determine if they differ only by a constant. If 
 this is so, the second expression can be calculated from the first by simple 
 addition. Whether this is better done outside the loop or at the time of use 
 depends on the machine characteristics. 
 
 7- Auxiliary Storage Input 
 
 The reverse pass feature of this method can be used to handle this 
 optimizing problem at a simple level. During the first pass, the input statement 
 is moved forward over statements as far as possible such that 
 
 a) No transfers into or out of these statements are possible 
 (determined by the nonexistence of statement numbers or 
 control transfers in FORTRAN), 
 
 b) No variables used or changed by these statements are in the 
 input list, and 
 
 c) None of these statements refers to the same i/o unit. 
 
 During the second pass, the call to the buffering program or monitor 
 is moved back to be the first statement of the program segment, that is, 
 immediately following the first transfer into or out of the program. 
 
 8. Conclusion 
 
 Because the output of one pass is read backwards by the next pass, 
 this method is particularly suited to machines with tapes that can be read in 
 either direction. Disk files or drums are also suitable for the intermediate 
 storage, and can lead to very fast second and third passes. 
 
 -13- 
 
On a machine with a large main memory many methods of optimization 
 which use large parts of core are feasible. Since the first pass is usually 
 highly input limited, it might be an advantage to use the method outlined 
 above on several programs simultaneously; that is, the compiler program is 
 time-shared by several inputs and auxiliary devices, each using a small piece 
 of memory for data storage. 
 
 Partial optimization of the object code in a machine independent 
 fashion is certainly feasible using a small amount of main memory with a 
 reversible auxiliary store. On most machines it is likely that the input time 
 for the source program will dominate the compile speed. Optimization of the 
 use of machine features such as index registers is a harder problem, probably 
 requiring two further passes, one in each direction, since in the proposed 
 compiler, information about which addresses would best be in index registers 
 is not available until Pass II, and Pass II must generate code whose length can 
 be determined by the end of Pass II. 
 
 -14- 
 
BIBLIOGRAPHY 
 
 [1] Floyd, R. W. "An Algorithm for Coding Efficient Arithmetic Operations/' 
 Comm. ACM, 4 (January, 196l), p. ^2. 
 
 [2] Gear, C. W. "Optimization of the Address Field Compilation in the 
 ILLIAC II Assembler, " Comp. J., 6 (January, 196k), p. 332. 
 
 [3] Huxtable, D. H. R. "On Writing an Optimizing Translator for ALG0L 60," 
 
 Introduction to System Programming . Edited by P. Wegner, Academic Press, 
 196U, p. 137. 
 
 [k] IBM. "Systems Manual for 70k FORTRAN and 709 F0RTRAN," Applied Programming 
 Department, IBM, April, i960. 
 
 [5] Samelson, K. and F. L. Bauer. "Sequential Formula Translation," Comm. ACM, 
 3 (February, i960), p. 76. 
 
 -15- 
 
c f „ r002 V171 1B709651 
 
 llllac III « pin"" ' ■ * 
 
 30112088398208 
 
 Ml