nam HI Mm H Hi BUSH if nSlffilG H m i H* i i ' •-'." I ; 5 mi 5?iHi icy ■ ' i The person charging this material is re- sponsible for its return to the library from which it was withdrawn on or before the Latest Date stamped below. Theff, mutilation, and underlining of books are reasons for disciplinary action and may result in dismissal from the University. UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN Digitized by the Internet Archive in 2013 http://archive.org/details/automaticgenerat458beal £//^ I Report No. ^58 7^ 1 /yu-ZA THE AUTOMATIC GENERATION OF DETERMINISTIC PARSING ALGORITHMS *y Alan James Beals June 21, 1971 ILLIAC IV Document No. 2^8 Report No. ^58 THE AUTOMATIC GENERATION OF DETERMINISTIC PARSING ALGORITHMS Alan James Beals May 25, 1971 Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 6l801 This work was supported in part by the Advanced Research Projects Agency as administered by the Rome Air Development Center under Contract No. USAF 30(602) -klkk and submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science, 1971. ABSTRACT This paper describes an algorithm for the conversion of a grammar in the form of a set of context free productions into a deterministic pars- ing algorithm described by a set of modified Floyd productions. The algorithm is extended in such a way that it may easily become a part of a complete translator writing system and make use of the information available in the semantic part of such a system. The paper also includes a determination of the subclass of context free grammars which can be successfully converted and a comparison of this algorithm with some other methods of generating parsing algorithms from context free grammars. IV TABLE OF CONTENTS Page 1. INTRODUCTION 1 2. NOTATION AND DEFINITIONS 5 3. THE BASIC ALGORITHM. ............ 10 3.1 Label (Group) Determination . 11 3«2 Descriptor Set Generation 12 3«3 Descriptor to FPL Production Mapping 13 3-^ Preclusion and Error Detection 15 3«5 Summary and Example ............. 15 3.5.1 Input Syntax . . 16 3.5*2 Labels (Groups) Needed 16 3.5«3 Descriptor Sets 16 3.5.^ Resulting FPL Production Set 17 k. COMBINATION OF FPL PRODUCTIONS 18 k.l CT(m) Groups 19 k.2 Marker Symbol Combination . 21 k.3 Initial CF Grammar Revision 25 k.k Summary and Example 27 k.k.l Input Syntax 28 k.k, 2 Grammar Revisions 29 k.k. 3 FPL Production Subset 29 k.k.k Combinations 30 5. LOOKAHEAD CONSTRUCTION 31 V Page 6. CF GRAMMARS ACCEPTED 39 6.1 Strong LR (k) Grammars 39 6.2 Explicit Lookahead Adequacy kk 6.3 Implicit Lookback Adequacy ^9 6.k Conclusion 57 7. BELATIVE EFFECTIVENESS 58 7.1 Upper Bounds 58 7.1.1 FPL Production Set . . 60 7.1.2 Comparison 62 7.2 Average Times 63 7.3 Subjective Evaluation ........ 65 "J.h Summary 68 8. THE IMPLEMENTED TWS 70 8.1 Syntax 70 8.2 Semantics 75 8.3 Other Considerations . 79 8.^4- Summary 8l 9. CONCLUSION ' 83 APPENDICES 87 A. TWINKLE SYNTAX OF DEMALGL . 87 B.. PR0TAB OF DEMALGL 90 C. FPL PRODUCTION SET FOR DEMALGL 97 D. IMPLEMENTATION FLOW CHARTS 139 LIST OF REFERENCES 151 VITA .153 VI LIST OF FIGURES Figure Page 1. Flow Chart for Lookahead Buffer Usage 33 2. Flow Chart for Lookahead Generation 35 1. INTRODUCTION It is the opinion here that one of the major restrictions to getting maximum use out of any computer is the lack of programming ability on the part of those people who have problems in need of solu- tion. The primary cause of this inability is the unavailability of problem oriented programming languages. This frequently leads to one or more of the following: the problem solver is uncertain about ca- pabilities and features of the specific machine and general purpose language available and thus fails to use the computer in many cases where it could be extremely valuable, or he does use the machine, but writes his own programs in a relatively simple and inefficient manner, or finally, he turns the problem over to a programmer who has the ability to get maximum use of the machine but does not have a thor- ough understanding of all the aspects of the problem, and thus can computerize only those specific aspects of a problem given him. The solution to this is the development of programming languages which correspond as closely as possible to the notation and terminology of specific problem areas. At the present time, very little is being done along these lines, partly because of the large effort required to implement a compiler or translator for a new lan- guage. The man hours required for such projects can seldom be justi- fied when the results will achieve the relatively limited usage generally accorded a specific problem oriented language. The obvious answer is to mechanize the task of compiler writing through the use of a translator writing system (TWS) whose input is the specification of a programming language and a machine, and whose output is a com- piler for that language which generates executable code for the spec- ified machine. Such a TWS "built compiler has its task broken into three relatively distinct parts. The first is to recognize the validity and, more importantly, the structure of an input program. This is referred to as the syntactic or parsing phase. The second is the se- mantic phase which associates, in some way, a meaning with each of the various subconstructs possible in the language. Meaning in this sense can be defined as the translation activity necessary whenever a specific subconstruct is recognized. The third phase is the actual generation of executable code for the machine specified. This third phase is almost impossible to automate at this time due to the lack of commonality between the instruction sets and other hardware features of various different machines. The best exist- ing approach is for the semantic phase of compilation to generate some sort of formal intermediate language which is then processed by a code generator handwritten for a specific machine. The semantic phase is also difficult to automate at present. The main reason for this is the failure, to date, to develop a good formalization of semantic specifications. Some efforts have been made in this direction, notably by Lewis and Stearns [l], Feldman [2] and Lucas and Walk [3]- These, however, still fall short of what is neces- sary for complete automation of the semantics process. The most com- mon approach to semantic processing is that used in the TWS to be de- scribed here. That is to transfer control, at points in the parsing process specified by the language designer, to hand coded routines which accomplish the semantic functions. The algorithm which converts context free (CF) productions to modified Floyd Production Language (FPL) productions, which is de- scribed here, was developed to automate the parsing phase of compiler generation. It grew from the desire to implement a practical and effi- cient TWS. The goals of this TWS are the same as those of any TWS. The most important of these is the "building of compilers which are fast, occupy as little space as possible and are capable of generating effi- cient object code. Input to a TWS should be in as simple a form as pos- sible. A TWS should also be capable of processing the grammars of a large class of languages. Out of the broad class of problems associated with translator writing systems, this thesis concentrates on one, namely the generation of parsing algorithms. Consideration was given to several different parsing algorithms as potential bases for the system. Among these were the algorithms of McKeeman [k~\ , Wirth and Weber [5], Ingerman [6], Brooker and Morris [7] and Trout [8]. Without precluding subsequent incorporation of some parts of these methods, it was decided that some version of the production language (FPL) of Floyd [9], Evans [10] and Feldman [2] was potentially most likely to meet the first of the above criteria. The input to such a system is, however, more complex than was desirable. Context free (CF) productions were chosen as the most widely known and easily understood method of language syntax specifications. The most commonly used notation for this type of syntactic specifica- tion is the Backus-Naur Form (BNF) as used in the ALG0L 60 report [11]. Attempts at conversion of CF grammars to FPL by Earley [12] and DeRemer [13] were considered and that of DeRemer chosen as the more promising starting point. No attempt is actually made to generate parsers for all CF grammars. The goal, instead, is to generate parsers for a class of grammars which approaches, as nearly as possible, the LR(k) grammars defined by Knuth [1^]. This restriction is made for two reasons. First, it is the belief here that attempts to process a larger subset of CF grammars are unlikely to achieve practical results; and second, it is felt that, if a designer set out to design an unambiguous grammar to specify the structural properties of a programming language, his re- sult will be a grammar whose sentences can be parsed during a single, deterministric scan from left to right j i.e., an LR(k) grammar. In- tuitively, we feel that this situation exists because the language is meant to be written and read by humans, most of whom are used to read- ing and writing natural languages from left to right. This thesis consists of two parts. The first, chapters 2 through 5? is a precise description of an algorithm for the conver- sion of a set of CF productions to a set of deterministic FPL produc- tions. The second, chapters 6 through 8, is an evaluation of this algorithm. Chapter 6 attempts to identify and classify that subset of CF grammars which the algorithm can successfully convert. Chapter 7 compares the resulting parsing algorithms to those generated by the recent somewhat similar systems of DeRemer [15], Korenjak [l6] and Earley [17]. Chapter 8 considers the effectiveness of the algorithm as a part of an implemented TWS. 2. NOTATION AND DEFINITIONS The notation and definitions of this chapter will "be used throughout the remainder of this work. All notation and definitions used in the following text are not included in this chapter since some would be meaningless if not introduced in the proper context. We consider a language L to be defined by a context free (CF) phrase structure grammar G = (V T , V N , S, P) where V = the set of terminal symbols of L, which will be represented by lower case Latin letters, V.- = the set of nonterminal symbols, which will be represented by upper case Latin letters from the beginning and middle of the alphabet, S £ V._ is a designated nonterminal symbol called the objective symbol and P is a numbered set of rules, or productions, of the form A -* a , where A e V and a e (V U V ) . Such a CF grammar is automatically extended as follows: #, an end of string marker, is added to V„, Z is added to V and replaces S as the objective symbol and E -> ffS jf is added to P as production number zero. Symbols which may be either terminal or nonterminal will be denoted by upper case Latin letters from the end of the alphabet. Strings of symbols will be denoted by .Greek letters. A string which consists of a single symbol X repeated n times will be denoted X . The input string to be parsed is assumed to be of the form o # where a is made up of terminal symbols only. A symbol X (X e V = V„ U V_) is called a headsymbol of a nonterminal A if A = X, or A -> X . . . , or there exists a subset of P of the form A -> A ... A -> X . . . n > 1. n Analogously, a symbol X e V is called a tailsymbol of a nonterminal A if A = X, or A -» ... X, or there exists a subset of P of the .form A -> ... A A-, ... Ap A -* ... X n > 1. n — An explicitly left recursive occurrence of a nonterminal symbol A is that occurrence as the first symbol on the righthand side (RHS) of a CF production of the form A -+ A ... An implicitly left recursive occurrence of A is an occurrence of A as the first symbol on a RHS in a subset of P of the form A -* *i A l * A 2 • A ■+ A ... n > 1 n — A left recursive occurrence,, either explicit or implicit, is defined analogously. 8 A descriptor is associated with each specific occurrence of a symbol X on the RHS of a CF production. The descriptor (n } n) represents the occurrence of X as the n symbol on the RHS of the CF production numbered jt. The context free productions (rules of P) will be referred to simply as productions, or for clarity, CF productions. The FPL statements will be referred to as productions (when the context makes clear which type of production is meant), or FPL productions. An FPL production will be written in the form LI: X a 2 Q | * L n The label LI may or may not be present. Most parsing algorithms make use of a parsing stack into which input symbols are pushed and in which reductions are made. For a set of FPL productions which is generated by this CF to FPL coversion algorithm, this stack consists of only a single symbol, called a parser symbol, or p-symbol. The X in the FPL production above, if present, is a symbol which is to be compared to this p-symbol. If the p-symbol is not X, processing is transferred to the beginning of the next production in sequence. The a. are strings of terminal symbols which are referred to as lookahead strings. If no lookahead string matches the next symbols in the input stream, processing is transferred to the beginning of the next production in sequence. The Q,, if present, represents one of two parsing activities. It takes the form «- Z when one of three types of marker symbols is to be pushed into a special marker stack. It takes the form -» A(n) when the p- symbol is to be set to A and the top n (n > 0) symbols are to be popped from the marker stack. The * may or may not appear and, if present, indicates that the next input symbol is to replace the parser symbol. This is re- ferred to as a scan. The L is either the label of the next production to be used or the special symbol DNT-A (to be explained in detail later) which indicates that the nonterminal A has just been recognized and the marker stack must be used to calculate dynamically the label of the next pro- duction to be used. As will be seen in chapter 3? there is a specific FPL production for each descriptor (jt, n). Thus there is a one to one cor- respondence between a specific occurrence of a symbol on the RHS of a CF production, a descriptor and a particular FPL production. In certain portions of the text which follows, these terms will be used somewhat interchangeably. 10 3. THE BASIC ALGORITHM Intuitively, the basis of the algorithm is that once a par- ticular symbol X has been recognized and found to correspond to that th occurrence of X as the n symbol on the RHS of the CF production num- bered -n, one of three situations must exist. The first possibility is st that the n + 1 symbol on RHS it is a terminal symbol, t. In this case, the next symbol is scanned and control is transferred to an FPL production which checks for t and takes appropriate action if it is st found. The second possibility is that the n + 1 symbol on RHS it is a nonterminal symbol, A. In this case, a marker symbol corresponding to this specific occurrence of A is pushed into the marker stack, the next input symbol is scanned and control is transferred to the first of a group of FPL productions which tests for all possible terminal beginnings (head symbols) of A. The last possibility is that this X is the last symbol of an n-symbol RHS which defines a nonterminal A. In this case, the parser symbol is set to A and an appropriate number of marker symbols are popped from the marker stack. If the top marker symbol is for A, control is transferred to the start of a group of FPL productions which includes tests for all left recursive occurrences of A and a test for the specific occurrence of A which caused the marker to be pushed. If the top marker symbol is for B (*A) then control is transferred to the start of a group of FPL productions which tests for all possible nonterminal beginnings (head symbols) of B, except for left recursive occurrences of B itself. What follows is a formal description of these intuitive ideas. 11 3-1 Label (Group) Determination The FPL productions are grouped and a label is attached to the first production of each group. The labels are thus associated with entire groups of productions rather than with single productions. The first step in the algorithm is to determine the labels of all of the FPL production groups which might be required in the th course of a parse. Let X (it) represent the n symbol on the RHS of the CF production numbered re. There are four rules for determining which labels (groups) must be created. For each A e V : label TH-A (Terminal Heads) exists if 3-1.1 :j;n:, n such that A = X (ir) and n > 2. This group would be required when the first n-1 symbols of the CF pro- duction numbered tc had been recognized causing the need to initialize a search for A. For each A e V„: label NTH-A (Nonterminal Heads) exists if 3-1.2 3 it f n such that A = X (it) and n > 2. This group would be required when the top symbol on the marker stack indicates that a search for A is in progress and a nonterminal symbol other than A has just been recognized. 12 For each A e V : label NT (it, n) (Nonterminal at (rt,n)) exists 3* 1*3 for each it, n such that A = X (it) and n > 2. ' n v — This group would be required when the top symbol on the marker stack th indicates that a search for the occurrence of A as the n symbol on the RHS of the CF production numbered jt is in progress and the non- terminal A has just been recognized. For each t € V_: label T (jt, n) (Terminal at (jt, n)) exists 3»1-^ for each it, n such that t = X (it) and n > 2. ' n v J — This group would be required when the first n-1 symbols of the CF pro- duction numbered jt had been recognized and a test for t must be next. 3.2 Descriptor Set Generation The next step in the CF productions to FPL productions con- version algorithm is to generate the set of descriptors, D = {(it,, n. ) \ i = 1,2,..., m) for which corresponding FPL productions must be created for each FPL group Q. The group labels are descriptive in the sense that the four types of labels (TH, NTH, NT and T) generate descriptor sets in the following four different ways: D TH-A = { ^ f1 ^ I X l ^ e ' V T and the LHS 3 * 2,1 of it is a head symbol of A} 13 D NTH-A = {(ltA) I X l M € V X 1 (rt) * A and 3 ' 2 ' 2 the LHS of it Is a head symbol of A} D NT(* ? n) = ((*Sn ? ) I (*,n) = (*>') or (*',n«) 3-2.3 describes a left recursive occurrence of X (jt)} 3.3 Descriptor to FPL Production Mapping For each group, the descriptors are mapped on a one to one basis into FPL productions and the appropriate group label is attached to the first such production. As was suggested at the beginning of this chapter, there are three different mapping rules which could apply- to the descriptor (tc, n), depending on what, if anything, occurs as the st n + 1 symbol on the RHS of the CF production numbered it. Assume a is a string of symbols of length n-1. The three mapping rules are: If symbol n + 1 is a nonterminal (i.e. 3-3-1 jt is A -> a X B . . . ) then (tc, n) maps into X I «- B (it, n+l)[ * TH - B. If symbol n + 1 is a terminal (i.e. 3-3.2 jr is A •* a X t . . . ) then (tc, n) maps into X I I * T(jt r n+1). Ik If CF production jt has only n RHS 3-3.3 symbols (i.e. it is A -> a X) then (jt, n) maps into X | -* A (k) | DNT-A where k is the number of nonterminal symbols in a X other than the first symbol and B (jt', n') at the top of the marker stack implies the following definition of DNT-A. If A = B , DNT-A = NT(jt', n') else 3.3-^ if A * B , DNT-A = NTH - B. All of the FPL productions in an NT (jt, n) group have the same parser symbol field (see 3- 2. 3)- Transfer is made to an NT (jt, n) group only when the parser symbol has just been set to this particular nonterminal symbol (see 3«3«^-)« Therefore., when the above mapping rules are applied to the descriptors for an NT (jt, n) group, the parser symbol field should be left empty. In addition, two. special labelled productions must be gen- erated. They are START: | *- Z (0,0), «- S (0,2) | * TH-S and NT (0,0): [ f EXIT. 15 3»k Preclusion and Error Detection If, within any group of FPL productions, the parser symbol of one production is the same as the parser symbol of another, the first in sequence precludes any possibility that the second could be successfully applied. Methods for the elimination of such preclusion are the subject of chapters k and 5* Each group is designed to contain all possible correct alter- natives at the point reached in the parse upon entry to the group. Thus an error in the input stream would cause entry to some group in which no production would apply and control would go to the first pro- duction after the group. For this reason an error production is in- serted at the end of each group. 3.5 Summary and Example The algorithm consists of the following five steps: 1. Read the CF grammar of L and add the special 0-th production Z «* #S # ; 2. Determine the labels of the FPL groups re- quired (3.1); 3. Generate descriptor sets for each group (3*2); h. Map the descrptors into FPL productions and create the special productions START and NT (0,0); 5. Do the necessary preclusion elimination and add the FPL error production to each group. 16 The following example has been chosen to contain one pre- clusion and illustrate rules one through four. It is assumed in this example that each group is followed by an error production and k has a value of one. 3.5-1 Input Syntax CF Production Number LHS Z 1 S 2 A 3 A k B 5 B 6 D Position 1 2 3 # s # A B c a D b B b d 3.5.2 Labels (Groups) Needed TH - S NT (0,2) T (5,2) TH - B NT. (1,2 ) START NTH - S T (0,3) NT (0,0) NTH - B T (1,3) 3.5.3 Descriptor Sets D TH-S D TH-B D. NTH-S D. NTH-B {(2,1), (6,1)} ((^1)3 {(1,1), (3,1)} { } 17 D. D D. D NT(0,2) ^(1,2) T(0,3) T(l,3) T(5,2) START D. NT(0,0) {(0,2)} {(1,2), (5,1)} ((0,3)} {(1,3)} {(5,2)} special special 3-5.4 Resulting FPL Production Set START: N Vo) TH-S: TH-B: NTH-S: NTH-B: WT (0,2) : NT (1,2) T (0,3) (1,3) ? (5,2) I - 2 (0,0), <- S (0,2) b| a| dI #1 -> D B •♦ A l (o) (0) ! (o) (1,2) (0) -> s '(1) (1) \o) * TH-S EXIT DNT-A DNT-D DNT-B * TH-B DNT-A * t (0,3) * "p -x- t (1,3) (5,2) preclusion DNT-Z DNT-S DNT-B 18 k. COMBINATION OF FPL PRODUCTIONS Efficient elimination of preclusions is the key to the prac- tical success of this conversion algorithm since the class of grammars which do not generate preclusions is extremely small. A left recursive occurrence, for example, of any nonterminal symbol causes a preclusion in all NT (n, n) groups for that symbol.. It is, in fact, the failure to eliminate a preclusion which defines a CF grammar as unacceptable to the conversion algorithm. Thus, the more effective the preclusion elimination, the larger will be the class of convertible grammars. There are two more or less independent methods of eliminat- ing preclusion. The first of these is the combination of two or more conflicting FPL productions into a single production. The second is the generation of strings of lookahead symbols for each of the con- flicting productions. The order in which these two methods are applied makes almost no difference. Due, at least in part, to the details of our particular implementation, it was found that the most effective and efficient approach is to try differentiation by one symbol look- ahead first. If this failed, combination is attempted. As a last resort, a k symbol lookahead is attempted. One feature of the conversion algorithm, used to eliminate preclusion, is the combination of two or more FPL productions in a group into a single production. Two (or more) productions are com- bined when the first in sequence precludes the second (the rest). Each production which is formed by such a combination, creates the need for one or more additional groups. As will be seen, these new 19 group types are, in each case, unions of those groups which would otherwise be transferred to either directly or indirectly (through a later application of the dynamic DNT transfer) by the simple produc- tions being combined. k.l CT(m) Groups The simplest combination possible is of two or more FPL pro- ductions formed by mapping rule 3«3«2 (when the next EHS symbols are terminals). If, within a group, the productions X | | * T 0^, n ± ) X I * T ( v \ } should appear, and any preceding attempt at differentiation by look- ahead has failed, then they can be combined into the single production X | I * CT(m) th where this is the m combination to have been made. Associated with the integer m is the set of labels L = {T(it., n.) | i = 1,2,..., k} U.l.l 20 The descriptor set D_ m , % is then defined by CT(m) Dpm/ \ = U D . U.1.2 CT(m) q 0. e L m Another, more intuitive, definition of D_ m/ x is CT(mj D CT(m) = ^ lCf n ^ ' (*' n_1 ^ is descrl P tor of one of the productions being combined}. Group CT(m) is, thus, precisely the union of the various groups which could have been transferred to, had the combination not been made. Some of the productions of the CT(m) group could have the same parser symbolo In fact, if lookahead has been previously attempted prior to the use of combination, a CT(m) group will always be of the form CT(m): t | ... t I As is easily seen, the preclusion has not been eliminated, but simply moved to a different group. This group, however, is entered at a point one symbol farther into the input stream and therefore, the maximum length lookahead string reaches one symbol farther into the input. The net effect of the combination in this case, is the implicit one symbol extension of the maximum lookahead capability of the conversion algorithm. 21 k.2 Marker Symbol Combination The second class of combinations is of two or more FPL pro- ductions formed by mapping rule 3»3«1 (when the next RHS symbols are nonterminals). This is referred to as a class of combinations because there are two different types of combinations which can be made in this circumstance. These two types of combination can be made in any order and., in fact, are made in a relatively random sequence in the actual implementation of the algorithm. It is, however, simpler to describe and easier to understand if it is assumed that all possible combinations of the first type are made before any combinations of the second type. The first type of combination is possible when the marker symbols (corresponding to the next RHS symbols) to be pushed into the marker stack are for different occurrences of the same nonterminal symbol. If, within a group, the productions X I ♦■ A k, n,) I * TH-A (*-,_> n x ) X i <- A (v V I * TH ~ A should appear, and any preceding attempt at differentiation by look- ahead has failed, then they can be combined into the single production X | «- A* (m) | * TH-A th where this is the m combination to have been made. The symbol A* (m) 22 will henceforth be referred to as a special marker symbol (as opposed to a simple marker symbol A (it, n)). The possibility that the symbol at the top of the marker stack is a special marker symbol requires an extension of the definition of the dynamic transfer DNT (3«3«^)« It now is, if the top marker symbol is B (jt, n) then k.2.1 if A = B , DNT-A = NT (it, n) else if A * B , DNT-A = NTH-B or if the top marker symbol is B* (m) then if A = B , DNT-A = CNT (m) else if A * B , DNT-A = NTH-B. As can be seen, the label of the group which must be built whenever a combination of this type is made, is CNT (m). The symbol A* (m) actually represents the set of marker symbols which could be pushed by the individual FPL productions which were combined. That is, A* (m) = {A (« 1 , n 1 ) | i = 1,2,..., k} The descriptor set D mTm/ \ is then defined by CNT(m) D CNT(m) = tU, n) | X (*', n') e X* (m) and 1+.2.2 (it, n) = (it', n ') or (it, n) describes a left recursive occurrence of X}. ; 23 As can be seen by comparing this definition of D„ Tm , A "with the CNT (m) definition of D / >, (3-2. 3); this type of group is simply the union of those groups which would have been needed had no combination been made. The second possible combination of FPL productions formed by mapping rule 3* 3*1 is of those productions which cause marker symbols (simple or special) of different nonterminal symbols to be pushed into the marker stack. This cannot be done under a certain condition, but it will be shown later in this chapter how this condition can be elim- inated. As was mentioned earlier, it is assumed that all combinations which create special marker symbols, have been made. If, within a group, the productions X | «- Q x J * TH-A 1 X i «- Q * TH-. should appear with Q. = A. (jr., n.) or Q,. = At (m.)> and any preceding attempt at differentiation by lookahead has failed, then they can be combined into the single production X | *- (m) | * CTH (m) where this is the m conbination to have been made. The symbol (m) is called a combined marker symbol and represents the set of marker 2k symbols which could have been pushed into the marker stack by the various productions which were combined. That is: (m) = {Q^ \, • • -> \)- The sets represented by the combined marker symbols must be available when the FPL production set is used to parse an input string because they are used in the dynamic transfer DNT. The definition of DNT is extended to read if the top marker symbol is B (it, n) then k.2.3 if A = B, DNT-A = NT (it, n) else if A £ B, DNT-A = NTH-B or if the top marker symbol is B* (m) then if A = B, DNT-A = CNT (m) else if A * B, DNT-A = NTH-B or if the top marker symbol is (m) then if A (it, n) € (in), DNT-A = NT (it, n) else if A* (m') e (in), DNT-A = CNT (m') else if A (it, n) jk (in) and A* (m ) jk (in), DNT-A = CNTH (m) . The production formed by a combination of this type indicates a transfer to a group labelled CTH (m) . This type of group is necessary whenever such a combination is made. Its descriptor set D„ mT / \ is CTH(m) indirectly defined by (m) as follows: 25 v ' A(it, n) e (m) A* (m')' e (in) The dynamic transfer definition indicates a potential transfer to a group labelled CNTH/ \. This type of group is also necessary whenever such a combination is made. Its descriptor set D_ Tmtr / > is indirectly CNTH(m) defined by (m) as follows: D CNTH(m) " Tf U , ,-, D NTH-A ^* 2 * 5 v ' A(it, nj e (m) A*(m') e (m) ^•3 Initial CF Grammar Revision Section U.1.2 made reference to a condition which prevented the combination of productions created by mapping rule 3«3«1 when the marker symbols (simple or special to be pushed are for difference non- terminal symbols. This occurs when the nonterminal symbols associated with the marker symbols to be pushed into the marker stack are such that one is a head symbol of another. That is, when A (it., n.) (or A* (m.)) and B (it . , n.) (or B* (m . ) ) would be elements of (m), A t B and A is a head symbol of B. When (m) was atop the marker stack, this would always cause DNT-A to be equal NT (it., n.) (or CNT (m.)). This ignores completely the possibility that the particular A just recognized is the beginning of a derivative of B (in which case, DNT-A should equal CNTH (m)). 26 This problem can be overcome by a relatively simple revision of the CF grammar before any CF to FPL conversion is done. This in- volves a search of all the CF productions for pairs of the form C -» a B . . . D -> a x p or C -» A' 7 B . . . D ■* a A 7 X p or C -> a A 7 B . . . D -> A' 7 X where C may or may not be the same as D, Oi is a nonempty string, p and 7 are possibly empty strings, A' is a left recursive occurrence of A and X e V is a head symbol of B. For each such pair, Xp is replaced by a new nonterminal symbol F, and the production F - X p is added to the set of CF productions. This revision of the CF grammar does not change the language being specified but does keep the pre- viously described condition from occurring and preventing the com- bination of productions when necessary. It can be noted at this point that any FPL productions formed by mapping rule 3«3»1 may be combined and that any FPL produc- 27 tions formed by mapping rule 3-3-2 may be combined. It is not possible, however, to combine two productions when one is formed by rule 3- 3*1 and the other by rule 3- 3-2. The possibility that such a pair of pro- ductions might appear within a group and require combination can be completely eliminated by one of two simple extensions of the CF gram- mar revision described above. The first case is when a differentiation of precluding FPL productions by lookahead is to be attempted before any conbination is attempted. In this case the CF grammar revision is the same except that it is also applied when X is a terminal symbol. When no attempt at lookahead differentiation is to precede possible combination attempts, the CF grammar revision must be further extended to include all those cases where X is a terminal symbol, even if it is not a head symbol of B. Thus, the only case where combination of FPL productions cannot be used to overcome preclusion is when one or more of the FPL productions involved has been generated by mapping rule 3- 3- 3 (i.e. when the end of RHS has been reached and a reduction is necessary) . Intuitively, this is explained by the fact that combination of FPL pro- ductions is simply a deferring of decision making which cannot be done when one of the possible decisions is to perform a specific parsing activity (i.e. reduction of a definition to its LHS). k.k Summary and Example The basic conversion algorithm (chapter 3) is now ex- tended to include an initial CF grammar revision. In addition, pre- clusions which have not been resolved by a preceding application of lookahead and do not involve an FPL production which is generated by 28 mapping rule 3« 3- 3 may "be eliminated "by combining the FPL productions involved. Each such combination creates the need for one or more new groups. These new groups are, in each case, exactly the union of appropriate groups which would have been necessary had no combination been made. These combination procedures use the new special marker (A* (m) ) and combined marker ((m)) symbols and the definition of the dynamic transfer, DNT, is extended (k.2.3) to include the cases where these new marker symbols are at the top of the. marker stack. The example which follows has been chosen to illustrate the ideas of this chapter. Some segments of the resulting FPL production set have been omitted since they do not illustrate material from this chapter. k.k.l Input Syntax CF Production Number LHS Po£ 1 ; it ion 2 3 2 -> # S # 1 s -» a A f 2 s -» a B c 3 s -» a d d k s -* a B d 5 s -> b c 6 s -> b d 7 B -> A b 8 B -> b 9 A ->■ a k.k.2 Grammar Revisions Production 1 becomes 29 1 S -> a D and 10 D, A f is added. Production 3 becomes 3 S 11 D r a D p and d d is added. k.k.3 FPL Production Subset TH-S: a b (2) * CTH (2) * CT (3) CTH (2) b a d B(0) A(0) DNT-B DNT-A T (11,2) CNTH (2) * CT (k) CNT (1): B I * CT (5) 30 CT (3) CT (k) CT (5) c - s(o) DNT-S d - S(0) DNT-S b -> B(0) DNT-B f -> D 1 (0) DNT-D c - S(l) DNT-S d -> 8(1) DNT-S U.^-.U Combinations B* (1) { B (2,2) , B (k,2)} (2) { D 1 (1,2) , B* (1) , D 2 (3,2)} { T (5,2) , T (6,2)) { T (7,2) , T (10,2)} { T (1,3) , T (3,3)} 31 5. LOOKAHEAD CONSTRUCTION As was pointed out in chapter k, there are two relatively- independent methods of preclusion elimination. The first of these, combination of FPL productions, was described in chapter k. The second is explicit use of the next symbols in the input stream to see if they correspond to one of a set of strings of symbols which could follow if a particular FPL production should apply. If the parser symbols of two or more FPL productions within a group are identical, then a preclusion exists and combination of pro- ductions or lookahead must be used to differentiate. It will be assumed throughout this chapter that all possible combinations have been made. When combinations have not already been made, lookahead construction is a simple case of the general method to be described. The method of lookahead construction is most easily described by dis- cussing explicitly the implementation used. The first production in the sequence of productions involved in the preclusion is paired, in turn, with each of those productions which it precludes. After the lookahead for this first production has been completely constructed (and assuming success in differentiation), it is, in effect, removed from the set of productions involved in the preclusion, thus reducing the set by one. This new smaller set is then processed by pairing its first production with all those which it precludes. This process continues until only one production remains. This last production precludes nothing and, therefore, needs no look- ahead. This approach guarantees that each production has only as much lookahead as it needs to avoid precluding those productions which fol- low it in the group. 32 Two identical lookahead construction buffers are used; one for each of the FPL production in a pair. Each buffer is initialized when it is first assigned to a production. For a given set of pre- cluding productions, the first buffer is thus initialized once from the first production, and the second buffer is reinitialized each time this first production is paired with another production from among those which it precludes. Assuming the productions involved in the preclu- sion are numbered in sequence from one to n, the flow chart of Figure 1 describes this pairing and initialization procedure. Each line of these two buffers has two words of control in- formation plus k additional words, where k is the length of the look- ahead presently being generated. Associated with each FPL production is a set of descriptors, one for each of the productions combined to form this resulting FPL production. One line, starting with line one, is initialized for each such descriptor, (it, n) , when a lookahead buffer is initialized from an FPL production. This is done as follows: word 1: the nonterminal which is the LHS of the CF production numbered jt word 2: zero words 3 thru x: RHS symbols n + 1 through n + y from CF production jt words x + 1 thru k + 2: zero 33 co B • ' « K P£ H 3qh§ Ph S3 o gHH o H « o <2 • x] M Q ?H woo <^ co q « ^ P ^ PL, pq 03 ■■d cvi a B c D 2 A -> b B D f 3 B ->• x c 1+ B ->• X 5 D -+ d D 6 D -> d h3 is not SLR (k) for any k since both the x in production three and the x in production four can he preceded by either a or h. Both of these x's can be followed by cd . Thus, using only lookahead or only look- back one cannot decide whether x, when found (and followed by cd ), should be reduced to B. This grammar is, however, LR (l, 2) since x in the context (a, cd) must be from production four and should be reduced to B while x in the context (b, cd) must from production three. The CF grammar 1 A -> a B 2 A -> b D 3 B -»• c B k B -> d 5 D -> c D 6 D -» d is not LR (m, n) for any finite m since the d in the input string a c d or b c d must be reduced to B (production four) or D (production six) depending on whether the first symbol of the string, which is m + 1 symbols back, is a or b. This grammar is, however, SLR(O) by definition. An intuitive feeling for what is and what is not included in the SLR (k) class of CF grammars is difficult to obtain. That SLR (k) grammars are not all LR (k) grammars is clear from the first example above. It is the feeling of this author that the class of SLR (k) grammars includes almost all LR (k) grammars which might reasonably be expected from human specification of meaningful languages. This is based on the difficulty of generating the somewhat contrived example kk above and, more significantly, on the fact that extensive use of the CF to FPL conversion algorithm, which will he shown to accept all SLR (k) grammars, has uncovered no example of an LR (k) grammar which is not SLR (k). SLR (k) grammars do include the "standard" syntax for expressions, block structures and other commonly occurring things of this nature. The remainder of the chapter is a proof of the fact that the CF to FPL conversion algorithm of Chapters Three through Five can successfully convert all SLR (k) grammars. It is clear from the definitions of this chapter and the description of the algorithm that what needs proof is the adequacy of the investigation of the preceding or following context. 6.2 Explicit Lookahead Adequacy This section will show that the lookahead construction method described in Chapter Five, generates all and only those terminal symbol strings which could possibly be next in the input stream if a particu- lar CF RHS occurrence is being recognized and a particular FPL produc- tion is therefore to be the next step in a parse. Some additional notation must be introduced at this point. For each descriptor (at. 9 n. ) there are the following: q. is the number of RHS symbols of CF production jr. j a. is the string of symbols with descriptors (jc ± , n i + 1), U ± , n. + 2),..., (x ± , c^) A. is the LHS symbol of CF production jr.. U5 In addition, an existing string y will, in some cases, be extended by concatenating it with some particular a.. The resulting string, 7 a. will be referred to as p.. It should be noted that a. and p. may, in general, be empty strings. The maximum length of the strings to be generated is assumed to be k. A single FPL production corresponds exactly to a set of RHS occurrences of a specific symbol in the set of CF productions. This set consists of only one RHS occurrence if the FPL production is not the result of combination. What must be done, then, is to show that the lookahead construction accurately generates those terminal strings which may follow the appropriate RHS occurrences. The first step in generating lookahead is to initialize an appropriate number of strings. Assuming the set of descriptors for the FPL production P in question is S = { (V, n.) I i = 1,2,... , m ) the strings which must obviously follow immediately are precisely en. for i = 1,2,..., m. A review of the buffer initialization as described in chapter 5 shows that these are precisely the strings initially generated. Two other steps must be taken. The first is to extend those strings which are not of sufficient length. This is done by the pro- cedure GETSIM. The second is to change the mixed strings (i.e., containing both terminal and nonterminal symbols) into purely terminal strings. This is done by the procedure NT2T. The sequence in which k6 these two procedures are applied, the comparison of generated terminal strings with those strings which might follow precluded FPL productions and the methods which minimize string length can vary greatly depending on implementation and will not be dwelt upon here. Those readers in- terested in understanding and verifying the specific implementation used are invited to study the flow charts of Figures 1 and 2 in chapter 5. Of the two steps yet to be described, the conversion to ter- minal strings is the simpler and will be discussed first. A specific occurrence of a nonterminal symbol in a lookahead string may be elim- inated by generating a new string for each definition of that nonterminal with the occurrence replaced by its definition. The original string is then eliminated from the set of lookahead strings. This process must be repeated for all remaining nonterminal symbols in the lookahead strings and for any new nonterminals introduced by the replacement. Unfortunately, left recursion in the CF grammar could cause this iterative replacement to be a never ending process. If, however, each new string thus created, is compared to all other lookahead strings and, more importantly, to those strings already processed and eliminated (but saved for this purpose) and ignored when found to be identical to one of these previously existing strings, the process must end. This is because the finiteness of k and of the alphabet (V = V^ Vj) guarantees that there can be only a finite number of strings. The method Just described guarantees the accurate conversion of mixed strings to terminal strings. A review of the procedure NT2T and its use as described in chapter 5 shows that this is precisely the method used in the lookahead construction of that chapter. ^7 The most important step in generating the proper set of terminal symbol lookahead strings is the proper extension of those initial (or partially extended) strings which are insufficient in length. In general, it is more efficient to do all possible nonterminal symbol elimination prior to any string extension since this process can and usually does cause some string extension. In extending strings it is important to recognize that each initial string which requires extension is the completion of some EHS. In particular, it comes from the CF production numbered it.. Thus what can follow at this point is exactly what can follow A. , the nonterminal which is defined by production jt.. Assuming that A. has RHS occurrences at (jt ., n .) for j = 1,2,..., p, the single string OL. must be replaced by the set of strings CC. a. = p.. This process must be repeated for each initial OL.. Each lookahead string is now p. for some j. J Some of these p. will still be of insufficient length. In J fact, in those cases where n. = q., the initial OL. =6. and the string has not been extended at all. Each 8 . is the end of the RHS of the D production numbered jt.. It can be extended by adding all those strings J which follow RHS occurrences of the nonterminal symbol A. in the same manner as was described above for adding those symbols which followed RHS occurrences of A. . This process is then repeated until all strings are of sufficient length. This process must eventually end for the same reason that the nonterminal to terminal conversion process must end. It must be noted, however, that two identical strings, p. and p. , are not considered the same unless j' = j p . This is true for both the conversion and the extension processes. U8 me method described above accurately extends lootahead strings which are of insufficient length, The procedure GETSTM of cha pter 5 does not, however, operate in precisely this manner. It h as two differences which increase the efficiency of extension without affecting the accuracy. The first of these differences minimizes the importance of j for each p . It must he remembered that each i corresponds to a par- ticular {L n.) which, in turn, corresponds to a particular A.. A re view of the procedure will show that once a string becomes p. through addition of a., the only importance of i is that it determines A.. Thus the procedure GET5M keeps trach of neither 3 nor (*., n.), but -.+»= ,Hth each string, the appropriate A . This neces- rather associates with eacn bi^ms, * j ~p in-fn-rmation since the descriptors (x., n ) are sitates less saving of information since 3 3 no longer necessary. It also increases the number of cases where newly created strings can be ignored since two identical strings P ^ and p. may he considered the same even if i ± t jg *en A^ = A^. 32 Tne second thing which GETSYM does differently is to use tail syffi bol relationships to eliminate those iterations of the extension procedure which fail to extend the string (i.e., when n. - q . for some 3). Assume that the following is a subset of the CE productions: a -> . . . A 2 A 2 ■* ••* ^3 a -> ... A n-1 n ^9 and A is the nonterminal whose definition (RHS) is ended by the string n to he extended. With the method described above, at least n iterations •would be required since one of the new strings generated by each attempt would not extend the string but simply change the nonterminal symbol associated with it (i.e.. A. to A. , for i = 2,..., n). Clearly the 1 l-l ultimate result is to extend the string with those symbols which follow A. for i = 1,..., n. It is also clear that A is a tail symbol of each A. for i = 1,..., n (see chapter 2). GETSYM makes use of this fact by extending the string on the first iteration with those symbols which follow any occurrence of any nonterminal symbol B which has A as a tail symbol (in this case all A. for i = 1,..., n) as opposed to any occurrence of A only. This allows GETSYM to ignore occurrences of B which end a RHS (i.e., do not in fact extend the string). This obviously accomplishes the same thing as the method originally described but in a more efficient manner. The only parsing decision which must be made when a set of FPL productions is used as a parsing algorithm is to decide which pro- duction applies. It has now been shown that when this decision can be made by lookahead, it is correctly made by the set of FPL productions generated by the CF to FPL conversion algorithm of chapters 3 through 5- 6.3 Implicit Lookback Adequacy It remains to be shown that all parsing decisions based on that part of the input already partially parsed will be correctly made by a set of FPL productions as generated by this conversion algorithm. 50 Tnis is a somewhat more complex situation since the lookback is not explicit, but is implicit in the grouping of the FPL productions. The approach to be used will be to show that all groups contain all and only those productions which could possibly apply upon transfer to the group. men this is complete, the adequacy of the lookback will have been proved because the exact accuracy of the groups implies that there is at least one particular partially parsed string which could precede the application of any and all the productions in the group. The two special groups, START and NT (0,0), are simple mechanical devices for beginning and ending the parse. A review of the contents of these groups (section 3-3) will verify that they contain all and only the appropriate productions. The remainder of the groups fall into eight distinct types (T, CT, TH, CTH, NTH, CMH, NT and CNT) and the proof, therefore, has the following eight distinct cases: case 1: T (*, n) groups The FPL production which generates the transfer to the T (*, n) group has descriptor («, n-l) (3-3.2) and applies only when symbol n-l on the RHS of the CF production numbered * has just been recognized. The transfer is to T («, n) if, and only if, symbol n of CF production * is a terminal symbol. Thus the T («, n) group must attempt to recognize exactly and only that terminal symbol. The + n is bv definition, the single descriptor (*, n) descriptor set D T ( n ) is, °y oexxu , (3.2.10 which maps into the FPL production which tests for the appro- priate terminal symbol and then takes action based on the symbol which ma y or may not appear at position n + 1 of CF production «. Thus, the 51 T (jt ; n) group contains exactly the single FPL production which could apply upon transfer to the group. case 2: CT (m) groups The FPL production which generates the transfer to the CT (m) group is a combination of FPL productions of the form which generate transfers to T (:rt, n) groups (section U.l). Thus, the CT (m) group must contain exactly those productions which would have made up the appro- priate T (jt, n) groups. The integer m has associated with it precisely this set of labels, L (Ij-.l.l). The descriptor set D„ m , ^ is defined ' m CT(mJ (1+.1.2) as the union of the descriptor sets of these appropriate T (it, n) groups. Thus, the CT (m) group contains almost by definition, exactly those productions which could apply upon transfer to the group. case 3* TH-A groups Transfer is made to a TH-A group of FPL productions upon recognition of the fact that some derivative of nonterminal A must follow if the input is correct (3* 3*1 a nd section k.2). Since the symbols which come next are from the input stream, they must be terminal symbols. The FPL productions of the TH-A group must test for all and only those occurrences of terminal symbols which may begin some deriv- ative of A. The definition of descriptor set D^ (3.2.1) explicitly ill -A specifies that all the elements of D^ must refer only to terminal In -A symbols which are first symbols on KHS's of CF productions. Thus, the FPL productions of TH-A cannot be testing for nonterminal symbols 52 or occurrences of terminal symbols which do not begin derivatives. For an occurrence of terminal symbol t to begin a derivative of A, , the following relationship must exist -in the set of CF productions: A 2 *1 - Ap A~ • • . A . -> A ... n-1 n A -» t . . . n A comparison of this relationship with the definitions of D^ . and of ill -A head symbols (chapter 2) makes it obvious that the descriptors of D,-, . refer to all and only those occurrence of terminal symbols which In -A may begin derivatives of A. The descriptor to FPL production mapping rules (section 3*3) then guarantee that group TH-A contains all and only the appropriate FPL productions. case k: CTH (m) groups As was the case in the CT (jt, n) type groups, the transfer to a CTH (m) group is made by a FPL production which is the result of combining two or more FPL productions of a single type, namely, those which would initiate recognition of nonterminal symbols by transfer to TH-A groups. Thus, a CTH (m) group must contain exactly those pro- ductions which would be in the various TH-A groups. Each uncombined FPL production which causes transfer to a TH-A group also creates and pushes into the marker stack the symbol 53 A (re, n) or A* (m). Thus, a set of such productions must create a set of such marker symbols, A. (), A^ (),♦.., A () and cause transfer to TH-A^ TH-A p , . .., TH-A . The symbol (m) is created by a combination of productions of this type and is defined to be this set of simple marker symbols (section ^.2). Thus, if A. (jt, n) e (m), then all the productions of TH-A. must be in the CTH (m) and no productions other than those "which are in some appropriate TH-A. group should be in CTH (m). It is clear from the definition of the descriptor set D. mT / s CTH (mj (k.2.k) that this is the case. case 5 : NTH-A groups An NTH-A group is reached by a dynamic transfer DNT-B when there has just been a reduction to the nonterminal B and the top symbol on the marker stack is A (jt, n) or A* (m). The marker for A was pushed when the search for an A began. If the A has been completely recognized, processing of the appropriate CF production EHS has continued. If there are other nonterminal symbols following the A, then the corresponding markers would be pushed onto the marker stack. When the end of the CF production has been recognized, then all marker symbols for nonterminal symbols on its RHS are popped from the marker stack. Thus, the existence of a marker symbol for A at the top of the marker stack means that the B just recognized must be part of some derivation of A. The B must be the first symbol of the RHS of some CF produc- tion or else the beginning of the search for it would have caused a marker for it to be pushed on top of the A marker. The LHS of the CF production, C, which begins with B must be a head symbol of A. That this must be true is seen by the definition of head symbol and the fact 5h that B must "be part of some derivative of A. If this were not true, the following relationship would have to exist: A ■* A 1 . . . A ■* A 2 ... A , •* • • • A n-1 n A -* A ^ .. n n+1 A J ■* C ... n+m C ■* B ... in which case a marker for A would be pushed onto the marker stack above the A marker which is, in fact, found at the top. Thus, when a nonterminal symbol has just been recognized which does not match the simple or special marker which is at the top of the marker stack, it must be the first RHS symbol of a CF production whose LHS is a head symbol of the nonterminal whose marker is atop the marker stack. By definition, this is precisely the set of symbols referred to by the descriptors of MH-A (3.2.2). 55 case 6: CNTH (m) groups The transfer to a CNTH (m) group is made when a nonterminal B has just been recognized and put in the parser symbol and the set (m) of marker symbols does not contain B (rt, n) or B* (n). Using the same arguments as those of case 5.? it is clear that the B just recognized must be the first RHS symbol in a CF production whose IHS is a head symbol of a nonterminal symbol, the search for which was initiated when (m) was pushed onto the marker stack. The (m) represents a set of simple and special marker symbols corresponding to set of nonterminal symbols, A,, A p ,..., A , (section k.2) which could come next at the time the (m) was pushed onto the stack. Thus, the set of FPL productions which could possibly apply is precisely the union of the FPL productions of the various NTH-A. . This is equivalent to saying that the descriptor set D^™,, \ must equal the union of descriptor sets D. TmrT . for all A. such that A. (it, n) € (m) NTH-A. i l ' ■ x _ _ l or A* (n) e (m) and this is precisely the definition of D / % (U.2.5). Thus, CNTH (m) contains all and only those FFL productions which could apply on transfer to the group. case 7: NT (it, n) groups Transfer is made to an NT (re, n) group when a nonterminal symbol A has just been recognized and the top symbol on the marker stack is A (re, n) or (m) where A (tc, n) e (m). If (m) is at the top and A (it, n) e (m), we must be at some stage in recognizing the A which th is the n symbol on the RHS of CF production tc. It cannot be part of the derivative of some other nonterminal symbol B where B (re. , n ) € (m) or B* (m ) € (m). This is due to the insertion of dummy symbols as % described in section U.3« As seen in case 5 > if the A just recognized were part of a derivation of B, it would have to he a head symbol of B. The case where A is a head symbol of Band the two FPL productions with descriptors (it, n-l) and (it , n_-l) could be combined is precisely the case where A and all symbols following it in CF production jt are replaced by a dummy nonterminal symbol D and A (7t,n) would not be an element of (m) . Thus, in either the A (ji,n) or (m) case, the A just found must be either the A corresponding to (it, n) or some part of a derivative of that particular A. th If the A just found is not the one corresponding to the n symbol of CF production jt, then it must by the reasoning of case 5 be a beginning of some derivative of A (i.e., a head symbol of A). Thus, by definition it must be a left recursive occurrence of A. The NT (it, n) group must, therefore, contain exactly the (it, n) occurrence of A and all left recursive occurrences of A. The descriptor set L\ Tm/ >, Con- NT(it, n) tains by definition (3*2.3) precisely the descriptors for these occur- rences. case 8: CNT (m) groups A CNT (m) group is transferred to, when a nonterminal symbol A has just been recognized and the top symbol on the marker stack is A* (m) or (k) where A* (m) € (k). All of the arguments of case 7 apply again with the single (jt, n) occurrence of A replaced by a set S = {(jr., n.) | A (i, n.) 6 A* (m)} of CF occurrences of A. The descriptor set D^, Tm/ >. must contain exactly all CNT(mJ (jr., n.) e S and all descriptors for left recursive occurrences of A. l i 57 Th is is precisely the 11311011 of all D. Tm / \ ((it., n.) e S). But this in turn is exactly the definition of D^- Tm/ \ (h.2.2). CNT(m) 6.k Conclusion It has "been shown that each group of FPL productions contains exactly those productions which could possibly apply based on what has preceded the transfer to the group. Therefore, no parsing decision can be directly affected by further investigation of the partially parsed preceding string. It is also clear that all possible use has been made of the next k input symbols through the explicit search (when necessary) for all k symbol strings which could follow if a particular FPL production could be applied. 58 7- RELATIVE EFFECTIVENESS The second step in evaluating the effectiveness of the conver- sion algorithm is to compare its results with other recently developed methods for generating parsing algorithms from context free grammars. This comparison will "be done on three levels. First will be the develop- ment of an expression for the upper bound, U, on the parsing time re- quired by each of the parsing algorithms generated. Second will be an evaluation of the effect, on these expressions of replacing the upper limit variables with variables representing averages. The third approach is a subjective analysis which attempts to point up the strengths and weaknesses of each algorithm relative to the others. Two of the methods to be compared are those of DeRemer [15 ] and Korenjak [l6] because they are relatively recent and the parsing algorithms which result from these methods operate by applying the same basic operations as the result of CF to FPL conversion. The classic LR (k) recognition algorithm of Knuth [lU] will also be compared since it is the most widely known one which also uses these same basic opera- tions. Lastly, the method of Earley [17 ] will be compared since it is the most widely known among recent attempts at automatic recognition generators. 7-1 Upper Bounds The algorithms of DeRemer, Korenjak and Knuth and the CF to FPL conversion result all operate by applying various sequences of what will be referred to as basic operations. These operations are: 59 1. test for a given symbol 2. put a symbol into stack or word 3. increment stack pointer k. decrement stack pointer (remove symbols from stack) 5. scan symbol from input 6. transfer control These operations correspond roughly to basic machine operations and will thus be treated as being of equal complexity. The variables used in the upper bound expressions are defined as follows for an arbitrary grammar G and input stream S: t = number of different terminal symbols in G n = number of different nonterminal symbols in G p = number of productions in G 1 = number of symbols on the RHS of the longest production in G k = lookahead length used m = number of symbols in S r = number of reductions in the longest possible sequence of reductions without an intervening scan from S (a function of G and S) q = maximum number of occurrences of a single nonterminal in the EHS's of productions of G. 60 The upper bound expressions for all except the algorithm of Ear ley are calculated by assuming that for each symbol scanned from S, the longest possible sequence of reductions must be made. It is also assumed that when comparisons of any sort are necessary, the maximum number of alternatives are tested sequentially with the last possibility being correctly matched. 7.1.1 FPL Production Set This section shows in some detail the calculation of the upper bound, U , on the number of basic operations necessary to parse input S with the parsing algorithm generated by the CF to FPL conver- sion applied to grammar G. For each symbol of S there must be a scan from the input stream, a placement of the symbol into the parser word, a transfer to a TH, CTH, T or CT group, and a sequence of reductions. The upper bound is then U = m (3 + Y + (r-2) X + Z) where Y represents the number of operations associated with the first reduction, Z the number of operations associated with the last reduction and X the maximum number of operations associated with each of the intervening reductions. The first set of operations contributing to Y is a test of the parser word for the appropriate terminal symbol of which there are t possibilities. Then all possible lookahead strings must be compared symbol by symbol which involves kt comparisons. This is followed by 61 placing the new nonterminal Into the parser word, popping the marker stack, comparing it to the top marker word and transferring to the appropriate NTH group (always the worst case). Thus, the maximum is Y = t + kt k + k In the NTH group all "but one of the nonterminals is a possibility (the symbol on the top of the marker stack is not), neces- sitating n - 1 comparisons. There are kt lookahead operations as before. The marker stack is not popped in this case since the reduction being made is of only one symbol (otherwise we would be in a NT as opposed to NTH group), but the other three concluding operations "apply. Thus, the maximum is X=n-l + kt k + 3 The calculation of Z begins with the same parser word and lookahead operations. This may be followed by adding to the marker stack and incrementing its pointer at which time the next scan would follow. Thus, the maximum is Z=n-l+kt k +2 Substituting for X, Y and Z and simplifying gives V U = m (rkt + rn + 2r + t - n + k) 62 which represents the maximum number of basic operations necessary for the result of the CF to FPL conversion of G to parse S. 7.1.2 Comparison The parsing algorithms generated by Korenjak have the same upperbounds as those generated by Knuth since Kornejak differs from Knuth only in ways which do not affect the number of operations neces- sary for parsing. The upper bound for DeRemer, tL, and the upper bound for Knuth and Korenjak, U , were calculated in the same manner as U was calculated. That is, a thorough step by step analysis of the maximum number of basic operations necessary between scans. The upper bound on algorithms generated by Earley, XL,, is not iii calculated in this manner since they do not operate by executing the same basic operations. That upper bound is taken directly from the work done by Earley himself. XL, does not represent basic operations and this author believes that it would be necessary to multiply the expres- sion by a constant factor of at least five to make it comparable. The other three upper bounds, XJ , XL. and U , are, however, attempts to at least approach the least upper bound in each case. This is not true of XL, which is calculated by Earley only to show a proportionality to m, Hi the number of symbols in the input stream S. The assumption is then, that these two factors tend to cancel and that XL, is roughly comparable to the other three. The upper bounds are : U = m (rkt + 2r + t + k) + m (rn - n) 63 U D = m (rkt k + 2r + t + k) + m (kt k + rq + hv + 2) U = m (r'kt + 2r + t + k) + m (kt + rn + rt +3r + n + t +2) 3 ,2 ,2k U E = m p J 1 t The major differences in these expressions will be explained in section 7«3j during the subjective analysis of the methods and their properties. 7*2 Average Times The average values of most of the variables involved are,, in fact, almost impossible to ascertain since they depend to such a great degree on specific grammars and input streams. This section will attempt to show the effect on U , LL and U T/r of considering the realistic cases c JJ iv (with estimates based on experience). The upper bound of Earley, U„, will not be considered here since it only vaguely represents the same thing as the other three and because it is calculated in a completely different manner. The first significant change is brought about by recognizing that each scan of a new symbol from the input stream does not result in r consecutive reductions. It is estimated rather that there are approximately two reductions for each input symbol. The next change stems from recognition of the fact that when a terminal (or nonterminal) is possible, it is not necessary to assume that the last of all t (n) possibilities will apply. It is estimated from experience that the average number of possibilities in such cases is no more than three or four. The average number of tests necessary is approximately two. 6k In the CF to FPL conversion method and the method of DeRemer, lookahead is used only when necessary and then only as far as necessary. Rarely is a lookahead length of more than one required. In fact, zero is the most common. Therefore, kt will be replaced by two in U and tL. The methods of Knuth and Korenjak require the use of lookahead to its maximum length in every case. The value of kt is, therefore, extremely dependent on the specific grammar and could, in fact, reach as much as one thousand or more. For this comparison we will be more conservative and use 50 as the number of symbols which must be compared when lookahead is applied. The final change stems from recognition that not all nonter- minal symbols appear in the grammar q times as RHS symbols. It is assumed that the average number of such appearances is four, of which only half must be tested. The upper bound expressions U , U and U become approximate average time expressions T , T^ and T by making the following sub- stitutions: kt replaced by 2 in U and U_, kt replaced by 50 in U T/r and r, t, q and n replaced by 2. The time expressions calculated in this manner are T = 16 m c T D = 30 m T v = 180 m ft 65 7-3 Subjective Evaluation The CF to FPL conversion algorithm described here and the systems of DeRemer, Knuth, Korenjak and Earley use quite different notations and terminology. The discussion of these systems which fol- lows will use, as much as possible, the terminology developed to this point for the conversion method when referring to roughly equivalent activities of the resulting parsing algorithms. Again the methods of Earley must be considered independently of the others. The main reason for this is that the algorithm which he describes, and from which his upper bound is calculated, generates recognizers as opposed to parsers. That is to say, his method determines whether or not a given string is in a particular language without neces- sarily determining the specific sequence of reductions which would convert the string to an objective symbol. He states that the algorithm can be converted to a form which generates parsers, but it is not clear exactly what form these parsers would take. It is thus not possible to compare them to parsers generated by the other methods. A second reason that comparison of Earley' s method to the others would be unreasonable is that his goal appears to be proof of the existence of an algorithm which generates recognizers with certain properties. This is as opposed to the others (with the possible exception of Knuth) who are attempting to create practical parsing algorithm gen- erators. The basic approach is to construct, at recognition time, the equivalent of the group of FPL productions (without transfers) which could apply next. This construction is based on the immediately pre- ceding group, other preceding groups which are identified by use of a 66 mechanism similar to the marker stack and the CF grammar. It seems to this author that a parsing algorithm based on similar methods would be relatively slow and inefficient. The other four methods are similar in many respects. They do, however, have differences which are significant enough to account for the variations in the time expressions of section 7*2. The most impor- tant of these is the unnecessary use of lookahead which has already been described. Another significant difference is the manner in which Knuth (and therefore, Korenjak) separates the parsing decisions from the decisions involved in transferring to the next step. The basic parsing decision is to reduce n symbols at the top of a stack to a single non- terminal symbol or to scan another terminal symbol from the input stream into the top of the stack. This decision is based solely on k symbols of lookahead. The transfer decision, however, is based solely on the symbol at the top of the stack (in this case a single stack holds both markers and parsing symbols). This leads to a significant amount of unnecessary testing. The most obvious example of this is when a one symbol lookahead recognizes the specific symbol t which causes t to be scanned into the stack. The next step is to test the symbol at the top of the stack against many possibilities (one of which is t) to determine the next transfer address. A basic difference between the method of Knuth and the CP to FPI conversion method was described in chapter 6 as the reason that the conversion method method cannot process all LR (k) grammars. Korenjak's approach is to take an arbitrarily selected subset of the nonterminals (presumably those which occur most frequently in the grammar) and use a single parsing procedure for each of these. When successful in the 67 selection process, the result is a parsing speed the same as Knuth, but a smaller set of tables. The conversion method and DeRemer' s method attempt to do this with all nonterminals which, when successful, minimizes, to a great degree, the table sizes. As noted in chapter 6, this does, unfortunately, have the added effect of restricting the class of grammars which can be successfully processed. DeRemer does intro- duce a technique in his method which duplicates the recognition of specific occurrences of nonterminals when required, but he admits that implementation is completely impractical. Unsuccessful attempts have been made to accomplish this in an implementable way in the CF to FPL conversion algorithm. The difference between T and T is explained by three factors. The first is in the dynamic transfer. The DeRemer method tests the marker stack to determine which, of all possible uses of the nonterminal just recognized is, in fact, the one which applies. It is believed that the conversion method is superior on the average because the number of possibilities in an NTH group is no greater than the number of occurrences of a nonterminal and the marker stack is used in such a manner as to force testing of the most likely possibility first. The second increase in the number of operations required by a DeRemer generated parser is caused by the fact that a marker symbol is pushed for every new group which is entered. This means more pushes and also that a pop accompanies every reduction. The conversion algo- rithm, on the other hand, pushes marker symbols only for transfers to TH and CTH groups and has some reductions which require no pop of the marker stack. This relative weakness of the DeRemer method also applies to the Knuth and Korenjak methods. 68 The third factor is that a reduction is separated from the following dynamic transfer in the DeRemer method. This requires the execution of an apparently unnecessary transfer. There are several other such relatively minor weaknesses in the methods of DeRemer, Knuth and Korenjak which presumably could and would he eliminated in any mean- ingful attempt at implementation. 7«^ Summary This chapter has attempted to show the differences between the parsing algorithms generated by the CF to FPL conversion method described in chapters 3 through 5 and the parsing algorithms generated from CF grammars by the methods of Earley, Knuth, Korenjak and DeRemer. The method of Ear ley must, in fact, be considered separately from the others since it generates recognizers as opposed to parsers. In addition, it appears to be a proof of the existence of a method for generating recognizers with certain theoretical properties as opposed to a method which might be implemented for practical purposes. The method of Knuth is also an existence proof. It, however, would be of practical value if it could be implemented efficiently enough. The basic problem is simply the size and quantity of the tables generated. The method of Korenjak is similar to that of Knuth except that it treats an arbitrarily selected subset of the nonterminal symbols of the CF grammar as special cases. If the appropriate nonterminals are selected, this method will generate parsing algorithms for all LR (k) grammars as accepted by Knuth. 69 The conversion method and the DeRemer method eliminate this trial and error nonterminal selection "by, in effect, assuming that all the nonterminal symbols fall in this special class. This, unfortunately, has the effect of reducing the class of grammars for which parsing algorithms may be generated (see chapter 6). Section 7*1 calculates expressions for the upper bounds on the parse times of algorithms generated by the marious methods. These expressions were arrived at in such a manner as to approach as nearly as possible, the least upper bounds. Section 7*2 calculates expressions for the average parse times of the algorithms generated. In each case the parsing algorithms generated by CF to FPL conversion appeared most efficient. 70 8. THE IMPLEMENTED TWS This chapter will he a brief and incomplete description of the translator writing system (TWS) which has been implemented using the CF to FPL conversion algorithm as the basis of the syntax preprocessing phase. The emphasis is on those factors which directly affect or are directly affected by the conversion. This chapter is included so the reader can evaluate the algorithm in a practical sense and also as a first step for potential users of either the conversion algorithm or the implemented TWS. i The TWS has been implemented on a Burroughs' B5500 and is written entirely in Burroughs' version of AIC0L for this machine. Certain j of the implementation methods were developed specifically to make use of j particular features of this machine and language. 8. 1 Syntax The syntax preprocessing part of the TWS consists of three phases. The first phase is a conversion of the input grammar, written in a metalanguage called TWINKLE [20], to an internally represented set of CF productions. This metalanguage uses notation very similar to Backus-Naur Form (BNF) but extends it to include productions whose PHS's resemble regular expressions. This allows grouping of alternatives with- in a single RHS and the specification of arbitrarily long lists of symbol, without the use of recursive productions. TWINKLE also allows the re- | placement of most of the special metasymbols with English language j special words. For example, the set of CF productions j 71 : := + ; : := - ; : := ; is equivalent to the TWINKLE production AN IS DEFINED AS A LIST 0F <3>S SEPARATED BY [ + 0R - ] ; It is felt that use of English forms of TWINKLE will simplify the language designer's task of documentation. Appendix A is a listing of the English- like TWINKLE specifi- cation of the syntax of a small subset of ALG0L called DEMALGL ( DEM on- stration ALG0 L). This is converted by the TWS built TWINKLE compiler into the set of CF productions of Appendix B. The translation from TWINKLE to CF productions also includes the following: 1. Elimination of all empty RHS's. 2. Elimination of all nonterminals with only a single definition. 3- Rearrangement of the CF production table to a group the definitions of each nonterminal symbol. k. Calculation of the head symbol tables. 5- Insertion of dummy nonterminal symbols (as described in section *+.3)- 72 6. Creation of a table of LHS and RHS occurrences of all nonterminal symbols. 7- Insertion of the special production defining E. The central processor time for the TWINKLE compilation of DEMALGL on the Burrough's B5500 was 6k seconds. Appendix B is PR0TAB (CF PRfa uction TAB le) as output from the TWINKLE compiler. PR0TAB is stored internally as a one-dimensional array with a single word corresponding to each RHS symbol or semantic call. This word contains the following bit fields: one bit specifying whether or not this is a left recursive occurrence of a nonterminal, one bit specifying whether or not this is a nonterminal which is the last nonsemantic RHS symbol, one bit specifying whether or not the next symbol is a semantic call, twelve bits for the number of the nonterminal LHS of this production, six bits for the type of this symbol (terminal, nonterminal, semantic test, etc.) twelve bits for the number of this symbol (symbols are numbered sequentially within each type). The PR0TAB of Appendix B differs from the CF production table described in chapters 3 through 5 in that the descriptors are 73 not of the (jt, n) form but are single numbers corresponding to addresses in PR0TAB. The (it, n) form was used to make the description of the algo- rithm more easily understood. The dummy nonterminal defined at location 310 is an example of the process described in section U.3« The potential bar to combination was the head symbol relationship between now at location 310 but originally at location 2^+3 and the occurrence of at location 276. The second phase of the syntax preprocessing is the conversion of the internally represented set of CF productions to an internally- represented set of FPL productions, using the algorithm described in chapters 3 through 5. As noted in Ghapter k, it is an imple- mentation feature which dictated the use of one symbol lookahead as the primary method of eliminating preclusion. This feature is the use of bit patterns to represent sets of terminal symbols. These bit patterns are pointed to by a third type of symbol called an "any" symbol. These sets can be specified directly in the TWINKLE metalanguage. They are also constructed internally to combine a set of FPL productions within a group which are identical except that the parser symbols are different terminal symbols. Their most important use is an internal combination of a set of one symbol lookaheads into a single "any" symbol which requires only a single test at parse time. The table of terminal head symbols created during the TWINKLE conversion is in the same bit pattern form and allows a simplification of the routine NT2T (chapter 5) when a nonterminal is being expanded in the k-th position during k symbol look- ahead construction. lk Appendix C is the set of FPL productions -which results from a conversion of the CF productions of Appendix B. The reference to "top stack symbol" is synonymous with the term parser symbol as used through- out this work. The phrase "is input" is followed by the set of lookahead strings associated with a production. A "bar" is a simple marker symbol, "sbr" is a special marker symbol and "cbr" is a combined marker symbol. The descriptor associated with a simple marker symbol, the number asso- ciated with a special marker symbol and the number of marker symbols to be popped at a reduction are not included in this listing. Points worthy of note in this example include the various types of combination. The combined groups begin at FPL production 325 and each identifies the group which contains the combined production. Also important in terms of the efficiency of the parse is the insertion of TPN groups (which by definition contain only one production) inline as opposed to being separate groups requiring a transfer. An example of this occurs in productions ^+-^6 in the TH-6 group. In this case, if the parser symbol is, in fact, <*I>, then production kk applies but transfers to production k-5 (instead of a separate TPN-191 group) upon completion. If production k5 does not apply, control is transferred to the error production at the end of the group. If the parser symbol is not initially <*I>, then production kk does not apply, but control skips TPN production k*~> and tries to apply production k6 next. This becomes important when it is realized that large languages require paging of a segmented pseudo instruction table. Thus l/0 is minimized when transfers are relatively local. Another advantage is that the parser symbol need not be checked in the TPN production if the preceding production uses lookahead. This fact would be almost impossible to ascertain if the 75 TPN production was a separate group which could, he transferred to from more than one place. The central processor time for this phase of syntax prepro- cessing was 76 seconds. The final phase of syntax preprocessing is a conversion of the internally represented set of FPL productions to a sequence of pseudo orders which may be used directly for interpretive parsing or may he further converted into compilable Burrough's B5500 ALG0L code. The choice between interpretive parsing and parsing with executable code is made by the language designer. Which choice is most efficient is a func- tion of the grammar (at least with the particular implementation now in use). This phase is the most machine dependent since it makes maximum use of the features of the B5500. It is also in this phase that a number of other effective optimizations are made to the set of FPL productions. These optimizations are significant enough to lower the parse time by 25 to 30 per cent. The DEMALGL example of Appendices A through C was con- verted only to pseudo instructions. The time for this phase of syntax preprocessing was 23 seconds. 8»2 Semantics The TWS works by calling a user specified semantic routine at the appropriate point in the parse. These calls may be placed at any point in a TWINKLE production except at the beginning. They are then associated with the preceding symbol in the resulting set of CF produc- tions. When that particular use of the symbol has been recognized, the semantic routine is called into execution. This has a significant effect on the CF to FPL conversion algorithm when the semantic call follows a 76 symbol other than the last one on an RHS. Under these circumstances the FPL production whose corresponding descriptor points to that symbol cannot be combined with any other FPL production which does not have the same semantic call. It is relatively obvious that the parsing decision must be made at this point, and in this sense it is equivalent to the parsing decision necessary upon completion of a RHS. Intuitively, in fact A -> a @SX B is, from the users point of view, equivalent in this sense to i A -> D B and D -* a @SX { ! where @SX is the specification of a call on semantic routine X. A second use of semantics has potentially an even greater effect on the CF to FPL conversion and resultant parser. This is a semantic test or condition and is specified @TX. A semantic test j routine may do anything that a normal semantic routine may do, but it j must set a global boolean variable to true or false. This type of routine is invoked after recognition of the appropriate symbol, but before it has been ascertained what specific use of the symbol applies. The use of this type routine is, in fact, to differentiate in a case where no j syntactic differentiation is possible using the CF to FPL conversion | method. The semantically conditioned terminal (or nonterminal) symbol is, for conversion purposes, considered to be a different symbol from 77 the same terminal (or nonterminal) symbol with no semantic test or with a different semantic test. The following is an example which might appear in the grammar for an ALG0L type language in the specification of boolean and arithmetic assignment statements: 1. : : = ; 2. : : = ; 3. : : = = = ; If. : : = = ■ ; 5- : : = <*I> ; 6. : : = <*!> ; where <*I> represents the terminal symbol "identifier" (recognized by a scanner). The TH- group would include <*I> <*!> (0) (0) DNT- and DNT-. This is an unresolvable preclusion. CF production five could be con- ditioned : : = <*!> @ TX where semantic test routine X checks some previously built symbol table and returns a value of true if and only if the particular identifier in question has been declared B00LEAN. In this case, the resulting FPL productions 78 <*I> @ Tx [ ... and <*I> | are considered to have two different parser symbols and are, therefore, not in conflict. The use of this semantic feature adds tremendously to the parsing power of the total system since semantic routines (both test and normal) have access to all parts of the total system. It is possible, for example, to go to the following extreme in building a compiler for ALG0L. Let : : - BEGIN §11; be the entire grammar and @ Tl be a handwritten ALG0L compiler which uses the TWS supplied scanner and returns true or false depending on whether or not the input stream is an acceptable ALG^L program. Clearly the language designer must take care in using this feature, since the existence of a semantic test causes the conversion algorithm to assume complete differentiation between the conditioned occurrence of a symbol and all other occurrences of the symbol which are not similarly condi- tioned. The semantic routines are written in Illinois Semantics Language (ISL) [21] which includes all of Burrough's B5500 ALG0L as a subset. In addition, there are a number of special constructs which simplify the declaration and use of a number of data structures such as stacks and tables which are frequently used in constructing compilers. 79 There are also special constructs -which allow the semantics "writer to output an intermediate language stream of operators (next pass semantic routine calls) and operands. The semantic preprocessor converts a set of routines written in ISL into pure B5500 ALG0L in a form appropriate for the final compiler or translator. An example of the use of semantic tests is the call @ T10 in line 1^ of the TWINKLE input of Appendix A. This shows up at word 37 (among others) of PR0TAB (Appendix B). This semantic test routine checks a symbol table and returns true if the identifier just scanned has "been declared integer and false otherwise. The value of this test is seen in FPL productions 37 and k-2 (Appendix C) which have identical parser sym- bols. An examination of PR0TAB shows that each can be followed by "<-" and an arithmetic expression of arbitrary length are thus are not dif- ferent iable by lookahead. A combination of these FPL productions would also fail ultimately because a (PR0TAB kQ) and an (PR0TAB 39) can be syntactically identical (e.g., a single identifier). The syntax of DEMALG-L as written here is, in fact, ambiguous without the use of a semantic test. 8-3 Other Considerations There are some other features of the total system which do not fall under the specific headings of either syntax or semantics. Probably the most important of these is the outstanding automatic error recovery available [22]. During the CF to FPL conversion numerous table entries (in the form of bit patterns) are generated specifying the sets 80 of valid symbols which may follow in certain situations. There are need in a variety of ways to do what amounts to error correction when no pro- duction in a group applies. The basic error recovery makes use of the information in the marker stack. The approach is to scan ahead until a symbol is found which can validly follow the nonterminal symbol in the process of being recog- nized. All intervening symbols are then arbitrarily reduced to that symbol. The process is actually slightly more complicated, but it is worthy of note that none of the systems considered in chapter 7 have enough information available at parse time to do anything along these lines. It should be noted that the final result of inputing syntax and semantics to the appropriate preprocessors is a compilable Burroughs B5500 ALG^L program. This is accomplished by merging the output of the semantics preprocessor with various prewritten pieces of ALG0L code (including a scanner which may be user modified) and possib^ the stream of ALG0L procedure calls output by the syntax preprocessor. The system has been used to generate several translators and parsers. Among these are the TWINKLE and ISL translators and a parser for a large subset of PL/1. Also, presently underway is the generation of a translator for an operating system language, 0SL, which is larger and more complex than ALG0L . In addition to its obvious use as a trans- lator writing system, the system can be used as an aid in writing any program whose operation depends on the syntactic structure of its input. The system has been used in this manner to generate a set of diagnostic programs which have aided in the development of the ILLIAC IV computer 81 and to generate a symbolic differential equation solver. Other projects of this nature might include a program for reformatting source codes for ALG0L (or other high level language). The program which does phase two of the syntax preprocessing (i.e., CF to FPL conversion) is a B5500 ALG0L program of approximately 2300 cards in length. As noted, execution time on the sample language DEMALGL was 76 seconds. This time varies greatly as a function of the size and complexity of the input grammar. Although no firm relationship has been established, it appears to increase somewhat more than linearly as the size of PR0TAB increases. Appendix D is a set of Flow Charts describing the structure of the conversion program with the necessary parts of the TWINKLE compilation included. All methods of internal data representation and many other details have been omitted since they are highly machine dependent. Q.k Summary This chapter has briefly described the implemented TWS which uses the CF to FPL conversion algorithm of chapters 3 through 5 as the essential part of its syntax processing phase. In particular, it describes several important features of the system which exist only be- cause the conversion algorithm was specifically designed and implemented to allow them. The most important of these features is the method of associat- ing semantics with syntax. Specifically mentioned by several users of the system as a significant advantage is the ability to put semantic calls in the grammar at positions other than the end of RHS's. Also 82 important in this area is the fact that no semantic routine will be executed erroneously. This is possible with many other parsing algorithms •which require backing up to a previous point in the parse. Other features are the use of semantic tests to greatly increase the class of convertible languages, the fact that ambiguous grammars are I recognized, the existence of marker symbol information for error recovery and the use of "any" symbols in the input grammar. In addition, the nature of the conversion algorithm and the form of the resulting parser I (i.e., FPL productions) make possible optimizations such as the inline use of T (jt, n) productions which increase parsing efficiency by 25 to 30 per cent. , 83 9- CONCLUSION The introduction (chapter l) of this work sets forth several goals for a translator "writing system. These included the construction of translators which are fast, occupy as little space as possible and are capable of generating efficient code when the translator is to be a compiler. Other criteria of importance were simplicity of use and successful processing of as large a class of grammars as possible. Let us first consider the speed and size of the translators which result from the implements TWS which uses the CF to FPL conver- sion algorithm of chapters 3 through 5 as the basis for its syntactic preprocessing. It must first be noted that the larger part of any translator or compiler is the semantics and code generation. The TWS's known to this author, leave this to the language designer (i.e., the TWS user). For this reasons we will consider only the speed and size of the parsing phase of translation. Chapter 7 shows that in most, if not all, cases the parsers generated by this method out-perform parsers of a similar nature generated automatically by the other methods known to this writer. In addition to the factors considered in chapter 7, there is the 25 to 30 per cent reduction in parse time mentioned in chapter 8 which results from optimizations most of which are possible only because of the form of the parsers generated. Of the methods which generate this type of parser, that of DeRemer [15] compares most favorably with the CF to FPL conversion method. Horning and Lalonde [23] have done comparison of parsers gen- erated by the DeRemer method as opposed to parsers based on precedence relationships generated by the methods of McKeeman [h] and Wirth and Qk and Weber [5]. This empirical study shows the DeRemer parsers, and therefore, by implication, the CF to FPL conversion generated parsers, to be significantly more efficient in both speed and size. Since the precedence methods mentioned above are, in general, considered practical as the bases for translator writing systems, we must conclude that we have more than adequately met our goals of speed and size. The next question to be considered is the class of languages which can be successfully processed by the conversion algorithm. For reasons given in chapter 1, the goal was to accept all LR (k) grammars. As pointed out in chapter 6, we have failed to meet this criteria. However, we have, as was also noted in chapter 6, approached LR (k) acceptance to such a great degree as to make this failing relatively insignificant. The important difference between LR (k) langauges and LR (k) grammars should be noted. For a language to be LR (k) means that there exists an LR (k) grammar for that language. It is possible, and some- times the case, that a non-LR (k) grammar is written for an LR (k) language. The most frequent reason that a grammar is found unacceptable, however, is that it specifies a language other than the one actually intended. Usually this is an ambiguous grammar. This brings us to the most serious problem of the implemented TWS. This is the difficulty in determining what changes are necessary in the input grammar when the conversion algorithm does not apply. This problem also occurs in all the other parsing algorithm generators previously discussed but is aggravated to some degree by the intermediate translation from TWINKLE to CF productions. Efforts are presently under 85 way in an attempt to alleviate this problem. The first such effort is an attempt to determine precisely what information should be made available to the language designer when an unresolvable preclusion is recognized. To date this includes the descriptors of the FPL productions involved and the common multisymbol lookahead string which may follow these conflicting productions. The second such effort is a search for (and hopefully documentation of) a semi -algorithmic approach which the language designer can use to determine what changes are necessary. This leads us to a consideration of the relative simolicity of using the TWS. With the exception just mentioned and one other, the system appears to be quite good in this area. Of particular value are the flexibility of access to semantic routines (both actions and tests), the automatic error recovery, the recognition of ambiguities and the TWINKLE metalanguage with its "any" symbol constructs and documentation- aiding English forms. The other exception referred to above is the failure to automate semantics and code generation. These tasks have been simplified through the introduction of the ISL semantics language but it is this author's opinion that no TWS will be entirely satisfactory until the semantic phase of compilation has been automated to a degree at least approaching that already available for the syntactic phase. This failure to automate semantics and, more importantly, code generation does, however, have one advantage: it allows the TWS a broader range of application by not restricting it to the generation of compilers. It also puts no inherent restrictions on the efficiency of the object codes generated by TWS built compilers. 86 In general, we believe that the TWS described in this work and particularly the CF to FPL conversion algorithm on which it is based are quite useful and practical contributions to the field of language pro- cessing. 87 APPEM5IX A TWINKLE SYNTAX OF DEMALGL OEHALGI /SYNTAX 88 oooooooi , .„,„««.. .. wn«. to pr . -too *. « i ^ , 4 CDNSISTS PF , ,INTFGFP M- / "OOLEAN PS5 / fl A*Fl M6 1 00000003 OOOOOOPA 00000005 00000006 00000007 00000008 00000009 ooooooio ,<„„„„'» I. KFINf. in ff . -L..ri» ,. rouo.ro py , , ,00 .,o or .oo op .00,0 J rouo.ro py . ^TIO («. M HP .♦ 1 PS1 ! HP I «, ,* PP *♦ 1 <"0,FAN FXPPF,SIPM> »S1» 0" ,. u ooooooi 6 fu „„„„,,„ ..„ , THf . ..T.««.T. «» ooocor(7 •S15 OP EMPTY 1 J . roasts oo ., op ,oo..r.T roiLo.ro ., . ppsmply r.,„ list or onytmimo put sr»ieP L o N >< rou o.rr by ., . A « L A E- F I > TS pFFTNFD TO PF ** «*t> • 00000016 00000019 000000?0 000000?1 oooooo?? 000000?3 rKMYiliT or cf ..op.- i roL,o.roRv» rOPMfTS 0, .» ...tTMMrTTC PP.P.«Y> FOU0.ro PV . POSSTPIY 8 9 000OOC?9 AN CONSISTS OF #( . f ) flR 00000030 AN <*I>PTlO #S>7 OR A <*N> #Sl* I 00000031 0000003? » CONSISTS OF » FOl I OwFD BY A 00000033 POSSIBLY FMPTY O0OO0C34 LIST OF I #0R FOLLOWFP BY A »S16 ]J 00000035 00000036 A IS DFFINFD TO PF A FPLLOwFD PY A 00000037 POSSIBLY FMPTY LIST OF r #AND PS16 ] » 00000038 00000039 A CONSISTS OF #( #) OP 000000*0 Ah FPLLOWFD BY t #«/##/ #> / #< / #< / #?] OOOOOCM FOLLOKFD PY An PR AN <*!> PS19 PR 000000*? «TRHF PS?0 PP fFALSE #S21 I P00000*3 POOOC'OM AN CONSISTS Pr / <*N> / <*S> / fPFGTM / 000000*5 tEKO / *f,P / fTP / * GOTO / #Cni«MFNT / IINTFGFP / ipPOl **AN / flAPFL / 000000*6 flf / ITHFW / fFLSF / IANP / #PR / fTRnE /ffALSF /t, / f t / ft / f? / 000000*7 *. J ooooocie 90 APPENDIX B PR0TAB OF DEMALGL 91 COMPLETE PRPTAP. IS I IOC. LETT HAND SlPF It* FPF MARK l|* Mb "I" > lie irPMMFK'T > Its lie lie feFGIM H-3 iic tr-r *TP TPEMTTFIFR P«-9 > Id I r. r TPFMTTFIFR P«-9 > u= f.pTn TPFNTTFITP Ps-9 > 11 = TPFNTIFITP PT-10 n I N f* — ft M= TrENTJFjrp PT-10 ll* TrENTTFIFP w I n w E « Mr TTNTTFIFP n + » Jit f TF / FMT PS-14 #FI SF ||* ||« TPENTIFIFP P<-9 > It* <*TATFMFNT DUMMY u #GPTO TPFNTTFirR P«:-9 > lie OtTATFMFNT OllMMY A TPFNTIFIFP PT-10 m | w It* <<;TATFMFMT PiiMMY A TPFMIFIFP PT-10 I|b < lie < Mr < i |b I 1b #TF <"POLFAWFvPRFSS!OK P * • 1 3 #tmf* PS-14 *F| SF t IB *TF lit lit lit f TF «-13 »THEK #n sf M-15 > ||* i 1= I I « f P F G I M P<-3 lie Mr Ms fPFGTN P«-3 > Ms Ms ! TSFMI TSFMI TSFMI TSFMI TSFMT TSFMI TSFMI TSFMI MMY 5 SSTPN SSIDM OM ON cninM cpi.pm CPLPN CpL*N CPLPN CPLPN CPLPN CTLPN CPl PN CPLP* Cpl pN CPLPN CPLPN CPL"N CPLPN CPLPN CPLPN Cpl r^ CPLPN CPLPN CPLPN CPLPN CPLPN CPLPN It* > III > ll» > I IB > I In <*PITh^FTIcEXPpFSSTPN PLiMMY C > III hmtfgfr TDENTTFir* M-7 fPPOlFAN PS-5 TPENTIFIFP #S-7 tlARFl M-6 TPENTIFIFP • W ICPLPM 95 II- ||« It. «•(« »«■ TrEKTTFlFP PT-10 M-17 <*PITmMFTIcPRTM*RY > H» WiiMRFR M-lfi He "*" PS-16 I In P«-16 It* P«-if jj« <*PlTMMFTTrpPT>'ARY > lie I!* UMMY P > tl* rnp II* #PP P«-16 Its "C" •« ^ n 11= <«PlTHMFTTrFyPPFSSJPI" > n T n lie nun lie < A P I THMF T T P F YPPFSS I P»' > ♦• y n in n < n 11= 96 2911 292l 293l M- «»RITHMFTTPC*PPFSS!P>' «>* <*PITMMFTTrEVPPFSS!OW > llr TPENTTF1FP M-19 > lla ITPUF M-20 > IIB tFSLSF M-21 > Ifp «»P01E*NTFPM MiMMY 9 f A*'0 ||" Mr <»PlTHMFTTCFyPPFSSTn*' Rltl 97 APPENDIX C FPL PRODUCTION SET FOR DEMALGL THE F0I10WJN6 ARE THE FlOYD PRODUCTIONS FOR PEMAIfiU 98 KTH NOKF fiPOUP Pt START KTPK TH TH Mt GROUP 1 I Ot EPPPP PPPDUCTTPNt SKIP SYHPPLS. It PUSH PAR kHPlF PRPGRAM TH I SCAN (FPF)J GP TP GROUP 2 1 60 TP P0KFWJTHFAS51 , 3l WHrtLE PRPGPAM . ?l fiPPUP 3l group A| 51 6 l 1 GROUP 7: 01 PROOUCTlPN ASSPCTATEO WITH PPPTAP MP.i TOP STACK SYMPPLi EPF PIKH "AP PRPGPAM SCANJ flO TO TM J , «l PRPGRAM 1. 17. PRODUCTION ASSPCIATFD WITH PPHTAB MP. 1 TOP STACK SYMPPU fBEGTN PS-3 IS IKPt'Tf ANY* 6 PUSH PAR pErLARATTPN PUMMV 3 SCANl fiO TP TH P , PROOUCTlPN ASSOClATFn WITH PRPTAp VP.i 160. TOP STACK SYMRPl 1 fRFGTN PS-3 IS JNP|lTi ANY- 7 PUSH OAR STATFHFNT SCAN? r-P TP TW f , PROOUCTlPN ASSPCIATFP WITH PRPTAB PP»i 1*7. TOP STAC* SYMPPtl *PFGTN PS-3 PfPUCF TO RLPfK DUMMY 9 r.n to pnt 7 . 5i PRPGRAM PPOPUfTlPW ASSPCIATFP KITH PRPTAB MP,« . TOP STACK SYMPpLl BLPCK Pl'MMY ? IS IK'Pl'Tl #FWP SCAN; TOP STACK SYMPPU »FNP PS-? P«-1 PFP|ifF TP rRPr.PAM rp TP pnt 1 . CTMPTMFD PRODUCTION ASSPCIATFP WITH PPPTAP NP.J 156# 16A» TOP STACK SYMPPU PLPfK fl\ PS-9 PFPIjCF TO STATFMENT r-P TP PNT 6 . PPPPUCTIPN ASSPCIATFD WITH PRPTAp NP,| TDP STACk «YMPPLt fGP SCANI TCP STACK SYMPPI I <*T> PS-Q PFr>liCF TO STATFMFNT rn tt pnt (■ . PPODUfTlPN ASSTCIATFP *TTH PPOTAp rn.» tpp stack SYMPru *r,PTr SCANI TPP STACK e YMPPLl <*T> (fS-0 PFniiCT TP STATFMFNT rp TO PNT f . PROPITTIPN ASSPCTATFP HTH PpnTAB NP,« TPP STACk SYMPri t <*T> »T-10 IS UPl'Tt "I" SCAN » TPP STACk SYMRrl I "I* SCAN: 15. 19. ?0. ?4. ??. 31. 101 TOP STACK SYMBOL! "■" PUSH PAR ARITHMFTTCFVPRFSSION SCAN! fin TO TH 11 , *7l PRODUCTION ASSOCTATFD WITH PPPTAR * PT-10 SCAN! TOP STACK SYMBOL: "«■" PUSH BAR ARITMMFTTCFyPRESSTOn SCANl fiO TO TH 11 . *?l PRODUCTION ASSOCIATED WITH PpnTAB NO.i Al. TOP STACK SYMPPL* <*T> IS INPUT! "I" "»* SCAHJ TOP STACK SYMPpLl " I " SCAM I TOP STACK SYMBPL » "«" PIKH PAR POOI FAnFyPPFSSIO*' SCAWl CO TO TH 1? . A?i PRODUCTION ASSOCTATFD fcTTH PRPTAB K'O.f AA. TOP STACK SYMRPL« <*T> IS INPl'TI "«■" SCAN! TOP STACK SYMBOL » "♦" PUSH PAR POPI FANEyPPF SCAN! rn TO Tu 1? . tit PRODUCTION ASSOCIATFD WTTH PPPTAp *p,i 190. TOP STACK SYHPPLl <*T> PS-P SCANJ TOP STACK SYMRPl I "l" PFPllCF TO STATFMFNT Pl'MMy b r-n TO pmt 10 , Afl CPHBTMFD pPOrUfTTOM A«SPClATFn WITH PPPTAP NO.i 50» 107, H*i. I'M* TOP STACK SYMPPLI *TF P(l«H cpp poOl FAKFvPRFSSIOh SCAMj cn TO TW 1? . A7j FPRPP PPPOHCTIPMi SKIP SYMPPI S. *'U A CPTUP l?t STATFMFWT API CPHPTMFD PPPPIIfTTHM A<$nCTATFP WITH PPPTAP MP.i 59# f)h, TOP STACK SYMPPlt STATEMTmT DUMMY A IS TMPt'Tt #GP SCAMj en TO CTPW ? . A9» PPODI'CTlPN ASSOCTATFP WITH ppnTAp N0.| A*. TOP STACK SYKPrit STAtEmFmT PPMMY A IS INPt'TI fGPTO 102 SCAN) TOP STAC* SVMBPU #60TP SCAN* top stac* symbpll <*t> bs-9 peoucf to statfment CO TO PNT 6 . *?t COMBlNFD PRODUCTION ASSOCIATED WlTM PPPTAR NO. i T?» 79* «5# 91# JP5, TOP STACK SYMBPLI STATEMENT DUMMY 4 IS INPUT! <*!> SCANf GO TO CTPN 3 . *3l CPMRlNFD PRODUCTION ASSOCIATFP WITH PPPTAP NO.i 96» 123# 132. 14A, TOP STACk SYMPPLl STAtEMFNT Dl'MMv 4 IS INPUT! fIF SCANi en TO CTPW A . ■?*! PPPDUCTlPN ASSPCIATFP WITH PRnTAB NP. | IP6. TOP STACK SYMPPLl STAtENFK'T P|immv 4 PFPUCF TO STATFMFNT fiP TP PNT A . *5l EPPPP PRpPUCTTpNi SKIP SVMPPI S. KTPN 54 r.PPUP !3« STATFMFNT **l PPOPUCTIPN ASSrcTATEP WITH PRPTAp, |vp M 55. TOP STACK SYMRPU STAtEmF|«T RS-1* SCAN! TOP STACK SYMPPLl »ELSF PUSH bAP STATFMENT SCANI fiP TP TM 6 , NTp* 57 r,Ppup 14 t STATFMFNT *Pt PPCDUCTlpN ASSPCIATFP WITH PPPTAB MP, 1 «58. TPP STACK SYMPPLl STATFMFNT t*-1* PFPUCF TO STATFMFNT r,0 TP PNT f , KTPK 1C! rPfMip J5, STATFMFNT *9| PPOPUCTlPN ASSnClATFD MTU PpnTAp ►•p., JC2. TPP STACK SYMPPLl STAtEMFnT RS-1* SCANi TOP STACK SYMPPLl #ELSF PIJ«H rAR STATFMFNT SCANf rn TP TM 6 , NTPI> 104 nPTHP l*j STATFMFNT Ml PPPDUCTlpN ASSPCIATFP MTH PpfUAP MP.. 1P5. TPP STACK SYMPPLl STATFMFNT PR"t? PFPllCr TP STATFMFNT pn TP PNT fr . 103 NTPN 113 GROUP 171 STATFMFNT 6?l PROOUCTlPN ASSOCIATED WITH PPPTAB MP.i 114, TOP STACK SYMBOL! STATEMFKT PS-15 BEPUCF TO STATFMFNT PO TP PNT 6 , NTPN 119 fiPPUP 18| STATEMENT 63l PPOPUCTlpN ASSOCIATFD WITH PPPTAB NP.t 120. TRP STACK SYMBOL! STATEMF&T PS-14 SCANl TOP STACK SYMBOL! 1FLSF PS-15 PEPIlCF TP STATFMfNT r.o TP PNT 6 . NTPN 130 rPPUP 19 i STATFMFNT 6*i PROOUCTlPN ASSOCIATED WITH PPPTAp *'P.| 1 3 1 . TPP STACK SYMBPl I STATFMFMT »S-1? PFPUCF TP STATFMFNT CO TP PNT 6 . KTFN 137 pPPUP 20| STATFMFNT 66| PPpnUCTlTN ASSTCTATFP WTTH PPPTAB K'p.t 13«. TOP STACK SYMPPL» STATEMF*T «S-14 SCAN I TOP STACK SYMPpLt fFLSF PS-15 PFPUCF TP STATFMFNT r.P TP PNT 6 . NTPN 1 bP r.PPUP ?li STATFMFNT 6P| PPOPllCTlPN ASSPCTATFD WITH PPPTAB MP.i 156. TPP STACK SYMBTLt STATEMFOT PFPIlCr TO PLPCK DUMMY 9 CO TP PNT 7 . NTFK 161 r.PPUP ?2t STATFMFNT fiQt PPOPUCTlTN ASSPCIATFP WTTM PPPTAP NP.i 161, TPP STACK SYMBPL! STATFMFC'T PFPUCF TP PLPCK DUMMY 9 r.P TP PNT 7 . NTFN 163 fiPFUP ?3l 5T»TFMFNT 7P| PROPIICTIPN ASSTCTATFP MTH PPPTAB MP.! 163. TCP STACK PS-7 PCPUCF TO PECLARATIPN PUMMY 3 60 TO PNT S . 731 PRODUCTION ASSOCIATFD KITH PPPTAB NO. I 1^8. TOP STACK SYMBPU *B0OlEAM »S-5 SCAN! TOP STACK SYMBPLl PS-7 PEPUCF TO PFfl ARATTPN PUMMY 3 r,n to pnt * . 75i PRODUCTION ASSPCTATFP WITH PPPTAB NO.t 1*2. TOP STACK SYMBPLI *LAPFL PS-6 SCANI TOP STACK SYMPPLI <*T> RS-7 PFPIICT TO pECLARATTPN PUMMY 3 TO TO PNT P . 77l FPPOP PPPOI'CTIPMl SKIP SYMBPI S, ktf-* la cPrup ?5j pfpiapattpw dummy 3 7?l PPOOUCTlPN ASSrCTATFD WITH PPPTAB NO.t 169. TOP STACK SYMPPLI DECLARATION PUMMY IS TNPUTI *»" SCANj TOP STACK SYMPPLI "*♦• SCAN: TOP STACK SYMPPI I <*T> PS-7 REDUCF TO rECt ARATTPN PUMMV 3 r.n tp pnt e . Ml PROPUCTlPN ASSPCIATFO VTTH ppnTAB wfl.i tA. TOP STACK SYMPPLI OFCIAPATlON PUMMY PUSH PAR rPMMFNT SCANi ftP TP TM , NTpK 1 <■» r.PDIlP ?6 1 PFPIAPATTPK' Dl'MMY 3 P?l PRPPIICTIPM ASSPCTATFP WITH PRPTAp k'O.i 1*9. TCP STACK SYMPPH PFClARATlPM PUMMY IS H'Pl'Ti m >" SCAN; TCP STACK SYMPri I "#" SCAN! TOP STACK SYMPPI I »S-7 PFnilCF TO PFCI ARATTPN PlIMMv 3 r.n TP PNT P . P*t PPOPUCTIPN ASSPCT»TFP VTTH PPPTAP *'P.« 1«. TPP STAC* SYMRPL » PFCIARATIPN PUMMY PLKM f^AR rOM^FNT SCAK't CP TP TH A. 7^ 11 r.PriiP ??l ^RITHMFTTCFXPRFSSIPN 105 MTU 11 Ml P7i PPl P9| GROUP 9P| 01 I 93i PRODUCTION ASSPCIATFO WITH PROTAB NOtl TOP STACK SYMRPLI V PUSH PAR PUMMV 23 SCANI PP TP TH 23 . ?*?. PRODUCTION ASSPCIATFO WITH PROTAB NO.i ?«6. TOP STACK SYHROLl <*T> PT-1P • S-U PFPUCF TO ARITHMFTICPRIMARY r.O TP PNT 17 . PRODUCTION ASSPCIATFO WITH PPDTAB NO.i ?Afl. TOP STACk SYHBPU <*N> PS-IP RFPIICF TO ARITHMFTTCPRIHAPY r,n tp pnt 17 . FRRPR PRPPUCTTPNi SkIP SYMBPI S. ?6| ARlTHMFTICFyPRFSSlPN PPPPUCTlPN ASSPCIATFP WTTH PPPTAB NO.j 192. TOP STACK SYMPPU TERM IS INPUT A*Y- 8 IS INPUT 1 #AK'P IS INPl'T i «0P IS TKiPliT i *fi sf IS INPUT t #TWFN IS I>'P|"T #FNP IS INPUT n s tt IS INPUT mfn IS INPl'T HW IS TNPUTI nyf RFPiiCr TP ARITHHFTTCEvPRFSSIPN r-n TP PNT 11 • PRODUCTION ASSPCIATFO WITH PPOTAB NO.i TOP STACK SYMPPl I TERM IS INPUTI "♦" SCAN! ?34. TOP STACK SYMPPLl SCANJ f,n TP PUSH OAR TH 15 , TFPH PROOUCTIPN ASSrCTATFP WTTH PPPTAp *P.i TOP STACK «YKPri '■ TFRf.' SCAN i ?3*. TOP STACK SYHRfLt "-•• PUSH BAR SCANJ CO TP TM 15 , TERM 9*1 PROOUCTIPN ASSPCIATFO WITH PPPTAp *'P.t TPP STACK SYHPPLI AP T THMfT I CF Y PRF«S T IS T*'P|'Tt Ar-Y- 8 IS INPUT! #AMD IS TNPl'Tt »P D IS TNPl'Tl fFLSF 193. 106 IS INPUT IS INPUT IS INPUT IS INPUT IS INPUT IS INPUT IS INPUT IS INPUT IS INPUT r,o to fTMFN #FNP m m m w >w PEDUCF TCI ARITHMETTCFYPRFSSlON ONT 11 . 9*1 PRODUCTION ASSOCIATFD WITH PRnTAR NO.i ??6. TOP STACK SYMBOL! ARTTHMFTICF*PRFSST IS INPUTI "'*" SC»N» TOP STACK SYNRpLI "♦« PUSH PAR SCAN! ftO TO TH 15 , TERM 9*1 PRODUCTION ASSOCIATFD WITH PROTAf) NP,i ?30. TOP STACK SYMPOLt ARTTHMFTlCEVPRESST SCAN! TOP STACK SYMPOt • **" PUSH PAR TFPM SCANj f.n TO TH 15 . 10C| PRODUCTION ASSOCIATED WITH PpnTAR NO.t ??«. TOP STACK SYHPOL« ART THmFTICPRTM»PY IS INPUT t ANY- B IS INPUT I *AMP IS INPl'T #PR IS TNP|iT r f Fl SF IS INPUT l tTHFN IS INPl'T *F*'D IS INPUT t,« IS H'Pl'T 1 "4" IS INPl'T «£•» IS INPl'T » } w IS INPl'T R*ll IS INPUT W{W IS INPl'T t> + n IS INPl'TI t«>" IS INPUTI PEnncr TO TEP> ro TO HNT 15 . lPll PRODUCTION ASSOCIATFD WITH PRHTAP NO. I TOP STACK SYMRpl » AR T THMf T I CPPTM4P Y IS INPl'TI "a* SCAN! P57. TOP STACK SYMPTLI "** PUSH PAR ARITHMFTTCPPTMAPv SCAWi f.n TO TH 17 . lC3i PRODUCTION ASSPCIATFO WITH PROTAP MO.t ?M . TPP STACK SYNPPLl AR T THNFT I CPPT MA PV SCANI 10? TOP STACK SYMRpLI ■/" PUSH PAR ARITMMETICPRTMAPY SCANl CO TO TH 17 . 10?I PRODUCTION ASSOCIATED WITH PROTAB MOti ??5. TERM DUMMY 7 r o i n vf > IS INPUT I ANY- a IS INPUT 1 f AND IS INPt'T 1 OOP IS INPUT I #EISE IS INPUT 1 #THEN IS INPliT l fENp IS INPUT * s w IS INPUT 1 "rf" IS INPliT w <«• IS INPl'T 1 * J" IS INPliT «■«• ii fWPWf 8*8 IS IWPliT *>* IS TNPl'T "> w PFnilCF TO TERM GO TO PNT 15 . 10*1 PROOUCTlpN ASSOCIATED WITH PPOTAp wp.t TOP STACK SYMPPU TERM DUMMY 7 IS IKiPUTl "n" SCAN | ?«9. TOP STACK SYMPPU »K» PUSH DAP ARITMMETTCPRTMAPY SCANl f,n TO TH 17 , lOPt PRODUCTION ASSOCIATED WITH PROTAp KO.t TOP STACk SYHPPH TERM DUMMY 7 SCANl ?53. ^TP^ 3« MPk 39 1 1 t cRrnp im r, roup n?i TOP STACK SYMRPLI " /" PII*H DAP ARITMMFTICPPTMARV SCAM» P,0 TO TH 17 , FRPOP PRPPHCTTPNI SKIP SYMPPI S. ?9| ARTTHMf TTCFxPPFSSTPN PROOUCTlPN ASSOCIATED WITH PRPTAR NP.i TOP STACK SYMPflL» APT THMF T I CEY PRT **T PfPL'CF TO STATrMFNT rn TO PNT & . 30: ARTTHMFTTCyPPFSSTPN PPUnnCTlPN ASSOCIATED MTH PRPTAR K'P.t TOP STACK SYMPPLl AR T THMf T I CFypRF^S T RFnprr TO STATFMENT r.P TO PNT I . 35. PS-1 1 40. PS-11 KTM 77 fiPPUP 1 13 I 31 i ARTTHMFTTrrxPPFSSTPN PPPPUCTJPN ASSrCTATEO MTH PRPTAR n'P.j TOP STACK «YMRPLI AR T THMFT ICF YPRF SS T 78. PS-H 108 KTPN 63 fiPOUP KTPK 280 fiPOUP KTPN 263 fiPOUP HM NTPN 266 GPCMIP 1 1 r i MPK 269 fiPPUP 116, MPh 2V? r.prup 119, *TPN 295 r,prnp l?Oi th l? r,priip 1?1 1 i??i REPUCF TO STATEMENT PNT 6 » 32i CO TO ARlTHMETICEXPBESSIpN PRODUCTlpN ASSOCIATED WITH PRPTAB WO. I TOP STACK SYMBOL! APITHMFTICEXPRFSST PEOUCF TO STATFMENT f,n TO PNT 6 , M. RS-11 33i IRTTHMETICEXPRESSIPN PPOOUCTlPN ASSPCTATEO WITH PPPTAB MO. I 260. TOP STACK SYMBPLI ARTTHMFTlcEXPRESST REPUCr TO POOI.EANPPIMAPY CO TO PNT 21 . 3A| ARITHMFTTCFXPRESSTPN PRODUCTION ASSPCTATEO WITH PRPTAB NO.! TOP STACK SYMBPLI ARITHMFTlCEypRESST PFPUCF TO POOI FANPRIMAPY CP TP PNT 21 . 263. 35 I ARTTHMFTTCFXPPFSSIPN PRODUCTION ASSPCIATFO WITH PRPTAp NO.i TOP STACK SYMRPLI ARTTHMFTlCFYPRF^ST PEPIJCF TO POOI FANPPTMAPY fiO TO PNT 21 • 266. 36, ARTTHMETICFyPPFSSTON PPPPUCTlPN ASSPCIATFO WITH PRnTAR NP.i TPp STACK SYMBOL! ARTTHMFTI CFYpPFSST REPUCF TO POOI FAMPPIMAPY f,0 TO PNT ?1 . 37 1 ARTTHMFTICFyPPESSlPN PPPPUCTlPN ASSPCIATFP WITH PRPTAB NP, « TOP STACK SYMPPL' ARTTHMFTlCFVPRFSST PFfMlCF TO POPI FANPRIMAPY r.n TP ONT 7\ % ?89. 292. 36 i ARlTHMFTTCFyPPFSSlPN PPPPUCTlrN ASSPCIATFO WITH PPHTAR WP.i TPP STACK SYMBPLI ART THMFT lCFyPRF«ST PFPI.ICF TO POPI FANPPTMAPY r.n TO PNT 21 . 295. 39, pnr| .EAKFYPPFSMON CPMPU'FP pPPPUCTTOW A^PCIATFP WITH PPPTAP NP., 9A2» ?7S# TPP STAC* «Y*PPL« m t" PU«H CFR rOMoTNFP RPPUP *P. SCAN! rP TP PTH 5 . PPPDI'CTIPN ASSrCTATFP WITH PPPTAB MO., ?A6. TPP STACK «YMPPLl <*T> PT-1P PS-17 109 GO TO PFPUCF TO ARITHMFTICPRTMARY PNT 17 , l?3l l?5i 196 t l?7i M^ 1? pPPUP 1?P| 1?0| 1*1 » ?97. PRODUCTION ASSOCIATED WITH PROTAB NO. I TOP STACK SYMBOL! <*T> BS-19 REDUCE TO POOLFANPPIMARY r.O TO PNT ?1 . PRODUCTION ASSOCIATED WITH PRPTAB M0.» ?*B. TOP STACK SYMBPH <**'» PS-IB PFPUCr TO ARITMMFTICPPTMAPy ftO TO PNT 17 . PRODUCTION ASSOCIATFP KITH PPPTAB *0.t ?99. TOP STACK SYMPPLl fTRUF PS-?0 RFPUCF TO POOI FANPRIMARY PA TO DNT 5>1 . PRODUCTION ASSOCIATED HTH PPPTAB MP. i 301. TOP STACK SYMPPLl #FAl SF PS-?1 PFPLlCF TO POOLFANPRIMARY CO TO PNT ?1 , FRPOP PRPDUCTIpNi SKIP SYMpPlS. «0i pnPLEA^EyppFS«;inN PPOPUCTlPN ASSOCIATED WITH PPPTAB MP.j 19?. TOP STACK SYMPPLl TFRK" ANY- A #AWO «0» tELSE fTHFN #F*'P *<•» MM Kvll PFPUCr TO ARITWMFTTCEvPRFSMON PNT 1 1 . IS INPUT IS TNPt'T IS INPUT IS TNPliT IS INPUT IS INP|"T IS INPUT IS INPUT is tnPut IS INPUT !§ INPUT! IS INPUT! rn TO PPOOUCTlPN ASSOCIATED WITH PPPTAB NP.i TOP STACK SYMPPLl TERl/ IS INPt'Tt • , + '' SCAN! ?3«. TOP STACK SYMPPLl SCANl CO TO PUSH PAR TH 15 , TFPM PRPPDCTlrN ASSPCIATFD WTTH PPPTAp, NP.t TPP STAC* SYMPPl.t TFP»' SCAN) ?3B. TOP STACk «YMRp| I "-" PIUH PAR SCANJ r.P TO TH 15 , TFPM 110 IS INPliTI ANY- 8 IS INPlUI *AND IS INPUT I *0R IS INPUT! *FISE IS INPUT! 'THEN IS INPPTI *end IS INPPTI "■" IS INPj'TI m * m IS TNPl'TI m 4* IS INPUTl ">" IS INPliTI "«" IS INPl'Tl "fc" IS INPUT! *> m PFPUCF TO ARITHMETTfFvPRFSSlOK CO TO ONT 11 . MM PRODUCTION ASSOCIATED WITH PPPTAB NO.! ??*. TOP STACK SYMBOL! ARTTHMFTICFXPRFSSI IS TNPl'T« **" SCANl TOP STACK SYMPPU "♦" PUSH WAR TERM SCANl 60 TO TM 15 . MM PRODUCTION ASSOCIATED WITH PRPTAp, NO. t ?30. TOP STACK SYMPOLI ARTTHMFTlCE VPRFSST SCANl TOP STACK SYMRPU *-" PUSH PAR TERM SCANl m TO TM 15 . MP| PRODUCTION ASSOCTATFO WITH PRPTAp NO, I 194. TOP STACK SYMPPU ROOl EAWTERM IS INPUT I AK'Y- * IS INPUTl *Fl SF IS INPl'Tl fTMFN IS INPl'Tl 'END IS INPl'Tl *)" PFnilCr TO POPLFANFYPRFSSTP^ r.n TP pnt 1? . MOi PRODUCTION ASSPCTATFD wTTM PRPTAp NO.i 771. TOP STACK SYMPPI.I ROPlEAWTERM SCANl TPP STACK SYMRFll tPR PUSH PAR POPIFAMTFPM SCANj f.n TO TM 19 . 1fl1> PRPPUCTlPN ASPrCTATFD WlTM PRPTAp NO.. 195. TCP STACK SYMPPI I POPI EAWFXPPFSSTPN IS IKPl'T t AKY- « IS INPl'Tt ffr| * F IS INPUTl #TMEN IS INPl'Tl «EK'0 IS TNPl'TI "J" PFPUCT TP P0OLrANfVPRF«STrK r.n tp pmt i? . Ill l«?l PRODUCTION ASSOCTATFD WITH PRPTAB NO, I 267. TOP STACK SYMBOL! ROOl EANrXPRFSSTON SCAN! TOP STACK SYMBOL! #OR PUSH PAR POOLFANTFRM SCANl f,0 TO TH 19 , lilt PRODUCTION ASSOCIATED NITH PRPTAB NO.! ??«. TOP STACK SYMBOL! ARITHMFTICPRIMARY IS INPUTl ANY- B IS INPUT! fANO IS INPUT! #OR IS INPUTl *FLSF IS INPliTi #THFN IS INPUT! *FNO IS INPUT! "* m IS INPUT! «#" IS INPUT! "S" IS INPUT! ")" IS INPUT! »-« IS INPUT! "<" IS INPUT! »♦* IS INPUT! *>" IS INPUT! ">" PFPUCF TO TFP" r.n TO PNT 15 , 1«5| PROPUCTlPN ASSPCTATFD WITH PRDTAp wp.i ?57. TOP STACK SYMRPLl ARITHMFTICPPIMAP Y IS INPliT! "x" SCAN! TOP STACK SYMRPLl "K* PUSH oAP ARITHMFTTCPPTMUPY SCANl (?n TP TH 17 . l«7l PPOOUCTlPN ASSPCIATFD WITH PRnTAP, NP.i ?61, TPP STACk SYMPPLl ARITHNFTICPPIMAPY SCAWJ TPP STACK SYMPPl « "/" PUSH BAR ARITNHFTICPPTMAPV SCAN! BO TP TN 17 . U9| PPOPUCTIPN ASSPCTATFP HTM PRPTAR NP.i ??5. TPP STACK SYNPPLi TFPM pi'MMY 7 IS INPUTl ANY- 8 IS INPUTl 'AMP IS TNPi'Tl *PR IS INPUTl #Fl SF IS INPUT? tTWFN IS INPUT! 'FNP IS INPUT! "■" IS INPUT! **" IS INPUT! "<" IS INPUT! ••)" IS T wPl'T t "-" IS INPUT! »<» IS INPUT! "♦" 112 IS TNPliTi "*" IS INPUT t *>* PEPUCF TO TERM GO TO PNT 15 . HO| PRODUCTION ASSOdATFD WITH PRPTAB NOti ?«9. TOP STACK SYMBOL t TERM DUMMY 7 IS INPtiTI »»" SCAMf TOP STACK SYMBOLl "«" PUSH BAR ARITHMETTCRPIMARY SCAN) GO TO TH 17 . U?l PRODUCTION ASSOCIATED WITH PRPTAB NO.t ?53. TOP STACK SYMBOLl TERM DUMMY 7 SCANJ TOP STACK SYMBOLl ■'/• PUSH PAR ARITHMETICPPTMAPV SCANl f.fl TO TH 17 . l">Af PRODUCTION ASSOCIATED WITH PPPTAB NP.t ?65. TOP STACK SYMPPLt BOOl EANpRIMAPY IS INPUT! ANY- 8 IS INPUT! #PR IS INPl'TI *F! SE IS INPUT! fTMEN IS TNPnTl 'END IS TNP|iTl "}" PFPUCr TO POP! FANTFRM rP TO PNT 19 . |55| PPDPUCTIPN ASSrCTATED WITH PPPTAp NP.t 306. TOP STACK SYMPPLi BOOL EANPRI MAP y SCAN! TOP STACK SYMBPLl *ANP PUSH PAP POOLFANPPIMAPY SCAN! r,P TP TM ?1 , is7i pponurTirN asspctatep wttw ppptap mp.i ?*6. TOP STAC* T! ">" PIV[\CF TP rPOIFANTFPM TP TP PMT 19 . 1*P| PPOPUfTIfN ASSPCTATFP kITH PPPTAB NP., 30?. TPP STAC* SYMPTLl POP| EANTEPM PIJMMy SCAN! TPP STACK SYMPTLl »>AN0 PIICM PAR RPPI FANPRTMAPY SCANt rn TP TH ?1 . 1*0» PRPPUCTlPN ASSPCTATFP WITH PPPTAp K'P.t ?7fl. 113 TOP STACK SYMBPU APTTHMFTICFKPRFSST IS INPUTS "■" SCANI TOP STACK SYMBOL! "«• PUSH PAR ARITHMETICFyPRFSSlON SCAN« BO TO TH U , lA?l PRODUCTION ASSOCTATFD KITH PRPTAB NO.i ?«1. TOP STACK SYMPPLt ARITHMFTICFKPRFSST IS INPt'Tl "*" SCANI TOP STACK SYKBPU V PUSH PAR ARITHMFTTCFvPRFSSlON SCAN» pn TO TM 11 , 1M| PRODUCTION ASSOCIATFO WITH PPPTAp NP.t ?«4. TOP STACK SYMBOL" ARTTHMFTlCtYPR^SST IS TK'PliTI ">" SCANJ TOP STACK SYMPni » ">" PUSH PAR ARITHMFTTCFvPRFSSlON SCANJ fifl TO TH 11 . l*fl PRODUCTION ASSOCIATFO KITH PRPTAR NO.« ?B7. TOP STACK SYMBOL! ARI THMFT ICFYPRFSST IS INPPTI "<" SCAN! TOP STACK SYMPPLl "<* PUSH PAR ARITHMFTTCFvPRFSSlON SCANJ fiO TO TH 11 , lf*l PPOOUCTlPN ASSTCIATFO WITH PRPTAB HP,| ?00. TOP STACK SYMPPLl ART THMFTlCF *PRFSST IS TNPl'Tl "<" SCANi TOP STACK SYMPPLl "<" PUSH PAR ARITHMFTTfiFyPRFSSlON SCANi r.D TO TH 11 , 1 70 t PFPnilCTlPK' ASSPCTATFO HTM PPPTAp *P.j ?» PUSH PAR ARITHHETTCFYPRFSSlON SCANI CO TO TM 1 1 , 17?I FPPPP PRppiTTTTN! SKIP SYMPPIS. NTpN AA fiRTliP All PPPLFArFyPPFSSIPN 1 7 3 t PRODUCTION ASSOCTATFO KITH PRPTAB NP.i «5. TOP STACK SYMRPLl R00| F AMF XPPF SS T PN PS-1? PFDUCF TO STATFMFNT CO TO PH 6 . 114 MPN 46 fiPOUP IT* t NTPN 69 GROUP 1751 NTPN 94 bRPMP 1 76 t NTPN 276 fiPOUP 177| «2l POPlFANFXPRESSION Th 13 pPPHP 179 t lpr i 1P1 1 1*?» 1P3i PRODUCTION ASSOCIATED WITH PROTAB NO* I TOP STACK SYMROU BOOl EANFXPRFSSTON PEPUCF TO STATFMENT BO TO PNT 6 . «3l POPLFANFXPRESSION PRODUCTION ASSOCIATED WITH PROTAB NO. I TOP STACK SYMRPLI BOOl EANFXPBFSSION PEOUCF TO STATFMENT BO TO PNT 6 . 44, POPLFAKEXPRESSION PROOUCTIPN ASSOCIATED WITH PRPTAB NO. I TOP STACK SYMRPLI ROOLEAWFXPRFSSION PEPUCF TO STATFMENT BO TO PNT 6 . 45t PPPLFANEXPRFSSION PROOUCTIPN ASSPCIATFO WITH PRPTAB NO, t TOP STACK SYMPPI t ROOl EAWFXPPFSSTPN SCANl TOP STACK SYMRPLI "I" PFPIlCF TO POOI FANPRIMAPY p-n TO PNT 21 . • 9. BS-1? 90. • S-l? 95. RS-1? 276. A6| AMYTHIKGPUTSEmICOLPN 196. PPOntlCTlPN ASSPCIATFO WITH PPOTAfl NP.i TOP STACK SYMPpLi PEPUCF TO ANYTHlNGRI'TSFMTCPLON GO TO PNT 13 , PPOPUfTlPN ASSPCIATFO WITH PRPTAB MO. i 197. TPP STACK SYMRrLl <**'> PFPtlCF TO ANYTHlNCPtlTSFMICPl ON r.p TP PNT 13 . PppPl'CTlpN ASSrCTATFP HTH PpnTAP NP.t 198. TOP STACK SYfPPLt <*«> PFnuCF TP ANYTMlNGPUKFMTCriPM rn TP pnt 13 . PFOOUTTlPN ASSPCTATFP WITH PRPTAp NP, } 109. TPP STACK SYMPPI t *PFGTN RFHUCF TP ANYTMlNGPIITSFMICPI ON r.n TP PNT 13 . PPPPHCTIPN ASSTCIATFP WITH PPHTAP MP.f 200. TPP STACk SYMPPLi AFNP PFnilCF Tn ANYTHlMftPUTSFMTCPlON rP TP TNT 13 . 1pa» PPOOUCTIPN ASSPCIATFO WITH PPPTAR NP,, 201. TPP STACk SYMPPI I #GP PFIMiCF TO ANYTHlNGPUTSFMICPLOM BP TP PNT 13 . 115 1K5I PRODUCTION ASSOCIATED WITH PROTAB NOtl 202. TOP STACK SYMBOL* #TO REPUCF TO ANYTHlNGPUTSEMICPLON r.O TO DNT 13 . IBM PRODUCTION ASSOCIATED WITH PROTAB NO.i 203. TOP STACK SYMBOLl #GOTO REDUCE TO ANYTMlNGRUTSEMICOLON p,P TO PNT 13 • 1P7| PRODUCTION ASSOCIATED WITH PPPTAB NO.i 204. TOP STACK SYMBOL! KCOHVENT REDUCE TO ANYTHlNGPUTSEMICDLON RO TO PNT 13 . lMl PRODUCTION ASSOCIATED WITH PPPTAB NO.i 205. TOP STACK SYMBOLl fINTFGEP REPUCF TO ANYTHINGPUTSFMICPLON f.p TO PNT 13 , 1P9 i PRODUCTION ASSOCIATED WITH PPPTAB MO. I 206. TOP STACK SYMBOL* *R0Ol EAN REPUCF TO ANYTHlNGPUTSFMlCPLON r-n TO PNT 13 . 190| PRODUCTION ASSOCIATED WITH PROTAB NO.i 207. TOP STACK SYMBOL! #1 APFL PFPUCF TO ANYTHlNGPUTSFMICDLON rn TO PNT 13 . 101 I PRODUCTION ASSOCIATED WITH PRPTAR NO. t 208. TCP STACK SYMPOLl *TF PFOUCF TO ANYTHlNGPUTSFMlCPLON TO TO PNT 13 . 1$2, PRODUCTION ASSOCTATFD WITH PPPTAB NO.i 209. TOP STACK SYMPPl J #THF* PFOUCF TO ANVTHINGPUTSFMICDLON r.0 TO PNT 13 . 103 1 PRODUCTION ASSOCTATFD WITH PPOTAP. NO.i 210. TOF STAC* SYMPCI I #FLSF PFOUCF TO ANYTHlNGPUTSFMICOLON CO TO ONT 13 . l9M PRODUCTION ASSOCTATFD t'TTH ppnTAp NO.i 211. TOP STACK SYMPOLl #AND PFOUCF TO ANYTHINGPUTSFMKOLOM r.O TO ONT 1 3 . 1Q5» PRODUCTION ASSOCTATFD MTH PROTAP MO. I 212. TOP STACk SYMPOI I fOR PFOUCF TO ANYTMlNGPHTSFMICOl ON rr> TO ONT 1 3 . 1K 223 pPPUP «B| ANYTHHGPUTSEMlCOt ON 205l PRODUCTlPN ASSOCIATED WITH PRPTAp *'P,i 223. TOP STACK SYMPpl I ANYTHINpRtiTSFMTCOl PFPUCr TO TOMMFNT DUMMY S CP TO PNT 14 , Tk 15 CPPUP «9| TFPM f-PFIlp TH 15 TS THE SAME AS GROUP TH 11 M(^ 15 GROUP *9t TFPM 2Pfi PRODUCTION ASSOCIATFD WITH PpnTAB NO, | 22A. TOP STUw SYMPPL « AP T THMFT I CPPT MAPy IS IK'Pl'Tt A^Y- * IS INPUT* *A*'P IS HPl'Tl *P» IS TKPl'Tt *FI SF IS I K- P 1 1 T • ITHFN IS INP|'T» *F*'P IS IK'Pl'T! »* m 117 IS INPUT IS INPUT IS INPUT IS INPUT IS INPUT IS INPUT IS INPUT IS INPUT PC! TP m + m If W IS INPUT REPiiCr r.n TP PNT 15 TO TERv 21?« PROOUCTlPN ASSPCTATEP HTH PPPTAp MP.! ?«9. TOP STACK SY^PPL« TFRN DUNMY 7 IS TNPUTI "* H SCANl TOP STACK SYMRpi I "*« POSH PAR ARITHMETTCPRTMARy SCANl TP TP TH 17 . 2^ht proouctlpn asspciatfp with ppptar no.t ?53. top stack symptll tfr>' ponmy 7 scan; 118 2l6t NTPN 226 fiPPUP 217| MPK 23? fiPOUP ?1P| KTFKi 236 cPPUP 219| NTPN ?40 GROUP ??0! TOP STACK SYMBPLI "/" PUSH PAR ARITHMETICPPIMARY SCAN) f,n TO TM 17 » ERROR PRPOUCTIpNi 50i TFRM SKIP SYMBOLS. 229. PRODUCTION ASSOCIATED WITH PRPTAR NO. I TOP STACK SYMPOLI TERN PS-16 RFOUCF TO ARITHMETTCEYPRFSSION OUNMY A CO TO PNT 16 . 51 i TFPN PRODUCTION ASSOCIATED WITH PRPTAB NO. I 233. TOP STACK SYNBOl t TFRN RS-16 PfOUCF TO ARITHMETTCFyPRFSSlON DUMMY A CO TO DNT 16 . 5?t TFPN PRODUCTION ASSPCIATFD KITH PRPTAp MP,, ?37, TOP STACK SYNPPLI TERN RS-16 RFPUCF TO ARITNMETICEyPRESSlON OUNMY * m TO PNT 16 . 53i TF»N ?M. PRODUCTION ASSOCIATFD WITH PPPTAp NP.j TOP STACK SYNRPll TFRM RS-16 PFPIlCF TO ARITHMFTTCEVPPFSSION PUNMY '* CO TO PNT 16 . Th 17 fiPPUP 5«i f.POUF TM KTPK 251 r.ppup 5tt 2?1 t PROD ARTTHMFTICPPIMARY 17 TS THF SAMF AS GPOU p ARTTHMFTTCPRImaRY TM 11 ppnoucTirN associated with prptab no.j TOP STAC* SYMPPLt AR T T HMTT I CPPT MA R Y RFPUCr TO TER^ PUMMv 7 TO TP PNT 18 , PS-16 N.Tp> 255 rRriiP 2??t MFf 25<5 r-PfHP 22? t 55t ARITMNFTTCPRI""ARY PPPPUfTlpN ASSprTATFP WlTM PPPTAFJ ^ , P.f TOP STACk SYNPPI t ARTTMM^ TICPPTNARY PFPUCF TO TFRM DUMNY 7 tp TP pnt 1P . 56t APTTM^FTICPPlMARY 2*6. RS-1 * PPfPUCTlPN AS^pTTATFO HTM PRPTAp NP.j ?60. TOP w IS INPUT •»>!» WITH PpPTAB ^P, > ARITHprTlCFyPRFSST 193. 2*1 t 120 REOUCF TO ARITNMETICEVPRESSTON GO TO DNT 11 , PRODUCTION ASSOCIATED WITH PRPTAB NO. I ??6. TOP STACK SYHBOLl ARITHMFTICFXPRFSST IS INPliTl "♦" SCAN; TOP 5TACK SYHBOLl •*» PUSH PAR SCAN) CO TO TH 15 , TERM 2J3| PRODUCTION ASSOCIATED WITH PPOTAB NO.i ?30. TOP STACK SVMRPLI ARTTHMFTICF*PRFSST SCANl TOP STACK SYHPPLt "•" PUSH PAR TERM SCAN? PO TO TH 15 . ?35i PPODUCTlPN ASSOCIATED WITH PRPTAB Nn, f TOP STACK SYNPPU ARITHMFTICPRIMAPV IS INP|iT» A*Y- 8 IS INPliT! #ANP IS INPnTl #0B IS INPliTl #FISE IS INPI'Tj *TMFN IS TNPl'Tl *FND IS INPUT! *•« IS TNP|iTI •»#" IS TNPpT t *«" IS INPliTl ")" IS INPliTl "-« IS TNPliTt »<» IS TNPl'Tl »*» IS INPUTI **" IS INPliTl ->" PEHIlCF TO TFRM r-0 TO ONT 15 . ??«. ?3»l PRODUCTION ASSPCIATFD WITH PPPTAp, NO, i ?57. TPP STACK SYMPPll AP T T HMTT I CPP T M APY IS TNPl'Tl "K" SCANl TPP STACK SYNBOLI "h* PU!» RFPUCF TO TFPn rn TP PNT 15 . 2«1| PRODUCTION ASSOCIATED WITH PRPTAB NP.i ?*9. TOP STACK SYMPOL* TFPN PUNMY 7 IS TNPUTl "x" SCAN! TOP STACK SYMPPLl "*" PUSH BAR ARITHMETTCPPTMAPY SCANi r,n TO TH 17 . ?«3i PRODUCTION ASSOCIATED WITH PRPTAP, MP, f ?53. TOP STACK SYMPPU TERM pi'NMY 7 SCAN! TOP STACK SYNPPl I V PUSH BAR ARITHMETTCPPTMAPY SCANI f-n TP TM 17 . 2h*t PROPUCTIPN ARSrCTATFP WITH PRPTAp NP.t ?65. TOP STACK SYNRPI. i ROPl EA^PRTMARY IS INPUT! A^Y- P IS INPUT! #OP IS TNPl'T» #FI SF IS TNPUTl fTMEN IS TNPUTl fFNP IS INPl'TI ")" pfpiicf to cnoi fanterm r.o TO PNT 19 . 2afl PROPUCTIPN ASSPCTATFP WITH PRPTAP MP. i 306. TOP STACK SYNPPI.I ROPI EA*'PRIM*RY SCANi TOP STACK SYNPPl I IANO Pll*H BAR POPI FANPPTMAPY SCAN; GO TP TM ?1 . ?Aflj PRODUCTION ASSOCTATFP WITH PRPTAB NP.s ?66. TOP STACK SYNRPLl ROP| EAWTFRM PlIMd'Y IS INPl'TI AKV- 8 IS INPl'T! #PR IS TNPUTl *FI SF IS INPllTl fTHEN IS TNPUTl *FNP IS INPl'TI ">? PFPUCP TP POOIFANTFPH 122 p.n TO DNT 19 . 2*9l PRODUCTION ASSOCIATED WITH PRPTAB NO. I 302. TOP STACK SYMSOLI BOOLEANTERM DUMMY SCAN) TOP STACK SYMBPLl *AND PUSH BAR POOLFANPRIMARY SCAN) GO TO TH ?J . 2M I PRODUCTION ASSOCIATED WITH PRPTAB NP.t ?78. TOP STACK SYMRpLl ARITHMFTICEYPRFSST IS INPUT I "■" SCANl TOP STACK SYMBOL! "■" PUSH PAR ARITHMFTTCFvPRfSSTOM SCANl GO TO TH 11 . ?s3i PRODUCTION ASSOCIATFD WITH PPPTAB MO.t ?M . TOP STACK SYMBPH ARITHMrTlCFYPRFSST IS INPUT * m t m SCAN! TOP STACK SYMPTLI T PUSH PAR ARITHMFTICFyPRFSSlON SCANl r-n TP TH ll . 2**l PRPOUCTlPN ASSPCIATFD WITH PRPTAB M0,| ?««. TOp STACK SYMPPLl ARlTHMFTlCFYPRFSST IS TNP|iT« *> m SCANl TOP STACK SYMPPLJ ">" PUSH PAR ARITHMFTTCEyPPFSS^ SCANl CO TP TH ll . ?S7l PPODUfTlPN ASSPCTATFP WITH PRPTAB NO, i ?87. TOP STACK SYMPPLl APTTHMFTlCFypRFSST IS INPUT! "«" SCANl TOP STACK SYMBOL* "«•" PUSH RAR ARITt-METTCFvPprssIO^ SCANt on TP TH 11 . ?SC| PPOPUCTlnN ASSPCTATFD WTTH ppPTAR MP.i ?90. TPP STAC* SYMPPLl ART THMFT I CF YPPF SST IS INPUT * "<" SCANl TOP STACK SYMPPLl "<" PU«tM PAR ARITHMFTTCFyPRFSSIPN SCANl en TP TH 1 1 , 2*1 I PRODUCTION ASSPCTATFD WITH PPPTAP NP.i ?93. TOP STACK SYMPPLl AR T THMFT ICFVPRFSST SCANl TPP STACK SYMPPLl ">" PUSH PAP ARITHMrTTCEYPRFSSlPN 123 2*31 NTPN 269 GROUP 264 I MP* 273 r,R0UP 2*5 1 T^ 21 MTH 21 SCAN* (SO TO TM 11 . FRROR PRPOUCTIPNt SKIP symbpls. 591 PflPLEANTFRM PRODUCTION ASSDCIATFD WITH PRPTAB NP.i 270. TOP STACK SYMRPU ROOlFANTERM RS-1* PFPUCF TO POPLFANFVPRFSSIPN DUMMY P r,n TO PNT 20 , 60s PPPLEANTFRM PROOUCTlPN ASSOCIATFP WITH PRPTAB NO.t 27t, TOP STACK SYMPPLl BOOLEANTERM »S-lf PFPUCF TO POOl FANFyPRFSSIPN DUMMY R CO TO PNT ?0 . fiROUP 61 i PPPLFANPPIMARv fiROUP TH 21 TS THF SAMF AS GROUP TH 12 GROUP Ml PPPIFANPRIMARV 2*6| PRODUCTIPN ASSPCIATFD WITH PRPTAB NP.« 192. TOP ! ?TACk sympti 1 TFPV IS INPl'Ti ANY- P IS HiPllTt #ANP IS INPl'Tt *0P IS INPl'T! #fi sf IS TMP|iT» #THFN IS TNPnTJ #FNP IS TNPliTI " = " IS INP(iT» f, #" IS TNPl'TI • , <" IS INPliTI ")" IS INPUTl *<* IS INPUTI «>* IS INPl'Tt ">" PFnjlCF TO ARITHMFTTCEvPRFSSlOM r.p TP pnt 11 . 2f7t PPOPITTlrN ASSPCIATFO WITH PRPTAB NO.i TOP STACk ^YMRTLi TERM IS I N' P 1 1 T I "♦ B SCANl 234. TOP STACK SYMBrit «♦« PU«H PAR scami r.n TO tm 15 , TFPM 2*9i ppnnurTiPN asstciatfo htm prptap mp.i TOP STAC* SVMPPL » TFRf S C A N J ?38. TPP STACK SYMPTLt "-•• PUSH PAP scan; OP TO TH 15 , TFP»' 271 i PPPOUCTlPN ASSrClATFD UTH PPPTAp NP.i 193. TTP STACK Y- P fAND #OP #FI SE fTHEM fE^D ??« IS IS IS IS IS IS IS IS IS IS IS IS IS IS INPUT-! INPUT! INPUT! INPUT! TNPl'T! INPUT! INPUT! TNPl'T! TNPUTI INPUT! INPUT! U'PUT'I INPUT!. TNPl'T! f-f) TO »<» l» + t» *£« PEPUCF TP DMT 15 . TFP" ?77i PRODUCTIPN ASSPCTATFP WTTH PPPTAB NP.» TPP STACK SYNRpt I AR T THMr T I CPRI MAP Y IS INPtiTI 'n' SCAN! ?S7. TPP STACK SYMPPL t "x« PUSH RAP ARITHMFTTCPPTMAPV SCAN! fP TP TH 17 , 125 2791 PR00UCT1PN ASSOCIATED WITH PRPTAB NO. I ?M . TOP STACK SYMBPLt ARITHMFTICPRIMAPY SCAN} TOP STACK SYMBPLt "/" PUSH PAR ARITHMfTTCPRTMAPV SCAN! ftp TO TH 17 . ?All PRODUCTION ASSOCIATED KITH PRPTAB NP.i 225. TOP STACK SYMBOL! TERM DUMMY 7 IS INPUT I ANY- 8 IS INPUTl 1 f ANO IS INPUT 1 *OR IS INPUT 1 fELSE IS INPUT I fTHEN IS INPUT I #end IS INPUT 1 tt B tt IS INPUT I •»#*> IS INPUT 1 **< m IS INPUT 1 " )*• IS INPUT! | tt-tt IS TNPUTl *ttt REPUCF TO TERM RP TO PNT 15 . 2fl?l PROOUCTlpN ASSPCIATED WITH PRPTAB NP.i TOP STACK SYMBPI.I TERM DUMMY 7 IS INPUTl •\K W SCANl ?«9. TOP STACK SYMBPLt *** PUSH BAR ARITNMETTCPPIMABv SCANl PP TO TH 17 . 2flAi PROOUCTlpN ASSPCIATED WTTH PRPTAB NP.i TOP STACK SY-MPPLl TERM DUMMY 7 SCANl ?53. TOP STACK SYMPPLI "/" PUSH PAP ARITHMETICPPTMAPY SCANj pP TO TH 17 . 2P6I PROPUCTlPN ASSOCIATED KITH PRPTAB NP.t ?78. TOP !5TACk SVMBPLi ARTTHMFTICEXPRFSST IS INPUTl *■" SCANj TOP STACK SYMBpLi t»_w SCANl PP TP PUSH BAR ARITHMETlCEyPRESSlON TH 11 . ?flPl PRODUCTION ASSOCIATED WITH PRPTAB NP.i ?01. TOP STACK SYMPPLt ARTTHMFTICEYPRFSST IS INPUTl "*" SCANl TOP STACK SYMPPLt tt^tt 126 PUSH MR ARITHMETlCEYPRFSSlON SCAN) 00 TO TH 11 . 200| PRODUCTION ASSOCIATED WITH PROTAB NO. I 2M. TOP STACK SYMBOL* ARITHMETlCEXPRFSSI IS INPliTl ">" SCANI top Stack symbol* "> w push par arithmetlceyprfsslon scan) go to th 11 . 2«2| production associated with protab no. i 287. top stack sympol* arithmfticeyprfssi IS INPUT* "<" SCANI TOP STACK SYMBDtt "<" PUSH PAR ARITHMETlCEYPRFSSlON SCAN| RO TO TH 11 . 29«l PRODUCTION ASSOCIATED WITH PROTAB NO.i 290, TOP STACK SYMBOL* ARITHMFT ICEVPRFSSI IS INPUT* "5" SCAN) TOP STACK SYHPOL* "<" PUSH PAR ARITHHETICEVPRESSION SCAN} 00 TO TH 1 1 . 296| PRODUCTION ASSOCIATED WTTH PROTAB NO.t 293, TOP STACK SYHPOL* ARI THMFT I CEYPRFSST SCAN) TOP STACK SYMBOL* ">* PUSH PAR ARITHMETJCEVPRESSION SCANf 00 TO TH 11 , 2©P« FRROR PRODUCTIONi SUP SYMPOlS. KTPK 304 OPOIIP 62t POOL FANPRIMARv 2Q9i PRODUCTION ASSOCIATED WITH PROTAB NO.i 305. TOP STACK SYMBOL* ROOl EAWPRIMARY *S-l" REDUCE TO ARITHMETICEYPRESSION f,0 TO ONT 11 . 30?t PRODUCTION ASSOCIATED WITH PRPTAB NO.! TOP STACK SYMBOL! TERM IS INPUT! "♦" SCAN) ?3A. TOP STACK SYMBOL! SCAN PUSH PAR TERM fin TO * TH 15 , 30A| PRODUCTION ASSOCIATED WITH PRPTAB NO. i TOP STACK SYMBOL! TERM SCAN! ?38, TOP STACK SYMBOL! "-" PUSH PAR TERM SCANl CO TO TH 15 , 306i PRODUCTION ASSOCIATED TOP STACK SYMPOLI WITH PRPTAB MO. t ARTTHMETICEXPRFSST 193. IS INPUT! ANY- « IS TNPUTl #Awn IS INPUT! #OR IS INPUT! #EISE IS INPUTt ITHEN IS TNPUTt #END IS INPUT! J**** IS TNPUTl Hf|f« IS INPl'TI *<»» IS INPUT! ♦» )♦» IS INPUT! »•<•• IS TNPUT! «>♦» IS INPUT! t*y* PEHUCF en to ONT 1 1 TO ARITHMETTCEVPRESSION 307| PRODUCTION ASSOCIATED WITH PRPTAB NO.t ??6. TOP STACK SYMPOLI ARITHMETlCEXPRFSST IS INPUT! *♦" SCANl TOP STACK SYMBPL! "4« PUSH RAR TERM SCANJ PP TO TH 15 . 128 SCANI TOP STACK SYMBOL! "-" PUSH BAR TERM SCANI 60 TO TH 15 . 3111 PRODUCTION ASSOCIATED WITH PROTAB NO. I 2?«. TOP STACK SYMBOL! ARITHMPTlCPRIMARY IS TNPUTI any- b IS INPUTl *AND IS INPUTl *op IS INPUT! #ELSE IS INPUTl 'THEN IS INPUTl 'END IS INPUT! "■" IS INPUTl m +* IS INPUT* "S" IS INPUT • ")" IS INPUTl "•* IS INPUTl »<* IS INPUT! "♦" IS INPUT « "*" IS INPUTl *>" PEPUCF TO TER" f,n^ TO ONT 15 . 31 ? | PRODUCTION ASSOCIATED WITH PRnTAR NO. I ?57. TOP STACK SYMBOL'! ARITHMETI CPRl MARY IS INPUTl "x" SCANI TOP STACK SYMBOL! "*" PUSH PAR ARITHMETICPPTMAPY SCANI fiD TO TH 17 . 31«I PPODUCTlPN ASSOCIATED WITH PRPTAB NO. I 2*1. TOP STACK SYMPPL'I ARI THMFTI CPRTMAPY SCANI TOP STACK SYMPPL» "/" PUSH PAR JRITHMETTCPPTMAPY scani r-n TO TH 17 . 3161 PPODUCTlPN ASSOCIATED WITH PRPTAB NO.! ??5. TOP STACK SYMBPLl TEPf DUMMY 7 IS INPUTl ANY- P IS INPUTl 'AND IS INPUTl *0P IS INPUTl *El SE IS INPUT* iTHEN IS TNPUTt fEND IS INPUT! »*»■ IS INPUTl ** m IS INPUT! m S m IS INPUTl •M" IS TNPUTI "■" IS TNPUTI m < m IS INPtiTl + + * IS INPUT! "*" 129 is inputi »»" reducf to term GO TO DNT 15 • 3lTl PRODUCTION ASSOCIATED WITH PROTAB NO.! 2*9. TOP STACK SYMBOLl TERM DUMMY 7 IS INPUT! «*«" SCAM) TOP STACK SYMBOL! "*" PUSH BAR ARITHMETTCPRlMAPv SCANl r,0 TO TH 17 , 3l9 t PRODUCTION ASSOCIATED WITH PROTAB MO.i ?53. TOP STACK SYMBOLl TERM DUMMY 7 SCANl TOP STACK SYMBOLl »/» PUSH PAR ARITMMETICPPTMARY SCANl fiO TO TM IT . 32l| PRODUCTION ASSOCIATED WITH PPPTAB NO. i 310. TOP STACK SYMBOLl ARI THMFTlCEXPRFSST SCANl TOP STACK SYMBPLI "l» REPUCF TO PUMMy 23 co TO ONT 23 . 323i ERROR PRODUCTIONi SKIP SYMRPlS. NTPN 243 GROUP 65t DUMMY 23 3?A| PRODUCTION ASSOCIATED WITH PPPTAB NO. i 2*3. TOP STACK SYMPPLI DUMMY ?3 REPUCT TO ARITHMFTTCPPTMAPv f,0 TO PNT 17 . CNTPN pROUP 66 1 SOURCE OF THIS COMPINATIPN ISi NTH 1 3?5l PRODUCTIPN ASSPCIATED WITH. PPPTAB NO. I 157 TOP STACK SYMppLl COMMENT IS TWPliTt ANY- 7 PUSH oAR STATFMENT SCAN| r,0 TO TM 6 . 3?6i PRODUCTIPN ASSPCIATFD WITH PROTAB NO. I 165 TOP STACK SYMPPLI COMMENT PFHUCF TO PLOCk DUMMY 2 CO TO PNT 7 . CNTPN 1 ($*GUP 67i POOLEANFYPPFSS ION SOURCE rr THIS COMPINATIPN TS t TH 6 3?7l COMBINED PRODUCTION ASSOCIATED WITH PROTAB NO. t 52* 109# 117* 1A3* TOP STACK SYMRPLi ROOLEANFXPRFSSTPN Bs-H SCANl GO TO CTPN 6 . 130 CTPN 2 GROUP 66l SOURCE OF THIS COMBINATION IS t NTH 6 3?8| PRODUCTION ASSOCIATED WITH PRPTAB NO.t 60, TOP STACK SYMBOL! *60 IS INPUTI #T0 SCANf TOP STACK SYMBOLl #T0 SCANf TOP STICK SYMBOLl <*T> PS»0 REDUCE TO STATFMFNT <*0 TO RNT 6 • 331 t PRODUCTION ASSOCIATED KITH PROTAB NO.i 63. TOP STACK SYMBOLl #60 SCANI TOP STACK SYMBOLl <*I> PS-o REOUCF TO STATEMENT fiO TO ONT 6 . CTPN 3 GROUP 69 1 SOURCE OF THIS COMBINATION ISl NTH 6 333i PRODUCTION, ASSOCIATED WITH PRPTAR NO.i 7«. TOP STACK SYMPPLl <*!> • T-IO IS INPUTI "I" SCANi TOP STACK SYMBOL I "l" SCANI TOP STACK SYMPPU •e" PUSH PAR ARITHMETICFvPRESSlON SCAN? nn TO TH 11 , 336| PPODUCTIPN ASSOCIATED WITH PRPTAB NO.i PI. TOP STACk SYMBOLl <*T> PT-10 SCANf TOP STACK SYMPPLl "♦■» PUSH PAR ARITMMETTCEVPRESSTON SCANI f,n TO TM 11 . 33Bl PRODUCTION ASSOCIATED WTTM PRPTAB NO.t P6. TOP STACK SYMPPt I <*T> IS INPUT t "I" "«" SCANt TOP STACK SYMPpI l »i« SCANf TOP STACK SYMPPLl "■" PUSH PAR POP» FANEyPRTSSlPK SCANf GO TO TH 1? . 131 341 I PRODUCTION ASSOCIATED WITH PROTAB NP.l 92. TOP STACK SYMBPLl <*T> IS INPUT! "♦" SCAN) TOP STACK SYMBOL! "♦" PUSH BAR POOLFANEyPRFSSTON SCANI GO TO TH 12 , 3*3t PRODUCTION ASSOCIATED WITH PROTAB NO.t 1*7. TOP STAC* SYMBOtl <*T> BS-B SCANI TOP STACK SYMBOL! "I" REDUCE TO STATEMENT DUMMY * GO TO DNT 10 , CTPN A CROUP 70| SOURCE OF THIS COMBINATION IS « NTH 6 3A5| COMBINED PRODUCTION ASSOCIATED WITH PPOTAB NO. 97# 12*. 133* 149» TOP STACK SYMPOLl #IF PUSH SBR POOl FANF*PRFSSIDN SCANI CO TO TH 12 . CTH 5 CROUP 71| SOURCE OF THIS COMBINATION ISl TH 12 CROUP CTH 5 IS THF SAME AS GROUP TH 1? CNTH 5 GROUP 71 | TFPM SOURCE PF THIS COMPlNATTPN ISl TH 12 3ji6 i PRODUCTION ASSCCIATFD WITH PRPTAB NO.i 192. TOP STACK SYMBPLl TERM IS INPUT! ANY- B IS INPUT! fAND IS TNPl'Tl tOP IS TNPiiTl fFLSE IS INPUTf fTHFN IS TNPUTi *FND IS INPUTI "e" IS TNPI'TI ••#«» IS INPPTI "«" IS INPUT! ♦»)" IS INPUT! "<" IS TNPliTl ">" IS INPMTI ">* PFPUCF TO ARITHMETTCEyPRESSION PP TO DNT 11 . 3fl7 1 PRODUCTION ASSOCIATED WITH PRPTAB NP.i ?34. TOP STACK SYMBPLl TERN IS INPUT! ■»♦" SCANI TOP STACK SYMBPLl m +» 132 SCAN) GO TO PUSH PAR TERM TH 15 . 3A9I PRODUCTION ASSOCIATED WITH PROTAB NO. I 238. TOP STACK SYMBOL! TERM SCANI TOP STACK SYMBOL! "•■ PUSH BAR TERM SCANI 60 TO TH 15 • 3S1 t PRODUCTION ASSOCIATED WITH PROTAB NO.! 193. TOP STACK SYMBOL! ARITHMFTlCEXPRFSST IS INPUT IS INPUT IS INPUT IS INPUT IS INPUT IS INPUT IS INPUT IS INPUT IS INPUT IS INPUT IS INPUT IS INPUT IS TNPUT ANY- 8 f AND *0R fELSE fTHEN #END M JW REPUCF TO ARITHMETICEVPRFSSTON r.n TO pnt n , 3S?| PRODUCTION ASSOCIATED WITH PRPTAB NO.l TOP STACK SYMBOL! ART THMFTl CEXPRFSST IS INPUTI "♦" SCAN! 226, 3M| 3^6| 3«i7i TOP STACK SYMPPLI *j.* SCAN» GO TO PUSH PAR TERM TH 15 . PRODUCTION ASSOCIATED WITH PROTAB HO. I 230. TOP STACK SYMBOL! ART THMFTI CFypRFSST SCANI TOP STACK SYMBOL I "-" PUSH PAR TERM SCANI GO TO TH 15 . PRODUCTION ASSOCIATED WITH PPOTAB NO.l 194. TOP STACK SYMBOL! BOOl EAWTERM IS TfcPuT IS TNPUT IS INPUT IS INPUT IS TNPUT ANY- ft f El SE fTHFN IFNO PEPUCF TO POOLFANFvPRFSSTON GO TO ONT 12 . PRODUCTION ASSOCIATED WITH PROTAB NO.l TOP STACK SYMBOL! BOOl EAMTERM SCANI 271. TOP STACK SYMBpLl fOR 133 PUSH BAR POOLFANTFRM SCANJ fin TO TH 19 , 359l PRODUCTION ASSOCIATED WITH PRPTAB NO.l 195. TOP STACK SYMBPLl BOOLEANEXPRFSSf ON IS INPllTi ANT- 6 IS INPUTI #ELSE IS INPUT! #TMEN IS INPUTI *END IS INPUT« «)" REDUCE TO BOOLFANEyPRFSSION CO TO DNT 12 , 3aOi PRODUCTION ASSOCIATED WITH PRPTAB NO.l 267. TOP STACK SYNBnLI BOOLEANFXPRFSSIPN SCANI TOP STACK SYMBOL! #OR PUSH BAR POOLFANTFRM SCANI 60 TO TH 19 , 3ft2l PRODUCTlPN ASSOCIATED WITH PRPTAB NO.i 224. TOP STACK SYMBOL! ARITHMFTICPPIMARY IS INPUT! ANY- 8 IS INPUT!. *AND IS INPUT! iOP IS INPUT! fELSE IS INPUT! *THFN IS INPUT! #FND IS INPUT! "«" IS INPUT! »#» IS INPUT! «<" IS INPllTi ")" IS INPUT! » m " IS INPUT! "<" IS INPUT! •♦• IS TNPUT! "S" IS TNPUT! »>• PFPUCF TO TERM GO TO ONT 15 , 3*3| PRODUCTION ASSOCIATED WTTH PRPTAB NO.l 257. TOP STAC* SYMBOL: ART THHFTlCPRIMARY IS TNPUT! "k* SCANI TOP STACK SYMROLI "x" PUSH PAR ARITHMFTICPRTMAPY SCANI fiO TO TH 17 , 3ft5i PRODUCTION ASSOCIATED WITH PPOTAB NO.l 261, TOP STACK SYMBPLl ARITHHFTI CPRIM*RY SCAN! TOP STACK SYMBOL! "/" PUSH PAR ARITHMETICPRTMAPY SCANI GO TO TH 17 , 3A7i PRODUCTION ASSOCIATED WITH PRPTAB NOti 225, TOP STACK SYMBPLl TFRM DUMMY 7 IS INPliTl ANY- 8 IS INPUT! #AND IS INPUTI #OR IS INPUTI fCtSE IS INPyT! • THEN IS INPUT! «FNO IS INPUT! •»■« IS INPUT! #^» IS INPUT! HJW IS INPUT! •» )*» IS INPUT! «■• IS INPUT! ♦»<*» IS INPUT! n+m IS INPUT! WJH IS INPUT! w>» reduce to term ftO TO ONT 15 . 13^ 36B| PRODUCTION ASSOCIATED WITH PROTAB NO.i ?A9. TOP STAC* SYMBOL! TERM DUMMY 7 IS INPUT! "*" SCANI TOP STACK SYMBOL! "x* PUSH BAR ARITHMETICPPIMARV SCANI BO TO TH 17 . 370| PRODUCTION ASSOCIATED WITH PROTAB NO.i ?53. TOP STAC* SYMBOL! TERM DUMMY 7 SCANI TOP STACK SYMBOL! V" PUSH dAR ARITHMETICPRTMARV SCANi f,n TO TH 17 , 372 1 PRODUCTION ASSOCIATED WITH PRflTAB NO. i ?65, TOP STAC* SYMBOL! BOOl EANPRIMARY IS INPUTt ANY- B IS TNPUTl #OP IS INPUT! tn SE IS INPUTI #TMEN IS TNPUTl #END IS TNPllTl ")" REDUCF TO POOLEANTERM GO TO ONT 19 . 373i PRODUCTIPN ASSOCIATED WITH PROTAB NO.i 306. TOP STACK SYMBOL! BOOl EANPRIMARY SCAN) TOP STACK SYMBOL! fANO PUSH PAR ROO! EANPRIMARY SCANf BO TO TH 7\ . 375t PRODUCTION ASSOCIATED WITH PROTAB NO.! 266. TOP STACK SYMBOL! BOOl EANTERM DUMMY IS INPUT! AKY- 8 IS TNPKTt #0R IS INPUTI fELSE IS INPUTI *THEN IS TNPUTl fFND IS TNPllTl ")" REDUCF TO BOOLFANTFRM 60 TO ONT 19 , 135 376| PRODUCTION ASSOCIATED WITH PROTAB NO. 8 302. TOP STACK SYMBOL! BOOLEANTERM DUMMY SCANJ TOP STACK SYMBOL* *AND PUSH RAR ROOI FANPRIMARY SCAN| ftO TO TH 21 , 378i PRODUCTION ASSOCIATED WITH PROTAB NOti ?78. TOP STACK SYMBOLI ARITHMFTICEXPRFSST IS INPUT! »•• SCAN) TOP STACK SYMBOL! "■" PUSH PAR ARITHMETTCEVPRFSSION SCANI 60 TO TH 11 . 360| PRODUCTION ASSOCIATED WITH PROTAB NO.i 281. TOP STACK SYMBOLI ARTTHMFTICEXPRFSST IS INPUT! "#" SCANJ TOP STACK SYMBOL! *** PUSH PAR ARITHMETICEyPRESSlON SCAN! GO TO TH 11 , 3A?| PRODUCTION ASSOCIATED WITH PROTAB NO.t 284. TOP STACK SYMBOLI ARITHMFTICEYPRFSST IS INPUT! ">" SCAN) TOP STACK SYMPOLl ">* PUSH PAR ARITHMETTCEYPRESSTON SCAN! 60 TO TH 11 . 3flA| PRODUCTION ASSOCIATED WITH PPOTAB NO. i 287. TOP STACK SYMBOL! ARITHMFT ICEXPRFSSI IS INPUT l "<" SCANI TOP STACK SYMPOLl *<» PUSH PAR ARITHMETTCEVPRFSSION SCANI GO TO TH 11 , 3**1 PRODUCTION ASSOCIATED WITH PROTAB NO. I 290. TOP STACK SYMBOL! ARI THMFTlCFYPRFSST IS INPUT! "<" SCAN) TOP STACK SYMBOL! w <* PUSH PAR ARITHMETTCEVPRFSSION SCANf 60 TO TH 11 . 3A8i PRODUCTION ASSOCIATED WTTH PROTAB *'0.t ?93. TOP STACK SYMBOL! ARI THMFT ICEXPRFSST IS INPUT • ">" SCANI 136 TOP STACK SYMB0L J ush "bIr ARITHMETTCEVPRFSSION SCAN! 60 TO TH 11 • SCAN) 3931 39«l TOP STACK StMBOL. Eou «)- TO ^^ „ f,0 TO PNT P3 . 39?l ERROR PRODUCTION, SKIP SYMBOLS. CTPN 6 fiROUP 72 1 SOURCE OF THIS COMBINATION ISl CNTPN 1 COMBINED PRODUCTION ASSOCIATED WITH PPPTAR NO., 53, 118* TOP STACK SYMBOL! *THEN IS INPUT, A»% USH 7 spR 5TATF „ E NT SCAN, fiO TO TH 6 . COMBINED PRODUCTION ASSOCIATED WITH PPOTAB NO., Ill* 1*5' . e ,* TOP STACK SYMBOL, 'THEN M-l« SCANI GO TO CTPN 9 . CNTPN 7 fiROUP 73, ROOLEANEXPRESSION SOURCE OF THIS COMPlNATlPN IS, CTPN A 39 „ COMBINED PRODUCTION ASSOCIATED WITH PPOTAP NO., 09* 12^# 135* 151* TOP STACK SYMBOL' POO L E AGRESSION »S-1» SCANI fiO TO CTPN 10 , CNTPK 8 fiROUP 74, STATEMENT SOURCE Of THIS COMBINATION IS, CTPN 6 3PM COHBINED PPODUCTIO^ASSOCIATFO WITH PPHTAP NO., TOP STACK ^MBPU 2 STATEMENT PS-1* SCAN, fiO TO CTPN 11 . CTP* 9 fiROUP 75, SOURCE OF THIS COMPlNATlPN IS, CjPN 6 307, PRODUCTION ASSOCIATED WlTM PRPTAB NO., TOP STACK SYMBOL! Tl-SF 1* TNPl'T' ANY" 7 IS J PUSH PAR STATFMENT SCAN, fiO TO TH 6 . 306, PRODUCTION ASSOCIATED J IT* J MflT A B NO.. TOP STACk SYMBOL! *FLSE ill TOP 5TAL REOUCF TO STATEMENT 11?. 1 07. 137 GO TO ONT 6 . CTPN 10 fiROUP 761 SOURCE OF THIS COMBINATION ISl CNTPN 7 399i COMBINED PRODUCTION ASSOCIATED WITH PPPTAR NO.i 100* 136t TOP STACK SYMBOLl tTHEN IS INPllTi ANY- 7 PUSH SBR STATFMENT SCANl fiO TO TH 6 • AOOt COMBINED PRODUCTION ASSOCIATED WITH PPPTAB NO.i 128» 153* TOP STACK SYMBOLl #THEN PS-14 SCANl 60 TO CTPN 13 . CTPN 11 fiROUP 77| SOURCE OF THIS COMBINATION ISl CNTPN 8 40li PRODUCTION ASSOCIATED WITH PPOTAb NO.i 56. TOP STACK SYMBOLl #ELSF IS INPUT 1 ANY- 7 PUSH RAR STATFMFNT SCANl fiO TO TH 6 . 4o2l PRODUCTION ASSOCIATED WITH PRPITAB NO.i 122. TOP STACK SYMBOLl #FLSF BS-15 REPUCF TO STATFMENT fiO TO ONT 6 . CNTPN 12 fiROUP 76 i STATEMFNT SOURCE Pr THIS COMBINATION ISl CTPN 10 403l COMBINED PPOOUCTION A*SOCTATFO WITH PPOTAP WO.i 102# 13R# TOP STACK SYMBOLl STATEMENT • S-H SCANl fiO TO CTPN 14 , CTPN 13 fiPOUP 79i SOURCE OF THIS COMBlMATlPN I S I CTPN tO 4pA| PROOUCTIPN ASSOCIATED WITH PPPTAB NO.i 1*9, TOP STACk SYMBOLl #FISF IS INPUT I ANY- 7 PUSH PAR STATFMENT SCANl GO TO TH 6 . 4G5| PRODUCTION ASSOCIATED WITH PRPTAB K'P.i 1 55. TOP STACk SYMBOLl #FLSF PS-15 REPUCF TO STATFMENT fiO TO ONT 6 . CTPN 14 GROUP 80| SOIRCF PF THIS COMBINATIPN ISl CNTPN 12 138 «06l PRODUCTION ASSOCIATED WITH PROTAB NO. I 103. TOP STACK SYMBOLl IFLSF IS INPUTI ANY- 7 PUSH PAR STATFMENT SCANI GO TO TH 6 . 407t PRODUCTION ASSOCIATED WITH PROTAB NO. I 110, TOP STAC* SYMBOLl IELSF #S-15 REOUCF TO STATFMENT GO TO ONT 6 . 139 APPENDIX D IMPLEMENTATION FLOW CHARTS These flow charts, Figures D.l through D.IO, show the "basic structure of the implementation of the CF to FPL conversion algorithm. They are relatively complete. The only things of any importance which are omitted, are those which are highly dependent on the methods of storing intermediate data. Any box which includes a reference of the form (D.n) can be replaced by the entire flow chart of Figure D.n. lUo ADD PRODUCTION ZERO BUILD HEAD -SYMBOL TABLES INSERT DUMMY NON- TERMINALS* BUILD NON- COMBINED GROUPS (D. 2) BUILD COMBINED GROUPS (D. 3] CALCULATE D NTH-I FOR ALL I CALCULATE D TH-I FOR ALL I ^Update head- symbol tables as dummies are inserted D.l Outer Structure 1^1 BUILD NT(n) FOR I (D.5) BUILD NTH-I BUILD TH-I (DA) MS is the number of nonterminal symbols ** groups are necessary for NT -I if NT-I appears as a nonfirst symbol on some RHS *** NT(n) is built for each nonfirst occurrence of I at n D.2 Construction of Noncombined Groups ite ( START W- ■]►» it I <- I + 1 BUILD CNT(I) (D.6) BUILD CTH(I)& CNTH(I) (D.7) BUILD CT(I) (D.8) * CGPT points to last entry in table of combined groups ** for the same nonterminal symbol D.3 Construction of Combined Groups 1U3 f START V- BUILD FPL PROD FROM or CH* FIXGP (D.9) * use mapping rules of section 3*3 on appropriate descriptor set *•* this uses appropriate lookahead and/or combination to eliminate preclusions and generates appropriate T {-a, n) productions inline D.k Construction of TH, NTH,, CTH and CNTH Groups from Descriptor Sets lkk ( START V BUILD FPL PROD. WITH DESC. n £ BUILD FPL PROD. WITH DESC. J J <- J«- J + 1 YES/PROTAB •* C.T TS A REC>-*- YES FIXGP (D-9) * see D.2 ** FEND points to last entry in PRj^TAB *** see D.2 D.5 Construction of NT (n) Groups for Nonterminal I 1^5 NOTE BUILD FPL PROD. WITH DESC. J J «- YES END }-<- J «- J + 1 YES FIXGP (D.9) * Build.. FPL productions with descriptors attached to the markers in combined group I (see D.3) X is the nonterminal symbol whose markers are in combined group I D.6 Construction of CNT-I Groups lU6 \ > CREATE * CH DO ! > IF— D.k yf ^ DO D.k CREATE ** -^ CH *■* CH is the union of D for each nonterminal X which has a marker in combined group I (see D.3) ■*-* same as @ but using D NTH-X D.7 Construction of CTH-I and CNTH-I Groups Ik-T Hi Si H ^ q u "r -1 H O M fli pq Ph Q O + 1 o H o •H -P « •H ■§ o o p CD -P I 1-3 0) P CO •H hi H Ph o CO & o s H EH O C|H o o ■H P O £ P CO s o o CO Q lkS I SIART V- OUTFUT PROD I JBSL -0. BUILD M (PROD WITH DESC K) LK- X YES J - J + 1 • LK1 FROM »*# PROD I INIT. LK2 FROM PROD J CONSTRUCT LOOKAHEAD + DO CCMB (D.IO) tote PRECLUDES J LK is the lookahead length to be used FPEND is the last FPL production in a group *** see Chapter Five t see Figure 2 of Chapter Five MAXLK is the length of the multisymbol lookahead D.9 Preclusion Elimination and T (k) Production Generation ii+9 START I FAILURE \ I EXIT J YES •♦{REDUCTION] IS NO HEW COMB *** GROUP N 1 ' I PUSHES 'cttoB MARK N I PUSHES SP. MARK N 1 1 1 ' I GOES TO era (N) DELETE PROD J ' 1 YES YES NO NO YES NO NEW COMB GROUP M**' -^< GOES TO NO NO YES UPDATE COMB. N** UPDATE COMB. I PUSHES COMB MARK M YES NEW COMB GROUP N. GOES TO CT(N) DELETE PROD J GOES TO CTH (M) D.IO Combination of Productions (Part I) T)-^- / tttV s t \ „, / NOTE NSIL>^ NOTE \NO > ADD MARKER PUSHED B J TO M a YES YES CREATE NEW COMB GROUP P** ADD MARKER PUSHED BY J TO N ^ \ t REPLACE K OF M WITH SPECI MARKE AL R P 150 DELETE PROD J * The transfer labels T(K_) and T(K ) from productions I and J are the entries ** add the transfer label T(K ) from production J *** markers pushed by I and J are the entries t add the marker pushed by J tt I must push (M) ttt marker pushed by J is for the same nonterminal as the simple marker which is K-th entry of combination M same as ttt except K-th entry is special marker N tt K-th entry of M and bar pushed by J are the entries D.10 Combination of Productions (Part II) 151 LIST OF REFERENCES [l] Lewis, P. M. and Stearns, R. E. , "Syntax-Directed Transduction", J. ACM 15 , (July I968), pp. k&5-k93' [2] Feldman, J. A., "A Formal Semantics for Computer Oriented Languages", Carnegie -Mellon University , Pittsburgh, Pennsylvania, I96U, Summarized in Comm. ACM 9 , (Jan. 1966) , pp. 3-9* [3] Lucas, P. and Walk, K. , "On the Formal Description of PL/l", ACM Symposium on Programming Languages Definition , San Francisco, California, Aug. 1969* [h] McKeeman, W. M. , et al, "The XPL Compiler Generating System", AFIPS I968 Fall Joint Computer Conference, San Francisco, California, Aug. I969. [5] Wirth, N. and Weber, H. "EULER-A Generalization of ALG0L and its Formal Definition", Part I, Part II, Comm. ACM 9 , (Jan., Feb. I966), pp. 13-25, 89-99. [6] Ingerman, P. Z. , " A Syntax Oriented Translator ", Academic Press, Inc. , New York, 1966. [7] Brooker, R. A, and Morris, D. , "A General Translation Program for Phrase Structure Languages", J. ACM , £ (Jan. 1962), pp. 1-10. [8] Trout, R. G. , "A Compiler-Compiler System", Proc. ACM 22nd National Conference , I967, pp. 317-322. [9] Floyd, R. W. , "A Descriptive Language for Symbol Manipulation", J. ACM , 8 (Oct. I96I), pp. 579-58U. [10] Evans, A., "An ALG0L 60 Compiler", Annual Review in Automatic Programming , h (196^), pp. 87-12^. [ll] Naur, P., et al, "Revised Report on the Algorithmic Language ALGjSL 60", Comm. ACM , 6 (Jan. I963), pp. 1-17- [12] Earley, J. C. , "Generating a Recognizer for a BNF Grammar", Computation Center Report , Carnegie -Mellon University, Pittsburgh, Pennsylvania, I965. [13] DeRemer, F. L. , "On the Generation of Parsers for BNF Grammars: An Algorithm", ILLIAC IV Document No. 1^7 , Department of Computer Science, University of Illinois, Urbana, Illinois, I968. 152 [Ik] Knuth, D. E. , "On the Translation of Languages from Left to Right ", Information and Control , 8 (Oct. I965), pp. 607-639. [15] DeRemer, F. L. , Practical Translators for LR (k) Languages," Massachusetts Institute of Technology , Project MAC , Cambridge, Massachusetts, 1969* [16] Korenjak, A. J., "A Practical Method for Constructing LR (k) Processors", Comm. ACM , 12 (Nov. I969) pp. 613-623 . [17] Earley, J. C. , "An Efficient Context-Free Parsing Algorithm", Dept. of Computer Science , Carnegie -Mellon University, Pittsburgh, Pennsylvania, I968, Summarized in Comm. ACM , 13, (Feb. 1970), pp. 9U-IO2. [18] Hopcroft, J. E. and Ullman, J. D. , Formal Languages and Their Relation to Automata , Addison -Wesley, Inc., Reading, Massachusetts, I969. [19] Floyd, R. W. , "Bounded Context Syntactic Analysis", Comm. ACM , J_ (Feb. I96IO , pp. 62-67. [20] Mercer, R. L. , "TWINKLE-A Syntax Language for a Translator Writing System", ILLIAC IV Document No. 2l8 , Department of Computer Science, University of Illinois at Urb ana -Champaign, 1970. [21] Machado, N. E. , "ISL - A Semantics Language for a Translator Writing System", ILLIAC IV Document No. 203 , Department of Computer Science, University of Illinois at Urbana-Champaign, I969. [22] LaFrance, J. E., "Syntax-Directed Error Recovery for Syntax- Directed Compilers", Ph. D. Dissertation in Process, Department of Computer Science, University of Illinois at Urbana-Champaign. [23] Horning, J. J. and Lalonde, W. R. , "Empirical Comparison of LR (k) and Precedence Parsers", ACM , SIGPLAN Notices , 11, (Nov. 1970), pp. 10-2^. 153 VITA Alan James Beals was "born on May 2k, 19^-1 at Chicago, Illinois. He attended high school at Leyden Community High School in Franklin Park, Illinois, from which he was graduated in June, 1959* He attended under- graduate school for one year at Knox College in Galesburg, Illinois. After two years in the United States Army, Mr. Beals completed his undergraduate education at the University of Illinois where he grad- uated in February, 1967* with a B.S. In Mathematics. He then began his graduate work at the University of Illinois where he received his Master's Degree in February, 1969. During the time spent earning his Master's Degree, and while working on his Ph. D. , he was also a Research Program- mer on the ILLIAC IV Project in the Department of Computer Science of the University of Illinois. JNCLASSIFIED Security Classification DOCUMENT CONTROL DATA .R&D (Saeurlty claaaltleatlon ol tltto, body of abattmct and Indomlnd annotation mumt bo wiUfrt irlfw trie ovorall report la ctoaalllod) IGINATING ACTIVITY (Corporal* author) mter for Advanced Computation liversity of Illinois at Urbana-Champaign bana, Illinois 6~l80J - 3M. REPORT SECURITY C L A SSI FIC A TIOK' WLAgSITOP 2b. CROUP PORT TITLE EE AUTOMATIC GENERATION OF DETERMINISTRIC PARSING ALGORITHMS script! ve NOTES (Typo 0/ report and hteluolro datoa) 'search Report THOR(S) (Flrat naato, middto Initial, laat namo) .an James Beals PORT DATE iy 23, 1971 TO. TOTAL NO. OF PACES l60 76. NO. OF REFS £3 ONTRACT OR CRANT NO. 5AF 30(602)-4lW- ROJECT NO. £>A Order 788 ma. ORIGINATOR'S REPORT NUMBER(S) ILLIAC IV Document No. 2*r8 ab. OTHER REPORT NOU) (Any othor numborm that may be aaalgnad Shit roport) DCS Report No. k 58 ISTRIBUTION STATEMENT >pies may be requested from the address given in (l) above. JPPLEMENTARY NOTES me 12- SPONSORING MILITARY ACTIVITY Rome Air Development Center Griffiss Air Force Base Rome, New York I3H0 This paper describes an algorithm for thecconversion of a grammar 1 the form of a set of context free productions into a deterministic parsing Lgorithm described by a set of modified Floyd productions. The algorithm 3 extended in such a way that it may easily become a part of a complete translator writing system and make use of the information available in the smantic part of such a system. The paper also includes a determination of ie subclass of context free grammars which can be successfully converted and comparison of this algorithm with some other methods of generating parsing lgorithms from context free grammars . ) ^^,AA73 UNCLASSIFIED Security Classification UNCLASSIFIED Security Classification KEY WOMOI ROLE WT Compiler-Compilers UNCLASSIFIED Security Classification A \ r <; # '.;