LIBRARY OF THE 
 
 UNIVERSITY OF ILLINOIS 
 
 AT URBANA-CHAMPAIGN 
 
 510. 84 
 3JM- 
 
 Cop. Z 
 
The person charging this material is re- 
 sponsible for its return to the lihrary from 
 which it was withdrawn on or before the 
 Latest Date stamped below. 
 
 Theft, mutilation, and underlining of books 
 are reasons for disciplinary action and may 
 result in dismissal from the University. 
 
 UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN 
 
 1 3 R 
 
 :C0 
 
 L161 — O-1096 
 
Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/extendedsyntaxdi795leac 
 
'U ■ V J 
 
 /P- 1 1- 
 
 IP 
 
 UIUCDCS-R-T6-T95 
 
 AN EXTENDED SYNTAX-DIRECTED TRANSLATION SCHEME 
 
 Sandra A. Leach 
 Will D. Gillett 
 
 April 1976 
 
 DEPARTMENT OF COMPUTER SCIENCE 
 UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN 
 
 URBANA, ILLINOIS 
 
 I ME LIBRARY OF THE 
 
 MAY 5 1976 
 
 UNIVERSITY OF IluiMOIS 
 *r ••PBANA-GhAMPAIGN 
 
JUL 2 3 1QTR| 
 
 L16 l_O-1096 
 
uiucdcs-R-76-795 
 
 AN EXTENDED SYNTAX-DIRECTED TRANSLATION SCHEME 
 
 Sandra A. Leach.' 
 Will D. Gillett 
 
 Department of Computer Science 
 
 University of Illinois at Urbana-Champaign 
 
 Urbana, Illinois 6l801 
 
 April 1976 
 
 This work was supported by the Department of Computer Science, 
 
0. INTRODUCTION 
 
 Syntax-directed translation schema (SDTS) coupled with LR parsing 
 is a commonly used technique for recognizing and translating a large 
 class of LR languages. However, this scheme often leads to a prohib- 
 itively large LR parsing table. This paper presents an extended SDTS 
 (ESDTS) which can aid in reducing the size of the parsing table. 
 
 Part I presents several examples that are intended to familiarize 
 the reader with the notation and conventions used throughout the paper, 
 Part II discusses the motivation behind the ESDTS: how SDTS often re- 
 quires excessive stratification of the grammar in order to implement 
 language semantics. Part III develops the extended scheme itself and 
 suggests a possible implementation. Part IV analyzes implementation 
 aspects in order to compare ESDTS with SDTS. 
 
I. NOTATION 
 
 In order to familiarize the reader with the conventions and 
 notation used throughout this paper, several examples are presented 
 before beginning discussion of the translation scheme. For a more 
 detailed description of the theories of LR parsing, syntax directed 
 translation schema and related algorithms, see [AU72], 
 
 Grammar 1: ASSIGN 
 EXPR 
 
 ELEM 
 
 = id j^= EXPR 
 
 = ELEM 
 
 = EXPR o£ ELEM 
 
 = id. 
 
 = number 
 
 In this grammar the set of terminal symbols is {_id, numbe r , : = , 
 op_}; the set of non-terminals is {ASSIGN, EXPR, ELEM}. The terminal 
 symbol op_ is a token class representing + , -, *, and / in the same way 
 that id and number represent the token classes of identifiers and 
 constants, respectively. The mapping of token to terminal takes place 
 during the lexical analysis phase so that the parser never sees the 
 specific token; however, this information is available for use by 
 the semantic routines. The language generated by Grammar 1 is the 
 set of assignment statements where the operators have simple left-to- 
 right associativity; e.g. A:=B*C+12/D and Y: =Z-7*Y/U+X. The LR(l) 
 table for recognizing this language is given in Figure 1. 
 
The LR(l) parsing algorithm is as follows: 
 
 Algorithm 1 
 
 INPUT: LR(l) parsing table PARSEJTABLE and string to be parsed. 
 OUTPUT: YES, if string £ L(G); otherwise ERROR. 
 
 1. Initialize state stack to TO. 
 
 2. Get the next incoming token x; if the input string is now 
 empty, the token is e. 
 
 3. Let Tn be the state on the top of the state stack. 
 
 a. If PARSE_TABLE(Tn,x)=Sm then shift: push Tm onto the 
 state stack and return to step 2. 
 
 b. If PARSEJTABLE (Tn,x)=Ri then reduce: if production 
 
 rule i is A::=a, then pop |a| states off the state stack, 
 exposing a new state Tm on the top of the stack. If 
 PARSE_TABLE(Tm,A)=Sj then push Tj onto the state stack 
 and return to step 2; otherwise halt with an ERROR. 
 
 c. If PARSE_TABLE(Tn,x)=A then accept: halt and declare 
 the input string valid by outputting YES. 
 
 d. If PARSEJTABLE (Tn,x)=E then error: halt with output of 
 ERROR (or transfer to an error recovery routine). 
 
 A PL/1 implementation of Algorithm 1 is given in Appendix A. 
 
 A translation scheme, in general, is a means of mapping a given 
 input string to an output string. A syntax-directed translation scheme 
 (SDTS) performs this mapping by associating a partial translation func- 
 tion with each production rule in a grammar. Whenever a production rule 
 is used in the derivation of the input string, its respective translation 
 
TERMINALS 
 
 NONTERMINALS 
 
 STATES 
 
 TO 
 Tl 
 T2 
 T3 
 Tk 
 T5 
 t6 
 TT 
 T8 
 T9 
 
 id 
 
 number 
 
 1Z °£ 
 
 ASSIGN EXPR 
 
 ELEM 
 
 SI 
 
 
 
 
 
 S9 
 
 
 
 
 
 S2 
 
 
 
 
 
 
 S5 
 
 Sb 
 
 
 
 
 
 S3 
 
 SU 
 
 
 
 
 ST 
 
 Rl 
 
 
 
 
 
 
 
 R2 
 
 R2 
 
 
 
 
 
 
 
 RU 
 
 Rk 
 
 
 
 
 
 
 
 R5 
 
 R5 
 
 
 
 
 S5 
 
 s6 
 
 
 
 
 
 
 S8 
 
 
 
 
 R3 
 
 R3 
 
 
 
 
 
 
 
 
 A 
 
 
 
 
 Sn neans shift incoming token and goto state n. 
 
 Rn means reduce "by production rule n. 
 
 A means accept. 
 
 Blanks are error entries. 
 
 Grammar 1: 
 
 1 
 
 ASSIGN : 
 
 : = id \^_ EXPR 
 
 
 2 
 
 EXPR : 
 
 := ELEM 
 
 
 3 
 
 
 := EXPR og_ ELEM 
 
 
 h 
 
 ELEM : 
 
 := id 
 
 
 5 
 
 
 := number 
 
 Grammar 1 and its LR(l) table, 
 
 Figure 1. 
 
function is applied to compute a portion of the output string. A 
 trivial example of an SDTS would be outputting the number of a pro- 
 duction rule when it is used in a reduction; the output generated 
 is the right parse (the reverse sequence of productions used in the 
 rightmost derivation) of the input string. 
 
 To illustrate, define a configuration of the parser as a 
 triple of (state stack, remaining input string, output string) 
 corresponding to step 3 of Algorithm 1. Then given Algorithm 1 
 with input of the table in Figure 1 and string Z:=X/Y*5 and the 
 translation described above, the parser would take on the following 
 sequence of configurations: 
 
 (TO, Z:=X/Y*5, e) ■+ (T0T1, :=X/Y*5, e) 
 
 -* (T0T1T2, X/Y*5, e) 
 
 -> (T0T1T2T5, /Y*5, h) 
 
 + (T0T1T2TU, /Y*5, h) 
 
 ■*■ (T0T1T2T3, /Y*5, h2) 
 
 + (T0T1T2T3TT, Y*5, k2) 
 
 -*■ (T0T1T2T3TTT5, *5, h2) 
 
 -»■ (T0T1T2T3T7T8, *5, U2U ) 
 
 -> (T0T1T2T3, *5, U2U3) 
 
 -»- (T0T1T2T3T?, 5, ^2U3) 
 
 ■> (T0T1T2T3TTT6, e, U2U3) 
 
 -»■ (T0T1T2T3TTT8, e, U2U35) 
 
 ■»■ (T0T1T2T3, e, U2U353) 
 
 ■*■ (T0T9, e, 1+2U3531) 
 
 -*■ accept and halt 
 
 The translation defined above is not particularly useful in a 
 compiler; more often the translation generated is some type of 
 assembly code or intermediate text which indicates in detail the 
 
"meaning" of the input string. In future examples the code that 
 will be generated by the translation will he assembly code for a 
 (fictitious) stack machine. The instructions and their meanings are: 
 
 PUSH number 
 PUSH #id 
 FETCH 
 
 SADD 
 SSUB 
 SMUL 
 SDIV 
 
 POP 
 
 STORE 
 
 push value on run-time stack 
 
 push address of id_ on stack 
 
 replace the address on the top of the 
 stack with "che value at that location 
 
 stack operations: if the values of X 
 and Y are on the top of the stack, 
 executing one of these commands pops 
 both values off the stack, and 
 pushes the value of X op Y 
 
 pop a value and an address off the 
 top of the stack and store the value 
 at that address 
 
 same as POP except that the value is 
 returned to the top of the stack 
 
 For example, the translation of Z:=X/Y*5 would be: 
 
 PUSH ffZ 
 PUSH #X 
 FETCH 
 PUSH #Y 
 FETCH 
 SDIV 
 PUSH 5 
 SMUL 
 POP 
 
 The term "semantics" is generally used for associating a translation 
 
 or "meaning" with an input string. Throughout this paper the term 
 
 "semantic actions" will encompass not only the explicit outputting 
 
 of a translation, but will also include: 
 
 - any operations necessary to compute the translation 
 
 - bookkeeping tasks for the compiler 
 
 - error checking/correction 
 
II . MOTIVATION 
 
 Although LR parsing coupled with SDTS is a fairly elegant 
 technique for recognizing and translating a large class of LR 
 languages, a serious implementation problem remains: the LR 
 table for even a moderately complex language can become prohibi- 
 tively large. One factor contributing to large table size is that 
 an SDTS often leads to excessive stratification of the grammar in 
 order to accommodate semantic analysis. There are two types of 
 stratifications generated: those resulting from the linking of 
 semantic actions with production rules, and those resulting from 
 reliance on strictly local context to perform semantics. 
 
 If a particular semantic action is required at a given point in 
 the parsing of a string, it must be guaranteed that a reduction take 
 place at that point since a semantic action in an .SDTS may only be 
 performed when a reduction takes place. For instance, in Grammar 1 
 if it were necessary to perform some semantic action when the id_ of 
 ASSIGN ::= id_ j_j^ EXPR was recognized, a production would have to be 
 forced there using productions such as 
 
 ASSIGN ::= VAR j=_ EXPR 
 VAR : : = id 
 
 The semantic action could then be accomplished when the reduction 
 VAR ::= id. took place. Similarly, requiring an action when op_ is 
 
encountered might be realized by: 
 
 EXPR : : = EXPR OP ELEM 
 OP : := op 
 
 Grammar 1 extended in this manner, its LR(l) table, and sample 
 semantic actions are shown in Figure 2. 
 
 The productions added in this way are "useless" in that they 
 contribute no information to the parse, but are present only to 
 provide a means for semantic analysis. They do, unfortunately, 
 add to the size of the LR table: each supplementary rule requires 
 a new nonterminal (another column in the table) and adds an extra 
 state (another row in the table). The number of entries in the LR 
 table for Grammar IE represents an increase of 50% over the table 
 for Grammar 1. 
 
 The second type of stratification results from the dependency 
 of SDTS on strictly local context; that is, in performing a semantic 
 action in an SDTS the only context information available is which 
 production rule is currently being used in the reduction. In many 
 cases, however, the semantic action does not depend totally on the 
 local configuration, but also on the global context in which this 
 configuration is embedded. 
 
 For example, a common situation is the following: 
 
 Grammar 2: DECLARE 
 
 INPUT 
 
 IDLIST 
 
 = typet IDLIST 
 = INPUT IDLIST 
 = id 
 
 = IDLIST j_ id 
 
 t the terminal type is again a token class representing {INTEGER, 
 REAL , BOOLEAN , . . . } . 
 
 
id number 
 
 1Z. 2£L 
 
 ASSIGN EXPR ELEM 
 
 VAR 01' 
 
 TO 
 
 S10 
 
 
 
 
 
 S9 
 
 
 
 SI 
 
 
 Tl 
 
 
 
 S2 
 
 
 
 
 
 
 
 
 T2 
 
 S5 
 
 S6 
 
 
 
 
 
 S3 
 
 Sk 
 
 
 
 T3 
 
 
 
 
 Sll 
 
 Rl 
 
 
 
 
 
 ST 
 
 TU 
 
 
 
 
 R2 
 
 R2 
 
 
 
 
 
 
 T5 
 
 
 
 
 rU 
 
 RU 
 
 
 
 
 
 
 T6 
 
 
 
 
 R5 
 
 R5 
 
 
 
 
 
 
 TT 
 
 S5 
 
 s6 
 
 
 
 
 
 
 s8 
 
 
 
 T8 
 
 
 
 
 R3 
 
 R3 
 
 
 
 
 
 
 T9 
 
 
 
 
 
 A 
 
 
 
 
 
 
 T10 
 
 
 
 R6 
 
 
 
 
 
 
 
 
 Til 
 
 RT 
 
 RT 
 
 
 
 
 
 
 
 
 
 Grammar IE 
 
 
 
 1 ASSIGN : 
 
 : = 
 
 VAR ^ EXPR : 
 
 2 EXPR : 
 
 : = 
 
 ELEM 
 
 3 : 
 
 : = 
 
 EXPR OP ELEM : 
 
 k ELEM : 
 
 : = 
 
 id : 
 
 5 : 
 
 : = 
 
 number : 
 
 6 VAR : 
 
 : = 
 
 id : 
 
 7 OP : 
 
 : = 
 
 °E : 
 
 Semantic Actions: 
 Generate POP 
 
 Pop operator stack and generate SADD, 
 
 SSUB, SMUL, SDIV appropriately 
 
 Generate PUSH #id, FETCH 
 
 Generate PUSH number 
 
 Generate PUSH #id 
 
 Push op code onto operator stack 
 
 Grammar IE, its associated semantics and LR table. 
 
 Figure 2. 
 
10 
 
 This language contains strings such as INTEGER A,B,C or INPUT X,Y. 
 Clearly different semantics should he associated with the identifiers 
 appearing in a declaration list than with the identifiers in an 
 input list. In the former case the type of id_ might he entered in 
 the symbol table; in the latter case a READ id. might he generated, 
 and the id_ might be checked to determine if it had been declared. 
 Performing the appropriate semantic action does not depend on the 
 fact that the id_ is contained in an IDLIST hut rather depends on 
 whether the IDLIST is associated with a declaration or input statement. 
 In order to correctly perform the semantic actions using SDTS, the 
 grammar must be stratified in such a way that each production rule 
 indicates its global context. One way to accomplish this is: 
 
 Grammar 2E: DECLARE 
 
 INPUT 
 
 DECLIST 
 
 INLIST 
 
 = type DECLIST 
 = INPUT INLIST 
 
 = id 
 
 = DECLIST 
 
 = id 
 
 id 
 
 = INLIST x id 
 
 This type of stratification results in redundant production rules, 
 and is even more costly in terms of table entries than the first type, 
 Not only is a new nonterminal added, but more than one extra state is 
 required (in this example, 3 extra states were needed). 
 
 Another example of this situation is the problem of embedded 
 assignment statements. Grammar IE can be modified to include this 
 feature: 
 
11 
 
 Grammar 3 : 
 
 ASSIGN : 
 
 • = 
 
 VAR jj= EXPR 
 
 
 EXPR : 
 
 : = 
 
 ELEM 
 
 
 
 : = 
 
 EXPR OP ELEM 
 
 
 ELEM : 
 
 : = 
 
 id 
 
 
 
 : = 
 
 number 
 
 
 
 : = 
 
 i ASSIGN j_ 
 
 
 VAR : 
 
 : = 
 
 id 
 
 
 OP : 
 
 : = 
 
 2R 
 
 
 »+-' 
 
 
 $ 
 
 ^' 
 
 This language includes strings such as A:=B+(C:=D/2)-E and 
 W:=(Y:=Z*(X:=X+l))/3 . 
 
 When the reduction ASSIGN ::= VAR jf« EXPR takes place, a POP 
 is generated, which will pop both the expression value and the address 
 of the variable off the run-time stack. However, if this is an embed- 
 ded assignment it is necessary to retain the expression value on the 
 top of the stack since it will be required for further computation; 
 this is the action produced by a STORE instruction. Therefore, a 
 STORE should be generated when reducing an embedded assignment, and 
 a POP generated only for non-embedded assignments. A sample extension 
 of Grammar 3 to accommodate these semantics is: 
 
 Grammar 3E : 
 
 ASSIGN : 
 
 ; = 
 
 VAR ± = EXPR 
 
 
 EXPR : 
 
 : = 
 
 ELEM 
 
 
 
 : = 
 
 EXPR OP ELEM 
 
 
 ELEM : 
 
 : = 
 
 id 
 
 
 
 : = 
 
 number 
 
 
 
 : = 
 
 {_ EASSIGN ]_ 
 
 
 VAR : 
 
 : = 
 
 id 
 
 
 OP : 
 
 : = 
 
 22. 
 
 
 EASSIGN : 
 
 ; = 
 
 VAR := EXPR 
 
12 
 
 Figures 3 and h show the LR tahles for grammars 3 and 3E respectively, 
 The number of table entries in the table for Grammar 3E represents a 
 30% increase over the number of entries in the table for Grammar 3. 
 
13 
 
 id numbr 
 
 °£ 
 
 ASSGN EXFR ELEM VAR 
 
 OP 
 
 TO 
 
 Tl 
 T2 
 T3 
 TU 
 
 T5 
 
 T6 
 
 TT 
 
 T8 
 
 T9 
 
 TIO 
 
 Til 
 
 T12 
 
 T13 
 
 TlU 
 
 S2 
 
 
 
 
 
 
 
 Silt 
 
 
 
 SI 
 
 
 
 
 S3 
 
 
 
 
 
 
 
 
 
 
 
 
 RT 
 
 
 
 
 
 
 
 
 
 
 s6 
 
 ST 
 
 
 S8 
 
 
 
 
 
 sk 
 
 S5 
 
 
 
 
 
 
 
 Rl 
 
 SIO 
 
 Rl 
 
 
 
 
 
 S9 
 
 
 
 
 
 R2 
 
 R2 
 
 R2 
 
 
 
 
 
 
 
 
 
 
 Rh 
 
 nk 
 
 Rl+ 
 
 
 
 
 
 
 
 
 
 
 R5 
 
 R5 
 
 R5 
 
 
 
 
 
 
 S2 
 
 
 
 
 
 
 
 Sll 
 
 
 
 SI 
 
 
 s6 
 
 S7 
 
 
 S8 
 
 
 
 
 
 
 S12 
 
 
 
 R8 
 
 R8 
 
 
 R8 
 
 
 
 
 
 
 
 
 
 
 
 
 
 S13 
 
 
 
 
 
 
 
 
 
 
 
 
 R3 
 
 R3 
 
 R3 
 
 
 
 
 
 
 
 
 
 
 R6 
 
 R6 
 
 R6 
 
 
 
 
 
 
 
 
 
 
 
 
 A 
 
 
 
 
 
 
 Grammar 3: 
 
 1 
 
 ASSIGN : 
 
 : = 
 
 VAR y^ EXPR 
 
 
 2 
 
 EXPR : 
 
 : = 
 
 ELEM 
 
 
 3 
 
 
 : = 
 
 EXPR OP ELEM 
 
 
 h 
 
 ELEM : 
 
 : = 
 
 id 
 
 
 5 
 
 
 : = 
 
 number 
 
 
 6 
 
 
 : = 
 
 {_ ASSIGN ^ 
 
 
 7 
 
 VAR : 
 
 : = 
 
 id 
 
 
 8 
 
 OP : 
 
 : = 
 
 2£ 
 
 Grammar 3 and its LR table, 
 Figure 3. 
 
id numbr 
 
 22. 
 
 ASSGN EXPR ELEM VAR 
 
 11+ 
 
 OP EASGN 
 
 S2 
 
 
 
 
 
 
 
 siU 
 
 
 
 SI 
 
 
 
 
 
 S3 
 
 
 
 
 
 
 
 
 
 
 
 
 
 RT 
 
 
 
 
 
 
 
 
 
 
 
 s6 
 
 ST 
 
 
 S8 
 
 
 
 
 
 sU 
 
 S5 
 
 
 
 
 
 
 
 
 
 SIO 
 
 Rl 
 
 
 
 
 
 S9 
 
 
 
 
 
 
 R2 
 
 R2 
 
 R2 
 
 
 
 
 
 
 
 
 
 
 
 Rk 
 
 Rk 
 
 rU 
 
 
 
 
 
 
 
 
 
 
 
 R5 
 
 R5 
 
 R5 
 
 
 
 
 
 
 
 S2 
 
 
 
 
 
 
 
 
 
 
 SI 5 
 
 
 Sll 
 
 s6 
 
 S7 
 
 
 S8 
 
 
 
 
 
 
 S12 
 
 
 
 
 R8 
 
 R8 
 
 
 r8 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 S13 
 
 
 
 
 
 
 
 
 
 
 
 
 
 R3 
 
 R3 
 
 R3 
 
 
 
 
 
 
 
 
 
 
 
 R6 
 
 R6 
 
 R6 
 
 
 
 
 
 
 
 
 
 
 
 
 
 A 
 
 
 
 
 
 
 
 
 
 Sl6 
 
 
 
 
 
 
 
 
 
 
 
 s6 
 
 ST 
 
 
 s8 
 
 
 
 
 
 SIT 
 
 S5 
 
 
 
 
 
 
 
 
 R9 
 
 SIO 
 
 
 
 
 
 
 S9 
 
 
 Grammar 3E: 
 
 1 ASSIGN 
 
 2 EXPR 
 3 
 
 ELEM 
 
 VAR 
 OP 
 
 9 EASSIGN 
 
 = VAR _j = EXPR 
 
 = ELEM 
 
 = EXPR OP ELEM 
 
 = id 
 
 = number 
 
 = {_ EASSIGN )_ 
 
 = id_ 
 
 = ££ 
 
 = VAR := EXPR 
 
 Semantics : 
 Generate POP 
 
 Pop operator stack and generate SADD, 
 SSUB, SMIIL, or SDIV appropriately 
 Generate PUSH #id, FETCH 
 Generate PUSH number 
 
 Generate PUSH #id 
 
 Push op code onto operator stack 
 
 Generate STORE 
 
 Grammar 3E, its associated semantics and LR table. 
 
 Figure h. 
 
15 
 
 III. THE TRANSLATION SCHEME 
 
 As shown in the previous section, the use of SDTS to incorporate 
 semantic analysis into an LR parse is often inconvenient and costly. 
 The question now is how can semantic actions he performed as required 
 without stratifying the grammar. 
 
 Eliminating T ype 1 St rat if i cat ions 
 
 One way to extend the SDTS in order to eliminate the first type 
 of stratification is to relax the constraint that semantic actions he 
 associated only with reductions. In fact, why not allow semantic actions 
 to he performed at any point of the parse — not only on reduction, hut 
 also on shift, accept and error?t In order to see how this might he 
 done, it is helpful to clarify what is actually occurring in an SDTS. 
 
 To carry out the translation, the normal SDTS uses internal infor- 
 mation from the parse, the n in an Rn entry of the LR table. This 
 number actually communicates two pieces of information: what production 
 rule is to be applied (for the parse) and, since the semantic actions 
 have the same numbers as the production rules, what semantic action is 
 to be performed (for the translation). This duality is lacking in the 
 other types of table entries (S, E, and A), so packing two pieces of 
 information into each R entry causes stratification by forcing a 
 reduction whenever a semantic action is required. 
 
 t Professor T. R. Wilcox was instrumental in the choice of this 
 attack on the problem. 
 
16 
 
 With this clarification, a solution is apparent. First, split 
 this single item that serves dual purposes into two separate items, 
 and second, since reduce entries now explicitly contain semantic 
 information, modify the LR table so that all entries carry semantic 
 information. Therefore, instead of each table entry containing only 
 two pieces of information (the type of parsing action and its associ- 
 ated number), each entry now contains a third piece of information, 
 the number of the semantic action to be performed: 
 
 type of 
 
 parsing 
 
 action 
 
 parsing 
 action 
 # 
 
 semantic 
 
 action 
 
 ft 
 
 Shift 
 Reduce 
 Error 
 Accept 
 
 (goto) state ft 
 prod, rule ft 
 error ft 
 
 By thus disassociating the semantic actions from the production 
 rules (and numbering them independently) and embedding the appropriate 
 semantic action number directly in the LR table entry that corresponds 
 to the point in the parse where that action is to take place, semantic 
 actions can be performed at any point of the parse. Using this config- 
 uration, an entry notation R3/5 means reduce by rule 3 and then perform 
 semantic action 5; S10/8 means shift, perform semantic action 8, then 
 goto state 10. Figure 5 shows how the semantics of grammar IE can be 
 directly incorporated into the LR table of Grammar 1; the stratifica- 
 tions are no longer necessary to correctly include semantic analysis. 
 
 Eliminating Redundant Semantic Code 
 
 " ■ ^ ' i w « — ^ * ■ m m > — ^f^. m <m m ^ ■ - ■■»■■■■■■ — n *m m n 
 
 It is worth noting that in Figure 5 semantic action 3 is actually 
 composed of two more basic actions, generate PUSH ft id and generate 
 
IT 
 
 id number 
 
 °E. 
 
 e ASSIGN EXPR ELEM 
 
 TO 
 Tl 
 
 T2 
 T3 
 Th 
 T5 
 T6 
 TT 
 T8 
 T9 
 
 Sl/5 
 
 
 
 
 
 S9 
 
 
 
 
 
 S2 
 
 
 
 
 
 
 S5 
 
 s6 
 
 
 
 
 
 S3 
 
 Si+ 
 
 
 
 
 ST/6 
 
 Rl/1 
 
 
 
 
 
 
 
 R2 
 
 R2 
 
 
 
 
 
 
 
 RU/3 
 
 RU/3 
 
 
 
 
 
 
 
 R5A 
 
 R5A 
 
 
 
 
 S5 
 
 s6 
 
 
 
 
 
 
 s8 
 
 
 
 
 R3/2 
 
 R3/2 
 
 
 
 
 
 
 
 
 A 
 
 
 
 
 Grammar 1: 
 
 1 ASSIGN 
 
 2 EXPR 
 
 ELEM 
 
 = VAR _^_ EXPR 
 
 = ELEM 
 
 = EXPR ap_ ELEM 
 
 = id 
 
 = number 
 
 Semantic Actions: 
 
 1 Generate POP 
 
 2 Pop operator stack and generate 
 SADD, SSUB, SMUL, SDIV appropriately 
 
 3 Generate PUSH #id, FETCH 
 h Generate PUSH number 
 
 5 Generate PUSH #id 
 
 6 Push op code onto operator stack 
 
 Grammar 1 with the semantics of Grammar IE 
 embedded directly in the modified LR table. 
 
 Figure 5. 
 
18 
 
 FETCH; furthermore, the basic action generate PUSH #i d is repeated 
 in semantic action 5. A question that naturally arises is how to 
 indicate that several basic actions are to be performed togethei 
 without wasteful duplication of code. 
 
 One way to attack this problem is to embed in the LR table 
 entry not the actual semantic action number but a pointer to a 
 linked list of basic semantic action numbers. The result is a 
 configuration such as the one shown in Figure 6. 
 
 For instance, in Figure 5 if the semantic items in the LR table 
 are viewed as pointers into a semantic table, and again the semantic 
 actions are renumbered, the resulting semantic table would be as 
 given in Figure 7. If the parser is in state TO looking at incoming 
 token id, id_ is shifted, basic semantic action 2 is performed, and 
 the parser goes to state Tl. If the parser is in state T5 looking 
 at incoming token op_, reduction by rule h is done, and basic semantic 
 actions 2 and 3 are both performed. In this way the code for an action 
 that should be done at many different points in the parse need not be 
 duplicated o 
 
 Eliminating Type 2_ Stratifications 
 
 The second type of stratification is a result of semantic actions 
 at a reduction depending on more contextual information than is supplied 
 by the production rule number alone. To choose the appropriate semantic 
 action in such a case, it is necessary to also know part of the previous 
 sequence of parser actions that lead to the current configuration. An 
 important element in the solution to this problem is the fact that, 
 given a state of an LR table, it is known locally how much of which 
 
19 
 
 Parsing Table Entry 
 
 Semantic Action Table 
 
 type of 
 
 parsing 
 
 semantic 
 
 parsing 
 
 action 
 
 action 
 
 action 
 
 number 
 
 pointer 
 
 basic action 1 
 
 •— 
 
 basic action 2 
 
 •— 
 
 • 
 • 
 • 
 
 
 basic action n 
 
 A 
 
 
 The configuration of an LR table entry and its 
 associated linked list of basic semantic actions. 
 
 Figure 6. 
 
 PARSE_TABLE ( T 5 , op_) 
 
 3* 
 
 PARSE TABLE(TO,id) 
 
 5* 
 
 BASIC 
 ACTION 
 
 # LINK 
 
 1 
 2 
 
 «► 3 
 h 
 
 «► 5 
 6 
 7 
 
 1 
 
 A 
 
 6 
 
 A 
 
 2 
 
 T 
 
 h 
 
 A 
 
 2 
 
 A 
 
 5 
 
 A 
 
 3 
 
 A 
 
 BASIC ACTIONS: 
 
 1. generate POP 
 
 2. generate PUSH #id 
 
 3. generate FETCH 
 
 k. generate PUSH number 
 
 5. push op code onto the 
 operator stack 
 
 6. pop operator stack and 
 generate SADD, SSUB, 
 SMUL, SDIV appropriately 
 
 SEMANTIC 
 TABLE 
 
 The basic semantic actions and the semantic table configuration 
 when the semantic items of Figure 5 are viewed as pointers. 
 
 Figure 7. 
 
20 
 
 (TO, A:=B+(C:=D/2)-E ) ■+ 
 (T0T1, :=B+(C:=D/2)-E ) ■* 
 (T0T1T3, B+(C:=D/2)-E ) ■* 
 (TOT1T3T6, +(C:=D/2)-E ) ■* 
 (TOT1T3T5, +(C:=D/2)-E ) -*■ 
 (T0T1T3TU, +(C:=D/2)-E ) -> 
 (T0T1T3TUT10, (C:=D/2)-E ) -> 
 (T0T1T3TUT9, (C:=D/2)-E ) + 
 (T0T1T3TUT9T8, C:=D/2)-E ) ■+ 
 (T0T1T3TUT9T8T2, :=D/2)-E ) ■> 
 (T0T1T3TUT9T8T1, :=D/2)-E ) ■> 
 (T0T1T3THT9T8T1T3, D/2)-E ) ■> 
 (T0T1T3TUT9T8T1T3T5, /2)-E ) •> 
 (T0T1T3TUT9T8T1T3TH, /2)-E ) ■* 
 (T0T1T3TUT9T8T1T3TUT10, 2)-E ) - 
 (T0T1T3TUT9T8T1T3TUT9, 2)-E ) ->• 
 (T0T1T3TUT9T8T1T3TUT9TT s )-e ) - 
 (T0T1T3TUT9T8T1T3TUT9T12, )-e ) 
 (T0T1T3TUT9T8T1T3TU, )-E ) 
 
 The sequence of parser configurations vith LR table of Figure 3 
 and string A:=B+(C:=D/2)-E as input. 
 
 Figure 8. 
 
21 
 
 production rule(s) has already been scanned and exactly what possible 
 items are anticipated to complete the rule(s). For example, returning 
 to the table in Figure k, when the parser is in state T3, VAR j^= of 
 production rule 1 has already been recognized, and the parser is 
 anticipating an expression, which may begin with any of several 
 symbols. When in state T8, the parser has scanned the open parenthesis 
 of an embedded assignment and is expecting the assignment itself. The 
 algorithm for constructing an LR table for a grammar conveys the 
 essence of this concept; for more details see [AU72], 
 
 Since each state is essentially a "snap shot" of the local 
 environment at a certain point of the parse, knowing the sequence of 
 states that have been traversed to reach the present point is equivalent 
 to having a history of the scanned symbols that are pertinent to the 
 current environment (i.e. not yet resolved in a reduction). This 
 history contains the information necessary to resolve the conflicting 
 semantics which cause type 2 stratifications, and is exactly the 
 information on the state stack! 
 
 For instance, using Algorithm 1 and the LR table of Figure 3 to 
 parse the input string A:=B+(C :=D/2)-E, the parser takes on the 
 sequence of configurations given in Figure 8. The next parsing action 
 is to reduce by rule 1; but at first glance there appears to be in- 
 sufficient information to decide whether to generate a POP or a STORE. 
 However, utilizing the state stack data, it is clear that the only 
 sentential form (if used as an "input" sequence) that could have 
 produced this state sequence is: 
 
 TO ■*■ Tl ■* T3 -*■ TU -> T9 -> T8 -* Tl -> T3 -»■ TU 
 VAR := EXPR OP ( VAR := EXPR 
 
22 
 
 (This can easily be seen from the LR table: the only way to go from 
 TO to Tl is by shifting a VAR; the only way to go from Tl to T3 is 
 by shifting a := ; and so on.) This confirms that indeed the re- 
 duction will be to an embedded assignment; thus STORE is the proper 
 translation. In particular, merely knowing that state T8 is an element 
 of the state sequence is sufficient to verify that the assignment is 
 embedded (since T8 corresponds to having scanned the open parenthesis 
 of an embedded assignment), and to generate the correct translation 
 it is necessary only to see if T8 is contained in the state stack. 
 Resolving a choice of semantic actions, then, depends on some 
 condition of the state stack. Incorporating this concept into the 
 semantic table is a simple matter; some encoding of the desired 
 condition can be easily added to each element of the table: 
 
 encoding 
 of condition 
 
 basic 
 action # 
 
 link 
 
 When an element of the list is being considered, the action is 
 performed only if the condition is found to hold true, then the 
 link is followed to the next action as before. 
 
 Theoretically the state conditions might become extremely complex, 
 but in practice only a few basic conditions need be considered useful. 
 Some typical state stack conditions might be: 
 
 - Ti appears (or doesn't appear) 
 
 - Ti and/or Tj appear (or not) 
 
 - Ti and Tj appear in a given order 
 
 Following are sample encodings of each of these conditions. In general, 
 
23 
 
 the encoding scheme that should he used is the one which supplies 
 enough power (but no more! ) to correctly handle the semantics of the 
 language . 
 
 Encoding 1: single state appearance, 
 
 ± i 
 
 semantic 
 action # 
 
 link 
 
 + i 
 
 - i 
 
 
 
 Semantic action to be done: 
 
 only if Ti appears on the state stack 
 only if Ti does not appear on the stack 
 unconditionally 
 
 This encoding will be used in the solution to the embedded 
 assignment problem presented later (see Figure 9). 
 
 Encoding 2_ : 
 
 multiple state appearance with one logical operator 
 specified implicitly. 
 
 ± i 
 
 ± J 
 
 semantic 
 action § 
 
 link 
 
 + i + J 
 - i + J 
 
 Semantic action to be done: 
 
 Ti and (or) TJ appear on the state stack 
 Ti doesn't appear and (or) Tj does appear 
 and so on. 
 
 This encoding also encompasses Encoding 1; ±i from above is the 
 
 same as : 
 
 ±i and +0 since TO is always on the bottom of the stack 
 
 ±i or +n where n is greater than the largest possible 
 state number, since that number could never 
 appear on the stack. 
 
2h 
 
 If the AND operator is used in Encoding 2, an order/no order 
 bit can be added to indicate whether or not the states need be 
 found in the given order. 
 
 Encoding _3: multiple state appearance and two logical operators. 
 
 l=and 
 0=or 
 
 ± i 
 
 ± J 
 
 semantic 
 action # 
 
 link 
 
 Except for the and/or bit, this is the same as Encoding 2. 
 
 Integrating the mechanism of linked lists of conditional 
 semantic actions into the LR table provides a powerful tool for 
 performing complex semantics without stratifying the grammar. As 
 a final example, Figure 9 shows the LR table and semantic table 
 for unstratified Grammar U; this LR table represents a kk% savings 
 in the number of table entries over the table for the stratified 
 grammar for the same language in Figure h (or an 80% increase in 
 the other direction). Notice that encoding 1 for state-stack 
 conditionals is used since it is powerful enough to correctly 
 perform the given semantics. Appendix B shows a sample PL/l 
 implementation of an LR parser using the ESDTS. 
 
25 
 
 id numbr 
 
 0£ 
 
 ASSGN EXPR ELEM 
 
 TO 
 
 Tl 
 T2 
 T3 
 Th 
 
 T5 
 
 T6 
 
 TT 
 
 T8 
 
 T9 
 
 TIO 
 
 Til 
 
 T12 
 
 Sl/1 
 
 
 
 
 
 
 
 S12 
 
 
 
 
 
 S2 
 
 
 
 
 
 
 
 
 S5 
 
 s6 
 
 
 ST 
 
 
 
 
 
 S3 
 
 Sk 
 
 
 
 
 
 Rl/2 
 
 S8/U 
 
 Rl/2 
 
 
 
 
 
 
 
 
 R2 
 
 R2 
 
 R2 
 
 
 
 
 
 
 
 
 RV5 
 
 RU/5 
 
 RU/5 
 
 
 
 
 
 
 
 
 R5/7 
 
 R5/7 
 
 R5/T 
 
 
 
 
 Sl/1 
 
 
 
 
 
 
 
 S9 
 
 
 
 S5 
 
 s6 
 
 
 Sll 
 
 
 
 
 
 
 SIO 
 
 
 
 
 
 Sll 
 
 
 
 
 
 
 
 
 
 
 R3/8 
 
 R3/8 
 
 R3/8 
 
 
 
 
 
 
 
 
 R6 
 
 R6 
 
 R6 
 
 
 
 
 
 
 
 
 
 
 A 
 
 
 
 
 Grammar k 
 
 1 ASSIGN 
 
 2 EXPR 
 3 
 
 1* 
 5 
 6 
 
 ELEM 
 
 S 
 
 o "^ 
 
 (Y cff „<z> o y 
 
 = id i^ EXPR 
 
 = ELEM 
 
 = EXPR op_ ELM 
 
 = id 
 
 = number 
 
 = ( ASSIGN ) 
 
 Basic Semantic Actions: 
 
 1 Generate POP 
 
 2 Generate STORE 
 
 3 Generate PUSH #id 
 h Generate FETCH ~ 
 
 5 Generate PUSH number 
 
 6 Push op code onto operator stack 
 
 7 Pop operator stack and generate 
 SADD, SSUB, SMUL, SDIV appropriately 
 
 
 
 3 
 
 A 
 
 +7 
 
 2 
 
 3 
 
 -7 
 
 1 
 
 A 
 
 
 
 6 
 
 A 
 
 
 
 3 
 
 6 
 
 
 
 h 
 
 A 
 
 
 
 5 
 
 A 
 
 
 
 7 
 
 A 
 
 SEMANTIC TABLE 
 
 Complete example of ESDTS. 
 Figure 9. 
 
26 
 
 IV. IMPLEMENTATION CONSIDERATIONS 
 
 Two of the major factors to be considered in evaluating the 
 ESDTS as a useful translation tool are its use of resources and its 
 compatibility with parsing and optimization techniques. 
 
 Resources 
 
 The savings in resources obtained by using the ESDTS is 
 dependent on the complexity of both the grammar and the semantics 
 to be performed. In deciding whether this translation scheme would 
 be more advantageous to use than normal SDTS for a particular grammar, 
 several aspects of implementation must be considered: data and 
 program memory requirements, and execution time. 
 
 In the comparison of the resources required in the ESDTS and 
 the SDTS, there are several assumptions made. First, the analysis 
 will not include programming time or the logical complexity of the 
 different schemes. It is assumed that pointers are not implemented 
 as the general system concept of memory addresses (e.g. 2k bits on 
 the IBM 360/75) but are essentially integer pointers into an array, 
 as in the examples of the previous section. Also it is assumed that 
 there exists a "case selection" construct in the implementation 
 language, such as a CASE statement in PASCAL or a branch table (a 
 GOTO via an array of labels) in PL/l. 
 
27 
 
 Program Spac e 
 
 A driver is required for traversing the linked lists of basic 
 semantic actions and for decoding and checking the action conditions. 
 The size of the driver is dependent on the complexity of the state- 
 stack conditional encoding scheme used. However, the overhead needed 
 for this driver is balanced by the saving in program space in other 
 areas, depending on the type of semantic driver used in the old scheme. 
 For example, if the code for the basic semantics was repeated where- 
 ever it was needed (as in the implementation in Appendix A), then the 
 space saved by removing duplicated code compensates for the addition 
 of the driver routine. If a method for performing semantics had been 
 used where the basic action code wasn't repeated, as in setting 
 switches to control the selection of semantic actions or using a 
 series of calls to basic action procedures, then converting to this 
 new scheme may still represent an overall savings in program space 
 since the code that previously directed the performance of semantic 
 actions would be eliminated. 
 
 The code necessary to decode the parse table . entries is slightly 
 more complex in the ESDTS, but the basic mechanism is already present 
 in the old scheme so the incremental cost is small. 
 Data S pace 
 
 In general, to obtain a useful comparison between the data space 
 requirements for the ESDTS and the normal SDTS, an estimate of the 
 number of bits used in each scheme is needed. In particular, the 
 following discussion will be based on the IBM 360 and assumes that 
 the data space is utilized in "chunks" such as bytes, half-words or 
 words (32 bits). 
 
28 
 
 The semantic table containing the linked lists of semantic 
 actions is an added data structure in the ESDTS. To get an idea 
 of how much space this requires, consider a reasonably large 
 example where there are: 
 
 - 250 states (8 bits + sign bit) 
 
 - 125 basic semantic actions (7 bits) 
 
 - 250 list entries (8 bits) 
 
 If condition encoding scheme 1 is used, each list entry can 
 easily be packed into one word of memory, which is a small over- 
 head compared to the amount of memory required for the LR table. 
 On the other hand, the semantic driver of a normal SDTS could be 
 implemented using linked lists of basic actions, in which case 
 changing to the ESDTS means extra memory overhead only for the 
 state-stack conditional encoding. 
 
 The LR table is the largest data structure and is where the 
 basic motivation for saving memory space originated, so it is the 
 key factor in the analysis. A comparison is required between the 
 number of bits used by the parse table when the grammar has been 
 stratified but semantic pointers are not present and the number of 
 bits used when the grammar is not stratified and semantic pointers 
 are present. 
 
 Since a semantic pointer has essentially been embedded into 
 each entry of the parse table, it is desirable to minimize the 
 number of bits required for the encoding of this pointer. One way 
 to do this is to arrange the linked lists so that the headers of 
 lists (the pointers embedded in the parse table) always have low 
 
29 
 
 Semantic Table 
 
 Semantic 
 Pointers 
 from LR 
 table 
 
 
 
 
 O t? .o 
 
 <» ,<y o 
 
 
 ■■'-'■■'- ■ I ■ ■■■■ | ' ' | 
 
 A 
 
 — — — __■ _____ I 
 
 ■ ' ■ ■ ■— ■ '■ ■ » ■■ i ii i — -) 
 
 ■ .11--— ■ __-__w_____l 
 
 A 
 
 Arrangement to minimize LR table semantic pointers. 
 
 Figure 10. 
 
30 
 
 indices in the semantic table, and the remaining elements in the 
 lists have high indices. (See Figure 10.) For example, assuming 
 there are 127 basic semantic actions and 31 lists, by arranging the 
 linked lists in this manner the semantic pointer in the parse table 
 need only be 5 bits instead of 8 bits. 
 
 EXAMPLE 1 
 
 Assumptions: a) data space allocated in half-words (l6 bits) 
 
 b) 30% reduction in the number of parse table 
 entries due to non-stratification 
 
 c) parsing action (2 bits) 
 
 250 states (and <250 production rules) (8 bits) 
 30 semantic lists (5 bits) 
 
 Analysis : 
 
 Since 10 of the l6 bits per entry would have already been used 
 
 in the normal SDTS, adding 5 more bits for the semantic pointers does 
 
 not increase memory requirements. Therefore, an actual 30% reduction 
 
 in storage space for the parse table is obtained, and even considering 
 
 the space required for the semantic table this would be a strong 
 
 candidate for the ESDTS implementation. 
 
 EXAMPLE 2 
 
 Assumptions: a) data space allocated in bytes (8 bits) 
 
 b) 30% reduction in number of parse table 
 entries due to non-stratification 
 
 c) parsing action (2 bits) 
 
 510 states (and <510 production rules) (9 bits) 
 125 semantic lists (7 bits) 
 
 Analysis : 
 
 A parse table entry of the old scheme (ll bits) fits into 2 
 
 bytes, whereas an entry in the ESDTS table requires 3 bytes. Thus 
 
 105% of the previous storage requirement is needed. If this scheme 
 
 doesn't produce extra power or savings somewhere else, then it is 
 
31 
 
 probably not a good candidate for this translation scheme. However, 
 assuming b) is now a U0% savings, then the ESDTS requires 90% of the 
 old scheme's space and becomes a good choice. 
 
 Execution Time 
 
 The difference in execution time in the two schemes is centered 
 mainly in the performance of semantic actions. Depending on the 
 complexity of the condition field of the semantic table entries and 
 the number of times the state stack is searched, execution time will 
 generally be larger than in the normal SDTS. This may be offset by 
 using a case-type statement (which is more efficient than duplicating 
 code or procedure calls) embedded in a loop along with the code neces- 
 sary to follow the linked lists of the semantic table; a sample 
 implementation is given in Appendix B. As always, there is a trade- 
 off between power, space, time, and having a clean, understandable 
 implementation. 
 
 Comp at ib i 1 i ty 
 
 Since the ESDTS is a means of optimizing the. LR table, it is 
 appropriate to discuss this scheme in terms of available compaction 
 techniques . 
 
 The ESDTS is compatible with any table compaction method which 
 does not modify the internal values of the entries but simply manipu- 
 lates each entry as a whole. 
 
 EXAMPLE 1_: reordering table rows and columns to obtain a block of 
 
 "useless" entries (entries that cannot be reached unless 
 more than one error has already been committed, or, if 
 there is one large error-handling routine that is used 
 
32 
 
 for any error that occurs, entries that are all errors) 
 The table can now be divided into two partial tables, 
 and the "useless" block(s) can be eliminated. To 
 illustrate, the reordered LR table of Grammar h (from 
 Figure 6) is presented in Figure 11. 
 
 EXAMPLE 2 : implementing the table as a sparse matrix. See [KN68]. 
 
 EXAMPLE 3_: implementing the table in a sequential fashion. Again 
 the LR table of Grammar h is used to illustrate. See 
 Figure 12. 
 
 The ESDTS is also adaptable to other, more complex optimization 
 methods, such as state merging and elimination of non-essential re- 
 ductions. However, care must be exercised to insure that semantic 
 information ends up in the "right place" in the table. 
 
 One last comment on compatibility: even though the ESDTS has 
 been presented in terms of LR parsing, it is suitable (with perhaps 
 slight modifications) for use with many other table-driven, shift- 
 reduce parsing techniques such as LL or Floyd-Evans methods. 
 
33 
 
 OLD : NEW : _). 
 
 (TO) TO' 
 
 (Tl) Tl' 
 
 (T2) T2» » 
 I 
 
 (TT) T3' [ 
 
 (T8) TV , 
 
 °E 
 
 id numbr ( ASSGN ELEM 
 
 (T3) T5 1 
 (TU) T6' 
 (T5) TT' 
 (T6) T8 f 
 (TIO) T9 1 
 (Til) TIO' 
 (T12) Til' 
 (T9) T12' 
 
 Rl/2 
 
 Sk'/k 
 
 R2 
 
 RU/5 
 
 R5/T 
 
 R3/8 
 
 R6 
 
 SIO 1 
 
 R2 
 
 RU/5 
 
 R5/T 
 
 R3/8 
 
 R6 
 
 A 
 
 Rl/2 
 
 R2 
 
 RV5 
 
 R5/T 
 
 R3/8 
 
 R6 
 
 EXPB 
 
 Sl'/l 
 
 
 
 511 f 
 
 
 
 
 
 
 
 
 
 S2' 
 
 
 ST' 
 
 s8' 
 
 S3' 
 
 
 s6» 
 
 
 S5 1 
 
 Sl'/l 
 
 
 
 S12' 
 
 
 
 
 ST' 
 
 s8' 
 
 SIO' 
 
 
 S9' 
 
 
 
 
 
 
 
 
 
 
 I 
 I 
 J 
 
 The reordered LR ta"ble of Grammar h, 
 Fig\ire 11. 
 
3U 
 
 State 
 
 
 
 1 
 
 1 
 
 3 
 
 2 
 
 k 
 
 3 
 
 9 
 
 h 
 
 12 
 
 5 
 
 15 
 
 6 
 
 18 
 
 7 
 
 21 
 
 8 
 
 23 
 
 9 
 
 27 
 
 10 
 
 28 
 
 11 
 
 31 
 
 12 
 
 3i+ 
 
 13 
 
 35 
 
 Row information 
 (indicates where 
 each state starts) 
 
 3 
 
 h 
 
 5 
 
 6 
 
 7 
 
 8 
 
 9 
 
 10 
 
 11 
 
 12 
 
 13 
 
 Sl/1 
 
 id 
 
 S12 
 
 ASSIGN 
 
 S2 
 
 ~ 
 
 S5 
 
 id 
 
 s6 
 
 number 
 
 
 ST 
 
 i 
 
 S3 
 
 EXPR 
 
 S4 
 
 ELEM 
 
 Rl/2 
 
 1 
 
 S8/U 
 
 22. 
 
 Rl/2 
 
 e 
 
 R2 
 
 I 
 
 R2 
 
 22 
 
 
 
 30 
 31 
 32 
 33 
 3U 
 
 R3/8 
 
 e 
 
 R6 
 
 1 
 
 R6 
 
 °E 
 
 R6 
 
 e 
 
 A 
 
 e 
 
 "Useful" Column information 
 LR table (indicates in which 
 entries column the entry- 
 appears ) 
 
 The LR table of Grammar h implemented 
 in sequential fashion. 
 
 Figure 12. 
 
35 
 
 V. SUMMARY 
 
 The ESDTS presented here is designed to allieviate the problem 
 of extremely large LR tables. In particular, the ESDTS 
 
 - eliminates stratifications of the grammar, resulting in 
 a more natural, readable grammar, 
 
 - allows the semantic actions to be parameterized, thereby 
 making the translator easier to modify, 
 
 - allows semantic actions to be performed conditionally, 
 taking advantage of global context information, 
 
 - can produce a large savings in memory requirements in the 
 LR table, and 
 
 - is compatible with other optimization techniques. 
 
36 
 
 References 
 
 [AU72] Aho, A.V. and Ullman, J.D., The Theory of Parsing, Translation, 
 and. Compiling , Volume 1, Prentice-Hall, Inc., 1972. 
 
 [AU73] Aho, A.V. and Ullman, J.D., The Theory of Parsing, Translation, 
 and Compiling , Volume 2, Prentice-Hall, Inc., 1973. 
 
 [KN68] Knuth, D.E., The Art of Computer Programming , Volume 1, 
 Addi son-Wesley, 1968. 
 
37 
 APPENDIX A 
 LR PARSER USING SDTS. 
 
 PREDEFINEC CONSTANTS: 
 
 MAXSTATES, MAXRULES, I* A X V OC A6 ( = # T F R X I K AL S4 »NOK TE RM I K AL S ) . 
 AND SSTACK.SIZE 
 
 mcscs : 
 
 push_itate(n) : pushes n cn top of the state stack 
 
 HITH OVERFLOW CHECKING 
 POP STATES<M) : POPS M ITEMS OFF Tt-E STATE STACK 
 
 fclTH UNDERFLOW CHECKING 
 PT CECODE<X.Y.Z) : FRCM X=VALUE OF LR TABLE ENTRY 
 
 FINDS Y=TYPE OF FARSING ACTION 
 Z=PARSING ACTION KUMEEfi 
 
 fpccedures: 
 
 get_token : lexical analysis, returns encccing of 
 
 incoming token in global variable token 
 
 comments: 
 
 thclgh not shohn • arrays pt, leftside. anc ri c-htpartlen 
 
 are assumed to be properly initialized */ 
 
 sef: frcc; 
 
 CCL( PTC i :maxstates» 1 :maxvocab) • /* LB parse TAELE */ 
 
 PT_ENTRY,ACT_TYPE»ACT_NUM. /* INDIVIDUAL TABLE ENTRY 
 
 ANC ITS ELEMENTS */ 
 STATE STACK! 1 ISSTACK SIZE). /* STATE STACK */ 
 LEFTSTDEU :MAXRULES)7 /* ENCODING OF NONTERMINALS 
 
 CN LEFTHANOSICE OF RULES */ 
 RIGHTPARTLEM l'.MAXRULES )• /* R IGH THANCSIDE RULE LENGTHS */ 
 
 ) FIXED BIN(IS). 
 ( CASE(C:3) IMT(ERR. SHIFT. REDLCE. ACCEPT) • 
 SEMAC1 XMAXRULES) ) LABEL. 
 
 DONE INIT(*O a E) EIT(l); 
 STATE TOS=l ; 
 
 state!stack<state_tcs )«o; 
 
 CALL GET TOKEN; 
 
 cc »hile"~<-»cone) ; 
 
 FT_ENTRY=PT(STATE ST ACM ST ATE_TOS ) .TOKEN) ; 
 PT DECODECPT ENTR Y . ACT_TYFE . ACT NUM); 
 GOTO CASE(ACT TYPE); 
 ERR: /* ERROR HANDLING ROUTINE */ 
 
 GOTO endcase; 
 sfift: push_state(act_num) ; 
 call get token; 
 goto endcase; 
 reduce: pcp_states(RIGhtpartlencact_nlm) ) ; 
 gctc sema(act num); 
 sema(l): /* semantics associated *ith rule i *• 
 
 goto endsema; 
 sema(2): /* semantics associated *ith rule 2 ♦/ 
 
 GOTO endsema; 
 
 sema(maxrules ): /♦ semantics associated fcltf last rule */ 
 
 endsema: 
 
 pt entry*pt(state_stack(state tos ) .lefts ide (act_num )) ; 
 
 pt~decode(pt entry.act_type.act num); 
 
 if act_tvpe=t then push_st ate ( ac*t_num ) ; 
 
 else /♦ error */ ; 
 GCTO endcase; 
 accept: put skip list* "valid program accepted*); 
 
 return; 
 endcase: 
 i end; /* dc hhile */ 
 i parser; 
 
38 
 
 APPENDIX B 
 LR PARSER USING ESDTS, 
 
 /» PREDEFINED CONSTANTS: 
 
 MAXSTATES. MAXVOCAB(=*TERMINALS+#NCNTEFMINALS) • MAXRULES. 
 SSTACK_SI2E. MAXSEMA, MAXSEMACTS 
 
 macfos: 
 
 push_state(n) : pushes n cn top of state stack 
 
 *ith overflow checking 
 
 POP STATES(M) : POPS * ITEMS OFF TCF CF STJTE 
 
 STACK WITh UNOERFLC* CHECKING 
 PT CECODE (H.X.Y. 2) : FROM W=VALUE CF LP T*ELE ENTRY. 
 
 FINDS X=PARSING ACTICN TYPE 
 
 Y=PARSING ACTION NUMBER 
 Z=SEMANTIC FOINTER 
 SEM 0ECODE(W,X,Y.Z) : FROM *=VALUE CF SEMANTIC TABLE ENTRY 
 
 FINDS X=CONDITI0NAL STATE* 
 
 Y=EASIC SEMANTIC ACTICN * 
 2=LINK 
 
 frccedures: 
 
 get token z lexical analysis. retlrns encocinc of incoming 
 token in c-l08al variable token 
 
 comments: 
 
 arrays pt. leftside. f i gh tpartlen . anc sematae are assumed 
 tc ee previously initialized 
 
 state_stack( 1 :sstack_si2e ) is assumed to ee declarec 
 fixed bin<15) globally */ 
 
 FAFSEF: FRCCJ 
 
 CCL( PT( 1 IMAXSTATES.l IMAXVOCAB), /* LF FARSE TAELE ♦/ 
 
 PT_ENTRY )FIXED BIN(21). /* INDIVIDUAL TAELE ENTRY */ 
 ( ACT_TYPE.ACT NUM.SEM_PTR. /* TABLE FIELDS «• 
 
 LEFTSIDE( 1 IMAXRULES) • •* ENCODING OF NONTERMINALS 
 
 CN LEFTHANDSIDE OF RULES */ 
 RIGHTPARTLENI lCMAXRULES ). /* FI GHTHAN CS IDE RULE LENGTHS ♦/ 
 
 ) FI XED BIN (15), 
 ( CASEKK2) IMTIERR. SHIFT. REDUCE. ACCEPT) • 
 SEMAO :MAXRULES) ) LABEL. 
 
 DONE INIT(»C«B) EIT(l); 
 STATE TOS=i; 
 
 STAlE_STACK(STATE_TOS )=0; 
 CALL get_token; 
 co *hile <-«ccne); 
 
 pt_entry=pt(state_stack(state_tos).token) ; 
 pt_oecode<pt_entry,act_tyfe.act_num,sem_ptr); 
 call sema(sem_ptr ); 
 gctc case(act type); 
 err: /* error handling routine */ 
 gctc enccase; 
 shift: pcsh_state<act_num) ; 
 call~get_tcken; 
 gctc endcase; 
 reduce: pcp_states<rightpartlen<act num)); 
 
 pt entry=pt<state_stack(state tos) .leftside(act_num) ) ; 
 pt~oecgde<pt_entry.act_tyfe,act_num.sem_ptr); 
 if~act_type=1 then oc; 
 
 call sema<sem_ptr); 
 push_state ( act_num ) ; 
 end; 
 else /* error */ ; 
 
 goto endcase; 
 accept: put skip list<»valid program accepted*); 
 
 return; 
 endcase: 
 
 enc; /* DC WHILE ♦/ 
 
 end parser; 
 
39 
 
 APPENDIX B CONT'D 
 
 SEkAZ PPCC (LINK ) ; 
 
 CCL SEMATABC1 CMAXSEMA) FIXED BIN(31). /* SEMANTIC TABLE */ 
 (LINK. STATE* .SNUM) FIXED BIN(15). /* TABLE FIELDS */ 
 FCUND eiT< 1 ) • 
 
 sew_act<i :maxsemacts) label; 
 locf: if link=o then return; 
 
 if(link>maxsena | link<1) then /* error */ ; 
 sen_decode(sematab(link) .state*. s num. link); 
 
 FCLNDs'O »B; 
 
 OC I=STATE_TOS TO 1 BY -I ; 
 
 IF STATE_STACMI )=AES (ST AT E# ) 
 then oc; FCUND=M»E; 
 goto oone; 
 end; 
 enc; 
 
 dcne: IF (STATE#>0 & FOUND) \ (STATEA<0 & -FOUND) 
 
 then goto sem_act ( snum) ; 
 else gotc loop; 
 £ek act(l): /* easic semantic action 1 */ 
 
 goto loop; 
 se»_act(2): /* easic semantic action 2 */ 
 gotc loop; 
 
 • 
 £EM_ACT( MAXSEMACTS): /* LAST BASIC SEMANTIC ACTION */ 
 
 GOTC loop; 
 
 ENC SENA; 
 
IBLIOGRAPHIC DATA 
 MEET 
 
 1. Report No. 
 
 UIUCDCS-R-76-795 
 
 2. 
 
 3. Recipient's Accession No. 
 
 Title .uiJ Subt it tc 
 
 5- Report Date 
 
 APRIT, 1976 
 
 AN EXTENDED SYNTAX-DIRECTED TRANSLATION SCHEME 
 
 6. 
 
 Author(s) 
 
 Sandra A. Leach and Will D. Gillett 
 
 8. Performing Organization Rept. 
 No. 
 
 Petlorming Organization Name and Address 
 
 Department of Computer Science 
 
 University of Illinois at Urbana-Champaign 
 
 Urbana, Illinois 6l801 
 
 10. Project/Task/Work Unit No. 
 
 11. Contract/Grant No. 
 
 , Sponsoring Organization Name and Address 
 
 Department of Computer Science 
 
 University of Illinois at Urbana-Champaign 
 
 Urbana, Illinois 6l801 
 
 13. Type of Report & Period 
 Covered 
 
 14. 
 
 1 Supplementary Notes 
 
 1 Abstracts 
 
 In syntax-directed translation schema, semantic actions may be performed 
 only when a grammar -rule reduction occurs and must rely on strictly 
 local context. These two conditions often lead to excessive stratifica- 
 tion of the grammar, especially when the semantics must perform both 
 translation and error correction. In the extended syntax-directed 
 translation scheme, semantics may be associated with any entry in the 
 LR parsing tables (shift, reduce, accept, or error) via semantic 
 pointers to the semantic action(s) to be performed. Since the state 
 stack contains the parse history of all the contexts within which the 
 current construct is embedded, semantic actions may be performed 
 conditionally, depending on the status of the state stack. 
 
 !7Key Words and Document Analysis. 17a. Descriptors 
 
 syntax-directed translation schema 
 
 LR parsing 
 
 semantics 
 
 state stack 
 
 global context 
 
 't'ldentifiers/Open-Ended Terms 
 'choSATI Field/Group 
 
 Mailabihty Statement 
 
 19. Security Class (This 
 Report) 
 
 UNCLASSIFIED 
 
 21- No. of Pages 
 
 39 
 
 UNLIMITED 
 
 
 20. Security Class (This 
 Page 
 
 UNCLASSIFIED 
 
 22. Price 
 
 USCOMM-DC 40329-P7I 
 
tttff 7 
 
 m