'L I B RAHY 
 
 OF THE 
 
 UNIVLRSITY 
 
 OF ILLINOIS 
 
 510.84 
 IZQT 
 
 no. 237-242 
 cop. 2 
 
The person charging this material is re- 
 sponsible for its return on or before the 
 Latest Date stamped below. 
 
 Theft, mutilation, and underlining of books 
 are reasons for disciplinary action and may 
 result in dismissal from the University. 
 
 University of Illinois Library 
 
 MARJ8 
 
 mar*;* 
 
 DEC ? 
 
 ': 
 
 ft$ 
 
 L161— O-1096 
 
Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/translatorgenera237mcin 
 
^ a37 Report No. 237 
 
 July 12, 1967 
 
 C00-1018-1121 
 
 THE TRANSLATOR GENERATOR SYSTEM 
 by 
 A. W. Mclnnes 
 
 Z^ 
 
 I 
 
 ^ 
 
 DEPARTMENT OF COMPUTER SCIENCE • UNIVERSITY OF ILLINOIS ■ URBANA, ILLINOIS 
 
Report No. 237 
 
 THE TRANSLATOR GENERATOR SYSTEM 
 by 
 A. W. Mclnnes 
 
 July 12, 1967 
 
 Department of Computer Science 
 University of Illinois 
 Urbana, Illinois 61801 
 
 Supported in part by Contract AT (11-1) -1018 with the U.S* Atomic Energy Commission 
 and the Advanced Research Projects Agency. 
 
THE TRANSLATOR GENERATOR SYSTEM 
 A. W. Mclnnes 
 
 1 . INTRODUCTION 
 
 This paper describes a system designed to produce a scanner from 
 the Backus Naur Form definition of the syntax of a programming language. 
 The theoretical aspects of this problem are considered in the paper by 
 Eickel and Paul [1J. 
 
 This problem really involves two important problems: 
 
 (a) Parsing problem 
 
 To give a method which decides whether or not a given 
 string, according to some rules of a language, belongs to 
 it, and if so to give the corresponding rules, 
 
 (b) Ambiguity problem 
 
 To decide whether or not a language is unique, i.e., 
 whether or not for every string in the language the corre- 
 sponding set of rules is uniquely determined. 
 
 In this paper we restrict our attention to Chomsky two grammars 
 (i.e., having a production system having just one element of the alphabet 
 of the language as left hand side of each production) having just one 
 distinguished element, Z, of the alphabet to which every sentence in the 
 language may be reduced by repeated applications of the members of the 
 production system. 
 
 The main purpose of this system is to produce a purposive 
 parsing system i.e., if in a left to right scan of a string, we find any 
 production (replacement rule in the syntactical definition of tne language; 
 matched by a substring of the given string, we can make the indicated 
 replacement knowing that this is the only possible replacement. However, 
 due to the practical necessity of providing bounds on the algorithm at 
 various stages, it is not clear that such a goal is possible for every 
 
such language considered. The size of the subset which can be successfully 
 treated will be a matter of practical experimentation, and possible modi- 
 fication based on experience with the system, 
 
 1 . 1 Basic Conc epts 
 
 Let (?< = (A, B, C, .».., Z} be a fin ite set of distinguished 
 
 characters which we shall call the alphabet. The elements (a, b, c, „,.„.) 
 
 of the free semigroup J (Of) over Of under concatenation will be called 
 
 strings over 01 . The unit element (or empty string) of/ ('-J will be 
 
 denoted by e. If x = X,X„ . .. X (X, e 'Jf ) , then Ixj - n :'.s called the 
 J 1 2 n k 
 
 length of x. A finite set II of ordered pairs of strings, 
 
 n - { £L :: = b ; a , b R e fY#) "* (e } ; k = l s 2, , .. , , ra } 
 
 is called a production system in j (OJ) - A production a :: = b (read 
 
 b, may be replaced by, or becomes, a ' ) may be regarded as a replacement 
 rule in the defining syntax of the language, a is called the left side 
 and b, the right side of the production a :: = b , II generates relations 
 p and p in -/ (D'l) in the following manner; 
 
 Definition 1 
 
 u, v e -f(Oj) : u p Q v iff x, y e f(<%) , 
 
 3 (a : : = b) e II 9u™ xby, v = xay 
 
 i.e., for given strings u and v in the free semigroup, we define u in 
 relation to v (u p v) if and only if there exist strings x and y in the 
 free semigroup and a production a :: = b in the production system II, such 
 that the string u has the form xby and v has the form xay. 
 
 Definition 2. 
 
 u, v e f(a) : u p v if f 3 <u, e ^(at) ) k=0,l,2 ..., n 5 
 U o = U ' U n = V ' \ P o \+l (k = °> 1 ' 2 ' •••• n ' 1) 
 
 -2- 
 
i.e., we define the string u to be in relation to the string v (u p v) if and 
 only if there exists a finite sequence of strings u such that tL p u, . 
 according to definition 1, and such that we are led by repeated applica- 
 tions of the relation p from the string u to the string v. That is the 
 
 relation p is the transitive closure of the relation p . For u p v we 
 
 o 
 
 shall say that u is a word for v. 
 
 A special subset of (7( is distinguished: 
 
 01 ~ { X | X e 0( » X does not occur as a constituent of the 
 
 right side of a production of II } 
 
 We call the subset 0( the set of anti-terminal characters in the alphabet. 
 
 The system we wish to consider is a quadruple Z = ( (?{ , II, Z, y) 
 where 0[ is an alphabet, II is a production system in f (.01) , Z is the 
 
 -vJ. 
 
 distinguished element in U( , and 
 
 Y = ( x x e * (Ol) , x p Z } 
 
 i.e., the set of all strings in the free semigroup which are words for 
 the distinguished element Z, including the case x = Z. However note that 
 we have restricted our consideration to those systems having a production 
 system of which the left side of each production is just a single element 
 of the alphabet (X . The set y of such strings is called a language. 
 
 Definition 3 (critical production) : 
 
 A production A :: = B.B ... B is called critical if there 
 
 r 12m 
 
 exists a k e { 1,2, .... m} and a production A' :: = b' in II, not 
 
 id 
 
 b' 
 
 identical with the production given above such that B occurs the string 
 
 Using this concept we now define a relation p contained in 
 
 p o' 
 
 -3- 
 
D efinition _4 
 
 u, v e f(C'(). u p" o v iff x, y e /^, 
 
 3 (a :: - b) e il 3 a ; : = b is not critical, 
 
 u = xby , v = xay. 
 
 i.e., the relation p is just the relation p restricted to the non- 
 ' o o 
 
 critical productions. Corresponding to definition 2, we also define the 
 relation p as the transitive closure of p ; that is u p v if we can 
 pass from u to v by a finite sequence of applications of p using only 
 the non-critical productions „ Clearly p contains p. 
 
 The relation p has the following property: 
 
 (//) u, v, w e ■/([}/) '• upv,upw =>vgw 
 
 To demonstrate this property by an example let us suppose we have productions: 
 
 A l :: =B 1 
 
 A 2 :: -B 
 
 A :: = B 
 
 The first is the only non-critical production in this system= 
 Let us suppose u=aB bBcBd 
 Thus if v=aAbBcBd 
 
 We have u p v 
 
 Now if w = a A b A c A_ d 
 
 then u p w since for p we may use any of the productions and we have the 
 
 following sequence of relations p : 
 
 o 
 
 -4- 
 
u = aBbBcBdp a B, b B c A„ d p a A, b B c A„ d p a A, b A^ c A„ d 
 1 o 1 3 o 1 3 o 1 2 3 
 
 and note that these two relations satisfy the condition that p is just the 
 relation p restricted to the non-critical productions. 
 
 Thus we clearly have v p w since 
 v = a A bBcBdpaA b B c A d p a A b A„ c A. d 
 
 Note that this property # is not true for p since we could then have 
 u p v = a A b A„ c B d and 
 
 u p w - a A b A p c A- d 
 
 In this case it is not true that v p_ w since although 
 
 v p v = a A b A., c A d , 
 
 there is no way we can replace the element A„ between the substrings b and 
 c to become the element A required for the string w. Thus we have come 
 to a "dead end" and have been led outside the set y consisting of the 
 strings comprising our language. 
 
 Thus if the relation p (p) had this property (#), then the 
 parsing problem would become trivial for then 
 
 u g y, v g ffa) : u P v i>v g 7 , 
 
 and this would mean that the application of the syntactical rules ac- 
 cording to p never leads from inside to outside of 7. In general 
 however, p does not have this property. Therefore, the main goal to 
 be achieved is to restrict the production system II to a syster 
 by means of contextual conditions such that the orrespondingl^ 
 stricted relation p* (C p) has the above mentioned property (#) . 
 
 All ambiguities in the language are characterized by the fact 
 that they are words for Z according to more than one essentially different 
 ways. In other words we may obtain more than one distinct sequence of 
 
 -5- 
 
replacement rules (or productions) which may be applied to a given string 
 and reduce it to the distinguished element Z. Thus the given string 
 be regarded as a valid word in the language in more than one essentially 
 different ways - i.e., in the language of programming, it may be scanned in 
 more than one way and still turn up as an apparently legal construct. 
 This is essentially the problem as discussed above except that the 
 "wrong choice" of a production now leads again into the set y instead 
 of outside of y and to a "dead end" as before. 
 
 In order to rigorously explain "essentially different" as used 
 in the previous paragraph we give the following definitions: 
 
 i )efinition 5 (local „ambi £uit x)_i. 
 
 A string x is called lo^^lj{_ambij'uo_us_ if there exist in II two 
 different productions a : : » b and a 1 :: = b' with |x|< |bj + |b'| 
 where b and b' are substrings of x, which cover x completely. 
 
 Thus since the length of the string x is less than the sum of 
 
 the lengths of the strings b and !>' and since {b} (J {b'} covers x com- 
 pletely, it follows that {!>} f\ {b'} ^ <J> . Thus it is clear that the 
 
 strings b ana b' have at least one element in common and so since the 
 
 productions are different, if there exist two such productions they must 
 
 be critical by definition 3. From definition 5 it follows that for ev< \ 
 
 locally ambiguous string x, there exist u, v, u' , v' e -/^(Of) with 
 
 x=ubv=u' b'v' , 
 
 x p u a v ~ x, 
 
 o 1 
 
 x p u' a' v' = x~ 
 o 2 
 
 De finition 6 (ambiguity) : 
 
 A string m is called ambiguous iff there exist u, v e y 
 and a locally ambiguous string x (where x p x , x p x^) such that 
 
 m p_ u x v , u x v P Z and u x v P Z 
 
This rigorously defines what we mean by saying that the string 
 m is a word for Z in two essentially different ways, and, places in 
 terms of the concepts rigorously defined above, our intuitive idea of 
 an ambiguity as a construct having two (or more) apparently valid 
 interpretations within the structure of the language, 
 
 Thus it is clear, that in order to isolate those elements of 
 IT which are causing either ambiguities occuring in Z, or the relation 
 p not having the quoted property(#), we need to isolate the critical 
 productions of n. We then attempt to resolve these problems by finding 
 contextual conditions on the critical productions and obtain a unique 
 system which contains no ambiguities. 
 
 The remainder of this report gives details of how this goal 
 may be achieved. Chapter 2 deals with the reduction of our system to 
 a Simple Chomsky system; Chapter 3 considers the determination of the 
 contextual conditions for the critical productions and Chapter k 
 describes a purposive scanner obtained from the system. 
 
 1.2 Example 
 
 In order to better explain some of the concepts mentioned 
 above, we shall illustrate by means of a short example. 
 Suppose we have the production system: 
 F 
 
 = (E) 
 
 = F 
 
 = T * F 
 
 F 
 T 
 T 
 E 
 E 
 Z 
 
 = E + T 
 
 =■■ E 
 
 It will be seen that these are part of the productions for an arithmetic 
 expression in ALGOL, where V = variable, F = factor, T = term and E = 
 expression, which becomes the distinguished element in the system. 
 
 -7- 
 
Suppose we consider the string V + V * V. Note that all 
 productions except the first are critical so we may expect that some 
 methods of reducing this string will lead to dead ends. 
 
 There is a production F : : = V. So suppose we replace all the 
 elements V in the string by F. The string now becomes 
 
 F + F * F 
 
 There is a production T :: = F, so we can replace every F by a T. The 
 string becomes 
 
 T + T * t 
 
 Similarly, using the production E :: = T, the string becomes 
 
 E + E * E 
 and finally using the production Z :: = E, the string becomes 
 
 Z + Z * Z. 
 
 However, we have now reached a dead end since the string has 
 not been reduced to the single distinguished element Z and no further 
 reductions form the production system can be made. Let us consider 
 what went wrong. By an inspection of the production system and our 
 intuitive knowledge in this case, we find that the problems began in 
 replacing the F by T. If the final F in the string had not been replaced, 
 we would have obtained the string 
 
 T + T * F 
 
 and the last three elements of the string is a right side of a production, 
 T : : = T * F. Making this replacement we obtain the string 
 
 T + T 
 
 -8- 
 
However, if we again use the production E : : - T in both cases 
 here, we are again led to a dead end 
 
 Z + Z 
 
 But if we use this production to replace only the first T we obtain the 
 string 
 
 E + T 
 
 which is again a right side of a production E : : = E + T. 
 
 Making this replacement and then using Z : : - E, we have 
 finally reduced the string to the single element Z as required 
 
 Thus, our reduction of the original string V + V * V, would 
 have been satisfactory if we use the production T :: ^ F to repl 
 F by T, except when F is preceded by a *. Or expressing the same con- 
 dition constructively, we use the production T : : - F only when F is 
 preceded by a '(' or a '+' or 'e f (e is the null string)- Thu; we 
 replace the production T :: s F in the system by the three produ 
 
 eT : : - eF 
 (T ;: - (F 
 +T :: - +F 
 
 Then we can make purposive replacements of F, using any production whos< 
 right side matches a substring of the given string, and then being sure 
 that we will not be led to a dead end= 
 
 Similarly for the production E :: - T, we find that we m 
 
 replace T bv E except when T is eded by a '-»", Hem replace 
 this production in the system by the three productions 
 
 -9- 
 
eE :: = eT 
 
 (E :: ■ (T 
 
 *E :: = *T 
 
 We now have a purposive passing scheme using this modified 
 production system which avoids leading to dead ends by the addition of 
 contextual conditions to certain productions. The following system 
 attempts to implement in a systematic manner, our intuitive evaluation 
 of this simple system. 
 
 -10- 
 
2. SIMPLE CHOMSKY SYSTEMS 
 
 The first part of this system will transform the production 
 system of a CHOMSKY - language given in Backus Naur Form into a simple 
 production system for the same CHOMSKY - language. The essential property 
 of a simple production system is that its critical productions are easily 
 recognizable and are all of length one. The description of the implemen- 
 tation given in section 2.2 should be read in conjunction with the flow 
 charts for the first part given in the appendix* 
 
 2.1 Basic Concepts 
 
 We are given a CHOMSKY system, which includes a set of produc- 
 tions II. Initially we reduce the given CHOMSKY system to a simple 
 CHOMSKY system which implies that the set of productions II must be trans- 
 formed to a set II 1 where II 1 is defined in definition 7. 
 
 Definition 7 (Simple CHOMSKY System) 
 
 A CHOMSKY System E' = { & IT, Z, y 1 } is called Simple if 
 n' = ( k^ :: = b k ; k = 1, 2, ... n } 
 
 has the following properties: 
 
 1) ! l>, j ^ 2 i.e., the right sides of the productions have 
 length one or two. Thus all productions of n' have the 
 form: 
 
 A. :: = S. T. i - 1, 2, , £ 
 
 l li 
 
 A. , : : ■ U . j = 1, 2, , m 
 
 2) The sets {S.}, {T.}, {U.} are pairwise disioint. 
 
 i i j r J 
 
 3) S t S , T $ T for all p, q - 1, 2, . .., I and p # q. 
 
 pqpq r»-i >» r-i 
 
 Thus all the critical productions are of length one and the local ambi- 
 guities (i.e., having more than one replacement possible from a given 
 element) are exactly the right-hand sides of the critical productions. 
 
 -11- 
 
The simple CHOMSKY system is obtained from the given system by 
 carrying out the following operations: 
 
 i) If there is a production X :: ■ X.X....X where n > 2 then 
 
 1 2 n 
 
 a new character E is created and this production is replaced 
 by the pair of productions: 
 
 E :: =X X X 2 
 
 A * I ~~ Lj An ° O • A 
 
 3 n 
 
 The second of these new productions is of length n - 1. If 
 n - 1 > 2 then this procedure is repeated iteratively until 
 all the new productions have length 2. The new character 
 E is adjoined to the alphabet 01 . 
 
 ii) If there exist two productions of the form 
 
 A :: ■ XT; B :: ■ SX (or a production of the form A :: = XX), 
 then the symbols X^ , X are created and the productions are 
 replaced by 
 
 A :: -XjT 
 
 or 
 
 A : 
 
 : '\\ 
 
 B II -SI, 
 
 
 X R 
 
 : : = X 
 
 X, :: -I 
 
 
 \ 
 
 : : = X 
 
 \ ■■■■ =X 
 
 
 
 
 The new characters X_ , X^ are adjoined to the alphabet 01 . 
 
 iii) If there exist two productions of the form 
 
 A : : ■ XT (A : : - SX) and B : : = X, then the symbol E 
 is created and the first production is replaced by 
 
 A : : - ET (A : : = SE) 
 E : : - X 
 
 The new character E is adjoined to the alphabet vl . 
 
 -12- 
 
iv) If there exist two productions of the form 
 
 A : : = ST and B : : « ST 
 
 then a new symbol H is created and these productions are 
 replaced by 
 
 H : 
 
 : = S 
 
 A : 
 
 : - H 
 
 B : 
 
 : = H 
 
 The new character H is adjoined to the alphabet Ot • 
 
 v) If there exist two productions of the form 
 A : : = XT and B : : - XT where T f T 
 (or A : : = S.X and B : : - S 2 X where S i S 2 ) 
 
 then the symbols G, H are created and these productions 
 are replaced by 
 
 A : 
 
 : = GT 
 
 (A :: = S 1 G) 
 
 G : 
 
 : = X 
 
 
 B : 
 
 : =HT 2 
 
 (B :: - S 2 H) 
 
 H : 
 
 : = X 
 
 
 The new characters G, H are adjoined to the alphabet 6y . 
 This construction, if carried out in the given order, leads 
 after finitely many steps to a simple CHOMSKY system. The 
 proof of this assertion is given in the paper by Eickel and 
 Paul [1]. 
 
 -13- 
 
2 .2 Implementation 
 
 2„2,1 Functional Description 
 
 In order to accomplish this reduction and also in order to 
 facilitate the next stage in which the ambiguous productions are resolved 
 by means of context, the productions IT are collected in three stacks: 
 
 (1) Stack S (with stack pointer p_) which contains all 
 productions of length 2< 
 
 (2) Stack S 1 (with stack pointer p..) which contains all 
 unambiguous productions of length 1. 
 
 (3) Stack S (with stack pointer p ) which contains all 
 critical productions (of length 1 by hypothesis) . 
 
 As each production of the given system is read in, each new 
 alphabet element is entered in the symbol table u[ in the first avail- 
 able location o The location of this element in the table (ft gives its 
 internal number which is used in all subsequent manipulations • Each 
 element of the symbol table Qf is of one word in length and has the 
 following form: 
 
 Flag bit on first byte 
 
 i r 
 
 1 — J 1 — I — 1 1 — I i — I — 1 — ! ! — I 1 — I ! < — I 
 
 -\ — i — i — i — r 
 
 I Z 
 
 K. 
 
 ^T 
 
 ~J 
 
 Five alphabetic characters of external representation. 
 
 Ol 
 
 Pointer, p , to the next 
 
 available word in the table. 
 
 Thus p will become the 
 a 
 
 internal number of the next 
 symbol entered in (Jl . 
 
 -14- 
 
If the new element is an anti-terminal symbol (i.e., a member of Ol which 
 consists of all those symbols that do not appear on the right-hand side of 
 a production), then the flag bit on the first byte is set to one. The 
 first two bits of the word are used to signal certain conditions on the 
 symbol represented in the rest of the word. If both bits are zero then the 
 external symbol is not more than five alphabetic characters in length and is 
 represented in BCD code in the remaining 30 bits of the word. If both bits 
 are one then the external symbol is more than five alphabetic characters. 
 In this case only the first character is represented in BCD code in the 
 remaining six bits of the first byte and the last three bytes contain the 
 link address of the first byte of the sequence of consecutive bytes which 
 contain the remaining characters of the extended symbol. The flag bit is 
 set on the last byte of this sequence to indicate the length of the extended 
 symbol. If the first bit is one and the second bit is zero then a "created 
 symbol" is indicated. In this case the external representation is left blank, 
 but the internal number corresponding to this position is used in the pro- 
 ductions to represent a created symbol as outlined above. To summarize: 
 
 External symbol of length ^_ 5 alphabetic 
 characters 
 
 Created symbol 
 
 Extended external symbol ;,,e., lengtt 5 
 characters , 
 
 In order to conserve storage and yet not be limited by ; he num 
 of symbols that may be used in a given system, the internal number will be 
 represented in the following form: 
 
 Since in general the reduced system will not have more chan 255 
 symbols in its "alphabet", the internal number of each system will be 
 represented by one byte. If however there are more than 255 symbols in 
 the alphabet then one byte is added to allow another 255 elements to be 
 
 Bit 1 
 
 Bit 2 
 
 
 
 
 
 1 
 
 
 
 1 
 
 1 
 
 -15- 
 
represented. This process is continued for as large an internal number as 
 required i.e., if the first byte has the value 255 (all bits are units) then 
 to find the internal number the value of the next byte is added to 255. If 
 the second byte also has the value 255 then the next byte is considered and 
 its value is also added on. 
 
 e.g- 
 
 11111111 
 
 11111111 
 
 00000011 
 
 Internal number = 255 + 255 + 3 
 
 The forms of the stacks S_, S n , and S are 
 
 2 1 u 
 
 V 
 
 s r 
 
 a. S J T. 
 
 l i i 
 
 1 1 i 1 Po " * 
 
 b. 
 J 
 
 U. 
 J 
 
 
 C k 
 
 v k 
 
 1 1 J 1 P x - ! 
 
 1 < k < p - 1 
 — — r u 
 
 where the symbols denote the i-th, j-th, k-th elements of the respective 
 stacks, and where each symbol represents an internal number of one byte in 
 length in general, but extended as above when necessary. 
 
 -16- 
 
In the procedure outlined below for making the required reduction 
 to a simple Chomsky system, the operations described previously are combined 
 in a series of subroutines. As each production is read in, each symbol is 
 compared with the symbols already in the symbol table 0{ , and converted to 
 its internal number. The production, represented in internal number form is 
 thus built up in a special location called PR0DN, from which it is pushed 
 into the appropriate stack after any necessary operations have been carried 
 out. If these operations involve the creation of a new symbol, then the 
 'created symbol' bit is set on in the next available cell in the symbol table 
 
 Ql , the external symbol representation is left blank, and the corresponding 
 internal number is us ad in PR0DN as the created symbol. The pointer p is 
 of course increased by one. The symbols of PR0DN are denoted by a, S, T 
 (not subscripted) for the zeroth, first and second symbols respectively . 
 The procedure to accomplish the reduction and set up the system for the 
 resolution of ambiguities, is a relatively simple one which calls on three 
 subroutines; C0MPAR, E and RH0 which will be explained in detail below . 
 Note that the left hand side of a production is called the zeroth element 
 and the first, second etc, elements of the right hand side are termed the 
 first, second etc. elements respectively of the production. The parameter 
 n determines the position of the symbol of the production currently under 
 consideration and the parameter j is a measure of the number of elements of 
 the right-hand side of the productions which match previous right hand sides 
 (see Subroutine E(n)). 
 
 -17- 
 
2.2o2 Operation Description 
 
 The main procedure SIMPLE 
 
 A new production is read in, the location PR0DN (where the produc- 
 tion is to be assembled in its internal form) is cleared, the parameters n, 
 j are initialized and a call is made on the subroutine C0MPAR to find if 
 this symbol has already been used in a production. The next symbol is tested 
 and if this is empty and if n = (i.e., a left hand side was being considered) 
 then there is an error condition, ERR0R 1, since the production did not have 
 a right hand side. If n 4 then, since all symbols of this production have 
 been read in, a call is made to the subroutine E which will push the produc- 
 tion in the appropriate stack depending on the value of the parameter j* If, 
 however, the next symbol is not empty, then if it is not the third symbol, 
 the comparison procedure with the symbol table 0[ is again repeated with this 
 new symbol by means of the subroutine C0MPAR, as before. If the next symbol 
 is the third then the subroutine RH0 is called, which separates off a pro- 
 duction of length two as described in operation (i) above and third symbol 
 now becomes the second before repeating this process of comparison. 
 Eventually the "next symbol" will be empty (the productions are of finite 
 length) and the procedure loops back, as above, to consider the next pro- 
 duction. 
 
 Subroutine C0MPAR (n, j) 
 
 This procedure searches through the symbol table Cn (whose i-th 
 element is denoted by N.) to check if the n-th symbol of the production 
 matches any element of Q{ , i„e., the n-th symbol has been used in a previous 
 production. If no match is found then the new symbol is entered into the 
 symbol table Q( . Note that if the symbol is longer than five alphabetic 
 characters then only the first character is entered into Of and the remain- 
 ing bytes contain a link address to the extension as explained previously, 
 The pointer p is tne internal number of this new symbol and this is placed in 
 
 -18- 
 
the n-th position of PR0DN (where if the internal number is greater than 255 
 it is converted to the representation described above). If n ■ the symbol 
 is a left hand side and so the flag bit is initially set, to denote that this 
 symbol possibly is an anti-terminal symbol. 
 
 However if a match is found and we are not considering a left-hand 
 side (i.e., n # 0) then a check is made to determine if the matched symbol 
 was provisionally marked as an anti- terminal symbol (i.e., the flag bit is 
 set) . If this is the case then the flag bit is set to zero since we are 
 considering a right hand side and the symbol, therefore, is not anti-terminal. 
 If the flag is zero then frhis right-hand side symbol matches a right-hand side 
 symbol of a previously considered production and the parameter j is adjusted 
 accordingly. Whether the symbol is a right or left hand side symbol the loon 
 variable i (converted to the required representation if necessary) is placed 
 in the n-th position of PR0DN and a return is made to the calling point of 
 the procedure. Note if n = 1 and this first element of the right-hand side 
 was now matchedchen j is set to the value 1/2 before the return, in order 
 that j has the correct value consistent with the description in the subroutine 
 E(n) in the case that the second element is matched „ 
 
 Subroutine KH0 (n) 
 
 This procedure separates off a production of length two from a 
 production of length greater than two, as described in operation (i) . A 
 new symbol is created in the manner previously described. Since the zeroth 
 element (left hand side) of the production will be needed for further 
 processing of the over length production, this is stored temporarily before 
 placing the newly created symbol (in internal number form) in the zeroth 
 position. Then, since the subroutine E must be called to determine into 
 which stack this production of length two is to be pushed (since there may 
 
 -19- 
 
have been matches in the first two elements, the subroutine E will destroy 
 the contents of PR0DN) the contents of PR0DN are stored temporarily before 
 calling E. Upon return from E, processing of this production is continued 
 with the original left hand side as left hand side of the production and 
 the left-hand side of the split off production as the new first element. 
 The third symbol of the production being processed now becomes the second 
 symbol upon return. Note that in the main procedure, after calling RH0, 
 the parameter j is set to one half. This is because the first symbol is 
 now the created symbol and so cannot match any symbol of a previous 
 production. 
 
 Subroutine E(n) 
 
 This is the main subroutine which determines into which stack 
 the production is to be pushed. It combines all the operations (ii) , (iii) , 
 (iv) , (v) in case any of these conditions are met. This subroutine in turn 
 calls on three further subroutines: 
 
 a) REPL4 in case condition (iv) is met, 
 
 b) REPL2 in case either conditions (ii), (iii) or (v) are met, 
 
 c) SSI which searches the stacks of productions of length one 
 
 in case the matching element was not found among the elements 
 of S 2 . 
 
 The parameter n is the measure of the number of matching right-hand sides . 
 If the parameter n = then no matches were found and so if the second 
 element T of PR0DN is empty then the production is pushed into S else it 
 is pushed into S_ before making a return. 
 
 If the parameter n = 1 then the first element of PR0DN was matched 
 but not the second. If n = 2 then the second element of PR0DN was matched 
 but not the first. If n = 3 both elements of production were matched, Thus 
 if n ^ the first and second elements of S„ are searched from the top down 
 
 -20- 
 
to attempt lu find where the matching element occurs. If no match has been 
 
 found when the bottom of S 9 has been reached then before the subroutine 
 
 SSI is called to search S, and S for the match, we check whether n = 2, and 
 
 1 u 
 
 if so, whether the matched element in fact matches the first element of the 
 production under consideration. In this case we call the subroutine REPL2 
 to effect the replacements called for by condition (ii) . We set the second 
 parameter of REPL2 to be the T element of this new production which will be 
 pushed into S by the subroutine REPL3 which is called by REPL2 „ If n = 2 
 the second element of PR0DN was matched and so SSI is called with parameter 
 T (the second element of PR0DN) . If n = 1, the first element was matched and 
 so SSI is called with parameter S. However if n = 3, after calling SSI (S) 
 and determining the match for the first element, the match for the second 
 element has yet to be determined and so the production must be popped from 
 S_ and the same production evaluated again for the match of the second 
 element by calling SSI (T) . 
 
 On the other hand if a match is found in S ? , the resulting action 
 again depends on the value of the parameter n« (if n = 2 only T need be 
 checked against the element of S ). If T matches an element of S then 
 subroutine REPL2 with appropriate parameters, is called which essentially 
 makes the replacements outlined in operations (ii) and (v) . Then if n f 3 
 a return is made, otherwise, the match of the first element S has yet to be 
 found and so the production is popped from S for further consideratiun. 
 
 The parameter n is set to 1 (first element matched) and the search 
 of S_ is continued since this match may also be in S, but among the elements 
 as yet unsearched. 
 
 If S matches an element of S there is a more complicated situation. 
 If S matches a first element and n = 3 then a check is made to see whether T 
 matches the second element of the same production. If this is true subroutine 
 REPLA is called to carry out operation (iv) and then return „ If not then 
 
 -21- 
 
REPL2 is called as above, and again if n = 3 the top element of S must be 
 popped for recondiseration and the parameter set to 2 (second element matched 
 but not the first) . 
 
 Subroutine REPL 4{n) 
 
 This subroutine carries out operation (iv) when the first and 
 second elements of PR0DN match the first and second elements of a production 
 in S_» Since both productions of length one have the same right hand side, 
 these are pushed into the stack S of critical productions . When there is 
 a sequence of productions in S with the same right-hand sides then to signal 
 the end of the sequence, the flag bit is set on the first byte of the final 
 production of the sequence Finally the zero-th element of the matched 
 production in S~ is replaced by the created symbol (which of course is still 
 the topmost element in the symbol table (/( ) , Note that a push into a stack 
 from PR0DN is assumed to be a non-destructive store from PR0DN o 
 
 Subroutine REPL2 (Y,X) 
 
 This subroutine combines the replacements required in operations 
 (ii) (iii) and (iv) „ If the contents of PR0DN is a production of length 
 two (i.e., T is not empty) then the subroutine REPL3 (Y, X) is called to 
 make the appropriate replacement of the matched element in PR0DN s and set 
 up the contents of PR0DN for the remaining replacements which is the same 
 situation as if the contents of PR0DN was production of length one and a 
 match had been found in S , The contents of PR0DN are pushed into S , a 
 new symbol is created, the matched symbol in S is replaced by the created 
 symbol, the zero-th element of PR0DN is replaced by the created symbol (the 
 first element is still the matched element since the store from PR0DN has 
 been assumed non-destructive), and the contents of PR0DN is pushed into S 
 (with flag bit set on first byte since this is the last production of this 
 sequence of critical productions) before making a return. 
 
 -22- 
 
Subroutine KEPL3 (Y , X) 
 
 Tnis subroutine replaces the matched symbol (parameter Y) with a 
 created symbol, pushes the production in PR0DN into S,,, and tne PR0DN is 
 set up as a production of length one for the required replacements of pro- 
 ductions of length one. The created symbol is placed in the zero-th position 
 of PR0DN and the matched element (taken from the matched element in S tnis 
 ti~.c) is placed in the first position. 
 
 Subroutine SS1(Y) 
 
 Tnis subroutine searches the stacks of productions of length one 
 (namely S and S ) to find the element which matches the parameter Y. It 
 in turn calls three further subroutines: 
 
 a) REPL3 - to handle the production of length two as above, 
 
 b) ADJS1 - to adjust the stack S of the matching element is 
 
 found here, 
 
 c) ADJSU - to adjust the stack S if tne matching element is 
 
 found here. 
 
 The stack S n is searched from the top down. If a ma ten is found 
 1 
 
 and PR0DN contains a production of length two (i.e., T is not empty), then 
 
 subroutine REPL3 is called to push the modified production of length two 
 
 into S„ (according to operation (iii)), and then ADJS1 is called to modify 
 
 the stack S since an element has been matched in this stack, If PR0DN 
 i 
 
 is of length one then only ADJS1 is called to adjust the stack S . 
 
 If no match is found in S., , then the critical productions in S 
 
 1 u 
 
 are searched for a match. When the match is found then, if T is not empty 
 
 REPL3 is called to deal with the production of length two as before, and 
 
 then ADJSU is called to adjust the stack S since an element in this stack 
 
 u 
 
 has been matched, Note that if no match is found in this stack then all 
 
 the stacks S.-., S, and S have been searched without finding a matcning 
 2 1 u 
 
 element. This is necessarily an error condition (ERR0R 2) since these 
 subroutines are called only if the symbol of the production under consid- 
 eration is a right-hand side which matches a symbol in the symbol table 
 which is also a rignt hand side. Thus a match must occur in one of these 
 three stacks. 
 
 -23- 
 
Subroutine ADJSl(n) 
 
 This subroutine is called when the n-th element of S is matched. 
 Since this implies that there are two productions of length one with identical 
 right-hand sides, both of these productions must be pushed into the stack S 
 of critical productions. Thus the n-th production in S, is popped from S.. 
 and pushed into S . The productions of S.. which were on top of the n-th 
 and thus also popped in this process, are pushed back into S , The contents 
 of PR0DN are pushed into S and the flag bit set since this is the end of 
 this sequence of critical productions. 
 
 Subroutine ADJSU(n) 
 
 This subroutine is called when the n-th production of S is 
 
 r u 
 
 matched. Since the search of S was made from the top down, all the pro- 
 ductions of S above the n-th are popped. Since the top element of the 
 stack is now the final production of this sequence of critical productions, 
 the flag bit signifying this on the top production of the stack is set 
 to zero and is reset on the new top element of the stack after PR0DN has 
 been pushed into S . Finally the productions of S which were previously 
 popped are pushed back into S and a return is made. 
 
 -24- 
 
3. DETERMINATION OF CONTEXTUAL CONDITIONS 
 
 In this second part of the translator generator system, the 
 contextual conditions necessary to solve the parsing and ambiguity problems 
 are obtained. Within given bounds, M and N, of right and left context 
 respectively, we set up successively, context compatible with the local 
 ambiguities determined from the simple production system obtained in the 
 first part. Note that if the production system is not uniquely determined 
 within the given bounds, it may be necessary to run the program again with 
 one or both of the bounds increased. The notation of the first part is 
 carried over into this section. 
 
 3.1 Basic Concepts 
 
 Since every CHOMSKY system can be embedded into a simple CHOMSKY 
 system such that the set of words for Z in the simple system, over the 
 alphabet of the original system, is the same as the set of words for Z 
 in the original system (see [1]), it is sufficient to solve the problems 
 raised in the introduction, for simple CHOMSKY systems only. Therefore, 
 in the following we shall deal only with simple CHOMSKY systems as defined 
 in definition 7. Note that this implies that 
 
 
 is a disjoint partition for Q( , 
 
 Simple CHOMSKY systems have the valuable advantage, that all 
 critical productions of n belong to the set 
 
 {B. :: - U.; j - 1,2, . . . m} 
 3 3 
 
 -25- 
 
and that the local ambiguities are exactly the right sides of the critical 
 productions which are contained in this set. This means especially that 
 the local ambiguities of simple CHOMSKY systems are all of length one. 
 
 A necessary condition for a string u e "fwu to be in y is that 
 u does not contain a substring XY of one of the following forms: 
 
 (a) X e {S 1 } , Y e ft 
 
 (b) X e Q& , Y e {T ± } 
 
 (c) X e {S i } , Y e {T } and J k 
 
 such that XY = ST 
 
 i.e., X,Y are not corresponding elements in the sets 
 {S.} , {T } respectively. 
 For then it is apparent that u does not belong to y. 
 
 In order to accomplish the determination of the contextual 
 conditions we consider a five-tuple associated with each critical or 
 undetermined production. Suppose C :: - V is a critical production. 
 Then the five-tuple associated with this production is denoted by 
 
 (a, V, e,w, C) 
 
 where a is the string of left context elements, 
 
 3 is the string of right context elements, 
 and W is an element such that a V 3 is a word for W; i.e., there 
 exists a sequence of productions in the production system 
 n, such that the string a V 3 may be reduced to W by making the 
 indicated replacements. 
 
 An examination of W will determine what the next context 
 
 element is since if W = S. for some i in { 1,2 .... I } then the next 
 
 l 
 
 context element will be the corresponding T and thus T. is added to the 
 
 -26- 
 
 , 
 
right context strings . On the other hand if W = T. for some i in 
 
 { 1,2, ...., £} then the next context elemenc is the corresponding element 
 
 S. and S. is added to the left context string a. 
 11 
 
 In order to determine whether a given critical production has 
 
 been determined by the context found at a given stage we need to define 
 
 the concept of compatibility. 
 
 Definition 8 (Compatibility) 
 
 Let a = A | i ... A. .A. 
 or 2 1 
 
 i _ 
 
 A '| a .| - vv 
 
 6 = B 1 B 2 — B |S| 
 
 S 1 » B.'B ' ... B f . . be strings. 
 12 | b'| 
 
 then we call (a, V, 6) compatibl e with (a', V', B') if and only if 
 
 (1) V - V 
 
 (2) A - A ' I i = 1,2, ... , min (jaj , |a* | )] 
 
 (3) B. - B. r [ j - 1,2, ... , min (|s| , |3'| )] 
 
 i.e., the two strings a V 8 and o.' V B' are compatible if their common 
 substring exhausts the smaller of the substrings of right context and 
 also the smaller of the substrings of left context. 
 
 -27- 
 
3.2 Implementation 
 
 3.2.1 Functional Description 
 
 3.2.1.1 Program Structure 
 
 The main program consists of three basic procedures called 
 ZETA, MU and SIGMA which are utilized at each step until the set of 
 undetermined productions vanishes . The bounds M, N of left and right 
 context respectively are used as global parameters which stop the itera- 
 tion if the set of undetermined productions has not vanished after 
 (M + N) steps. 
 
 The procedure ZETA decides whether the context set up so far 
 is sufficient for the determination of the critical productions. The 
 procedure MU essentially makes all possible replacements in the produc- 
 tions by considering only those productions of length one. The procedure 
 SIGMA obtains the context elements of the productions. 
 
 The procedure ZETA gives a partition of the set, G, of unde- 
 termined productions from the previous step into a determined part and 
 an undetermined part. For a given production in this set, we determine 
 whether there is different production (*) in the set which is compatible 
 with the given production. If such a production (*) is found, then if 
 W = W* the production (*) is placed in a subset G of G. If W fi W* 
 
 A 
 
 then the production (*) is placed in the set, G , of undetermined pro- 
 ductions. The set of determined productions are those elements of G 
 which are not in G . 
 
 The procedure SIGMA determines the context elements at a given 
 stage, by checking whether the element W is a member of the set of the 
 right-hand sides of the productions in the stack S„ of productions of 
 
 -28- 
 
length 2. If W = T. for some i, then we replace the string a by the string 
 S a, and replace W by a. . If W = S. for some j, then we replace the 
 string 6 by B T. and replace W by a . 
 
 The procedure MU operates on the element W of the five- tuple, 
 and determines whether W is a member of the set of right-hand sides of the 
 productions in S_, or is a member of the set A of elements which do not 
 appear on the right-hand side of any production. If not then an iteration 
 process is entered. First all possible reductions are made from all the 
 productions of length one, and then the resulting elements are again 
 checked to determine whether they are right-hand sides in S or in A . 
 If not, we then repeat the process. A global parameter K is used to stop 
 the process if the procedure is not satisfied after K steps. 
 
 The object of the whole process is to reduce the set of unde- 
 termined productions to the empty set. However, if this is not achieved 
 then one can either run the program again with an increased value for 
 one or all of the global parameters M, N, K; or one can apply the pro- 
 duction system as it is and accept the disadvantage of parsing strings 
 in ways which do not lead to valid constructions in the language. 
 
 -29- 
 
3.2.1.2 Data Structure 
 
 It is assumed that the global parameters M,N,K are given to 
 the system, as are the results of the first part. The context is built 
 up one element at a time (either left or right context) within the given 
 bounds M,N and controlled by the variable P. 
 
 In order that the bounds may be considered finite but unbounded, 
 the main idea of the implementation is that the context elements at each 
 stage are stored in one stack which is linked to the elements of the 
 preceding stacks in the sequence, in a cascading manner. The stacks in 
 this sequence are denoted by S(0), S(l), S(2), . .. S(P), S(P + 1). The 
 pointer for the stack S(j) is denoted by p(j) and the k-th element of the 
 stack S(j) is denoted by y, . Each element of every stack in the sequence 
 after S(0) (i.e., for S(j), 1 <_ j £ P + 1) is to be at least two bytes in 
 length, although extended where necessary to accomodate a larger internal 
 number (see first part). The flag bit on the first byte of each element 
 is called "FLAG 1" and the flag bit on the second byte is called "FLAG 2" . 
 
 When the context parameter has the value P, we are finding the 
 (P - l)st contextual element, and we have P + 2 stacks. 
 
 S(0) 
 
 S(l) 
 
 FLAG 1 FLAG 2 
 
 f k 
 
 S(P-2) 
 
 Last 
 
 P-2 
 
 S(P-l) 
 
 New 
 
 P-l 
 
 r k 
 
 S(P) 
 
 Old W 
 
 P 
 Y k 
 
 S(P+1) 
 
 New W 
 
 P+l 
 
 are the typical elements in each stack 
 
 -30- 
 
The first two stacks, S and S(O) have elements which are only 
 one byte in length but extended when necessary as described in the first 
 part. The first stack contains all the different elements V which are 
 the right-hand sides of the critical productions found in the first part. 
 The second stack S(0) contains all the left-hand sides of the critical 
 productions. How these stacks are linked will be described below. 
 
 The first context element found for each production is stored 
 in the stack S(l). If it is a left context element (i.e., y° = A ) then 
 FLAG 2 is not set, but if it is a right context element (i.e., y e = B..) 
 then FLAG 2 is set. 
 
 When a given production becomes determined, it is popped from 
 this system and the context for the remaining undetermined productions is 
 built up by building a new stack at each stage. When the context parameter 
 has the value P we have the following situation: 
 
 The last context element found for each production is 
 stored in the stack S(P - 2). During this iterative pass the 
 new context elements found are stored in S(P - 1) . With each 
 production is associated a five-tuple containing an element W, 
 and these W's found during the previous iterative pass are 
 stored in S(P) while the elements W being determined during 
 this iterative pass are stored successively in S(P + 1). 
 
 Now in the stack S of critical productions found in the first 
 
 u 
 
 part, for each right-hand side V , there exists at least one other pro- 
 duction in S with the same right-hand side — by definition of a critical 
 
 production. Thus conversely, associated with each distinct V, in S , 
 
 k u 
 
 there is a set of C.'s which contains at least two elements. Thus we 
 
 J 
 first push an element V 1 into the stack S and then push the set of 
 
 C.'s associated with it, into the stack S(0), setting FLAG 1 on the first 
 
 element of this set pushed into S(0). Thus to determine all the C.'s 
 
 associated with the first element of S we must search the stack S(0) from 
 
 v 
 
 -31- 
 
the bottom until we come to the second flag — at which point we have 
 
 a left-hand side of V~ , the second element of the stack S . In calling 
 
 2 v ° 
 
 the procedure MU each such C. could in itself give rise to such a set of 
 left-hand sides, so we have 'cascaded' the set of productions corresponding 
 to the right-hand sides V, . Each of these can have its own first context 
 element and so we push into S(l) the set of first context elements (the 
 first element of each set having FLAG 1 set on) corresponding to each 
 C.. This cascading process is continued throughout the sequence of 
 stacks of context elements. However, there is a one-to-one correspondence 
 between the stack of elements W and the last stack of context elements 
 at each stage. This means that there exists a one-one correspondence 
 between the stacks S(P) and S(P-2) and also between the stacks S(P + 1) 
 and S(P - 1). Thus the flags 1 and 2 have no significance in the stacks 
 S(P) and S(P + 1) and hence FLAG 1 in these stacks has been utilized to 
 denote that the flagged production is an element of the subset G of the 
 set G of undetermined productions at the previous stage as mentioned in 
 the previous section of this report. 
 
 -32- 
 
s<o) 
 
 s(D 
 
 S(P-2) 
 
 " — "" v 
 
 FLAG 2 implies context element v 
 belongs to tae right context string ii ^_ 
 
 S(P-l) S(P) 5(P+D 
 J 
 
 _ v 
 
 FLAG 1 implies the first element of a new 
 group 
 
 FLAG 1 implies 
 • production is in G 
 
 -33- 
 
Note that unless specifically denoted otherwise a PUSH or assign- 
 ment operation is assumed to apply only to the bits of the byte itself and 
 not to the flag bits as well. A pop operation of course will clear the 
 entire byte (or bytes) including the flag bits. 
 
 The elements of the left context, a, and the right context, 3, 
 
 are stored in stacks S and S. respectively. These stacks are built only 
 
 a 3 
 
 for the production under consideration. However in order to determine 
 compatibility we need two of each of these stacks — one for the given 
 production and one for the production being compared. We denote these 
 
 respectively by S , , S . and by S n1 , S n „. The k-th element of S , is 
 
 r J J al a2 J 31 32 al 
 
 denoted by a , and its pointer is denoted by p The k-th element of 
 S is denoted by a and its pointer by p „ . Similar notation holds 
 for S and S^. 
 
 -34- 
 
3-2.2 Operation Description 
 
 The Main Procedure CONDETN (M,N,K) 
 
 The system is initialized by setting up the stacks S and S(0) 
 
 as described above. In addition the left-hand sides of the critical 
 
 productions (i.e., the C 's) are also pushed into the stack S(2) as these 
 
 J 
 
 are the first W elements of the five-tuple associated with each production. 
 The counter GA is initialized to zero. This counter counts the number of 
 elements in the ambiguous class G. . Then setting the context parameter 
 P to 1 we recursively call the procedure ZETA, which, in this implementation, 
 itself calls the MU and SIGMA routines mentioned above. After calling 
 ZETA, if the stack of undetermined productions, S , is not empty then if 
 the P - 1 context elements found by this stage is less than the upper 
 bound of M +'N context elements, the parameter P is increased by one and 
 ZETA is called again. 
 
 The initialization also includes the extraction of the anti- 
 terminal symbols from the symbol tabled/ in order to facilitate checking 
 for the property of not appearing on the right-hand side of a production. 
 These elements are stored in a stack A whose i-th element is designated 
 by L. and whose pointer is designated by P y . 
 
 Subroutine ZETA (P) 
 
 The running variable for the stack S(j) is denoted by i(j). 
 Since the stacks are searched from the bottom up, the variables i(j) 
 for j =s 0, ..., P are initialized to zero. In order to facilitate re- 
 turning to the next production in the sequence after comparison operations 
 later in the program, the current values of each i(j) is saved in the 
 location q(j). Then a call is made to the routine LP which builds up the 
 
 context strings a, 8 for each production in the stacks S . and S n re- 
 
 K ^ al 61 
 
 spectively. In order to determine compatibility, for each production in 
 the main pass through the set of undetermined productions, we must compare 
 all other productions in this set. The 'switch' f is used to denote these 
 different passes -- f - 1 denotes the main pass and f = 2 denotes the pass 
 required to find a compatible production. 
 
 -35- 
 
During consideration of the possible context strings for a gi 
 production, we may find that a given product J- omes "determJ 
 the sense described above. In this case it is necessary to ?0? it from 
 the system of stacks and this condition is noted by setting the ' swit 
 D to 1. Thus, on return from the consideration of all possible context 
 strings for a given production in the procedure LP, if D = I the current 
 element of the stack S(0) must be popped and the FLAG 1 denoting the 
 first element of a new group adjusted according. The next production is 
 now considered (if the last one was popped, the running variable is not 
 increased by 1 since the next production is now in the position of the 
 last) and the process is repeated. Upon exnausting the stack S(0) 
 return to the main program and consider whether it is necessary to 
 repeat the process and find another context element. Note that if FLAG 
 1 is set on the next element of S(0) then we must now consider the next 
 element of S and so its running variable, k, is increased by I . If this 
 stack becomes exhausted (by implication before S(0; is exhausted), then 
 an error must have occurred in the organization of the I's in S(0) 
 and the ERROR call is made. 
 
 Subroutine LP (e,f,P) 
 
 This subroutine builds and keeps track of the stacks S „, 
 
 ^ af ' 
 
 S containing the elements of the strings Gc, (3 respectively for each 
 pi 
 production. It has parameters e, the running variable; f, the value of 
 
 which determines whether the production under consideration is one in 1 1 
 main pass or in the comparison pass; and of course the parameter ? de- 
 termining how many context elements have been found at this stage = 
 
 During the first iteration the parameter P has the value 2. 
 Since all productions are undetermined by definition during the first 
 iteration, we omit the procedure ZETA as described in section 3-3-I-l> 
 and call the subroutine MU directly. Since the variables m, n denote the 
 number of left and right context-elements respectively for each produc- 
 tion, these are set to zero during this first iteration, 
 
 -36- 
 
During other iterations we proceed through the stacks S( j ) 
 
 successively, using the current value e(j) of the running variable for 
 
 each stack, determining whether the value stored in each stack is a member 
 
 of a or (3 and so building up the stacks S and S appropriately. If 
 
 Q!I p I 
 
 the last context stack S(P-2) is exhausted, control is returned to the 
 calling point of this subroutine, while if any other stack is exhausted 
 we have an error situation in the FLAG l's as described in the previous 
 subroutine . 
 
 When all the context elements have been found for this pro- 
 duction at this stage (i.e., we have reached S(P-2), then a call is made 
 to CCMP 1 if this is the main pass, or to CCMP 2 if this is the comparison 
 pass. CCMP 1 initiates the comparison pass and CCMP 2 determines whether 
 there is a compatible production. On return from these subroutines, the 
 
 stacks S „, S „ are adjusted to be ready to accept next context elements 
 af pf 
 
 of the production. Note that because of the cascading nature of the 
 stacks described above we may have to pop a sequence of elements from 
 these stacks if the FLAG l's denoting a new group are set. If this 
 cascade reaches right down to S(0) then we have to return and adjust the 
 first two stacks in the subroutine ZETA-. 
 
 If the production is "determined" (i.e., D = l), then it must 
 be popped as in the subroutine ZETA. However since the possible context 
 strings for a production may have many elements in common (in the preceding 
 stacks S(j)), we pop only those stacks containing elements which are 
 unique i.e., in cascading back to S(o) we pop no further stacks after 
 finding one in which the next element is not a member of a new group . 
 The preceding elements of each stack must be replaced after each pop. 
 After completing the pops, we return to the process where we left off 
 after setting the "switch" D back to zero. This final pop is affected 
 by the short subroutine PR. 
 
 Subroutine CCMP 1 
 
 This subroutine initiates the comparison pass for a given 
 production by setting the "switch" f to 2 . The comparison pass is 
 designed to find a production compatible to the given one. This sub- 
 
 -37- 
 
routine achieves essentially those decisions described in procedure ZETA 
 in section 3.2.1.1. Since the definition of compatibility requires the 
 right-hand side V be the same, we need check only those productions 'cas- 
 cading' from one element V of the stack S . Thus we initialize the 
 running variables g(h) for h ■ 0, . ..., P-2 to the values they had on 
 beginning new groups when we began consideration of the new V, . We com- 
 pare all possible productions up to the next FLAG 1 in S(0) when we would 
 be beginning a new group corresponding to the next V. 
 
 The number of elements in the a, 3 stacks corresponding to the 
 given production are saved in variables m and u, n and v respectively for 
 use later, and then we again make a call on the subroutine LP to build up 
 the a and 3 strings for the productions being compared. 
 
 We noted in section 3.2.1.1 that the production is determined 
 if it is not undertermined and so after checking all possible compatible 
 productions and it is still not undetermined, it must be determined and 
 we call the subroutine DET to place it in the stack S of determined pro- 
 ductions, f is then set back to 1 and we proceed with consideration of 
 the next production in the main pass. 
 
 Subroutine DET 
 
 If the production is found to be "determined", this subroutine 
 pushes it into a stack S of determined productions and then sets D to 1 
 so that the offending production is then popped from the system. 
 
 In order to conserve storage, the 5-tuple corresponding to the 
 determined production is stored "vertically" in the stack S whose 
 elements are one byte in length, extended where necessary. The method 
 of storage in S is depicted on the figure. 
 
 -38- 
 

 
 % 
 
 
 
 
 • 
 
 • *. % « 
 
 
 
 
 
 
 Wi 
 
 
 
 
 
 
 • 
 
 • • • • 
 
 
 
 
 
 VA 
 
 
 
 i 
 
 Count for cc 
 
 A 
 
 a 
 
 V 
 
 A, 
 
 Count for p 
 
 - B 
 
 IP 
 
 P "I 
 
 ^■a 
 
 v 
 
 >P 
 
 w 
 c 
 
 (FLAG bit on W 
 denotes bit G„) 
 
 t 
 
 FLAG 
 Bit 
 
 All productions in S are pushed in this fixed format and so unstacking 
 will immediately produce the productions in the desired form viz: 
 
 W = A 
 
 a 
 
 , A 2 A X C B X B 2 
 
 B 
 
 IP 
 
 = A 
 
 a 
 
 ,A 2 A 1 V B X B 2 
 
 . . B 
 
 IP 
 
 -v- 
 
 We first push in C from S(0) and then W from S(P), We check 
 whether this production is a member of G. (denoted by flag 1 set on the 
 element W in S(P)) and if so call the subroutine DETA before popping the 
 W element of this production from the stack S(P). The subroutine DETA 
 determines whether this production is a valid member of G. and if so 
 increases the counter GA of the number of members in the ambiguous 
 
 class G. 
 
 A 
 
 Next we push into S , in order, the elements of right context, 
 
 the last found first and the first found last, from the stack S , followed 
 
 p 
 
 by the count for p i.e., the number of elements of right context. The count 
 is flagged, firstly to distinguish it from the internal numbers representing 
 
 -39- 
 
symbols and secondly to denote the end of the arbitrary string 0. V is 
 
 pushed in next followed by the elements of the left context a, the first 
 
 found element first and the last found element last, from the stack S , 
 
 al 
 
 followed by the count for a flagged as above. 
 
 Subroutine COMP 2 
 
 This subroutine determines whether this production in the 
 comparison pass is compatible with the given production in the main pass, 
 Since we have required V = V , recall that this means that 
 
 A ± - A i [i ■ 1, 2, »..., min (ja| J a i ) ] 
 
 B, - B * [j - 1, 2 S ...., min (|b|,|B | ) ] 
 
 Since we know these conditions were satisfied for the previous iteration, 
 we need only check that the last found context element satisfies these 
 conditions and so we compare the top elements of the stacks S s S from 
 each pass. If it is not compatible, we return and consider the next 
 production of the comparison pass, while if it is compatible, we check 
 that the left-hand sides, C 9 (found in S(0))are different and then if the 
 W elements (in S(P))are the same we flag the W element of the given 
 production to denote a member of G and return to the comparison pass, 
 else if the W elements are different we have found an undetermined 
 production and so make a call to the subroutine MU before initializing 
 the a and 3 stacks of the comparison pass and returning to the next 
 production of the main pass, 
 
 Subroutine MU 
 
 The subroutine MU places the W of the production being considered 
 in a dummy location Q and then calls the main subroutine, OP, on the 
 undetermined productions, The subroutine OP includes the procedure SIGMA 
 described in the first section. The subroutine OP operates on the W of 
 the production and determines whether i: is 'open' i e e , whether it is a 
 member of tne set of elements which are on the right-hand side of the 
 
 -40- 
 
productions In S„ or it is in A,^ (i.e», is not on the right-hand side of 
 any production) . If W is not open, it is returned with the flag bit set 
 which means a call must be made to NUITER which is the subroutine which 
 makes all possible replacements from the productions of length 1. Before 
 returning from MU we set FLAG 1 on the next available cell in S(P-l) 
 since the next context element placed in there will be the first element 
 of a new group as we will then be considering the next W element in S(P). 
 
 Subroutine OP(x) 
 
 This subroutine determines whether the argument, X, is open 
 (i.e., an element of a right-hand side of some production in S„) or does 
 not appear on any right-hand side. This latter case was the reason for 
 putting all these anti-terminal elements in a separate stack A . If X 
 is not open, we set the flag bit and return., However if X is a member 
 of a right-hand side of S_, we take this opportunity to determine the 
 next context element . The number of left and right context elements 
 found to date for this production have been saved by means of the variables 
 m, n respectively and so after checking that the addition of another 
 context element will not violate the given bound, the context element is 
 pushed into the new context stack S(P-l) and FLAG 2 is set if it is a 
 right context element. The left-hand side of the matched element in S„ 
 becomes the new W and so is pushed into the new W stack, S(P + 1) . The 
 next available cell of S(P-l) is then cleared of everything including 
 flags so that a FLAG 1 set to denote a W in G will not cause confusion 
 here. Note that by construction in part 1, all the right-hand sides of 
 S are different from each other and all other right-hand sides and so 
 having found one match we need look no further. 
 
 If the addition of a further context element would violate the 
 given bounds M or N of left and right context, then we can find no more 
 context within the given bounds. In this case the production is pushed 
 into a stack S of the same form as S described previously, and a de- 
 cision about what to do about such productions must be made upon completion 
 of the program. Clearly this problem could be eliminated by running the 
 program again with increased bounds, 
 
 -41- 
 
On the other hand if a match is obtained with an element of A , 
 
 then no further reductions can be made since this element does not appear 
 
 as the right-hand side of any production in our system. In this case we 
 
 first check that this element is not the distinguished element Z of 
 
 our language, and if it is not we push this production into a stack S , of 
 
 R 
 
 the same form as S , If the element is the distinguished element, Z, then 
 after checking that the addition of another context element would not 
 violate the bounds on right context, we add the symbol // as the next right 
 context element. The symbol # is the "string terminator" symbol and in 
 this context means that replacing it by a right element at the end of a 
 string is not possible without leading outside of Y. The modifications 
 to the stack when a new context element is added are carried out as 
 described above and in this case the new element W in the stack S(P + 1} 
 remains as Z . 
 
 Subroutine NUITER(Q) 
 
 This subroutine is an iterative procedure making all possible 
 reductions from the productions of length one if the argument Q is not 
 open. The iteration is bounded by the given global parameter K, and if 
 all the new W elements so produced are not open after K iterations, they 
 are pushed into the new W stack S(P + 1) with the corresponding new context 
 elements left blank and the process continues as before. 
 
 The iteration is controlled by the variable & . We initially 
 check Q against the right-hand sides of all the non-critical productions. 
 If no match is found, it is compared to the right-hand sides of the critical 
 productions. If no match is found here either, then we must have an error 
 situation since Q was not open and by construction of part one 
 
 01 - {S.} U {T.l U (u.) u {v k } U (A^ 
 
 is a disjoint partition of the alphabet Of . 
 
 If a match is found with U. for some j then Q is replaced by 
 the left-hand side of the matched production, a call is made to the 
 subroutine OP with this new Q as argument and the iteration proceeds. 
 
 -42- 
 
If a match is found with V. for some j, the same procedure is 
 followed except that now all possible left-hand sides associated with the 
 V are now used as replacements successively. Note that if a given V. 
 is not matched then the running variable g of the stack S(0) of left-hand 
 sides must be adjusted appropriately according to the scheme outlined 
 previously. Note also that this running variable g must also be a function 
 of the iteration variable 2,, and hence having found an open Q, we must 
 check that all the possible "left-hand sides" have been dealt with before 
 returning from this subroutine. This is the reason for the series of 
 tests before the return on page 9B of the flow-charts. 
 
 Subroutine DETA(X) 
 
 This subroutine determines whether or not the given production 
 is a valid member of the ambiguous set G . This requires that the 
 element W associated with this production is a word for the distinguished 
 element Z. All those productions for which W is not a word for Z may 
 be dropped from the set G . 
 
 The element W of this production is stored in the i (P)th 
 position of the stack S(P) and it is placed in the cell W in order to 
 carry out these operations. If W is the element Z, then the flag 1 
 associated with the W element in the stack S is set in order to denote 
 that this production is a member of the class G and the counter GA 
 (which counts the number of elements in the class G ) is increased by 
 
 A 
 
 1 before returning to DET(X) to store the remaining elements of this 
 
 production in S . But if W is not the element Z, we attempt to reduce 
 x 
 
 it to Z by finding a sequence of replacements from the stack S of 
 productions of length 1, since W itself is just of length 1. If no 
 further reductions can be found (the top of the stack S is reached 
 without finding a matching element for W) , then this production may 
 be dropped from the set G and we return to DET(X) as before. 
 
 •43- 
 
4. PURPOSIVE PARSER 
 4.1 Implementation 
 
 4.1.1 Functional Description 
 
 The production system constructed in the first two parts allows 
 the purposive parsing of strings; for this one uses a modification of the 
 NU procedure used before. The production system consists of those 
 productions stored in the stacks S and S_, which contain productions of 
 length one and two respectively and also those in the stack S of 'de- 
 termined' productions in which the sets of critical productions have been 
 'determined' or differentiated by context. 
 
 Let Z be the distinguished element which is given and suppose 
 the program which is to be scanned is read into the stack S with elements 
 X. and with pointer p . The program string is terminated by reading the 
 string terminator symbol, //, into the top of the stack. Assume the be- 
 ginning of the program is at the bottom of the stack, and we assume a 
 left to right scan. Since we do not assume a 'look back' facility and 
 in order to simplify the procedure, we made only one replacement of a 
 string element at each consideration of an element, it is necessary to 
 iterate the scanning passes through the program string, the iteration 
 being bounded by an arbitrary given parameter Q. 
 
 4.1.2 Program Description 
 
 The Scanning Procedure 
 
 We first initialize the iteration parameter q and the variable 
 i which counts the elements in the program string. Note that if the 
 program string x is 'open' (i.e., if X- e {T, } or Xi i e (S, } ) then 
 
 J. 1C x J K 
 
 x i y, i.e., the program is not a sentence in the language. Thus we first 
 check whether X- e {T } and if so go to EXIT 1 which is an error condi- 
 tion since such an element needs an element of {S, } in front of it in 
 
 k 
 
 order to be reduced. 
 
 -44- 
 
We next note that if x = u U v and U is an anti-terminal 
 element (i.e., 6 C{ consisting of elements which do not appear on the 
 right-hand side of any production) then x j. 7, since no further re- 
 ductions could then be made on this element and it could not thus be 
 reduced to the distinguished element Z. This is of course assuming 
 that Z is not the only element in the program string, in which case 
 we have finished. If, upon reaching the top of the stack S< excluding 
 the string terminator symbol^), and the single element is not Z or 
 there is more than one element left in the stack, then we increase 
 the iteration counter q by one and, provided this counter is less than 
 the iteration bound Q, we return to the start and repeat the process. 
 If the iteration bound has been reached, then we exit by EXIT 3 and must 
 re-evaluate the situation. It may be necessary to increase Q. 
 
 Having checked these initial conditions on the element X. of 
 to 1 
 
 the program string, we proceed to finding the appropriate replacement 
 rules in the procedure NU. X. is first checked against the right-hand 
 sides of the distinct productions of length one found in S. . If a 
 match is found, the appropriate replacement is made in the stack S and 
 we return to consideration of the next element in the program string 
 since the alphabet has been divided into disjoint sets as described in 
 the previous parts. 
 
 If no match is found in S , we next seek a match in S . The 
 method of storing these productions in S was explained in part two, and 
 an examination of this structure will show why it is necessary to examine 
 the flag bits on each element of the stack in order to jump over the 
 context strings a and p for each production. Since the first flag is 
 set on the count for the right context p, if a match is found and the 
 count of p is zero, we need only compare the left context oc. The number 
 of left context elements is again found by checking for the flag bit, but 
 note that if the count backwards, k, on the program string reaches zero, 
 then there are not enough left context elements in the program string and 
 this replacement rule fails. Otherwise we check each context element 
 against the corresponding element in the program string. If a non- 
 matching element is found, we return to the search of S but otherwise 
 
 -U5- 
 
on reaching the end of the left context string (denoted by another flag), 
 
 a valid context has been found. Thus, since for this value of j we hh 
 
 V. = Icul, and the count for 8 is obtained from the element in 3 with 
 J N 
 
 index j -V. -2. With this new value of the index j we finally obtain 
 
 J 
 the replacement symbol for this production from the element of S with 
 
 index j - V. -2 [compare the storage scheme for S in previous part]. 
 
 If the symbol X. was not, matched by a right-hand side of a 
 production in S , then we must jump over the left context of this pro- 
 duction in S , before returning to look for the next right-hand side to 
 compare . 
 
 If the count for 8 was not zero after a match for a right-hand 
 side has been found, then we must check also the right context (see 
 label F). If the count, i, on the program string plus the count for 
 8, V., is not less than the stack pointer p , then there are not enough 
 right context elements in the string for this replacement to be made, 
 and we resume the search on S . Otherwise we compare each right 
 context element against the corresponding symbol in the program string 
 as was done above for left context. If this comparison is successful, 
 the left context is compared as above. 
 
 If no match is found in S^, then we search for a match amongst 
 
 
 
 the right-hand sides of productions of length two. If the entire stack 
 S is searched without finding a match, then we have an error exit at 
 EXIT 2 since we should have found a match in one of these stacks due to 
 the partition of the alphabet into disjoint classes. If the element X. 
 matches a member of the class (T ), then we return to the beginning and 
 consider the next element of the program string . This is because the 
 previous element may have been replaced to become an element of the 
 class [S }, and we are not assuming a 'look back' property. This re- 
 
 K. 
 
 placement would be made on the next scanning pass through the program 
 string. If, however, a match is found with a member of the class (S }, 
 then after checking that X. is not the last element of the program 
 string, we check that the following element of the program string is 
 the corresponding member of this production in S . Note if X. is the 
 last element of the string, then the string is open and so we go to error 
 
 -1+6- 
 
exit EXIT 1, since we cannot obtain the corresponding T that this final 
 S, needs in order to made a replacement „ Otherwise, if the corresponding 
 T, does not match the next element of the program string, then it must 
 be a member of one of the other disjoint classes of the alphabet and so 
 we return to the beginning and consider this next element of the program 
 string. However if it does match, then the two elements are replaced by 
 the left-hand side of this production in S ? and then remaining elements 
 of the program string in S are pushed down one space, in turn, before 
 a return is made to the beginning to consider the next element in S . 
 
 As mentioned previously, when the top of the stack S is 
 reached, if there is more than one element remaining in the stack or if 
 the single element is not the distinguished element Z, then the scanning 
 counter q is increased by one and another scanning pass is initiated 
 provided the scanning counter is less than the iteration bound Q. If 
 the single remaining element in S is the distinguished element Z, then 
 we have achieved our goal of reducing the program by successive appli- 
 cations of the productions of the system. 
 
 -47- 
 
5. CONCLUDING REMARKS 
 
 5.1 Consequences of the Parsing Algorithm 
 
 In order to determine to what extent we have solved the ambiguity 
 problem and to what extent this parsing algorithm is really purposive, we 
 have to discuss the following cases: 
 
 (a) S - and G - 
 
 (J x J 
 
 Recall from 3.2,1„1 that G was a subset of the set, G, of 
 
 A 
 
 undetermined productions at a given stage . An element of G has the 
 
 A 
 
 property that if (a, V, 8, W, C) is the 5-tuple associated with this 
 production, then a V 8 is ambiguous* in the following sense: 
 
 3 i, j such that 
 
 C, :: = V c. :: - V. 
 
 i i j J 
 
 are productions in the production system II such that 
 
 In this case, for any u e y (i.e„, u a word for Z) , the parsing 
 algorithm gives, successively, strings u„ (i = 1, 2, .=.,, n) such that 
 u = u, , u - Z, u. p u..,. While running, the main program does not 
 
 In 1 1+1 ° r o 
 
 consider tentatively other sequences v. which do not lead to Z. 
 
 The ambiguity problem is solved since E is unique. This is 
 because it does not contain ambiguities in the sense described above, 
 which are necessary for the existence of ambiguities (see definition 6) 
 
 -A8- 
 
(b) S - , G A 4 
 
 u A 
 
 In this case we have ambiguities in the sense described in 
 case (a) above . Such W have to be checked whether they are words for Z, 
 i,e., W p Z, All elements of G for which the corresponding W is not a 
 word for Z can be dropped from G , as is done in the procedure DET in 
 the second part of this system, 
 
 The production system is ambiguous if and only if G , thus 
 
 A 
 
 reduced, is not empty. 
 
 This means that the parsing algorithm should sometimes consider 
 more than one sequence {u.}, all of which lead to Z and therefore are of 
 interest. However at the present time, the algorithm has been set up 
 to consider just one sequence, namely that obtained by using the first 
 matched production found, and the notation is given that the production 
 system is ambiguous. This decision will be reviewed in the light of 
 practical experience with the system but there would seem to be some 
 doubt as to the practical utility of a system which is ambiguous in this 
 sense. 
 
 (c) S 4 
 
 U 
 
 In this case the ambiguity problem is not solved. It may 
 possibly be solved by increasing any or all of the bounds, M, N, K 
 (See 3.2.1.1) 
 
 For the parsing problem, the comments under (a) and (b) above 
 hold except that we generate terms u„ which contain a right side of a 
 production which is not in S , ie., a production in either S.. , S~ or S . 
 All the substrings u. which are words for Z and which do not contain a 
 substring which is a right side of a production in S , S ? or S but 
 which contain a substring which is a right side of a production in S , 
 must be of the following form: 
 
 -49- 
 
(i) u contains a substring v zy ((X^Ol ) with |x| >_ m, 
 where m is the length of the right side of the critical production assoc- 
 iated with the locally ambiguous string x, contained in the substring v. 
 (See definition 5) i.e., v contains a locally ambiguous substring x 
 which implies from the definition that the right side of a corresponding 
 critical production must be a substring of i.e., m <_ |x|. 
 
 (ii) u contains a substring v 1 such that there exists a sequence 
 {v.} (i = 1, 2, ... , t) , v. = v\ v f Z 
 
 i,e , there exist sequences which lead outside of y, the set of words for Z 
 By increasing K and/or M and/or N, both of these situations become rarer-. 
 
 If one of these situations occurs during the parsing of a 
 string one has two choices: 
 
 (i) One can run the main program again with one or more of 
 the bounds M, N, K increased, and thereby improve the resulting production 
 system, 
 
 (ii) One can tentatively apply all the corresponding productions 
 of S and accept the disadvantage of following sequences which lead 
 outside of y i.e., do not lead to the desired distinguished element Z„ 
 
 5.2 The Semantics 
 
 The next stage in the system is to adjoin the semantics to 
 actually compile the code for the constructs recognized by the parsing 
 algorithm. One method of doing this would be to attempt to match the 
 semantics associated with the definition of the language in BNF, to the 
 rather expanded set of productions produced by the main program. However 
 it has been suggested [2] that an easier method would be to run each right 
 side of the original definition of the language, through this system and 
 hence obtain a definitive context for each of these. Then we use these 
 original productions (with context) purposively in the scanner. In this 
 form the semantics may be adjoined in their original form and we have also 
 eliminated all the "created characters" used to "determine" the production 
 system, However this will have to be the object of further study. 
 
 •50- 
 
REFERENCES 
 
 [1] Eickel, J. and M. Paul, "The Parsing and Ambiguity Problem for 
 Chomsky-Languages," TH Munchen, Bericht Nr. 6409 , (August 1964) 
 
 [2] Paul, M. , Private Communication 
 
 -51- 
 
( START j_ 
 
 READ IN A 
 
 NEW 
 
 PRODUCTION 
 
 CLEAR 
 PR0DN' 
 
 n n+l 
 
 CALL 
 RHO (j) 
 
 MAIN PROCEDURE: SIMPLE 
 
 
 CREATE A 
 NEW SYMBOL 
 
 H 
 
 
 TEMPORARILY 
 STORE THE 
 ZERO ELEMENT 
 OF ' PRODN ' 
 
 
 a — H 
 
 
 STORE 
 
 ' PRODN ' 
 
 TEMPORARILY 
 
 
 
 
 
 
 
 
 
 
 
 
 A 
 
 i 
 
 
 
 
 
 
 
 RESTORE THE 
 
 CONTENTS OF 
 
 'PRODN' 
 
 
 S — a 
 
 
 RESTORE THE 
 
 ZEROTH 
 ELEMENT OF 
 
 'PR0DN' 
 
 
 CALL \_ 
 
 
 . E '"' J 
 
 
 
 
 
 
 
 
 
 
 
 1 
 
 
 * 
 
 
 
 THIRD SYMBOL OF 
 PRODUCTION BEING 
 
 A\ dctiidmU 
 
 
 CONSIDERED 
 BECOMES THE S 
 
 NOW 
 ECOND. 
 
 
 
 
 SUBROUTINE: RHO (n) 
 FIGURE IA 
 
 52 
 
ENTER n th SYMBOL 
 
 INTO SYMBOL TABLE 
 
 A IN THE NEXT 
 
 AVAILABLE LOCATION 
 
 i + l 
 
 I 
 
 PLACE p a (CONVERTED 
 IF>25U) IN THE 
 
 n th POSITION IN 
 THE CELL 'PRODN' 
 
 2j + l 
 
 SET FLAG 
 TO ZERO 
 
 SET FLAG BIT ON 
 
 FIRST BYTE OF TOP 
 
 ELEMENT IN THE 
 
 SYMBOL TABLE A 
 
 P a — 
 
 V 
 
 PLACE i (CONVERTED 
 IF > 25U) IN THE 
 
 n th POSITION IN 
 THE CELL 'PR0DN' 
 
 SUBROUTINE: C0MPAR (n,j) 
 FIGURE 2A 
 
 53 
 
PUSH 
 'PRODN' 
 INTO S, 
 
 OF S„ 
 
 CALL 
 REPL2 (S,T.) 
 
 f CALL 1 
 
 SUBROUTINE: E(n) [PAGE I] 
 FIGURE 3 A 
 
 ^ 
 
_E1 
 
 3A 
 
 CALL 
 REPL 2 (S, Sj) 
 
 CALL 
 REPL 2 (S, Tj 
 
 n 2 
 
 POP THE 
 
 TOP 
 
 PRODUCTION 
 
 OF S 2 
 
 CALL 
 REPL 4 ( i ) 
 
 SUBROUTINE: E (n) [PAGE 2] 
 FIGURE 4A 
 
 55 : 
 
CALL 
 REPL 3 (Y, X) 
 
 PUSH 
 ' PRODN ' 
 INTO S, 
 
 CREATE 
 
 NEW 
 
 SYMBOL 
 
 H 
 
 X — H 
 
 a — H 
 
 PUSH ' PR0DN ' 
 
 INTO S u AND 
 
 SET FLAG 
 
 BIT 
 
 SUBROUTINE: REPL 2 (Y,X) 
 
 REPL 3 \ 
 
 
 CREATE 
 
 NEW 
 
 SYMBOL 
 
 G 
 
 
 Y - 
 
 - G 
 
 
 PUSH 
 ' PRODN ' 
 INTO S 2 
 
 
 a — 
 
 G 
 
 (Y,X) J 
 
 
 
 
 
 S — x 
 CLEAR T 
 
 
 
 
 
 
 
 
 SUBROUTINE: REPL 3 (Y,X) 
 
 (REPL4(h) \ 
 
 CREATE 
 
 NEW 
 
 SYMBOL 
 
 H 
 
 
 S - H 
 CLEAR T 
 
 
 PUSH 
 ' PRODN ' 
 INTO Sy 
 
 
 a — a n 
 
 
 
 
 
 
 
 1 
 
 1 
 
 
 
 
 ^/[RETURN j\ 
 
 
 
 PUSH 'PRODN' 
 INTO S u AND 
 
 SET THE 
 FLAG BIT 
 
 a n _ H 
 
 
 
 
 
 SUBROUTINE: REPL 4 (n) 
 FIGURE 5A 
 
 56 
 
SUBROUTINE: SS I (Y) 
 
 ( ADJSU (n)j- 
 
 POP n th 
 PRODUCTION OF 
 S| AND PUSH 
 IT INTO S„ 
 
 PUSH PRECEDING 
 PRODUCTIONS 
 
 OF S| BACK 
 INTO Si 
 
 SUBROUTINE: ADJSI (n) 
 
 POP (n+l) st 
 
 PRODUCTION 
 
 OF S„ 
 
 SET FLAG BIT 
 
 TO ZERO ON 
 
 TOP ELEMENT 
 
 OF STACK S u 
 
 PUSH 'PR0DN' 
 
 INTO S u AND 
 
 SET THE 
 
 FLAG BIT 
 
 PUSH 'PR0DN' 
 
 INTO S u AND 
 
 SET THE 
 
 FLAG BIT 
 
 PUSH PRECEDING 
 PRODUCTIONS 
 OF S u BACK 
 INTO. S„ 
 
 SUBROUTINE: ADJSU (n) 
 FIGURE 6A 
 
 57 
 
START 
 
 k — k-l 4 
 
 PUSH C k 
 
 INTO S(O) 
 WITHOUT FLAG BIT 
 
 PUSH C k 
 
 INTO S(2) 
 WITHOUT FLAG BIT 
 
 i — u -I 
 
 GA — 
 
 P — 
 
 ■* P — P+l 
 
 CALL 
 ZETA (P) 
 
 MAIN PROCEDURE: CONDETN (M,N,K) 
 FIGURE IB 
 
 58 
 
k —I 
 
 -*T* 
 
 j — j+l 
 
 » j - 
 
 j — j*l 
 
 i(j) — I 
 
 CALL 
 
 REPLACE 
 
 PRECEDING 
 
 ELB4ENTS OF 
 
 POP v k 
 
 SUBROUTINE: ZETA (P) 
 FIGURE 2B 
 
 59 
 
j — jfl 
 
 ISABEL n 
 
 UB 
 
 POP TOP 
 ELEMENT 
 
 OF y 
 
 POP TOP 
 ELEMENT 
 OF Stf 
 
 'LABEL 3 
 
 SUBROUTINE: LP (e,f,p) [PAGE I] 
 FIGURE 3B 
 
 60 
 
'LABEL 4 
 
 3B 
 
 j _ j -I 
 
 REPLACE 
 
 PRECEDING 
 
 ELEMENTS 
 
 OF S(j) 
 
 POP r. 
 
 e(j) 
 
 SUBROUTINE: LP (e.f.p) [PAGE 2] 
 
 POP X i(j) 
 
 FROM THE 
 
 STACK S(j) 
 
 REPLACE 
 
 PRECEDING 
 
 ELEMENTS 
 
 OF S(j) 
 
 SUBROUTINE: PR ( X i ( j > > 
 FIGURE 4B 
 
 61 
 
u — -p - 1 
 
 1 
 
 •" -"Pol - 1 
 
 
 v — -p - 1 
 n -^ -l 
 
 
 h — 
 
 2 
 
 
 
 
 
 0(h) - q (h) 
 
 
 
 ii ' 
 
 
 
 
 
 
 
 
 
 h — h+l 
 
 CALL 
 LP (o.f.P) 
 
 u — u + 1 
 
 0(0) - o(o)+i 
 
 CALL 
 DET (D) 
 
 SUBROUTINE: COMP I (P) 
 
 PUSH X ',(ol 
 INTO S» 
 
 PUSH (U a| -I) 
 
 INTO S x AND 
 SET FLAG I 
 
 D - 1 
 
 PUSH r |(pJ 
 INTO S x 
 
 PUSH (ftg, -I 
 INTO S x AND 
 SET FLAG I 
 
 PUSH V K 
 INTO S x 
 
 PUSH 0„ 
 INTO S x 
 
 ' 
 
 ) 
 
 v — v - I 
 
 SUBROUTINE: DET (X) 
 FIGURE 5B 
 
 62 
 
a2 
 
 — I 
 
 *02 *" 
 
 » f 
 
 SUBROUTINE: COMP 2 
 FIGURE 6B 
 
 63 
 
on vp+d 
 
 PUSH a; 
 
 INTO S(P+I) 
 
 SET FLAG 2 
 P-l 
 
 ON r. 
 
 p(P+D 
 
 PUSH Z 
 INTO S(P+i; 
 
 CLEAR y p(p+|) 
 
 INCLUDING 
 
 FLAGS 
 
 SUBROUTINE OP (X) 
 FIGURE 7B 
 
 61+ 
 
•> £ — o 
 
 0(1) - v(o) 
 
 Jl — £+1 
 
 j - -p I - 1 
 
 I 
 
 j - j-l 
 
 SET ON >♦ 
 
 PUSH 
 INTO S(P+I) 
 
 P-l 
 CLEAR y P (Pt " 
 
 INCLUDING 
 FLAGS 
 
 — b, 
 
 CALL 
 OP (0) 
 
 J — 
 
 i(£) - I 
 
 j - j + i 
 
 »(( ERROR J] 
 
 0(£) _ g(j?M 
 
 SUBROUTINE: NUITER [PAGE 
 FIGURE 8B 
 
 - X 
 
 WITHOUT 
 FLAG BIT 
 
 <H& 
 
 65 
 
SUBROUTINE: NUITER [PAGE 2] 
 FIGURE 9B 
 
 66 
 
f DETA (X) V 
 
 I (P) 
 
 SET FLAG I 
 
 ON TOP 
 
 ELEMENT OF 
 
 S Y 
 
 r — 
 
 r — r+l 
 
 SA — GA+I 
 
 W — b r 
 
 SUBROUTINE: DETA (X) 
 FIGURE IOB 
 
 67 
 
I START h 
 
 q — 
 i — 
 
 j ■— j-l 
 
 J 
 
 q — q+l 
 
 i + l 
 
 j — -P2" 1 
 
 j —j-l 
 
 J ~- -0 | — I 
 
 PROCEDURE: SCANNER [PAGE I] 
 FIGURE IC 
 
 68 
 
LABEL B 
 
 IC 
 
 » j — o 
 
 J — J- 1 
 
 j - J+2 
 
 Y 
 
 
 
 
 J J + > 
 
 
 \ 
 
 ' 
 
 
 k — k-l 
 
 V fc 
 
 
 
 
 j —j »• 
 
 J 
 
 
 V 
 
 
 j — J-Vj-2 
 
 
 1 
 
 ' 
 
 
 X. — V. 
 
 1 J 
 
 PROCEDURE: SCANNER [PAGE 2] 
 FIGURE 2C 
 
 69 
 
( LABa E V_ 
 
 j — o 
 
 
 
 j — j + l 
 
 »<^ J=P2' 
 
 K*J 
 
 t\ " 
 
 
 
 
 4— 
 
 
 
 = "J 
 
 i <?* -2? 
 
 » k — i+l 
 
 x k — X k+1 
 
 k — k+l 
 
 PROCEDURE: SCANNER [PAGE 3] 
 FIGURE 3C 
 
 TO 
 
PROCEDURE: SCANNER [PAGE U] 
 FIGURE UC 
 
 71 
 

 
JU/f 
 
 *9 
 
 %