1 I E> R.AR.Y OF THE U N IVLR.5ITY OF ILLINOIS 510.84 IZ6r no. 237-242 cop. 2 The person charging this material is re- sponsible for its return on or before the Latest Date stamped below. Theft, mutilation, and underlining of books are reasons for disciplinary action and may result in dismissal from the University. University of Illinois Library * IB W71 4 2 mm DEC L161— O-1096 Digitized by the Internet Archive in 2013 http://archive.org/details/eolreport241luka 5/0. W Report No. 2*kL September X, ±9&1 )yv*sCtt EOL REPORT by Leon Lukaszewicz and Jurg Nievergelt UrtfVkHSlTY OF ILLINOIS '5*1968 LIBRARY Report No. 2^1 EOL REPORT* by Leon Lukaszewicz and Jurg Nievergelt September 1, 19^7 Department of Computer Science University of Illinois Urbana, Illinois 6l801 * This work was supported in part by the National Science Foundation under Grant No. GV-k6^G. ACKNOWLEDGEMENT EOL has been implemented on an IBM 709^- by Freda S. Fischer, M. Irwin-Zarecki, J. R. Sidlo, D. S. Edgar, and R. Sanders. Their active participation in the project is gratefully appreciated. 111 TABLE OF CONTENTS Page ACKNOWLEDGEMENT iii 1. INTRODUCTION 1 1.1. Purpose and history of EOL 1 1.2. The EOL Computer 2 1.2.1. Input and Output 2 1.2.2. Stacks 4 1.2.3. Instruction Address Stack 5 1.2.4. Logical Register H 5 1.2.5. Files ......... 6 1.3. General structure of EOL programs . . . 7 1.4. Formal description of syntax 8 1.5* Reference language and its representation .... 9 1.6. Notation for the values of variables 10 2. BASIC SYMBOLS 11 2.1. Characters 11 2.1.1. Letters 11 2.1.2. Digits 11 2.1.3. Delimiters 11 2.1.4. Precedence of characters 12 2.2. Markers 12 3- VARIABLES AND THEIR VALUES 13 3.1. Inputs. 13 3-2. Outputs 13 3-3- Stacks. l4 3.3.1. Words 14 3.3.1.1. Precedence of words 14 3.3.2. Numbers Ik 3.3.3. Address 15 3.3.3.1' Instruction address 15 3.3.3.2. Pointer position 15 3.3.4. Examples of stack values 15 3.3.5. Instruction address stack 15 3.4. Files 16 3.5. Logical values l6 3.6. Initial values l6 4. GENERAL FORMAT OF INSTRUCTIONS IT 4.1. Operators 17 4.2. Arguments IT 4.2.1. Literals 17 4.2.2. Integers 18 IV TABLE OF CONTENTS (Continued) Page 4.2.3. Indexed mode symbols 18 4.2.3.1. Intermediate list 18 4. 2. 3. 2. Mode symbols 19 4.2.3.2.1. Mode of reading . . 19 4.2.3.2.2. Mode of adding. . . 20 4.2.3.2.3. Mode of testing . . 20 4.2.3.3. Indexes 21 4.2.3.4. Examples 21 4.2.4. Tests 22 4.2.4.1. Number of steps 22 4.2.4.2. Constituent test. . . 22 4.2.4.2.1. Class ....... 22 4.2.4.2.2. Word test operator. 23 4.2.4.2.3. Number test operator. 24 4.2.4.2.4. Constituent compared 24 4.2.4.2.5. Address test. ... 25 4.2.4.3. Record test 25 4.2.4.4. Examples. 25 4.2.5. Labels. ...... 26 4.3. Termination of reading a list 26 4.3.1. Reading an input or a stack 26 4.3.2. Reading a file 27 4.4. Instruction conventions 28 4.4.1. Missing second argument 28 4.4.2. Testing with an empty stack 28 4.4.3. Undefined result. . . 29 5. INSTRUCTIONS 30 5.1. Stack instructions 30 5.1.1. Move 30 5.1.2. Exchange 31 5.1.3. Set into stack 32 5.1.4. Clear stack 32 5.1.5. Test stack 33 5-1.6. Arithmetic instructions 34 5.I.T. Convert to word 35 5.1.8. Convert to number 36 5.1.9. Split 37 5.1.10. Compress 38 5.1.11. Count 38 5.2. Input instructions 39 5.2.1. Read 39 5.2.2. Clear input . ..... 40 5.2.3. Test input 4l v TABLE OF CONTENTS (Continued) Page 5. 3- Output instructions 42 5.3.1. Write 42 5.3.2. Set output 1^2 5.3.3. Output format instructions 43 5.4. File instructions 44 5.4.1. Put .' 44 5.4.2. Get 45 5.4.3. Save h6 5.4.4. Restore 46 5.4.5. Set into file 47 5.4.6. Clear file 47 5.4.7. Test file 48 5.4.8. Shift 49 5.4.9. Shift and count 50 5.4.ia Copy 51 5.5. Jump instructions 51 5-5.1. Simple jump 52 5.5.2. Conditional jump 52 5.5.3. Switch jump 53 5.5-4. Return 54 5.5.5. Stop 55 5.6. Table instructions 55 5.6.1. Search 55 5.6.2. Search and count 56 5.6.3. Extract 57 5.7. Macro instructions 58 5.8. Code 59 6. DECLARATIONS 60 6.1. Switches 60 6.1.1. Name switch 60 6.1.2. Index switch . 60 6.2. Tables 6l 6.2.1. Word table 6l 6.2.2. Number table 6l 6.3. Entry declaration 6l 6.4. External declaration 6l 6.5. Index declaration 62 7- PROGRAMS, PROCEDURES, MACROS 63 7-1. Programs 63 7.2. Procedure and statement 63 7.2.1. Comments 64 7.3. Macro definition 64 VI TABLE OF CONTENTS (Continued) Page 7-4. Block structure and the scope of declarations . . 66 7.4.1. Procedures: external and internal. ... 66 7.4.2. Declaration of labels 67 7.4.3- Scope of a declaration of a label .... 67 7.4.4. Scope of an index declaration 68 7.4.5. Use of labels and formal indexes 68 7.4.6. Example 69 7«5« Flow of control in a program 69 BIBLIOGRAPHY 71 APPENDIX I. SUMMARY OF EOL SYNTAX 72 APPENDIX II. INDEX OF TERMS 78 Vll 1 . INTRODUCTION 1.1. Purpose and history of EOL EOL is a low-level language for manipulating strings of characters. The following considerations guided the design of the EOL language. (a) It should be relatively easy to learn. (b) It should be easily implemented on most computers. (c) EOL programs should run efficiently. (d) It should be well suited for two of the most important applications of string manipulation languages, namely: (i) compiler writing (ii) symbolic and algebraic manipulation Concerning (a), EOL can be considered "to be the assembly language of the fictitious EOL computer (to be described in Section 1.2.), which has some fifty basic instructions and a very simple general structure. Concerning (b), the EOL system consists of a compiler (see [8]), written in EOL, and an interpreter (see [9])> documented in a language XI (see [10]) which is close to the assembly language of many modern computers . Concerning (c), the data structure of EOL is such that on a binary, word oriented computer (for which EOL is mainly intended), operations on strings are done word by word, not character by character. A second feature which enhances the efficiency of EOL is the possibility to call procedures written in machine language from EOL programs. Concerning (d), reference [7] gives some examples of EOL programs which are discussed in detail. These examples are also intended to help the reader learn EOL. Another example of an EOL application is the EOL compiler itself (see [8]). Many concepts used in EOL were taken from other symbol manipulation languages, in particular IPL-V [l] and COMIT [2]. The basic features of EOL first appeared in [5]. Two previous versions, EOL-1 and EOL-2 have been developed and implemented at the Institute for Mathematical Machines in Warsaw, Poland, beginning in I965. On this basis, an improved EOL-3 has been developed and implemented on an IBM 709^ computer at the University of Illinois. 1.2. The EOL Computer The EOL language can be described as the programming language of the hypothetical EOL computer, shown in Figure 1. The EOL computer is intended to be simulated on a conventional computer and is therefore a pure software system. Binary fixed length word computers are best suited for simulation of the EOL computer. 1.2.1. Input and Output The EOL computer is equipped with a number of inputs, identified by the variables 10, II, ..., la and with outputs identified by variables Q0, Ql, ..., Qp In practice, inputs and outputs correspond to such devices as card readers, line printers, or magnetic tapes. Therefore, the numbers (X and 3 are usually small integers. Input and output data is represented in the form of strings of arbitrary characters, like letters, digits, arithmetic operators, etc. (see sections 3«1« and 3.2.). EOL COMPUTER PO PI P6 10 la Files QO QP Inputs Outputs Figure 1 The input data can be read under EOL program control, character by character, and each character read is deleted from the input string. Strings of consecutive characters read from an input string can be assembled into larger units called words, which may consist of an arbitrary number of characters. Input strings can be tested under program control for occurrence of an arbitrarily specified character pattern. Due to this, no limitations are imposed upon the form of the input data so that the user may choose the format most convenient for his purpose. Similarly, the output strings are constructed under program control and presented to an output by adding new characters at the end of the indicated output string. 1.2.2. Stacks Internal processing in the EOL computer is done mostly on a number of stacks, identifed by the variables SO, SI, ..., Sy The number 7 depends on the particular implementation, and typically is equal to 31 • Stacks are linear lists of units called constituents, which are strings of characters preceded by one of the marks or or /N . These marks are not part of the EOL character set, but serve to indicate the structure of a list and the types of its constituents. A constituent may be of the type word, number, or address. A word consists of a string of arbitrary characters, preceded by the mark A number is a freely chosen string of digits, possibly with an algebraic sign and preceded by the mark . Numbers are usually identified with the fixed point numbers available on the particular computer on which EOL is implemented. They permit one to perform the usual arithmetic operations on integers. An address is represented by any string of characters preceded by the mark /s . Addresses are usually identified with addresses available on the particular computer on which EOL is implemented. They are used to identify (a) a record in a file (b) a position of the pointer in a file (c) an instruction in an EOL program The syntax of stack values is described in section 3- 3 = In the EOL language there are many instructions which permit flexible processing of stacks. For example, (a) Moving a determined number of constituents from the beginning of one stack to the beginning or to the end of another stack, the original order of constituents being kept or reversed. (b) Compressing several words into one word or splitting one word into separate words of one character each. (c) Testing whether the initial or the final constituent of a stack is equal to a given word or number. In an EOL implementation, stacks are held in the main memory as lists composed of pairs of computer words. Storage allocation for stacks including retrieval of abandoned pairs for reuse proceeds dynamically and automatically, using a "garbage collection" technique. 1.2. 3- Instruction Address Stack A special instruction address stack IAS is used to hold return addresses of subroutine calls. 1.2.^. Logical Register H A one bit register H is used to hold the outcome (success or failure) of tests. H can be tested by conditional transfer instruc- tions. The two possible values of the variable H are denoted by "plus" and "minus". A successful test leaves the value of H unchanged, an unsuccessful one sets it to "minus". Thus the effect of several consecutive test instructions is that H will be "minus" if at least one of the tests failed (logical OR). All conditional transfer instructions automatically reset H to "plus", whether or not the condition is satisfied. 1.2.5. Files In the EOL computer mass storage is provided by files, identified by the variables. PO, PI, ..., Po. Usually, the files PO and PI are held in core storage, and the others correspond to such back-up storage devices as discs, drums, or magnetic tapes . Each of the files represents a linear list composed of any number of records. A normal record consists of a string of constituents, preceded by the mark v. The entire content of a stack or part of it may be stored as a normal record in a file. A special record consists of a mark v followed by the symbol a. It is used to denote the boundary between different sections of a file. In every file there is exactly one pointer t which indicates some place or some record in the file. Reading from or writing into a file always concerns the record which follows directly after the pointer. The syntax of file values is described in section 3«^« In the EOL language there are many instructions which permit one to process files, for example: (a) Shift the pointer forward until it is positioned immediately before the first-record that satis if ies a specified condition. (b) Store the address of the record which is placed directly after the pointer as the initial constitutent of an indicated stack. (c) Place the pointer of a file directly in front of the record whose address is the initial constituent of an indicated stack. The last two operations permit one to manipulate addresses of records by means of instructions designed to process stacks. This facility is very important in many applications where an effective search for information depends on the data structure in a file. Files are designed to store a large amount of information. For example., in program translation they may serve for storing tables of identifiers or processed programs. 1.3» General structure of EOL programs An EOL program consists of a sequence of macro definitions followed by a sequence of external procedures. Macro definitions can be included in external procedures at compile time. A macro definition alone or an external procedure along with all macro definitions which are to be included in this external procedure are units which can be compiled separately (independent of any other macro definition or external procedure). Both a macro definition and a procedure consists of a heading (which includes; among other things ; the symbols MACRO or PROC); a procedure body, and a termination symbol END. A procedure body is a sequence of statements; and a statement in turn may be : (a) An instruction; which usually causes the execution of a simple action such as a change of the value of a variable, However; a macro instruction; which causes the insertion at compile time of a copy (possibly modified by the replacement of actual for formal parameters) of a macro 7 definition in place of this macro instruction, may be arbitrarily complex. (b) A procedure. The fact that a procedure may contain statements which are again procedures gives rise to the nested block structure of EOL. (c) Declarations, used to declare switches, tables of constants (numbers or literals), formal indexes, or labels with a special function (external labels or entry points into a procedure). (d) Comments, which have no effect on the execution of programs . Execution of an EOL program begins with the first instruction of the procedure declared to be MAIN (see 7-2. ), and proceeds to the instruction written next, unless the present instruction (a jump) or the statement written next (a non executable statement) explicitly state otherwise. In the EOL language programs as well as input and output data have the form of strings of characters. In particular any EOL program may be the input to or the result of another program written in this language. l'^« Formal description of syntax This report uses the Backus notation (which was designed for the description of ALGOL [3]) to describe the syntax of EOL programs and data. This notation has been extended with the following conventions : (a|b. .. | z ) denotes a,b . . . z [a|b... Iz] denotes precisely one of the symbols none or one of the symbols a,b. . . z' ' a J*' denotes "an arbitrary non-empty sequence of symbols a" [a].. denotes "an arbitrary 3 possibly empty sequence of symbols a" Examples : {A|b} may denote A {A|b}.. may denote ABA [A|b].. may denote an empty sequence 1.5» Reference language and its representatio n The EOL language as defined in this report will be called the reference language. Both the program and the data are defined in this language as sequences of basic symbols satisfying some specified syntax rules. Typographical features such as a space or line feed are not part of the reference language. In order to allow compatibility with available equipment and clarity for a user, the EOL reference language admits various representations. In such a representation each symbol of the reference language has a character or typographical feature associated with it. In the examples of programs contained in this report, the following representation of the reference language is used. (1) The symbol j j is replaced by an arbitrary non-empty string of blanks . (2) Each can be surrounded from both sides by any number of blanks . (3) The symbol \ denotes carriage return and line feed. (k) If the last nonblank character in a line is the character "-", then the next line is treated as a continuation of the current line and not as a new line. (5) The remaining characters of the representation are the same as in the reference language. The above convention is not used for auxiliary notations for the values of variables as described in 1.6., where the symbol j | preserves its 9 form, while blanks and line feeds have no meaning. 1.6. Notation for the values of variables In the examples given in this report the following notation is used to illustrate the value of a variable : : [— f ||- ] The symbol H means that the elements (e.g. characters, constituents, records) occur in the sequence in the order in which they are written from left to right. The symbol \- means that these elements occur in the sequence in the order in which they are written from right to left. If none of these symbols occur the effect is the same as if the sequence were terminated by -\ . This latter symbol is often used to emphasize the end of a sequence. Examples : II : ABCD 13 : H Si+ : ~A1~=~3 H S l+ : ~3~=~A1 h The latter two notations are equivalent. It should be stressed here that this type of notation is of auxiliary significance only and has no influence on program execution or the data structure. 10 2. BASIC SYMBOLS The EOL reference language is built up of the following "basic symbols : :={|] 2.1. Characters : : = { | j } 2.1.1. Letters : :=(A|B. . . |Z) The above set can be extended arbitrarily with other letters, e.g. lower case letters . 2.1.2. Digits : : = {0|l|2 ....... |9) 2.1.3- Delimiters : : = { u | , | : | ' MM/I/} The above list can be extended arbitrarily with any other delimiters. In the reference language the character X serves the purpose of separating consecutive statements in an EOL program, while 7 is assigned no function. In particular implementations these delimiters may denote typographical features, e.g. X may denote "carriage return line feed" for output on a printer and "end of card" for input from a card reader. 7 may denote "start new page" on output or the "escape character" used in several character codes. 11 2 . 1 . h . Precedence of characters The order of precedence is defined for the EOL character set by the following list: L-J A,B, Z 0,1, 9 We say that a character which on the above list occurs earlier precedes any character which occurs later and that the later character follows the earlier one. 2.2. Markers : : = ( I I \a\ It) Markers,, except the pointer t, serve to define the structure of stacks and files as described later. The pointer serves to distinguish a certain record in a file. The precedence of the characters which do not occur in this list is undefined. It may be defined in a particular implementation. 12 3. VARIABLES AND THEIR VALUES EOL programs operate on a limited number of variables selected from the following set : inputs 10, II,... ICC outputs QO, Ql, ...QP stacks SO, SI, . . 0S7 files PO, P1,...P5 logical variable H instruction address stack IAS OL, p, 7, 6 denote same fixed poitive integers which depend on the particular implementation. For instance, OL = p = 1, 7 = 31; 6 = 3. 3-1. Inputs : := : :=[] . . Examples : II : JOHN l_j SMITH 13 : A = B * (C + D) l_j X -\ 3-2. Outputs : := Example : Q2 : X = 3-1^159; l_i END 13 3.3- Stacks : :=[] . . : : = { | | } 3.3.I. Words : := . {) . . Examples : "ABC ~X=A+B ^ X ; "38 3.3-1-1' Precedence of words We say that the word X precedes the word Y if in the process of matching them character by character from left to right, in the first pair of different characters the character of the word X precedes the corresponding character of the word Y.(see 2.I.U.). We assume that absence of a character (at the end of a word) precedes any character. We say that the word X follows the word Y if the word Y precedes the word X. 3. 3-2. Numbers : := : :=[+ I -] () Examples o^r o 36 u -48 0006 Ik 3-3-3- Address
: :={|} 3.3-3-1- Instruction address : := {} . . An instruction address is a symbol which denotes the location of an instruction in a program. Instruction addresses are used automatically by procedure calls and returns (see section 5-5-)- They may be operated upon under program control by exchanging the values of the instruction address stack IAS and of another stack (see 5.1.2.). 3.3.3.2. Pointer position ::= () . . A pointer position is a symbol which denotes the beginning or the end of a file or the boundary between two records in a file. It may be used to save the current position of a file pointer and to "restore the pointer to that place later. 3.3.^-. Examples of stack values SI ST Sl6 ~Xl"="A~*~ ( ~B~+~GAMMA~ ) "z"+"y~="x \- °13^P08"X 3.3.5. Instruction address stack The instruction address stack IAS is used mainly for storing addresses of instructions of the type CALL. These addresses can be used later by the RETURN instruction (see 5-5«)- 15 Example : IAS : "38' A63' 18 -\ S\ /\ , /\ 3.4. Files : : = [] . .t [] : :=l 0} The symbol a makes possible division of the file into sections. Therefore, each file can consist of an arbitrary number of sequentially ordered sections. Examples : PI : t P3 : V "A _ BC°38t P6 : a b c crt d e f where a,b, . . . denote stack values 3. 5. Logical value : : = {+ I -} Examples : H 3-6. Initial values By convention at the beginning of each program run the variable H has the value "+" while all stacks (including IAS) are empty. The initial values for inputs, outputs and files must be defined for each particular run. 16 4. GENERAL FORMAT OF INSTRUCTIONS The form of an EOL instruction is either ^ [[ ,[ , ] ] ] or \j ., , The latter is a special case of a three -argument instruction where the second argument need not be written because it is assigned a standard value (see 4.4.1.). 4.1. Operators An operator is always a string of capital letters, e.g. move, set, search The operator determines an instruction or a class of instructions which are differentiated more exactly by the form of their arguments. 4.2 . Arguments : :={||| |