IINBffiHSIffilfl iffiffl iiSreify Ufilil 'IP lam IffiSSfi IHRHI mm mm M mm BWOTIfttil JHI mm I JBuU iH8H ■ Imam MHfflnuOG nwRuiiii iMlllMnffl IL HHII Illil LIBRARY OF THE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN 510.8+ lift- Ho. 661-666 cop. Z CENTRAL CIRCULATION BOOKSTACKS The person charging this catena lis re- iSore the Latest Date stamped Sow. £n?b. tEteS*— fee of $75.00 for each lost book. for disciplinary action and may r«»uii « £ ."^EW^I. TELEPHONE CENTEB, M3-S400 AUG 1 4 1996 AMR 8 19S6 When renewing by phone, write new due date below previous due date. Report No. UIUCDCS-R-7 1 +-665 A TEXT EDITOR DESIGN by Joyce Moore Kai July, 197^ Report No. UIUCDCS-R-7U-665 A TEXT EDITOR DESIGN by Joyce Moore Kai July, 197^ Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 61801 Digitized by the Internet Archive in 2013 http://archive.org/details/texteditordesign665kaij Ill PREFACE The purpose of this paper is to design a text editor to the stage that someone else could easily program and debug it, by using the algorithms presented herein. Thus, the details of the text editor and the algorithms used to implement it are all logically developed and thoroughly explained, so that the reason for each action taken in the algorithms can be easily understood. Although this text editor was designed to be implemented on a specific machine in a specific language, the algorithms are machine and language independent. Thus, advanced language features should not be needed to implement the editor. Certain features which could be considered as part of the duties of an editor are purposely left to the domain of the operating system under which the editor is run. These features include the management and maintenance of files and directories These deliberate omissions are clearly noted in the algorithms in which they occur, as requests to the system. The algorithms are written in a go-to-less form, with the scopes of IF statements and loops being clearly indicated by indentation. The left arrow is used as an assignment operator, and the equal sign is used only as a relational operator. Chapter 1 describes the features of the editor, Chapter 2 describes the data and data structures used, and Chapter 3 describes the most basic algorithms. The remaining chapters describe the bulk of the editor, using a top-down approach, that is, the most basic algorithms are described last. IV TABLE OF CONTENTS Page PREFACE ■ . iii 1. THE FEATURES OF THE EDITOR 1 1.1. The Command Language 2 1.2. Special Characters 8 1-3 • Assignments and Pointer Manipulation 10 lA. Expressions ll+ 2. DATA AND DATA STRUCTURES USED 15 2.1. The Text 15 2.2. Other Structures Containing Text. ........ 17 2.3. Macros, Tabs, and Correction Sets 18 2.1+. The User Pointers 19 2.5. Buffers 19 2 .6 . Other Data Items 20 3- THE BASIC ROUTINES 23 3-1 3-2 3-3 3-5 3.6 3-7 3-8 3-9 INC_CHAR, DEC_CHAR 23 INCJUNE, DEC_LINE 25 GT 27 ALPH_TEST, NUM_TEST 30 GET_POINTER (CHAR ) 30 GET_LINE_NUMBER 31 GET_NUMBER 32 GET_COMMAND 3I+ REPLACE TEXT 35 k. THE OVERALL EDITOR 36 5 • READ_A_LINE 39 5.1. CR kl 5.2. Escape 1+1 5-3- Backspace Erase 1+2 5.^. Backspace No Erase 1+3 5-5- Tab ' 1+3 5-6. Quote 1+1+ 5-7- Equal 1+5 5«8. Special Characters Controlling the Display. . . 1+5 5 '9* Special Characters Controlling the Cursor ... 1+7 TABLE OF CONTENTS (Continued) v DECODE ASSIGNMENT 6.1. The Overall Decoder 6.2. TYPE_CHECK . . . . 6.3- MATCH , 6.4. PATCH ■ 7. EXECUTE MACRO 7.1. The Overall Macro Execution 7-2. COPY_TO_FOEMAL_PARAMETER . , 7-3- COPY ACTUAL PARAMETER. . . , 8. EXECUTE COMMAND. 8.1. 8.2 8.3 8.4, 8.5 8.6 8.7 8.8 8.9. 8.10. 8.11. 8.12. 8.13. Q.Ik. 8.15- 8.16. 8.17. 8.18. 8.19. OPEN Close ESCAPE DESTROY COPY TABSET VERIFY, NOJ/ERIFY ERROR, N0_ERR0R. MARGIN CATALOGUE. . . . LIST , SAVE , REPEAT ...... End APPEND , MACRO RUN GT EQ THE CRT DISPLAY. 9.1. The Overall Display. . . . , 9-2. DISPLAY . . 9.3- FORWARD_ROLL 9-4. BACKWARD_ROLL , 9-5- INSERT TEXT IN TEXT DISPLAY. Page 49 50 54 63 74 89 89 91 92 95 95 101 109 110 111 112 113 113 111+ 114 115 116 116 119 119 120 122 124 124 126 126 127 128 129 130 LIST OF REFERENCES. 132 1. THE FEATURES OF THE EDITOR The first task in the design of a text editor is to determine the capabilities which are desired. The editor described in this paper has been designed to contain the following capabilities: 1. The commands repeated most often (such as insertion, deletion, and replacement) should be easy for the user to specify. 2. High powered means of text manipulation (such as matching capabilities which resemble those available in SNOBOLU) should be available 3- The user should have some means to repeat sequences of commands, or save them for use at any time. k. Changes in the text should involve as little moving of text by the editor as possible. The editor described in this paper has the following features, which satisfy the above criteria: 1.. A command language is used to create, name, run, and destroy files, to save macros between editing sessions, and to set margins and tabs. 2. There are pointers which may be used to denote positions in text. Pairs of these pointers are used to denote segments of text, and segments of text may be manipulated by manipulating the pointer pairs. 3- Recursive macros may be defined, and the repeat command causes repetition of the commands between the repeat and its matching end. h . In addition to the pointers that the user is given, any two sections of text may he doubly linked hy pointers placed in the text. Thus, it is not necessary to move large segments of text to make room for material to be inserted, nor is it necessary to fill up space where text has been deleted because the text does not remain contiguous. 5. The editor is CRT oriented, so that text may be quickly displayed; it is easy to manipulate the display so that a desired portion of text is on the screen. The cursor may easily he moved, and one of the pointers always denotes the position of the cursor in text, so that the commands and macros can interact easily with the display. In addition to the above capabilities, each time an editing session is completed, a copy of all changes made since the beginning of the session is made. If the user wishes, he or she may obtain a copy of any earlier cycle of text when the text is opened. 1.1. The Command Language Open Name Number This names the file that will be edited by the user. If the named file did not exist before it was opened then the user must place the initial text in the file. Otherwise, the text that was already in the file may be accessed by the user. When the file is opened, each line of text in the file has a line number, starting with one and increasing by one for each succeeding line. The number given in the command is optional; it specifies the cycle of the file to be edited (see Figure l.l). If the Cycle n Cycle n-1 Cycle 1 Cycle I text after most recent editing session I text after 2Bl editing session I text after l£i editing session Examples: if the file name of the above text is mp then open mp causes cycle n to be edited open mp - 1 causes cycle n-1 to be edited open mp causes cycle to be edited Note that a limit may be placed on the number of cycles retained, or the user may destroy some cycles. Thus, cycle k (where n > k > 0) may be the oldest accessible cycle. Examples: if text changes are made after open mp - 1 then close causes cycle n to be destroyed and replaced by the new cycle n; if text changes are made after open mp then closes causes cycles 1 to n to be destroyed and a new cycle 1 to be created; if text changes are made after open mp then close causes cycle n + 1 to be created. Figure 1.1. Cycles of Text cycle number is positive then it specifies the cycle number of the text to be edited. If the cycle number is negative, then it is subtracted from the cycle number of the most recent cycle to obtain the cycle to be edited. Close A file must be open when this command is given. All changes that have been made since the file was last opened are recorded, and the file is linearized (that is all internal pointers are removed) and given proper line numbers in preparation for reopening. If the currently opened file is not the most recent cycle and changes were made in the text then the cycles more recent than the opened cycle are destroyed. Escape This causes any editing done on the currently opened file to be discarded. No cycles are created or destroyed. This is to be used when a large error has been made . Destroy The currently opened cycle and all previous cycles are destroyed. Copy Name Number The indicated cycle of the file named is copied into the currently opened file. Any text in the currently opened file is destroyed. This is to be used to edit a cycle of a file when it is desirable to retain later cycles of the file. Tab set n_, n ... n. 12 i This is used to initialize i tabsettings -which may later be accessed by using the tab key or the tab expression. Verify No Verify These control the status of a logical variable. In verify mode, all changes made in the text are displayed or printed. In no verify mode, displays are caused only by the list command. Initially the editor is in verify mode . Error No Error These control the status of a logical variable. In no error mode, the editor -will continue editing after many serious errors . Initially the editor is in error mode. Margin Number This allows the width of the text to be specified. Warning messages are displayed if any change in text causes a line longer than the width to be created, or if a tab is set outside the margin. Catalogue A list of the names of all files available to the user is displayed. List n , n Lines n, to n ? of text are displayed. If n~ is not specified, then lines n to n.. plus the length of the display buffer are displayed. N, and n p may be in the form of absolute line numbers or pointers to text. In the latter case, the entire line that each points to will be displayed. Save All macros defined during the editing session or saved from earlier sessions will be available when the file is reopened. Repeat End All commands between the repeat and its corresponding end will be repeated until pointer A denotes a position in text that is beyond the position denoted by pointer Z. Repeat commands may be nested. Append P • Term (where term may be a special character) This is to be used for inserting large amounts of text into the file. P may be a pointer or an absolute line number. Text between the append and term is inserted into the file at point P. Macro name formal -parameters End Defines a macro which may be invoked at any time after the macro definition as follows: Name actual-parameters A warning is given if, while the editor is in error mode, the user attempts to redefine a previously defined macro. A macro may invoke 7 itself. For completeness, the acceptable types of actual and formal parameters are given in Table 1.1, although they will not be fully explained until section 1.3* Assignments and Pointer Manipulation. Formal Parameter Type Pointer Pointer pair Corresponding Actual Parameter Type Nothing (the cursor pointer, C, is used) Pointer Pointer expression Pointer pair Pointer pair expression String String expression Table 1.1. Acceptable Parameter Types In the macro definition, each time a formal parameter occurs, it must be preceded by a parameter marker. Thus, a macro with a parameter A may also use the pointer A with no confusion. Upon invocation of the macro, the entire macro definition is recopied with the actual parameters in place of the formal parameters. The resulting commands are executed by the editor and then discarded. 8 Run The open file is closed and run. From the user's point of view, each line of text may be considered a card, and thus the file must have the same format as a card deck to be run. Line numbers are placed in columns 72-80. gt PI , P2 eq PI , P2 In both cases PI and P2 may be pointers or pointer expressions, which must be evaluated. Gt succeeds if PI denotes a position in text that is further from the beginning than the position denoted by P2 . Eq succeeds if PI and P2 denote the same position in text. Upon a success of gt or eq, the next sequential command is executed. Upon failure, the next sequential command is skipped. These two commands were included to allow a criterion for terminating recursive macro invocations. 1.2 . Special Characters In addition to the commands above, the following special characters may be used. CR (carriage return) Note that this may be included within the text. Escape Causes all characters typed since the previous CR to be ignored and erased. Backspace Erase Causes the previous character to be ignored and erased. Backspace No Erase Used for overprinting. This will not "back up through a CR. Tab Advances to the next tab setting. Quote Encloses text. To insert a quote in text, type two quotes. The following special characters may be typed at any time . They affect the display only. In the commands, a page is the number of lines which can be displayed on the screen at one time. Next Page The display is rolled up a page length. Next Half Page The display is rolled up one-half page length. Next Line The display is rolled up one line. Prev Page The display is rolled down a page length. Prev Half Page The display is rolled down one -half page length. Prev Line The display is rolled down one line. In the above commands, the cursor remains at the bottom right-hand corner of the display. 10 Begin Forward Roll Begin Backward Roll These cause the display to roll in the specified direction until End Roll is typed. Cursor Down Cursor Up Cursor Left Cursor Right These cause the cursor to move in the specified direction on the display. The pointer C always points to the position in text which corresponds to the position of the cursor on the screen. 1-3 ■ Assignments and Pointer Manipulation There are 26 pointers which may be used to denote positions in text. These pointers are denoted by the letters of the alphabet. Three of the pointers have special uses: when a file is opened, A denotes the first character in the file, Z the last character. As stated above, C always points to the text position which corresponds to the position of the cursor on the screen. A and Z are used as the halting condition for the repeat command, and A is altered whenever a portion of text is matched. Pointers may be changed in two ways: by assigning a different value to the pointer; or, by assigning the pointer to point to one end of a string which is to be matched. Thus, 11 (1) pointer = pointer expression (2) pointerl = string expression = pointer2 where pointer expression := ( pointer | number | pointer + number [pointer - number} pointer := (a|b| . . . |z) number := ( integer | integer ( integer ) ) th The non-parenthesized integer n denotes the n — line of text. the parenthesized integer n(m) represents the m — character of the th .. n — line . string expression := ( string [ expression | string expression + string expression} string := ( ' characters ' } Expression will be defined in the next section. Until then, there is no loss of generality in assuming that a string expression is simply a string. In (l), a pointer may be assigned the position of another pointer, or it may be given a line and character offset into the line; also, the pointer may be assigned a positive or negative offset from another pointer. In (2), either pointerl or pointer2 (but not both) may be omitted. The first instance of the string expression in text is located; the scan for the string begins at A. Pointerl is made to denote the first character in the expression, pointer2 the last. The pointer A is changed to denote pointerl + 0(l), i.e. one character past pointerl. Two pointers can be combined to form a pointer pair, which is used to denote the segment of text between the two pointers. The first pointer of the pair must denote a position in text before the position 12 denoted by the second. By manipulating pointer pairs, text may be manipulated: by insertion of a string; or by "moving" sections of text denoted by pointer pairs. (3) pointer pair = string (U) pointer pair = pointer pair expression where pointer pair := { pointer pointer ] pointer pair expression := ( pointer pair | string | pointer pair expression + pointer pair expression ) Thus, (3) is a subset of (k) . In (3) and (k) , string expression maybe substituted for pointer pair ; the string will be matched and its pointer pair generated, as in (2). The null string may be used in (3) and may be denoted by blanks (i.e. pointer pair = ). Assignments (3) and (h) result in the replacement of the text denoted by the pointer pair, by the string or segment of text on the right side of the statement. If a pointer pair is used on the right side, it must denote text that is between the pointers on the left, or that has been otherwise deleted from text. (The system has not been designed to protect the user from doing this to himself.) The following example should clarify the usage of pointers and assignments. Suppose the text consists of • • • • i i Scix& • • • • and we wish to change it to ... 'the president of the university said' . 13 The statement D = ' jj said' causes the pointer D to denote the blank before the 's' of 'said' as shown: .... ' (_j said' .... \ DD = 'the president of the university fj ' causes the string to be inserted (the character at the insertion point always remains before the inserted character) . Thus, the text becomes ....' the president of the university said' .... t D Now suppose we wished to change the text to .... 'the university president said' .... One way is the statement 'president of the university' = 'university president' Another way is the following statements: D = 'president ,_, ' = E F = 'university u ' = G These cause pointers to be placed as follows: .... 'the president of the university said' .... T T T T D E F G The statement DG = FG + DE results in the following: .... 'the university president said' .... tut F GD E Ik l.U. Expressions When attempting to do the matching operation (2), if a string is used, each character in the string must "be identical to a character in the text for the match to succeed. Expressions can be concatenated to strings or each other to give the user the added capability of matching such things as any letter, any digit, a certain number of characters, and other things as follows: arb - matches one of any character alph - matches one letter num - matches one digit tab - matches all characters up to the next tab stop. Note that this tab cannot be used to insert blanks in text . span arg - matches any number of contiguous args not arg - matches one character that is not an arg Arg has the following form: one blank followed by any character or alph or num followed by one blank. In addition to alph and num , span's argument may include not arg. Example : Suppose we wish to locate each occurrence in text of a number followed by a letter and separate them from each other by blanks . repeat P = num + alph PP = ' ' end 15 2. DATA AM) DATA STRUCTURES USED 2 .1. The Text TEXT, and other structures used by the editor to store text material, is considered by the editor as an array (or string) of char- acters, whose lower bound is one, and whose upper bound is arbitrarily large. The editor fills TEXT in a sequential manner, starting at the lower bound and progressing toward the upper bound. Each reference to a character of the TEXT (denoted by TEXT (POSITION)) is considered as a request to the operating system, which provides the requested information by use of some paging algorithm. Each line of TEXT has the following structure : TEXT array cr line the text of the line cr number one line Figure 2.1. The Structure of a Line of TEXT 16 The number of character positions used by the line number is denoted by LINE_NO LENGTH. The line numbers are given to the lines when the file is closed (as stated in the explanation of open) and are replaced by blanks if the line is changed in any way after it is reopened. Line numbers are used in constructing correction sets (the information needed to reconstruct a cycle) and to determine the absolute position of a pointer in text for the command gt . As previously stated, much text manipulation is done by inserting pointers in TEXT. The structure of one of these pointers is as follows: POINTER DELIMETER position in text (pointer) POINTER DELIMETER Figure 2 .2 . Structure of a TEXT Pointer The number of character positions needed to denote a position in text is denoted by POINTER_LENGTH . POINTER_DELIMETER is a special internal char- acter which cannot be confused with other characters. 17 When a file is opened, the editor makes a request to the operating system to obtain the file. After this, the following assumptions are made: the requested file may he referenced by the editor in TEXT; BEGINJTEXT denotes the position of the first character of text, which is in TEXT(l); END TEXT denotes the position of the last character of text; FREEJFEXT denotes the first free position in the text array, which is ENDJTEXT + 1 immediately after opening. These variables will be maintained throughout editing to retain their proper values. 2 .2 . Other Structures Containing Text OLDJTEXT, AUX_TEXT, MACRO_TEXT, and CS_TEXT are structures which are like TEXT in that they are considered as open ended arrays, but have different uses . OLDJTEXT: When a file is closed, a copy of the text at the time of the previous open is needed in order to create the correction set. OLDJTEXT is the name given this. Conceivably, OLD TEXT and TEXT could occupy common storage in case the text had not been changed. AUXJTEXT: When a file is being closed, the text is linearized as the correction set is constructed, that is, all the internal pointers are removed and the text is stored sequentially with line numbers, so that no changes need be made when it is reopened. The linearized text is stored in AUXJTEXT. AUXJTEXT is used also in executing repeat and macros. FREE_AUXJTEXT is used to denote the first free position of AUXJTEXT. MACRO TEXT: Used to store all macro definitions. 18 CS TEXT: Used to store all correction sets. The correction sets are formatted as follows: (l) a line number, (2) a symbol meaning insertion or deletion, and (3) if (2) was insertion, the string to be inserted FEEE CS denotes the first unused character in CS_TEXT. 2.3. Macros, Tabs, and Correction Sets The names of all macros that have been defined are placed in a binary search tree. Each node of the tree contains the following fields: INFO - The name of a macro LLINK, RLINK - Pointers to other nodes MACRO_POS - The starting position in MACRO_TEXT, of the macro definition which corresponds to the macro name in INFO. MACROS points to the root of the tree of macro names. MACROS - Points to the beginning of the list When tab settings are made, the settings are placed in a linked list. Each node of the list contains the following: IKFO - Contains the tab setting, which is in the form of a character displacement from the beginning of the line. LINK - Pointer to the next node TABS - Points to the beginning of the list Correction sets are also placed in a linked list, with each node containing the following: CNUMBER - The number of the cycle which the correction set will produce LINK - Pointer to the next node CS_POS - The starting position in CS_TEXT of the correction set CORRECTIONS - Points to the beginning of the list 19 2 .k. The User Pointers Each pointer is represented as an index into an array which contains the actual values of the pointers. Pointers A to Z are represented by indices 1 to 26 . Indices 27 to, say Uo, are available as temporary pointers, and are linked together into a list of pointers not being used. Thus, for example, if we wish the editor to represent the pointer C by the variable C, then C = 3« The array to be indexed is called POINTER. It contains indices to positions in TEXT. Note that the value of each pointer remains invarient unless it is changed in one of two ways: (l) the user requests the change through an assignment; (2) insertion of a pointer in text causes the text that the user pointer denotes to be moved. Whenever a character of text is moved physically, the user pointers are searched and any necessary changes are made. 2.5. Buffers Several buffers are used by the editor. Each command or state- ment is placed in BUFFER before it is processed. When an assignment is pro- cessed, intermediate results are placed in EDIT_BUFFER. If a string expression needs to be matched, or text is to be manipulated according to a pointer expression, the expression is placed in EDIT_BUFFER, with superfluous blanks removed. 20 2.6. Other Data Items The following other data items are used by the editor : OPEN - Has the value true if a file is currently open, false otherwise VERIFY - Has the value true if the editor is in verify mode, false otherwise ERROR - Has the value true if the editor is in error mode, false otherwise MARGIN - Contains the value 80 unless reset by the margin command TEXT MODE - True when material being processed is between quotes, false otherwise APPEND_MODE - True when an append command is being executed, false otherwise ASSIGNMENT - Is set to true when the character equal (=) is placed in the buffer during the reading of a line; indicates that an assignment was placed in the buffer PP - The first line in the CRT display buffer to be displayed PL - The number of lines that can be displayed on the CRT LINE_LENGTH - The number of characters that can be displayed in a CRT line 21 MACROS MACRO TEXT A 'DEL' A FREE MACRO TEXT Figure 2.3. Example of the Tree of Macro Names TABS ^ 5 10 25 60 A — — Figure 2 .k . Example of the Linked List of Tabs 22 CS TEXT Figure 2.5. Example of the Linked List of Correction Sets 23 3- THE BASIC ROUTINES The routines described here are used by many procedures of the editor, but do not naturally fit in with the discussion of any one procedure. These routines fall into two categories: (l) the manipulation or collection of information about pointers to the text array; (2) the manipulation or collection of information about characters from any array in the program . 3-1. INC_CHAR, DEC_CHAR In order to do any sort of manipulation of text pointers, it must be possible to move a pointer one character forward or backward in TEXT. This is made difficult by the fact that pointers can reside in TEXT, and thus the text is not contiguous. As explained in Section 2.1, an in-text pointer is set off from text by a POINTER_DELIMETER on each side. The method for placing a pointer in TEXT is explained in Chapter 6. All in-text pointers are doubly linked; that is, if one pointer denotes a spot in TEXT, there is another pointer there, which denotes the position of the first pointer. Therefore, it is easy to detect if an in-text pointer denotes a spot where a new pointer is to be placed, and correct the condition; therefore, in-text pointers always will denote text, and not other pointers. Similarly, it is easy to detect when a pointer to TEXT denotes a position that is to be occupied by an in-text pointer, and this also is corrected. Thus, it is assumed that the pointer to be incre- mented or decremented always denotes a text character, and not a character within a pointer or a character that has been deleted from text by placing an in-text pointer . 2k To summarize the actions of INC_CHAR, first, the number of character positions that the pointer is to be incremented is checked to make sure it is greater than zero. If not, DEC_CHAE should have been called with the positive value, and this is done. If the increment is positive, the pointer is incremented the desired number of times. If the end of TEXT is ever reached, no further action is taken, so that the pointer will stay in TEXT, and its position may be detected upon return from INC_CHAR. To increment the pointer, if the pointer denotes a POINTER_DELIMETER, an in-text pointer is traversed; otherwise, one is simply added to the pointer . Algorithm 3-1 - INC_CHAR (PTE, # CHARS) if # CHARS < then DEC_CHAR (PER, - # CHARS) else do I = 1 to # CHARS while PTR ^ END_TEXT if TEXT (PER) = POINTER_DELIMETER then while TEXT (PER) = POINTER_DELIMETER and PTR ^ ENDJEEXT do PTR <- TEXT (PTR + 1 to PTR + POINTER_LENGTH) else PTR «- PTR + 1 end DEC_CHAR is nearly identical to INC_CHAR, except that the pointer is decremented instead of incremented, and the pointer must be checked to ensure that it does not pass BEGIN TEXT. 25 Algorithm 3-2 - DEC_CHAR (FIR, # CHARS) if # CHARS < then INC_CHAR (PTR, - # CHARS) else do I = 1 to # CHARS while PTR ^ EEGINTEXT if TEXT (PTR) = POINTER_DELIMETER then while TEXT (PTR) = POINTER_DELIMETER and PTR ^ BEGINJIEXT do PTR ^TEXT (PTR - 1 to PTR - POINTERJLENGTH ) else PTR *- PTR - 1 end 3.2. INC_LINE, DEC_LINE INC_LINE and DEC_LINE are used to increment or decrement a pointer a specified number of lines, with the character offset of the pointer from the beginning of the line remaining constant. INC_CHAR and DEC_CHAR are used to greatly simplify the routines. INC_LINE only is described here, because DEC_LINE is so similar to it. As in INC_CHAR, if the increment is negative, DEC_LINE is invoked with the positive value of the increment. If the increment is positive, the offset of the pointer is obtained by counting the number of times the pointer must be decremented before a CR is reached. Next, the correct number of lines is passed by incrementing the pointer past the correct number of CRs. Finally, the pointer is incremented until the correct offset is reached. If the end of TEXT or the end of the line is encountered before the pointer reaches its final value, then an error is signaled. 26 Algorithm 3-3 - INC_LINE (PER, # LINES) if # LINES < then DEC_LINE (PER, - # LINES) else begin OFFSET < — 1 while TEXT (PER) ^ CR and PER ^ BEGIN_TEXT do begin DEC_CHAR (PER, l) OFFSET «- OFFSET + 1 end INC_CHAR (PER, l) do I = 1 to # LINES while PER ^ END_TEXT while TEXT (PER) ^ CR and PTR ^ END_TEXT do INC_CHAR (PER, l) INC_CHAR (PER, l) end do I = 1 to OFFSET if PTR = END_TEXT or TEXT (PER) = CR then error INC_CHAR (PER, l) end end 27 Algorithm 3-k - DEC_LINE (PTR, # LINES) if # LINES < then INC_LINE (PER, - # LINES) else begin OFFSET «- - 1 while TEXT (PTR) J- CR and PTR ^ BEGIN_TEXT do begin DEC_CHAR (PTR, l) OFFSET *- OFFSET + 1 end DEC_CHAR (PTR, l) do I = 1 to # LINES while PTR ^ BEGIN_TEXT while TEXT (PTR) ^ CR and PTR ^ BEGIN_TEXT do DEC_CHAR (PTR, l) DEC_CHAR (PTR, l) end if I < # LINES then error if PTR ^ BEGIN_TEXT then INC_CHAR (PER, 2) do I = 1 to OFFSET if TEXT (PTR) = CR then error INC_CHAR (PTR, l) end end 3-3. GT GT receives as parameters two pointers to TEXT. The value true is to be returned if and only if the second pointer denotes a position in TEXT that is beyond the position denoted by the first pointer. The 28 value to be returned can be determined in either of two ways: (l) by- traversing the text between the two pointers; (2) by finding the nearest line which has a line number and seeing which line number is smallest. Both methods may involve a great number of calls to INC_CHAP or DEC_CHAR: (l) if the pointers are far apart or in the incorrect order; (2) if many changes have been made in the text (if a line is changed, its number is removed) . The procedure outlined here attempts to minimize the calls to DEC_CHAR ( it uses only DEC_CHAE) by combining the two methods as out- lined below. First of all, it is ensured that the pointers are not equal in the first place. If this were not done, a line with a number would have to be found for each pointer. The two pointers are assigned to TEMPI and TEMP2 so that the pointers will not be changed by the procedure. Next, TEMP2 is decremented until a line with a number is encountered, or the first pointer or the beginning of TEXT is reached. If the first pointer is reached by decrementing TEMP2, then true is returned. If the first pointer is not between TEMP2 and the beginning of TEXT, then false is returned. If a line with a number is encountered, then information about the first pointer must be obtained as follows. TEMPI is decremented until a line with a number is encountered, or the second pointer or the beginning of TEXT is reached. If the second pointer is reached by decrementing TEMPI, then false is returned. If the second pointer is not between TEMPI and the beginning of TEXT, then true is returned. If a line with a number is encountered, then the line numbers are compared; they must be different, since the pointers were not equal, and neither pointer was between the other pointer and its line number. The procedure GET LINE_NUMBER is described in Section 3.6. 29 Algorithm 3-5 - GT (PTR1, PTR2) if PTR1 = PTR2 then return (false) FINISHED «- false TEMPI «- PTR1 TEMP2 <- PTR2 while not FINISHED do begin if TEXT (TEMP2) = CR then LINE2 «- GET_LINE_NUMBER (TEXT, TEMP2, true) if LINE2 > then FINISHED <- true else begin DEC_CHAR (TEMP2, l) if TEMP2 = PTR1 then return (true) if TEMP2 = BEGINJTEXT then return (false) end end FINISHED <- false ■while not FINISHED do begin if TEXT (TEMPI) = CR then LINE1 «- GET_LINE_NUMBER (TEXT, TEMPI, true) if LINE1 > then FINISHED <- true else begin DEC_CHAR (TEMPI, l) if TEMPI = PTR2 then return (false) if TEMPI = BEGINJTEXT then return (true) end end RET <- (IINE2 > LINE1) return (RET) 30 3.U. ALPH_TEST, NUM_TEST Each procedure is to return true if the character passed as a parameter is an alphabetic (numeric) character. The value to be returned is determined by creating a string of all the characters which should return true, and determining if the parameter is a member of that string . Algorithm 3-6 - ALRH_TEST (CHAR) STRING «- 'ABCDE. . .Z 1 I «-l while I < 27 and STRING (i) ^ CHAR do I «- I + 1 return (i < 27) Algorithm 3-7 - NUM_TEST (CHAR) STRING «- '012. . .9' I <-l while I < 11 and STRING (i) ^ CHAR do I <- I + 1 return (i < 11) 3.5. GET_P0 INTER (CHAR) This procedure is very similar to ALPH_TEST, except that the parameter is known to be alphabetic, and its index in the alphabet is the desired value. This index is the index of the pointer whose name is CHAR, into the pointer array. 31 Algorithm 3-8 - GET_POINTER (CHAR) STRING = 'ABCDE...Z' I <-l while I < 27 and STRING (i) ^ CHAR do I <- I + 1 if I = 27 then error . return (i) 3.6. GET_LINE_NUMBER GET LINE NUMBER receives three parameters: the type of text, which can be one of TEXT, OLD_TEXT, AUXJTEXT, or CS_TEXT; the position in text; and a logical variable which states whether or not the text type is TEXT; if the text type is TEXT, the INC_CHAR must be used. GET_LINE_NUMBER is to return the integer value of the line number (which is LINE_NO_LENGTH characters long) . If the line number is all blanks, a zero is returned. If the text position initially denotes a CR, it is incremented before the attempt to evaluate the line number. Algorithm 3-9 - GET_LINE_NUMBER (TEXT_TYPE, TEXT_POS, TEXT_TYPE_IS_TEXT) if TEXT_TYPE (TEXT_POS) .= CR then if TEXT_TYPE_IS_TEXT then INC_CHAR (TEXT_POS, l) else TEXT_POS «- TEXT_POS + 1 LINE_NUMBER f- STRING *- '123^56789' do I = 1 to LINE_NO_LENGTH if TEXT_TYPE (TEXT_POS) = blank then if LINE NUMBER > then error 32 else NUM «- else if TEXT (TEXT_TYPE) = '0' then NUM *- else begin NUM <- 1 while STRING (NUM) ^ TEXT (TEXT_TYPE) and NUM < 10 do NUM «- NUM + 1 if NUM = 10 then error end LINE NUMBER *- LINE_NUMBER * 10 + NUM if TEXT_TYPE_IS_TEXT then INC_CHAR (TEXT POS, l) else TEXT_P0S «- TEXT_P0S + 1 end return (LINE_NUMBER) 3-7- GET_NUMBER GET_NUMBER is to place a string of digits, of any length, found in the BUFFER at position BIN, preceded by blanks and/or a plus or minus, in the integer variable NUMBER, and return a value of true if a number was encountered, false otherwise. 33 Algorithm 3-10 - GET_ NUMBER STRING <- '012 3^56789' NUMBER «- SIGN <- 1 FINISHED *-- RET <- false while BUFFER (BIN) = blank do BIN *- BIN + 1 if BUFFER (BIN) = minus then begin SIGN «- - 1 BIN <- BIN + 1 end else if BUFFER (BIN) = plus then BIN <- BIN + 1 while BUFFER (BIN) = blank do BIN <- BIN + 1 while not FINISHED do begin NUM «- 1 while NUM < 11 and BUFFER (BIN) ^ STRING (NUM) do NUM <- NUM + 1 if NUM = 11 then FINISHED «- true else begin RET <- true NUMBER *- NUMBER * 10 + (NUM - l) BIN «- BIN + 1 end end NUMBER LENGTH (COMMAND) then error end end 35 3-9- REPLACE_TEXT This procedure is used to place characters from the source text into the DESTJTEXT, starting at the indicated positions, and up to and including the first CR. The position markers are to be updated. The texts may be any arrays used by the editor, as long as it is certain that no in-text pointers "will be encountered. Algorithm 3-12 - RE PLACE JTEXT (DEST_TEXT, SOURCEJTEXT, DEST_POS, SOURCE_POS) while SOURCEJTEXT (SOURCE_POST) ^ CR do begin DESTJTEXT (DEST_POS) <- SOURCEJTEXT (SOURCE_POS) DEST_POS <-DEST_POS + 1 SOURCE_POS <- SOURCE_POS + 1 end DEST_TEXT (DEST_POS) «- CR DEST_POS ^-DEST_POS + 1 SOURCE POS <- SOURCE POS + 1 36 h. THE OVERALL EDITOR The main procedure which controls the sequence of operations done by the editor has the following structure. First, initializations are done. All data items to be initialized have been described in Chapter 2. Next an endless loop is entered, in which the editor reads a line and edits it. As it is presently designed, the editor must be terminated by the operating system. However, the editor can easily be modified to include a command to terminate itself and return control to the operating system. The operations done inside the loop are contained in a procedure since they are performed several times throughout the editor. First, a line is read from the input. In Chapter 5, you will notice that the input is considered to be just another character array, since the formal para- meter in READ_A_LINE is referenced in that manner. After the line is read, ASSIGNMENT is checked (remember, it is true if an equals was encountered in the line) and if necessary, the assignment is decoded. Otherwise, EXECUTE_MACRO is executed. This gets the first command in the line to be executed and tries to match it with a name in the tree of macro names. If the command is matched, then the macro is executed and true is returned. Otherwise, false is returned. If false is returned, the editor attempts to execute the command as one of the commands to the editor. If this fails, then an error has occurred. 37 Throughout the editor there are many places where errors can occur. No attempt has been made to state what actions should be taken upon occurrence of these errors, but the condition is noted by placing the word error in place of a statement. The subsequent chapters deal with the four procedures used in the loop of the editor, and other procedures which govern the output to the CRT. Algorithm k.l - THE OVERALL EDITOR MARGIN *- 80 VERIFY <- ERROR <- true OPEN <- APPEND_M0DE «- false PL <- (the length of the display page) LINE_LENGTH «- (the length of the display line) TABS *- A FREE_AUX_TEXT *- 1 do I = 1 to HO POINTER (I) <- 1 end initialize INPUT, INPOS as necessary while true do EDIT LINE (INPUT, INPOS ) 38 Algorithm k.2 - EDIT_LINE (TEXT_TYPE, TEXT_POS) READ_A_LINE (TEXTJTYPE, TEXT_POS) if ASSIGNMENT then DECODE_ASSIGNMENT else if not EXECUTE_MACRO then if not EXECUTE_COMMAND then error 39 5 • READ_A_LINE This procedure places characters from the input into the buffer in preparation for editing, taking special actions on the special characters mentioned in Section 1.2. This chapter will describe the actions on special characters separately from the overall execution of the procedure . The following initializations are made. FINISHED is set to false and will become true when the line is completely processed. Normally, this will occur when a CR is reached, but only if the CR is not enclosed in quotes. ASSIGNMENT is set to false and will become true if an equal is processed that is not enclosed in quotes, indicating that an assignment is being processed. TEXT_MODE is set to false indicating that the first character processed is not enclosed in quotes. This is true because, as noted above, FINISHED may not be set to true when TEXT_MODE is true. CHAR_NUMBER is set to zero; this refers to the position in the line of the current character, and may be less than the position in the buffer if backspace no erase or CR has been used. Finally, the buffer is initialized to all blanks, and BIN (Buffer Index) to zero. A variable, APPEND is also used although it is not initialized in this section. APPEND is true when the append command is being executed, and is used to indicate that the editor is in a text mode, but only one line is to be read into the BUFFER at a time. This is so that the number of lines that may be inserted by the append command is not dependent on the length of the BUFFER, and so that the term command or character may be easily detected. ko Next a loop is entered -which is executed until FINISHED becomes true. If the character is a special character, then the correct action is taken. Otherwise, CHAR_NUMBER is checked to make sure it is not greater than MARGIN and the character is inserted in the BUFFER. Algorithm 5-1 - READ_A_LINE (TEXT_TYFE, TEXT_POS) FINISHED <- ASSIGNMENT <- TEXT_MODE <- false BUFFER «- all blanks BIN «- CHAR_NUMBER «- while not FINISHED do begin CHAR <- TEXT_TYFE (TEXT_POS) TEXT_POS «~ TEXT_POS + 1 CHAR case of: special characters: (described later) other: if CHAR_NUMBER = MARGIN then error INSERT_IN_BUFFER CHAR_NUMBER <- CHAR_NUMBER + 1 end Algorithm 5-2 - INSERT_IN_BUFFER BIN <- BIN + 1 if BIN > BUFFER_LENGTH then error BUFFER (BIN) <- CHAR Note that all checks for the text exceeding the BUFFER_LENGTH are done in INSERT_IN_BUFFER, whereas, all checks for the text exceeding the LINE LENGTH are done in REAL A LINE . kl 5.1. CR If the editor is in TEXT_MODE or APPEND_MODE (remember that APPEND MODE is true only when the append command is "being executed), then space for the line number of the next line must "be left after the CR This is done by inserting the proper number of blanks. If not TEXT_MODE then FINISHED is set to true. If not TEXT_MODE and not APPEND_MODE, and no characters are in the BUFFER, READ_A_LINE is continued. Algorithm 5«3 - Special Character CR INSERT_IN_BUFFER if TEXT_MODE or APPEND_MODE then begin CHAR_NUMBER <- CHAR <- blank while BIN < = BIN + LINE_NO_LENGTH do INSERT_IN_BUFFER CHAR *-CR if APPEND_MODE then FINISHED <- true end else if BIN > 1 then FINISHED <- true 5 .2 . Escap e All characters in BUFFER are erased and BIN and CHAR_NUMBER are set back to zero . TEXT_MODE and ASSIGNMENT are returned to false . Thus, the user may start all over again on the statement. k2 Algorithm 5-*+ - Special Character Escape BUFFER ♦- all blanks BIN «- CHAR NUMBER «- TEXT MODE f- ASSIGNMENT <- false 5.3. Backspace Erase The editor is returned to the configuration before the character at BIN was added to the buffer, with the exception of any special character except quote and equal. If the character to be erased is a quote, then TEXT MODE is negated. If it is equal, then the entire buffer must be checked to see if APPEND_MODE should remain on. Algorithm 5«5 - Special Character Backspace Erase if CHAR_NUMBER = then error else begin if BUFFER (BIN) = quote then TEXT_MODE <- not TEXT_MODE if BUFFER (BIN) = equal and ASSIGNMENT then begin ASSIGNMENT <- false TEMP *- TEXT_MODE do I = BIN - 1 to 1 by - 1 while not ASSIGNMENT if BUFFER (I) = quote then TEMP «- not TEMP if BUFFER (i) = equal and not TEMP then ASSIGNMENT <- true end end CHAR_NUMBER <- CHAR_NUMBER - 1 BUFFER (BIN) <- blank BIN <- BIN - 1 end ^3 5.U. Backspace No Erase This is almost the same as inserting any other character into the BUFFER, except that CHAR_NIMBER must be decremented instead of incremented, to indicate that overprinting has occurred. Backspace no erase is considered legal only in text, when there are characters in the BUFFER. Algorithm 5-6 - Special Character Backspace No Erase if CHAR_NUMBER = or (not TEXT_MODE and not APPEND_MODE) then error else begin CHAR_NUMBER <- CHAR_NUMBER - 1 INSERT_IN_BUFFER end 5-5. Tab The TAB list is searched until the first tab which is beyond the current position in the line (which is specified by CRAR_NUMBER) is found. The new position of BIN is then calculated as follows: CHAR_NUMBER must be incremented until it equals the position of the tab; BIN must be incremented by this amount as well. kk Algorithm 5.7 - Special Character Tab if TEXT_MODE or APPEND_MODE then begin P <-TABS while P ^ A and INFO(P) < CHAR_NUMBER do P <-RLINK(P) if P = A then error else begin BIN <-BIN + INFO(P) - CHARJHJMBER CHAR_NUMBER «- INFO(P) end end else error 5 .6. Quote TEXT_MODE is negated only if the editor is not in APPEND_MODE . The line is checked to see if it exceeds the MARGIN. If so, a warning is given, but additional characters may be inserted in the line if the user so desires. Algorithm 5*8 - Special Character Quote if CHAR_NUMBER = MARGIN then error INSERT_IN_BUFFER CHAR_NUMBER <- CHAR_NUMBER + 1 if not APPEND MODE then TEXT MODE <- not TEXT MODE >+5 5-7- Equal If the editor is not in a text mode, ASSIGNMENT must be set to true. In addition, the MARGIN check is made. Algorithm 5»9 - Special Character Equal if CHAR_NUMBER = MARGIN then error INSERT_IN_BIJFFER if not APEEND_MODE and not TEXT_MODE then ASSIGNMENT <- true 5.8. Special Characters Controlling the Display These short routines simply call CRT display routines which will be described in Chapter 9. These routines, FORWARD_ROLL (# LINES) and BACKWARD ROLL (# LINES) roll the text on the screen, upward, revealing lines later in the text, or downward, revealing lines earlier in the text. The variable PL (page length) refers to the number of lines of text that can be displayed at one time. After each roll, the cursor pointer, C, is given the position of the last character displayed. This is done in the FORWARD_ROLL and BACKWARD_ROLL routines, since this information is available there anyway. In the begin roll commands, an indefinite number of single forward or backward rolls are done. Before each roll the input is checked to see if an end roll has been typed to end the roll. (As presently designed, if another character is accidentally typed, the loop will never be exited.) The WAIT in the begin roll commands is to allow enough time for the user to view the display before it changes. 46 Algorithm 5.10 - Special Characters Controlling the Display (a) next page: F0RWARD_R0LL (PL) (b) next 1/2 page: F0RWARD_R0LL (PL/2) (c) next line: F0RWARD_R0LL (l) (d) prev page: BACKWARD_R0LL (PL) ( e ) prev l/2 page : BACKWARD_R0LL (PL/2) (f) prev line: BACKWARD_R0LL (l) (g) begin forward roll: •while TEXT_TYPE (TEXT_P0S) ^ end roll do begin F0RWARD_R0LL (l) WAIT end TEXT_P0S -p c c c c o o • o C CD •H •H •H O a CO CO CO •H bb co CO CO co CD *■ — ^ CD CD CD co CO co u h *H fH *H Sn CD M B 1 ft ft ■H •H S3 •H C U -p a X X CtS OJ O oj o £ c cc3 ft ft ft ft ft -H ft -H CD H CO to ft s bD P k ^ h fn h Sh co £h CO czj C CD CD CD CD CD CD CD CD bD bD bO •H H -P -p -P -P -P -P U -P U C S3 •H 45 H C c S3 a S3 ■H & •ri ?S •H •H to -P ctJ •H ■H •H •H •H ^ 5m CO O ■>—' o o O O o O H O ft -P -P < S ft ft ft ft ft ft ft CO CO 51 CD XI -P ft O CO Sn O •P cc? U cc5 ft CD W «5 C O -P CJ * oj M o >> CD ctJ 43 s O CD CD CO e CD 43 H -P ft •N O CD H CO 42 •P ct3 3 +3 ft co CD CD O ft O cti . CD H U • oj VD CO CD 3 H 53 42 •H 5 F! FH ?h O co a H ft • CO 43 CD CJ tH •H -P 43 -H £ -P C £3 CD •H CD CO H -P P S3 ctf CD -P S ft bD CD CD CJ CO CJ o3 CD 43 Jn -P CD 43 £3 -p M O 52 If the first segment is a pointer pair , then the next segment must be a string or a pointer pair expression , that is, type 5 or 0, or type 3 or k. In all of these cases, PATCH is simply called to perform the indicated changes in text. Recall that the pointer pair of the first segment may be replaced by a string expression , that is, type 5 or 6. In this case, MATCH is called to find the pointer pair , and then the above procedure is used. Algorithm 6.1 - DECODE ASSIGNMENT BIN <- 1 Tl <- TYPE_CHECK (BUFFER, BIN) if Tl = 1 then begin TEMPI and return true if the character tested is alphabetic or numeric, false otherwise. GET_POINTER and GET_NUMBER are also described in Chapter 3- GET_POINTER returns the index into the POINTER array of its argument, which should be a single alphabetic character. GET_NUMBER returns the numeric value of any string of digits which is ended by a non-numeric character. The digits may be preceded by blanks and/ or plus or minus. TEXT_TYFE and TEXT_POSition must be stated. NEXT_CHAR places the first non-blank character in the BUFFER at or after BIN into the variable CHAR. INS_CHAR places the current character of the BUFFER (that is, BUFFER (BIN)) into the current position of the EDITJBUFFER, and places the next character of the BUFFER into CHAR. ARG_CHECK checks the argument of span or not , which must be in the form of blank arg blank , and places the argument in the EDIT BUFFER beginning at BIN. ARG CHECK 57 may be called with an actual parameter of 1 or 2: (l) indicates that the argument of span is "being checked, and thus not and its valid arguments will be accepted; (2) means that the argument of not is being checked, and not is not a valid argument. Algorithm 6.2 - TYFE_CHECK (BUFFER, BIN) EBIN <- 1 EDIT BUFFER <- all blanks TEMP_SUM <- P *- PI <- P2 <- LINE_OFFSET «- CHAR_OFFSET *- FINISHED <~ NUM «- PTR <- PTR_PR <- SIR <- EXP *- PL <- MIN «- false while not FINISHED do begin NEXT_CHAR if CHAR = quote then begin STR «- true while CHAR = quote do begin INS_CHAR while CHAR ^ quote do INS_CHAR INS_CHAR end end else if (NUM_TEST (CHAR) or CHAR = left paren) then begin NUM «- true LINE_OFFSET «- LINE_OFFSET + GET_NUMBER (BUFFER, BIN) while NUM_TEST (CHAR) do INS_CHAR NEXT CHAR 58 if CHAR = left paren then begin INS_CHAR I «- GET_NIMBER (BUFFER, BIN) if I = then error CHAR_OFFSET «- CHAR_OFFSET + I NEXT_CHAR if CHAR = plus or CHAR = minus then INS_CHAR NEXT_CHAR while NUM_TEST (CHAR) do INS_CHAR HEXT_CHAR if CHAR ^ right paren then error INS_CHAR end end else if ALPH_TEST (CHAR) then begin I <-0 while ALPH_TEST (CHAR) do begin INS_CHAR I <- I + 1 end if I = 1 then begin P «~ GET_POINTER (EDIT_BUFFER (EBIN - l)) PTR <- true end 59 if I = 2 then "begin PI ^GET_POINTER (EDIT_BUFFER (EBIN - 2)) P2 <- GET_POINTER (EDIT_BUFFER (EBIN - l)) if GT (POINTER (Pi), POINTER (P2)) then error PTR_PR «- true end if I = 3 then if EDITJBUFFER (EBIN - 3 to EBIN - l) = 'art', 'num 1 , or 'tab' then EXP *- true else if EDIT_BUFFER (EBIN - 3 to EBIN - l) = 'not' then ARG_CRECK (2) else error if I = k then if EDIT_BUFFER (EBIN - 3 to EBIN - l) = 'alph' then EXP «- true else if EDITJBUFFER (EBIN - 3 to EBIN - l) = 'span' then ARG_CHECK (l) if I > h then error end else if (CHAR = CR and not PL or MIN) then FINISHED <- true else error if not FINISHED then begin NEXT CHAR 6o if (CHAR = equal, CR, or comma) then begin if CHAR ■ equal then BIN «- BIN + 1 FINISHED «- true EDIT_BUFFER (EBIN) «- CR end else if ((CHAR = plus or minus) or BUFFER (BIN then begin if CHAR = plus then PL «- true if CHAR = minus then MIN «- true if not (PL or MIN) then begin BIN <- BIN - 1 BUFFER (BIN) <- plus end INS_CHAR end else error end end DECODE <- RET <- - 1 if EXP then DECODE <- DECODE + 1 if STR then DECODE ^DECODE + 2 if PTR_PR then DECODE ^-DECODE + k if PL then DECODE <- DECODE + 8 if MIN then DECODE ^DECODE + 16 if NUM then DECODE *- DECODE + 32 1) = blank) 61 if PTR then DECODE <- DECODE + 6k if DECODE = then RET «- if DECODE = 6k then RET *- 1 if DECODE = 32, 112, or 10U then RET if DECODE = i| then RET <- 3 if DECODE = 12 or 14 then RET <- U if DECODE = 2 then RET <- 5 if DECODE = 11 then RET <- 6 if RET = - 1 then error if PTR then TEMP_SUM «- POINTER (P) else TEMP_SUM <- BEGIN_TEXT if NUM then begin INC_LINE (TEMP_SUM, LINE_OFFSET) INC_CHAR (TEMP_SUM, CHAR_OFFSET) end return (RET) Algorithm 6-3 - NEXT_CHAR CHAR <- BUFFER (BIN) while CHAR = blank do begin BIN <- BIN + 1 CHAR ^-BUFFER (BIN) end 62 Algorithm 6.U - INS_CHAR EDITJBUFFER (EBIN) *- BUFFER (BIN) EBIN <-EBIN + 1 BIN <- BIN + 1 CHAR «- BUFFER (BIN) Algorithm 6-5 - ARG_CHECK (MODE) J <-- if CHAR = "blank then INS_CHAR else error if ALPH_TEST (CHAR) then while ALFH_TEST (CHAR) do begin INS_CHAR J «- J + 1 end else INS_CHAR if (J = 3 and EDIT_BUFFER (EBIN - 3 to EBIN - l) = 'not' and MODE = l) then ARG_CHECK (2) else begin if ((J = 3 and EDIT_BUFFER (EBIN - 3 to EBIN - l) = 'num.') or (J = k and EDIT_BUFFER (EBIN - k to EBIN - l) = 'alph') or (J < 2)) then EXP <- true else error if CHAR = blank then INS_CHAR else error end 63 6.3- MATCH As previously explained, MATCH (PTR1, PTR2) finds the first occurrence in TEXT after pointer A, of a string which matches the string or string expression in the EDITJBUFFER . PTR1 and PTR2 are then given the text positions of the beginning and end of the text string. Most of the work of MATCH is done "by the procedure MATCH_CHAR . This compares the TEXT character at TEXT_POS with whatever is in the EDIT_BUFFER at EBIN, and returns true if the character matches, false otherwise. In addition, the variables EBIN, TEXT_POS, TEXT_MODE, and LINE_POS are updated so that MATCH_CHAR may be immediately recalled. The logical variable INBOUND S is also maintained so that it is true if the character denoted by TEXT_POS is not beyond Z in the text, false otherwise. MATCH functions basically as follows: it repeatedly uses MATCH_CHAR to find a possible first character of the string. When a match is found, its position in TEXT is saved and MATCH_CHAR is used to try to match the subsequent TEXT characters to the expression in the EDIT BUFFER. If the entire expression can be matached, then STRING_MATCHED is set to true in MATCH_CHAR; otherwise, a new first character is searched for, starting with the position after that of the present first character. First some initializations must be done. Since the match must occur in the text between pointers A and Z, we must ensure that A actually denotes a place in TEXT before Z; this is also necessary because the check used to update INBOUND S is an equality , and if the text match were started after pointer Z it might never terminate . MATCH_START always denotes the position in TEXT of the first character of the match attempt; 64 TEXT POS denotes the position in TEXT of the character which is currently- being matched. MATCH_START is initialized to the position before the start of the first match attempt to obey the conventions of initializing TEXT POS later on. STRING_MATCHED becomes true when the entire expression in the EDIT_BUFFER is matched. INBOUNDS becomes false when MATCH_CRAR attempts to match a text character located beyond Z. LINE_POS and LINE POS SAVE denote offsets of characters from the beginning of a line; and are maintained for use with the tab expression. LINE_POS_SAVE denotes the offset from the beginning of the line of the character denoted by MATCH_START. LINE_POS denotes the offset of the character denoted by TEXT_POS. LINE_POS_SAVE is initialized by decrementing TEXT_POS until a CR is reached and counting the number of decrements. Since the characters of the line number were included in the count, and should not have been, LINE_NO_LENGTH is subtracted. Now LINE_POS_SAVE may be used to determine if MATCH_START is positioned at a line number. If so, (hopefully this will not be possible) MATCH_START is moved forward to the beginning of the actual text of the line . Now the main loop of the program is entered, which will not be exitted until the expression in the EDITJBUFFER is matched or the starting position of the match attempt is out of bounds. First, TEXT_POS and LINE_POS are initialized or restored to the state of the first character of the previous match attempt. Next, MATCH_CRAR is called until one character is matched. After each unsuccessful call to MATCH_CEAR, the EDIT_BUFFER must be returned to its initial state. After a successful match, MATCH_START and LINE_POS_SAVE must be updated to contain the correct information about the beginning of the match. Now, MATCH CHAR is called 65 repeatedly until the string is matched, or the match attempts are out of bounds, or a match fails. Last, INBOUNDS is set to true unless MATCH START was out of bounds. Although this is technically part of the initialization at the beginning of the main loop, it must be done at the end because INBOUNDS controls the loop. Before returning PTR1, PTR2, and POINTER (A) must be assigned their new values if the MATCH was successful Algorithm 6.6 - MATCH (PTR1, PTR2) if GT (POINTER (A), POINTER (z)) then error MATCH_START <- TEXT_POS <- PTR (A) DEC_CHAR (MATCH_START, l) STRING_MATCHED <- false INBOUND S <- true LINE_POS_SAVE <- while TEXT_POS ^ START_TEXT and TEXT (TEXT_POS) ^ CR do begin DEC_CHAR (TEXT_POS, l) LINE_POS_SAVE <- LINE_POS_SAVE + 1 end LINE_POS_SAVE <- LINE_POS_SAVE - LINE_NO_LENGTH - 1 if LINE_POS_SAvE < then begin INC_CHAR (MATCH_START, - LINE_POS_SAVE ) LINE_POS_SAVE *- end 66 while not STRTNG_MATCHED and INBOIMDS do begin TEXT_POS <- MATCH_START LINE_POS «- LINE_POS_SAVE TEXT_MODE «- false while not MATCH_CHAE and INBOUND S do begin EBIN <- 1 TEXT_MODE «- false end MATCH_START <- TEXT_POS LINE_POS_SAVE «- LINE_POS while not STRING_MATCHED and INBOUNDS and MA.TCH_CHAR do (null statement) if TEXT_POS rf. MATCH_START then IKB01M)S <- true end if STRING_MATCHED then begin POINTER (A) <- PTR1 «- MATCH_START INC_CHAR (POINTER (A), l) PTR2 ^TEXT_POS end return (STRING_MATCHED) Most of the work of MATCH_CHAR is done by TEXTMODE_MATCH and UNTEXTMODE_MATCH ; which one is called depends on the state of TEXTMODE . TEXTMODE may change if EDITJBUFFER (EBIN) = quote, and if this is the case, it is quite easy for one routine to call the other in order to complete the match in the correct mode. MATCH_CHAR itself, then, in addition to calling the correct matching routine, updates INBOUNDS, 67 TEXT_POS and LINE_POS. EBIN, TEXTMODE, and STRING_MATCHED are updated in the text mode match routines. First, the INBOUNDS check is made. Since TEXT_POS is incre- mented before the matching is attempted, any character that is matched will be out of bounds if TEXT_POS is already the same as POINTER (z) . However, if all characters have already been matched, then STRING MATCHED will be set to true on this pass, and the match will be in bounds. Next, LINE POS is updated. If the beginning of a line has been reached then LINE POS is returned to 1 and TEXT_POS is made to skip over the line number. Otherwise, LINE_POS is simply incremented. Next, TEXT_POS is updated, and finally, the correct matching routine is called. Algorithm 6.7 - MATCH_CHAR if TEXT_POS = POINTER (z) then INBOUNDS *- false if TEXT (TEXT_POS) = CR then begin LINE_POS <- 1 INC_CHAR (TEXT_POS, LINE_NO_LENGTH) end else LINE_POS <- LINE_POS + 1 INC_CHAR (TEXT_POS, l) if TEXTMODE then RET «- TEXTMODE_MATCH else RET <~ UNTEXTMODE_MATCH return (RET) TEXTMODE_MATCH is a relatively simple routine. First, it checks for a return to not TEXTMODE ~oy testing if the current character of the EDIT BUFFER is a quote. If this is the case, the next character of the 68 EDIT BUFFER must be checked as well, since one quote is entered in text by typing two quotes. If the second quote is found, then the match depends on whether TEXT (TEXT_POS) is a quote. Otherwise, TEXTMODE becomes false, and the match depends on the results of UNTEXTMODE_MATCH . If no quote was found in the first place, then the match depends on whether EDIT BUFFER (EBIN) is the same as TEXT (TEXT_POS) . Algorithm 6.8 - TEXTMODE_MATCH if EDITJBUFFER (EBIN) = quote then begin EBIN <- EBIN + 1 if EDITJBUFFER (EBIN) = quote then begin RET <- (TEXT (TEXT_POS) = quote) EBIN ^-EBIN + 1 end else begin TEXTMODE <- false RET <- UNTEXTMODE_MATCH end end else begin RET <- (TEXT (TEXT_POS) = EDIT_BUFFER (EBIN) ) EBIN ^EBIN + 1 end return (RET) 69 UNTEXTMODE_MATCH is slightly more complicated than TEXTMODE_MATCH . No error checking is done because it is assumed that all errors were discovered in TYFE_CHECK. Recall also that all blanks have been removed in TYPE CHECK except the blanks on each side of the argument of span and not . UNTEXTMODE_MATCH uses several procedures: ALPHJTEST and NUM TEST were described in Chapter 3 and return a logical value which depends on whether the single character tested is alphabetic or numeric, respectively; ARG MATCH attempts to match an argument of span or not , and is similar to the ARG_CHECK procedure of TYPE_CHECK. When UNTEXTMODE_MATCH is called, there are three possibilities for the part of the expression at position EBIN in the EDIT_BUFFER: (l) EBIN may be positioned past the entire expression in the buffer; in this case the entire expression must have been matched, and thus STRING_ MATCHED is set to true. (2) EBIN denotes the beginning of the buffer (note that EBIN can equal only when the first expression in the EDIT_ BUFFER was tab , or a span whose argument was matched), or (3) The char- acter denoted by EBIN is a plus which concatenates the preceding part of the expression with whatever is to come. In either case, EBIN is updated so that it denotes the first character of an expression which must be matched. This expression may take one of two possible forms: (1) it may be a string, in which case TEXTMODE is set to true, EBIN is updated, and the results of the match depend on the results of TEXTMODE_ MATCH; (2) it may be one of the six expressions described in section l.k. For all expressions except span and tab, EBIN is incremented to denote the first character past the expression, which should be a plus, or a CR denoting that the expression is terminated. In addition, the 70 following actions are taken: arb - the match always succeeds; alph - the match depends on whether TEXT (TEXT_P0S) is alphabetic; num - the match depends on whether TEXT (TEXTJPOS) is numeric; tab - the tab list is searched until a tab setting which is not less than LINE_P0S is found. If none is found, then the match fails. Otherwise, the match depends on whether the end of the line has not been reached, since all tab stops are considered on the same line, and blank characters may not be inserted by the expression tab . Like all of the expressions matched with MATCH_CHAR, tab matches only one character of text. Therefore, when LINE_P0S reaches a tab stop, the MATCH is completed and EBIN is incremented to the character beyond 'tab'. Otherwise, the match must be continued in the next call to MATCH CHAR, and so EBIN is decremented to the character preceding 'tab'. Span and not both call ARG_MATCH to perform the match. Algorithm 6.9 - UNTEXTMODE_MATCH if EDIT_BUFFER (EBIN) = blank or CR then STRING_MATCHED «- RET *- true else if EDIT_BUFFER (EBIN) = plus or EBIN = or 1 then begin if EBIN 7/ 1 then EBIN <- EBIN + 1 if EDIT_BUFFER (EBIN) = quote then begin TEXTMODE <- true EBIN <- EBIN + 1 RET <- TEXTMODE_MATCH end else EDIT BUFFER (EBIN to EBIN + ) case of 71 'arb' : RET «- true EBIN <- EBIN + 3 'alph': RET *- ALPHJTEST (TEXT (TEXT_POS)) EBIN «- EBIN + k 'nura': RET <- NUM_TEST (TEXT(TEXT_POS) ) EBIN <- EBIN + 3 'tab' : P <- TABS while P/A and INFO (P) < LINE_POS do P *-RLINK (P) if P = A then RET «- false else begin RET <- (TEXT (TEXT_POS) ^ CR) if LINE_POS = INFO (P) then EBIN «- EBIN + 3 else EBIN «- EBIN - 1 end 'span': RET «- ARG_MATCH (l) 'not': RET <- ARG_MATCH (2) end return (RET) ARG_MATCH is used to match the arguments of span and not , since they are allowed to have the same arguments, and ARG_MATCH can be re-called if span has the argument not . ARG_MATCH has one parameter, MODE, which has the value one (l) if ARG MATCH was called for span, and two (2) if for not . 72 If span's argument is matched, then the match of span will be continued for the next character of TEXT in the next call to MATCH_CHAR. SAVE records the position to which EBIN is to be returned if the span is continued. In addition, EBIN is incremented for span since "span" is one letter longer than "not". Alph and num are checked for, next, and their actions are the same as those taken in UNTEXTMODE_MATCH . Next, not is checked for. In this case, ARG_CHECK must be re-called, but first EBIN must be updated to denote the 'n' in 'not' as expected in ARG_CHECK. If none of the above were encountered, then the argument must be a single character, and the match depends on whether TEXT (TEXT_POS) contains this character. At this point, the return value has been determined, but it must undergo changes which depend on the MODE. If ARG_CHECK was called for not then the match should succeed only if the argument was not matched; therefore, RET is negated. On the other hand, span should always match, although it is allowed to match no (zero) characters. If a character is matched, then EBIN should be updated so that the span will be continued; if no character is matched, then EBIN is already in position to begin matching the next part of the expression. However, the character denoted by TEXT_POS must be either matched or not matched before a return is made; UNTEXTMODE_MATCH is called to accomplish this. Algorithm 6.10 - ARG_MATCH (MODE) if MODE = 1 then begin SAVE ^-EBIN - 1 EBIN ^EBIN + 1 end 73 if EDIT_BUFFER (EBIN + k to EBIN + 7) = 'alph' then begin RET ^ALPH_TEST (TEXT (TEXT_POS)) EBIN «- EBIN + 9 end else if EDIT_BUFFER (EBIN + k to EBIN + 6) = 'num' then begin RET <-NUM_TEST (TEXT (TEXT_POS)) EBIN <- EBIN + 8 end else if EDIT_BUFFER (EBIN + k to EBIN + 6) = 'not' then begin EBIN <--EBIN + k RET ^-ARG_CHECK (2) end else begin RET *- (EDIT_BIFFER (EBIN 4- k) = TEXT (TEXT_POS)) EBIN «- EBIN + 6 end if MODE = 1 then if RET then EBIN <- SAVE else RET *- UNTEXTMODE_MATCH else RET «- not RET return (RET) 7^ 6.U. PATCH PATCH and the append command are the only means used by the editor to actually change text. The append command is handled in a manner very similar to PATCH, but append is very restricted, and thus much simpler. PATCH changes the text by two methods: (l) internal pointers are inserted in TEXT, which allows the text to be traversed in a non-linear fashion; (2) new text may be inserted in the part of the TEXT array that does not presently contain text. This new text can be connected to any point of the present text by use of the pointers in (l). The precise information of how to change text is given to PATCH from two sources: (l) the beginning point and ending point in text, of the patch, are passed to the routine in PTR1 and PTR2; (2) the rest of the patch is specified in the EDITJBUFFER in the form of a pointer pair expression, a pointer pair, a string, or nothing. Most of the work of PATCH is done by calling lower level routines. DOUBLE_LINK (PTR1, PTR2) places two in-text links: one at PTR1 which denotes position PTR2, and one at PTR2 which denotes position PTR1. APPEKD_STRING places a string from the EDIT_BUFFER into the free part of TEXT. Upon invocation of APPEND_STRING, EBIN should denote the first character past the quote which delimits the string. Upon return, EBIN will denote the character after the final quote of the string. If two quotes together are encountered in the string, one quote is inserted in TEXT. REMOVE_LINE_ NUMBER3_FR0M (PTR1, PTR2) erases the in-text line numbers of the lines which contain PTR1 and PTR2 and all lines between these lines. This information must be updated to be used in the creation of the correction set when the file is closed. 75 PATCH "works basically as follows: it is given an expression of the form P n P = P_P. + P C P^ + ... + P ,P 12 3 4 56 n-1 n where P and P are passed in PTR1 and PTR2, and any of the other pointer pairs may be considered a string; also there are no blanks except inside strings. The object of PATCH is to link points in text: P to P , P, to P , ¥s to P , and so on until the patch is completed by linking P to P . If P-P^ were actually a string, then P, would be linked to FREE_TEXT, the string would be placed in the TEXT at FREE_TEXT, FREE_TEXT would be moved to the character position past the last character inserted, and then FREE_TEXT would be linked to P . If the assignment had no right side (that is, the text between P and P was to be deleted), then P would simply be linked to P . This linking is done iteratively in a simple loop, in which TEMPI always denotes the first position in text to be double linked, and TEMP2 the second. Thus TEMPI is initialized to PTR1, and then is assigned the value of the second pointer in a pointer pair, or the position in TEXT of the last character of a string that was inserted into the free part of TEXT. TEMP2 is always the value of the first pointer of a pointer pair, or when a string is to be inserted, FREE_TEXT . Inside the loop, two cases are considered: if the part of the expression currently being processed is a string, then a double link is placed between TEMPI and FREE_TEXT, the string is placed in TEXT, and TEMPI is updated to denote the last character of the string inserted; otherwise, a pointer pair is being processed. In this case, TEMP2 is assigned the 76 value of the first pointer, a double link is placed between TEMPI and TEMP2, and TEMPI is updated to contain the value of the second pointer. The line numbers are then removed from the text denoted by the pointer pair TEMP2 - TEMPI- The only remaining problem occurs when a patch of the form RjP^ = P P^ + P 1 P 2 or P 1 P 1 = string is to be made where POINTER (P ) = BEGINJEEXT, or POINTER (PU) = ENDJTEXT (or POINTER (Pi) = ENDJTEXT in the second case). In the first case, we wish P , or the first character of the string, to become the beginning of TEXT. Thus, BEGIN_TEXT is assigned TEMP2 and the double link is omitted. In the second case, we wish P p ,or the last character of the string, to become the end of text. Thus, ENDJTEXT is assigned TEMPI, and the double link is omitted. The loop is continued is a concatenating plus is found, indicating that there is more to be processed in the EDIT BUFFER. To terminate, the final link is placed, and the result of the patch is displayed if the editor is in verify mode. Algorithm 6.11 - PATCH (PTR1, PTR2) FINISHED <- false TEMPI ^ PTR1 EBIN ±- 1 while not FINISHED do begin 77 if EDIT_BUFFER (EBIN) = quote then begin if TEMPI = BEGIN_TEXT then BEGrN_TEXT <- FREEJTEXT else DOUBLE_LINK (TEMPI, FREEJTEXT) EBIN ^-EBIN + 1 APPEND_STRING TEMPI ^-FREE_TEXT - 1 end else "begin TEMP2 f- POINTER (GET_POINTER (EDIT_BUFFER (EBIN) ) ) if TEMPI = BEGIN_TEXT then BEGIN_TEXT *- TEMP2 else DOUBLE_LINK (TEMPI, TEMP2) EBIN <- EBIN + 1 TEMPI «- POINTER (GET_POINTER (EDIT_BUFFER (EBIN) ) ) end FINISHED <- (EDIT_BUFFER (EBIN + l) ^ plus) EBIN ^ EBIN + 2 end if PTR2 = END_TEXT then END_TEXT <- TEMPI else DOUBLE_LINK (TEMPI, PTR2) if VERIFY then DISPLAY (PTR1, PTR2) REMOVE_LINE_NUMBERS_FROM (PTR1, PTR2) Up to this point in the discussion, the problem that pointers take up space in text has not been considered. The structure of the in-text pointer was described in section 2.1; note that POINTER LENGTH + 2 characters 78 will be displaced each time a pointer is placed in TEXT. However, we wish to retain all displaced text, because it is not certain that it will be deleted. Therefore, the following scheme is used. Suppose we wish to insert a pointer from PTR1 to PTR2 as shown in Figure 6 -1(a). i- ■ ' ! lit' ttti LL I i— /////////// 1 1 1 1 1 1 i i » i i •{ r i I i i i i if r i it T r PTR1 PTR2 r FREE TEXT Figure 6.1(a). Placing an In-Text Link That is, we wish the text to appear to the user as if the character denoted by PTR2 is the first character following that denoted by PTR1; FREE_TEXT denotes the first unused position of TEXT. Wow ( POINTER_LENGTH + 2) characters will be displaced; however, they can be placed in the free part of TEXT, and linked to whatever text was on either side. To do this, place a link in TEXT, (displacing the character from PTR1 - POINTER_LENGTH - 1 to PTRl) which denotes (FREEJTEXT + POIWTER_LENGTH +2). Then place the displaced text in TEXT (FREE_TEXT + POIWTER_LENGTH + 2 to FREEJTEXT + 2 * POINTER_LEWGTH + 3). Finally, place a link starting at FREEJTEXT which denotes (PTRl - POINTER_LEWGTH - 2) and update PTRl and FREEJTEXT . At this point, the first half of the link has been placed, and the configuration 79 of the text is shown in Figure 6.1(h). Note that the new PTR1 denotes the same character as the old PTR1. Now a link must still be placed between PTR1 and PTR2 . A link is placed in TEXT (PTR2 to PTR2 + POINTER_LENGTH + l) which denotes (FREEJTEXT + POINTER_LENGTH + l) . Again, the second link is placed in TEXT (FREEJTEXT + POINTER LENGTH + 2 to FREEJTEXT + 2 * POINTER_LENGTH + 3) (remember that we are referring to the new FREEJTEXT), which denotes PTR2 + POINTER_LENGTH +2. The displaced text is placed contiguous to the text displaced by the first set of links in FREE TEXT. The result is shown in Figure 6.1(c). (a) FREE TEXT (b) FREE TEXT POINTER_LENGTH + 2 characters displaced by in-text pointer POINTER_LENGTH + 2 each (1) in-text pointer (2) displaced characters Figure 6.1(b). The First Half of the Double Link 80 (a) PTR1 (b) PTR2 (a) FREE TEXT (b) (c) PTR2 (b) FREE TEXT (c) FREE TEXT Figure 6.1(c). The Second Half of the Double Link 81 PTRl PTR2 FREE TEXT Figure 6.1(d). The Final Result Without Intermediate Stages In the example, note that TEXT is shown as going all the way to FREE_TEXT (a), that is, FREE_TEXT at the beginning of the example. When this is the case, END_TEXT must point to the last character before FREE_TEXT. Usually, the text just before FREEJFEXT will be linked back to some interior part of TEXT, as at the end of this example. 82 There are several special cases which may be encountered in double linking which were not mentioned in the discussion above: (l) Part of the area in which characters are to be displaced to make room for a link already contains a link. If this is true of the link to be placed at PTR1, then the intruding pointer must go backwards in text, because otherwise PTR1 would be in an area that has previously been deleted from the text that can be traversed, and it is assumed that the user will protect himself from this. Therefore, the intruding link is traversed, and its counterpart that points forward is changed to denote the part of text to which we were originally attempting to place a link. The characters displaced before the intruding link was found are left in their positions after FREE_TEXT. This process is illustrated in Figure 6.2. The case of an intruding link at PTR2 is handled in a similar manner. (2) PTR1 or PTR2 is the same as FREE_TEXT. This is the case when text is being inserted. If PTR1 is the same as FREEJTEXT, then it need not be linked to FREE_TEXT; therefore, only the second half of the linking, that is the link between FREEJTEXT and PTR2, need be done. When PTR2 is the same as FREEJTEXT, only the first part of the linking is done. (3) PTR1 points to FREEJTEXT - 1. This case appears only when ENDJTEXT points to FREEJTEXT - 1 and text is being inserted at the end of the file. This is handled the same as case (2). 83 i i ft i i i i . i i i i i i i i i i i i i i t . ; /////// #; FTRl intruding link displaced characters (before linking} \ FREE TEXT (after linking) less than LINE_N0_LENG1 displaced characters FREE TEXT PTR1 Figure 6.2. Special Case of the Intruding Link 8U Algorithm 6.12 - DOUBLE_LINK (PPR1, PTR2) if PTR1 = PTR2 - 1 then return if PTR1 = PTR2 then error SAVE <-FREE_TEXT if PTR1 ^ FREE_TEXT - 1 and PTR1 ^ FREE_TEXT then begin LINK *- false TEMP «- PTR1 LIM <- PTR1 - (POINTER_LENGTH + 2) SAVE *- FREE_TEXT + 2 * POINTER_LENGTH + 3 while TEMP > LIM and not LINK do "begin if TEXT (TEMP) = POINTER_DELLMETER then LINK «- true else "begin CHANGE_PO INTERS (TEMP, SAVE) TEXT (SAVE) «-TEXT (TEMP) SAVE *- SAVE - 1 TEMP <- TEMP - 1 end end if LINK then TEMPI <- TEXT (TEMP - k to TEMP - l) + 1 else TEMPI <- LIM + 1 INSERT_POINTER_IN_TEXT (TEMPI, SAVE) TEMP2 <- SAVE - ( POINTER_LENGTH + 2) TEMPI <- TEMPI - 1 INSERT_POINTER_IN_TEXT (TEMP2, TEMPI) SAVE <-FREE_TEXT FREE_TEXT <- FREE_TEXT + 2 * POINTER_LENGTH + k end 85 if PTR2 jl SAVE then begin LINK <- false TEMP <- PTR2 LIM <- PTR2 + POINTER_LENGTH + 2 TEMPI «- TEMP - 1 SAVE «-FREE_TEXT while TEMP < LIM and not LINK do begin if TEXT (TEMP) = POINTER_DELTMETER then LINK <- true else begin CHANGE_POINTERS (TEMP, SAVE) TEXT (SAVE) <- TEXT (TEMP) SAVE <- SAVE + 1 TEMP <- TEMP + 1 end end if LINK then TEMPI «~ TEXT (TEMP + 1 to TEMP + POINTER_LENGTH) INSERT_POINTER_IN_TEXT (SAVE, TEMPI) INSERT_POINTER_IN_TEXT (TEMPI + 1, SAVE - l) FREE_TEXT <- SAVE + POINTER_LENGTH + 2 end The procedure INSERT_POINTER_IN_TEXT simply inserts a pointer at the indicated position in TEXT, which denotes the destination. POINTER DELIMETER characters are placed on either side of the actual pointer. 86 Algorithm 6.13 - INSERT_POINTER_IN_TEXT (POS, DEST) TEXT (POS) «- POrNTER_DELIMETER TEXT (POS + 1 to POS + POINTER_LENGTH) <-DEST TEXT (POS + POINTER_LENGTH + l) <- POINTER_DELIMETER The procedure to change pointers from one position to another when a character has been displaced simply searches through the array of pointers, and changes the ones which denote the displaced character. Algorithm 6.lU - CHANGE_POINTERS (INIT, DEST) for 1=1 until (the number of pointers) do if POINTER (I) = INIT then POINTER (i) ^DEST APPEND_STRING is the procedure which inserts new text into the free part of the TEXT. The only problem for this procedure is in deter- mining when to insert a quote in TEXT. The characters of the string are inserted in TEXT until a quote is encountered. If the next character is quote, then a quote is inserted, and the procedure loops back to resume insertion of the string in text. Otherwise, the procedure is terminated. Algorithm 6.15 - APPEND_STRING FINISHED *- false while not FINISHED do begin while EDIT_BUFFER (EBIN) ^ quote do begin TEXT (FREE_TEXT) <- EDIT BUFFER (EBIN) FREEJEEXT «- FREE_TEXT + 1 EBIN <-EBIN + 1 end 87 EBIN ^EBIN + 1 if EDIT_BUFFER (EBIN) = quote then begin TEXT (FREE_TEXT) *- quote FREEJTEXT <- FREE_TEXT + 1 EBIN ^-EBIN + 1 end else FINISHED ^- true end The procedure to remove line numbers from a segment of TEXT denoted by two pointers first goes backwards in TEXT to remove the line number of the line in which the first pointer denotes a position. The text between the two pointers is then traversed, and all line numbers encountered are replaced by blanks. Since TEXT must be traversed character by character, it is easy to maintain a counter and do a margin check in case any line has been caused to exceed the desired limit by the patch. Algorithm 6.16 - REMOVE_LINE_NUMBERS_FROM (PTR1, PTR2) LENGTH «- TEMP ^- PTR1 while TEXT (TEMP) ^ CR do begin DEC_CHAR (TEMP, l) LENGTH ^- LENGTH + 1 end LENGTH *- LENGTH - LINE_NO_LENGTH do I = 1 to LINE_NO_LENGTH INC_CHAR (TEMP, l) TEXT (TEMP) ^ blank end 88 TEMP «- PTR1 while TEMP ± PTR2 do begin while TEMP ± PTR2 and TEXT (TEMP) ^ CR do begin INC_CHAR (TEMP, l) LENGTH «- LENGTH + 1 end if TEXT (TEMP) = CR then begin if LENGTH > MARGIN then error LENGTH <- do I = 1 to LINE_NO_LENGTH INC_CHAR (TEMP, l) TEXT (TEMP) «- blank end end end while TEXT (TEMP) ^ CR do begin INC_CHAR (TEMP, l) LENGTH <- LENGTH + 1 end if LENGTH > MARGIN then error 89 7- EXECUTE MACRO 7.1. The Overall Macro Execution The procedure to execute a macro consists of four parts. After using GET COMMAND (described in Chapter 3) to get the name of the macro, the tree containing the macro names is searched. If the name is not found in the tree, then the procedure immediately returns a value of false, indicating that a macro was not executed. Second, upon finding the macro name, the procedure uses TYEE_CHECK to make sure the types of the parameters agree as stated in Table 1.1, as follows. The formal parameters are at the beginning of the macro, which starts at a position in MACRO_TEXT that is specified by the MACRO_POS field of the node containing the name of the macro. Therefore, this first part of the text is scanned until a MACRO_ PARAM MARKER is encountered. The formal parameter than follows, and so the types are checked. This process is repeated until a CR is encountered in the MACRO_TEXT; this signifies the end of the formal parameters. In the third section of the procedure, the macro is copied into the free part of AUXJIEXT, substituting the actual parameters for the formal ones. This is done by the procedures COPY_TO_FORMAL_PARAMETER and COPY ACTUAL_PARAMETER . The process is completed when a MACRO_END_MARKER is encountered in COPY_TO_FORMAL_PARAMETER. In the fourth section, the macro is executed by calling EDIT_LINE until the MACRO_END_MARKER is again encountered. Before the fourth section, AUX_POS must be reinitialized to the beginning of the macro, and FREE_AUX_TEXT must be made to denote the first character position after the macro, since the execution of the macro may cause more information to be placed in AUX TEXT. After the 90 execution of the macro, FREE_AUX_TEXT is returned to its original position, thus effectively discarding the copy of the macro. Since the original macro in MACRO_TEXT was not changed, we need not save any information from the copy. Algorithm 7.1 - EXECUTE_MACRO BIN «- 1 GET_COMMAND P <- MACROS while P/A and INFO (P) ^ COMMAND do if INFO (P) > COMMAND then P «- LLINK (P) else P <-RLINK (P) if P = A then return (false) FINISHED <- false POS *-MACR0_P0S (P) while not FINISHED do begin while MACRO_TEXT (POS) ^ MACRO_PARAM_MARKER or CR do POS <- POS + 1 POS «- POS + 1 if MACRO_TEXT (POS - l) = CR then FINISHED «- true else begin Tl <- TYPE_CHECK (MACRO_TEXT, POS) 12 ^TYPE_CHECK (BUFFER, BIN) if Tl = 0, 2, k, 5, or 6 then error else if Tl = 1 then begin 91 if T2 = 3, ^, 5> or 6 then error end- else if Tl = 3 then if T2 = 0, 1, or 2 then error end end FINISHED <- false AUX_P0S <- FREE_AUX_TEXT while not FINISHED do begin C FY_T0_F 0EMAL_PARAME TER COPY_ACTUAL_PARAMETER end MACR0_START <- FREE_AUX_TEXT FREE_AUX_TEXT «- AUX_P0S AUX_P0S <- MACR0_START while AUX_TEXT (AUX_P0S) ^ MACRO_END_MARKER do EDIT_LINE (AUX_TEXT, AUX_P0S) FREE_AUX_TEXT <- MACR0_START return (true) 7-2. C 0PY_T0_F0RMAL_PARAMETER This is a simple routine which places all characters of MACR0_TEXT up to the next MACRO_PARAM_MARKER into AUX_TEXT . If a MACRO_END_MARKER is encountered, FINISHED is set to true and the procedure is terminated. 92 Algorithm 7-2 - COPY_TO_FORMAL_PARAMETER while MACRO_TEXT (POS) / MACRO_PARAM_MARKER and not FINISHED do begin AUXJTEXT (AUX_POS) «- MACRO_TEXT (POS) if MACRO_TEXT (POS) = MACRO_END_MARKER then FINISHED '«- true AUX_POS <-AUX_POS + 1 POS <- POS + 1 end 7.3. C OPY_AC TUAL_ PARAMETER This procedure consists of three sections: first, the formal parameter is located in the formal parameter list; second, the corresponding actual parameter is located; and third, the actual parameter is placed in AUX_TEXT . Before the parameter is located in the formal parameter list, it is checked to make sure that it consists of either one or two alphabetic characters, that is, it is a pointer or pointer pair. In this process, POS is set to denote the first character after the parameter, which is the value that it must have upon return. PARAM_POS is thereafter used to denote the position of the parameter in MACRO_TEXT. SEARCH is used to denote the position currently being searched in the formal parameter list . COUNT denotes the number of formal parameters that have already been found not to match. Note that the parameters are searched in a linear fashion; this is not inefficient because a list must contain between ten and fifteen elements before more powerful searching methods, such as binary search, begin to save time . 93 The parameter is located in the actual parameter list by finding st the (count - 1) comma which is not enclosed in quotes. To insert the actual parameter in AUX_TEXT, a check is first made to see is the actual parameter is type 0. If so, then ' (_/ C u ', a reference to the cursor pointer, is placed in AUX_TEXT. Otherwise, the characters up to the next comma which is not enclosed in quotes are inserted. Algorithm 7-3 - COPYACTUAL_PARAMETER if not FINISHED then begin POS «- POS + 1 PARAM_P0S <- POS while ALPH_TEST (MACR0_TEXT (POS)) do POS <- POS + 1 if POS = PARAM_P0S or POS > PARAM_P0S + 2 then error COUNT *- SEARCH <- MACRO POS (P) while not FINISHED do begin while MACRO_TEXT (SEARCH) ^ MACRO_PARAM_MARKER or CR do SEARCH <- SEARCH + 1 if MACRO_TEXT (SEARCH) = CR then error SEARCH «- SEARCH + 1 while MACRO_TEXT (SEARCH) = blank do SEARCH <- SEARCH + 1 if MACRO_TEXT (SEARCH) = MACRO_TEXT (PARAM_P0S) then if PARAM_P0S = POS - 2 then begin if MACRO_TEXT (SEARCH + l) = MACRO_TEXT (PARAM_P0S + l) then FINISHED *- true end 9h else FINISHED «- true COUNT *- COUNT + 1 end BIN «- 1 FINISHED «- TEXTMODE <- false do I = 1 to COUNT - 1 while BUFFER (BIN) ^ comma or TEXTMODE do begin if BUFFER (BIN) = quote then TEXTMODE «- not TEXTMODE BIN <- BIN + 1 end end BIN <- BIN + 1 BIN_SAVE «-BIN T2 ^-TYFE_CHECK (BUFFER, BIN_SAVE) if T2 = then begin AUX_TEXT (AUX_POS) «- ' u C u ' AUX_POS ^AUX_POS + 1 end else while BUFFER (BIN) ^ comma or CR or TEXTMODE do begin AUX_TEXT (AUX_POS) *- BUFFER (BIN) AUX_POS ^-AUX_POS + 1 if BUFFER (BIN) = quote then TEXTMODE <- not TEXTMODE BIN <- BIN + 1 end end 95 8 . EXECUTE_COMMAND The procedure to execute a command consists of a few initializations, and then a specific set of actions taken for each command. These actions -will be described later in this chapter. The small set of actions taken for every command is described below. The return value is set to false. It will be set to true if a set of actions for a command is successfully completed. BIN is set to one so that the procedure will begin processing at the beginning of the BUFFER. GET_COMMAND (described in Chapter 3) is called to isolate the command, and the correct action for the command is taken. Algorithm 8.1 - EXECUTE_COMMAND RET «- false BIN <- 1 GET_COMMAND {the correct action is taken here} return (RET) 8.1. OPEN If a file is presently open, then an error is signaled. GET_COMMAND is called to isolate the name of the file, which is placed in FILE_NAME, so that the file may be identified when it is closed. Then a request to the system is made, to allow the file to be accessed through TEXT. This request is stated here as the command GET FILE (FILE NAME). BEGIN TEXT and 96 FREE TEXT are initialized. If a cycle number is encountered (GET_NUMBER is described in Chapter 3), then GET_COERECTION_SET_TEXT is invoked to create the desired cycle. TEMP_CORRECTIONS is initialized to denote the node in the list of corrections, of the cycle of the file being used. This is to be used when the correction set for the file being opened is created and linked to the rest of the list of corrections. Finally, OPEN and RET are set to true to indicate that a file has been successfully opened. Algorithm 8.2 - OPEN if COMMAND = 'open' then begin if OPEN then error GET_COMMAND FILENAME <- COMMAND GET_FILE (FILENAME) BEGIN_TEXT <- LINE_NO_LENGTH + 1 if GET_NUMBER then GET_CORRECTION_SET_TEXT else TEMP_CORRECTIONS <- CORRECTIONS FREE_TEXT <- END_TEXT + 1 OPEN <- RET <- true end GET_CORRECTION_SET_TEXT first calculates the number of the cycle which it is to construct from the number which was obtained in open . If the number is less than zero, then it is to be subtracted from the number 97 of the latest cycle; otherwise, it is the desired cycle number. Next, the linked list of correction sets is traversed from most recent to the desired cycle number, applying each correction set to the TEXT. After each correction set is applied, the new cycle may be accessed in AUX TEXT. Since we wish the new cycle to be in TEXT, so that APPLY_CORKECTIONS may be executed again, or so that the TEXT will be ready for editing, the statement TEXT^ AUXJTEXT is used to mean that the AUX_TEXT array should be renamed TEXT (and AUXJTEXT should access some other memory space, perhaps that of TEXT). This is intended to be implemented as a request to the system or an assembler language routine. If this is not possible, it can be circumvented by including TEXT and AUXJTEXT as parameters to APPLY_ CORRECTIONS. If the linked list of corrections is exhausted before the desired cycle is created, then an error is signaled. This is possible because the user may destroy some cycles. Finally, TEMP_CORRECTIONS is given its correct value . Algorithm 8-3 - GET_CORRECTION_SET_TEXT P «- CORRECTIONS if NUMBER < then CNUM <- CNIMBER (P) + NUMBER else CNUM «- NUMBER while CNUM < CNUMBER (P) and P / A do begin APPLY_CORRECTIONS (P) if CNUM J- CNUMBER (P) then P ^ LINK (P) TEXT «= AUXJTEXT end if P = A then error TEMP CORRECTIONS <- P 98 In order to understand the APPLY_CORRECTIONS procedure, recall the structure of a correction set: it consists of (l) a line number, (2) an 'I' indicating insertion or a 'D ' indicating deletion, and (3) if (2) was an insertion, then the text to be inserted follows. The material to be inserted is followed by an INSERTION J2ND_MARKER . TEXT POS and AUX_TEXT_POS are initialized to start processing at the beginning of TEXT and AUX_TEXT. AUX_TEXT_LINE is a counter which is used to place the line numbers correctly. CPOS is used to record the position in the correction set currently being applied. The corrections are applied line by line until the end of TEXT is reached. If the current line number of TEXT matches the line number of the current correction, then the correction is applied; otherwise, the line is placed in AUX_TEXT. To apply the correction, we first must check for insertion or deletion; if a deletion is indicated then the current line of TEXT is skipped by invoking GET_NEXT_LINE_OF_TEXT . If an insertion is indicated, this means that the new text is to be inserted before the current line. Therefore, the insertion is made by a transferral from CS_TEXT to AUXJTEXT, and then the current line is transferred from TEXT to AUXJTEXT. In either insertion or deletion, CPOS is updated to the next correction. After all of the corrections have been made, END__TEXT is given the new position of the last character of TEXT. GET_LINE_NUMBER, which is described in Chapter 3, returns a zero if only blanks are found; otherwise, it returns the line number. 99 Algorithm Q.k - APPLY_CORRECTIONS (P) TEXT_POS <-AUX_TEST_POS «- AUX_TEXT_LINE <- 1 CPOS <- CSPOS (P) while TEXT_POS < ENDJEEXT + 1 do begin if GET_LINE_MJMBER (TEXT, TEXT_POS, true) = 0, or = GET_LINE_NUMBER (CS_TEXT, CSPOS, false) then begin CPOS <- CPOS + LINE_NO_LENGTH + 1 if CSJTEXT (CPOS - l) = *D ' then GET_NEXT_LINE_OF_TEXT else begin INSERT_ C S_TEXT_IN_AUX_TEXT INSERT_TEXT_IN_AUX_TEXT end end else INSERT_TEXT_IN_AUX_TEXT end ENDJTEXT <- AUX_TEXT_POS - 1 GET_WEXT_LINE_OF_TEXT simply increments TEXT_P0S until it denotes a CR, or the end of text is reached. TEXT P03 is then incremented once more to get past the CR . Algorithm 8-5 - GET_EEXT_LINE_OF_TEXT while TEXT (TEXT_P0S) <£ CR and TEXT_P0S < END_TEXT do TEXT_P0S <-TEXT_P0S + 1 TEXT POS *- TEXT POS + 1 100 INSERT TEXT IN_AUX_TEXT first places the correct line number in AUX TEXT, and then uses REPLACEJTEXT (described in Chapter 3) to place the TEXT up to and including the next CR in AUX_TEXT . ENCODE converts an integer into the set of characters which correspond to the integer. Algorithm 8.6 - INSERT_TEXT_IN_AUX_TEXT TEXT_P0S «- TEXT_P0S + LINE_N0_LENGTH if TEXT_P0S < END_TEXT then begin AUX_TEXT (AUX_TEXT_P0S to AUX_TEXT_P0S + LINE_N0_LENG-TH - l) .<- ENCODE (AUX_TEXT_LINE ) AUX_TEXT POS <- AUX_TEXT_POS + LINE_NO_LENGTH REPLACE_TEXT (AUXJEEXT, TEXT, AUX_TEXT_POS, TEXT_P0S) AUX_TEXT_LINE <- AUX_TEXT_LINE + 1 end INSERT_CS_TEXT_IN_AUX_TEXT repeatedly places a line number in AUX_TEXT, and then places CS_TEXT, up to and including the next CR, in AUX_TEXT. After each line is inserted, the character after the CR is checked to see if it is an INSERTION_END_MARKER : if so, the procedure is terminated. Algorithm 8-7 - INSERT_CS_TEXT IN "AUX TEXT FINISHED <- false while not FINISHED do begin AUX_TEXT (AUX_TEXT_P0S to AUX_TEXT_P0S + LINE_N0_LENGTH - l) ^ENCODE (AUX TEXT LINE) 101 AUX_TEXT_P0S «- AUX_TEXT_P0S + LINE_N0_LENGTH AUX_TEXT_LINE <- AUX_TEXT_LINE + 1 REPLACE_TEXT (AUX_TEXT, CS_TEXT, AUX_P0S, CPOS) ' if CS_TEXT (CPOS) = INSERTION_END_MARKER then begin FINISHED «- true CPOS *- CPOS + 1 end end 8.2. Close If no file is open, then an error is signaled. LINEARIZE AND CORRECT is invoked, which returns a value of true if the file was changed since the last cycle, false otherwise. If the file was changed, then a request is made to the system to store the file. The correction set is then linked into the list of correction sets, and is given a cycle number. FREE_AUX_TEXT is returned to one (in case it was not already one), OPEN is set to false, and RET to true. Algorithm 8.8 - Close if COMMAND = 'close' then begin if not OPEN then error if LINEAR IZE_AND_CORRECT then begin PUT_FILE (FILENAME) CORRECTIONS <- CS LINK (CS) ^-TEMP CORRECTIONS 102 if TEMP_C0RRECTI0NS = A then CNUMBER (CS) «- 1 else CNUMBER (CS) «- CNUMBER (TEMP_C0RRECTI0NS) + 1 end FREE_AUX_TEXT *- 1 OPEN «- false RET «- true end The first action of LINEARIZE_AED_CORRECT should be to determine if a correction set should be created. If the text was the first cycle of the file, then no correction set should be created, because the correction set would consist of deleting every line in the file. The correction set is created by comparing 0LD_TEXT (which is a copy of the TEXT at the time the file was opened) to TEXT (which is the current copy of the text), and placing information about the differences between these into CS_TEXT. At the same time, TEXT is linearized, and placed in the AUXJTEXT, with line numbers given to each line . The correction set should contain the information needed to recreate the 0LD__TEXT from a copy of text that is exactly like the copy placed in AUX TEXT. To explain the actions taken in creating a correction set, consider the example in Figure 8.1. To create the TEXT, lines 1, h, 6, and 7 of 0LD_TEXT were deleted and four new lines were inserted (or four lines were changed) . When the TEXT is linearized and placed in AUXJTEXT, line numbers are placed so that line 1' is the same as line 2, line k 1 is the same as line 3, and line 5' is the same as line 5. To get a copy of 103 0LD_TEXT from AUX_TEXT, lines 2', 3', 6' and 7 * must be deleted, and lines 1, k, 6, and 7 must be inserted in their proper places. To do these insertions and deletions, the following notation will be used in CS TEXT: (l) line number I string - means insert string before the AUX TEXT line whose line number is specified; (2) line number D - means leave out the AUX_TEXT line whose line number is specified. OLD TEXT TEXT AUX_TEXT CS_TEXT 1 2 2 3 k 3 1' I'll 2' 2'D 3' 3'D 5 5' 6'D 6' 7'D 7' 8'l6-7 Figure 8.1. Creation of a Correction Set To create the correct configuration in CS TEXT for a specific file, pointers to corresponding lines in TEXT and AUX_TEXT are maintained in order to linearize TEXT, as well as to keep track of the positions in AUX TEXT as to where, in TEXT, insertions and deletions must be made in order to obtain 0LD_TEXT again. A pointer to 0LD_TEXT is maintained, which always contains the position of the line number of the first line 104 which has not been accounted for in the correction set. Thus, whenever a blank line number is encountered in TEXT, the corresponding line in AUX TEXT must be deleted, so aux text line D is placed in the correction set, the text line is placed in AUXJTEXT, and the TEXT and AUXJTEXT pointers are updated to process the next line. The insertion case is a little more complicated. First of all, note that the 0LD_TEXT pointer cannot denote a line that is after the line denoted by the TEXT pointer (if the TEXT line has a number). This can be shown by several observations: (l) the OLD TEXT pointer is updated only when the TEXT pointer denotes a line with a line number; (2) when it is updated, the 0LD_TEXT pointer is made to denote the line after the line corresponding to the one denoted by the TEXT pointer; the TEXT pointer is then immediately updated; (3) in both TEXT and 0LD_TEXT, line numbers appear in ascending order in the TEXT, and no two lines may have the same number. Thus, we need consider only two cases: when the 0LD_TEXT line is before the TEXT line, and when the two lines are the same. When the two lines are the same, we know that the 0LD_TEXT line appears in TEXT, and thus, need not be included in the correction set. Thus, the 0LD_TEXT pointer is updated to the next line. When the 0LD_TEXT line is before the TEXT line, some OLLJTEXT lines have been left out of TEXT, and thus need to be included in the correction set. Thus all 0LD_TEXT lines after and "including the one denoted by the 0LD_TEXT pointer, and before the one denoted by the TEXT pointer, must be placed in the correction set. The only remaining problem occurs when all lines of TEXT 105 have been processed, but there are unprocessed lines 0LD_TEXT remaining. In this case line number I (the remainder of old text] is placed in the correction set, where the line number is the first line number unused by AUX_TEXT. LINEARIZE_AND_CORRECT uses several procedures. GET_LINE_NUMBER is described in Chapter 3- It scans LINE_N0_LENGTH characters and translates the digits found there into an integer. If only blanks are found, a zero is returned. ENCODE is not described in detail, but its purpose here is to change an integer back into a character representation. GET_NEXT_LINE_0F_0LD_TEXT simply causes 0LD_TEXT_P0S to be positioned at the beginning of the next line of 0LD_TEXT; 0LD_TEXT_LINE is also updated. C0PY_TEXT_INT0_AUX_TEXT copies one line of TEXT into AUX_TEXT, excluding the line number, and including the terminating CR. If the end of text is encountered in this operation, FINISHED is set to true. ALL0CATE_CS_N0DE allocates a node which can be used in the linked list of correction sets. The functioning of LINEAR I ZE_AND_C0RRECT can be summarized as follows. First, the necessary initializations are done. Then a big loop is entered in which the procedure remains as long as there is 0LD_TEXT that has not been processed (signified by OLD TEXT_LINE > since every line of 0LD_TEXT has a number) or TEXT that has not been processed (signified by not FINISHED). The loop is traversed once for each TEXT line, and one more time if there are OLD TEXT lines remaining to be processed after all text lines are processed. Corrections are processed only if a correction set is being created (indicated by CPOS > 0) . For each TEXT line, if the line io6 has no number then a deletion must be placed in the correction set. If the line has a number, then an insertion may be placed in the correction set, if there is old text waiting to be processed, and if the line of 0LD_TEXT is before the line of TEXT, in 0LD_TEXT. After the corrections are processed the linearization is done for one line of TEXT, if any TEXT remains to be processed. After all processing has been completed, END TEXT is updated to its new position, and, if no changes were made the CS node is freed, since there is no use for it. Algorithm 8-9 - LINEARIZE_AND_CORRECT 0LD_TEXT_LINE <- GET_LINE_NUMBER (0LD_TEXT, 1, false) if 0LD_TEXT_LINE = then begin RET <- true CPOS <- end else begin RET «- false ALL0CATE_CS_N0DE (P) CPOS <- CSPOS (P) <-FREE_CS_TEXT end 0LD_TEXT_P0S <- AUX_TEXT_P0S <- AUX_TEXT_LINE <- 1 TEXT_P0S ^- BEGIN_TEXT FINISHED «- false 107 while 0LD_TEXT_LINE > or not FINISHED do begin TEXT_LINE «- GET_LINE_NUMBER (TEXT, TEXT_P0S, true) if CPOS > then if TEXT_LINE = and not FINISHED then begin CS_TEXT (CPOS to CPOS + LINE_NO_LENGTH - l) <- ENCODE (AUX_TEXT_LINE) CPOS <-CP0S + LINE_NO_LENGTH + 1 CS_TEXT (CPOS - 1) <- 'D' RET <- true end else if OLD_TEXT_LINE > then if TEXT_LINE > OLD_TEXT_LINE or FINISHED then begin CSJTEXT (CPOS to CPOS + LINE_NO_LENGTH - l) <- ENCODE (AUX_TEXT_LINE ) CPOS <- CPOS + LINE_NO_LENGTH + 1 CS_TEXT (CPOS - 1) «- 'I' RET «- true while (TEXT_LINE > OLD_TEXT_LINE or FINISHED) and TEXT_LINE > do begin 0ID_TEXT_P0S <- 0LD_TEXT_P0S + LINE_NO_LENGTH REPLACE_TEXT (CS_TEXT, 0LD_TEXT, CPOS, 0LD_TEXT_P0S) OLD_TEXT_LINE <- GET_LINE_NUMBER (0LD_TEXT, 0LD_TEXT_P0S, false) end 108 GET_NEXT_LINE_0F_0LD_TEXT end if not FINISHED then "begin AUX_TEXT (AUX_TEXT_P0S to AUX_TEXT_P0S + LINE_N0_LENGTH - l) ^-ENCODE (AUX_TEXT_LINE) AUX_TEXT_LINE «- AUX_TEXT_LINE + 1 AUX_TEXT_POS <- AUX_TEXT_POS + LINE_NO_LENGTH C OPY_TEXT_INTO_AUX_TEXT end ENDJTEXT «- AUX_TEXT_POS - 1 FREE_CS_TEXT «- CPOS if not RET then FREE_CS_NODE (P) return (RET) Algorithm 8.10 - GET_NEXT_LINE_OF_OLD_TEXT if OLD_TEXT_LINE > then begin while 0ID_TEXT (0LD_TEXT_P0S ^ CR do 0LD_TEXT_P0S <- 0LD_TEXT_P0S + 1 0LD_TEXT_P0S «- 0LD_TEXT_P0S + 1 0LD_TEXT_LINE <- GET_LINE_NUMBER (0LD_TEXT, OLD_TEXT_LINE, false) end Algorithm 8.11 - COPY_TEXT_INTO_AUX_TEXT INC_CHAR (TEXT_P0S, LINE_NO_LENGTH ) while TEXT (TEXT_P0S) ^ CR and not FINISHED do begin 109 AUX_TEXT (AUX_TEXT_P0S) «- TEXT (TEXT_P0S) AUX_TEXT_P0S <- AUX_TEXT_P0S + 1 if TEXT_P0S = END_TEXT then FINISHED <- true INC_CHAR (TEXT_P0S, l) end AUX_TEXT (AUX_TEXT_P0S) <- CR AUX_TEXT_P0S «- AUX_TEXT_P0S + 1 if TEXT_P0S = END_TEXT then FINISHED <- true INC_CHAR (TEXT_P0S, l) if TEXT_P0S = END_TEXT then FINISHED <- true 8.3- ESCAPE This command is useful when the user wishes to discard all changes made in the file since it was opened, and start over by opening a file. Since the copy of the file when it was opened is still available and will not be changed (to add a correction set or replace the text by a new set of text) until a close is executed, nothing need be done to allow the user to open a file . Algorithm 8.12 - ESCAPE if COMMAND = 'close' the begin OPEN «- false FREE_AUX_TEXT <- 1 RET «- true end 110 8.U. DESTROY The user wishes to destroy the currently opened cycle, and all cycles older than the opened cycle of a file. When a file is opened, TEMP CORRECTIONS denotes the CS node to -which the correction set of the opened file will be linked when the file is closed. If this is the head of the linked list of correction sets, then the copy of the TEXT should he destroyed. Otherwise, the CS node whose link denotes TEMP_CORRECTIONS should be unlinked, and it and all nodes linked to it should be freed. This is done by FREE_C S_N0DE . Note that this algorithm does not free parts of CS_TEXT denoted by the CS_P0S fields of the freed CS nodes. This should be done by manipulating the paging scheme used to access CS_TEXT in the FREE_CS_NODE procedure. For instance, the freed CS_TEXT positions could be designated to have no location in secondary memory, and the secondary memory locations that had been used for this could be transferred to some other array. Algorithm 8.13 - DESTROY if COMMAND = 'destroy' then begin if not OPEN then error if P/ TEMP_CORRECTIONS then if LINK (P) ^ TEMP_CORRECTIONS then begin Q «- P P ^-LINK (LINK (P)) while P ^ TEMP_CORRECTIONS do begin Q «-LINK (Q) P <- LINK (P) end Ill LINK (Q) ^A end else CORRECTIONS ^A else begin CORRECTIONS <- A DESTROYJTEXT (FILE_NAME) end while P / A do begin Q <- LINK (P) FREE_CS_NODE (P) P ^Q end RET «- true OPEN «- false end 8.5- COPY The text of the file named in the command is to be placed in the currently opened file. Thus, the file is obtained, and corrections are made if necessary. The file name is not changed, and CORRECTIONS and TEMP_CORRECTIONS are set to null, since this will be the first cycle of a newly created file. Some convention must be included to make sure that OLD_TEXT is empty so that the file may be properly closed. Other variables are initialized as described in section 8.1. 112 Algorithm Q.lk - COPY if COMMAND = 'copy' then begin if not OPEN then error GET_COMMAND GET_FILE (COMMAND) if GET_NUMBER then GET_CORRECTION_SET_TEXT CORRECTIONS «- TEMP_CORRECTIONS <- A BEGIN_TEXT <- LINE_NO_LENGTH + 1 FREE_TEXT <- END_TEXT + 1 FREE_AUX_TEXT <- 1 COMMAND <- 'copy' RET <- true end 8.6. TAB SET A new linked list of tab settings is to be created. Each tab setting is checked to make sure that it doesn't exceed the margin, and that it is greater than the previous tab (or in the case of the first setting) , A tab node is allocated for each setting. TABS is made to denote the first node, and the link of each node is made to denote the next node. The algorithm is terminated when no more numbers are located by GET_NUMBER. Algorithm 8.15 - TAB SET if COMMAND = 'tabset' then begin ALLOC ATE_TAB_NODE (P) TABS <- P 113 if not GET_NUMBER then error if NUMBER < or NUMBER > MARGIN then error INFO (P) <- NUMBER while GET_NUMBER do begin if NUMBER < INFO (P) or NUMBER > MARGIN then error ALLOC ATE_TAB_NODE (q) LINK (P) <-Q INFO (Q,) «- number P ^Q end LINK (P) «-A RET <- true end 8-7. VERIFY, NO_VERIFY The logical variable VERIFY must be set to true or false Algorithm 8.16 - VERIFY, NO_VERLFY if COMMAND = 'verify' then VERIFY «- RET «- true if COMMAND = 'no verify' then begin VERIFY «- false RET «- true end 8.8. ERROR, NO_ERROR The logical variable ERROR must be set to true or false . llU Algorithm 8.17 - ERROR, NO ERROR if COMMAND = 'error' then ERROR «- RET «- true if COMMAND = 'no error' then begin ERROR «- false RET «- true end 8.9. MARGIN The margin is checked against the linked list of tabs and if any tab is greater than the margin, an error is signaled. Algorithm 8.18 - MARGIN if COMMAND = 'margin ' then begin if not GET_NUMBER then error P <- TABS while P/A and INFO (P) < NUMBER do P «- LINK (P) if P / A then error MARGIN «- NUMBER RET «- true end 8.10. CATALOGUE A list of all existing files should be displayed. Since all access to the files is effected through the system, this must also be done through a system request. Presumably some list of file names exist 115 to be used to locate files "by name for GET_FILE . This list should simply be traversed, and all of its file names displayed. Algorithm 8.19 - CATALOGUE if COMMAND = 'catalogue' then begin catalogue request to system RET <- true end 8.11. LIST TYFE_CHECK is invoked as an easy and efficient means of evaluating the two pointers which define' the scope of the display. If no value is found for the first pointer, then it is given the value of the pointer C; if no value is found for the second pointer, then it is set so that exactly a display page of TEXT is displayed. The CRT display routine, which is described in Chapter 9> is then invoked. Algorithm 8.20 - LIST if COMMAND = 'list' then begin if not OPEN then error T <- TYPE_CHECK (BUFFER, BIN) if T = 3, ^, 5, or 6 then error if T = then PTR1 <- POINTER (C ) else PTR1 *- TEMP_SUM T <- TYPE_CRFCK (BUFFER, BIN) if T = 3, ^, 5, or 6 then error if T = then begin i [£ PTR2 do begin READ_A_LINE (TEXTJTYEE, TEXT_POS) if not ASSIGNMENT then begin GET_COMMAND if COMMAND = 'repeat' or 'macro' then NUM_ENDS <- NUM_ENDS + 1 if COMMAND = 'end' then NUM_ENDS «- NUM_ENDS - 1 end NUM_LINES ^NUM_LINES + 1 REPLACE_TEXT (AUX_TEXT, BUFFER, AUX_POS, l) end AUXJTEXT (AUX_POS) <- POINTER_DELTMETER AUXJTEXT (AUX_POS + 1 to AUX_POS + POINTER_LENGTH ) <- FREE_AUX_TEXT FREE_AUX_TEXT <- AUX_POS + POINTER_LENGTH + 1 while GT (POINTER (z), POINTER (A)) do begin AUX_POS «- AUXJTEXT (AUX_POS + 1 to AUX_POS + POINTER_LENGTH ) for I = 1 to NUMJLINES do EDIT_LINE (AUXJTEXT, AUX_POS) while AUX_TEXT (AUX_POS) ^ POINTER_DELIMETER do AUX_POS <-AUX_POS + 1 end 119 FREE_AUX_TEXT «- AUXJTEXT (AUX_POS + 1 to AUX_POS + POINTER LENGTH) COMMAND ^, 5, or 6 then error if T = then PTR1 <- PTR2 «- POINTER (C ) else PTR1 «- PTR2 <- TEMP SUM 120 TNC_CHAR (PTR2, l) D0UBLE_LINK (PTR1, FREE_TEXT) TEXT_P0S *- FREE_TEXT while APPEND_M0DE do begin READ_A_LINE (TEXT_TYFE, TEXT_P0S) GET_C0MMAND if COMMAND = term then APPEND_MODE «- false else REPLACE_TEXT (TEXT, BUFFER, TEXT_P0S, l) end FREE_TEXT <- TEXT_P0S DOUBLE_LINK (FREE_TEXT, PTR2) COMMAND <- 'append' RET <- true end 8.16. MACRO The macro name and its position in MACRO_TEXT is placed in the proper place in the binary tree of macros by SEARCH_MACRO_TREE, which returns true if the macro name Was already in the tree. The macro body is then placed in MACRO_TEXT in a manner very similar to the first section of the repeat algorithm. A MACRO_END_MARKER is placed at the end of the macro. Algorithm 8.25 - MACRO if COMMAND = 'macro' then begin if not OPEN then error GET COMMAND 121 if SEARCH_MACRO_TREE (COMMAND) then error REPLACE TEXT (MACRO_TEXT, BUFFER, FREE_MACRO_TEXT, BIN) NUM_ENDS *- 1 while NUM_ENDS > do begin READ_A_LINE (TEXTURE, TEXT_POS) if not ASSIGNMENT then begin GET_COMMAND if COMMAND = 'repeat' or 'macro' then NUM_ENDS <- NUM_ENDS + 1 if COMMAND = 'end' then NUM_ENDS <- NUM_ENDS - 1 end REPLACE_TEXT (MACRO_TEXT, BUFFER, FREE_MACRO_TEXT, l) end MACRO_TEXT (FREE_MACRO_TEXT) <- MACRO_END_MARKER COMMAND <- 'macro' RET *- true end SEARCH_MACRO_TREE allocates a node, Q, and gives it the proper INFO and MACRO_POS values. A normal binary search is then done on the tree of macros, and Q is attached at its proper spot. If a node that contains the same INFO as Q is found, Q is not attached, and is freed. Algorithm 8.26 - SEARCH_MACRO_TREE (NAME) P ^MACROS RET *~ FINISHED ^ false ALLOCATE MACRO NODE (q) 122 INFO (Q) <-NAME MACRO_POS (Q) «- FREE_MACRO_TEXT while not FINISHED do begin if INFO (P) = NAME then RET «- FINISHED «- true else if INFO (P) > NAME then if LLINK (P) = A then begin LLINK (P) <- q FINISHED «- true end else P «- LLINK (P) else if RLINK (P) = A then begin RLINK (P) <- Q FINISHED <- true end else P «- RLINK (P) end if RET then FREE_MACRO_NODE (q) return (RET) 8.17- RUN Each line of the currently opened file is to be presented to the system in the following form: characters 1 to 72 contain a line as typed; characters 73-80 contain the line number generated by the editor. The TEXT file as a whole should have the same format as a card deck. If a TEXT line contains more than 72 characters, it is truncated. It is assumed that the editor generated line number is not longer than 8 characters. 123 The algorithm outlined below collects a line of TEXT in an 81 character string, and then uses the command PRE SENT_LIKE_TO_SY STEM to perform the necessary actions on the line. Algorithm 8.27 - RUN if COMMAND = 'run' then begin if not OPEN then error LINE_NO <- 1 TEXT_POS *-BEGIN_TEXT FINISHED <- false while not FINISHED do begin INC_CHAR (TEXT_POS, LINE_NO_LENGTH ) do I = 1 to 72 while TEXT (TEXT_POS) ^ CR and not FINISHED LINE (I) <- TEXT (TEXT_POS) INC_CHAR (TEXT_POS, l) if TEXT_POS = END_TEXT then FINISHED <- true end do J = I to 72 LINE (I) <- blank end LINE (73 to 80) ^ENCODE (LINE_N0) LINE (81) <- CR INC_CHAR (TEXT_P0S, l) if TEXT_P0S = END_TEXT then FINISHED <- true PRE SENT_LINE_TO_SYSTEM end end 12 k 8.18. GT TYPE CHECK is invoked to evaluate the two arguments, which are assigned to PTR1 and PTR2 . If PTR1 does not denote a point in TEXT which precedes PTR2, then READ_A_LINE is invoked so that the next (and only the next) line will not be executed on the next iteration of the editor. Algorithm 8-28 - GT if COMMAND = 'gt' then begin if not OPEN then error T «- TYPE_CHECK (BUFFER, BIN) if T = 1 or 2 then PIR1 <- TEMP_SUM else error T «- TYPE_CHECK (BUFFER, BIN) if T = 1 or 2 then PTR2 <- TEMP_SUM else error if not GT (PTR2, PTRl) then READ_A_LINE (TEXT_TYPE, TEXT_POS) RET <- true end 8.19- EQ TYPE_CHECK is invoked to evaluate the two arguments. If they do not denote the same position of TEXT, then READ A LINE is invoked. 125 Algorithm 8-29 - EQ if COMMAND = 'eq' then begin if not OPEN then error T *- TYPE CHECK (BUFFER, BIN) if T = 1 or 2 then PTR1 «- TEMP_SUM else error T <- TYFE_CHECK (BUFFER, BIN) if T = 1 or 2 then PTR2 *- TEMP_SUM else error if PTR1 ^ PTR2 then READ_A_LINE (TEXTJTYFE, TEXT_POS) RET «- true end 126 9. THE CRT DISPLAY 9.1. The Overall Display The exact format for the CRT display has not been decided. This is not critical, since the display routines can "be completely isolated from the rest of the editor. The screen will probably be divided into two parts: on one part, the user commands will be echoed, and messages from the editor will be displayed; on the other part, segments of the TEXT will be displayed by means of the user command list , and the special characters that can be used to manipulate the display. These two parts could be displayed simultaneously, one at the top of the screen and one at the bottom, or in an alternating fashion, according to some algorithm, or by adding another special character to allow the user to pick the display he wants. A simultaneous display would probably be more convenient because it avoids problems of switching back and forth . This chapter is primarily concerned with the part of the display that displays the TEXT. This part of the display, and only this part, will be contained in an array called TEXT_DISPLAY . TEXTJDISPLAY can contain PL lines of the fixed length LINE_LENGTH. It is assumed that whatever is in TEXT_DISPLAY will automatically be displayed, starting at line PP and going, in a circular manner, to line (PP-l) . The contents of the TEXT_DISPLAY may be altered by the editor in three ways: DISPLAY wipes out the present contents and totally renews the text display; FORWARD_ROLL and BACK¥ARD_ROLL wipe out and renew a number of lines of the TEXT_DISPLAY which does not exceed PL; the rest of the lines remain the same. In addition 127 to the TEXTJDISPLAY array, two arrays of length PL, BEGIN_POS and END POS, are used to denote the position in TEXT of the beginning and end of each line in TEXT_DISPLAY. 9.2. DISPLAY The procedure which renews the display is given two parameters which denote positions in TEXT; it is known that the position denoted by the first parameter is not after the position denoted "by the second character. The procedure is to fill the TEXT_DISPLAY, starting with the first character of the line in which the first parameter denotes a position, and stopping with the last character of the line in which the second parameter denotes a position, or when PL lines have been placed in the TEXT_DISPLAY. TEMP moves through TEXT, denoting characters which are to be placed in TEXTJDISPLAY. PP denotes the line that is currently being processed, I the character of the line. PL is the number of lines that can be put in the display, LINE_LENGTH the number of characters that can be placed in a line . If the two pointers denote the same position in TEXT, then nothing is displayed. PP and PP_TEMP are initialized to the first line; TEMP is initialized to the first parameter, and then decremented until the beginning of the line is reached. Now the loop is entered in which lines of TEXT are placed in TEXT_DI SPLAY as long as TEMP is not greater than the second parameter, and the end of TEXT has not been reached. Lines of TEXT are placed in the TEXT DISPLAY by the procedure INSERT TEXT IN TEXT DISPLAY. To 128 terminate the algorithm, blanks are inserted in any lines of the TEXT_DI SPLAY which were not filled in the loop, and the pointer C is assigned the position of the last character placed in the TEXT_DISPLAY. Algorithm 9-1 - DISPLAY (PTR1, PTR2) TEMP «- PTR1 while TEXT (TEMP) ^ CR do DEC_CHAR (TEMP) do PP = 1 to PL while TEMP ^ END_TEXT and GT (PTR2, TEMP) INSERT_TEXT_IN_TEXT_DISPLAY end while PP < PL do begin I «-l while I < LINE_LENGTH do begin TEXT_DISPLAY ((PP-l) * LINE_LENGTH + i) <- blank I «-I + 1 end PP <- PP + 1 end POINTER (C ) <- TEMP 9-3- FORWARD_ROLL If the number of lines is less than zero, then backward roll is called. If the number of lines equals zero, then nothing is done. If the number of lines is greater than the number of lines that can be displayed, then an error is signaled. TEMP is set to the ending position of the line 129 before the line denoted by PP; since the display is circular and PP denotes the first line in the display, (PP-l) mod PL should denote the last line in the display, and thus TEMP denotes the TEXT position of the CR of the last line in the display. A loop is then entered to insert TEXT into the TEXT_DISPLAY the correct number of times. Pointer C is made to denote the position in TEXT of the last character of the display. Algorithm 9-2 - FORWARD_ROLL (# LINES) if # LINES < then BACKWARD_ROLL (- # LINES)' if # LINES = then return TEMP ^END_POS ( (PP-l) mod PL) do I = 1 to # LINES while TEMP ^ ENLJTEXT INSERT_TEXT_IN_TEXT_DI SPLAY PP <- PP + 1 if PP > PL then PP <- PP - PL end POINTER (C ) <- TEMP 9.k. BACKWARD, ROLL We wish to place the line before the beginning line of the TEXT_DISPLAY into the TEXT_DISPLAY. TEMP is thus initialized to the first character of the TEXT_DISPLAY, which is the first character of line PP. TEMP is then decremented past the CR of the previous line. A loop is then entered in which, for the proper number of times, the following actions 130 are taken: TEMP is decremented until the beginning of the line is reached PP is also decremented. The TEXT is inserted in the TEXT_DISPLAY. TEMP is then reinitialized to the end of the previous line. To terminate the algorithm, pointer C is set to the TEXT position of the last character in TEXT_DISPLAY, which is the CR of line (PP-l). Algorithm 9-3 - BACKWARD ROLL (# LINES) if # LINES < then F0RWARD_R0LL (- # LINES) if # LINES = then return TEMP «- BEGIN POS (PP) DEC_CRAR (TEMP, POINTER_LENGTH + 2) do I = 1 to # LINES while TEMP ± BEGIN_TEXT while TEXT (TEMP) ^ CR and TEMP ^ BEGIN_TEXT do DEC_CHAR (TEMP, l) PP ^ PP - 1 if PP < 1 then PP <- PP + PL INSER T_TEXT_I N_TEXT_D I SPLAY TEMP ^-BEGIN_P0S (PP) DEC_CHAR (TEMP, POINTER_LENGTH + 2) end POINTER (C) ^END_P0S ((PP-l) mod PL) 9-5- INSERT_TEXT_IN_TEXT_DISPLAY This procedure inserts one line of TEXT into the TEXTJDISPLAY. The CR is not inserted since the TEXT_DISPLAY lines are of fixed length. When this procedure is invoked, TEMP should denote the position in TEXT of the CR which precedes the line that is to be inserted; PP should denote 131 the line of the TEXT_DISPLAY into which the TEXT is to be inserted. First, TEMP is incremented past the line number, and the beginning position of the line in TEXT is recorded. Then the line of TEXT is inserted into the TEXT_DISPLAY. If the TEXT line is longer than the display line, then the characters which go beyond the display line are not displayed, and TEMP is incremented to the end of the line. If the TEXT line is shorter than the display line, then blanks are inserted in the TEXT_DISPLAY line to fill it out. Finally, the ending position of the line in TEXT is recorded. Algorithm 9 .k - INSERT_TEXT_IN_TEXT_DISPLAY INC_CHAR (TEMP, LINE_NO_LENGTH + 1 BEGIN_POS (PP) 9,< IN I H ■I