UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN The person charging this material is re- sponsible for its return to the library from which it was withdrawn on or before the Latest Date stamped below. Theft, mutilation, and underlining of books are reasons for disciplinary action and may result in dismissal from the University. To renew call Telephone Center, 333-8400 UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN L161 — O-1096 Digitized by the Internet Archive in 2013 http://archive.org/details/experimentalform811embl Ma 7 REPORT NO. UIUCDCS-R-76-811 EXPERIMENTAL AND FORMAL LANGUAGE DESIGN APPLIED TO CONTROL CONSTRUCTS FOR INTERACTIVE COMPUTING by David Wayne Embley July 1976 DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN ■ URBANA, ILLINOIS The library of the NOV 8 1976 University ot Illinois at Urbana-Cham ign Report No. UIUCDCS-R-76-811 EXPERIMENTAL AND FORMAL LANGUAGE DESIGN APPLIED TO CONTROL CONSTRUCTS FOR INTERACTIVE COMPUTING by David Wayne Embley July 1976 Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 61801 This work was supported in part by the Department of Computer Science and the National Science Foundation under Grant No. US-NSF-EC-41511, and was submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science, 1976. & EXPERIMENTAL AND FORMAL LANGUAGE DESIGN APPLIED TO CONTROL CONSTRUCTS FOR INTERACTIVE COMPUTING David Wayne Embley, Ph.D. Department of Computer Science University of Illinois at Urbana-Champaign, 1976 This thesis explores two objective approaches to language design. An experimental approach to language design recognizes the human element in programming and attempts to achieve an optimal design by an empirical investigation of language constructs and design prin- ciples. A formal approach to language design recognizes the theoretical foundations of programming languages and attempts to achieve an optimal design by a specification of the properties of language constructs in order to expose weaknesses, inconsistencies, and design flaws. These approaches to language design are applied to an explora- tion of control constructs for interactive computing, particularly in computer-aided instruction (CAI). A control construct is examined that handles CAI answer judging and unifies selection and iteration. A static exception processing scheme is also examined. Two experiments tested these constructs and at the same time illustrated a methodology for conducting experiments on-line in a CAI environment. In the investiga- tion, several design principles evolved. These are proposed as a partial basis for reasoning about language features in general. Ill ACKNOWLEDGMENTS Professor Wilfred J. Hansen deserves a special thanks for guiding me through this study. He spent numerous hours discussing various aspects of this research with me and much of the work was ac- complished because of his continuous encouragement. I sincerely appreciate the other members of my doctoral com- mittee, Professors Richard G. Montanelli, Thomas R. Wilcox, H. George Friedman, and Bruce A. Sherwood, for their interest in this project. Professor Montanelli deserves a special note of thanks for his assistance on the statistical aspects of this study. Thanks is also due to the Department of Computer Science and the National Science Foundation (Grant EC-41511) for their financial support. I extend my thanks to Dave Eland, Ron Daniel son, and Prabhaker Mateti for their many stimulating discussions and helpful suggestions. I also extend my thanks to the subjects who gave their time to take the experiments. Finally, I am deeply indebted to my wife, Ann, for her kind understanding and support during all phases of this work. IV TABLE OF CONTENTS Chapter Page 1. INTRODUCTION 1 1.1 Language Design 1 1.2 KAIL 2 1.3 Control Constructs for Interactive Computing 3 1.4 Thesis Overview 3 2. OBJECTIVE APPROACHES TO LANGUAGE DESIGN 5 2.1 The Experimental Approach 5 2.2 The Formal Approach 13 2.3 Design Principles 16 3. STRUCTURED FLOW IN INTERACTIVE DIALOGUES 35 3.1 The KAIL Selector 35 3.2 Psychological Soundness 43 3.3 Selector Trouble Spots 48 3.4 Possible Extensions 50 3.5 Formal Semantics 51 4. FRAME SEQUENCING IN INTERACTIVE DIALOGUES 65 4.1 A Model for Frame Sequencing 65 4.2 Implementation Possibilities 65 4.3 TUTOR 68 5. EXCEPTION PROCESSING 73 5.1 Basic Definitions 73 5.2 Implementation Possibilities 74 5.3 KAIL Exception Processing 79 5.4 Examples 83 5.5 Extensions 90 5.6 Formal Semantics 90 6. SEQUENCING EXPERIMENT 108 6.1 The T Language 108 6.2 The K Language 110 6.3 Hypotheses Ill 6.4 The Experiment 112 6.5 Results 114 TABLE OF CONTENTS (continued) Chapter Page 7. SUMMARY AND CONCLUSIONS 122 7.1 Summary 122 7.2 Conclusions 123 7.3 Suggestions for Further Research 127 LIST OF REFERENCES 129 VITA 131 VI LIST OF TABLES Table Page 1 Performance of Subjects in Selector Experiment 2 Comparison of Complexity of Control Types . 3 Error Contingency Table for Bug 1 4 Error Contingency Table for Bug 3 47 58 115 118 Vll LIST OF FIGURES Figure Page 1. The PLATO Terminal 9 2. The PLATO Keyboard 10 3. An ML Block in M 28 4. Locality Lattice 30 5. Typical Flowchart for CAI Answer Judging .... 36 6. ALGOL-Like Encoding of the Flowchart in Figure 5. . 37 7. KAIL Selector Encoding of the Flowchart in Figure 5 40 8. Groups in the Selector Experiment 45 9. State Diagram for a Typical CAI Lesson 66 10. Lesson and Help Sequences in TUTOR 71 11. Exception Declarations in Nested Blocks 81 12. State Diagram of a Typical CAI Lesson 84 13. Goto Version of Exception Processing for the CAI Lesson of Figure 12 85 14. Goto- Less Version of Exception Processing for the CAI Lesson of Figure 12 87 15. Transition Diagram of an Interactive Dialogue. . . 88 16. KAIL Code to Implement the Transition Diagram in Figure 15 89 17. PL/1 -Like Interrupts Coded in KAIL 91 18. The Internal Representation of a Program Block . . 94 19. Handling back in K 116 20. Handling back in T 116 Vlll Figure Page 21. Section of T Code 120 22. Section of K Code 120 23. Double pause Problem in T . . 121 1 . INTRODUCTION 1 .1 Language Design This thesis explores two objective approaches to language design and applies them to an investigation of control constructs for interactive computing. The purpose of language design is to propose new programming languages and improvements to existing languages in order to ease a programmer's task and increase his productivity. Basically, there are three approaches to language design. 1. Study languages and reason informally about language features. 2. Experimentally test the effect of language features on programmer productivity. 3. Formally define the syntax and semantics in order to in- vestigate the properties of language features. Generally, language designers have solely relied on the first approach. They have introduced and propogated language features based on their own feelings, thoughts, and experiences, but have made little or no effort to scientifically present evidence for proposed language features. The other two approaches to language design are more objective. In an experimental approach to language design , a designer recognizes the human element in programming and attempts to achieve an optimal design by an empirical investigation of language constructs. Through carefully documented, thorough, and replicable experiments, designers can present scientific evidence to support claims about language features and styl- istic considerations. In a formal approach to language design , a designer recognizes the theoretical foundations of programming languages and attempts to achieve an optimal design by a specification of the properties of language constructs in order to expose weaknesses, inconsistencies, and design flaws. A better understanding of the syntax and semantics of language constructs can make it easier for a language designer to see what features are really desirable. Not only are these approaches to language design directly applicable in an investigation of proposed language constructs but also in an investigation of language design principles. Many lists of design principles exist [Hoare 73, Wirth 75, Horning 75], but notions such as "simplicity" or "uniformity" that are generally included in these lists are insufficiently defined. A formal definition of these principles could facilitate identification and consideration of language features that violate basic design principles. Moreover, experiments can be applied in an attempt to objectively validate design principles so that language designers can confidently apply them. 1.2 KAIL As a means to explore experimental and formal language design, the author is designing new programming language called KAIL [Embley 75a]. The KAIL control constructs are intended to be particularly suited for programming interactive dialogues. KAIL itself is a CAI author language. 1 .3 Control Constructs for Interactive Computing Computer-aided instruction is one of the interactive comput- ing application areas. In interactive computing , a user communicates directly with a computer via an on-line terminal to obtain information and solve problems. For the interactive dialogue to be successful, the system must present options which are pertinent to the problem solution, and the user must choose, from among the options, those which are applicable. From the system's viewpoint, interactive computing may be thought of as a two-step, iterative process: (1) present options to a user, and (2) process the selected option. The system presents and processes a particular set of related options in a frame and sequences a user from frame to frame as the dialogue progresses. One of the options presented in a frame may allow a user to decide for himself which frame should come next. Generally a user has two options: (1) continue within the structured flow of the dialogue, or (2) interrupt the dialogue and continue elsewhere. The KAIL control constructs discussed in this thesis are designed to handle these two basic options generally available in interactive computing. 1.4 Thesis Overview Chapter 2 expounds further on the experimental and formal approaches to language design and presents three language design principles, and Chapters 3 through 6 show the application of these methods to control constructs for interactive computing. More specifically, Chapter 2 presents a methodology for conducting experiments on-line in a CAI environment, reviews briefly formal methods for defining the syntax and semantics of programming languages, and formally defines three basic design principles. Chapter 3 discusses the unified KAIL control construct for structured flow of control, describes and gives the results of an experiment to test this construct, and discusses and formally defines its semantic properties. Chapter 4 defines exception processing in prepara- tion for a discussion of dialogue interruption. Chapter 5 discusses and formally defines the KAIL constructs to handle dialogue interruption, and Chapter 6 describes and gives the results of an experiment to test these constructs. 2. OBJECTIVE APPROACHES TO LANGUAGE DESIGN Section 2.1 further explains the experimental approach to language design, cites related work, and presents an on-line methodology in terms of its advantages and disadvantages. Section 2.2 further ex- plains the formal approach to language design and briefly reviews formalisms for defining syntax and semantics. Section 2.3 defines three design principles, uniformity, separability, and locality. 2.1 The Experimental Approach Language features are psychologically sound if they help in- crease a programmer's productivity. For example, a psychologically sound language feature helps a programmer reduce the number of bugs he might introduce into a program. An experimental approach to language design attempts to objectively determine if proposed language constructs are psychologically sound through carefully documented, thorough, and repli cable experiments [Shneiderman 74]. 2.1.1 Related Work In 1971, G. M. Weinberg wrote in The Psychology of Computer Programming [Weinberg 71] that he wished to "trigger" the study of pro- gramming as a human activity. Although most of his book concentrates on social and individual activity in programming, Weinberg also hoped to encourage language designers to include the psychology of computer pro- gramming as a new dimension in their design philosophy. He advocated the use of experimentation as a method to study programming and pro- gramming language features. Since the appearance of Weinberg's book an increasing number of researchers have conducted psychological experi- ments on programming language features. In one of these experiments, three psychologists concluded that nested if programs are easier for inexperienced programmers to understand than corresponding goto programs [Sime, Green, and Guest 73]. Their purpose was not mainly to compare these two language features, but to demonstrate that "by devising micro-languages exemplifying par- ticular features of interest, knowledge can be gained that allows more informed decisions to be made when designing new languages." In another study, L. Weissman listed and categorized factors af- fecting the complexity of programs, began development of a methodology for studying the psychological complexity of computer programs, and con- ducted experiments attempting to determine which factors or combination of factors reduce complexity [Weissman 74];. Although he studied the ef- fects of comments, paragraphing, and mnemonic variable names, his main objective was to search for an appropriate experimental methodology and for a means to measure a subject's ability to understand a given program. Weissman's methods consisted of measuring a subject's ability to hand- simulate program execution, and to fill in the blanks of an English des- cription of the program. He also asked subjects to subjectively evaluate their own understanding. Weissman concluded that variations on these methods resulted in valid measures of understanding. At the University of Indiana, B. Shneiderman and M. Ho used memorization tests to measure the effect of program structure on under- standing [Shneiderman and Ho 74]. One experiment measured a subject's ability to organize information by structural patterns. As might be ex- pected, the results implied that a well -structured program is easier to comprehend. A second experiment tested the logical and arithmetic IF statements in FORTRAN. The results indicated that novice programmers understand the logical IF better than the arithmetic IF, but experienced programmers can deal with both forms equally well, at least on short pro- grams. For his Ph.D. research, J. D. Gannon gathered empirical evidence to support or discredit specific language design issues [Gannon 75]. He reported the results of an experiment to measure the effect on reliability of several different design decisions. In the experiment, he compared two languages and analyzed the frequency and persistence of errors to show that several design decisions influenced error-proneness and error- persistence. These researchers considered existing language features and stylistic issues and attempted to resolve controversies, but they made few attempts to experimentally examine new programming constructs. The experiments reported below investigated the psychological soundness of new control constructs for programming interactive dialogues. They also ex- emplify a methodology for conducting experiments in an interactive, CAI environment. 2.1.2 An On-line Methodology The particular interactive environment was the PLATO CAI sys- tem [Alpert and Bitzer 70]. At the time of the experiments, PLATO ran on a CDC Cyber 73 dual processor with two million 60- bit words of extended core storage. Attached to the processor were over nine hundred plasma display terminals, which had been invented by the PLATO group for this project (Figures 1 and 2). The peak comfortable system load was about four hundred simultaneous users. Under PLATO, all programs are written in a special system language called TUTOR [Sherwood 74]. TUTOR is a CAI author language and contains features for programming frame sequences and sophisticated answer judging. The experiments reported in this thesis were programmed in TUTOR and conducted on PLATO. Several advantages can be gained by con- ducting experiments on-line in an interactive, CAI environment. There can also be several disadvantages. 2.1.2.1 Advantages 1. Controlled Teaching Environment . Since PLATO is designed for CAI, it can be programmed to teach needed concepts. Subjects can learn in a controlled and closely monitored setting. This is particular- ly valuable when running experiments on new, yet-unlearned language features because these features can be taught as part of the experiment. Moreover, the system can gather data on the learning process itself and test sub- jects to insure an adequate level of understanding before allowing them to proceed with the remainder of the experiment. >> ^- ■M C 01 c > tO to Q. 4-> lO -t-> o •r- l— 1 ra "O s- fO • .e E CD o lO E na -r- CM i— -E LD Q. 10 CM O) > *r— E o -E n» -t-> o s- c aim •r- e Q- •r" a> CM S- S- "*s^ *r- i — 3 s- O" 0) 00 (U -p S- o c (0 (T3 -M s_ 3 03 w o .E (0 -E O .e +j •r" «a • i— s E fO T3 fO E E E S_ •r- CD fO ai E CD o S- X- to j_ ai u s_ Q. 4-> to o +-> S_ a> a) (8 I— -■-> 0) =3 ^_^ E a. c O) E o CT) O o i— E a> (O O E CD E •!- •^ -E •r- 4-> r— 4-> E ro s- E -a fO CO $- E •r- I— o fO > 4- O E S- rpr ^~ 1— to o > a. a> to -^ E O O -r- 10 +-> a >> c .a 3 M- -o cu -o 4-» cu c 4-> o -o -Q C >> n3 a> -* Q. S- cu ■r-TJr- s- c cu S fO -Q cu Q. * +-> o fO i— _Q fO o I — to CO - O •i- +-> to Q. X >> CU -M C (U fO to -C to cu •i- -* c •r- -o c s- o -a to .003 >> «= CU 3 cu -^ 4- s- (U iT3 ro CU C >> c • CU O i— i^ •!- J^ -t-> o 0(0 H CiD < =5 I M— •> Q_ i— r- -4-> CU fO X x: •■- cu h- o c C\J CU s- 3 en 11 2. Interactive Dialogue . In addition to providing a learning environment, PLATO can interact meaningfully with subjects during an experiment. When a subject incorrectly answers a question, for example, he can receive immediate, individualized feedback and can be given an opportunity to try again. Subjects can also request and receive help. Because the system individually monitors each subject's progress, it can obtain additional data such as number of questions answered correctly on the first try or number of times help was requested. Moreover, the system can provide solutions to a problem or question before allowing a subject to proceed and thereby control the experiment more uniformly. 3. Individualized Sequencing . PLATO can individually sequence subjects through an experiment. It can force rigidity or allow great flexibility and can record the possibly unique path of every individual through an experiment. The system is also able to insure appropriate responses at each juncture. Thus, for instance, a subject can be forced to complete a self-evaluation form with reasonable entries before con- tinuing. 4. Precise Data . PLATO can gather highly precise data; it can record every single keypress and the exact time it was pressed. Thus, it is trivial to obtain information on the time spent looking at a pro- gram, reading a problem specification, or answering a question. 5. Precise Timing Constraints . PLATO can impose precise timing constraints. If a subject is given 100 seconds to read and try to understand some section of a program, for example, the system can stop him after exactly 100 seconds. It can also conveniently display 12 the elapsed or remaining time on the screen to keep a subject informed. Since timing constraints are imposed individually, each subject may pro- ceed at his own pace; and tight controls can be maintained even when subjects take the experiment on different occasions. 6. Assistance in Grading . PLATO can help evaluate most sub- ject responses. It can trivially grade multiple choice or true/false questions and can also handle fill-in-the-blank and simple question- answer situations. The difficulty of anticipating all correct responses, however, prevents the experimenter from relying entirely on the system. Inherent in this advantage are disadvantages, but the experimenter is not left entirely without assistance. 7. On-Line Editing . PLATO can be programmed to provide on- line editing capabilities and on-line aids such as instructions on the use of the editor or syntax diagrams for language constructs. This gives researchers the opportunity to design experiments which include exten- sive coding, debugging, and modifying. 8. On-Line Execution . PLATO can be programmed to execute (or simulate execution of) programs. Coupled with on-line editing capa- bilities, this allows experimenters to carefully monitor subjects as they create, debug, and modify programs. An experiment on debugging, for instance, can be designed in which subjects dynamically discover logic errors seeded into a program and fix them using an on-line editor. 2.1.2.2 Disadvantages 1. Cost . On-line experiments are expensive. It requires much time and effort to program an experiment which includes computer- 13 aided instruction, on-line execution and editing capabilities, and ex- tensive data gathering routines. Computer resources are also expensive. 2. System Failures . Hardware and software failures are an annoying disadvantage. The system may crash, an individual terminal may fail, or some software error such as buffer overflow may occur. These are constant threats to an experiment. 3. Unfamiliarity with the System . Subjects may feel uncom- fortable working with an on-line system. A machine is unforgiving and "stupid mistakes" that "nobody would make" can be costly. The danger is that a subject's unfamiliarity with the system may appear as lack of understanding or inability to answer a question. Researchers must take special care to insure familiarity so that results are unaffected. 2.2 The Formal Approach A formal approach to language design specifies clearly and con- cisely the capabilities and limitations of proposed constructs in order to expose weaknesses, inconsistencies, and design flaws. It has been said, One might suspect that the language would not improve by having to conform to a restrictive defining tool. But experience shows that it does. In some sense there is no art unless there is a restriction of the medium. In some perverse way, the human mind, in coping with restrictions, produces its best results. [McKeeman 74, p. 518] A language designer can apply formal syntax descriptions and particularly formal semantic definitions to improve his design. 14 2.2.1 Syntax A syntax definition of a programming language makes it pos- sible to determine if a program is well-formed. C.A.R. Hoare asserts that "the regularity, clarity, and ease of implementation of the ALGOL 60 syntax may at least in part be due to the use of an elegant formal technique for its definition" [Hoare 69, p. 580]. The success of this technique has encouraged language designers and implementers to con- tinue to define the context-free aspects of programming languages by some variation of BNF. Unfortunately, BNF is unable to formally describe context- dependent rules, such as, a variable may be declared only once and must be declared before it is used. Context dependencies can be formally de- fined with two-level grammars [van Wijngaarden 69] or other formalisms, but these formal techniques are more difficult to use and understand and may even obscure the simple properties they are intended to clarify. This is a general problem with formal descriptions, and a simpler for- malism, even if it is unable to entirely define an object, is often preferable. 2.2.2 Semantics A formal semantic definition of a programming language makes it possible to determine the meaning of a well-formed program. Research- ers have applied semantic formalisms to define PL/1 [CBEMA 75], PASCAL [Hoare and Wirth 73], and other languages, but few efforts have been made to apply formal semantics to design new language features. 15 The many approaches to formal semantics can be classified into three categories: 1. Mathematical definition, 2. Behavioral definition, and 3. Translation. One mathematical approach to formal semantics is the axio- matic approach of R. W. Floyd and C.A.R. Hoare [Floyd 67, Hoare 69]. The axiomatic approach "involves the elucidation of sets of axioms and rules of inference which can be used in proofs of the properties of computer programming" [Hoare 69, p. 576]. Hoare introduces a notation to state the required connection between a precondition P, a program segment s, and a description of the result of its execution R. The notation P s R means that if assertion P is true before the initiation of a program segment s, then assertion R will be true on its completion. In this thesis P, P Q , P,, . . . , Q, Q , Q.|, . . . , R, R Q , R-., . . ., and state- ments enclosed in braces {. . .} will denote assertions while small let- ters will denote program segments. In a behavioral definition of a programming language, the se- mantics are defined in terms of its dynamic execution on a hypothetical abstract machine. The typical approach is to define the machine in terms of a state vector and then define the meaning of a program by its effect on the state vector [McCarthy 62]. In this thesis, a special purpose abstract machine similar to J. Johnston's contour model of block 16 structured processes [Johnston 71] is introduced to give an abstraction of the run-time behavior of the language features under condiseration. The third approach to formal semantics is to give an algorithm that translates a given program into an equivalent program written in a language defined by one of the other techniques. This approach to formal semantics will not be employed in this thesis. 2.3 Design Principles The classical approach to language design is to propose lan- guage features based on informal reasoning and experience. The advan- tage to this approach is that it requires less work; the disadvantage is that it is subjective. If language designers had a reliable set of basic design principles, however, this disadvantage might be largely overcome. In exploring the KAIL and TUTOR control constructs, several design principles evolved. These are formulated in terms of three basic design principles: 1. Uniformity, 2. Separability, and 3. Locality. 17 In order to describe these basic principles, we introduce a mini-language ML. A PLATO terminal (Figure 1) and a PLATO keyboard (Figure 2) are assumed to be available. program frame statement (frame ;) ? + identifier' [ (statement ;) wri te string ■*■ size number ■* erase ■+ pause -*■ goto identifier -*■ jump identifier Program execution begins with the first statement in the first frame and proceeds from frame to frame until the last statement of the last frame is executed. This flow of control may be altered by goto or jump ; both statements transfer control to the designated frame. The wri te statement displays a string of characters on the screen in the size most recently specified by the size statement. The size is initially size 1. Erase clears the screen. Pause causes processing to halt until any key is pressed. The syntax is described in a variation of BNF. Symbols and under- lined words are terminals, other words are nonterminals. The symbols of the metalanguage are -*-, (, ),*,.*, and H grouping, • means "zero or one," Parentheses are used for means "zero or more," "" means "one or more," and ■+■ means "is (or may be) composed of." Note that brackets, [], and alternation, |, are not part of the matalanguage. 18 For the discussion of the design principles to follow, we need to introduce the idea of a language construct and a semantic notion. A language construct refers to a textual program segment, usually an expres- sion, a statement (e.g., write or erase ) , or part of a statement. Often, it corresponds to the right-hand side of a production or part of a right- hand side (e.g., the opening or closing bracket of a frame). The extent of the construct will be clear in the context of the discussion. Some- times, we will refer to a language construct as a syntactic form, par- ticularly when we are concerned with its appearance. A semantic notion refers to a single action, usually a machine operation (e.g., add, multi- ply, or update) or some observable happening (e.g., erase screen). It may also refer to the setting of some internal status information. Often, a semantic notion is expressed by a particular syntactic form (e.g., erase for screen erase), but a language construct may subsume several semantic notions (e.g., a frame includes frame entrance, the semantics of each statement in the frame, and frame exit). The meaning we attach to the idea of a semantic notion will be clear from the context. 2.3.1 The Uniformity Principle "Uniform" means always having the same form, not varying or variable; it also means conforming to one rule. In a uniform language, a semantic notion always has the same syntactic form; it is consistent and does not vary within a program or from program to program. Moreover, a single rule applies to each language construct independent of its 19 context. Thus, a programmer can easily select the proper language con- struct to implement a semantic notion, and he can know what a language construct means without having to consider the surrounding text. Formally, a programming language is uniform when there exists a one-to-one relationship between semantic notions and syntactic forms. That is, a programming language is uniform if and only if both of the following conditions hold: 1. There are no language constructs x, and x 2 , x-, f x ? , such that (P X] Q) e (P x 2 Q) is true for all assertions P and Q. 2. There is no language construct x such that (R.j A P x R 1 A Q ] ) a (R a P x R 2 a Q ) is true where R, and R ? are assertions that contain only the state information necessary to distinguish the mean- ings of x, P contains the remainder of the state informa- tion, and Q, and Q 2 represent the state information con- tained in P after the execution of x, but Q, t Q 2 . If the first condition holds, there are no two language constructs with identical meanings; and if the second condition holds, each language construct has the same meaning in eyery context. The reason nonuniformity hinders programmers is because it introduces extra constructs, rules, and exceptions to rules into a language. If there are several ways to express a semantic notion, a programmer has more constructs to consider. If a single language con- struct may have various meanings in a program depending on the execution 20 history, a programmer must check all possible contexts in which the con- struct may be executed. In order to verify that the construct will have an intended meaning, a programmer may need to consider a large amount of program text. Nonuniformity leads to more decisions and thus more probability of error. With regard to the first condition for uniformity, some vio- lations may hinder programmers more than others, and some violations may actually benefit programmers. ML is not uniform because goto and jump have identical meanings, that is (P goto identifier Q) = (P jump identifier Q) is true for all P and Q. We could eliminate either goto or jump from the language; to retain both would only confuse programmers. (Later, we will alter the meanings of jump and goto and retain both syntactic forms.) As another example, consider the PL/1 iterative DO. Various syntactic forms can be used to express the semantic notion of looping 5 times through a statement sequence while the index variable I varies sequentially from 1 to 5 (P DO I = 1 TO 5; . . . END; Q) (1) e (P DO I = 1 TO 5 BY 1; ... END: Q) (2) e (P DO I = 1 BY 1 TO 5; ... END; Q) (3) The first version (1) is a shorthand for versions (2) and (3), and the default, "BY 1", relieves the programmer from the burden of inserting this oft-encountered phrase. Since both (2) and (3) are permitted, the 21 programmer need not remember an order rule such as "TO" before "BY". These versions benefit the programmer, but another equivalent version, DO I = 1 TO 10 WHILE I < 6; ... END; obscures the programmer's intent. The freedom to use various forms can be helpful, but it can also be harmful and obscure the program. With regard to the second condition for uniformity, the entire state may have to be considered; but in most instances, a programmer is unable to keep all of this information in mind. The subset of asser- tions perceived by a programmer at any instance of time constitutes a perceived predicate . This perceived predicate depends largely on the program text and the effects a programmer currently hopes to achieve. Given a programming language that is nonuniform because of a context dependent syntactic entity x, if the predicate R that disambiguates x is not included in the perceived predicate, the programmer is likely to be surprised when x behaves differently from his expectations. If, on the other hand, R is included in the perceived predicate, the programmer must check all possible contexts; the danger is that he might not check all possibilities. The following example is introduced to exemplify nonuniformity and to anticipate later discussion in the thesis. Suppose we alter the meaning of goto and jump to solve the following problem. We wish to transfer control to a particular frame and at the same time possibly erase the screen. To provide this facility, let jump and goto respectively set or 22 reset a status flag, called erase-flag. When set, the system will erase the screen on entrance into a frame. This change causes ML to violate the second condition of the uniformity principle, however, because a single syntactic entity depends on context for its meaning. When control enters frame f via goto f, the screen remains unaltered, but when control enters via jump f, the screen is erased. Hence, (R 1 a P f[ R 1 aQ 1 )a(R 2 aP f[ R 2 a Q 2 ) is true where f[ is open frame f, R, = (erase-flag ijs reset), Rp = {erase-flag j_s set}, P = {(screen status is S) a . . . }, Q, = {(screen status js_ S)a ... }, Qp = {(screen status j[s erased) a ... }. If a programmer's perceived predicate does not include the status of erase-flag, he is likely to be surprised when the screen is unexpectedly erased via jump f. If the perceived predicate includes erase-flag, a programmer will need to check the status of erase-flag on all possible paths leading to frame f in order to verify his assumptions. As a fur- ther example, consider the interaction between size and write in ML. The write statement is context dependent because the size of the text 23 displayed on the screen is determined by the last size statement exe- cuted. To avoid this violation of uniformity, the language would need to include some indication of the size with eyery write statement, but this would likely be burdensome for the programmer. Moreover, the size is likely to be included in the perceived predicate associated with write because of its unique association with the write statement. Nevertheless, a programmer must still check size on all paths leading to a write statement. In a more uniform language, fewer rules and exceptions to rules are necessary. There are fewer constructs and special cases, and fewer relationships that a programmer must check to verify the accuracy of his program. With fewer cases to consider, a uniform language is simpler and an in-depth understanding of the language is more easily at- tained. 2.3.2 The Separability Principle Several semantic notions may be invoked by one language con- struct, but a programmer may not wish to invoke all of them. If by some means a programmer can separately invoke each of the semantic notions subsumed by a language construct, the construct is "separable"; otherwise it is "inseparable." A separable language construct is "directly separable" if a programmer can write down the actions he wants by using other syntactic forms of the language; otherwise it is "indirect- ly separable." To indirectly separate a language construct, for instance, a programmer may be required to inhibit unwanted actions and then invoke 24 the construct. Direct separability is preferable. Inseparability either makes it necessary for a language designer to alter his language or forces a programmer to choose a different language, and indirect separability often results in awkward code. Formally, let x, , x ? , . . . , x , n > 2, be distinct semantic notions invoked by a language construct z; (z is called a composite language construct ). If some mechanism exists that permits a programmer to separately invoke each x., 1 < i < n, z is separable ; otherwise z is inseparable . Furthermore, if z is separable, it is directly separable if each x., 1 ^ i ^ n, can be expressed by some other syntactic form of the language, and indirectly separable otherwise. The advantage of a composite construct lies in its power to produce a desired effect with a minimal amount of code. Most high level language structures are composite constructs; and so long as no program- mer needs an unavailable component of a composite construct, all is well. In the unfortunate situation where a needed component is unavailable, the language designer may be willing to extend the language. If not, a programmer would have to obtain the component indirectly by "programming around" the problem if this is possible; otherwise he would have to use a different language. To supply syntactic forms for many composite structures would cause language constructs to proliferate. To force the programmer to indirectly separate composite structures, on the other hand, would result in code that is more difficult to understand and maintain. 25 In ML, a common sequence of frames is [ erase ; . . . pause ; ] [ erase ; . . . pause ; ] [ erase ; . . . pause ; ] where the first action in a frame is clear the screen and the last action is pause. To ease the programmer's task, erase can be combined with frame entrance and pause with frame exit. Consider the inseparability problems that might arise. Let begin be the composite feature open frame and erase screen, and let end be the composite feature pause and close frame. Assume begin , end , erase , and pause exist in the language, but "[" and "]" do not. With these changes, we wish to open a frame without a screen erase, and we wish to close a frame without a pause. For frame exit, we can obtain end without pause by the code . . . jump nxt; end ; nxt begin . . . where jump closes the frame, but avoids the pause . For frame entrance, we must introduce a new language feature, inhibit erase . To obtain begin without erase , a programmer may now write . . . inhibit erase ; end ; begin . . . where inhibit erase suppresses the normal erase on frame entrance. Begin and end are not directly separable, but they are both indirectly separable. One bothersome feature of indirect separability is that a programmer can- not directly say what he wants, he must, instead say what he does not want. 26 Indirect separability often leads to hidden actions . If two semantic notions x and y are bound together in a composite construct z whose name implies x, then y is a hidden action . In the above example, erase and pause are hidden actions. Hidden actions are particularly difficult to use and understand if they are also nonuniform. For ex- ample, if begin opens a frame and either erases the screen or leaves it unaltered depending on the context, the language is nonuniform and a programmer must check all possible contexts for begin to verify the ac- curacy of his program. As mentioned earlier, inseparability often creates a need for further extensions to a programming language. Consider, for example, a language containing a while loop. The exit condition is part of the head of the loop. For a programmer who needs the exit condition at the end of a loop, the repeat-until construct can also be included. This is insuf- ficient, however, because as long as the exit condition is part of either the head or tail of a loop, a programmer still cannot easily solve the "n and a half executions of a loop" problem [Knuth 74] without using a goto . The goto is a mechanism for "programming around" the problem and can be used to indirectly separate the exit condition from the head or tail of a loop. Separability suggests that a few general language constructs are better than many constructs that have only specific utility. Sepa- rability does not imply, however, that fundamental machine operations should not be combined into high level language constructs; it does sug- gest that caution should be exercised in the creation of special purpose constructs. 27 2.3.3 The Locality Principle "Local" means involving or affecting only a restricted area or portion. If a programmer needs to consider only a restricted portion of his program to obtain some desired information such as the set of pos- sible values of a particular variable, the information is localized. The locality principle suggests that local information is preferable to global information. Furthermore, it suggests that permanent information is preferable to alterable information and that static information is preferable to information that can only be obtained by considering the dynamic flow of a program. When information is restricted to some por- tion of a program, it is easier for a programmer to check the accuracy of his program. In order to describe locality, we extend ML to include declara- tions, procedures, nested frames, and a few more statements in addition to write , size , erase , pause , goto , and j ump . program ■* block block ■> [ (declaration ;)' (statement ;) declaration -> declare identifier ■*■ procedure identifier; block statement ■* call identifier ■*■ frame frame identifier' block 28 Locality is defined in terms of an abstract machine M. M con- tains an infinite array of storage cells S. At execution time, M repre- sents every ML program block in S as depicted in Figure 3. The static block pointer for a block b points to the base location in S of the statically enclosing block , the ML block which textually surrounds b, and the dynamic block pointer points to the base location in S of the dynamically enclosing block , the ML block which transferred control to b. ML transfers control to a new block when it encounters a frame or executes a call statement. identifier. identifier static block pointer dynamic block pointer Figure 3. An ML Block in M When control enters an ML block, M sets the static and dynamic block pointers and establishes the information declared in the block. In the case of the outermost block, both the static and dynamic block point- ers are null . When control leaves an ML block, the storage in which it is represented is released. 29 During execution of an f1L program, control can always be as- sociated with some ML program block. The block in which control resides is the cur rent block . Information is local if it can be accessed in the current block. Information is statically accessible if it Can uc accessed by following one or more pointers along the static block chain and dynami cally accessible if it can be accessed by following one or more pointers along the dynamic block chain. If information is statically or dynamically accessible, it is global . Accessible information is either permanent, can only be read, or alterable, can be both written and read. The scope of an object of information is the portion of a pro- gram where the object has the same meaning and can be referenced. A programming language adheres to the local ity pri nciple if its constructs are designed as close as possible to the top of the partially ordered locality lattice shown in Figure 4. Locality suggests that language constructs trial locally access information are preferable to constructs that statically access information, and these, in turn, are preferable to constructs that dynamically access information. Further- more, locality suggests that permanent information is preferable to alterable information. Locality aids programmers because it structures information. When a programmer has a large amount of information to consider, any mechanism that structures this information or restricts it so only a small subset needs consideration is most helpful. When a programming 1 tnguage enco 1 "lity, it reduces the amount of tert a programmer must consider in order to determine the effect of a language construct. 30 Furthermore, it imposes a structure on information accessing methods and restricts the variability of language features whenever possible. Lo- cality also facilitates modularity and is particularly valuable when a program must be modified. permanent, local alterable, local permanent, statically accessible alterable, statically accessible permanent, dynamically accessible alterable, dynamically accessible Figure 4. Locality Lattice (The lattice is a partially ordered set of information accessing methods.) In ML, the size feature is alterable and global to all blocks. When the size statement is executed, global information is established and remains unaltered until the next size statement is executed. The write statement accesses this information to determine how to display text. The size feature, however, can be redefined to be permanent and local. The write statement would then access the unalterable local size 31 information to determine how to display text, for example, it is d - mon to display a title in a larger size, size 3 isplay the rest of the page in the normal size, size 1. with th» permanent, local scheme, this can be coded as [ size 1 : write page 1 ; [ size 3; write TITLE; ]; write Rest of page . . . ; ]; Here, "TITLE" is written in size 3, and "page 1" and "Rest of paqe . . , are written in size i. Under the alterable, global scheme, this can be written as [ size 1 ; write page 1 ; size 3; write TITLE; size 1; ..■ : Rest of page . . . ; ]; The effect is the same as before; but if the programmer forgets to reset size after writing "TITLE", "Rest of page . . ." will be written in size 3, Use of alterable, global information can leaa to '■ > I de effects . When an operation alters global information unrelated to its main func- tion, this alteration is a side effect. For instance, a global variable may be altered in a called procedure and later accessed to direct the flow of control in the calling routine. The change in control flow caused by the altered global variable is likely to be unclear if the 32 setting of this global variable is not the main purpose of the called procedure. Side effects that result from information accessing methods lower in the locality lattice are generally more difficult to detect than those higher in the lattice. Many high level languages include features that increase locality. Procedure parameters are one example. In most programming languages, they not only serve to pass information between calling routines and subroutines but also to limit the scope of information. In the case of a cal 1-by-val ue parameter, information that would have only been dynamically accessible in the absence of parameters becomes local information. In the case of a call -by-reference parameter, even though the information remains dynamically accessible, at least the name of the parameter becomes local. Call -by-reference parameters, however, can lead to extremely bad side effects because global information is altered through the dynamic accessing method which is at the lowest point in the lattice. When a programming language encourages locality, it structures information and thus reduces the amount of text a programmer must con- sider in order to determine the effect of a language construct. Several language features can be introduced into programming languages in order to increase locality. 2.3.4 Summary The thrust of the three design principles, uniformity, separa- bility, and locality, is toward "simplicity." Adherence to uniformity 33 results in fewer rules, exceptions to rules, context dependencies, and language constructs; a programmer has less to remember. Adherence to separability reduces awkward, indirect programming; a programmer can say what he wants rather than what he does not want. Adherence to locality imposes structure, restricts variability, and reduces the amount of program text that must be considered; a programmer can more easily determine the effect of his program. There are tradeoffs, however, and sometimes a violation of one of these principles may actually benefit programmers. Although de- fault values and the freedom to use various syntactic forms to express the same meaning cause violations of the uniformity principle, they can be helpful. Used in excess, however, multiple syntactic forms cause confusion. With regard to separability, special-purpose, composite constructs allow a programmer to produce desired results efficiently, but only if these constructs fit naturally into his program. The dan- ger is that a programmer may have to "program around" various problems when composite constructs do not quite fit his needs. Strict locality is not always best or even always possible. Some information must be variable, and the use of global data and the ability to dynamically update variables through procedure parameters are useful and pragmatic programming techniques. Nevertheless, the more structured the informa- tion flow in a program, the easier it is for a programmer to debug and maintain his program. Besides the tradeoffs, there are also interactions among the three design principles. When a language construct violates two or more 34 of these design principles, the difficulties for a programmer are usu- ally compounded. For example, if a particular symbol in a language is used to denote frame entrance and if frame entrance either does or does not include erase depending on the context, then the language violates both uniformity and separability. Furthermore, if the behavior of inhibit erase , a mechanism to indirectly separate frame entrance and erase , does not adhere to the locality principle, the language violates all three design principles; and a program written in this language may become unusually difficult to understand. In the chapters that follow and particularly in Chapters 5 and 6, the three design principles, uniformity, separability, and locality, will be helpful in our analysis of language constructs. With these design principles in mind and with experimental and formal tools for language design, we proceed to investigate control constructs for inter- active computing. 35 3. STRUCTURED FLOW IN INTERACTIVE DIALOGUES Section 3.1 introduces the KAIL selector, and Section 3.2 ex- plains and gives the results of an experiment to test the selector for psychological soundness. In Sections 3.3 and 3.4, trouble spots and possible extensions are discussed. Finally, Section 3.5 formally de- fines the semantics of the KAIL selector. 3.1 The KAIL Selector Perhaps the most complex flow pattern in interactive dialogues is found in CAI answer judging. A control construct for answer judging must include not only a facility to select from a number of paths based on sophisticated judging machinery but also a facility to recycle a question and a facility to leave the construct when a question is satis- factorily answered. A typical flowchart for answer judging is given in Figure 5. In a question-answer dialogue, a CAI system presents a ques- tion, accepts a reply, compares it with a set of anticipated answers, responds appropriately, and either recycles or terminates the process. Using typical one-in/one-out control structures similar to those in ALGOL or PL/1 encourages a lesson author to code the flowchart (Figure 5) as in Figure 6. This code has several disadvantages: (1) an extra control variable, answer_ok, is necessary, (2) the reply is re- peatedly written in each test, and (3) long sequences of " else if " can become tedious to write. Moreover, question-answer dialogues are seldom thought of as looping processes. 36 U) CD -o S- s s- rc> -C O o o Q. 37 present_question; answer_ok *■ fal se ; while not (answer ok) do begin accept reply; i_f match(reply, answerj) then begin response_l ; answer_ok *■ true ; end else if match(reply, answer_2) then begin response_2; answer_ok +■ true ; end ej_se rf match (reply, answer_n) then begin response_n; end el se response_for_unanticipated_answer; end; Figure 6. ALGOL-Like Encoding of the Flowchart in Figure 5 38 These disadvantages are avoided in the KAIL selector [Embley and Hansen 76], a control construct proposed for structured flow in interactive dialogues. The selector syntax will be introduced as needed in the discussion; the essential syntax can be sketched as selector ■*■ [ statement-sequence control choices ] control + rf expression * choices -*■ (| relation expression : statement-sequence) where expression, relation, and statement-sequence are defined in the base language. (The two expressions are referred to as the control- expression and the choice-expression.) One of the statements that can appear in a statement-sequence is again . statement -»■ again The semantics of the selector are formally defined in Section 3.5; in- formally, the selector is executed in five steps: 1. Execute the initial statement-sequence. 2. Evaluate the control-expression and save its value in a temporary, t. 3. Test each choice in turn until a "selected choice" is found such that (t relation choice-expression) yields true . 4. Execute the statement-sequence in the selected choice. 39 5. If again is encountered in the statement-sequence of the selected choice, immediately return to step one. If no selected choice is found, steps four and five are omitted. One additional choice is choices ■*■ else : statement-sequence If evaluation reaches this choice, which must occur last, it is the selected choice and its statement-sequence is executed. The flowchart for CAI answer judging (Figure 4) can now be coded as in Figure 7. A comparison of the code in Figures 6 and 7 re- veals the following observations. (1) There are two levels of nesting in the ALGOL-! ike code compared to one level of nesting in the KAIL code. (2) The KAIL code is ten lines shorter than the ALGOL-like code. (3) The else if is replaced by the shorter "|", a savings of six keystrokes. (4) An extra control variable, answer_ok, is necessary in the ALGOL-like code. (5) The variable "reply" only appears twice in the KAIL code versis n + 1 times in the ALGOL-like code; n is typically between two and ten. (6) Depending on the number of correct or incorrect replies, the ALGOL-like code will have more answer_ok «- true statememts or the KAIL code will have more again statements. Usually more incorrect responses are anticipated than cor- rect responses, so again is likely to appear more often. 40 present_question; [ accept reply; i_f reply matches answer_l matches answer 2 response_l ; response 2; ]; matches answer_n : responseji; again ; else : response_for_unanticipated_answer; again ; Figure 7. KAIL Selector Encoding of the Flowchart in Figure 5 41 The following is a specific example which tests a student's arithmetic abilities. x +■ ran_d(25); y «- rand (25) ; t set x and y to random integers in [1, 25] t write What is + ? ; [ accept reply; ijF reply = x + y : write Very good; right_count «- right_count + 1; = x * y : write Add, don't multiply; again ; |>x + y + 20: tx^x div_ 10; ty *■ y di_v 10; write But <{tx+l>0 + 0 is only <(tx+ty+2> 0! ; again ; I Due to a set of defaults, the selector supplants many other con- trol structures ( if-then-els e, case , labelled case , . . .). The expres- sions and relations may be omitted with the following result: control-expression -1 ( true ) relation choice-expression (1) -1 ( true ) choice-expression (2) ( false ) choice-expression (i) i-2 (i > 2) When the choice-expression is omitted, the relation and ":" must also be left out. (These defaults were motivated by TUTOR conventions. Their ad- vantages and disadvantages are discussed in Section 3.3.) 42 The selector unifies iteration and selection in a simple manner. An alternate syntactic form for the selector is selector ■+ *[ statement-sequence loop-control choices ]* loop-control -*• while expression ■*■ until expression Note that the symbol "*" is part of the language and is not the meta- symbol " ". For iteration a sixth step is added to the selector semantics: 6. For while , if no selected choice is found, exit; other- wise, return to step one. For until , if no selected choice is found, return to step one; otherwise, exit. In the selector, the loop exit is neither part of the head or tail of a loop. Adherence to the separability principle permits the selector to subsume both the wh i 1 e and the repeat-until and solve the "n and a half loops" problem. As an example of selector iteration, consider the inner loop of C.A.R. Hoare's "quicksort" algorighm [Dijkstra 71]. t m < n and A(m-l) exists and A(m-l) < A(i) for m < i < n t i ■*- m; j -*- n - 1 ; v <- A(n); *[ *[ while A(i) < v : i •*- i+1 ; ]*; t increment i t *[ while | A(j) > v : j «- j-1; ]*; t decrement j t while i < j A(i) «-> A(j); t swap A(i) and A(j) t ]*; 43 Selectors can become deeply nested. To help the reader and compiler match brackets, the programmer may tag them with matching sel- tags selector ■> sel-tag selector sel-tag where the two sel-tags are both the same unique identifier. Moreover, sel-tags allow exit and again from within nested selectors. statement -* again sel-tag" ? ■* exit sel-tag' Without the sel-tags, these statements reference the innermost selector in which they are found. A selector may also serve as a blocko The control, control- expression, and choices may be omitted leaving a simple statement-sequence enclosed in brackets. Declarations may also appear immediately follow- ing the opening bracket of a selector, and items declared in a selector are known only within the selector's scope. 3.2 Psychological Soundness Observations on the characteristics of the KAIL selector and its possible use in programming suggest the following HYPOTHESIS: the selector is easier to understand than an equivalent set of traditional constructs . The selector should enhance understanding because programmers would only need to know one control construct, not several. In the selec- tor syntax, symbols replace keywords in some instances. This should also 44 enhance programmer understanding because it tightens the code and pdcks more information into a program listing. For instance, the closure it- self indicates whether or not the statement is to be repeated. The selector also allows multiple selection within a loop without requiring a second level of nesting. Whenever this construct is applicable, the reduced nesting should aid programmer understanding. An experiment to test this hypothesis is reported in a separate document [Embley 75b] and summarized below. 3.2.1 The Experiment Two versions of two programs were written. One version, S, used the selector while the other version, A, used ALG0L68-like if- then- else , case , and while constructs. One of the programs (A-favored) read naturally in the A language because it made common use of if-then-else , case , and wh i 1 e . The other program (S-favored) took advantage of the selector to eliminate a level of syntactic structure. As subjects began the on-line experiment, the system welcomed them and told them what to expect. They then completed a background in- formation section; and based on this background and on the order of ar- rival, the system assigned them to one of the four groups shown in Figure 8. Taking advantage of the controlled teaching environment, the CAI system taught the syntax and semantics of each subject's assigned language. To establish some minimum level of understanding, the system asked each subject six questions as it taught the language. Each question 45 language A- favored, a S-favored s- o e S-favored, re S_ o A-favored group #1 group #2 group #3 group #4 Figure 8. Groups in the Selector Experiment tested some aspect of the language; moreover, the questions served to familiarize subjects with the type of question-answer interaction re- quired later in the experiment. After teaching the languages, the system presented the problems one at a time in the predetermined order. A subject first read an English description of the problem until he felt he understood its purpose. The system then gave the subject precisely 100 seconds to study the program before having to make a subjective self-evaluation of his understanding on a - 9 scale. (The actual programs were short, about fifteen lines in length.) After placing the program back on the screen, the system in- dividually sequenced each subject through 10 questions. Each subject either answered a question correctly or gave up after 2 minutes, or 3, or 3-1/2, or 3-3/4, etc., up to 4 minutes. The system noted elapsed time for each input and recorded all wrong answers. After completing question 10, a subject made a final self-evaluation of his understanding 46 on a - 9 scale as before. The second problem was then presented in the same fashion as the first, and the experiment was complete. 3.2.2 Results A subject's understanding of the two programs was measured in several ways: 1. Number of questions answered correctly, 2. Number of questions answered correctly on the first try, 3. Time taken to answer a question correctly, 4. Initial self-evaluation, 5. Final self-evaluation, and 6. Difference between final and initial self-evaluation. Subjects using the S language answered on the average 16.7 of the 20 questions correctly; subjects using the A language answered only 15.1 correctly. This result is significant (df = 1, F = 5.34, p £ .025). The selector construct indeed enabled subjects to answer more questions correctly. Table 1 shows how each of the four groups performed in each measure of understanding. Group #2, using the S language and taking the S-favored problem last, performed best in all categories except the sub- jective self-evaluations. This is probably because they had the advan- tage of dealing with the S-favored program last, after having become more familiar with the S langauge. They answered about two more questions than any other group both on the first attempt and in the end. To analyze 47 Table 1 Performance of Subjects in Selector Experiment Group #1 #2 #3 #4 Problem order A-favored first S-favored first Language A S A S Number in group 19 13 15 11 Number answered correctly (20 possible) 15.2 17.5 15.0 15.8 Number answered correctly on the first try (20 possible) 11.7 13.3 11.8 11.4 Standardized mean time to answer correctly .055 -.12 .14 .057 Initial self-evaluation (A-favored program) 2.9 2.3 2.0 2.3 Final self-evaluation (A-favored program) 4.1 3.6 3.6 3.2 Initial self-evaluation (S-favored program) 2.9 3.4 2.5 4.2 Final self-evaluation (S-favored program) 5.4 5.7 4.5 5.7 Note: For the self-evaluations, zero represents no understanding and nine represents perfect understanding. the average time taken to answer a question correctly, the raw data was first standardized; and an average for each subject on each problem was computed. This removed the effects introduced by differing degrees of difficulty in the questions. Again Group #2 performed best, having 48 taken less time. Although these results were not significant, they are in the direction to support the hypothesis. In the subjective self-evaluation, one result was significant. Disregarding order, the S language subjects thought they initially under- stood the S-favored program at level 3.1, whereas the A language sub- jects only thought they initially understood at level 2.7 (df = 1, F = 5.54, p < .022). The experiment shows that empirical data can be obtained to support claims about language constructs. Significant results were observed which support the hypothesis that the selector can be read and understood at least as well or perhaps better than an equivalent set of traditional constructs. 3.3 Selector Trouble Spots 3.3.1 Multiple Syntactic Forms The selector violates the uniformity principle because the de- faults, split in relational operands, and else allow many alternate forms for any one semantic. A typical if- then-else vf a < b then min «- a else min +■ b can be expressed in six ways without explicitly introducing any other operator or the defaults true and false. [ rf a < b min «- a; min «- b; ] [ if a < b min •«- a; | else : min ■*■ b; ] 49 [ rf a < b min ■<- a; | a < b : min «- b; ] [ i_f a | < b : min «- a; else : min «- b; ] [ j_f a < b : min ■*- a; a : min «- b; ] [ vf | a < b : mi n «- a ; else : mi n ■*■ b ; ] Many other versions can be created if other operators and true and false are used. There seems to be no simple solution to this violation of the uniformity principle if the defaults, else , and the split in relational operands remain to accomodate CAI answer judging. One possible alterna- tive is to impose appropriate stylistic rules and let the compiler give a warning or error message if these rules are violated. Another possible alternative is to separate constructs for answer judging from loop and selection constructs, but then the advantages of a single unified control construct would be lost. 3.3.2 Minus One Using minus one as the initial value of the default sequence is awkward. One solution might be to start the default sequence at zero and let true be zero and false be one, but this seems to contradict the intuitive representations of true and false . Another solution might be to introduce booleans into the language. With this scheme, if the control expression yields a boolean result, the default values would be respectively true and false for the first and second choice; but other default values would be undefined. When the result of this control ex- pression is nonboolean, the default sequences would start with one. 50 3.4 Possible Extensions 3.4.1 Multiple Comparands One reasonable extension is to let more than one alternative select a single choice. * choices -* (| comparands statement-sequence) -> else : statement-sequence / ? • x* comparands ■> (relation' expression :) When a choice contains several comparands, the comparands are evaluated from left to right until the first match is found. The comparands be- yond the first match are not evaluated. This can help solve the problem of how many terms in a disjunction are evaluated. In rf A or B or C then . . . if A is true, the system may or may not evaluate B and C; but in [ vf_ | A:B:C:... the system only evaluates A. 3.4.2 Case and All Additions to rf, while , and until might ease the programming task. Likely candidates for inclusion are case and all . Case would shift the default sequence from -1 , 0, 1 , . . . to 1 , 2, . . . , and is another possible solution to the minus one trouble spot. All would select ewery choice that matches. 51 3.4.3 Selection of Other Constructs Perhaps the selector should select expressions, labels, screen coordinates, write text, etc., as well as statements. The selector could be used in these and similar contexts by replacing the statement- sequence in every choice with the appropriate construct. For example, in an expression context [ if a < b | a | b ] would be permissible. Note that this construct permits statements inside expressions, write text, etc. This may be undesirable, but the initial statement-sequence could easily be eliminated at all levels except the statement level. 3.4.4 Generalized Exit and Again Permit exit and again to operate at the procedure level. Exit could replace the typical return found in many languages and again could add additional iterative capabilities to procedures. 3.5 Formal Semantics Based on C.A.R. Hoare's axiomatic approach to formal semantics, the semantics of the KAIL selector are now formally defined. This ap- proach is chosen because it is relatively simple and because it reveals the similarities and underlying differences in the control types. The necessary hypotheses and the resultant conclusions characterize the 52 properties of each control type. After the selector is defined, impli- cations are discussed. 3.5.1 Definition For notational purposes, we assume the selector is written as P sel-head P~ s~ Q fi control -type e fi i r i e i : p i s i Q i y y ' / y 9 re : P s Q 1 n n n n n sel-tail Q where 1. Capital letters are predicates. (The predicate R is also used as needed below, to indicate the precondition of a statement in one of the statement-sequences.) 2. Sel-head is either "[" or "*[", and sel-tail is the cor- responding "]" or "]*". 3. Control -type is one of rf, while , until , case , or all . 4. The s. are statement-sequences. 5. The e- are expressions. 6. The r. are relational operators. 7. All defaults are expressed explicitly (i.e., the r. and e. appear in every choice). 53 8. Else does not explicitly appear. (It may be represented by letting r be "=" and e be e n if e~ is unaffected 3 n n U by side effects.) Until later in the discussion, we assume that e n is unaffected by undesirable side effects, Since we seek an inference rule of the form H T H 2' ; * " H m P selector Q we must first present the necessary hypotheses, H. As a trivial hypothesis, we need Q, n A i=l (R 2 (0,i) ^> ■V. Q ^ Q P selector Q 55 For while (without exit), ^(O.n) > Q, A (M0,1)3 PJ, i=l * ] Q ^ P n X P selector R^O.n) For while (with possible exit ) , H, ^(O.n) 3Q, A(R ? (0,i)3 P ), 1=1 * n W C P selector Q For until , H, R^O.n) => P Q , A (R ? (0,i) 5P), i = l * n Q > Q c P selector Q 56 For case, see if. For all , H, n i-1 A ((V Mj.i-D) a (e n r. e.) DP.), i=l .1=0 ' u i i i n V Mi.n) ^Q i-0 ' P selector Q It is possible to give a general inference rule for the selector, but it is preferable, to leave each control type separate in the form given above. We now consider how side effects interfere with the selector. If the control expression, e Q , is evaluated with each comparison, it is vulnerable to side effects. To avoid this, we map P sel-head P Q s Q Q Q control -type e r. e. : P. s. Q. li l i y i sel-tail Q 57 into P sel-head P Q s Q Q Q t «- e n Q Q control -type true = (t r. e.) : P. s. Ql sel-tail Q where t is local to the selector and is initialized only once on each pass. This change prevents the alteration of the control expression during the execution of each pass through the selector. Furthermore, else can now be represented by = (t = t) and is thus no longer vulnerable to side effects. Undesirable side ef- fects, however, may still affect the choice expressions, (t r, e,) 3.5.2 Discussion As an immediate consequence of the formal definition, we ob- serve that if and case are semantical ly identical. The only advantage that can be gained by retaining both control types is the shift in the default sequence. The disadvantage is that this allows even more syn- tactic forms for a single semantic notion (Section 3.3). The formal definition shows that all is more complex than the other control types (Table 2). Although the number of hypotheses is 58 Table 2 Comparison of Complexity of Control Types Control Types Measures of Complexity rf, while , and until all Number of hypotheses n + 5 n + 5 Number of NOTs in hypotheses n + 3n n 3 + 3n + 2n Number of ANDs in hypotheses « Number of ORs in hypotheses n 2 + 3n n 3 + 3n 2 + 8n 2 n + n Maximum number of statement sequences s., < i < n, executed on each pass 2 n + 1 Number of possible paths on each pass n + 1 2 n 59 the same for IMF, while , until , and all , the hypotheses for all are more complex as measured by the number of operators NOT, AND, and OR. The 3 2 hypotheses for all contain 0(n ) NOTs and ANDs and 0(n ) ORs, whereas 2 the hypotheses for the other control types contain only 0(n ) NOTs and ANDs and contain no ORs. The number of operators relates to the number of possible configurations; fewer operators implies fewer cases for the programmer to consider. Other analyses show that the maximum number of statement sequences executed on a single pass through the selector is linear for all but constant for the other control types, and that the number of possible paths through the selector on a single pass is ex- ponential for all but only linear for the other control types. A program- mer has less to consider with vf, while , and until than will all . More- over, the essential properties of all can easily be simulated. [ if e Q r ] e ] j s ] ]; [ 11 e o r 2 e 2 I S 2 ^ • • • [11 e r n e n | s n ]; ]; The complexity of all as compared to the other control types and its easy simulation in terms of simpler constructs are indications that all should not be retained as one of the control types. Through the formal definition of the selector, we can also gain additional insight into the multiple comparands extension. If more than 60 one alternative selects the n choice, re * r e * 're *s m,l m,l m,2 m,2 • • • • m ^q m ^q • m 3; the formal semantics can easily be augmented to accommodate this change, We map this into [ r m,l e m,l : s m r m,2 e m,2 : s m • • • re : s m,q m,q m ]; and apply the inference rules as stated. Note this clearly expresses the idea that only the comparands that precede a matching comparand are evaluated. This clarifies any ambiguity that might arise during imple- mentation. Thus, the precise definition helps promote program trans- portability. In connection with this analysis of the multiple comparands extension, consider all . The mapping given above is insufficient if 61 al 1 is retained as one of the control types because this means that the statement sequence s could possibly be executed as many as q times. To formally express the proper meaning, it is necessary to introduce another level of formal structure. We map multiple comparands that select the m choice into [ ... all ■ Q ( I r i e , : s x m I m,l m,l m r e : s 1 m,2 m,2 m r ~ e ~ : s 1 m,q m,q m 1 1 >«* ]; where the parentheses are introduced for grouping purposes only. We must now include the following hypotheses for each multiple comparand. m-1 i V Mi, m-1) 2> Q i=0 ' m q , i-1 A ( (Q A A -> (e n r ■ e . ) ) \ (e n r . e , ) ^ P ) ■ _V m • -j m,j m,j v m,i m,v m' Q > q" ^m m Although these hypotheses do not increase the order of magnitude of the 62 complexity of all , they do indicate that a programmer must consider a multiple comparands choice associated with all differently from a mul- tiple compa rands choice associated with the other control types. This is a further indication that all should not be retained as one of the control types. A consideration of the differences between the hypotheses of the control types i_f, while , and until leads to an evaluation of another possible control type. If. includes Q C ^Q, R^O.n) ^Q . While includes ^V ^(O.n) ^ Q Until includes Q C 3Q, R^O.n) z> P Q . Notice that there is one more combination, namely "c 3 V R^O.n) 3 P Q . 63 This last set of hypotheses implies a loop structure with no exit mech- anism provided by the structure itself. This structure can be a valu- able language construct, particularly for control loops in monitoring programs. Moreover, this analysis shows that formal language design can lead to discoveries. From the formal definition, we can determine how well the selector adheres to the three design principles. If there are no side effects, the control expression, e n , can become part of ewery choice expression or it can remain as the control expression and no change in the predicates is necessary. This is a clear violation of the uniform- ity principle. The other violations of uniformity (Section 3.3.1) are not exposed by this formalism because the defaults are filled in and else is mapped into a choice expression. With regard to separability, the separation of loop exit from the head or tail of the loop is evi- dent because it is not necessary that P = Q n or that P = Q. This means J ( n ( that a statement sequence may appear both before and after any test for loop termination (Section 3.1). The introduction of the local temporary t (Section 3.5.1) that reduces the selector's vulnerabil ity to side effects moves the selector construct higher in the locality lattice. Since t is local and initialized only once on each pass, it is perman- ent and local for a single pass through the selector. Hoare's axiomatic approach to formal semantics is a valuable tool. (1) It clearly exposes similarities (if and case ) and differences (if, wh i 1 e , and until ). (2) It reveals underlying reasons for simplicity 64 and complexity and methods for measuring complexity ( all ). (3) It helps evaluate extensions (multiple comparands) and the interaction of exten- sions with other features (multiple comparands and all ). (4) It clears up possible ambiguities (is every comparand in a multiple comparand evaluated?) and thus increases program portability. (5) It uncovers possible extensions (infinite loop) for further evaluation. (6) It aids in an evaluation of language constructs with regard to the three design principles. 65 4. FRAME SEQUENCING IN INTERACTIVE DIALOGUES 4.1 A Model for Frame Sequencing An interactive dialogue may be visualized as a finite state machine where each state corresponds to a frame in a dialogue. Figure 9 shows a state diagram for the flow of a typical CAI lesson. The labels on the arcs of the state diagram correspond to function key names found on the PLATO Keyboard (Figure 2), and a student moves from one state to another when he presses the appropriate function key. To traverse the arc from state A to state B, for example, a student presses the next key. Many interactive systems have custom keyboards with special function keys similar to those on the PLATO keyboard. These function keys are often used, as indicated, to allow a user to direct the trans- fer of control to some other state. When a user presses a function key, an externally initiated event occurs. 4.2 Implementation Possibilities When an externally initiated event occurs, the system, through a language, must provide a means for the dialogue programmer to obtain this information and act on it. The language must somehow allow the pro- grammer to consider eyery input. Some languages, for example DIAL [Newman 70], include special constructs to simulate a finite state machine as a means of handling this information. In general, a consideration of eyery input can be implemented either explicitly or implicitly depending 66 I II, III Introduction Discussion A. Topic 1 B. Topic 2 C. Topic 3 Quiz expected path alternate path Figure 9. State Diagram for a Typical CAI Lesson (This lesson intro- duces some concept, discusses it, and then gives a quiz. At any time during the lesson, even before the discussion, a student may decide to take the quiz. The lesson allows a student to freely review during the discussion; but once the quiz begins, there is no review.) 67 on the available language features. An explicit implementation requires that appropriate tests be associated with every input statement; whereas an implicit implementation only requires appropriate tests to be as- sociated with each state in the dialogue. Consider state II in Figure 9: next directs control to state A, back directs control to state I, and next! directs control to state III. If a programmer implements the flow of control for state II ex- plicitly, he codes the following tests with every input statement in state II. if key = next then goto state_A; if key = back then goto state_I ; if key = next! then goto state_III; If several input statements are scattered throughout the state, code proliferates and becomes burdensome to write. If a programmer implements this control implicitly, however, he only needs to include the tests once as part of the state information. The system then internally generates all necessary tests. If explicit implementation must be used because the program- mer's language does not support implicit implementation, two techniques can be utilized to avoid code proliferation. 1. Confine input to subroutines. 2. Place input in one main routine which monitors all other modules of a program. 68 The first technique usually obscures the flow of control because it either textual ly separates control transfers from the main routines or encourages the use of side effects to control the program flow. The second technique avoids these disadvantages; unfortunately, not eyery program fits into this pattern. In particular, many interactive dia- logues do not fit this pattern because the behavior of most states de- pends heavily on user input. Placing all input in one main routine would therefore textual ly separate an important part of the dialogue from its usual state. 4.3 TUTOR These problems indicate that a language for programming inter- active dialogues should include features for programming these tests implicitly. The TUTOR programming language [Sherwood 74], developed in connection with the PLATO CAI system, provides lesson authors with this ability. TUTOR is introduced at this point in the discussion to exemplify this technique and to provide a basis for further discussion of frame sequencing in interactive dialogues. TUTORs approach to computer-aided instruction, in its simplest form, is based on a sequence of "display-response" frames. A TUTOR pro- gram repeatedly displays information on the screen and then accepts and processes student responses. A lesson author codes each "display-response" frame as a unit and successive units constitute a lesson. A variety of inter-unit transfers provide for nonsequential flow of control. 69 Each unit consists of a header and a sequence of statements, unit ■* unit unit-name statement Unit nesting is not permitted. Execution begins in the first unit and flows sequentially into subsequent units unless control is altered via internal control transfers or externally initiated events. The system initiates an internal transfer of control to another unit when it encounters one of the branch statements. branch-statement -> transfer-type unit-name Three of the transfer types are j ump , goto , and do. If control enters a unit via one of j ump , lesson entry, sequential flow, or an externally initiated event, it is a main unit . If control enters a unit via goto or do , it is an attached unit . An attached unit always returns control to a unit in the calling sequence. For do, control returns to the statement textually following the call; for goto , control returns to the end of the main unit in the calling sequence. To handle externally initiated events, the system maintains an internal pointer for each function key and transfers control to the designated unit when a student presses a function key. When control enters a main unit, all pointers are set to null except the next pointer which is set by default to the textually next unit. When execution reaches the end of a main unit, the system automatically pauses to wait for student input. If the student presses next , the screen is erased and control passes to the unit designated by the next pointer. By 70 executing a function-key-statement a lesson author can set any pointer, including the next pointer. function-key-statement ■* function-key-name unit-name For example, in state II (Figure 9), unit state_II back state_I nextl state_III • • • unit state_A causes the back pointer to reference unit state_I and the nextl pointer to reference unit state_III. If unit state_A textual ly follows unit state_II, the next pointer is automatically set to state_A on entrance into state_II. In TUTOR, next , next! , back , and back! invoke lesson sequences , and help and help! invoke help sequences . The purpose of a help se- quence is to allow a student to digress from the main dialogue and ob- tain auxiliary information. Lesson sequences are completely independent of any execution history; help sequences are not. When a student in a lesson sequence presses one of the help keys, the system designates the current main unit as the base unit and sets an internal pointer to it. Following the help sequence, control returns to the beginning of the base unit. As shown in Figure 10, a help sequence consists of one or more main units. In a help sequence, the system not only sets the next 71 0) a c O) c a. 3 •r— r— CT (TJ ai a> E .c in c ■r- c c 3 "O CD O o i- Cl a» CO (TJ a. 'cd r^ a» c o ■r- c c — oo ai c CD o -C CD +J E -c O 4-> CT) s- C 4- C •^ •r— S T3 o a> +-> 1— _*: t- ^— o c o > 3 4- c •i- c co •r— c CO fO 1- •r- E 3 a> a» o o- cu s_ co -c .c co (J cu c -C x o 2 +J CO co o 4- a> +■> t— 1 r— -4-> CO •r - .C C o: 4-> 3 o 1— c CD rs -i- CO l— re CO -Q c +-> •r- »r" a) C -C CO 3 4-> ai o -a CO c co CD <1> -C E 3 O o O" f0 o CD +-> ai CO -»-> -O (T3 Q. CD r— S- c • CU O •r™ CD in •«-> U c 3 c -a •<- o CD C f0 $_ 3 rO E O" CT> CD C 0) C 00 O -C •r- CO +-> t— Q. co r— CD 4- fl3 CD — 1 O O -C s- 3 v_y 72 pointer as usual, it sets the back and back! pointers as well and as- sociates them with the base unit. When execution encounters an end , the next pointer is set to the base unit, and the help sequence terminates. 73 5. EXCEPTION PROCESSING In this chapter, exception processing is defined, and the necessary language features are specified. PL/1, TUTOR, and KAIL excep- tion processing features are then explained in terms of the definition. Finally, the KAIL and TUTOR features are formally defined and the re- sults of the formal definitions discussed. 5.1 Basic Definitions In order to define exception processing, we introduce the idea of a condition. Among the conditions are (1) hardware conditions (e.g., exponent overflow), (2) software conditions (e.g., subscript out of range), and (3) external conditions (e.g., an externally initiated event described in Chapter 4). When one of these conditions occurs, it is an exception , and the exception is said to be raised . Exceptions may either be enabled or disabled . When an exception is raised an interrupt occurs if the excep- tion is enabled and does not occur if the exception is disabled . An interrupt transfers control to a section of code which processes the exception. This section of code is the exception handler and may either be programmer or system provided. In order to process an exception , a system must first determine if the exception is enabled. If the exception is disabled, the system ignores it and continues processing. Second, the system must associate the exception with a proper handler. Third, the system must activate and process the handler. If no explicit handler is available, the system 74 activates and processes a default handler (e.g., print an error message and terminate the job). Fourth, if the handler does not terminate the job, the system must resume processing. 5.2 Implementation Possibilities If a language permits programmers to handle exceptions, it must provide constructs for these four aspects of exception processing. 5.2.1 Enabled/Disabled Status Depending on the programming language, a programmer enables (disables) an exception dynamically or statically . An exception is dynamically enabled ( disabled ) if its status depends on the execution of some statement. An exception is statically enabled ( disabled ) if its status depends on entering (leaving) some scope. A statically enabled (disabled) exception is permanently enabled (disabled) over some speci- fied scope. A dynamically enabled/disabled exception permits the ex- ception to be variously enabled or disabled over the same scope. PL/1 exceptions are statically enabled and disabled. Some PL/1 exceptions are permanently enabled over the entire scope of all PL/1 programs; others are either initially enabled or initially dis- abled and may be altered by programming. A programmer enables or dis- ables an alterable exception condition by attaching an appropriate con- dition prefix to a statement, block, or procedure. This enables or disables the specified exception condition within the scope of the 75 prefixed structure. The prefix does not, however, affect any structure lying outside the scope including a procedure which may be invoked during execution of the prefixed structure. In TUTOR, all exceptions may be enabled dynamically through execution of a function-key statement. Next is also enabled statically but as a hidden action associated with entrance into a main unit. In help sequences, back and back! are also statically enabled as hidden actions on entrance into main units. All exceptions are statically dis- abled when a main unit terminates, but they may also be dynamically disabled through execution of a function-key statement with no unit name. TUTOR constructs that enable and disable exceptions violate all three basic design principles (Section 2.3). Tutor violates uni- formity because next , back and back! may be enabled by two different syntactic mechanisms, unit entrance and a function-key statement. More- over, back and back! are automatically enabled only in help sequences. Depending on the flow of control, it is possible for back to be enabled in a particular unit and later be disabled in the same unit. TUTOR violates separability because back and back! for instance, are enabled in connection with entrance into each main unit of a help sequence. To separate this enabling from entrance into a main unit, a function-key statement with a null tag must be executed in each main unit of the help sequence. TUTOR violates locality because an exception may be enabled i in an attached unit far removed from the calling unit where the exception may be raised. 76 5.2.2 Association of a Raised Exception with a Handler When an exception is raised, the beginning of the statement in which control resides is the activation point . A handler is dynamically associated with an activation point if the association depends on the execution of some statement, and statically associated if the associa- tion depends on entrance into some scope. Static association permanently binds an activation point to a handler for the duration of a specified scope. Dynamic association allows different handlers to be associated with the same exception raised at a given activation point. PL/1 exceptions are dynamically associated with activation points. In PL/1, an ON statement consists of an exception condition and a statement, called an on-unit , which defines the handler for that exception. Only after an ON statement is executed is it available for association with an activation point. It becomes unavailable for assoc- iation when another ON statement for the same exception condition is executed, when a REVERT statement in the same block and for the same ex- ception condition is executed, or when the block containing the ON ■ >'; statement is terminated. Note that this last method is static while the other methods are dynamic. It is questionable whether dynamic association is necessary [Goodenough 74]. The possible advantage is that different handlers for the same exception can be associated with the same activation point in the program, but this violates the locality principle because the 77 association is alterable. It is possible to provide this same feature with permanent association by placing a control switch as the first statement in a single handler. It seems unnecessary to violate the locality principle in order to provide this feature. In TUTOR, the pointer mechanism associates a handler with a function key when the exception is enabled and breaks the association when the exception is disabled. Thus, like enabling and disabling, exception association in TUTOR is both dynamic and static and violates the basic design principles as discussed above. 5.2.3 Execution of the Handler An exception may either be handled by the system alone or by programmer provided code. If a programmer provides code, the language may restrict its form; and the system may execute it in a special mode disallowing some actions and automatically providing others. PL/1 restricts the types of statements that may appear in on- units. A RETURN statement may not be an on-unit nor may it appear as one of the statements of a begin block used as an on-unit. PL/1 also monitors the handling of interrupts and provides special actions in some cases. When the CONVERSION condition is raised twice in succession by the same operation, for example, the system automatically raises the ERROR condition. TUTOR help sequences are executed in a special mode. On entrance into each main unit, the back and back! pointers are set in addition to the next pointer, and an end command terminates the help sequence. This violates the uniformity principle because in the context 78 of a lesson sequence only next is set on entrance into a main unit and end has no special meaning. 5.2.4 Resumption after Execution of a Handler After execution of a handler, the system may continue proces- sing at one of four different places. 1. It may resume at the activation point. 2. It may resume at the statement following the activation point. 3. It may continue with the statement to which control is explicitly transferred by one of the language defined branch statements. 4. It may resume at some special default point. After a PL/1 on-unit is executed, control usually returns to the statement following the activation point. It is always possible, however, for a programmer to branch out of an on-unit and direct the flow of control by means of a GOTO. Since the default in PL/1 is to resume processing near the activation point, PL/1 does not make use of a sepa- rate default point at which control may resume. Handlers in TUTOR may only initiate lesson sequences, which never return, or help sequences. After execution of a help sequence, control resumes at the beginning of the base unit, the designated default point. Options 1 and 2 are not used. 79 5.3 KAIL Exception Processing The KAIL exception processing scheme is now presented and contrasted with the schemes for PL/1 and TUTOR. The next chapter re- ports the results of an experiment which compared a KAIL-like scheme with a TUTOR- like scheme. The following is the KAIL syntax for exception processing constructs. enable-declaration ■*■ enable exception-name-list exception-name-list ■* exception-name (, exception-name) disable-declaration ■+ disable exception-name-list exception-declaration •*■ on exception-name-list handler' handler •*■ statement All KAIL statements are acceptable in an exception declaration, but only those pertinent to the discussion are explicitly given. statement ■*■ selector -*■ resume ■> exit identifier' ■* again identifier' -»■ goto label KAIL exceptions are enabled and disabled statically and only affect the scope in which they are defined. Thus, they are high on the locality lattice. • 80 Unlike PL/1, KAIL exceptions are statically associated with a handler. This static association method allows exception declara- tions, like enable and disable declarations, to adhere to the locality principle. When a KAIL exception declaration appears in a block, it also enables the exceptions in the exception name list and associates them permanently with the handler until control leaves the block. This renders an enable declaration in the same block superfluous. In nested blocks, however, the same exception may be disabled and then enabled; it may also be redefined. In Figure 11, if exception, is raised in any block, the system activates handler,. If exception^ is raised in block C, it is ignored; if it is raised in block D, the system activates handler.; elsewhere in block B, the system activates handler^. Excep- tion^ is only enabled in block C; if exception^ is raised in any other block, it is ignored. In KAIL, when an exception is raised and associated with an exception declaration, the handler is executed. The handler is uniform because no special restrictions or semantics dictate the form or behav- ior of the statement, and no separate construct such as a PL/1 on-unit is necessary. There are four KAIL statements which a programmer can use to designate the point where control should resume after a handler is executed. 1 . resume 2. goto 3. exit 4. again 81 A_[ on exception, handler,; • • • B_[ on exception^ handler^; • • • C_[ on exception- handler-; disable exception^; • • • LP! D_[ on exception^ handler,; . . . ]_D; • • • ]_B; • • • ]_A; Figure 11. Exception Declarations in Nested Blocks (Exceptions declarations obey ALGOL-like scope rules. ) If a resume statement in the static scope of a handler is executed, con- trol transfers to the activation point. (The resume statement violates the uniformity principle, however, because in any other context it is ignored.) Exit , again , and goto are branch statements which a programmer may use as usual to explicitly transfer control to some point in the program. If none of these explicit statements is used to designate where control should resume, control continues by default with the statement textual ly following the exception declaration, the first 82 statement of the block in which the exception declaration appears. This default is unusual but useful. In CAI, for example, it facilitates re- generation of screen text after return from a help sequence. There is one other point, namely the statement following the activation point, where a programmer may wish to resume after a handler is executed. This extension is easily accommodated by including another statement that has the desired meaning. Since exception processing is statically defined, the question of how to extend the scope of an enabled exception into a called proce- dure remains. By analogy, the scope of a local variable may be extended into a procedure by explicitly passing it in a parameter list. It is somewhat awkward, however, to immediately adopt this for exceptions. By their very nature, exceptions are raised in exceptional cases, and two calling routines may very well have a different set of declared excep- tions. For example, a display used frequently in an interactive dialogue may be called under many different circumstances. It is unreasonable for the display procedure to constrain a calling routine to a particular exception processing environment by forcing it to supply an exception name for every formal exception name in the parameter list. Since a programmer, on the other hand, should be able to locally consider a procedure and be able to tell how it will behave, the names of possible exception conditions which may be extended into the scope of a procedure should somehow be provided. The KAIL syntax requires all possible exceptions that may interrupt a procedure to be listed in the procedure heading. 83 ? procedure-heading -*■ procedure identifier parameters' * (, exception-name) Unfortunately, this nay mislead unwary programmers because not every exception in this list is available on each call. The absence of an exception name, however, insures its unavailability. Although not ideal, the list is a valuable check for both the compiler and the pro- grammer. 5.4 Examples As an example of exception processing, consider how a program- mer might use the goto to provide a user with the ability to back up from one state to another. state : [ ... ]; state : [ on back goto state : ... ]; m J n When the back exception is raised in state control transfers to state . m n Although C. Boehm and G. Jacopini [66] have proved the goto unnecessary and the structured programming movement discourages its use, inclusion of the goto as a construct for programming dialogue inter- ruption is justifiable. Consider the problem of implementing the flow diagram of Figure 9 according to the given specifications. (For conven- ience the state diagram in Figure 9 is reporduced in Figure 12.) Figure 13 is a possible goto implementation, and Figure 14 is a possible goto- less implementation. 84 expected path alternate path Figure 12. State Diagram of a Typical CAI Lesson 85 I II i nt ro_[ on_ nextl goto III; disc_[ oil back goto I; on back! goto II; on nextl goto III; . ]_intro; III: A: topi £ on back goto II; B: top2 [ on back goto A; C: top3_[ ojx back goto B; ]_disc; quiz_[ . . . ]_quiz; ]_topl ; ]_top2; ] top 3; Figure 13. Goto Version of Exception Processing for the CAI Lesson of Figure 12 86 In Figure 13, each state stands out clearly and the expected path is clearly indicated by the sequential ordering. The unobtrusive exception declarations clearly indicate alternate paths and the scope of active function keys. Unlike Figure 13, the flow in Figure 14 is unclear. The extra control variables, t and tt, which designate the next state are detri- mental to the readability of the probram. An extra level of the selector is necessary in two places in order to serve as a pseudo_label for again . The again , of course, could be replaced by yet another control variable and a looping selector, but this would further complicate the problem. Basically, the reason goto is acceptable in the context of dialogue in- terruption is because it mirrors our natural thought process (e.g., "Quit doing this and c[0_ do something else."). Consider one more example (Figure 15). Assume the user is expected to enter A and then proceed to B where information is displayed and some interaction takes place. Assume further that two help displays, H, and hL, are available. If the user accesses the help displays from A, the screen is to be erased, but if entered from B no erase is to occur. Once the user has entered B, next! always refreshes the screen and re- starts the process at the beginning of B. This occurs even when the user is in H, or H ? . The code in Figure 16 indicates how this problem might be solved. This example shows the use of default resumption in block B and C and the procedure heading list in procedure H. Note that next! is active in H only when called from within block C. 87 t «■ 1 ; tt «- 1 j lesson_[ intro_[ if t = 1 [ cm nextl [ t «- 3; exit intro; ]; ... t «- 2; ]; ]_intro; disc_[ If t = 2 | [ on back [ t «- 1; again lesson; ]; on back! [ tt «- 1 ; again di sc ; ] ; on nextl [ t •*■ 3 ; exit di sc ; ] ; • • • n_[ topi_[ if tt = i | [ on back again disc; ... tt ■*■ 2; ]; ]_topl; top2_[ if. tt = 2 | [ oil back [ tt +■ 1 ; again II;]; ... tt «- 3 ; ] ; ]_top2; top3_[ IJF tt - 3 | [ on back [ tt «- 2; again II; ]; . . . ]; ]_top3; t + 3; ]; ]_disc; quiz_[ if t = 3 1 ... ]_quiz; ]_lesson; Figure 14. Goto- Less Version of Exception Processing for the CAI Lesson of Figure 12 38 -*■» expected path alternate path any key but back and nextl any key Figure 15. Transition Diagram of an Interactive Dialogue 89 A_[ cm help [ erase ; H ; exit A ; ] ; • • • t create display A t LA J B_[ cm nextl ; erase; • • • t create display B t C_[ on help H ; • • • t interact with the user t ]_c; LB 5 procedure H, nextl ; Hl_[ on_ back exit H; • * • t create display H, t ]_Hl; H2_[ • • • t create display H 2 t ]_H2; end H; Figure 16. KAIL Code to Implement the Transition Diagram in Figure 15 90 5.5 Extensions The KAIL exception processing scheme discussed thus far can immed- iately be extended to handle arbitrary exception conditions such as overflow, end-of-file, etc. Figure 17 illustrates the use of PL/1-1 i ke interrupts. After the program initializes the array A, it reads input until the endfile condition is raised. If the conversion ( conv ) or subrange (s ubrg ) error occurs during the in^ process, a message is written and i_n continues. The out process then prints the nonzero elements of A and writes "*" if conversion ( conv ) errors occur. Another extension allows users to define and raise their own excep- tions. To define exceptions, a programmer only needs to include a uni- que identifier in an exception name list. exception-name ■> function-key-name -> identifier A construct of the form signal exception-name dynamically raises an exception. Although this feature may be advanta- gous, it may also be disadvantagous becuase of its possible use in place of a goto. Such a use would be a violation of the uniformity principle. 5.6 Formal Semantics Because the locality principle is defined in terms of an abstract 91 t read and print sparse m by n matrix t prog_[ integer i , j ; integer array (m,n) A +■ 0; t declare and initialize matrix A t in_*[ while true [ on_ endfile exit in ; on conv write conversion error; on subrg write ( <(i)> , <(j)> ) out of range; accept i, j, A(i,j); write ; ]_*_>; i - 1; out_*[ while i < m j *■ 0; *[ cm conv write *; j +■ j + 1 ; while j s n C ±fA(1.j) ^ write A( , ) = ;]; ]*; i «- i + 1; ]*_out; ] prog; Figure 17. PL/1 -Like Interrupts Coded in KAIL 92 machine, and because of the particular importance of locality in excep- tion processing, the formal semantics of the KAIL exception processing scheme are defined in terms of a special -purpose abstract machine, SPAM* For comparison, TUTOR exception processing is also defined in terms of SPAM. Since the only objective is to clearly expose the semantics of exception handling, efficiency is ignored. 5.6.1 Definition There are four language constructs in SPAM: (1) assignment statements, (2) the KAIL selector, (3) subroutine calls, and (4) transfer statements. An assignment statement is formally defined by P x x +- y P y where x is a variable identifier, y is an expression, and P is obtained from P by systematically substituting y for all free occurrences of x in P. The KAIL selector is formally defined in Section 3.5. SPAM subroutines can be expanded in line preceded by any necessary assignment of actual parameters to formal parameters. The transfer statement is defined below. SPAM is an 8-tuple SPAM = (S, L, I, V, C, T, F, R) 93 where S is an infinite array of storage cells, Lisa sequential list of instructions, I is static information known about the program before execution begins, V is a set of variables, C is a set of constants, T is a set of temporaries, F is a set of functions, and R is a set of subroutines. S is an infinite array of storage cells, and S(i) denotes the J. L. i cell. At runtime, SPAM represents every invocation of a program block in consecutive cells as depicted in Figure 18. L is a sequential list of instructions. Execution begins with the first instruction in the list and proceeds sequentially to the end of the list. If the instruction transfer-control -to i is executed, control transfers to the i instruction in the list and execution continues. I is static information known about the program before execu- tion begins and consists of a block number and a statically enclosing block number for every program block. It also consists of Storage Exception Cell # Description # 94 13 12 11 10 9 8 7 6 5 4 3 2 1 enabled/disabled status of exception 2 address of exception 2 name of exception 2 enabled/disabled status of exception 1 address of exception 1 name of exception 1 number of exceptions pointer to previous static block pointer to previous dynamic block block number location for possible return address Figure 18. The Internal Representation of a Program Block (Each cell has sufficient storage capacity to hold the largest "piece" of information indicated in the formal definition.) 95 J.L. x_LIST(i,b): the i exception of the exception name list x declared in block b, where x is one of EX: exception declaration, EN: enable declaration, DIS: disable declaration, PEX: exceptions listed in procedure heading. Neither b nor i need appear. If b is omitted, the current block is referenced; and if i is omitted, the entire list is referenced. Visa set of variables used during the execution of SPAM and consists of BP: pointer to the base location in S of the current block. J. L. ADDR_x(j(i)): address of the handler of the i excep- tion in the list j where x is either LOC: location in L, or BNR: block number. If x is omitted, LOC and BNR are both referred to together. x(b): relative location of x in the internal representation of block b (see Figure 18), where x is one of BNR: block number, DYN: dynamic chain pointer, 96 STAT: static chain pointer, NREX: number of exceptions, EXB: base for the exceptions, NREX = EXB. As before, b may be omitted. RUNEX_LIST(i ,b,): the i exception of the runtime excep- tion list in S for block b. Either i or b or both may be omitted as before. RA_x: return address, where x is either LOC: location in L, or BNR: block number. As in ADDR, x is optional. RAISED_EX: name of raised exception. ACT_P: address of activation point. C is a set of constants and consists of ENABLED: code for enabled status. DISABLED: code for disabled status. T is a set of temporaries and consists of i,j,k: integer temporaries. list(i): i exception of temporary list. If i is omitted the entire list is referenced, flag: enabled/disabled temporary, loc: location temporary. bnr: block number temporary. 97 stat_bnr: static block number temporary. bp: block pointer temporary. F is a set of functions and consists of LENGTH(x): returns the length of the exception list x. ELEMENT(x,y) : returns true if the exception x is an element of the exception list y; and returns false otherwise. INDX_EX(x,b): index of exception x in RUNEX_LIST(b). R is a set of subroutines and consists of ALLOCATE: allocates sufficient storage from S to hold the information for the new current block (see Figure 18), sets BP to point at its first location, and sets the block number, bnr, and static block number, stat_bnr, of the new cur- rent block; COPY: subroutine C0PY(i, bnr); t copies the i exception from block bnr t j +■ EXB + 3*(LENGTH(RUNEX_LIST)); k «- EXB(bnr) + 3*(i-l); S(j+1) *■ S(k+1); S(j+2) *■ S(k+2); S(j+3) «- S(k+3); S(NREX) *■ S(NREX)+1; end COPY; 98 MERGE: subroutine MERGE Mist, flag); t merges the exceptions in list with the exceptions al- ready in the current block and sets the enabled/disabled status as specified in flag t 1*1; *[ while i < LENGTH(list) | [If ELEMENT(list(i), RUNEXJ.IST) | j <- EXB + 3*(INDX__EX(list(i))-l); | j +■ EXB + 3*LENGTH(RUNEX_LIST); S(j+1) <- list(i); S(NREX) <- S(NREX)+1; ]; S(j+2) + ADDR(list(i)); S(j+3) <- flag; i «- i+1; ]*; end MERGE; TRANSFER: subroutine TRANSFERfloc. bnr) ; t searches the static block chain releasing blocks until bnr is found and transfers control to loc t *[ if bnr = S(BNR) I transfer-control -to loc; | BP +■ S(STAT); ]*; end TRANSFER; 99 It is necessary to show the details involved to 1. Establish the representation of a program block in S, 2. Handle a raised exception, and 3. Resume processing after servicing an exception. Programs, procedures, selectors, and handlers are considered to be blocks. When control enters a block, its representation is estab- lished in S; and when control leaves a block, the storage cells for its representation are released. The following code shows how the representation of the initial block of a program is placed in S as execution begins. ALLOCATE; S(BP) <- 0; S(BNR) *■ bnr; S(DYN) «■ 0; S(STAT) *■ 0; S(NREX) *■ 0; t establish exceptions t MERGE(EN_LIST, ENABLED); MERGE(DIS_LIST, DISABLED); MERGE (EX_LIST, ENABLED); On entry into a handler block, the following actions occur. bp <- BP; ALLOCATE; S(BP) *• RA; S(BNR) <- bnr; S(DYN) <- bp; S(STAT) +■ stat_bnr; S(NREX) *■ 0; t copy exceptions from the statically enclosing block t i - 1; *[ while i < LENGTH(RUNEX_LIST(S(STAT))) | C0PY(i, S(STAT)); ]*; MERGE (EN LIST, ENABLED); 100 MERGE (DISJ.IST, DISABLED); MERGE(EX_LIST, ENABLED); Entrance into a selector block is identical to entrance into a handler block except that "S(BP) + 0" replaces "S(BP) +■ RA". On entry into a procedure, the following actions occur. bp *• BP; ALLOCATE; S(BP) <- RA; S(BNR) *■ bnr; S(DYN) *■ bp; S(STAT) <- stat_bnr; S(NREX) «- 0; t copy exceptions from the dynamically enclosing block if listed in the procedure heading t i - 1; *[ while i < LENGTH(RUNEX_LIST(S(DYN))) | [if ELEMENT(RUNEX_LIST(i), PEXJ.IST) | C0PY(i, S(DYN)); ]; i +■ i+1; ]*; MERGE(EN_LIST, ENABLED); MERGE(DIS_LIST, DISABLED); MERGE (EX_LIST, ENABLED); The following code gives the action of SPAM when an exception is raised. RA <- ACT_P; [ if ELEMENT(RAISED_EX, RUNEX_LIST) | j *■ INDX_EX(RAISED_EX); [ if S(EXB + 3*j) = ENABLED | TRANSFER(ADDR_LOC(RUNEX_LIST(j)), ADDR_BNR(RUNEX_LIST(j)) ) ; ]; 101 ]; TRANS FER(RA_L0C, RA_BNR); For resumption, there are three cases: 1. A goto , exi t , or again forces control out of a handler. 2. A resume , statically enclosed in the handler, is executed. 3. The handler terminates naturally. SPAM processes a goto , exit , or again by searching the static block chain releasing blocks until the block is found that contains the destina- tion of the goto , exit , or again . Processing then continues at the indi- cated location. For resume , the destination address is in location zero of the handler block; resume is simply a special goto where the address is unknown until execution time. RA «• S(BP); TRANSFER(RA_LOC, RA_BNR) ; When the handler terminates naturally, SPAM executes the statement goto label where label, determined before execution begins, prefixes the first state- ment in the block where the exception is declared. 5.6.2 Discussion The formalism helps clarify some possible ambiguities. (1) In a handler block, it is clear that the exceptions in effect are the excep- tions copied from the statically enclosing block plus any which may be 102 declared in the handler block itself. (2) Because the merge order is 1. EN_LIST, 2. DISJ.IST, 3. EX_LIST it is clear that exception declarations take precedence over enable/ disable declarations and that disable declarations take precedence over enable declarations. The exception name lists in these declarations should be disjoint, however, and a compile-time warning can easily be given if they are not. (3) When resume is executed, control always re- turns to the activation point. System defined exceptions are not mentioned. These are easily handled in SPAM, however, by assuming the existence of an outer block surrounding the program in which the exceptions are declared. For comparison, TUTOR exception processing is now formally defined. We first augment some of the components of SPAM. For every unit, the static information I must include the location in L of the textually next unit; we let the subroutine ALLOCATE set this variable, next loc. To the set of variables V, we add PC: SEQ_M0DE MAIN_PTR MAIN_L0C BASE_PTR BASE LOC program counter, gives the current location in L; sequence mode, lesson or help; pointer in S to the main unit; location in L of the main unit; pointer in S to the base unit; location in L of the base unit. 103 To the set of constants C, we add LESSON: code for lesson status; HELP: code for help status; NEXT: exception name for next ; NEXT1 : exception name for next! ; BACK: exception name for back ; BACK1 : exception name for backl . To the temporaries T, we add keyname: name of function key. To the subroutines R, we add SET: subroutine SET( keyname, loc, flag); t merges the exception keyname with the exceptions in the main unit and sets the enabled/disabled status as specified in flag t [ If ELEMENT( keyname, RUNEX_LIST(MAIN_PTR)) | j <: EXB(MAIN_PTR) + 3*( IN DX_EX( keyname, MAIN_PTR)-1) ; | j <- EXB(MAIN_PTR) + 3*LENGTH(RUNEX_LIST(MAIN_PTR)) ; S(j+1 ) «- keyname; S(NREX(f1AIN_PTR)) <- S(NREX(MAIN_PTR))+1 ; ]; S(j+2) «- loc; S(j+3) «- flag; end SET: 104 As execution begins, the system sets the sequence mode to LESSON, the base pointer to null , and the base location to zero. SEQ_M0DE +■ LESSON; BASE_PTR + null ; BASE_L0C +■ 0; On entry into a main unit, the following actions occur. ALLOCATE; MAIN_PTR «- BP; MAIN_L0C «- PC; i check for possible termination of a help sequence t [ if MAIN_PTR = BASE_PTR | BASE_PTR <- null ; BASE_L0C +- 0; SEQ_M0DE +■ LESSON: ]; t establish necessary pointers t S(NREX) *■ 0; SET(NEXT, NEXTJ.0C, ENABLED); [ if SEQJIODE = HELP | SET(BACK, BASE_PTR, ENABLED); SET(BACK1, BASE_PTR, ENABLED); ]; On entry into an attached unit, the following action occurs. ALLOCATE; When a function-key statement is executed, the following actions occur. Let key be the function-key name and ptr be the location pointer 105 in L to the designated unit. If the tag in the function key statement is empty, ptr = 0. [ rf ptr = I [if key = NEXT | SET (key, NEXT_L0C, ENABLED); | SET(key, ptr, DISABLED); ]; | SET(key_, £tr, ENABLED); ]; The following code defines the action of SPAM when an excep- tion is raised. (In TUTOR, RAISED_EX can only be a function-key name.) RA <- ACT_P; [ if ELEMENT(RAISED_EX, RUNEX_LIST(MAIN_PTR)) | j *- INDX_EX(RAISED_EX, MAIN_PTR); [if S(EXB(MAIN_PTR) + 3*j) =- ENABLED | [ if RAISED_EX | NEXT: NEXT1: BACK: BACK!: else : [ if SEQJ10DE = LESSON | SEQ_M0DE *- HELP; BASE_PTR +■ MAIN_PTR; BASE_L0C *- MAINJ-OC; ]; ]; 106 transfer-control -to ADDR_LOC(RUNEX_LIST( j , MAIN_PTR) ) ; ]; ]; transfer-control -to RAJLOC; When an end is encountered, the following code is executed. [ if SEQ_M0DE = HELP | j «- EXB(MAIN_PTR) + 3*( INDX_EX(NEXT, MAIN_PTR)-1 ) ; S(j+2) «- BASE_L0C; S(j+3) <- ENABLED; The formal definition shows that exception processing in TUTOR is simpler than in KAIL in several respects. In TUTOR, the control trans- fers are simpler, and there is no need to store block numbers or pointers to the previous dynamic and static blocks. Since no environment except SEQ_M0DE is inherited in TUTOR, the COPY procedure is unnecessary. Ex- cept to properly ignore a disabled exception, RA is unnecessary in TUTOR; and the resumption process is much simpler. On the other hand, the formal definition shows how TUTOR vio- lates the basic design principles (Section 2.3). On entry into a main unit, SEQ_M0DE is checked, so the behavior of a main unit depends on context and thus violates the uniformity principle. Likewise, when an end is encountered, SEQJ10DE is checked and uniformity is violated. An attached unit appears syntactically the same as a main unit, but there are essential differences in behavior as exhibited by the formalism. 107 Also, main units in help sequences behave differently from main units in lesson sequences. A violation of the separability principle occurs when an exception is raised because the setting of SEQ_M0DE is bound to function-key names. A violation also occurs on entrance into a main unit because NEXT and possibly BACK and BACK! are automatically set. All of these composite features are indirectly separable in TUTOR, but the code is awkward. The locality principle is violated when a function-key statement or an end statement is encountered and when an exception is raised. In these three cases, the main unit is referenced, but control may be in some entirely different unit. In contrast to TUTOR'S low position in the locality lattice (Figure 4), the formalism shows how the KAIL exception processing scheme is quite high. When a KAIL exception is enabled, disabled, or raised, all necessary information is local. Some of this information may have been locally established by the COPY procedure, but the restrictions on COPY are such that the KAIL exception processing scheme is still high in the locality lattice. 108 6. SEQUENCING EXPERIMENT An experiment was performed to test the psychological sound- ness of the KAIL exception processing constructs and to investigate sequencing constructs for programming CAI lessons. Two languages were compared. One language, T, incorporated a subset of the sequencing constructs in TUTOR; the other language, K, included an earlier version of the KAIL constructs for exception processing proposed above. The languages designed for the experiment emphasized differences in sequencing constructs and minimized other differences as much as pos- sible. The common features included the KAIL selector (Chapter 3), input-output statements similar to those found in TUTOR, and declarations and expressions typical of high level programming languages. This ex- periment is reported in a separate document [Embley 76] and summarized below. 6.1 The T Language A T program consists of a sequence of procedures optionally preceded by global variable declarations. T-program -*■ lesson identifier ; (variable-declaration ;) (procedure ;) end procedure -> procedure procedure-name ; ( statement ;) end 109 Procedures behave exactly like TUTOR units. Thus, in T there are main procedures, attached procedures, and base procedures. The TUTOR transfer statements j ump , goto , and dp_ are included in T. branch-statement ■* jump procedure-name -> goto procedure-name -*■ procedure-name Note that do is replaced by a procedure call denoted by the appearance of the procedure name. Since all information is global, no parameter passing is necessary. The T function-key statement is function-key-statement -> function-key-name procedure-name Lesson and help sequences are invoked and behave as in TUTOR. T modification statements are defined as modification -*■ rewrite -*■ echo -> erasure -> size expression ■* rotate expression "** long expression These commands modify input-output statements. For the write statement, rewrite , echo , and erasure set the mode, and size and rotate designate no the character size and angle of writing respectively. The long statement limits the length of an input string. Modifications are set and reset dynamically, and their effect extends beyond procedure boundaries. Hidden actions in T correspond to TUTOR hidden actions. For each main procedure, the system automatically generates a screen erase at the beginning and a pause at the end. If a lesson author wishes to prevent these defaults, an inhibit erase statement nullifies automatic screen erase; and a jump statement transfers control without pausing. 6.2 The K Language A K program is defined as K-program ■+ l esson identifier ; (construct ;) end construct ■* statement -> variable-declaration •*■ procedure -> exception-declaration procedure -* procedure procedure-name parameters ; (proc-construct ;) end proc-construct ■* statement ■*■ variable-declaration -*■ exception -declaration Procedures contain local variable and exception declarations but not local procedure declarations. Exception processing in K is as in KAIL except that variable declarations can appear anywhere in a block, and the default resumption Ill point is the following statement. Also, the scope of an exception is dynamically extended into a procedure through a parameter rather than through an exception list in a procedure heading. K modifications are defined as modifications ■+ { modification ( s modification) ) where the modifications are the same as those defined in T. K modifica- tions only affect the statement to which they are attached; thus, { size 0.75 } [ a^ 429; { size 3 } write TITLE; at 605; write sentence; ]; writes "TITLE" in size 3, but writes "sentence" in size 0.75. K contains no hidden actions. The language requires e\/ery action to be explicit. 6.3 Hypotheses There are at least three problems with T sequencing constructs. Each led to a hypothesis tested in this experiment. 1. T violates the uniformity principle. One example is the nonuniform behavior of procedures. A common error is to insert an at- tached procedure between two main procedures and then invoke it by falling through from the textual ly preceding main procedure. UNIFORMITY HYPOTHESIS: programs are easier to understand and use if written in a language where syntax and semantics are in a one-to-one relationship . 112 2. T violates the separability principle. Hidden actions resulting from composite features permeate T sequencing constructs. Some common mistakes are to forget to inhibit unwanted hidden actions and to use additional explicit statements where desired effects are already included in composite feature. SEPARABILITY HYPOTHESIS: violations of the separability principle, particularly with regard to hidden actions, hinder a programmer's ability to debug and modify programs . 3. T violates the locality principle. Exceptions are dynamic- ally enabled and associated with handlers. This allows an author to ac- tivate an exception in an attached procedure textual ly removed from the calling routine where the exception may be raised. T modifications are alterable. A common mistake is to set size , for instance, and then forget to reset it. LOCALITY HYPOTHESIS: language constructs that appear higher in the locality lattice (Figure 4) are less error prone than those that appear lower in the hierarchy . 6.4 The Experiment To test the hypotheses, subjects were asked to debug and modify a reasonably large CAI lesson consisting of about 500 lines of code. Subjects performed the task on-line while the system gathered data. A subject's ability to perform was measured in several ways: 1. Accuracy in coding, 2. Speed at finding and correcting program bugs, and 3. Facility at finding and correcting bugs and at making modifications. 113 Subjects were students in a class on computer-aided instruc- tion and learned the language concepts of both TUTOR and KAIL as part of the course. Based on background information gathered for the experiment, subjects were divided into two groups. They then learned the particulars of T and K in a one hour lecture given by the author. In an on-line session on PLATO, each subject was asked to debug and modify lesson "intrep" on integer representations. Several on-line aids were available during the experiment: 1. A complete program listing, 2. An editor, 3. A reference manual , 4. Program execution, and 5. A flow diagram of "intrep." In the debugging section, subjects had 25 minutes to find and fix as many bugs as possible. The system seeded the program with bugs one at a time and in a specific order. Thus, at any given moment, the program contained only one bug. Their task was to find it and fix it. All bugs were logic errors and typified those which might ac- tually arise in programming "intrep." Naturally, bugs were selected which related to the hypotheses, but care was taken to keep them typical and thus, hopefully fair. To help subjects identify a bug, the system directed them through a path in the execution of the lesson leading to the problem and then gave a one line explanation. When a subject pressed the term key and entered "editor," the program listing became available for editing. While editing, subjects could view again the execution of 114 "intrep" with the bug. Any changes to the code, however, were not re- flected in the execution of the program. When subjects fixed a bug (or gave up), the system presented a solution to the problem. After completing the debugging section, subjects were given an additional 25 minutes to modify a bug-free listing of "intrep" ac- cording to two specifications. Both modifications involved adding a help sequence but under different circumstances and for different rea- sons. They were not allowed to execute "intrep" as they modified it, but the system imposed no other constraints. 6.5 Results The first bug tested the locality hypothesis. A title in "intrep" was supposed to be written in size 1.5, but the statement write C. FIBONACCI REPRESENTATION; wrote it in size 1. K language subjects should have fixed this bug with { size 1.5 } write C. FIBONACCI REPRESENTATION; and the T language subjects should have fixed it with size 1.5; write C. FIBONACCI REPRESENTATION; size 1; Only 1 of 14 T subjects who attempted the problem solved it correctly, whereas 5 of 11 K subjects solved it correctly (Table 3). An exact test showed this was significant (p £ .039). This data supports the locality hypothesis because subjects committed fewer coding errors 115 when using modification constructs that are higher in the locality lattice. Table 3 Error Contingency Table for Bug 1 number of errors 1 > 2 5 4 2 1 9 4 The second bug tested the separability hypothesis. In the K listing, an unwanted erase was placed at the end of a help sequence; in the T listing, a necessary inhibit erase was deleted. Of 13 K subjects, 8 performed well; of 14 T subjects, 8 also performed we! 1 . Two T sub- jects, however, placed inhibit erase in the wrong procedure; apparently, they understood the problem but not the rules of inhibit erase . This bug was probably too simple to reveal any significant information about the separability hypothesis. The third bug considered program structure and exception hand- ling and tested hypotheses relating to uniformity and locality. Coding CAI lesson sequences with K constructs seemed to demand a level of in- direction ( goto label 1 . . . 1: call procedure p), but T sequencing (continue at procedure p) did not. Figure 19 shows Ks indirect method of going back to the previous frame; Figure 20 shows Ts direct method. 116 ttl: title; indx: index; • • • procedure title; • • • end title; procedure index; on back goto ttl ; • • • end index; Figure 19. Handling back in K procedure title; • • • end title; procedure index; back title; • • • end index; Figure 20. Handling back in T 117 The bug was seeded into the programs by deleting on back goto ttl ; from Ks listing and back title; from Ts listing. Only 1 of 10 K subjects who completed the problem solved it correctly, whereas 7 of 12 T subjects solved it correctly (Table 4). An exact test showed this was significant (p £ .026). Of the 10 K subjects, 8 entered goto procedure-name. On the average, the T subjects scored 8.1 (10 possible), and the K subjects scored 5.6 (df = 24, t = 2.69, p 5 .013). Moreover, even though the amount of code to be inserted for a correct solution was nearly identical for both languages, K subjects spent more time inserting (df = 22, t = 2.33, p < .030), perhaps because they were unsure of theirselves. More traditional sequencing in K re- sulted in poorer performance. The K language subjects seemed confused by going one place to get somewhere else. On the other hand, 2 T subjects placed " back title" in the wrong procedure. In both instances, these subjects may have expected back to remain active once it was enabled since under this assumption they essentially solved the problem correctly. This lends support to both the uniformity and the locality hypothesis because the 2 T subjects seemed to be confused about the interaction between dynamic activation of exceptions and entrance into a main procedure. The indirect frame sequencing problem in K implies that this language should be modified to allow direct jumps to frame processors. 118 Table 4 Error Contingency Table for Bug 3 number of errors 1 > 2 1 3 6 7 5 The data does not imply, however, that all subroutines should be coded as frame processors as in T. One possibility is to introduce a new construct designed as a special frame processor. The fourth bug tested the separability hypotheses. In T, the jump statement, in addition to directing control to a designated proce- dure, erased the screen. The goto only directed control elsewhere and could often be used in place of a j ump . Bug 4 was a screen overwrite problem. An erase was left out of the K listing; a goto , instead of a jump , was used in the T listing. In T, 10 of 11 subjects gave a correct solution, but only 1 replaced the goto with a jump . The rest of the T subjects, 9 of 10, gave an alternate solution: they properly inserted erase in the appro- priate procedure. Clearly, T subjects found it easier to fix the bug directly. This result lends support to the separability hypothesis be- cause the explicit solution was preferred. Underlying implicit effects were more difficult to find. 119 The fifth bug also tested the separability hypothesis. Because help sequences in T return to the beginning of the base procedure, pro- cedures had to be broken at unusual junctures; and because the screen erases automatically when entering a main procedure, normal defaults had to be overridden. Solving both these problems at once resulted in par- ticularly awkward and unusual code. Figure 21 shows the essential state- ments in a section of T code where unusual coding was necessary. The inhibit erase prevented an unwanted screen erase on entrance into the help sequence, mrhelp. The erase in mixedr2 was essential because the automatic erase which normally would have occurred in falling through from mixedrl was inhibited. Figure 22 shows the corresponding section of K code. In both instances, the bug was created by removing the erase . Of the 8 T subjects who attempted to solve this problem, 3 re- moved inhibit erase from procedure mixedrl which fixed the immediate prob- lem but introduced another bug. All 5 K subjects who attempted this prob- lem solved it correctly. K subjects scored 14.0 of 15 in a subjective measure that adjusted the score to give more credit to those who did well quickly, but T subjects scored only 9.6 (df = 11, t = 2.22, p < .049). This supports the separability hypothesis and shows how combina- tions of hidden actions interfere with one another to produce unexpected bugs. In the modifying section of the experiment, 10 K subjects made 20 errors related to the separability hypothesis, and 13 T subjects made 36 such errors. The K subjects omitted necessary pauses and seemed confused about erasing after help messages. These errors seemed to be 120 procedure mixedrad; • • • end mixedrad; procedure mixedrl ; inhibit erase ; help mrhelp; • • • at 3020; write Press NEXT to continue.; end mixedrl ; procedure mixedr2; erase ; • • • end mixedr2; Figure 21. Section of T Code procedure mixedrad; • • • on help mrhelp; • • • at 3020; wri te Press NEXT to continue.; pause next ; erase ; • • • end mixedrad; Figure 22. Section of K Code 121 based on assumptions learned from TUTOR. The T subjects referred unneces- sarily to next , back , and back! , included extra pauses, and omitted a necessary inhibit erase . All T subjects who completed the modifying section of the experiment included a help procedure with an unwanted double pause in approximately the form shown in Figure 23. One pause occurred explicitly; the other occurred implicitly at the end of the help procedure. procedure sorry; at 3010; write sorry; pause next , back , back}/, at 3010; erase 5; endhelp sorry; Figure 23. Double pause Problem in T With regard to the uniformity hypothesis, T subjects positioned 5 of a possible 26 attached procedures in sequences of main procedures so that the attached procedures would also be executed as main procedures. The results of the sequencing experiment generally support the uniformity, separability, and locality principles. They also show how experiments can reveal design flaws as in the case of indirect frame sequencing in K. 122 7. SUMMARY AND CONCLUSIONS 7.1 Summary This thesis has explored an experimental and a formal approach to language design and has applied these design methods to an investiga- tion of control constructs for interactive computing. The KAIL selector unifies selection and iteration and handles CAI answer judging and structured flow in interactive dialogues. KAILs exception processing scheme is static and handles dialogue interruption. These control constructs were tested in two experiments con- ducted on-line in a CAI environment; the results indicate that the KAIL constructs are likely to be psychologically sound. In the selector experiment, subjects attempted to understand and answer questions about two short programs. For one group of subjects, these programs were written in S, a selector language; and for the other group, they were written in A, an ALGOL-like language. In the sequencing experiment, subjects debugged and modified a substantial CAI lesson about 500 lines in length. One version, T, was written in a TUTOR-like language, and the other version, K, was written in a KAIL-like language. These control constructs were also examined through a formal definition of their semantics, and their properties were clearly exposed. An axiomatic approach was applied to the KAIL selector, and the KAIL exception processing scheme was defined in terms of a behavioral model. 123 The TUTOR exception processing scheme was also formally defined and com- pared with the KAIL scheme. As a result of the investigation of experimental and formal approaches to language design, three basic design principles evolved: 1. Uniformity, 2. Separability, and 3. Locality. The uniformity principle suggests that languages ought to be designed with a one-to-one relationship between syntax and semantics. The separability principle suggests that special purpose composite structures may be harmful. The locality principle suggests that language features should be as high on the locality lattice as possible; that is, they should be as permanent and local as possible. These three principles are proposed as a possible basis for an informal approach to language design. 7.2 Conclusions 7.2.1 The Experimental Approach to Language Design The results of the selector experiment support the hypothesis that programmers understand the KAIL selector more easily than an equiv- alent set of traditional constructs. S language subjects answered more questions correctly than A language subjects, and also thought they initially better understood the S-favored program. Statistics on the number of questions initially answered correctly, average time taken to 124 obtain a correct answer, and initial and final self-evaluations are all in the direction of the S language. No performance statistics favor the A language. In the sequencing experiment, the results are mixed, but gen- erally support all three hypotheses. T subjects introduced errors when they improperly inserted attached procedures in sequences of main pro- cedures. This observation lends support to the uniformity hypothesis. Several observations support the separability hypothesis. All T subjects found the double pause problem particularly difficult to avoid, and sub- jects omitted necessary pauses, referred unnecessarily to next , back , and back! , preferred to fix problems directly rather than through hidden actions, and seemed confused by the inhibit erase/erase phenomenon. The experiment also lends support to the locality hypothesis because T sub- jects forgot to reset size . This approach to language design can produce scientific evi- dence to support claims about language features and design issues as shown in both experiments. It can also reveal design flaws as exhibited by the indirect sequencing problem in K. Through empirical tests, de- signers can gain assurance that their language features are psychologic- ally sound. 7.2.2 Experiments in the PLATO Environment The selector experiment made use of the following six advan- tages of the on-line PLATO environment: (1) controlled teaching environ- ment, (2) interactive dialogue, (3) individualized sequencing, (4) pre- cise data, (5) precise timing constraints, and (6) assistance in grading. 125 Subjects were taught on-line; they had had no previous experience with their assigned language. The system monitored each subject individually, gave appropriate assistance, imposed timing constraints, and gathered precise timing data. Moreover, the system graded each subject's response and gathered statistics during the question-answer part of the experi- ment; it also logged all necessary keypresses to permit redundant checks on performance measures. Two of the three disadvantages of running on- line experiments were also encountered: (1) system failures, and (2) unfamiliarity with the system. Partial data from two subjects was lost because of a buffer overflow in the monitoring program. During one ses- sion, the system crashed and data from eleven subjects was lost. In another session, two subjects encountered hardward screen-overwrite prob- lems with their terminals causing some of their data to be declared in- valid. 1 Few subjects encountered problems due to unfamiliarity with the system because the experiment had been designed to teach subjects about the use of PLATO and to familiarize them with the question-answer dia- logue before gathering performance statistics. As it turned out, a few subjects did enter "correct" responses that were judged incorrectly by the system. Typing in "_" (underline) for "-" (minus) was the most pre- valent of these few errors. In these cases, the raw data was updated to reflect the appropriate system logging as if these answers had been anticipated. The experiment could have been improved if less data had been lost due to system failures and if the subjects had been a little more familiar with PLATO. A further improvement in the experiment would have 126 been to gather more data during the instructional session and to have used this data in evaluating a subject's performance. Although precise timing data was gathered on how long each subject took to answer a ques- tion, it revealed little information about how well subjects understood the constructs. Perhaps if subjects had not been allowed to give up until after a much longer time, timing data might have been significant. The sequencing experiment took advantage of the interactive PLATO environment by making extensive use of editing and execution capa- bilities. Through interactive dialogue and individualized sequencing, the system allowed each subject to proceed at his own pace in debugging and modifying "intrep." The system recorded every keypress and supplied precise data on elapsed time between all significant actions. Although many observations led to conclusions that support the hypotheses, there were only a few statistically significant results. One possible reason is that cost prohibited the experiment from making full use of PLATO. Subjects were taught off-line. This factor was, therefore, harder to control for, and may have accounted in part for subjects committing more language interference errors than had been ex- pected. Subjects were not allowed to observe the results of their changes when executing "intrep." If they would have been able to recompile the program as desired, they would have been able to reconsider their fixes and modifications in light of whether or not they actually solved the problem. It is likely that more obscure constructs would have caused subjects to spend more time and do more work to solve the problems. As it was, however, the precise timing and editing data gathered by the system revealed few significant results. 127 In general, the experiments profitably took advantage of the PLATO environment. Time and space limitations, however, prevented full exploitation of potential advantages. 7.2.3 The Formal Approach to Language Design An application of the axiomatic formalism to the KAIL selector helped clarify and concisely specify several general observations. The KAIL if and case are semantical ly identical* and all is complex compared to the other control types. The formalism also revealed similarities and differences among rf, wh i 1 e , and until and led to an investigation of another possible control type. The semantics of both KAIL and TUTOR exception handling were defined in terms of a special purpose abstract machine. This behavioral definition shows that the KAIL features are nearer the top of the locality lattice than the TUTOR features. Moreover, the formalism shows how the TUTOR features violate the uniformity and separability principles. This approach to language design can objectively reveal proper- ties of language features as shown in the formal definition of the pro- posed KAIL control constructs. It can also objectively expose weaknesses and inconsistencies and provide insight into why some language features are better than others. 7.3 Suggestions for Further Research 1. The results indicate that further research in experimental and formal language design is likely to be fruitful. These methods can 128 be applied to obtain scientific evidence to support claims about lan- guage features and design issues in general. 2. It would be particularly valuable to apply these methods to obtain further support for the three basic design principles. These could then be confidently used as a partial basis for reasoning about and designing programming language features in general. 3. In order to administer the experiments reported in this thesis, automated methodologies for experiments in computer programming were developed. These should be refined. Because more and better data can be gathered and because computer resources can be exploited, on- line methods are likely to be useful for experiments in computer pro- gramming. 129 LIST OF REFERENCES 1. Alpert, D. and D. L. Bitzer, "Advances in Computer-based Education," Science , Vol. 167, No. 3924, March 1970, pp. 1582-1590. 2. Boehm, C. and G. Jacopini, "Flow Diagrams, Turing Machines and Lan- guages with Only Two Formation Rules," Communications of the ACM , Vol. 9, No. 5, May 1966, pp. 366-371. 3. CBEMA, "Draft Proposed American National Standard Programming Language PL/1," Basis/1-12, BSR X3.53, X3 Secretariat: Computer and Business Equipment Manifactures Association, February, 1975. 4. Dijkstra, E. W., "EWD316: A Short Introduction to the Art of Pro- gramming," Technical University, Eindhoven, The Netherlands, August, 1971. 5. Embley, D. W., "An Experiment on a Unified Control Construct," Techni- cal Report UIUCDCS-R-75-759, University of Illinois, August, 1975b. 6. Embley, D. W., "An Experiment on CAI Sequencing Constructs," Technical Report UIUCDCS-R- 76-771, University of Illinois, February, 1976. 7. Embley, D. W. , "An Introduction to KAIL," Lesson kaids on the PLATO System, University of Illinois, August, 1975a. 8. Embley, D. W. and W. J. Hansen, "The KAIL Selector - A Unified Control Construct," SIGPLAN Notices , Vol. 11, No. 1, January, 1976, pp. 22-29. 9. Floyd, R. W., "Assigning Meanings to Programs," Proceedings of a Sym- posium in Applied Mathematics of the American Mathematical Society, Vol. 19, April, 1966, pp. 19-31. 10. Gannon, J. D. , "Language Design to Enhance Programming Reliability," Technical Report CSRG-47, University of .Toronto, January, 1975. 11. Goodenough, J. B., "Structured Exception Handling," Conference Record of the Second ACM Symposium on Principles of Programming Languages, January, 1975, pp. 204-224. 12. Hoare, C. A. R., "An Axiomatic Approach to Computer Programming," Communications of the ACM, Vol. 12, No. 10, October, 1969, pp. 576- 580, 583. 13. Hoare, C. A. R. , "Hints on Programming Language Design," Technical Re- port, STAN-CS-73-403, Department of Computer Science, Stanford Univer- sity, December, 1973. 130 14. Hoare, C. A. R. and N. Wirth, "An Axiomatic Definition of the Program- ming Language PASCAL," Acta Informatica, Vol. 2, No. 4, 1973, pp. 335-355. 15. Horning, J., Department of Computer Science Colloquium, University of Illinois, April 17, 1975. 16. Johnston, H. B., "The Contour Model of Block Structured Processes," SIGPLAN Notices , Vol. 6, No. 2, February, 1971, pp. 55-82. 17. Knuth, D. E., "Structured Programming with Goto Statements," ACM Computing Surveys , Vol. 6, No. 4, December, 1974, pp. 261-301. 18. McCarthy, J., "Towards a Mathematical Science of Computation," Pro- ceedings of IFIP Congress 62, August, 1962, pp. 21-28. 19. McKeeman, W. M., "Programming Language Design," in Lecture Notes in Computer Science, Vol. 21, Compiler Construction - an Advanced Course , Springer-Verlag, 1974, pp. 514-524. 20. Newman, W. M., "An Experimental Display Programming Language for the PDP-10 Computer," Technical Report UTEC-CSc-70-104, University of Utah, July, 1970. 21. Sherwood, B. A., "The TUTOR Language," Computer-based Education Re- search Laboratory and Department of Physics, University of Illinois, June, 1974. 22. Shneiderman, B., "Experimental Testing in Programming Languages, Sty- listic Considerations and Design Techniques," Technical Report No. 16, Indiana University, October, 1974. 23. Shneiderman, B. and M. Ho, "Two Experiments in Programming Behavior," Technical Report No. 17, Indiana University, October, 1974. 24. Sime, M. E., T. R. G. Green, and D. J. Guest, "Psychological Evaluation of Two Conditional Constructions Used in Computer Languages," Interna - tional Journal of Man-Machine Studies , Vol. 5, No. 1, 1973, pp. 105- 113. 25. van Wigngaarden, A. (Editor), "Report on the Algorithmic Language ALG0L68," Numerische Mathematik , Vol. 14, No. 2, 1969, pp. 79-218. 26. Weinberg, G. M., The Psychology of Computer Programming , Van Nostrand Reinhold Company, 1971. 27. Weissman, L. M., "A Methodology for Studying the Psychological Com- plexity of Computer Programs," Technical Report CSRG-37, University of Toronto, August, 1974. 28. Wirth, N., Department of Computer Science Colloquium, University of Illinois, April 7, 1975. 131 VITA David Wayne Embley was born in Salt Lake City, Utah, on October 30, 1946. He graduated from the University of Utah with a Bachelor of Arts degree in mathematics in June 1970. At that time he was elected to Phi Kappa Phi. In December of 1971, he received a Master of Science degree in computer science also from the University of Utah. Before coming to the University of Illinois, he worked as a programmer and consultant while serving in the U.S. Air Force. Since August 1973, he has been a research assistant in the Department of Computer Science at the University of Illinois. 131 VITA David Wayne Embley was born in Salt Lake City, Utah, on October 30, 1946. He graduated from the University of Utah with a Bachelor of Arts degree in mathematics in June 1970. At that time he was elected to Phi Kappa Phi. In December of 1971, he received a Master of Science degree in computer science also from the University of Utah. Before coming to the University of Illinois, he worked as a programmer and consultant while serving in the U.S. Air Force. Since August 1973, he has been a research assistant in the Department of Computer Science at the University of Illinois. BIBLIOGRAPHIC DATA SHEET 1. Report No. UIUCDCS-R-76-811 3. Recipient's Accession No. 5- Report Date 4. Title and Subtitle EXPERIMENTAL AND FORMAL LANGUAGE DESIGN APPLIED TO CONTROL CONSTRUCTS FOR INTERACTIVE COMPUTING July 1976 6. 7. Author(s) David Wayne Embley 8. Performing Organization Rept. UIUCDCS-R-76-811 No. 9. Performing Organization Name and Address Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 61801 10. Project/Task/Work Unit No. 11. Contract/Grant No. US-NSF-EC-41511 12. Sponsoring Organization Name and Address National Science Foundation Washington, D.C. 13. Type of Report & Period Covered Technical Report u. 15. Supplementary Notes 16. Abstracts This thesis explores two objective approaches to language design. An experi- mental approach to language design recognizes the human element in programming and attempts to achieve an optimal design by an empirical investigation of language constructs and design principles. A formal approach to language design recognizes the theoretical foundations of programming languages and attempts to achieve an optimal design by a specification of the properties of language constructs in order to expose weaknesses, inconsistencies, and design flaws. These approaches to language design are applied to an exploration of control constructs for interactive computing, particularly in computer-aided instruction (CAI). A control construct is examined that handles CAI answer judging and unifies selection and iteration. A static exception processing scheme is also examined. Two experiments tested these constructs and at the same time illustrated a methodology for conducting experiments on-line in a CAI environment. In the investigation, several design principles evolved. These are proposed as a partial basis for reasoning about language features in general. 17. Key Words and Document Analysis. 17a. Descriptors Programming Languages Programming Language Design Psychological Experiments in Programming Formal Semantics Computer-Aided Instruction 17b. Identifiers/Open-Ended Terms 17c. COSATI Field/Group 18. Availability Statement Release Unlimited 19. Security Class (This Report) UNCLASSIFIED 20. Security Class (This Page UNCLASSIFIED 21. No. of Pages 22. Price FORM NTIS-35 (10-70) USCOMM-DC 40329-P7 1 ^ .♦ sfr AUG 2 9 1980