LIBRARY OF THE 
 
 UNIVERSITY OF ILLINOIS 
 
 AT URBANA-CHAMPAIGN 
 
 510.8 
 
 XJL(or 
 
 no. 547-55 Z 
 cop- 2. 
 

 2/996 
 
 

 5'^f UTUCDCS-R-72-5^-9 
 
 7?i*>&l 
 
 PARALLEL TECHNIQUES 
 
 October, 1972 
 
 ^7 
 Harold Robert George Trout 
 
 THE UBRARI QE IHE '$?3 
 
 UNIVERSITY OF ILLiNO 
 ATI 
 
 DEPARTMENT OF COMPUTER SCIENCE 
 UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN 
 
 URBANA, ILLINOIS 
 
UIUCDCS-R -72-5^9 
 
 PARALLEL TECHNIQUES 
 
 by 
 Harold Robert George Trout 
 
 October, 1972 
 
 Department of Computer Science 
 University of Illinois at Urbana-Champaign 
 Urbana, Illinois 
 
Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/paralleltechniqu549trou 
 
PARALLEL TECHNIQUES 
 
 Harold Robert George Trout, Ph.D. 
 Department of Computer Science 
 University of Illinois at Urbana-Champaign, 1972 
 
 This thesis develops several techniques for creating the 
 parallel equivalent of a sequential algorithm. It is shown 
 that any function which is linear in its memory variables can 
 be coded in a parallel fashion. Furthermore, using a bit by 
 bit strategy parallel equivalents of sequential algorithms in 
 which the iterative function is subject only to a Lipschitz 
 condition are developed. The thesis considers speed of conver- 
 gence, and the computational expense of these algorithms. It 
 develops in detail the application of several of the algorithms 
 to practical examples. 
 
TABLE OP CONTENTS 
 
 111 
 
 Page 
 
 INTRODUCTION 1 
 
 CHAPTER 1 DEFINITIONS 3 
 
 CHAPTER 2 LINEAR SEQUENTIAL ALGORITHMS AND EXTENDED 
 
 ASSOCIATIVITY 8 
 
 CHAPTER 3 APPLICATIONS OF LINEAR SEQUENTIAL 
 
 ALGORITHMS 1H 
 
 CHAPTER 4 LIMITATIONS OF THE LINEAR SEQUENTIAL 
 
 METHOD 2 
 
 CHAPTER 5 BIT BY BIT TECHNIQUES: RESIDUE METHOD 25 
 
 CHAPTER 6 ANOTHER BIT BY BIT TECHNIQUE: THE 
 
 LIPSCHITZ METHOD 39 
 
 CHAPTER 7 LIMITATIONS TO THE LIPSCHITZ METHOD 48 
 
 REFERENCES 56 
 
 VITA 57 
 
INTRODUCTION 
 
 Two divergent trends are distinguishable in the archi- 
 tecture of computing machines. On the one hand considerable 
 increase in speed and reliability has been achieved by the 
 duplication and parallel operation of many components within 
 these machines. On the other hand there have been more and 
 more machines built with specialized instruction set for cer- 
 tain applications. Any marriage of these two broad strategies 
 has been on an ad hoc basis since in general very little has 
 been known about the parallel equivalent of given sequential 
 functions. If the techniques for creating the parallel equi- 
 valent of a sequential process were known it would be possible 
 to connect many LSI circuits together to achieve the desired 
 speed improvement. The tacit assumption has been that although 
 certain techniques may work for certain algorithms that unified 
 techniques, except the most trivial kind, are unlikely to exist. 
 
 This paper unifies some of the parallel equivalents of 
 linear sequential algorithms and demonstrates some further areas 
 in which they can be used with success. In addition, it demon- 
 strates parallel techniques of wide application to algorithms 
 in which each step of the sequential process Is subject to a 
 Lipschitz condition. The strategy in finding a parallel equiv- 
 alent to these non-linear systems is to evaluate the partial 
 
2 
 results from each stage in a bit by bit fashion. Thus the num- 
 ber of steps in evaluating the whole function is of the order 
 of the number of bits of the final answer, rather than the length 
 of the sequence. 
 
 The machine parallelism implicit in these algorithms is 
 many identical units operating simultaneously. 
 
CHAPTER 1 
 DEFINITIONS 
 
 1.1 A SEQUENTIAL ALGORITHM Is an algorithm In which the 
 execution of the (1 ) step necessarily depends upon some values 
 generated by the (i-1) step (with, of course, the zero or 
 Initial value(s) given). 
 
 1.2 THE ITERATIVE FUNCTION of such sequential algorithm 
 
 Is the rule for obtaining the i value(s) from the i-1 . Sup- 
 pose f is represented pictorially 
 
 ... c . 
 
 1 
 
 a. b. 
 i i 
 
 i-1 
 
 i-1 
 
 'i-1 
 
 >x . 
 
 i 
 
 >y 
 
 > z i 
 
 iccepting inputs x . _, , y. ., •••> z • _-, , and a., b. , 
 
 c . and 
 
 generating outputs x., y . , 
 be represented as follows 
 
 z., the sequential algorithm may 
 
 a l b l ' 
 
 a~ bp 
 
 a 3 b 3 
 
 -» 
 
 4 
 
 which represents n repetitions of f, on n sets of input 
 
The usual implementation of such a sequential algorithm 
 would employ some memory variables X, Y, . . . Z and might be 
 written as follows: 
 
 X:=x Q ; Y:=y ; . . .;Z:=z q 
 
 for i:=l step 1 until n do 
 begin 
 X 
 
 Y 
 
 y f(X, Y, . . .,Z, A., B. 
 
 °i> 
 
 Z: = 
 
 end ;' 
 
 (f returns n values in this example.) This algorithm will be 
 
 written : 
 
 f/x , y Q , . . . z Q , A; B; . . . C 
 
 in the style of APL (reference 1) and is read "f" reducing 
 
 . . z n , A: B; C where A = a n , a OJ . . .,a . 
 — — — — 1 ' 2 ' ' n 5 
 
 X 0' y o 5 
 
 = b 1 , b 2 , . 
 
 , b etc . 
 
 ' n 
 
 Not infrequently the partial results are as much interest as 
 the final answer. The partial results are the sequence of values 
 X, Y, . . . Z at i=l X, Y, ... Z at i=2 etc. In some sense thes 
 are a sequence of snapshots of the sequential algorithm taken for 
 each value of I. Continuing the APL like notation: 
 
 f \ x > Y ' ' ' ' ' Z 0' -' -' ' * ,; - 
 will be used to represent these partial results. 
 
 The notation throughout this thesis will bear some resemblance 
 
 to APL (op. cit.) with the following exceptions and extensions: 
 
5 
 
 1) A function call f with parameters a, b, c, d, will be 
 written Y[a, b, c, d]'or '(a,b) f (c,d) T or ' a f (b,c,d)\ 
 
 2) k will be used to represent scalar multiplication and 
 often matrix multiplication. 
 
 3) The j result of a function f which returns several re- 
 sults is written f(...) [ j ] . 
 
 4) A vector will generally be written with an underline 
 a or A. 
 
 5) The reverse of a vector a will be written <£§_. 
 
 6) f/z,A. f is permitted to have more than two parameters 
 as described above. 
 
 7) If A is an array, then A[l,2] = A[l], A[2] so A[B] 
 
 = A[B[1]], A[B[2]], ..., A[B[n]] when B is one dimensional, and 
 will be the array X, where X.. = A[B..] when B is two dimensional. 
 
 8) pA = length of A. 
 
 1.3 THE MEMORY VARIABLES of a sequential algorithm are the 
 results (X, Y, ..., Z in the example above) which are remembered 
 from step to step of the sequential algorithm. The function is 
 specifically prohibited from having any memory and is thus a 
 
 pure procedure (i.e., one yielding fixed results for fixed inputs). 
 
 1.4 THE EXTRANEOUS VARIABLES of a sequential algorithm are 
 the non-memory inputs to f at each step (a. , b. , . . . , c . in the 
 example above) . 
 
 1.5 A BOUNDED SEQUENTIAL ALGORITHM is a sequential algorithm 
 in which the number of memory variables is fixed (i.e., does not in- 
 crease as later steps of the algorithm are computed). Since all 
 sequential algorithms in this thesis will be bounded, the term will 
 
6 
 generally be dropped. The restriction to bounded algorithms 
 excludes parallel equivalents of a sequential algorithm in which 
 the calculations are deferred by remembering all relevant data 
 until the last step when one giant computation is performed to 
 compute the final result. This restriction also eliminates from 
 the discussion algorithms such as Gaussian elimination. 
 
 1.6 A LINEAR SEQUENTIAL ALGORITHM is a sequential algorithm 
 in which the iterative function is linear in the memory variables. 
 
 Thus, if 
 
 2 
 f(x, y, a, b) = a x + y, bx + y 
 
 f/x Q y Q A; B 
 
 is a linear sequential algorithm since f is linear in x and y. 
 
 1.7 DISCUSSION. Intuitively it would be expected that the 
 evaluation of f/a-, , a~, •••> a would take n sequential steps. Al- 
 though ad hoc methods applicable to particular iterative functions 
 f may serve to reduce the number of steps It would be implausible 
 to expect this to be generally possible. However, it is actually 
 possible under quite wide circumstances to recast the calculations 
 in a form that permits more parallelism to be employed. Thus, given 
 sufficient hardware, the final result could be generated in some- 
 thing less than n times the time for one iteration. The price of 
 such increase in parallelism is often a significant Increase in 
 
 the total amount of computation that is performed. 
 
 1.8 Several cases can be handled summarily. For example, if 
 there are no memory variables, then there is no reason to evaluate 
 the algorithm sequentially. 
 
7 
 
 Also, if there are no extraneous inputs to the sequence, the 
 results are entirely dependent on the initial values and frequently 
 admit of a closed algebraic solution. In cases where this closed 
 solution is still difficult to compute the techniques discussed 
 here may still yield an improvement in computation time. 
 
 Finally, the special case of f an associative function of one 
 memory variable and one extraneous variable. An extended concept 
 of associativity will be developed in the next chapter and it will 
 be shown that all linear sequential algorithms can be cast in this 
 associative form. The condition of associativity grants the free- 
 dom to order the calculation in any manner whatsoever so that paral- 
 lel equivalents of the sequential algorithm can readily be found. 
 
 Chapter 3 is a discussion of some applications of linear 
 sequential algorithms to many problems of linear algebra. 
 
 Chapter H is a discussion of algorithms similar to linear 
 algorithms and their inherent limitations. 
 
 Chapter 5 is a discussion of the remainder method for the 
 bit by bit evaluation of polynomials and related sequential 
 algorithms . 
 
 Chapter 6 is a discussion of an alternate bit by bit tech- 
 nique called the Lipschitz method that has proven to be of quite 
 wide application. The most stringent requirement of this strategy 
 is that the function obey the Lipschitz condition with respect to 
 each memory variable. 
 
 Chapter 7 is a discussion of various practical constraints on 
 the Lipschitz method and a number of techniques for circumventing 
 these restrictions. 
 
CHAPTER 2 
 LINEAR SEQUENTIAL ALGORITHMS AND EXTENDED ASSOCIATIVITY 
 
 2.1 In Section 1.8 It was observed that an associative func- 
 tion reducing some vector can easily be recast in a parallel fash- 
 ion. The classical definition of associativity is, however, more 
 stringent than is required to be able to rearrange the computation 
 in a more parallel fashion. The key property of associativity 
 might be expressed as follows: If a and b are any two vectors, 
 then + is associative means 
 
 +/(a,b) e (+/a) + (+/b) 
 
 Accordingly the following definition of weak associativity: a 
 function f is said to be an ASSOCIATE of a function f if there is 
 some identity element(s) I with the property that for any vectors 
 A, B and initial value z 
 
 f/z,L = (f/z,A) f (f/z,B) 
 where L = A, B the concatenation of A and B. 
 
 Any function f with an associate f is said to be WEAKLY 
 ASSOCIATIVE. 
 
 The identity element (s) deserve some careful attention. For 
 binary functions there Is no identity element, since the number 
 of memory variables is one and the first element of the vector 
 is generally taken as the initial value of this memory. However, 
 
9 
 when there are more than one memory variables it is necessary to 
 specify these to get the algorithm started. The identity ele- 
 ments are chosen so that they also satisfy the above requirements. 
 
 2.2 Patently any binary function p that is associative is 
 also weakly associative with p~ = p . (The identity is null for 
 binary functions). Further, it is easy to see that for any func- 
 tion 1=1. 
 
 2.3 EXTENDED ASSOCIATIVITY. Frequently a function f which is 
 not associative per se has an extended function F which is associa- 
 tive. This means that in addition to the function of f, the extended 
 function F computes some additional memory variables which enable 
 
 an associate F to recover from some partition of the calculation. 
 For example 
 
 f(a, b, c) = (a x b) + c 
 then F(a, d , b, c ) = (a x b) + c , d x b 
 
 now clearly 
 
 (F/z, 1, B; C)[l] = f/z, B; C 
 so that F is indeed an extension of f. Furthermore, F is an assoc- 
 iative function with associate 
 
 F(a, b, c , d) = (a x d) + c , b x d 
 and 
 
 (F/z, y, B; C) F(F/0, 1, D; E) = 
 F/z, y, (B, D); (C, E) 
 can be demonstrated in a straightforward fashion. It is not diffi- 
 cult to replace the operators + , x with more general functions pro- 
 vided the following properties are retained. 
 
10 
 i) that x distribute over + : 
 
 ii) that x have an associate x 
 iii) that + be its own associate. 
 
 Also, the usage of f makes it apparent that any functions of 
 b, c could be substituted for b, c since they are used only once 
 in the calculation. Another way of looking at this is as follows: 
 
 If B and C_ are preprocessed element by element, using g and 
 h to give g[B] and h[C] then 
 
 f /z , g[B]; h[C] is equivalent to 
 k/z, B; C 
 where k(a, b, c) = f(a, g[b], g[c]) 
 
 i.e., replacing b or c with functions of b and c is equivalent to 
 preprocessing the extraneous input. Thus, we have the general 
 associative theorem. 
 
 2 A THE ASSOCIATIVE EXTENSION THEOREM. 
 
 1) If * is a function with associate x 
 
 2) x distributes over another function + 
 
 3) + is its own associate 
 
 4) g, h are any functions 
 
 then the function f(a, b, c) = a x g[b] + h[c] 
 has an associative extension 
 
 P(a, b, c, d) = (a x g[c] + h[d]), (b x g[c]) 
 which, in turn, has an associate 
 
 F(a, b, c, d) = ((a x d) + c), (b x" d) 
 with identity = 0, 1, the identity of + and x respectively. 
 
11 
 
 Proof 
 
 As observed above, the arbitrary functions g and h can be 
 ignored since they amount to some preprocessing of the extraneous 
 inputs . 
 
 Also, it is clear that the theorem need only be demonstrated 
 on input vectors of length 3, since longer cases can be proven by 
 repeated application of the theorem. Thus by straightforward 
 expansion 
 
 F/z, y, A, B, C, D, E, G 
 = F/(f/z, A, B), (x/y, A), C, D, E, G 
 = ((f/z, A, B) x C + D) x E + G, ((x/y, A) x C) x E 
 using the distributive property of x over + 
 
 = ((f/z, A, B)x~C)xE+(DxE)+G, ((x/y, A) x C) x E 
 now using the associativity of x 
 
 e (f/z, A, B) x (x/C, E) + (D x E) + G, (x/y, A) x (x/C, E) 
 but (D x E) + G E (P/0, 1, C, D, E, G)[l] 
 and (x/C, E) = (F/0, 1, C, D, E, G)[2] 
 so that 
 
 F/z, y, A, B, C, D, E, G E F[(F/z, y, Z, B) , (F/0, 1, C, D, E, G)] 
 where F[a, b, c, d] = (a x d) + c, b x d as stated. 
 
 2.5 LINEAR FUNCTIONS OF MORE THAN ONE MEMORY VARIABLE can 
 also be extended to an associative form in a similar way. For 
 example 
 
 f[x, y, a, b, c, d] = ((x x a) + (b x y) + c), 
 ((xxb)+(yxc)+d) 
 
12 
 Now f can be easily written as a matrix expression 
 f[x, y, a, b, c, d] e (* £) * (x 3 y) + (c, d) 
 
 which is, of course, of form of theorem 2.4 and so its associa- 
 tive extension is 
 
 F[x, y, (P £), a, b, c, d] = 
 
 O ■ <*■ ^ + ^ d >' ( bc> - c?2) 
 
 FCx, y, (P j), X, Y, (£ §)] 
 
 = (* §) x (x, y) + (X, Y), (P I) * (£ g) 
 
 with identity element = (0, 0), ( ~ -, ) 
 
 f/x, y, A; B; C; D = (F/x, y/jjj ° ), Aj B; C; D)[l,2] 
 (F/x, y,(J °),A; B; C; D) F (F/0A(q °fe M; N; P) 
 
 = F/x, y/J °),(A, L); (B, M ) ; (C, N ) ; (D, P ) . 
 
 2.6 Quasi-linear functions. There is one final class of 
 functions which can be computed in parallel using the linear 
 strategy . 
 
 f[x, a, b, c, d] s (ax + b) -:- (ex + d) 
 
 now by splitting the numerator and denominator of x(x=t^p) it is 
 possible to write 
 
 g[t, p, a, b, c, d] = (at + bp) , (ct + dp) 
 
 then f[tvp, a , b, c, d] = t /g[t, p, a, b, c, d] 
 
 so that using the result of Section 2.5 the two components of x 
 may be kept separate until the end of the algorithm, when the 
 final result can be obtained by a single division. 
 
13 
 The same device can be used with functions of several mem- 
 ory variables; for example 
 
 f[x, y, a, b, c, d, e, g, h, j, k] e ||~+~|2L±-| , 
 
 hx + jy 
 
 + k 
 
 
 
 dx + ey 
 
 + g 
 
 
 ssuming 
 
 t 
 
 x = — 
 
 P 
 
 j y 
 
 u 
 
 p 
 
 then after applying f the updated 
 
 memory values are 
 
 b i»U , , t , .u . , 
 a p + b p + c , h p + Jp + k 
 
 ,t . u ,t , u , 
 d— + e— + g d— + e— + g 
 P P & P P & 
 
 or at + bu + cp , ht + ju + kp 
 dt + eu + gp dt + eu + gp 
 
 so that the function 
 
 g[t, u, p, a, b, c, d, e, g, h, j, k] 
 
 = (at + bu + cp ) , (ht + ju + kp ) , (et + gu + hp ) 
 
 can be used and the answer that f would have generated can be ob- 
 tained by appropriate division 
 
 (g x g 2 g 3 ) = g/ x » y> 1« A; B; C; D; E; G; H; J; K 
 then f/x, y, A; B; C; D; E; G; H; J; K 
 
 = (g]_ * g 3 ) , (g 2 f S3) 
 and g has an associative extension (Section 2.5). 
 
14 
 
 CHAPTER 3 
 APPLICATIONS OF LINEAR SEQUENTIAL ALGORITHMS 
 
 3.1 The most obvious application of linear sequential algor- 
 ithms and their parallel equivalents is to linear sequential cir- 
 cuits (LSC). An LSC is defined by its state vector s_(t) at time 
 t, its inputs u(t), and characterizing matrices A, B, C, D. Then 
 the subsequent state is defined by 
 
 s(t + l) = A x s(t) + B 55 u(t) 
 and the output by 
 
 y(t) = C x s_(t) + D x u(t) 
 By judicious application of theorem 2.3 it is possible to define 
 an LSC which will accept k inputs and generate the corresponding 
 k outputs. (The so called k-channel analogue of the LSC.) 
 Let 
 
 f[s,u] = (A x s) + (B x u) 
 
 then f\s(0), u(0), u(l), u(2), ...,u(k-l) 
 
 is the sequence of states through which the LSC passes. y can 
 easily be calculated from these. Now f has associative extension 
 
 F[s,P,u] = (A x s) + (B x u) , A x P 
 
 P[A,B,C,D] = (A x B + C), (B x-D) 
 
 F/s(0),I,u o5 ...,u k _ ] _ 
 e (F/s(0),I) F (F/0, I, u o ,...,u k _ 1 ) 
 
 hence 
 
15 
 
 now evaluating F/Q, I, u Q ... u, , 
 
 u, 
 
 (A k ~ B, A k " 2 B, . ..,AB, B) *| ^° 
 
 u„ 
 
 <_ 
 
 -\r 
 
 , A 
 
 k 
 
 B 
 
 A 
 
 thus S f = f/S, u , •••s u k-l - ^ A ' * S) + B' s (u Q3 ... u. ,) 
 
 C and D' follow trivially. This result was also obtained by 
 Gill [2] 1966. 
 
 3.2 CARRY LOOK AHEAD ADDER. If the bits of the operands , in 
 
 b ) , then 
 n ' 
 
 i > 
 
 a binary addition are a = (a, . . . a ) and b = (b-, . 
 J - 1 n — 1 
 
 the carry bit at each stage c. is defined by 
 
 c i+l = ^ a i A b i ) ^ ^ a i * c i) ^ ^ b i A c j_) 
 c Q = 
 
 hence if f[c.a.b.] = (a. a b. ) v ( (a. • b. ) a c . ) 
 111 i 1 1 1 1 
 
 then f (o, a; b) generates the carries and satisfies the conditions 
 of theorem 2.3- Therefore, the associative extension 
 F[a,b,c ,d] = (ca d) v ((c v d) a a), (c \i d) a b 
 
 F[a,b,c,d] = (a a d) v b , b a d 
 can be used to break the chain of n carries (n = length of a into 
 several shorter chains and then combined using F. This process 
 may be applied recursively to further shorten the length of the 
 propagate chain. 
 
16 
 
 The flow of data is as follows: ; 
 
 (F/c Q , 1, A ; B x ) , (F/0, 1, A_ 2 ; B ? ) , ... 3 (F/0, 1, A_ n ; B_ n ) 
 
 c n + l s F/ < 
 
 3.3 STURM SEQUENCES are used to locate the roots of poly- 
 nomials, the Sturm sequence f , f, 
 following rules 
 
 . f is defined by the 
 n 
 
 f = 1 
 o 
 
 f x (u) = d 1 - u 
 
 f.(u) = (d. - u) f i _ 1 (u) - Sf f ± _ 2 (u) i > 2 
 
 for some parameter u. 
 Hence if 
 
 f[d,S,fl,f2] = ((d-u) x fl + (-S 2 ) x f2), fl 
 
 f \ f o> f l» - ; - I 
 
 will generate the Sturm sequence. Hence using the technique of 
 Section 2.5>f can be expected to yield an associative extension. 
 In this case, however , it is possible to realize the same result 
 without appeal to Section 2.5* f can be written in the form 
 
 f[fl,f2,X] = X x (fl, f2) 
 
 where 
 
 2 , 2 
 
 X = ( d l" U > ~ S 1) , ( d 2" Uj ~ S 2) , 
 
 ,d -u , S N 
 . . , ( n » n) 
 
 
 
 
 
17 
 then f/j f , f 1 , X generates the sequence and f has an associa- 
 tive extension trivially since matrix multiplication is associative 
 
 3.4 SOLUTION OP BAND MATRICES. Although the general back- 
 substitution process is outside the scope of this thesis, the 
 closely related question of the solution of band matrices is sub- 
 ject to the technique discussed. This is by virtue of the fact 
 that the number of memory variables is limited by the width of 
 the band. For concreteness the case of a tri-diagonal matrix will 
 be developed. 
 
 A = 
 
 d l U l 
 
 1-, d„ u„ 
 
 i 2 a 3 
 
 (o) 
 
 u. 
 
 (o) 
 
 d , . u , 
 
 n+1 n-1 
 
 1 . d 
 
 n-1 n 
 
 and the equation to be solved is Ax_ = b. 
 
 First eliminating the lower diagonal elements by 
 
 1- i 
 row. *■ row. - i-l * row._ 1 
 
 1 d. I 1 
 
 i-l 
 
 the d. are altered 
 
 i 
 
 D i * d i 
 
 D = d. - u k-l * 1 1-1 
 D i-1 
 
 i > 2 
 
 i = 2 , . . . n 
 
18 
 (The D. , d. cannot be zero if the matrix is non-singular.) Thus, 
 if 
 
 f[D,d,u,l] = d-(uxlTD) 
 then using the strategy of Section 2.6 the numerator and denominatoi 
 of D can be separated, calculated independently using a parallel 
 scheme based on the linear properties of the resulting pair of 
 functions and finally combined with n divisions to yield the D^ . 
 The right hand side may then be easily computed. 
 
 Now the matrix can be solved explicitly for x . . Using the same 
 notation as before with 1. = 
 
 x = b v d 
 n n n 
 
 x. = (b. - u . x x . , n ) t d . i = l,2,...,n-l 
 
 11 1 1+1' 1 J 3 J 
 
 thus 
 
 g[x,b,u,d] = (b - u x x) t d 
 will generate the x. when applied to the reverse of b, u, d 
 (cj)b, cjm, <J>d) 
 
 g\o, (cf)b); (cf)u); (<f>d) 
 
 It is worthwhile drawing attention to the computational ex- 
 pense that has been incurred by using this parallel technique. 
 Use of the function f incurs a fourfold increase in the amount of 
 computation that has to be performed (see Section 2.6). g incurs 
 a twofold increase. The same techniques can be applied when the 
 matrix is 5, 7, 9, etc. in width, however, the amount of extra 
 computation that must be performed increases with the square of 
 
19 
 the width from the diagonal and soon swamps the potential advan- 
 tage of the parallel scheme. A case study for Illiac IV indi- 
 cates that for a machine of width 64 processing elements the 
 break even point is about 5- This break even point is, of 
 course, dependent on the width of the machine. 
 
20 
 
 CHAPTER 4 
 LIMITATIONS OF THE LINEAR SEQUENTIAL METHOD 
 
 4.1 In attempting to apply the linear sequential strategy 
 to specific practical cases it often becomes apparent that the 
 exact conditions of theorem 2.3 are not going to be met. The 
 question that naturally arises is whether or not some perturba- 
 tion of theory can handle such cases. The next section Is a 
 discussion of some of the underlying algebraic properties that 
 are employed in creating an associative extension. Specific ad 
 hoc examples of associative extensions are given in this chapter. 
 Their existence demonstrates that linear functions are not the 
 only ones for which an associative extension can be found, how- 
 ever, a closed form bound on the kind of functions for which 
 associative extensions exist is very hard to come by. 
 
 4.2 No matter what algebraic properties are used to achieve 
 the final result, it must be possible to separate the initial 
 values of a sequential algorithm from the repetitive computation. 
 This is required for it to be possible to break the chain of de- 
 pendence and introduce some parallelism in the execution of the 
 algorithm. Then it is necessary to have some function analogous 
 to the associate which reconstructs the expected answer from the 
 partial results and the initial values. In essence, the require- 
 ment is that some useful computation must be possible at stages 
 
21 
 i, i+1, ... i+m, even though the values of the memory variables 
 at stage i have not yet been determined. In the linear tech- 
 nique this separation was achieved by computing the factor by 
 which the initial value was multiplied and combining it with the 
 additive part using the associate. 
 
 If the general term of the sequential algorithm is expressible 
 as an expression in which the initial value occurs in a bounded 
 number of places, then a strategy akin to the linear strategy 
 can be expected to yield an associative extension. However, 
 suppose the function involves powers of the memory variables or 
 some other expression which introduces the memory variable in 
 
 several places, e.g., 
 
 2 
 f [x ,a ,b] = ax + b 
 
 now, clearly f/z, A; B will involve powers of z up to z where 
 n = length of A and it is clear that there is no limit to the 
 number of powers of z that would have to be remembered to contain 
 the effect of the initial value. 
 
 4.3 An associate F of a function P is the most natural point 
 at which to examine functions for associativity. Since P must be 
 a function of 2n inputs and yield n outputs, and must be its own 
 associate (P = F, Section 2.2), it is a simple matter to test F. 
 For n=l F=+, x, max, min, and many others. Functions for which 
 n >2 are discussed below. The function F whose associate is P 
 can quickly be determined; for instance 
 
 F[a,b,c,d] = (a x d + c , b x d) 
 
22 
 
 then P[X,Y,F[e ie2 Z]] = F[X,Y,Z] 
 
 so that if F[e 1 e 2 Z] = G[Z], H[Z] the general form of F is 
 
 F[X,Y,Z] = (X x H[Z] + G[Z], Y x H[Z] 
 for arbitrary functions H and G. 
 
 U.l] f[x,y, . . . ,z 9 b] = bx , by 
 
 p(x,y , ...,z) p(x,y, . . . ,z) 
 
 bz 
 
 p(x,y, . . . ,z) 
 
 where p(x,y,...,z) is some homogeneous function of x,y,...,z 
 
 th 
 of order say n. It is not difficult to show that the i term 
 
 of f/X,Y,...,Z,b is (g/l,b)x, (g/l,b)y, ..., (g/l,b)z 
 
 where g[K,b] = bK 
 
 p(x,y, . . . ,z) 
 
 thus taking a log transformation log g = log b + log k x (l-n)-log 
 It is clear that g has an associative extension 
 
 G[K,L,b] E bR 1 "" , L x(l-n) 
 P 
 
 G[K,L,M,N] = K N M, L x N. 
 Thus it is possible to generate associative extensions of 
 functions such as 
 
 f(x,y,b) = bx , by 
 
 2^ 2 2 ? 
 x +y x +y 
 
 or f(x,y,b) = _^x , ^y 
 
 /(x + y) /(x + y) 
 
 4.5 APL type subscripting is associative. 
 
 For example if A = (3, 9, 18, 27) 
 B - (2, 3, 4) 
 C « (2, 2, 1) 
 
23 
 then A[B[C]] 
 
 = A[3,3,2] 
 
 = 18, 18, 9 
 but (A[B])[C] 
 
 = (9, 18, 27)[C] 
 
 = (18, 18, 9) 
 There are, of course, restrictions on the values that B and C_ 
 can have in order that A[B[C_]] be well defined. KB[i] <_pA and 
 
 l<C[i] <pB 
 
 The proof is straightforward. 
 First 
 
 ACBCC^Cg]] = ACBC^], b[c 2 ]] = aebc^]], AjBCC-^] 
 
 so that the result need only be proved for pC_ = 1 
 
 A[B[C]] = (A[B], A[B 2 ], A[B 3 ], ...., A[B n ])[C] = A[B Q ] 
 as required. Further, if A, B, C two dimensional arrays then 
 
 A[B;C] = X 
 where X. . = A[B. , ; C, .] 
 
 With this definition the associativity of subscripting can be ex- 
 tended to multi-dimensional arrays. 
 
 4.6 It is also worth noting that APL-type subscripting 
 
 distributes over almost all other functions. A moments thought 
 
 th 
 and It is apparent that selecting the n element of a group of 
 
 numbers and then applying some algorithm is equivalent to applying 
 
 the algorithm element by element to the group and selecting the 
 
 n result. Thus, the following trivial but often useful theorem: 
 
24 
 If p is any function in a finite set d^, d^ 3 . ..,d then 
 p has an associative extention P 
 
 P[x 1 ,x 2 ,...,x n ,a] = p[x p a], p[x 2 ,a], ..., p[x n ,a] 
 
 PCi 1 ,i 2 ,...,i n ,v 1 ,v 2 ,... f v n ] 
 
 = (v 1 ,v 2J ... J v n )[i 1 ,i 23 ...,i n ] 
 
 p/z.a = (P/d 1 d 2 , . . .,d n ,a)[z] 
 
 This result can be employed to define another associative 
 form of the carry look ahead algorithm. 
 
 More generally any finite state machine (FSM) has a parallel 
 equivalent. The next state S' is related to the present state S 
 and the input i by the characterizing matrix A, 
 
 S' = A[S;i] 
 Hence the states of the machine with inputs i are 
 
 p/S ,i 
 
 p(S,i) = A[s;i] 
 
 which has an associative extension P as defined above 
 
25 
 
 CHAPTER 5 
 BIT BY BIT TECHNIQUES: RESIDUE METHOD 
 
 5.1 The techniques discussed so far are essentially word 
 by word approaches in which each calculation is made on a com- 
 puter word and the final results computed from some aggregate 
 of these words. The approach that will be examined in succeeding 
 chapters will be bit by bit algorithms in which certain bits 
 of the answer will be defined as the algorithm proceeds. This 
 represents a radical departure from the sequential method in 
 which the problem was originally stated. After i stages of a 
 sequential algorithm nothing is known about the i + 1, i+2, • ••, 
 ith stages, however, everything is known about the i, i-1, ... 
 1 stages. The bit by bit techniques will generally yield, say 
 a single new bit of information about every partial result at 
 each step of the algorithm. The iteration will proceed until 
 enough bits are determined for the application in mind. 
 
 If the sequential algorithm is viewed as follows 
 
 f 
 
 3 
 
 ■11 
 
 12 
 
 13 
 
 21 
 
 22 
 
 23 
 
 ml 
 
 m2 
 
 m3 
 
 In 
 
 '2n 
 
 mn 
 
would emerge from a sequential machine in the order x n 1 , 
 x n •„ , x n „, . . . ,x„, . x„„ ...,,x , x „, . . . , x mvi say and from a bit 
 
 26 
 .n which the ith bit of the jth partial result is x. . . Then 
 
 '11 
 
 ; 12 , x 13 , ...,a 21 , a 22 , . . . ,A ml ^ m2 , ..., ^ mn ■■- ' 
 
 by bit machine in the order x.^ x 21 x ?]_> ••• x m > . ..,x, 2 , x ?2 5 
 
 X 32 J " ' 3 X m2' ' * -' ' x mrT 
 
 Thus, the sequential technique may be regarded as a vertical 
 
 slicing of the computation and bit by bit techniques a horizontal 
 
 one . 
 
 5.2 THE RESIDUE METHOD is a bit by bit technique in which 
 processively higher order bits of the partial results are de- 
 rived. It Is amenable to problems in which the iterative func- 
 tion f is a polynomial in the memory variables with mod 2 (for 
 fixed n) also a permissible operation. This case was not solvable 
 in general by a linear technique (Section 4.2). 
 
 The algorithm proceeds as follows: 
 
 1) at step i the i low order bits of each partial result are 
 
 known. Let these results be X, . , X ". ..... X . . i 
 
 li' 2i 5 ' mi 
 
 2) The output of the next iteration will be X-,.,-,, X„ . , _ , .... 
 
 li+l J 2i+l 5 
 
 X .,, 
 mi+1 
 
 where X-. . . n = X, . or X-, . , , = X-, . + 2 
 
 li+l li li+l li 
 
 that is, the next bit of the answer will either be set 
 or it will not be set 
 
 3) evaluate f(X- 13 a ) and fCX^ + 2 1 , a.) for j =1, 2, . 
 Now from these two trials it is possible to decide what 
 influence the Introduction of another bit of the input 
 
 m 
 
27 
 to each stage of the sequence has on the output, i.e., 
 the introduction of the 2 th bit either causes the 2 1 th 
 bit in the output to be switched on or not . 
 4) Hence proceed from left to right selecting one of the 
 two cases depending upon the input from the previous 
 stage. Now repeat the whole process with i increased 
 by 1. 
 The validity of this process depends on the fact that 
 P(x, y, ..., z) mod n = A 
 means that 
 
 P(x, y, ..., z) mod 2n = ^ +n 
 
 where P(x, y, ..., z) is a polynomial in x, y, ..., z. 
 
 5.3 THE SELECTION PROCESS OP Section 5 • 2 in which the choice 
 between f(X.. +2 , a.) and f(X.., a.) would appear to be an in- 
 herently sequential process. However, as noted in Section 4.5, 
 subscripting, which is equivalent to the selection process, is 
 an associative function. Thus it is easily possible to reorgan- 
 ize the selection operation in a more parallel fashion. Further- 
 more, in any hardware realization of this algorithm the selection 
 function is a very simple, logical operation that could be com- 
 puted at quite high speed. 
 
 5-4 THE SIGN OF THE FINAL RESULT is easily obtained from 
 the last partial result. The sign is lost in processing because 
 the partial results are evaluated modulo 2 . The signs are re- 
 covered by evaluating each partial result assuming first positive 
 then negative sign for each partial result and using the same 
 
selection operation as is used in Section 5*2 
 5-5 EXAMPLE of the residue method 
 
 f(x, a, b) = ax-b 
 f/z, A; B with z = 10 A = -9 , 4 , -3 , 2 , 5 , 7 
 
 B = 7, 1, 4, 2, 8, 5 
 for which the partial results are 
 
 -97, -389, 1163, 2324, 11612, 81279 
 
29 
 
 BrGI N 
 
 FILE PRINT PRINT ( 2,10)j 
 
 REAL PROCEDURE r(X»A#g);vALUE X'A»B*REAL X'A'B* 
 
 F|« A x X - B \ 
 
 REAL PROCEDURE MDOOrCA*B)IV'ALUE fi'BJREAl A*PJ 
 
 MQDOF Is ENHERf ,5*(TF A<0 T MEMf B-ABS( A •> MOD B)MnD B ELSE A MOO 3))J 
 
 COMMENT ROUND UP SINCE EVERYTHING IS INTEGER* 
 
 INTEGER ARRAY A, B, OUT PUT, TRU, TRYJ.OW, TRYHi GH r 0| ft ij 
 
 INTEGER I#J»Mn0ULUS) 
 
 REAL Z,MAVITERATIONSj 
 
 BOOLEAN BITSET) 
 
 MONITOR PR I NT C TRU, MODULUS, A, B# OUT PUT) j 
 
 TRuroi t=Zt*10* 
 Ar0it=O|Arlli»-9;Ar23irA;A[3]ls-3lArOiir2;Ar5]is5|A[6]l«7| 
 
 Broit*OI9flli B 7JB[?]ieiJBr33i s A*Rr*i:s?»Bt5!lsBIBC6!l«5l 
 TOR Ii=l STEP 1 UNTIL 6 DO 
 
 RESIN 
 
 TRU[Tllrr(TRUtI-l],A[n,B[I])| 
 
 MAXlTERATlONSI E MAXCMAylTERATlONS»TRUriJ>l 
 
 END) 
 
 MAXlTERATlONSt«ENTIER(LN(MAXlTERATlONS)/LN(2))*l) 
 MODULUS |«1| 
 
 DO PEGIN 
 
 FOR Ii.l STEP 1 UNTIL 6 DO 
 
 PEGIN 
 
 TRYHiGMflj t«M0DnF ( r ( 0UTPu' tl-l ]* M OnULUS,A[ I),B[I]),IxMODUlUS>| 
 
 TRYLOrt CT ]l«MODOFCF(OUTPUTtI-l 1 # A t H #B tl 1 ) #2*M0DULUS > I 
 
 END| 
 
CDMur-gT NO* TRYHTGH CONTAINS THE D'JT°UT rXPECTED FROM 
 
 30 
 
 T -IE TTM STAqE ASSU M DN G TME MFyT R IT " r THE iNpUT IS SET 
 
 TRYLO" CONTAINS THE OUTPlM ASSUMING THIS BIT IS NOT SET! 
 
 "ITSrTig MnDPF(ARS(7),2 K M nOULJS) GEO MODllLUSj 
 
 -3MMrvjT TEST WHETwrR OR NOT T^E RIT T S SfT EnR THE INITIAL V»LUEJ 
 
 COM^rsT TME FOLLOWING ,,30? IS THE SELECTION LOOPj 
 
 OJTP'jTtOl t=wpD0F"(7'2 K MT0ULUS>> 
 
 rDR Ijel STE° 1 UNTIL 6 00 
 
 9EGIN 
 
 OUT p UT[I ]t =lF 9ITSTT THFM TRYHIGMjIt ELSF TRYLOWfl}, 
 
 gITSFT t=M3D0r(0UTPjTn ]#?KMnOUL'iS) GEO MODULUSI 
 ENDj 
 uDDULJS «= MODULUS x 2'> 
 
 END jsjTil M AyITERATIO\'S t b m *vTTERATI0^S-1 LEO Oj 
 COMMENT MO* EVALUATE THE SIGN OF THE PARTIAL RESULTS) 
 FOR I isl STFP 1 UNTIL 6 DO 
 REGIS 
 
 TRvHi3Hri]tzr c nuTP'jTri-n,6m# B rii)t 
 
 TRYLOrf m:=r(OUTPuTtI-l 3-M0DjLUS»AtT I'Bfl 1)1 
 
 CDmmtvjT THE NEGATIVE VALUE WAS GrNER«TED moo mODULUSi HENCE 
 IF IT TURNS OUT NEGATIVE IT IS "VALUE MINUS MODULUS" J 
 
 END; 
 BITSET lc 7 LSS I 
 FOR Ii= 1 STE P 1 UNTIL 6 DO 
 
 BEGIN 
 
 OJT^'JTrl] , = IF 9ITSET HEN T R V l, D W r I 3 r LSE TRYMlGH[I]j 
 
 ^ I T SET It 0JTPUT[I]<DJ 
 ENDlEND. 
 
 gIT gy glT algorithm for Example 5.5 
 
31 
 
 NOTd THE OUTPUT FROM THIS PROGRAM HAS P ^ N loiTHO By HAWD . ., 
 
 I MPRDV/£ READABILITY, 
 TlRST THE IMpUT VECTORS A AND R, 
 
 3 
 B 
 B 
 3 
 
 B 
 B 
 3 
 
 C 31 * 
 C53« 
 
 r3]« 
 
 CA1« 
 t5l» 
 [61» 
 
 
 -9 
 
 a 
 
 -3 
 2 
 
 5 
 
 7 
 
 
 7 
 1 
 A 
 
 2 
 8 
 
 5 
 
 TOR WHICH THE TRUE ANSWER IS 
 
 TRU 
 
 [<>]« 
 
 10 
 
 TRU 
 
 tlJt 
 
 -97 
 
 TRU 
 
 C?3« 
 
 -389 
 
 TRU 
 
 r3j« 
 
 1163 
 
 TRU 
 
 t«3« 
 
 23?0 
 
 TRU 
 
 f5]« 
 
 1161? 
 
 TRU 
 
 t6j« 
 
 81279 
 
NJDH Ev A l'J* TlNS ThE results td a modulus OF It 
 
 32 
 
 MODULE 
 
 OUTPUT 
 OUTPUT 
 
 OUTPUT 
 OUTPUT 
 
 OUTPUT 
 
 OUTPUT 
 
 OUTPUT 
 
 03- 
 
 lj- 
 
 ?3« 
 3J« 
 
 5]» 
 
 63. 
 
 REPEATING THE CALCULATION FOR A MODULUS OF 2, 
 
 MOOULU* 
 OUTPUTtOje 
 OUTPUT t l ]« 
 0UTPUTf23« 
 OUTPUT[33« 
 
 OUTPUTf Al« 
 OUTPUT[53« 
 0UTPUTt6]« 
 
 0UTPUTT23" 
 0UTPUTT3]« 
 0UTPUT[1 3 « 
 
 0UTPUTtO3« 
 MOOULU. 
 OUTPUTt A]« 
 
 0UTPUT[53« 
 
 OUTPUHM* 
 
MODULE" 
 OUTPUTjO,. 
 
 OUTPUTf 1 5« 
 
 OUTPUT r ? 3 « 
 
 0UTPUTC3]. 
 OUTPUTf*], 
 OUTPUTf 53* 
 0UTPUT t 6 3 » 
 
 10 
 
 7 
 11 
 11 
 ?0 
 
 S 
 17 
 
 33 
 
 MODUlU* 
 
 0UTPUT[01« 
 
 OUTPUTfl]. 
 
 0UTPjTt?3« 
 
 0UTPUTf3]« 
 
 OUTPUTT^Ia 
 
 0UTPUTf5l« 
 DUTPUTC6]. 
 
 16 
 
 10 
 
 31 
 27 
 
 11 
 ?0 
 12 
 19 
 
 MOOULU« 
 
 OUTPUTfO). 
 
 OUTPUTtlJ- 
 
 OUTPUTf?]. 
 
 0UTPUTf33« 
 
 OUTPUTfAj. 
 0UTPUT[5] B 
 
 0UTPUT t 6l« 
 
 MODULU* 
 OUTPUTjOi, 
 OUTPUTM 3« 
 OUTPUT r ?j. 
 
 32 
 
 64 
 
 10 
 31 
 59 
 11 
 20 
 28 
 
 15 
 
 10 
 31 
 
 123 
 
OUTPUT r 33 » 
 OUTPUTf a 3« 
 0UTPUT[5l« 
 OUTPUTtftJ* 
 
 11 
 
 20 
 
 92 
 255 
 
 3^ 
 
 MOOULU. 
 
 OUTPUT[0]« 
 
 DUTP'JTfl]. 
 
 0UTPUTT21* 
 
 0UTPUTr3]» 
 
 OUTPUTt*]* 
 
 DUTPUT[53« 
 
 0UTPUTr6ie 
 
 MODUtU. 
 
 nUTPUTtOl* 
 
 0UTP'JT[1]« 
 0UTPUTC2]r 
 DUTPUT[3]» 
 OUTPUTt *]■ 
 0UTPUT[5]« 
 0UTPUTC61* 
 MODUlU. 
 
 126 
 
 256 
 
 10 
 159 
 123 
 139 
 276 
 
 92 
 127 
 
 10 
 
 415 
 123 
 139 
 276 
 92 
 127 
 
 512 
 
 0UTPUT[01« 
 
 ouTPi)Trn« 
 
 0UTPUT[2]« 
 
 0UTPUTt3]« 
 OUTPUTfOj. 
 
 0UTPUTt5l« 
 0UTPUT f 6l« 
 
 10 
 927 
 635 
 1163 
 276 
 30* 
 639 
 
MODULU- 
 
 1024 
 
 OUTPUTtOj. 
 
 10 
 
 OUTPUTf 1 j. 
 
 1951 
 
 OUTPUTf?]* 
 
 1659 
 
 0UTPUT r 3j* 
 
 1163 
 
 0UTPUTt45« 
 
 276 
 
 OUTPUT[5]« 
 
 1372 
 
 0UTPUTC63« 
 
 3455 
 
 M09UIU« 
 
 2048 
 
 OUTpUT r Oj a 
 
 10 
 
 OUTPUTt 1 1« 
 
 3999 
 
 0UTPUTf2] s 
 
 3707 
 
 0UTPUTt35« 
 
 1)63 
 
 OUTPUT^]* 
 
 2324 
 
 0UTPUTt51« 
 
 1372 
 
 0UTPUTf65« 
 
 1407 
 
 modulu. 
 
 4096 
 
 0UTPUTtO3« 
 
 10 
 
 0uTpuTrii« 
 
 8095 
 
 OUTPUTr2]« 
 
 7803 
 
 0UI.PUTr31« 
 
 1163 
 
 OUTPUT f 4 J« 
 
 2324 
 
 OUTPUT[5l. 
 
 3420 
 
 OUTPUT[6]« 
 
 1007 
 
 moduiu* 
 
 8192 
 
 0UTPUT[O]« 
 
 10 
 
 OUTPUTjl,. 
 
 16287 
 
 35 
 
OUTPUT t ? !■ 
 
 15995 
 
 OUTPUT[3i« 
 
 1163 
 
 OUTPjTt ft ]« 
 
 2324 
 
 DUTPUTfSj. 
 
 11612 
 
 QUTPUTt 6 5« 
 
 32127 
 
 MODULI). 
 
 16384 
 
 OUTPUUOl* 
 
 ID 
 
 OUTPUT[l3« 
 
 32671 
 
 OUTPUT[?]« 
 
 39379 
 
 DUTPUT[3]« 
 
 1163 
 
 OUTPUTt*1« 
 
 23?4 
 
 0UTPUT[5]a 
 
 11612 
 
 0UTPUT[6)b 
 
 15743 
 
 MODUlU* 
 
 32768 
 
 OUTP'jTtO]. 
 
 10 
 
 OUTPUTT 1]« 
 
 65439 
 
 0UTPUTr2]s 
 
 65147 
 
 OUTPUTf 3]« 
 
 1163 
 
 OUTPUT^}. 
 
 2324 
 
 □UTPUT[5}« 
 
 11612 
 
 0UTPUTr6]« 
 
 15743 
 
 MODUI.il* 
 
 65536 
 
 OUTPUT[0]« 
 
 10 
 
 outputc n« 
 
 130975 
 
 0UTPUTf6l« 
 
 81279 
 
 OUTPUTt55« 
 
 11612 
 
 OUTPUT[«i« 
 
 2324 
 
 OUTPUTT 3j« 
 
 1163 
 
 OUTPUT[2 5 . 
 
 130683 
 
 36 
 
ODUlU« 
 
 131072 
 
 37 
 
 FINALLY CALCULATING TmE SIq^S D r ThF pARTlA^ RESULTS 
 
 UTPUTtn- 
 UTPUT[2]» 
 UTPUTt33* 
 UTPUTt 4 1* 
 UTPUT t 53 * 
 UTPUT[63« 
 
 -97 
 -389 
 1163 
 
 232* 
 11612 
 B1279 
 
 which is of course the same resul t t hat w as fhumd by the straig-itfdrward 
 
 SEQUENTIAL METHOD, 
 
38 
 
 5.6 It is possible to extend the residue method to include 
 division of one polynomial by another using the same technique 
 that was used in Section 2.6 to handle quasi-linear functions. 
 For example 
 
 f[x,a] = (x 2 + a 2 ) t (x 2 - a 2 ) 
 write x = t/b and using 
 
 g[t,b,a] = (t 2 + b 2 a 2 ), (t 2 - b 2 a 2 ) 
 
 5.7 The residue method as stated suffers from one major 
 drawback. That is, unless there is some guarantee that the 
 partial results are bounded, it is likely that the number of bits 
 in the final result will be at least as large as the number of 
 iterations that would be required to perform the calculations in 
 
 a straightforward sequential manner. This can often be avoided by 
 evaluating higher order bits of later stages in the algorithm. 
 For example if the iterative function were f[x,a] = 2x + a and 
 the input vector were positive, then each stage of the algorithm 
 would at least double the previous partial result. Accordingly, 
 
 one could start the n stage of the algorithm at the n bit . 
 
 th 
 This would lose the n low order bits of the n partial result 
 
 which would be set to zero. The details of this type of tech- 
 nique will be developed in the next chapter (Section 6.3) • 
 
 If this technique is to be used, however, there are some 
 quite formidable difficulties to be overcome in any practical 
 application. Furthermore, the technique is restricted to poly- 
 nomial-like iterative functions and cannot be extended even to 
 such comparatively mild functions as max or abs. 
 
39 
 
 CHAPTER 6 
 ANOTHER BIT BY BIT TECHNIQUE: THE LIPSCHITZ METHOD 
 
 6.1 Most of the difficulties with the residue techniques 
 arise from the fact that it evaluates the bits of each partial 
 result from least significant to most significant. This is 
 counter to the usual practice in numerical computation In which 
 the most significant part of the answer is computed first. The 
 Lipschitz technique discussed in this chapter evaluates partial 
 results from most significant to least significant. Thus, after 
 n iterations one would expect to know n high order bits of each 
 partial result. This would normally be of much more interest 
 than the n least significant bits that are obtained in the resi- 
 due method. Furthermore, this technique is applicable to a very 
 wide class of functions including max, abs, and many continuous 
 functions. These are not generally amenable to solution by the 
 residue method. 
 
 6.2 THE LIPSCHITZ METHOD. Assume that the input to the j th 
 
 th 
 stage at the i iteration is X. . . Then, introducing a small 
 
 change 6 in X. . will induce a maximum change of M6 in X . ,, . where 
 
 M is the Lipschitz bound (effectively k-| < M throughout the range 
 
 o X 
 
 for each memory variable x) . For convenience assume that <5 is 
 some power of 2, say 6 = 2 r and that M also is some power of 2, 
 say M = 2 s . Then the introduction of 6 = 2 r at stage j will 
 produce at most a change in the 2 r+s bit of stage j+1. Accordingly 
 
40 
 
 these effects may be anticipated by evaluating f(S.. + <5. , a.), 
 
 J J J 
 
 f(X.. - 6., a.) for each stage j and using a selection process 
 similar to that used in Section 5.2. 
 
 In practice the selection is based on the closeness of the 
 trial results '(fUj^, a . ) , fCX.^. + 6., a.) and f(Xj ± - 6j , §_. ) 
 
 to the trial parameters X j +li > X j+ik + ^j+l 5 X j+li ~ 5 j +1 * This 
 technique is chosen rather than a strict examination of the bits 
 of each result for two reasons. First it tends to speed the 
 convergence, and second because it automatically compensates for 
 unexpected overflow from accumulated lower order bits that occur 
 as the algorithm proceeds. 
 
 6.3 INITIAL VALUES for the 6. defined above must be chosen 
 
 with some care. The 6. are halved at each Iteration and the X.. 
 
 J Ji 
 
 can be altered by ± 5. at each iteration. Clearly if these 6. 
 
 J J 
 
 are too small initially, it will not be possible to converge to 
 the true value of X. . On the other hand, a choice of very large 
 6. will cause the algorithm to waste much time without making 
 
 J 
 
 any changes at all to X... 
 
 For many cases an adequate set of initial values can be ob- 
 tained directly from the Lipschitz bound M. A function f(x, a) 
 satisfies the Lipschitz condition with respect to its memory 
 variable x if for any pair of values y, z in the domain of f. 
 
 |f[y,a] - f[z,a] | < M|y-z| 
 
 Thus 
 
 f[y>a] belongs to the interval 
 
 f[0,a] ± M|y| 
 
Hi 
 
 Hence if X. are the partial results of f/6, A then 
 
 X ± e f [0,^] ± M | z | 
 
 X 2 £ f[0,Ap] ± M|X-| 
 
 etc . 
 so that the function 
 
 g[x,a] = Max (abs(f[0,a] + M | x | ) , 
 abs(f[-,a] - M | x | )) 
 vjill produce suitable initial values when it reduces z, A. An 
 obvious artifice can be used to generate a linear form of g 
 
 g[x,a] = abs(f[0,a]) + Mx 
 then 
 
 g/abs(z) ,A 
 produces the same initial values. Now g is linear in its memory 
 variables and has therefore a parallel equivalent. 
 
 6.^4 Example. 
 
 The following program simulates the Lipschitz algorithm for 
 the function 
 
 f[x,a] = 1/2 Sin(2x + a) 
 reducing a randomly generated vector. 
 
 The maximum derivative of f with respect to the memory vari- 
 able x (the Lipschitz bound) is 
 
 i|£l< i 
 
 'fix- 
 
BEGIN 
 
 UWN INTEGER SI/E#COUNTERl 
 
 HEAL K,XfY»ZZ,I,LASTOUTPUT»EPSILONIBOOLEAN DEBUG. CONVERGED I 
 
 F ILE REMUTE(1,10)| 
 
 FILE LINL OISK SErIAL t 20 I 900 ] ( 2. 1 0. 150. SAVE 99)j 
 
 HEAL MAXUERIVATIVE'Z) 
 
 PROCEDURE PRIST ( NAME * A ) J VALUE NAMEJARRAY AC01JRFAL NAME) 
 BEGIN INTEGER II 
 
 hRlTL( L INE. <//A6," a"/1000C5(E14.7)/>>.NAME» 
 FUR Ilsl STEP 1 UNTIL SIZE 00 AC 1 1 > J 
 
 COMMENT PRINT PRINTS ThE ARRA Y pASSED TO IT AS pARA M FTER| 
 
 42 
 
 fcNDI 
 
 PROCEDURE GETA(A);ARRAY ACO]l 
 
 BEGIN 
 
 INTEGER vii 
 
 OWN INTEGER BASElOWN REAL Tl 
 
 HEAL PROCEDURE RANDOM I 
 
 BEGIN 
 
 IF pASE«0 THEN BASE I »"2Tp*200 w l 
 
 TI«=3&IBASE.[37|261*6137*2197513)C37| 35 13611 
 
 Tlte p EAL(BOOLEAM(BASEl B T)OR BOOLE AN ( ,5) )-!J5j 
 
 RANDOM I* (T*T)*1000I 
 
 END I 
 FOR Jl«0 STEP J UNTIL SIZE DO A t JJ I "RANDOM! 
 END* 
 
 COMMENT GET GENERATES A SAMPLE VECTOR A IN WHICH EACH ELEMENT 
 LIES IN ThE RANGE "1000. *1 000 1 
 
 heal proceoure f(x»a)*value x,ajreal a.x| 
 begin 
 
 M C 0,5*S1N<2*X + A>J 
 
 COMMtNT F IS THE FUNCTION WHICH IS REDUCING THE Z VALUE 
 L AND THE VECTO" Al F/Z.A I 
 
 END I 
 
 HEAD ^RE MUTE,/ . DEbUG . S I ZE. EPSILON. Z ) I 
 
 maxdertvativeipi.oj 
 hritl(line»< 
 
 "F(X»A)»U.5*siN(2*X*A) n 
 ♦ n MAX oEKIVATlVEa",F9.3. H Z VALUE*". F9 , 3>, MAXDER I VAT I VE . Z ) I 
 BEGIN 
 
 ARRAY A, OUTPUT, HI GH. LOW, EVEN, DELTA, TRU CO 1 SIZE] J 
 
 GETAU)| 
 
 TRUCO] jsUuTpUTCO] toZZI«Z| 
 
 UELTAC03 »cABS(E(O.Z))l 
 
 COMMENT COMPUTE ThE CORRECT ANSWER USING STRA I GNTFOR WARD SEquENTIAI 
 
 TECHNIQUE | 
 FOR il»l STEP 1 UNTIL SIZE DO 
 BEGIN 
 
 TKurn»*zzi c r(zz,ACP>i 
 
 Xi«FCO»A[X])j 
 
 DELTAU3ieAB5(x)4MAXnERlvATIvE*DELTACI-l ]| 
 ENDj 
 PRINK" A "»A)j prINT("TRUE n . TRU ) I PR I NT C "DELT A »,DELTA)| 
 
COMMENT NOW ENTER THE MAIN LOOP* 
 UO BEGIN 
 
 lF(cOUNTERt*COuNTE:R*i ) MOD 10»0 
 wRITEUINE. < w ITERATION # n 
 
 COMMENT THE! VECTOR HIGH CONTAINS 
 
 AT THE nfvt iteration IS 
 
 low contains the result assuming it is decreased by 
 
 li«e amount, even is the result assuming no changei 
 
 OR DEBUG THEN 
 tI4> .COUNTER)! 
 
 THE OUTPUT ASSUMING THE 
 
 INCREASED RY "DELTA t T3* 
 
 43 
 
 INPUT 
 A 
 
 FOR 11*1 STEP 
 
 BEGIN 
 DELTACIJl 
 
 HIGHCI] » 3 
 
 LOW [1)1= 
 
 k-VENC 1 3 I » 
 
 END I 
 CUNvEKGEDI-TRU 
 
 cummEnt the >je 
 
 SELECT 
 FOR Ilel STEP 
 BEGIN 
 
 CONVERGEDI* 
 
 LASTOUTpUTj 
 
 WI»MIN(XIpA 
 
 IF X*W THEN 
 
 IF ZZ«W THE 
 
 ENDI 
 
 IF COUNTER MOD 
 
 PRINT("0U 
 
 END UNTIL COUN 
 
 WRlTE(ulNE, </"0l 
 
 " with relaTiv 
 (If cunverqEo 
 
 print(»output"»ou 
 lock^lineji 
 
 END^END. 
 
 1 UNTIL SIZE DO 
 
 ■DELTAtn/21 
 
 F( OUTPUT t I-n*DELTAtI]»ACn)l 
 
 F(OUTPUTtI-n-DELTACl],Atl3)l 
 
 FCOU T Pl>Td-l3 »AcI])l 
 
 EILASTOUTPUTI^OUTPUTCO]) 
 
 <T LOOP IS THE SELECTION LOOP. NOTE THE CRITERION 
 
 S THE CLOSEST INPUT TO THE NEXT STAGFI 
 
 1 UNTIL SIZE DO 
 
 CONVERGED AND ABS ( W I "OUTPUT 1 1 -13-LASTOuTPUT ) ■» 
 
 EPSlLON*ABS(LASTOUTPUT)| 
 
 ■0UTPUT[I]| 
 
 BS(WDELTAtI])»ABS(W)»ZZl"ABS(W*DELTAC!3))| 
 
 OUTPUTC 1 3 l«HIGHtll ELSE 
 N OUTPUT^ I J l"LOW C I 3 ELSE OUTPUT £ 1 3 t «EVENt j 3 1 
 
 10-0 AND NOT CONVERGED OR DEBUG THEN 
 TPUT"»nuTPuT) J 
 TER GEO SIZE OR CONVERGED I 
 
 %A3." CONVERGE AFTER-. IA." ITERATIONS'*. 
 Z ERR0R» W #E1A.7>» 
 THEN " " ELSE "NOT"), COUNTER. EPSIL0N)| 
 
 TPUT)I 
 
 PROGRAM FOR EXAMPLE 6. A THE LIPSCHlTZ METHOD. 
 
44 
 
 note the output from this program has been editted by hand tct 
 impkove readability. 
 
 FtX»AJ««0»5*SINt2*X*A] MAX DERIVATIVE* 
 
 1,000 INITIAL VALUE. 
 
 0.000 
 
 F REUUCEb THE VECTOR A I "F/Z.A" 
 
 A 
 1.677 
 
 6,233 
 2,3B7< 
 
 9.11' 
 2.o?e 
 
 5.891 
 
 1.120 
 
 1.787. 
 
 V,A2V| 
 
 5,535< 
 
 t». 331' 
 
 t>,472 
 
 J, 670 
 
 3.o43 
 
 2.0&3 
 
 0,490 
 
 0.192 
 
 1.011 
 6.756 
 
 061*E 
 1811E 
 
 oaoof 
 
 lO^UE 
 3l6bfr 
 
 059JE 
 355b£ 
 
 032£>e 
 
 180«E 
 0890g; 
 
 492 y E 
 
 b avF 
 
 329^£ 
 122^E 
 ^6 1 It 
 
 ^oiof 
 
 H98>e 
 5806^ 
 9 l 96 E 
 
 02 
 02 
 02 
 02 
 02 
 02 
 02 
 02 
 02 
 02 
 02 
 02 
 02 
 01 
 02 
 02 
 
 0? 
 02 
 
 02 
 02 
 
 5,122 
 
 7.356 
 
 4.311 
 
 4.640 
 1.556 
 
 2.454 
 5.3 8 
 6,760 
 '.111 
 0.510 
 3.175 
 
 V.932 
 2.797 
 
 2.2 « 
 0.238 
 2.487 
 9.344 
 9.101 
 0.062 
 4.956 
 
 2517P 
 7965r 
 0*02r 
 6l96r 
 2246r 
 4702? 
 j794r 
 8o37p 
 8455r 
 40l5f 
 4333F 
 6465P 
 9 3 08r 
 8n29r 
 2509r 
 4434P 
 
 31*1? 
 7o45r 
 
 9370F 
 4674p 
 
 00 
 02 
 02 
 02 
 02 
 02 
 02 
 02 
 01 
 02 
 02 
 02 
 02 
 02 
 02 
 02 
 
 01 
 02 
 02 
 01 
 
 0730429E 
 090B077F 
 8370909F 
 8123144E 
 32702S7E 
 7103618E 
 8131090E 
 1892501E 
 2625648F 
 0102233F 
 0312124F 
 5?82996E 
 7688257E 
 21«0646F 
 3735Q91F 
 9079761F 
 
 6005232E 
 5483281F; 
 
 3364955F 
 
 97Q8I11E 
 
 01 
 02 
 02 
 02 
 0? 
 02 
 02 
 01 
 02 
 0? 
 02 
 0? 
 02 
 02 
 02 
 02 
 0? 
 02 
 02 
 02 
 
 .8420446E 
 ♦5l7?008r 
 .9829632F 
 .B20«013f 
 .B987321F 
 
 ,230?9oOF 
 .6657901F 
 •1103210F 
 .5465698f 
 
 .0A80U5E 
 .1258570E 
 .1049Rl4r 
 .6700241F 
 ,2627359 F 
 
 .177297QF 
 .4J46257F 
 
 •5651733F 
 ,259l96lr 
 
 .9759765F 
 .8048783F 
 
 02 
 0? 
 02 
 02 
 02 
 02 
 
 Ol 
 02 
 02 
 02 
 02 
 02 
 01 
 02 
 02 
 02 
 02 
 02 
 02 
 02 
 
 03999 
 
 D7264 
 06403 
 78500 
 35233 
 32579 
 193^6 
 
 640 9 « 
 24182 
 26071 
 72725 
 58163 
 39845 
 51341 
 021*7 
 4?37l 
 36698 
 5o5B/ 
 «0165 
 77623 
 
 52E 
 
 49E 
 
 58E 
 AOE 
 
 B1E 
 78E 
 97E 
 84E 
 I6E 
 65E 
 71E 
 89E 
 76E 
 54E 
 ^8E 
 57E 
 74E 
 26E 
 53E 
 54E 
 
 02 
 
 02 
 
 02 
 
 01 
 
 02 
 
 02 
 
 02 
 
 02 
 
 02 
 
 02 
 
 02 
 
 02 
 
 02, 
 
 02 
 
 02 
 
 02 
 
 02 
 
 02 
 
 02 
 
 02 
 
 THE 
 
 TKUE 
 •4,66 
 
 1,78 
 -2.98 
 
 4,94 
 •4,28 
 
 4,94 
 •4,99 
 
 •1.72 
 
 ■1.33 
 
 •2.69 
 
 •2.34 
 •4,99 
 
 «*.6l 
 
 "3.29 
 
 4.9l 
 
 J, 76 
 
 3,76 
 
 «.92 
 
 -1.43 
 
 "2.30 
 
 tru^ 
 
 a 
 32902 
 13606 
 6 3864 
 *032* 
 
 20 o 6 3 
 77386 
 7189b 
 
 14730 
 93876 
 
 2632** 
 <*0596 
 J9917 
 99906 
 4 7 J b 1 j 
 Jb58<; 
 
 0*326 
 96805 
 4 >7*93 
 79964 
 08311 
 
 ANSWER 1st 
 
 E-OI 
 
 L-4, 
 
 E"01 
 
 3. 
 
 F-01 
 
 1-5. 
 
 r-01 
 
 1 4, 
 
 E*01 
 
 1-3, 
 
 F-01 
 
 1 4. 
 
 E"01 
 
 L 4, 
 
 E"01 
 
 1-1, 
 
 E-01 
 
 [ 4. 
 
 E"01 
 
 [-9, 
 
 e"o; 
 
 >-9, 
 
 E-01 
 
 •2. 
 
 F-01 
 
 1-4, 
 
 F-01 
 
 -4, 
 
 E"01 
 
 1 4, 
 
 r-OJ 
 
 >-2. 
 
 F"01 
 
 
 F-0! 
 
 1 4, 
 
 E"01 
 
 1 1. 
 
 F-01 
 
 1-4. 
 
 3321343 
 
 9 2 66878 
 4j7o264 
 
 7953703 
 6824?6i 
 
 9205943 
 4799«;33 
 447^407 
 9323025 
 5382 4l 
 7t73386 
 2974553 
 ^920527 
 5olOl25 
 9bl?656 
 9614625 
 5315539 
 68000 9 4 
 5 7 8 9 3 34 
 5859592 
 
 F-01-4 
 E-01 4 
 F-02-3 
 r»02 4 
 F-01 4 
 P-01 3 
 F-01"? 
 F-01 4 
 P-01-6 
 r-02 2 
 f-02-4 
 r-01"2 
 r-01-2 
 r-02 3 
 r-01 
 P-01 
 F"02 
 r-02 
 f-01-2 
 r-01 3 
 
 99910 
 96881 
 3547i 
 00576 
 42532 
 52136 
 74388 
 
 86462 
 82097 
 
 76320 
 8 389 
 
 25502 
 57811 
 83446 
 193*1 
 82987 
 5B048 
 84183 
 97143 
 77027 
 
 97E-0 
 *3E'0 
 76F-0 
 *1E"0 
 98E"0 
 03E"0 
 33F-0 
 94e"0 
 41F"0 
 24F-0 
 84F-0 
 01F-0 
 ?9r-0 
 74F"0 
 01E"0 
 55F"0 
 12F"0 
 62F-0 
 17E-0 
 26E"0 
 
 186 
 594 
 750 
 575 
 920 
 969 
 833 
 173 
 942 
 214 
 
 fiin 
 
 3 59 
 
 780 
 3 36 
 597 
 
 912 
 941 
 516 
 
 099 
 246 
 
 3434p- 
 !*13F- 
 7897f- 
 3380F- 
 3301F- 
 8984f- 
 7255p- 
 
 0^25r- 
 1«36f- 
 2620F" 
 2203f- 
 1745f- 
 3043f- 
 7764f- 
 8278 r - 
 3659 r - 
 798qf- 
 6699p» 
 3l99 F - 
 5820F- 
 
 02 4 
 
 -4 
 
 -3 
 
 3 
 -4 
 
 -6 
 
 -2 
 
 3 
 
 -4 
 -3 
 
 02 4 
 02 2 
 
 9 
 -4 
 •4 
 
 3 
 -3 
 "4 
 
 t 
 
 -5 
 
 5924287 
 
 3156022 
 S967973 
 
 2855212 
 6263376 
 
 2 ? 5 5 fl 4 
 8702684 
 Il963n6 
 9993775 
 37916*2 
 9^59873 
 B?51»72 
 8202042 
 8j 16776 
 67774*54 
 2370959 
 66?18 ? 3 
 5981634 
 1432396 
 5466436 
 
 E-01 
 E-01 
 
 E-01 
 E*01 
 
 E-01 
 E - 02 
 E"02 
 E-01 
 E"01 
 E-01 
 E-01 
 E-01 
 E-02 
 r-01 
 F-01 
 E-01 
 F-01 
 E-01 
 F-01 
 F-02 
 
AT THE 2UTH 
 ITERATION # 
 
 ITERATION 1 
 20 
 
 OUTPUT 
 •4,663 
 
 1.781 
 »i!,986 
 
 4.949 
 
 •W.282 
 
 4,947 
 
 •4,997 
 
 •1.721 
 
 •1.339 
 
 -2,69* 
 
 •2.344 
 
 •4, 99,5 
 
 4.619 
 
 •3.294 
 
 4.913 
 
 3.768 
 
 3.769 
 
 4.92* 
 
 •1.437 
 
 •2,300 
 
 2902E-0 
 360&E-0 
 3864 E -o 
 032VfO 
 8o63f0 
 736/'E'0 
 189r E -o 
 472'F"0 
 3871^-0 
 63?3f - 
 
 oo 9 ur w 02 
 
 99l6p"0 
 9904t : -0 
 6656^-0 
 
 577*e"0 
 9680£« 
 
 b779 E -o 
 7801E"0 
 9863E"0 
 8q66e:-o 
 
 '4.332 
 
 3.9?6 
 
 '5.4i7 
 
 4, 795 
 
 '3.682 
 
 4.920 
 
 4,479 
 
 '1.447 
 
 4.932 
 
 '9.536 
 
 '9.7J7 
 
 '2.297 
 
 '4,49? 
 
 •4,499 
 
 4.951 
 
 '2.961 
 
 '2.531 
 
 4.680 
 
 1.578 
 
 •4,586 
 
 1343C 
 6R78r- 
 
 0264c" 
 3703r< 
 
 4?82r" 
 5945r- 
 9533r- 
 2408r- 
 3045r« 
 20 89p« 
 40 8 0f' 
 4553r« 
 0592T- 
 9<39of" 
 2*53F' 
 
 4664P 
 5373T' 
 0557p« 
 80l3r« 
 01^3F' 
 
 01-4 
 
 01 4 
 02-3 
 
 02 4 
 
 01 ft 
 01 3 
 01-2 
 
 01 4 
 01-6 
 
 02 2 
 02-4 
 
 01-2 
 01-2 
 
 02 
 01 
 01 
 02 
 02 
 
 01-2 
 01 3 
 
 9991097E' 
 9688193E* 
 354717^^- 
 
 00576B1E' 
 
 4253255E 1 
 52136Q7E' 
 
 7438833E' 
 8646303E' 
 
 8285294E' 
 7632061E' 
 8038983F' 
 2550197F' 
 578U29E' 
 83«5928F" 
 I939i?3e« 
 
 8293522F' 
 5805553F' 
 8418173F' 
 9714908E' 
 770213OE' 
 
 01 4 
 01 1 
 01 4 
 01-4 
 
 01 6 
 01 2 
 
 03-4 
 01-4 
 03 4 
 03 2 
 01-3 
 01 2 
 01-1 
 01-1 
 01 1 
 02-3 
 01-3 
 01-3 
 01-4 
 01 4 
 
 18634 
 59418 
 75078 
 57533 
 92013 
 96990 
 8337? 
 17307 
 94218 
 21425 
 81018 
 35918 
 78033 
 3369Q 
 59782 
 91233 
 94181 
 51667 
 09937 
 24665 
 
 34E-01" 
 
 13F-01' 
 
 97F-nl 
 
 80F-01' 
 
 23F-02 
 
 07f-01' 
 
 53F-01' 
 
 ?9f-o1 
 
 36F-01' 
 
 99F-01' 
 
 72F-02 
 
 01F-02 
 
 12E-01 
 
 87F-nl' 
 
 7 4F-01' 
 
 OOE-01 
 
 89f- 1' 
 58f-o1 
 
 66F-01 
 l^E-01 
 
 45 
 
 • 5924 
 .3156 
 .89*7 
 .2855 
 .82*3 
 
 • 022* 
 .8702 
 .1196 
 .9993 
 .3791 
 .9^59 
 .8?51 
 .8205 
 .R116 
 .6777 
 
 • 2371 
 
 .6*21 
 .5981 
 .1431 
 .5468 
 
 287E-01 
 022E-01 
 973E"01 
 212E-01 
 
 359F-01 
 183F-02 
 654E-02 
 308E-01 
 776E-01 
 6*2E"01 
 876E-01 
 869F-01 
 241E-02 
 868E-01 
 441E"01 
 545E-01 
 490E-01 
 249E-01 
 769F-01 
 568E"02 
 
 DID 
 
 OUTP 
 
 •4.6 
 
 1.7 
 
 •2.9 
 
 4,9 
 
 -4,2 
 
 4.9 
 
 •4,9 
 
 •1.7 
 
 •1.3 
 
 '•2.6 
 
 "2.3 
 
 ,•4,9 
 
 4.6 
 
 •3.2 
 
 4.9 
 
 3.7 
 
 3.7 
 
 4,9 
 
 "1.4 
 
 •2.3 
 
 cunvergE after 26 iterations with relative errop» 1.0000000E-05 
 
 UT u 
 
 632902E 1 
 
 81360&E' 
 863864^ 
 
 49 329e' 
 
 4^7366^' 
 971896 E < 
 214729^ 
 393876^' 
 926324f 
 44o58¥ r < 
 939917^' 
 
 j999o«>F' 
 
 947J61F 1 
 
 1J558U 
 
 689 3 22F' 
 
 6V6809F' 
 
 257 6 91F' 
 
 008331E' 
 
 01 
 
 1*4, 
 
 01 
 
 I 3. 
 
 0' 
 
 1-5. 
 
 
 
 L 4, 
 
 01 
 
 1-3, 
 
 0' 
 
 1 4, 
 
 
 
 4, 
 
 01 
 
 1-1. 
 
 0' 
 
 1 4, 
 
 0' 
 
 I- 9 . 
 
 0; 
 
 >-9, 
 
 
 
 i-2. 
 
 
 
 1-4. 
 
 
 
 !-4, 
 
 
 
 L 4, 
 
 0; 
 
 ?-2, 
 
 
 
 1-2. 
 
 
 
 1 4, 
 
 
 
 1 1. 
 
 
 
 1-4. 
 
 3321343F- 
 9266A78F- 
 4i7o?64P' 
 7953703F' 
 6824? 6 1 F' 
 9205943r- 
 4799533F' 
 4a7?408c" 
 
 9323n25r- 
 
 538?n a lP' 
 7i73375p« 
 2974553c" 
 4920527F' 
 5010132F' 
 9512657r- 
 9 6 1 4 * 2 1 F ' 
 53l5^»5lP' 
 6800109F' 
 5789333F' 
 5859607F' 
 
 01-4 
 
 01 4 
 02-3 
 
 02 4 
 01 ft 
 01 3 
 01-2 
 
 01 4 
 01-6 
 
 02 2 
 02-4 
 01-2 
 01-2 
 
 02 
 01 
 01 
 02 
 02 
 
 01-2 
 01 3 
 
 99910 
 96881 
 35471 
 00576 
 42*32 
 52136 
 74388 
 
 86462 
 82097 
 
 7*320 
 8o389 
 
 25502 
 57011 
 63446 
 19391 
 82*87 
 56048 
 84183 
 97143 
 77027 
 
 97E-0 
 93E-0 
 76F"0 
 81F-0 
 98£-o 
 03E-0 
 33F-0 
 94f"0 
 ftlF'O 
 24F-0 
 84F"0 
 01E"0 
 ?9f"0 
 BOF-0 
 01F-0 
 60F"0 
 21F-0 
 61E-0 
 09F"0 
 ?5F - 
 
 186 
 594 
 750 
 575 
 920 
 969 
 833 
 173 
 942 
 214 
 
 810 
 359 
 
 780 
 3 3* 
 597 
 
 912 
 941 
 516 
 099 
 246 
 
 3434c 
 
 1013F 
 7897F. 
 
 3380E 
 ^ 3 1 F 
 8984r 
 7255F 
 0725f 
 1836 F 
 
 2*20F 
 2202F 
 l745r 
 3043f 
 776ftF 
 8279 r 
 
 3655F 
 7985r 
 6710F 
 3203F 
 5828E 
 
 -02 ft 
 -6 
 •2 
 3 
 •4 
 -3 
 
 -4 
 -3 
 
 3 
 -4 
 
 02 ft 
 0? ? 
 
 9 
 • 4 
 
 -a 
 
 3 
 
 •3 
 •4 
 
 1 
 
 -5 
 
 59242 
 31560 
 89679 
 
 26552 
 02*33 
 02255 
 87026 
 H963 
 99937 
 37916 
 93598 
 82518 
 
 «20?0 
 8,167 
 *7774 
 23709 
 
 *6210 
 59816 
 14323 
 54663 
 
 87E-01 
 22E-01 
 73E-01 
 J?E"01 
 76E-01 
 61E-02 
 84E-0? 
 06E-01 
 75E"01 
 62F-01 
 74F-01 
 72E"01 
 50E-02 
 78E-01 
 54E"01 
 59E-0! 
 ?8f-01 
 22E-01 
 9?E"01 
 77^-02 
 
 1 HE NUMBt-H OF ITERATIONS THAT ARE EXPECTED FOR THIS ALGORISM ArEI 
 3 1/3 TlMEs 5 TO G^T THE FlvE SIGNIFICANT FIGURES OF ACCURACY 
 • 17 ITERATIONS 
 
 PLUS 5 ITERATIONS FO* THE MAXIMUM VALUE OF DElTACj) TO BECOME 
 SMALL ENOUGH, 
 THE AlTuAL NuMbEr OF ITErATIqNS rEQUIrD IS 2 * . 
 
46 
 
 6.5 CONVERGENCE OF THE LIPSCHITZ METHOD. ; 
 
 Two questions need to be answered in this respect. First, 
 under what conditions does the method converge, and second, how 
 quickly does it converge? 
 
 An examination of the algorithm in Section 6.4 reveals that 
 the partial results f /z , A propagate through the calculations. 
 That is, after the i iteration the i partial result is known 
 exactly. Referring to the variables used in the program, the 
 trial input "EVEN[I] n will be the one chosen when the algorithm 
 converges. Since EVEN[I] is computed using 0UTPUT[I-1] , it is 
 certain that after k iterations EVEN[k] will be exact. Thus, 
 the algorithm converges to the sequential results under all 
 conditions . 
 
 The number of iterations in which the algorithm will con- 
 verge can be calculated by examining the initial values 6 . . If 
 
 j 
 
 th 
 the maximum tolerable error is E., then the j stage will con- 
 tinue logp (6./E.) times to convergence. 
 
 It is clear that a large Lipschitz bound M will cause 6. to 
 be very large and therefore slow convergence. This issue will 
 be taken up in the next chapter. 
 
 In the example cited in Section 6.4 the 6. are approximately 
 
 3/3 « Thus, the number of iterations expected for different 
 
 lengths should be of the order of log 2 ( j/3 ) + 25 . This is born out 
 
 empirically . 
 
 Size of A Number of Iterations 
 
 100 26 
 
 500 30 
 
 1000 32 
 
 2000 33 
 
 10000 36 
 
47 
 6.6 Functions of more than one memory variable can also be 
 
 handled with this technique. Two distinct approaches are possible 
 
 For example, if f(x, y, a) is a function which yields two results, 
 
 then the iterative process can proceed as follows. 
 
 1) evaluate fCx^, y ± , & ± ) f(x ± +6, y ±i a ± ) f(x i ~6, y ±a a ± ) 
 and use the usual selection process to choose which of these three 
 values is most appropriate at each stage. 
 
 2) repeat the process for the parameter y. 
 
 Or the two steps may be performed at once by evaluating the 
 nine values f(x. ± 6, y. ± e, a.) and again choosing the best of 
 these values using the straightforward selection process. 
 
 The first of these techniques would appear to be the more 
 economical in most instances. Calculation of an Initial sequence 
 follows the style of single memory variable functions. 
 
 If ||^| < M ||£| < N 
 
 'ox 1 - ' 6y ' — 
 
 then f(x, y, a) e f(0, 0, a) ± M | x | ± N | y | 
 
 so that g[x,y,z] = abs(f(0, 0, a)) + Mx + Ny 
 predicts the initial sequence when applied to 
 g/abs(z,), abs(Zp),A 
 
H8 
 
 CHAPTER 7 
 LIMITATIONS TO THE LIPSCHITZ METHOD 
 
 7.1 The initial values for the <5 . In Section 6.3 crucially 
 
 effect the efficiency with which the parallel algorithm will work. 
 
 If the Lipschitz bound M is greater than 1, then these numbers 
 
 will be of order 6. = 0(M J ) which may be absurdly large for large 
 
 j. However, it may not be possible to choose smaller values for 
 
 the <5 . since any change 6. at stage j could produce a change M6 . 
 
 at stage j+1. Thus 6 . , , must be at least M6 . . 
 
 J+1 J 
 
 For example 
 
 f[x,a] = 1/2 Sin(3x + a) 
 M = 3/2 and for an extraneous input vector of length 1000 
 
 61000 - (3/2) - 2 . Since the Sin function is never greater 
 
 t h 
 than 1 it is absurd to start looking at the 580 bit. An algor- 
 ithm which should converge more quickly is described in this 
 chapter . 
 
 First the 6. are chosen as the maximum value of f. The al- 
 gorithm proceeds as described in Chapter 6; however, if in the 
 execution of the algorithm it happens that a stage k is converg- 
 ing to a point x Q for when|||| x=x > 1 then « k+1 , <5 k+2 , ... are 
 multiplied by M. 
 
 The net effect of this is to delay the convergence of stages 
 k+1 , k+2, ... for log 2 M iterations, after which the changes to 
 stage k will have settled down. 
 
49 
 
 7.2 EXAMPLE. The following example based on the function 
 
 2 
 f[x,a] = 3x + a subject to the arbitrary restriction that it not 
 
 stray outside the range [-1,+!]. Thus 
 
 p 
 f[x,a] = max(-lj min(l, 3x + a)) 
 
 and the extraneous input vector A is uniformly distributed in the 
 
 range [-3 3 0]. The algorithm is essentially the same as example 
 
 6.4 with two exceptions: 
 
 1) The initial 6. are chosen to be the maximum value of 
 
 J 
 
 f[x,a], 6. = 1. 
 
 2) In the loop which computes trial values for HIGH, LOW, and 
 
 EVEN a special test is made to see if the expected change at stage 
 
 i is greater than the change being computed at stage i+1 . If 
 
 this is so <$.,-,, Sje.o, ... are multiplied by M, the maximum deriva- 
 l+l ' i+2 ' 
 
 tive of f with respect to x. 
 
 -t— = 6x 
 6x 
 
 maximum value of |-j—| occurs at 3x + a = 1 
 
 1 6x ' 
 
 and a = -3 i.e., x = /4/3 , M = ||£| =7. 
 
 7.4 CONVERGENCE OF THE MODIFIED METHOD is guaranteed for 
 the same reasons that were advanced in Section 6.3 The number of 
 iterations that can be expected is easily found if the probability 
 that | -= — | > 1 is known. If this probability is p and the length 
 of the extraneous input is n, then the last 6. should be multi- 
 plied np times, which costs nplog ? M iterations. Thus, to produce 
 
 an 
 
 swers accurate to E will required (logp I/E) + nplog^M itera- 
 
 tions, where I is the initial value assigned to 6.. 
 
 J 
 
BEGIN 
 
 (JwN INTE 
 
 HEAL *#* 
 
 F ILE. RFM 
 F ILE LiN 
 
 HEAL MA 
 PROCLULjH 
 BEGIN IN 
 *RIU(1 I 
 FUR I 
 LNDj 
 
 pROcLDU^ 
 bEGIN 
 INTEGER 
 OwN INTt 
 HEAL PRO 
 RLGIN 
 U hA 
 
 T:=rF_ 
 
 RANDO 
 END ; 
 
 FOR wi:=0 
 
 END* 
 
 HEAL PRO 
 
 BEGIN 
 
 F i = MAX ( - 
 
 END; 
 
 HEALHREM 
 
 MAXIMUM 
 W R I T L ( L I 
 "MA^l-l 
 " M A X I M 
 MAXIMUM' 
 BEGIN 
 ARRAY A 
 
 GETA( A) » 
 1RU[0].= 
 
 OELTAIOI 
 FOR I * = 1 
 H 1 6 I N 
 TKUt I 
 XJ=F( 
 PtLTA 
 
 END | 
 
 P R I N U " 
 UP HLGiN 
 I M f. 
 
 INCRE 
 pUR I 
 
 50 
 
 iiLR SUE.cfHi^TE', JJ BOOLEAN INCREASED* 
 
 • Y»2» I#uAsTOi|TP IT»tPSlLUN 'BOOLEAN DEBUG. CON Vf RGEO ' 
 
 UTLC 1 . 1 D ) » ; 
 
 I 0IS< SERIAL [?0j300](2il<M50,sAvE 99); 
 
 XlMUM » I Nl T I AL» ^DERIVATIVE; 
 
 L PRINT (NAMf , A ) J VALUE NAME;ARPAY AtOljRFAL NAMFJ 
 
 TEGER I ; 
 
 NE» < // A6," = m/ 1 000(5(E1<*» 1 )')>»NAMF , 
 ' = 1 STL= 1 UNTIL SIZE DO A [ I ] ) ' 
 
 L GETA(A);ARRAY A[ 0] j 
 
 j; 
 
 bEK R 
 L E U R 
 
 SE=0 
 
 I f* A S F 
 
 M L(R" 
 
 I. 1 = ( 
 
 STEP 
 CEDUR 
 1 »MI 
 UTE./ 
 
 * s 1; 
 
 Nt ,< 
 
 » M I N ( 
 U M s ♦* 
 
 iMTI 
 
 A S E I f] w N W £ 1 L T i 
 
 THEN RA$E ? = "2^*200"'' 
 
 .l-37,?M!61-3 7 + 2!975l3-H37i35l36JI 
 f]LEAN (f ,ASEi = T)OR B^ C) LEAN( ,5) )-.5J 
 
 T + T"0.5)J3"1,5: 
 1 UNTIL STZE DO A C J » :=RAN()QMI 
 
 E fc-x.aWValuE x.ajreal A.Xj 
 
 M U , 3 1 X I X f a ) ) I 
 
 ,DE M U(- f Sl7E,EpSIL0N, INITIAL)} 
 M a X D F R I v A T I V E s = p I 
 
 \ »3*X**?+A))«/ 
 
 • F9..3/" 1NTTIAL VALUE = ♦' , F9 . 3 » "M AX DFR I \, A T I vE =« , F9 , 
 
 AL'MAXnrHI VaTIVE ); 
 
 #0UTP jT •HlfiHfLO ^EvFN,DELTA,TRUlO?SlZE]; 
 
 LuTpuTL 0] :=/:-I v lTlAl.} 
 
 I = M A X T M J M ; 
 
 STEP 1 U/j TIL ST /E f)0 
 
 J »"Zlsf (Z.aC.I 1)1 
 U » A [ I ] ) } 
 
 I I ] : = 'i a xImUm: 
 
 A ",A)| prINT("TRUE %TRij)J 
 
 UNTEP ! s CiHjNTFR + ! ) MOO 10=0 OR PFrUJG THEN 
 *RlTE(LlNE. <"ITFRATlON i"»IQ> » C. HUNTER 5 > 
 
 a seo := false; 
 
 : = 1 sTLp i ijmUl SI^E D n 
 
 BEGIN 
 
 UELTACIlj»nFLTAfl1/2l 
 
 HluHt I J * B F<Ol)TP .1 T C I- J UdELTAC I 1,aE I ] ) * 
 
 LOW [ T J 5 = r ( n l ! T P I r [ I - 1] - L L T 6 [ n . A c n ) > 
 
 LVLM I J J=f ( puTP 'Tfl-.l J »At I I ) i 
 
51 
 
 LF I > CUilNTF*. AND NOT INCREASED AND I < SIZF TWFN 
 IF (ir xisAPS<Hlr,Ht I J-EVENll )> > OELTAti+lJ th pm TRUE El SF 
 XlsA3S(LOW tT]-EVEN[IJ) > DELTA 1 1 + 13 ) THEN 
 R^Gln 
 
 COMMENT LAT.E CHANGE WAS ORSERVFO- TAKE CORRECTIVE ACTION 
 MULTIPLY EACH "lELTAfj] SUBSEQUENT TO THIS ST4GE BY 
 THE mAx DERIVATIVE, IN NO CASE SHOULD THIS MAXIMUM 
 PL LARGE* THAN THE MAXIMUM EUR THE FUNCTION! 
 
 FUR j:sT+l STEP 1 WHILE DELT A r J ] <m a x I MUM AND J<SIZF DO 
 r)LLTAfJi !=M TNCMAXIhUM,MAXDERlVATIVE!r)ELTA L -»nJ 
 
 Increased ! s true; 
 
 END; 
 END } 
 If- nEhUQ I nb vj pRlMTCDELTA «, DELTA)! 
 CUN VERGED «=T RUE. ' L ASTIuTPHT S=nuTPUT CO] ; 
 FUR I «=1 STEP i UNTIL SIZE 00 
 REbIN 
 C UN VERGED !=c"N\/tR SEP A NO A BS ( in : =OUTPUT ( I -11 -LASTOMTPUT ) i 
 
 EPSILO^'IABSCLASI OUTPUT) j 
 LASTOllTpUT |-PUTPUT[ I ] ; 
 
 WI«ttIN(Xl»AB5f W"DPLTACI J)»ABS(W)'Zl«sABS(W*OELTAr Il))l 
 
 IF X = w THEN (HiTPiJtt i ) :=HlGnt 1 3 ELSE 
 
 IF Z = W THEN OUTPUTC I 1 :=l.Ow C 1 3 ELSE OUT PUT C I ] » sp VEM c I 1 I 
 
 em,; 
 
 IF cOONTFR MOD 10 = MP NUT CONVERGED OR DFRIJCj THEN 
 
 P R U i T ( " D II T P 1 1 T " . n U T P I! T > > 
 END UNTIL COUNTER (jE"5 SIZE OR COM\/ER(iFO J 
 NRI.TL(|.INE. </"Dlr) ".Aa." CONVERGE AFTER%IA, m ITERATIONS*. 
 
 " With RELATIVE FRaOR»"»EH«7>» 
 
 (lp cONvERfit-O then » " ELSE "NOT" ) , c Oi|NTER,EpSI| nN) j 
 
 "RINl ("OUTPUT M 'DUTPUT) J 
 -OCK^LlNt); 
 
 ■NDiLNQ. 
 
 PROGRAM FOR EXAMPLES 7.2 
 
ITERATION * 10 
 
 U U T P 
 
 ■1 .0 
 
 1.0 
 
 f ,\ 
 
 l.C 
 
 1 .0 
 
 -1.0 
 
 J. 3 
 5.3 
 
 1.0 
 
 1.0 
 1.0 
 1.1 
 *.l 
 6.2 
 
 oooooo 
 
 0000 00 
 6Jb?lb 
 
 o.ooo oo 
 o o o U 
 oooooo 
 2 t) y 3 3 o 
 
 u 9 7 h 
 6 1 5 4 1 J 
 
 oooooo 
 
 1 6 6 V 1 
 
 o o o o o o 
 
 f ! 
 J 4 9 o b 
 2VJ*7J 
 506b J M 
 
 4,o2-J4997 
 5, i70H?37 
 3.034766J 
 1. ooooooo f 
 
 on 1 
 00 1 
 01-1 
 
 00 1 
 00 4 
 
 no '? 
 
 01-6 
 
 1 - o 
 
 01 -1 
 1 
 01-1 
 1 
 00 b 
 01-1 
 
 '01 1 
 
 oi-l 
 
 . 5 „? 6 6 
 .0000 
 . 6 /" ? 
 
 , oooo 
 
 . 3 6 3 4 
 
 .5754 
 
 . 2 a 1 9 
 
 . o ij o 
 
 . ;,) 
 
 . Oooo 
 
 .0000 
 . B 9 3 B 
 . 
 
 . (i 
 . 
 
 7b5r-o2' 
 f 
 a59c-01' 
 r 
 
 73*r-0] 
 j 6 r - o 1 < 
 A 1 B r - o 1 ' 
 
 S'IK-01' 
 o - 
 no Or oo 
 no ok oo 
 n(iO" 
 1 65^-01 
 n o r 
 OOOf 00 
 r 
 
 1 .000 
 1.000 
 
 1.00 o 
 9 , 3 h 9 
 « . 5 ? 
 
 H.36b 
 
 1 .00 
 1.000 
 
 1 .ooo 
 
 1 .OOU 
 1.00 
 1.00 
 1 . o 
 9. A >\ 2 
 1 ,00 
 
 l.ooo 
 
 oooof 
 o o n o e 
 
 o o o o r 
 4 3 ? 9 p ■ 
 
 04o*F' 
 
 9 1 /i / F ' 
 
 OOF 
 F 
 F 
 F 
 00 OOF 
 00 00 
 OOOO F 
 1937F 
 F 
 OOOOF 
 
 5 
 00 1 
 
 00 1 
 0?-? 
 
 '01-3 
 
 01 2 
 
 oo-i 
 
 
 00 
 00 
 00 
 00 
 
 '01 
 00 
 00 
 
 .526 
 
 .oon 
 
 . 000 
 .0^ 
 .291 
 
 .190 
 . 00 
 . 330 
 . 
 .0 00 
 .746 
 
 ,000 
 
 . 7 1 
 
 .000 
 .000 
 ,7/i3 
 
 1 3 3 V e • 
 
 0000 F 
 OOOOF 
 5 9 6 2 F ■ 
 
 1 126E' 
 
 R700F' 
 
 F 
 9 A S q p . 
 OOOOr 
 
 1 1 r 
 
 1 3 1 3 r . 
 r 
 n "7 2 c 
 OOOOF 
 F 
 
 nl-l . 
 
 1 . 
 00 1 . 
 
 01-1 . 
 
 P 1 -7 . 
 
 01 
 no 
 3 
 oo 
 
 no 
 
 s. 
 
 1 . 
 
 1 . 
 
 9. 
 1 . 
 
 1 - 1 . c 
 01 5.9 7532^^-0 1' 
 01 -y.O'|H2'i7lr-ol 
 00 1.4 rt 69/i0?r-0l' 
 
 1 . C F 
 1 .00 000 OOF 
 1 . E 
 1. OOOOOO OE 
 
 - 1 . F 
 
 1 .0 OOOOOO F 
 1 . o n n o o o E 
 
 1 . P 
 
 n!- q . 
 
 00-1 . 
 
 Ol"1. ■ 
 
 no 1 . 
 00 1 , 
 
 nl 4. 
 
 oo-l . 
 oo i . 
 00 1 . 
 00-H. 
 
 52 
 
 O0O00n 
 
 oooooo 
 
 OOOOOO 
 
 o n o o o n 
 
 1 98 7 f 
 « 7 7 3 9 1 
 
 oooooo 
 
 
 7 ? S 4 6 S 
 
 oooooo 
 
 3 ] fj 2 2 P 
 
 noo oon 
 no oooo 
 oooooo 
 oooooo 
 
 2^-sB 1 7 
 
 oooooo 
 
 no 000 
 oooooo 
 a 9 1 6 i 
 
 OF 
 
 OF 
 
 OF 
 
 OF 
 
 3F' 
 
 5F. 
 
 OF 
 
 OF. 
 
 5F' 
 
 OF 
 
 6F' 
 
 OF 
 
 OF 
 
 OF 
 
 OF 
 
 
 
 (J 
 
 
 
 n 
 
 01 
 01 
 01 
 
 3F-0: 
 OF 6 
 OF 0( 
 OF 0( 
 OF Of 
 2F"0' 
 
 I 1 ERA T ION it 20 
 
 OUTPUT 
 
 -l .000 
 
 1 .000 
 7. 16-1 
 
 l .ooo 
 
 "1.000 
 
 -l .ooo 
 1 , o o 
 J. 384 
 b,361 
 1 .000 
 
 ■ t> • 8 9 J 
 
 2,494 
 
 1 , OOO 
 1.103 
 *.129 
 
 6,250 
 
 i .ooo 
 
 -J. 75V 
 
 J. 034 
 
 -V.006 
 
 s 
 
 oooo 
 oooo 
 521c 
 o o o o 
 OOOO 
 
 OOOO 
 097c 
 54 1 J 
 o o 
 522' 
 6 a 04 
 OOOO 
 4 9oo 
 J6 7J 
 6 J 4 
 ('00 
 67 if 
 7(S^ J 
 ( 2 1 r 
 
 00 
 
 00 
 '01 - 
 
 00 
 
 00 
 
 00 
 
 00- 
 '01 -o 
 '01 - 
 
 00 
 
 01 - 
 
 01 
 00 
 '01 - 
 01 
 01 - 
 
 00 'I 
 02-* 
 
 01 -* 
 01 -1 
 
 .536* 
 
 .0000 
 .6720 
 
 . o o o o 
 
 . 6 b b 
 .363^ 
 
 ,0()(H) 
 . 2 1 9 
 
 • 
 . 
 
 • oooo 
 
 . 2 9 / 9 
 . 3 9 3 A 
 
 • o o o o 
 
 . oooo 
 
 . Of)00 
 . 6 ') 3 2 
 
 • b :> 2 4 
 
 . 4 P 2 
 
 • ooo o 
 
 755r. 
 
 000 p 
 
 A 5 9 r • 
 
 00 Of 
 
 7 3 fi r ■ 
 
 1 6 r ■ 
 0" 
 S41r- 
 
 00 0" 
 
 nOOf 
 OU-Qr 
 .3*6?" 
 l65r- 
 
 OOOr 
 no 0^ 
 OOOr 
 o 5 7 r ■ 
 
 8 7 7 r - 
 /.7lr- 
 OOOr 
 
 02-1 
 00 1 
 01-1 
 00 1 
 1 4 
 1 3 
 00 1 
 
 oi-l 
 
 00 
 
 no 
 
 
 
 01 
 01 
 00 
 
 
 
 
 oi-l 
 01-1 
 
 01 1 
 00 H 
 
 .00 
 .00 
 .000 
 .00 
 ,5?0 
 ,97o 
 .00 
 
 . ooo 
 . oou 
 
 . 
 .23V 
 
 . 0H<4 
 
 .000 
 .64^ 
 . f > 
 . 
 
 . ooo 
 
 .00 
 . 000 
 
 .^12 
 
 OOOOF 
 OOOOF 
 OOOOF 
 OOOOF 
 
 4 6 F ' 
 2 9 B 7 F ' 
 OOOOF 
 OOOOF 
 OOOOF 
 OOOOF 
 6 5 4 4 f" 
 o9'3 7r' 
 OOOOF 
 
 1 9 3 7 e ' 
 F 
 OOOOF 
 OOOO F 
 V 
 OOOOF 
 4 3 3 4 r ' 
 
 00 5, 
 00 1 . 
 00 1 . 
 00 1 . 
 01-6. 
 02-5. 
 00-1 . 
 00 9. 
 00 1 . 
 1 . 
 
 oi-l. 
 
 or 
 
 00 
 01 
 
 
 
 00 
 00 
 01 
 
 526 1 
 
 oooo 
 oooo 
 o o o o 
 174 5 
 261 5 
 noo 
 3 3 09 
 OOOO 
 
 oooo 
 
 
 
 1 H 5 
 
 7010 
 
 oooo 
 oooo 
 oooo 
 oooo 
 oooo 
 oooo 
 oooo 
 
 3 3 9 r - n 1 
 O00f oo 
 000F 00 
 OOOf 00 
 737r-ol 
 4 02 r. ol 
 OOOf. 00 
 6 5 4 F - 1 
 OOOr no 
 OOOr oo 
 
 ooof no 
 557 r - n i 
 
 072r-0l 
 OOOr no 
 nnot 
 nf)Or 
 V 
 f 
 OOOr 
 OOOf 
 
 no 
 
 no 
 
 
 
 00 
 00 
 
 '1.0000 
 1 .0000 
 
 1 . oooo 
 2.6355 
 ?.5nS2 
 1 »nooo 
 1 .noon 
 1 . 
 
 9. 7?54 
 
 < 9 . 7 \ 7 B 
 
 3. 31 B? 
 
 1*0000 
 
 1 . n 
 1.0000 
 1 . n n 
 1 -oooo 
 i\ . 1 00 v 
 1 .0000 
 1 .oooo 
 1 .0000 
 
 OOOE 
 
 OOOE 
 00 CE 
 
 01.2s 
 
 277E" 
 
 OOOE 
 OOOF 
 OOOF 
 6 s 5 f ■ 
 50 4 F' 
 2A6| ' 
 OnnE 
 OOOF 
 OOOF 
 000 r 
 
 OOOF 
 b?lF'i 
 
 OOOF 
 OOOF 
 OOOE 
 
 0( 
 Of 
 Of 
 '01 
 '01 
 Of 
 Of. 
 0( 
 01 
 01 
 
 01 
 Or 
 
 or 
 
 Of 
 
 or 
 oc 
 01. 
 
 00 
 00 
 00 
 
 
IURAI I [)N $ 30 
 
 53 
 
 OUTPU 
 
 -1.00 
 
 1,00 
 
 7.16 
 
 1.00 
 
 •1.00 
 
 ■1.00 
 
 -4.81 
 
 J. 38 
 
 5.36 
 
 1.00 
 
 1.00 
 
 1.00 
 
 4,41 
 
 1.10 
 9.12 
 
 ■1.00 
 1 .00 
 
 •5.17 
 J. 03 
 1,00 
 
 I s 
 
 ooooof oo 
 ooooof oo 
 
 J52lbf-01' 
 oooooe 00 
 oooooe oo 
 oooooe oo 
 
 *893t>E -02' 
 <»097f t"01' 
 
 i ba 1 JF.-01 ■ 
 oooooe 00 
 
 OOOOOf. 00 
 
 uoooup. oo 
 
 ^6355p-01- 
 J 4 9 8 o p- - 01' 
 
 9367Jf_"01 
 00 E 00 
 OOOOOF 00 
 bb23a:-01 
 */66J£-oi' 
 OOOOOE 00 
 
 1.53 
 1.00 
 1.67 
 
 1.00 
 
 1 .00 
 6. 2d 
 1 .00 
 1 .00 
 9.5 2 
 1 .00 
 
 1 .00 
 1.00 
 2.21 
 
 7 .46 
 
 2 . 8 o 
 1.3a 
 9.oa 
 
 1.46 
 
 667b5r. 
 00000 r 
 
 2 * 5 9 r ' 
 
 OO^OOf 
 
 6 6 7 3 3 r ■ 
 
 34] 06r< 
 
 OOOOOf 
 
 lOS^tr- 
 
 r 
 
 OOOOOf 
 
 6 ? 9 V 9 r - 
 
 r 
 
 OOOOOf 
 
 OOftOOf 
 
 4752br- 
 
 2 3 3 3 r - 
 
 3?957r' 
 
 6 8 4 7 p 
 82471 r« 
 
 6 9 4 2 F ' 
 
 02-1 
 00 1 
 01-1 
 
 00 1 
 
 01 a 
 01 3 
 00 b 
 01-1 
 
 00 
 
 oo 
 
 01 
 
 00 
 
 00 
 
 00 
 
 01 
 
 01 
 
 01-1 
 
 02-1 
 
 01 1 
 
 ■01-1 
 
 .0000 
 .0000 
 .0000 
 ,0000 
 .5200 
 
 .970? 
 .4393 
 .0000 
 .0000 
 .0000 
 ,0000 
 ,0000 
 ,0000 
 .6/121 
 .6205 
 .4?9H 
 .0000 
 .0000 
 .0000 
 .0000 
 
 OOOF 
 OOOF 
 OOOF. 
 OOOE 
 
 406F' 
 
 9H7f 
 269r- 
 
 OOOF 
 
 OOOF 
 
 OOOF. 
 
 OOOF 
 
 OOOF 
 
 OOOF 
 
 937F' 
 
 272E' 
 
 402F' 
 
 OOOF 
 
 OOOF 
 
 o o o f: 
 
 OOOE 
 
 00 s 
 00 1 
 00 1 
 00 1 
 01-6 
 
 02-5 
 01-1 
 
 00 
 00 
 00 
 00 
 00 
 00-1 
 01 1 
 '01-3 
 '01-1 
 1 
 00 1 
 00 1 
 00 1 
 
 .526 
 
 .noo 
 
 .00 
 .000 
 .17 4 
 .261 
 .000 
 .3 30 
 .000 
 
 . o o o 
 
 .noo 
 
 .000 
 .0 00 
 .000 
 ,9*ft 
 
 .14? 
 
 .000 
 ,000 
 .000 
 
 ♦ ooo 
 
 1339r- 
 
 OOOOr 
 OOOOF 
 OOOOr 
 5737 F , 
 
 8 40?r- 
 
 OOOOF 
 
 9654r- 
 
 OOOOF 
 
 r 
 
 OOOOr 
 
 oooor 
 
 OOOOr 
 
 OOOOr 
 
 1o9or- 
 
 7900F' 
 
 OOOOF 
 
 OOOOE 
 
 OOOOF. 
 
 OOOOE 
 
 01-1 
 00 1 
 00 1 
 
 00 2 
 01-? 
 
 01 A 
 00 1 
 1 1 
 00 9 
 00 1 
 OO 1 
 00 7 
 00 1 
 00 1 
 
 01 -v 
 oi-i 
 
 00-1 
 00 1 
 
 00 1 
 00 1 
 
 ooooo 
 
 00000 
 
 ooooo 
 
 63550 
 50522 
 ?P34« 
 OOOOO 
 
 ooooo 
 
 72546 
 
 ooooo 
 ooooo 
 
 7/149 1 
 OOOOO 
 OOOOO 
 '* 3 4 3 7 
 OOOOO 
 OOOOO 
 OOOOO 
 OOOOO 
 OOOOO 
 
 OOE 00 
 OOF 00 
 OOE 00 
 12E-01 
 77E-01 
 ?1E"01 
 OOE 00 
 OOF 00 
 55E-01 
 OOF 00 
 OOE 00 
 6HE-01 
 OOE 00 
 OOE 00 
 **E-01 
 OOE 00 
 OOE 
 OOE 
 OOE 
 OOE 
 
 00 
 00 
 00 
 00 
 
 ITERATION # 40 
 
 OUTPUI s 
 
 •i.oooooooe 
 
 1, 0000000 F 
 
 f .16352 l&E 
 
 l.OOOOOOUf 
 
 ■l.OOOOOOOf 
 
 ■l . ooooooof 
 
 -4.81*6936 F 
 3,384o976 E 
 
 00 1 .5366755r-o?-l .OOOOOOOF. 
 00 1 . r 00 1.0 OOOOF 
 01-1.67?0659r-ol-l .OOOOOOOF 
 00 1 . r 00 I.OOOOOOOE 
 00 ^,6r,b673flr-ol 4.520040AE' 
 00 '. 36361 06r-ol 3.97o29P7f' 
 02-1.0 OOOOOF 00 S . 4 3 9 3 ? 6 9 £' 
 01-6, 2bl t >5 / *lr-ol-1.0 000 OOF 
 
 00 5.526l339r' 
 00 1. OOOOOOOF 
 00 1,0 OOOOOF 
 
 oo 1 .ooooooo r 
 
 01-6. 17/1 57 3 7 r 
 
 02-5.?6tR40?r 
 '01-1. OOOOOF 
 00 l >»33n96S4p 
 
 5.36 1 
 
 1,000 
 
 1.000 
 
 1.000 
 
 4,412 
 
 1.103 
 
 9.12V 
 
 ' 6.250 
 
 1 .000 
 
 ■J. 75V 
 
 ■1.000 
 
 6.369 
 
 541 Jf ' 
 OOOOE 
 OOOOE 
 OOOOF 
 
 635ip' 
 /i9bbf " 
 
 3 6 7 J F ' 
 683^F 
 E 
 J 7 3 t e 
 OOOOF 
 ?3^ t 
 
 01 -1 
 00 1 
 00 9 
 
 00 1 
 01-1 
 
 01 -1 
 01 1 
 01-1 
 00 2 
 02-2 
 00-1 
 
 '01-1 
 
 .000 
 .0 000 
 .5262 
 . 
 .0000 
 
 . OoOO 
 
 .oooo 
 
 • oooo 
 
 . bQ32 
 .652/1 
 .0000 
 
 • oooo 
 
 OOOf 
 
 OOOr 
 
 999c-. 
 
 OOOt" 
 
 OOOf 
 
 r 
 
 000 «■ 
 
 oOOr 
 Q57r> 
 
 R77r> 
 
 00 Or 
 000^ 
 
 00 
 00 
 01 
 
 00 
 
 
 
 00 
 
 
 
 
 01-1 
 01-1 
 
 1 
 
 00 8 
 
 .00 
 ,00 
 .000 
 
 ,000 
 
 .00 
 .6Z.2 
 .000 
 
 .000 
 
 .000 
 
 .000 
 
 .000 
 
 .912 
 
 oooor 
 
 OOOOF 
 OOOOE 
 
 oooor 
 
 OOOOF. 
 
 1937r- 
 
 OOOOF 
 
 OOOOF 
 
 OOOOF 
 
 OOOOF 
 F 
 «334F.' 
 
 00 
 00 
 00 
 
 00 
 01 
 (JO 
 00 
 00 
 
 00 
 01 
 
 . 000 
 .000 
 
 .000 
 .000 
 ,701 
 .0 00 
 .(•00 
 .000 
 .000 
 
 . 000 
 .000 
 .000 
 
 oooor 
 oooor 
 
 OOOOr 
 OOOOr 
 
 oo72r- 
 oooor 
 
 OOOOF 
 
 oooor 
 
 OOOOF 
 OOOOF 
 OOOOF 
 OOOOF 
 
 01-1 
 
 00 1 
 00 1 
 
 00 2 
 M-? 
 
 01 6 
 
 00 1 
 
 01 1 
 
 00 9 
 
 00 1 
 
 00 1 
 
 00 7 
 
 01-1 
 
 00 
 
 
 00 
 00 
 00 
 
 0-6 
 00-9 
 
 000000 
 000000 
 000000 
 
 635501 
 5n52?7 
 28348? 
 
 000000 
 
 000000 
 
 Pa 
 
 72546S 
 
 000000 
 
 000000 
 7/i/i91 a 
 
 000000 
 000000 
 000000 
 000000 
 
 100962 
 000000 
 5/i9753 
 <J2032f 
 
 OF 00 
 OE 00 
 OE 00 
 ?E"01 
 7F-01 
 1E"01 
 OE 00 
 OE 00 
 OE 8. 
 5E-01 
 OE 00 
 OE 00 
 8F-01 
 OF 00 
 OF 00 
 OF 00 
 OF 00 
 1E-01 
 
 OE 00 
 1E-01 
 9E-01 
 
54 
 
 DID 
 
 CUNVER'.E ApTFP " 8 ITFRAT1UNS WITH RFLATIVE FPRORr 1 . OOOoOOOE"(. 
 
 UUTPU1 = 
 
 
 
 
 
 
 
 
 
 
 
 - 
 
 
 •i.oouopooj 
 
 00 
 
 1 ,5366755r< 
 
 •02-1 
 
 .ooooooof 
 
 00 
 
 5 
 
 .52M339f~o 
 
 ■1 . 
 
 OOOOOOOF 
 
 00 
 
 1 . u C [ 
 
 00 
 
 1 , F 
 
 00 
 
 1 
 
 .OOOOOOOE 
 
 00 
 
 1 
 
 .OOOOOOOE 
 
 00 
 
 1 1 
 
 -OOOOOOOE 
 
 00 
 
 ?.16Jb?lb| 
 
 -01 - 
 
 1 ,6/20Ab9r. 
 
 -oi-i 
 
 .OOOOOOOE 
 
 00 
 
 1 
 
 •OOOOOOOF 
 
 
 
 1 . 
 
 OOOOOOOF 
 
 00 
 
 1 , o o U f 
 
 - 
 
 l.oooooooc 
 
 00 
 
 1 
 
 , r 
 
 
 
 1 
 
 .OOOOOOOF 
 
 00 
 
 , 
 
 .*3550l?F' 
 
 •01 
 
 - 1 • o o u o 1 
 
 00 
 
 ft ,6<>86738r< 
 
 •01 
 
 ti 
 
 .5200406E-01-6 
 
 . 1 745737f_-o.1« 
 
 •?« 
 
 .5052277F' 
 
 "01 
 
 " 1 . u o u U 1 
 
 on 
 
 f . 3^3ai 06^ 
 
 •01 
 
 3 
 
 .97029H7F-02-5 
 
 , 2 6 1 M. 2 F - o 1 
 
 h 1 
 
 ?83^6?1 F" 
 
 "01 
 
 -«*.8U693t>( 
 
 -0?- 
 
 l.OoOOOOOf 
 
 
 
 5 
 
 ,43932*9F" 
 
 ■oi-l 
 
 .OOOOOOOF 
 
 00 
 
 1 . 
 
 •OOOOOOOE 
 
 00 
 
 J. 3baoQ7t| 
 
 -01- 
 
 b , 2 8 1 9 5 * 1 r * 
 
 •01' 
 
 •1 
 
 . 1 1 V 
 
 00 
 
 9 
 
 . 3 3 ^ h S 4 F ■ 
 
 ■01 
 
 1 1 
 
 OOOOOOOF 
 
 on 
 
 5 . 3 6 1 b 4 1 J | 
 
 -01 - 
 
 I .(MOOOOQr 
 
 00 
 
 1 
 
 .OOOOOOOE 
 
 00 
 
 1 
 
 . r 
 
 00 
 
 9 
 
 ■ 7254655 r ■ 
 
 "01 
 
 1 , U ( 
 
 00 
 
 1 , r 
 
 00 
 
 1- 
 
 .OOOOOOOF 
 
 00 
 
 1 
 
 .OOOOOOOF 
 
 00 
 
 1 , 
 
 •OOOOOOOF 
 
 00 
 
 1 , Ol; ( 
 
 00 
 
 V.526?999r. 
 
 •01 
 
 1 
 
 .OOOOOOOF 
 
 
 
 1 
 
 .OOOOOOOr 
 
 00 
 
 1 . 
 
 .OOOOOOOF 
 
 00 
 
 1 .OOUOouuf 
 
 . 00 
 
 1 ,0000000" 
 
 00 
 
 1 
 
 .OOOOOOOF 
 
 00 
 
 1 
 
 , f- 
 
 00 
 
 7, 
 
 . r /u 9 1 6 n E ■ 
 
 •01 
 
 <*.41*635S| 
 
 -01- 
 
 1 . r 
 
 00 
 
 1 
 
 .OOOOOOOF 
 
 
 
 1 
 
 . 7 1 7 2 r • 
 
 •01 ■ 
 
 •1 
 
 .OOOOOOOE 
 
 00 
 
 1 . 1 3 « 9 6 1 ( 
 
 -01- 
 
 1.0000000" 
 
 00 
 
 9 
 
 .6421937F- 
 
 •ol 
 
 1 
 
 .OOOOOOOF 
 
 00 
 
 1 
 
 .OOOOOOOE 
 
 00 
 
 9,12V367J[ 
 
 ;-oi 
 
 1 .0000000^ 
 
 00 
 
 1 
 
 .OOOOOOOF 
 
 00 
 
 1 
 
 .OOOOOOOr 
 
 00 
 
 1 . 
 
 .OOOOOOOE 
 
 00 
 
 6. 25 U6r3<M 
 
 -01 - 
 
 1 .000000 s 1 
 
 00 
 
 1 
 
 .OOOOOOOE 
 
 00 
 
 1 
 
 .OOOOOOOr 
 
 00 
 
 1 
 
 .OOOOOOOE 
 
 00 
 
 1 . o U U U I 
 
 ■ 00 
 
 2. ao3?Pb7r. 
 
 -01-1 
 
 .OOOOOOOF 
 
 00 
 
 1 
 
 .OOOOOOOr 
 
 
 
 /» 
 
 >10096?ip 
 
 ■01 
 
 -J. 759J73/ f f 
 
 -o?- 
 
 2 , 6 b 2 /i B 7 / ~ ■ 
 
 ■nl-l 
 
 •OOOOOOOF 
 
 00 
 
 1 
 
 .OOOOOOOr 
 
 00 
 
 1 ■ 
 
 - OOOOOOOE 
 
 00 
 
 J.03 t4 /66^! 
 
 -oi- 
 
 9 . '4 P 2 'i 7 1 r ■ 
 
 ■01 
 
 1 
 
 .OOOOOOOE 
 
 
 
 1 
 
 .OOOOOOOF 
 
 
 
 1 
 
 •OOOOOOOE 
 
 00 
 
 1 , oouoooui 
 
 00 
 
 l.Urt69'i02r*« 
 
 •Ol' 
 
 •1 
 
 .OOOOOOOE 
 
 00 
 
 1 
 
 .OOOOOOOF 
 
 00 
 
 1 
 
 •OOOOOOOE 
 
 00 
 
55 
 
 Conclusion 
 
 The techniques discussed in this thesis do demonstrate a 
 parallel technique of wide application to functions whose deri- 
 vative with respect to each memory variable is comparatively 
 small. The ultimate case of an unpredictable function is one 
 in which the range of values is discrete. These functions are 
 essentially unpredictable using the techniques discussed here 
 and one can offer convincing arguments (Chapter H) that no 
 parallel evaluation is possible. Such discrete functions are 
 characterized by an infinite derivative at many points. As the 
 constraint on a derivative becomes more stringent the class of 
 functions admits of more and more accurate parallel evaluation. 
 
56 
 
 REFERENCES 
 
 Iverson, K. E. A Programming Language . New York, Wiley, 1962 
 
 ill, A. On the Series-to-parallel Transformation of Linear 
 Sequential Circuits. IEEE EC-15, 1966, p. 107. 
 
57 
 
 VITA 
 
 Harold Robert George Trout 
 
 Born 18th December 19^5- 
 
 B.Sc. in Mathematics, Victoria University of Wellington, New 
 Zealand, 1966. 
 
 B.Sc., 1st class honours in Mathematics, 1967, Victoria Univer- 
 sity. 
 
 M.S., Computer Science, University of Illinois, 1969 • 
 
 1970 - present, vice-president TMM programming. A hardware, 
 software, and consulting company. 
 
 Mr. Trout is author of 'A Compiler Generating System', 21st ACM 
 National Conference, 1967. 
 
BIBLIOGRAPHIC DATA 
 SHEET 
 
 1. Report No. 
 
 UIUCDCS-R-72-5^9 
 
 3. Recipient's Accession No. 
 
 4. Title and Subtitle 
 
 Parallel Techniques 
 
 5. Report Date 
 
 October, 1972 
 
 7. Author(s) 
 
 Harold Robert George Trout 
 
 8. Performing Organization Rept. 
 
 No. 
 
 9. Performing Organization Name and Address 
 
 10. Project/Task/Work Unit No. 
 
 Department of Computer Science 
 University of Illinois 
 Urbana, Illinois 618OI 
 
 11. Contract /Grant No. 
 
 12. Sponsoring Organization Name and Address 
 
 13. Type of Report & Period 
 Covered 
 
 14. 
 
 15. Supplementary Notes 
 
 16. Abstracts 
 
 This thesis develops several techniques for creating the parallel 
 equivalent of a sequential algorithm. It is shown that any function which 
 is linear in its memory variables can be coded in a parallel fashion. 
 Furthermore, using a bit by bit strategy parallel equivalents of sequential 
 algorithms in which the iterative function is subject only to a Lipschitz 
 condition are developed. The thesis considers speed of convergence, and 
 the computational expense of these algorithms. It develops in detail the 
 application of several of the algorithms to practical examples. 
 
 7. Key Words and Document Analysis. 17a. Descriptors 
 
 parallel algorithms 
 bit by bit algorithms 
 
 b. Identifiers/Open-Ended Terms 
 
 c COSATI Field/Group 
 
 Availability Statement 
 
 19. Security Class (This 
 Report) 
 
 UNCLASSIFIED 
 
 20. Security Class (This 
 Page 
 
 UNCLASSIFIED 
 
 21. No. of Pages 
 
 22. Price 
 
 |*M NTIS-13 ( 10-70) 
 
 USCOMM-DC 40329-P7I 
 
oc 
 
 „ a »1*3