LIBRARY OF THE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN 510.8 XJL(or no. 547-55 Z cop- 2. 2/996 5'^f UTUCDCS-R-72-5^-9 7?i*>&l PARALLEL TECHNIQUES October, 1972 ^7 Harold Robert George Trout THE UBRARI QE IHE '$?3 UNIVERSITY OF ILLiNO ATI DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN URBANA, ILLINOIS UIUCDCS-R -72-5^9 PARALLEL TECHNIQUES by Harold Robert George Trout October, 1972 Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois Digitized by the Internet Archive in 2013 http://archive.org/details/paralleltechniqu549trou PARALLEL TECHNIQUES Harold Robert George Trout, Ph.D. Department of Computer Science University of Illinois at Urbana-Champaign, 1972 This thesis develops several techniques for creating the parallel equivalent of a sequential algorithm. It is shown that any function which is linear in its memory variables can be coded in a parallel fashion. Furthermore, using a bit by bit strategy parallel equivalents of sequential algorithms in which the iterative function is subject only to a Lipschitz condition are developed. The thesis considers speed of conver- gence, and the computational expense of these algorithms. It develops in detail the application of several of the algorithms to practical examples. TABLE OP CONTENTS 111 Page INTRODUCTION 1 CHAPTER 1 DEFINITIONS 3 CHAPTER 2 LINEAR SEQUENTIAL ALGORITHMS AND EXTENDED ASSOCIATIVITY 8 CHAPTER 3 APPLICATIONS OF LINEAR SEQUENTIAL ALGORITHMS 1H CHAPTER 4 LIMITATIONS OF THE LINEAR SEQUENTIAL METHOD 2 CHAPTER 5 BIT BY BIT TECHNIQUES: RESIDUE METHOD 25 CHAPTER 6 ANOTHER BIT BY BIT TECHNIQUE: THE LIPSCHITZ METHOD 39 CHAPTER 7 LIMITATIONS TO THE LIPSCHITZ METHOD 48 REFERENCES 56 VITA 57 INTRODUCTION Two divergent trends are distinguishable in the archi- tecture of computing machines. On the one hand considerable increase in speed and reliability has been achieved by the duplication and parallel operation of many components within these machines. On the other hand there have been more and more machines built with specialized instruction set for cer- tain applications. Any marriage of these two broad strategies has been on an ad hoc basis since in general very little has been known about the parallel equivalent of given sequential functions. If the techniques for creating the parallel equi- valent of a sequential process were known it would be possible to connect many LSI circuits together to achieve the desired speed improvement. The tacit assumption has been that although certain techniques may work for certain algorithms that unified techniques, except the most trivial kind, are unlikely to exist. This paper unifies some of the parallel equivalents of linear sequential algorithms and demonstrates some further areas in which they can be used with success. In addition, it demon- strates parallel techniques of wide application to algorithms in which each step of the sequential process Is subject to a Lipschitz condition. The strategy in finding a parallel equiv- alent to these non-linear systems is to evaluate the partial 2 results from each stage in a bit by bit fashion. Thus the num- ber of steps in evaluating the whole function is of the order of the number of bits of the final answer, rather than the length of the sequence. The machine parallelism implicit in these algorithms is many identical units operating simultaneously. CHAPTER 1 DEFINITIONS 1.1 A SEQUENTIAL ALGORITHM Is an algorithm In which the execution of the (1 ) step necessarily depends upon some values generated by the (i-1) step (with, of course, the zero or Initial value(s) given). 1.2 THE ITERATIVE FUNCTION of such sequential algorithm Is the rule for obtaining the i value(s) from the i-1 . Sup- pose f is represented pictorially ... c . 1 a. b. i i i-1 i-1 'i-1 >x . i >y > z i iccepting inputs x . _, , y. ., •••> z • _-, , and a., b. , c . and generating outputs x., y . , be represented as follows z., the sequential algorithm may a l b l ' a~ bp a 3 b 3 -» 4 which represents n repetitions of f, on n sets of input The usual implementation of such a sequential algorithm would employ some memory variables X, Y, . . . Z and might be written as follows: X:=x Q ; Y:=y ; . . .;Z:=z q for i:=l step 1 until n do begin X Y y f(X, Y, . . .,Z, A., B. °i> Z: = end ;' (f returns n values in this example.) This algorithm will be written : f/x , y Q , . . . z Q , A; B; . . . C in the style of APL (reference 1) and is read "f" reducing . . z n , A: B; C where A = a n , a OJ . . .,a . — — — — 1 ' 2 ' ' n 5 X 0' y o 5 = b 1 , b 2 , . , b etc . ' n Not infrequently the partial results are as much interest as the final answer. The partial results are the sequence of values X, Y, . . . Z at i=l X, Y, ... Z at i=2 etc. In some sense thes are a sequence of snapshots of the sequential algorithm taken for each value of I. Continuing the APL like notation: f \ x > Y ' ' ' ' ' Z 0' -' -' ' * ,; - will be used to represent these partial results. The notation throughout this thesis will bear some resemblance to APL (op. cit.) with the following exceptions and extensions: 5 1) A function call f with parameters a, b, c, d, will be written Y[a, b, c, d]'or '(a,b) f (c,d) T or ' a f (b,c,d)\ 2) k will be used to represent scalar multiplication and often matrix multiplication. 3) The j result of a function f which returns several re- sults is written f(...) [ j ] . 4) A vector will generally be written with an underline a or A. 5) The reverse of a vector a will be written <£§_. 6) f/z,A. f is permitted to have more than two parameters as described above. 7) If A is an array, then A[l,2] = A[l], A[2] so A[B] = A[B[1]], A[B[2]], ..., A[B[n]] when B is one dimensional, and will be the array X, where X.. = A[B..] when B is two dimensional. 8) pA = length of A. 1.3 THE MEMORY VARIABLES of a sequential algorithm are the results (X, Y, ..., Z in the example above) which are remembered from step to step of the sequential algorithm. The function is specifically prohibited from having any memory and is thus a pure procedure (i.e., one yielding fixed results for fixed inputs). 1.4 THE EXTRANEOUS VARIABLES of a sequential algorithm are the non-memory inputs to f at each step (a. , b. , . . . , c . in the example above) . 1.5 A BOUNDED SEQUENTIAL ALGORITHM is a sequential algorithm in which the number of memory variables is fixed (i.e., does not in- crease as later steps of the algorithm are computed). Since all sequential algorithms in this thesis will be bounded, the term will 6 generally be dropped. The restriction to bounded algorithms excludes parallel equivalents of a sequential algorithm in which the calculations are deferred by remembering all relevant data until the last step when one giant computation is performed to compute the final result. This restriction also eliminates from the discussion algorithms such as Gaussian elimination. 1.6 A LINEAR SEQUENTIAL ALGORITHM is a sequential algorithm in which the iterative function is linear in the memory variables. Thus, if 2 f(x, y, a, b) = a x + y, bx + y f/x Q y Q A; B is a linear sequential algorithm since f is linear in x and y. 1.7 DISCUSSION. Intuitively it would be expected that the evaluation of f/a-, , a~, •••> a would take n sequential steps. Al- though ad hoc methods applicable to particular iterative functions f may serve to reduce the number of steps It would be implausible to expect this to be generally possible. However, it is actually possible under quite wide circumstances to recast the calculations in a form that permits more parallelism to be employed. Thus, given sufficient hardware, the final result could be generated in some- thing less than n times the time for one iteration. The price of such increase in parallelism is often a significant Increase in the total amount of computation that is performed. 1.8 Several cases can be handled summarily. For example, if there are no memory variables, then there is no reason to evaluate the algorithm sequentially. 7 Also, if there are no extraneous inputs to the sequence, the results are entirely dependent on the initial values and frequently admit of a closed algebraic solution. In cases where this closed solution is still difficult to compute the techniques discussed here may still yield an improvement in computation time. Finally, the special case of f an associative function of one memory variable and one extraneous variable. An extended concept of associativity will be developed in the next chapter and it will be shown that all linear sequential algorithms can be cast in this associative form. The condition of associativity grants the free- dom to order the calculation in any manner whatsoever so that paral- lel equivalents of the sequential algorithm can readily be found. Chapter 3 is a discussion of some applications of linear sequential algorithms to many problems of linear algebra. Chapter H is a discussion of algorithms similar to linear algorithms and their inherent limitations. Chapter 5 is a discussion of the remainder method for the bit by bit evaluation of polynomials and related sequential algorithms . Chapter 6 is a discussion of an alternate bit by bit tech- nique called the Lipschitz method that has proven to be of quite wide application. The most stringent requirement of this strategy is that the function obey the Lipschitz condition with respect to each memory variable. Chapter 7 is a discussion of various practical constraints on the Lipschitz method and a number of techniques for circumventing these restrictions. CHAPTER 2 LINEAR SEQUENTIAL ALGORITHMS AND EXTENDED ASSOCIATIVITY 2.1 In Section 1.8 It was observed that an associative func- tion reducing some vector can easily be recast in a parallel fash- ion. The classical definition of associativity is, however, more stringent than is required to be able to rearrange the computation in a more parallel fashion. The key property of associativity might be expressed as follows: If a and b are any two vectors, then + is associative means +/(a,b) e (+/a) + (+/b) Accordingly the following definition of weak associativity: a function f is said to be an ASSOCIATE of a function f if there is some identity element(s) I with the property that for any vectors A, B and initial value z f/z,L = (f/z,A) f (f/z,B) where L = A, B the concatenation of A and B. Any function f with an associate f is said to be WEAKLY ASSOCIATIVE. The identity element (s) deserve some careful attention. For binary functions there Is no identity element, since the number of memory variables is one and the first element of the vector is generally taken as the initial value of this memory. However, 9 when there are more than one memory variables it is necessary to specify these to get the algorithm started. The identity ele- ments are chosen so that they also satisfy the above requirements. 2.2 Patently any binary function p that is associative is also weakly associative with p~ = p . (The identity is null for binary functions). Further, it is easy to see that for any func- tion 1=1. 2.3 EXTENDED ASSOCIATIVITY. Frequently a function f which is not associative per se has an extended function F which is associa- tive. This means that in addition to the function of f, the extended function F computes some additional memory variables which enable an associate F to recover from some partition of the calculation. For example f(a, b, c) = (a x b) + c then F(a, d , b, c ) = (a x b) + c , d x b now clearly (F/z, 1, B; C)[l] = f/z, B; C so that F is indeed an extension of f. Furthermore, F is an assoc- iative function with associate F(a, b, c , d) = (a x d) + c , b x d and (F/z, y, B; C) F(F/0, 1, D; E) = F/z, y, (B, D); (C, E) can be demonstrated in a straightforward fashion. It is not diffi- cult to replace the operators + , x with more general functions pro- vided the following properties are retained. 10 i) that x distribute over + : ii) that x have an associate x iii) that + be its own associate. Also, the usage of f makes it apparent that any functions of b, c could be substituted for b, c since they are used only once in the calculation. Another way of looking at this is as follows: If B and C_ are preprocessed element by element, using g and h to give g[B] and h[C] then f /z , g[B]; h[C] is equivalent to k/z, B; C where k(a, b, c) = f(a, g[b], g[c]) i.e., replacing b or c with functions of b and c is equivalent to preprocessing the extraneous input. Thus, we have the general associative theorem. 2 A THE ASSOCIATIVE EXTENSION THEOREM. 1) If * is a function with associate x 2) x distributes over another function + 3) + is its own associate 4) g, h are any functions then the function f(a, b, c) = a x g[b] + h[c] has an associative extension P(a, b, c, d) = (a x g[c] + h[d]), (b x g[c]) which, in turn, has an associate F(a, b, c, d) = ((a x d) + c), (b x" d) with identity = 0, 1, the identity of + and x respectively. 11 Proof As observed above, the arbitrary functions g and h can be ignored since they amount to some preprocessing of the extraneous inputs . Also, it is clear that the theorem need only be demonstrated on input vectors of length 3, since longer cases can be proven by repeated application of the theorem. Thus by straightforward expansion F/z, y, A, B, C, D, E, G = F/(f/z, A, B), (x/y, A), C, D, E, G = ((f/z, A, B) x C + D) x E + G, ((x/y, A) x C) x E using the distributive property of x over + = ((f/z, A, B)x~C)xE+(DxE)+G, ((x/y, A) x C) x E now using the associativity of x e (f/z, A, B) x (x/C, E) + (D x E) + G, (x/y, A) x (x/C, E) but (D x E) + G E (P/0, 1, C, D, E, G)[l] and (x/C, E) = (F/0, 1, C, D, E, G)[2] so that F/z, y, A, B, C, D, E, G E F[(F/z, y, Z, B) , (F/0, 1, C, D, E, G)] where F[a, b, c, d] = (a x d) + c, b x d as stated. 2.5 LINEAR FUNCTIONS OF MORE THAN ONE MEMORY VARIABLE can also be extended to an associative form in a similar way. For example f[x, y, a, b, c, d] = ((x x a) + (b x y) + c), ((xxb)+(yxc)+d) 12 Now f can be easily written as a matrix expression f[x, y, a, b, c, d] e (* £) * (x 3 y) + (c, d) which is, of course, of form of theorem 2.4 and so its associa- tive extension is F[x, y, (P £), a, b, c, d] = O ■ <*■ ^ + ^ d >' ( bc> - c?2) FCx, y, (P j), X, Y, (£ §)] = (* §) x (x, y) + (X, Y), (P I) * (£ g) with identity element = (0, 0), ( ~ -, ) f/x, y, A; B; C; D = (F/x, y/jjj ° ), Aj B; C; D)[l,2] (F/x, y,(J °),A; B; C; D) F (F/0A(q °fe M; N; P) = F/x, y/J °),(A, L); (B, M ) ; (C, N ) ; (D, P ) . 2.6 Quasi-linear functions. There is one final class of functions which can be computed in parallel using the linear strategy . f[x, a, b, c, d] s (ax + b) -:- (ex + d) now by splitting the numerator and denominator of x(x=t^p) it is possible to write g[t, p, a, b, c, d] = (at + bp) , (ct + dp) then f[tvp, a , b, c, d] = t /g[t, p, a, b, c, d] so that using the result of Section 2.5 the two components of x may be kept separate until the end of the algorithm, when the final result can be obtained by a single division. 13 The same device can be used with functions of several mem- ory variables; for example f[x, y, a, b, c, d, e, g, h, j, k] e ||~+~|2L±-| , hx + jy + k dx + ey + g ssuming t x = — P j y u p then after applying f the updated memory values are b i»U , , t , .u . , a p + b p + c , h p + Jp + k ,t . u ,t , u , d— + e— + g d— + e— + g P P & P P & or at + bu + cp , ht + ju + kp dt + eu + gp dt + eu + gp so that the function g[t, u, p, a, b, c, d, e, g, h, j, k] = (at + bu + cp ) , (ht + ju + kp ) , (et + gu + hp ) can be used and the answer that f would have generated can be ob- tained by appropriate division (g x g 2 g 3 ) = g/ x » y> 1« A; B; C; D; E; G; H; J; K then f/x, y, A; B; C; D; E; G; H; J; K = (g]_ * g 3 ) , (g 2 f S3) and g has an associative extension (Section 2.5). 14 CHAPTER 3 APPLICATIONS OF LINEAR SEQUENTIAL ALGORITHMS 3.1 The most obvious application of linear sequential algor- ithms and their parallel equivalents is to linear sequential cir- cuits (LSC). An LSC is defined by its state vector s_(t) at time t, its inputs u(t), and characterizing matrices A, B, C, D. Then the subsequent state is defined by s(t + l) = A x s(t) + B 55 u(t) and the output by y(t) = C x s_(t) + D x u(t) By judicious application of theorem 2.3 it is possible to define an LSC which will accept k inputs and generate the corresponding k outputs. (The so called k-channel analogue of the LSC.) Let f[s,u] = (A x s) + (B x u) then f\s(0), u(0), u(l), u(2), ...,u(k-l) is the sequence of states through which the LSC passes. y can easily be calculated from these. Now f has associative extension F[s,P,u] = (A x s) + (B x u) , A x P P[A,B,C,D] = (A x B + C), (B x-D) F/s(0),I,u o5 ...,u k _ ] _ e (F/s(0),I) F (F/0, I, u o ,...,u k _ 1 ) hence 15 now evaluating F/Q, I, u Q ... u, , u, (A k ~ B, A k " 2 B, . ..,AB, B) *| ^° u„ <_ -\r , A k B A thus S f = f/S, u , •••s u k-l - ^ A ' * S) + B' s (u Q3 ... u. ,) C and D' follow trivially. This result was also obtained by Gill [2] 1966. 3.2 CARRY LOOK AHEAD ADDER. If the bits of the operands , in b ) , then n ' i > a binary addition are a = (a, . . . a ) and b = (b-, . J - 1 n — 1 the carry bit at each stage c. is defined by c i+l = ^ a i A b i ) ^ ^ a i * c i) ^ ^ b i A c j_) c Q = hence if f[c.a.b.] = (a. a b. ) v ( (a. • b. ) a c . ) 111 i 1 1 1 1 then f (o, a; b) generates the carries and satisfies the conditions of theorem 2.3- Therefore, the associative extension F[a,b,c ,d] = (ca d) v ((c v d) a a), (c \i d) a b F[a,b,c,d] = (a a d) v b , b a d can be used to break the chain of n carries (n = length of a into several shorter chains and then combined using F. This process may be applied recursively to further shorten the length of the propagate chain. 16 The flow of data is as follows: ; (F/c Q , 1, A ; B x ) , (F/0, 1, A_ 2 ; B ? ) , ... 3 (F/0, 1, A_ n ; B_ n ) c n + l s F/ < 3.3 STURM SEQUENCES are used to locate the roots of poly- nomials, the Sturm sequence f , f, following rules . f is defined by the n f = 1 o f x (u) = d 1 - u f.(u) = (d. - u) f i _ 1 (u) - Sf f ± _ 2 (u) i > 2 for some parameter u. Hence if f[d,S,fl,f2] = ((d-u) x fl + (-S 2 ) x f2), fl f \ f o> f l» - ; - I will generate the Sturm sequence. Hence using the technique of Section 2.5>f can be expected to yield an associative extension. In this case, however , it is possible to realize the same result without appeal to Section 2.5* f can be written in the form f[fl,f2,X] = X x (fl, f2) where 2 , 2 X = ( d l" U > ~ S 1) , ( d 2" Uj ~ S 2) , ,d -u , S N . . , ( n » n) 17 then f/j f , f 1 , X generates the sequence and f has an associa- tive extension trivially since matrix multiplication is associative 3.4 SOLUTION OP BAND MATRICES. Although the general back- substitution process is outside the scope of this thesis, the closely related question of the solution of band matrices is sub- ject to the technique discussed. This is by virtue of the fact that the number of memory variables is limited by the width of the band. For concreteness the case of a tri-diagonal matrix will be developed. A = d l U l 1-, d„ u„ i 2 a 3 (o) u. (o) d , . u , n+1 n-1 1 . d n-1 n and the equation to be solved is Ax_ = b. First eliminating the lower diagonal elements by 1- i row. *■ row. - i-l * row._ 1 1 d. I 1 i-l the d. are altered i D i * d i D = d. - u k-l * 1 1-1 D i-1 i > 2 i = 2 , . . . n 18 (The D. , d. cannot be zero if the matrix is non-singular.) Thus, if f[D,d,u,l] = d-(uxlTD) then using the strategy of Section 2.6 the numerator and denominatoi of D can be separated, calculated independently using a parallel scheme based on the linear properties of the resulting pair of functions and finally combined with n divisions to yield the D^ . The right hand side may then be easily computed. Now the matrix can be solved explicitly for x . . Using the same notation as before with 1. = x = b v d n n n x. = (b. - u . x x . , n ) t d . i = l,2,...,n-l 11 1 1+1' 1 J 3 J thus g[x,b,u,d] = (b - u x x) t d will generate the x. when applied to the reverse of b, u, d (cj)b, cjm, d) g\o, (cf)b); (cf)u); (d) It is worthwhile drawing attention to the computational ex- pense that has been incurred by using this parallel technique. Use of the function f incurs a fourfold increase in the amount of computation that has to be performed (see Section 2.6). g incurs a twofold increase. The same techniques can be applied when the matrix is 5, 7, 9, etc. in width, however, the amount of extra computation that must be performed increases with the square of 19 the width from the diagonal and soon swamps the potential advan- tage of the parallel scheme. A case study for Illiac IV indi- cates that for a machine of width 64 processing elements the break even point is about 5- This break even point is, of course, dependent on the width of the machine. 20 CHAPTER 4 LIMITATIONS OF THE LINEAR SEQUENTIAL METHOD 4.1 In attempting to apply the linear sequential strategy to specific practical cases it often becomes apparent that the exact conditions of theorem 2.3 are not going to be met. The question that naturally arises is whether or not some perturba- tion of theory can handle such cases. The next section Is a discussion of some of the underlying algebraic properties that are employed in creating an associative extension. Specific ad hoc examples of associative extensions are given in this chapter. Their existence demonstrates that linear functions are not the only ones for which an associative extension can be found, how- ever, a closed form bound on the kind of functions for which associative extensions exist is very hard to come by. 4.2 No matter what algebraic properties are used to achieve the final result, it must be possible to separate the initial values of a sequential algorithm from the repetitive computation. This is required for it to be possible to break the chain of de- pendence and introduce some parallelism in the execution of the algorithm. Then it is necessary to have some function analogous to the associate which reconstructs the expected answer from the partial results and the initial values. In essence, the require- ment is that some useful computation must be possible at stages 21 i, i+1, ... i+m, even though the values of the memory variables at stage i have not yet been determined. In the linear tech- nique this separation was achieved by computing the factor by which the initial value was multiplied and combining it with the additive part using the associate. If the general term of the sequential algorithm is expressible as an expression in which the initial value occurs in a bounded number of places, then a strategy akin to the linear strategy can be expected to yield an associative extension. However, suppose the function involves powers of the memory variables or some other expression which introduces the memory variable in several places, e.g., 2 f [x ,a ,b] = ax + b now, clearly f/z, A; B will involve powers of z up to z where n = length of A and it is clear that there is no limit to the number of powers of z that would have to be remembered to contain the effect of the initial value. 4.3 An associate F of a function P is the most natural point at which to examine functions for associativity. Since P must be a function of 2n inputs and yield n outputs, and must be its own associate (P = F, Section 2.2), it is a simple matter to test F. For n=l F=+, x, max, min, and many others. Functions for which n >2 are discussed below. The function F whose associate is P can quickly be determined; for instance F[a,b,c,d] = (a x d + c , b x d) 22 then P[X,Y,F[e ie2 Z]] = F[X,Y,Z] so that if F[e 1 e 2 Z] = G[Z], H[Z] the general form of F is F[X,Y,Z] = (X x H[Z] + G[Z], Y x H[Z] for arbitrary functions H and G. U.l] f[x,y, . . . ,z 9 b] = bx , by p(x,y , ...,z) p(x,y, . . . ,z) bz p(x,y, . . . ,z) where p(x,y,...,z) is some homogeneous function of x,y,...,z th of order say n. It is not difficult to show that the i term of f/X,Y,...,Z,b is (g/l,b)x, (g/l,b)y, ..., (g/l,b)z where g[K,b] = bK p(x,y, . . . ,z) thus taking a log transformation log g = log b + log k x (l-n)-log It is clear that g has an associative extension G[K,L,b] E bR 1 "" , L x(l-n) P G[K,L,M,N] = K N M, L x N. Thus it is possible to generate associative extensions of functions such as f(x,y,b) = bx , by 2^ 2 2 ? x +y x +y or f(x,y,b) = _^x , ^y /(x + y) /(x + y) 4.5 APL type subscripting is associative. For example if A = (3, 9, 18, 27) B - (2, 3, 4) C « (2, 2, 1) 23 then A[B[C]] = A[3,3,2] = 18, 18, 9 but (A[B])[C] = (9, 18, 27)[C] = (18, 18, 9) There are, of course, restrictions on the values that B and C_ can have in order that A[B[C_]] be well defined. KB[i] <_pA and l ••• x m > . ..,x, 2 , x ?2 5 X 32 J " ' 3 X m2' ' * -' ' x mrT Thus, the sequential technique may be regarded as a vertical slicing of the computation and bit by bit techniques a horizontal one . 5.2 THE RESIDUE METHOD is a bit by bit technique in which processively higher order bits of the partial results are de- rived. It Is amenable to problems in which the iterative func- tion f is a polynomial in the memory variables with mod 2 (for fixed n) also a permissible operation. This case was not solvable in general by a linear technique (Section 4.2). The algorithm proceeds as follows: 1) at step i the i low order bits of each partial result are known. Let these results be X, . , X ". ..... X . . i li' 2i 5 ' mi 2) The output of the next iteration will be X-,.,-,, X„ . , _ , .... li+l J 2i+l 5 X .,, mi+1 where X-. . . n = X, . or X-, . , , = X-, . + 2 li+l li li+l li that is, the next bit of the answer will either be set or it will not be set 3) evaluate f(X- 13 a ) and fCX^ + 2 1 , a.) for j =1, 2, . Now from these two trials it is possible to decide what influence the Introduction of another bit of the input m 27 to each stage of the sequence has on the output, i.e., the introduction of the 2 th bit either causes the 2 1 th bit in the output to be switched on or not . 4) Hence proceed from left to right selecting one of the two cases depending upon the input from the previous stage. Now repeat the whole process with i increased by 1. The validity of this process depends on the fact that P(x, y, ..., z) mod n = A means that P(x, y, ..., z) mod 2n = ^ +n where P(x, y, ..., z) is a polynomial in x, y, ..., z. 5.3 THE SELECTION PROCESS OP Section 5 • 2 in which the choice between f(X.. +2 , a.) and f(X.., a.) would appear to be an in- herently sequential process. However, as noted in Section 4.5, subscripting, which is equivalent to the selection process, is an associative function. Thus it is easily possible to reorgan- ize the selection operation in a more parallel fashion. Further- more, in any hardware realization of this algorithm the selection function is a very simple, logical operation that could be com- puted at quite high speed. 5-4 THE SIGN OF THE FINAL RESULT is easily obtained from the last partial result. The sign is lost in processing because the partial results are evaluated modulo 2 . The signs are re- covered by evaluating each partial result assuming first positive then negative sign for each partial result and using the same selection operation as is used in Section 5*2 5-5 EXAMPLE of the residue method f(x, a, b) = ax-b f/z, A; B with z = 10 A = -9 , 4 , -3 , 2 , 5 , 7 B = 7, 1, 4, 2, 8, 5 for which the partial results are -97, -389, 1163, 2324, 11612, 81279 29 BrGI N FILE PRINT PRINT ( 2,10)j REAL PROCEDURE r(X»A#g);vALUE X'A»B*REAL X'A'B* F|« A x X - B \ REAL PROCEDURE MDOOrCA*B)IV'ALUE fi'BJREAl A*PJ MQDOF Is ENHERf ,5*(TF A<0 T MEMf B-ABS( A •> MOD B)MnD B ELSE A MOO 3))J COMMENT ROUND UP SINCE EVERYTHING IS INTEGER* INTEGER ARRAY A, B, OUT PUT, TRU, TRYJ.OW, TRYHi GH r 0| ft ij INTEGER I#J»Mn0ULUS) REAL Z,MAVITERATIONSj BOOLEAN BITSET) MONITOR PR I NT C TRU, MODULUS, A, B# OUT PUT) j TRuroi t=Zt*10* Ar0it=O|Arlli»-9;Ar23irA;A[3]ls-3lArOiir2;Ar5]is5|A[6]l«7| Broit*OI9flli B 7JB[?]ieiJBr33i s A*Rr*i:s?»Bt5!lsBIBC6!l«5l TOR Ii=l STEP 1 UNTIL 6 DO RESIN TRU[Tllrr(TRUtI-l],A[n,B[I])| MAXlTERATlONSI E MAXCMAylTERATlONS»TRUriJ>l END) MAXlTERATlONSt«ENTIER(LN(MAXlTERATlONS)/LN(2))*l) MODULUS |«1| DO PEGIN FOR Ii.l STEP 1 UNTIL 6 DO PEGIN TRYHiGMflj t«M0DnF ( r ( 0UTPu' tl-l ]* M OnULUS,A[ I),B[I]),IxMODUlUS>| TRYLOrt CT ]l«MODOFCF(OUTPUTtI-l 1 # A t H #B tl 1 ) #2*M0DULUS > I END| CDMur-gT NO* TRYHTGH CONTAINS THE D'JT°UT rXPECTED FROM 30 T -IE TTM STAqE ASSU M DN G TME MFyT R IT " r THE iNpUT IS SET TRYLO" CONTAINS THE OUTPlM ASSUMING THIS BIT IS NOT SET! "ITSrTig MnDPF(ARS(7),2 K M nOULJS) GEO MODllLUSj -3MMrvjT TEST WHETwrR OR NOT T^E RIT T S SfT EnR THE INITIAL V»LUEJ COM^rsT TME FOLLOWING ,,30? IS THE SELECTION LOOPj OJTP'jTtOl t=wpD0F"(7'2 K MT0ULUS>> rDR Ijel STE° 1 UNTIL 6 00 9EGIN OUT p UT[I ]t =lF 9ITSTT THFM TRYHIGMjIt ELSF TRYLOWfl}, gITSFT t=M3D0r(0UTPjTn ]#?KMnOUL'iS) GEO MODULUSI ENDj uDDULJS «= MODULUS x 2'> END jsjTil M AyITERATIO\'S t b m *vTTERATI0^S-1 LEO Oj COMMENT MO* EVALUATE THE SIGN OF THE PARTIAL RESULTS) FOR I isl STFP 1 UNTIL 6 DO REGIS TRvHi3Hri]tzr c nuTP'jTri-n,6m# B rii)t TRYLOrf m:=r(OUTPuTtI-l 3-M0DjLUS»AtT I'Bfl 1)1 CDmmtvjT THE NEGATIVE VALUE WAS GrNER«TED moo mODULUSi HENCE IF IT TURNS OUT NEGATIVE IT IS "VALUE MINUS MODULUS" J END; BITSET lc 7 LSS I FOR Ii= 1 STE P 1 UNTIL 6 DO BEGIN OJT^'JTrl] , = IF 9ITSET HEN T R V l, D W r I 3 r LSE TRYMlGH[I]j ^ I T SET It 0JTPUT[I]]« 10 TRU tlJt -97 TRU C?3« -389 TRU r3j« 1163 TRU t«3« 23?0 TRU f5]« 1161? TRU t6j« 81279 NJDH Ev A l'J* TlNS ThE results td a modulus OF It 32 MODULE OUTPUT OUTPUT OUTPUT OUTPUT OUTPUT OUTPUT OUTPUT 03- lj- ?3« 3J« 5]» 63. REPEATING THE CALCULATION FOR A MODULUS OF 2, MOOULU* OUTPUTtOje OUTPUT t l ]« 0UTPUTf23« OUTPUT[33« OUTPUTf Al« OUTPUT[53« 0UTPUTt6]« 0UTPUTT23" 0UTPUTT3]« 0UTPUT[1 3 « 0UTPUTtO3« MOOULU. OUTPUTt A]« 0UTPUT[53« OUTPUHM* MODULE" OUTPUTjO,. OUTPUTf 1 5« OUTPUT r ? 3 « 0UTPUTC3]. OUTPUTf*], OUTPUTf 53* 0UTPUT t 6 3 » 10 7 11 11 ?0 S 17 33 MODUlU* 0UTPUT[01« OUTPUTfl]. 0UTPjTt?3« 0UTPUTf3]« OUTPUTT^Ia 0UTPUTf5l« DUTPUTC6]. 16 10 31 27 11 ?0 12 19 MOOULU« OUTPUTfO). OUTPUTtlJ- OUTPUTf?]. 0UTPUTf33« OUTPUTfAj. 0UTPUT[5] B 0UTPUT t 6l« MODULU* OUTPUTjOi, OUTPUTM 3« OUTPUT r ?j. 32 64 10 31 59 11 20 28 15 10 31 123 OUTPUT r 33 » OUTPUTf a 3« 0UTPUT[5l« OUTPUTtftJ* 11 20 92 255 3^ MOOULU. OUTPUT[0]« DUTP'JTfl]. 0UTPUTT21* 0UTPUTr3]» OUTPUTt*]* DUTPUT[53« 0UTPUTr6ie MODUtU. nUTPUTtOl* 0UTP'JT[1]« 0UTPUTC2]r DUTPUT[3]» OUTPUTt *]■ 0UTPUT[5]« 0UTPUTC61* MODUlU. 126 256 10 159 123 139 276 92 127 10 415 123 139 276 92 127 512 0UTPUT[01« ouTPi)Trn« 0UTPUT[2]« 0UTPUTt3]« OUTPUTfOj. 0UTPUTt5l« 0UTPUT f 6l« 10 927 635 1163 276 30* 639 MODULU- 1024 OUTPUTtOj. 10 OUTPUTf 1 j. 1951 OUTPUTf?]* 1659 0UTPUT r 3j* 1163 0UTPUTt45« 276 OUTPUT[5]« 1372 0UTPUTC63« 3455 M09UIU« 2048 OUTpUT r Oj a 10 OUTPUTt 1 1« 3999 0UTPUTf2] s 3707 0UTPUTt35« 1)63 OUTPUT^]* 2324 0UTPUTt51« 1372 0UTPUTf65« 1407 modulu. 4096 0UTPUTtO3« 10 0uTpuTrii« 8095 OUTPUTr2]« 7803 0UI.PUTr31« 1163 OUTPUT f 4 J« 2324 OUTPUT[5l. 3420 OUTPUT[6]« 1007 moduiu* 8192 0UTPUT[O]« 10 OUTPUTjl,. 16287 35 OUTPUT t ? !■ 15995 OUTPUT[3i« 1163 OUTPjTt ft ]« 2324 DUTPUTfSj. 11612 QUTPUTt 6 5« 32127 MODULI). 16384 OUTPUUOl* ID OUTPUT[l3« 32671 OUTPUT[?]« 39379 DUTPUT[3]« 1163 OUTPUTt*1« 23?4 0UTPUT[5]a 11612 0UTPUT[6)b 15743 MODUlU* 32768 OUTP'jTtO]. 10 OUTPUTT 1]« 65439 0UTPUTr2]s 65147 OUTPUTf 3]« 1163 OUTPUT^}. 2324 □UTPUT[5}« 11612 0UTPUTr6]« 15743 MODUI.il* 65536 OUTPUT[0]« 10 outputc n« 130975 0UTPUTf6l« 81279 OUTPUTt55« 11612 OUTPUT[«i« 2324 OUTPUTT 3j« 1163 OUTPUT[2 5 . 130683 36 ODUlU« 131072 37 FINALLY CALCULATING TmE SIq^S D r ThF pARTlA^ RESULTS UTPUTtn- UTPUT[2]» UTPUTt33* UTPUTt 4 1* UTPUT t 53 * UTPUT[63« -97 -389 1163 232* 11612 B1279 which is of course the same resul t t hat w as fhumd by the straig-itfdrward SEQUENTIAL METHOD, 38 5.6 It is possible to extend the residue method to include division of one polynomial by another using the same technique that was used in Section 2.6 to handle quasi-linear functions. For example f[x,a] = (x 2 + a 2 ) t (x 2 - a 2 ) write x = t/b and using g[t,b,a] = (t 2 + b 2 a 2 ), (t 2 - b 2 a 2 ) 5.7 The residue method as stated suffers from one major drawback. That is, unless there is some guarantee that the partial results are bounded, it is likely that the number of bits in the final result will be at least as large as the number of iterations that would be required to perform the calculations in a straightforward sequential manner. This can often be avoided by evaluating higher order bits of later stages in the algorithm. For example if the iterative function were f[x,a] = 2x + a and the input vector were positive, then each stage of the algorithm would at least double the previous partial result. Accordingly, one could start the n stage of the algorithm at the n bit . th This would lose the n low order bits of the n partial result which would be set to zero. The details of this type of tech- nique will be developed in the next chapter (Section 6.3) • If this technique is to be used, however, there are some quite formidable difficulties to be overcome in any practical application. Furthermore, the technique is restricted to poly- nomial-like iterative functions and cannot be extended even to such comparatively mild functions as max or abs. 39 CHAPTER 6 ANOTHER BIT BY BIT TECHNIQUE: THE LIPSCHITZ METHOD 6.1 Most of the difficulties with the residue techniques arise from the fact that it evaluates the bits of each partial result from least significant to most significant. This is counter to the usual practice in numerical computation In which the most significant part of the answer is computed first. The Lipschitz technique discussed in this chapter evaluates partial results from most significant to least significant. Thus, after n iterations one would expect to know n high order bits of each partial result. This would normally be of much more interest than the n least significant bits that are obtained in the resi- due method. Furthermore, this technique is applicable to a very wide class of functions including max, abs, and many continuous functions. These are not generally amenable to solution by the residue method. 6.2 THE LIPSCHITZ METHOD. Assume that the input to the j th th stage at the i iteration is X. . . Then, introducing a small change 6 in X. . will induce a maximum change of M6 in X . ,, . where M is the Lipschitz bound (effectively k-| < M throughout the range o X for each memory variable x) . For convenience assume that <5 is some power of 2, say 6 = 2 r and that M also is some power of 2, say M = 2 s . Then the introduction of 6 = 2 r at stage j will produce at most a change in the 2 r+s bit of stage j+1. Accordingly 40 these effects may be anticipated by evaluating f(S.. + <5. , a.), J J J f(X.. - 6., a.) for each stage j and using a selection process similar to that used in Section 5.2. In practice the selection is based on the closeness of the trial results '(fUj^, a . ) , fCX.^. + 6., a.) and f(Xj ± - 6j , §_. ) to the trial parameters X j +li > X j+ik + ^j+l 5 X j+li ~ 5 j +1 * This technique is chosen rather than a strict examination of the bits of each result for two reasons. First it tends to speed the convergence, and second because it automatically compensates for unexpected overflow from accumulated lower order bits that occur as the algorithm proceeds. 6.3 INITIAL VALUES for the 6. defined above must be chosen with some care. The 6. are halved at each Iteration and the X.. J Ji can be altered by ± 5. at each iteration. Clearly if these 6. J J are too small initially, it will not be possible to converge to the true value of X. . On the other hand, a choice of very large 6. will cause the algorithm to waste much time without making J any changes at all to X... For many cases an adequate set of initial values can be ob- tained directly from the Lipschitz bound M. A function f(x, a) satisfies the Lipschitz condition with respect to its memory variable x if for any pair of values y, z in the domain of f. |f[y,a] - f[z,a] | < M|y-z| Thus f[y>a] belongs to the interval f[0,a] ± M|y| Hi Hence if X. are the partial results of f/6, A then X ± e f [0,^] ± M | z | X 2 £ f[0,Ap] ± M|X-| etc . so that the function g[x,a] = Max (abs(f[0,a] + M | x | ) , abs(f[-,a] - M | x | )) vjill produce suitable initial values when it reduces z, A. An obvious artifice can be used to generate a linear form of g g[x,a] = abs(f[0,a]) + Mx then g/abs(z) ,A produces the same initial values. Now g is linear in its memory variables and has therefore a parallel equivalent. 6.^4 Example. The following program simulates the Lipschitz algorithm for the function f[x,a] = 1/2 Sin(2x + a) reducing a randomly generated vector. The maximum derivative of f with respect to the memory vari- able x (the Lipschitz bound) is i|£l< i 'fix- BEGIN UWN INTEGER SI/E#COUNTERl HEAL K,XfY»ZZ,I,LASTOUTPUT»EPSILONIBOOLEAN DEBUG. CONVERGED I F ILE REMUTE(1,10)| FILE LINL OISK SErIAL t 20 I 900 ] ( 2. 1 0. 150. SAVE 99)j HEAL MAXUERIVATIVE'Z) PROCEDURE PRIST ( NAME * A ) J VALUE NAMEJARRAY AC01JRFAL NAME) BEGIN INTEGER II hRlTL( L INE. >.NAME» FUR Ilsl STEP 1 UNTIL SIZE 00 AC 1 1 > J COMMENT PRINT PRINTS ThE ARRA Y pASSED TO IT AS pARA M FTER| 42 fcNDI PROCEDURE GETA(A);ARRAY ACO]l BEGIN INTEGER vii OWN INTEGER BASElOWN REAL Tl HEAL PROCEDURE RANDOM I BEGIN IF pASE«0 THEN BASE I »"2Tp*200 w l TI«=3&IBASE.[37|261*6137*2197513)C37| 35 13611 Tlte p EAL(BOOLEAM(BASEl B T)OR BOOLE AN ( ,5) )-!J5j RANDOM I* (T*T)*1000I END I FOR Jl«0 STEP J UNTIL SIZE DO A t JJ I "RANDOM! END* COMMENT GET GENERATES A SAMPLE VECTOR A IN WHICH EACH ELEMENT LIES IN ThE RANGE "1000. *1 000 1 heal proceoure f(x»a)*value x,ajreal a.x| begin M C 0,5*S1N<2*X + A>J COMMtNT F IS THE FUNCTION WHICH IS REDUCING THE Z VALUE L AND THE VECTO" Al F/Z.A I END I HEAD ^RE MUTE,/ . DEbUG . S I ZE. EPSILON. Z ) I maxdertvativeipi.oj hritl(line»< "F(X»A)»U.5*siN(2*X*A) n ♦ n MAX oEKIVATlVEa",F9.3. H Z VALUE*". F9 , 3>, MAXDER I VAT I VE . Z ) I BEGIN ARRAY A, OUTPUT, HI GH. LOW, EVEN, DELTA, TRU CO 1 SIZE] J GETAU)| TRUCO] jsUuTpUTCO] toZZI«Z| UELTAC03 »cABS(E(O.Z))l COMMENT COMPUTE ThE CORRECT ANSWER USING STRA I GNTFOR WARD SEquENTIAI TECHNIQUE | FOR il»l STEP 1 UNTIL SIZE DO BEGIN TKurn»*zzi c r(zz,ACP>i Xi«FCO»A[X])j DELTAU3ieAB5(x)4MAXnERlvATIvE*DELTACI-l ]| ENDj PRINK" A "»A)j prINT("TRUE n . TRU ) I PR I NT C "DELT A »,DELTA)| COMMENT NOW ENTER THE MAIN LOOP* UO BEGIN lF(cOUNTERt*COuNTE:R*i ) MOD 10»0 wRITEUINE. < w ITERATION # n COMMENT THE! VECTOR HIGH CONTAINS AT THE nfvt iteration IS low contains the result assuming it is decreased by li«e amount, even is the result assuming no changei OR DEBUG THEN tI4> .COUNTER)! THE OUTPUT ASSUMING THE INCREASED RY "DELTA t T3* 43 INPUT A FOR 11*1 STEP BEGIN DELTACIJl HIGHCI] » 3 LOW [1)1= k-VENC 1 3 I » END I CUNvEKGEDI-TRU cummEnt the >je SELECT FOR Ilel STEP BEGIN CONVERGEDI* LASTOUTpUTj WI»MIN(XIpA IF X*W THEN IF ZZ«W THE ENDI IF COUNTER MOD PRINT("0U END UNTIL COUN WRlTE(ulNE, Td-l3 »AcI])l EILASTOUTPUTI^OUTPUTCO]) » THEN " " ELSE "NOT"), COUNTER. EPSIL0N)| TPUT)I PROGRAM FOR EXAMPLE 6. A THE LIPSCHlTZ METHOD. 44 note the output from this program has been editted by hand tct impkove readability. FtX»AJ««0»5*SINt2*X*A] MAX DERIVATIVE* 1,000 INITIAL VALUE. 0.000 F REUUCEb THE VECTOR A I "F/Z.A" A 1.677 6,233 2,3B7< 9.11' 2.o?e 5.891 1.120 1.787. V,A2V| 5,535< t». 331' t>,472 J, 670 3.o43 2.0&3 0,490 0.192 1.011 6.756 061*E 1811E oaoof lO^UE 3l6bfr 059JE 355b£ 032£>e 180«E 0890g; 492 y E b avF 329^£ 122^E ^6 1 It ^oiof H98>e 5806^ 9 l 96 E 02 02 02 02 02 02 02 02 02 02 02 02 02 01 02 02 0? 02 02 02 5,122 7.356 4.311 4.640 1.556 2.454 5.3 8 6,760 '.111 0.510 3.175 V.932 2.797 2.2 « 0.238 2.487 9.344 9.101 0.062 4.956 2517P 7965r 0*02r 6l96r 2246r 4702? j794r 8o37p 8455r 40l5f 4333F 6465P 9 3 08r 8n29r 2509r 4434P 31*1? 7o45r 9370F 4674p 00 02 02 02 02 02 02 02 01 02 02 02 02 02 02 02 01 02 02 01 0730429E 090B077F 8370909F 8123144E 32702S7E 7103618E 8131090E 1892501E 2625648F 0102233F 0312124F 5?82996E 7688257E 21«0646F 3735Q91F 9079761F 6005232E 5483281F; 3364955F 97Q8I11E 01 02 02 02 0? 02 02 01 02 0? 02 0? 02 02 02 02 0? 02 02 02 .8420446E ♦5l7?008r .9829632F .B20«013f .B987321F ,230?9oOF .6657901F •1103210F .5465698f .0A80U5E .1258570E .1049Rl4r .6700241F ,2627359 F .177297QF .4J46257F •5651733F ,259l96lr .9759765F .8048783F 02 0? 02 02 02 02 Ol 02 02 02 02 02 01 02 02 02 02 02 02 02 03999 D7264 06403 78500 35233 32579 193^6 640 9 « 24182 26071 72725 58163 39845 51341 021*7 4?37l 36698 5o5B/ «0165 77623 52E 49E 58E AOE B1E 78E 97E 84E I6E 65E 71E 89E 76E 54E ^8E 57E 74E 26E 53E 54E 02 02 02 01 02 02 02 02 02 02 02 02 02, 02 02 02 02 02 02 02 THE TKUE •4,66 1,78 -2.98 4,94 •4,28 4,94 •4,99 •1.72 ■1.33 •2.69 •2.34 •4,99 «*.6l "3.29 4.9l J, 76 3,76 «.92 -1.43 "2.30 tru^ a 32902 13606 6 3864 *032* 20 o 6 3 77386 7189b 14730 93876 2632** <*0596 J9917 99906 4 7 J b 1 j Jb58<; 0*326 96805 4 >7*93 79964 08311 ANSWER 1st E-OI L-4, E"01 3. F-01 1-5. r-01 1 4, E*01 1-3, F-01 1 4. E"01 L 4, E"01 1-1, E-01 [ 4. E"01 [-9, e"o; >-9, E-01 •2. F-01 1-4, F-01 -4, E"01 1 4, r-OJ >-2. F"01 F-0! 1 4, E"01 1 1. F-01 1-4. 3321343 9 2 66878 4j7o264 7953703 6824?6i 9205943 4799«;33 447^407 9323025 5382 4l 7t73386 2974553 ^920527 5olOl25 9bl?656 9614625 5315539 68000 9 4 5 7 8 9 3 34 5859592 F-01-4 E-01 4 F-02-3 r»02 4 F-01 4 P-01 3 F-01"? F-01 4 P-01-6 r-02 2 f-02-4 r-01"2 r-01-2 r-02 3 r-01 P-01 F"02 r-02 f-01-2 r-01 3 99910 96881 3547i 00576 42532 52136 74388 86462 82097 76320 8 389 25502 57811 83446 193*1 82987 5B048 84183 97143 77027 97E-0 *3E'0 76F-0 *1E"0 98E"0 03E"0 33F-0 94e"0 41F"0 24F-0 84F-0 01F-0 ?9r-0 74F"0 01E"0 55F"0 12F"0 62F-0 17E-0 26E"0 186 594 750 575 920 969 833 173 942 214 fiin 3 59 780 3 36 597 912 941 516 099 246 3434p- !*13F- 7897f- 3380F- 3301F- 8984f- 7255p- 0^25r- 1«36f- 2620F" 2203f- 1745f- 3043f- 7764f- 8278 r - 3659 r - 798qf- 6699p» 3l99 F - 5820F- 02 4 -4 -3 3 -4 -6 -2 3 -4 -3 02 4 02 2 9 -4 •4 3 -3 "4 t -5 5924287 3156022 S967973 2855212 6263376 2 ? 5 5 fl 4 8702684 Il963n6 9993775 37916*2 9^59873 B?51»72 8202042 8j 16776 67774*54 2370959 66?18 ? 3 5981634 1432396 5466436 E-01 E-01 E-01 E*01 E-01 E - 02 E"02 E-01 E"01 E-01 E-01 E-01 E-02 r-01 F-01 E-01 F-01 E-01 F-01 F-02 AT THE 2UTH ITERATION # ITERATION 1 20 OUTPUT •4,663 1.781 »i!,986 4.949 •W.282 4,947 •4,997 •1.721 •1.339 -2,69* •2.344 •4, 99,5 4.619 •3.294 4.913 3.768 3.769 4.92* •1.437 •2,300 2902E-0 360&E-0 3864 E -o 032VfO 8o63f0 736/'E'0 189r E -o 472'F"0 3871^-0 63?3f - oo 9 ur w 02 99l6p"0 9904t : -0 6656^-0 577*e"0 9680£« b779 E -o 7801E"0 9863E"0 8q66e:-o '4.332 3.9?6 '5.4i7 4, 795 '3.682 4.920 4,479 '1.447 4.932 '9.536 '9.7J7 '2.297 '4,49? •4,499 4.951 '2.961 '2.531 4.680 1.578 •4,586 1343C 6R78r- 0264c" 3703r< 4?82r" 5945r- 9533r- 2408r- 3045r« 20 89p« 40 8 0f' 4553r« 0592T- 9<39of" 2*53F' 4664P 5373T' 0557p« 80l3r« 01^3F' 01-4 01 4 02-3 02 4 01 ft 01 3 01-2 01 4 01-6 02 2 02-4 01-2 01-2 02 01 01 02 02 01-2 01 3 9991097E' 9688193E* 354717^^- 00576B1E' 4253255E 1 52136Q7E' 7438833E' 8646303E' 8285294E' 7632061E' 8038983F' 2550197F' 578U29E' 83«5928F" I939i?3e« 8293522F' 5805553F' 8418173F' 9714908E' 770213OE' 01 4 01 1 01 4 01-4 01 6 01 2 03-4 01-4 03 4 03 2 01-3 01 2 01-1 01-1 01 1 02-3 01-3 01-3 01-4 01 4 18634 59418 75078 57533 92013 96990 8337? 17307 94218 21425 81018 35918 78033 3369Q 59782 91233 94181 51667 09937 24665 34E-01" 13F-01' 97F-nl 80F-01' 23F-02 07f-01' 53F-01' ?9f-o1 36F-01' 99F-01' 72F-02 01F-02 12E-01 87F-nl' 7 4F-01' OOE-01 89f- 1' 58f-o1 66F-01 l^E-01 45 • 5924 .3156 .89*7 .2855 .82*3 • 022* .8702 .1196 .9993 .3791 .9^59 .8?51 .8205 .R116 .6777 • 2371 .6*21 .5981 .1431 .5468 287E-01 022E-01 973E"01 212E-01 359F-01 183F-02 654E-02 308E-01 776E-01 6*2E"01 876E-01 869F-01 241E-02 868E-01 441E"01 545E-01 490E-01 249E-01 769F-01 568E"02 DID OUTP •4.6 1.7 •2.9 4,9 -4,2 4.9 •4,9 •1.7 •1.3 '•2.6 "2.3 ,•4,9 4.6 •3.2 4.9 3.7 3.7 4,9 "1.4 •2.3 cunvergE after 26 iterations with relative errop» 1.0000000E-05 UT u 632902E 1 81360&E' 863864^ 49 329e' 4^7366^' 971896 E < 214729^ 393876^' 926324f 44o58¥ r < 939917^' j999o«>F' 947J61F 1 1J558U 689 3 22F' 6V6809F' 257 6 91F' 008331E' 01 1*4, 01 I 3. 0' 1-5. L 4, 01 1-3, 0' 1 4, 4, 01 1-1. 0' 1 4, 0' I- 9 . 0; >-9, i-2. 1-4. !-4, L 4, 0; ?-2, 1-2. 1 4, 1 1. 1-4. 3321343F- 9266A78F- 4i7o?64P' 7953703F' 6824? 6 1 F' 9205943r- 4799533F' 4a7?408c" 9323n25r- 538?n a lP' 7i73375p« 2974553c" 4920527F' 5010132F' 9512657r- 9 6 1 4 * 2 1 F ' 53l5^»5lP' 6800109F' 5789333F' 5859607F' 01-4 01 4 02-3 02 4 01 ft 01 3 01-2 01 4 01-6 02 2 02-4 01-2 01-2 02 01 01 02 02 01-2 01 3 99910 96881 35471 00576 42*32 52136 74388 86462 82097 7*320 8o389 25502 57011 63446 19391 82*87 56048 84183 97143 77027 97E-0 93E-0 76F"0 81F-0 98£-o 03E-0 33F-0 94f"0 ftlF'O 24F-0 84F"0 01E"0 ?9f"0 BOF-0 01F-0 60F"0 21F-0 61E-0 09F"0 ?5F - 186 594 750 575 920 969 833 173 942 214 810 359 780 3 3* 597 912 941 516 099 246 3434c 1013F 7897F. 3380E ^ 3 1 F 8984r 7255F 0725f 1836 F 2*20F 2202F l745r 3043f 776ftF 8279 r 3655F 7985r 6710F 3203F 5828E -02 ft -6 •2 3 •4 -3 -4 -3 3 -4 02 ft 0? ? 9 • 4 -a 3 •3 •4 1 -5 59242 31560 89679 26552 02*33 02255 87026 H963 99937 37916 93598 82518 «20?0 8,167 *7774 23709 *6210 59816 14323 54663 87E-01 22E-01 73E-01 J?E"01 76E-01 61E-02 84E-0? 06E-01 75E"01 62F-01 74F-01 72E"01 50E-02 78E-01 54E"01 59E-0! ?8f-01 22E-01 9?E"01 77^-02 1 HE NUMBt-H OF ITERATIONS THAT ARE EXPECTED FOR THIS ALGORISM ArEI 3 1/3 TlMEs 5 TO G^T THE FlvE SIGNIFICANT FIGURES OF ACCURACY • 17 ITERATIONS PLUS 5 ITERATIONS FO* THE MAXIMUM VALUE OF DElTACj) TO BECOME SMALL ENOUGH, THE AlTuAL NuMbEr OF ITErATIqNS rEQUIrD IS 2 * . 46 6.5 CONVERGENCE OF THE LIPSCHITZ METHOD. ; Two questions need to be answered in this respect. First, under what conditions does the method converge, and second, how quickly does it converge? An examination of the algorithm in Section 6.4 reveals that the partial results f /z , A propagate through the calculations. That is, after the i iteration the i partial result is known exactly. Referring to the variables used in the program, the trial input "EVEN[I] n will be the one chosen when the algorithm converges. Since EVEN[I] is computed using 0UTPUT[I-1] , it is certain that after k iterations EVEN[k] will be exact. Thus, the algorithm converges to the sequential results under all conditions . The number of iterations in which the algorithm will con- verge can be calculated by examining the initial values 6 . . If j th the maximum tolerable error is E., then the j stage will con- tinue logp (6./E.) times to convergence. It is clear that a large Lipschitz bound M will cause 6. to be very large and therefore slow convergence. This issue will be taken up in the next chapter. In the example cited in Section 6.4 the 6. are approximately 3/3 « Thus, the number of iterations expected for different lengths should be of the order of log 2 ( j/3 ) + 25 . This is born out empirically . Size of A Number of Iterations 100 26 500 30 1000 32 2000 33 10000 36 47 6.6 Functions of more than one memory variable can also be handled with this technique. Two distinct approaches are possible For example, if f(x, y, a) is a function which yields two results, then the iterative process can proceed as follows. 1) evaluate fCx^, y ± , & ± ) f(x ± +6, y ±i a ± ) f(x i ~6, y ±a a ± ) and use the usual selection process to choose which of these three values is most appropriate at each stage. 2) repeat the process for the parameter y. Or the two steps may be performed at once by evaluating the nine values f(x. ± 6, y. ± e, a.) and again choosing the best of these values using the straightforward selection process. The first of these techniques would appear to be the more economical in most instances. Calculation of an Initial sequence follows the style of single memory variable functions. If ||^| < M ||£| < N 'ox 1 - ' 6y ' — then f(x, y, a) e f(0, 0, a) ± M | x | ± N | y | so that g[x,y,z] = abs(f(0, 0, a)) + Mx + Ny predicts the initial sequence when applied to g/abs(z,), abs(Zp),A H8 CHAPTER 7 LIMITATIONS TO THE LIPSCHITZ METHOD 7.1 The initial values for the <5 . In Section 6.3 crucially effect the efficiency with which the parallel algorithm will work. If the Lipschitz bound M is greater than 1, then these numbers will be of order 6. = 0(M J ) which may be absurdly large for large j. However, it may not be possible to choose smaller values for the <5 . since any change 6. at stage j could produce a change M6 . at stage j+1. Thus 6 . , , must be at least M6 . . J+1 J For example f[x,a] = 1/2 Sin(3x + a) M = 3/2 and for an extraneous input vector of length 1000 61000 - (3/2) - 2 . Since the Sin function is never greater t h than 1 it is absurd to start looking at the 580 bit. An algor- ithm which should converge more quickly is described in this chapter . First the 6. are chosen as the maximum value of f. The al- gorithm proceeds as described in Chapter 6; however, if in the execution of the algorithm it happens that a stage k is converg- ing to a point x Q for when|||| x=x > 1 then « k+1 , <5 k+2 , ... are multiplied by M. The net effect of this is to delay the convergence of stages k+1 , k+2, ... for log 2 M iterations, after which the changes to stage k will have settled down. 49 7.2 EXAMPLE. The following example based on the function 2 f[x,a] = 3x + a subject to the arbitrary restriction that it not stray outside the range [-1,+!]. Thus p f[x,a] = max(-lj min(l, 3x + a)) and the extraneous input vector A is uniformly distributed in the range [-3 3 0]. The algorithm is essentially the same as example 6.4 with two exceptions: 1) The initial 6. are chosen to be the maximum value of J f[x,a], 6. = 1. 2) In the loop which computes trial values for HIGH, LOW, and EVEN a special test is made to see if the expected change at stage i is greater than the change being computed at stage i+1 . If this is so <$.,-,, Sje.o, ... are multiplied by M, the maximum deriva- l+l ' i+2 ' tive of f with respect to x. -t— = 6x 6x maximum value of |-j—| occurs at 3x + a = 1 1 6x ' and a = -3 i.e., x = /4/3 , M = ||£| =7. 7.4 CONVERGENCE OF THE MODIFIED METHOD is guaranteed for the same reasons that were advanced in Section 6.3 The number of iterations that can be expected is easily found if the probability that | -= — | > 1 is known. If this probability is p and the length of the extraneous input is n, then the last 6. should be multi- plied np times, which costs nplog ? M iterations. Thus, to produce an swers accurate to E will required (logp I/E) + nplog^M itera- tions, where I is the initial value assigned to 6.. J BEGIN (JwN INTE HEAL *#* F ILE. RFM F ILE LiN HEAL MA PROCLULjH BEGIN IN *RIU(1 I FUR I LNDj pROcLDU^ bEGIN INTEGER OwN INTt HEAL PRO RLGIN U hA T:=rF_ RANDO END ; FOR wi:=0 END* HEAL PRO BEGIN F i = MAX ( - END; HEALHREM MAXIMUM W R I T L ( L I "MA^l-l " M A X I M MAXIMUM' BEGIN ARRAY A GETA( A) » 1RU[0].= OELTAIOI FOR I * = 1 H 1 6 I N TKUt I XJ=F( PtLTA END | P R I N U " UP HLGiN I M f. INCRE pUR I 50 iiLR SUE.cfHi^TE', JJ BOOLEAN INCREASED* • Y»2» I#uAsTOi|TP IT»tPSlLUN 'BOOLEAN DEBUG. CON Vf RGEO ' UTLC 1 . 1 D ) » ; I 0IS< SERIAL [?0j300](2il»NAMF , ' = 1 STL= 1 UNTIL SIZE DO A [ I ] ) ' L GETA(A);ARRAY A[ 0] j j; bEK R L E U R SE=0 I f* A S F M L(R" I. 1 = ( STEP CEDUR 1 »MI UTE./ * s 1; Nt ,< » M I N ( U M s ♦* iMTI A S E I f] w N W £ 1 L T i THEN RA$E ? = "2^*200"'' .l-37,?M!61-3 7 + 2!975l3-H37i35l36JI f]LEAN (f ,ASEi = T)OR B^ C) LEAN( ,5) )-.5J T + T"0.5)J3"1,5: 1 UNTIL STZE DO A C J » :=RAN()QMI E fc-x.aWValuE x.ajreal A.Xj M U , 3 1 X I X f a ) ) I ,DE M U(- f Sl7E,EpSIL0N, INITIAL)} M a X D F R I v A T I V E s = p I \ »3*X**?+A))«/ • F9..3/" 1NTTIAL VALUE = ♦' , F9 . 3 » "M AX DFR I \, A T I vE =« , F9 , AL'MAXnrHI VaTIVE ); #0UTP jT •HlfiHfLO ^EvFN,DELTA,TRUlO?SlZE]; LuTpuTL 0] :=/:-I v lTlAl.} I = M A X T M J M ; STEP 1 U/j TIL ST /E f)0 J »"Zlsf (Z.aC.I 1)1 U » A [ I ] ) } I I ] : = 'i a xImUm: A ",A)| prINT("TRUE %TRij)J UNTEP ! s CiHjNTFR + ! ) MOO 10=0 OR PFrUJG THEN *RlTE(LlNE. <"ITFRATlON i"»IQ> » C. HUNTER 5 > a seo := false; : = 1 sTLp i ijmUl SI^E D n BEGIN UELTACIlj»nFLTAfl1/2l HluHt I J * B F LVLM I J J=f ( puTP 'Tfl-.l J »At I I ) i 51 LF I > CUilNTF*. AND NOT INCREASED AND I < SIZF TWFN IF (ir xisAPS > OELTAti+lJ th pm TRUE El SF XlsA3S(LOW tT]-EVEN[IJ) > DELTA 1 1 + 13 ) THEN R^Gln COMMENT LAT.E CHANGE WAS ORSERVFO- TAKE CORRECTIVE ACTION MULTIPLY EACH "lELTAfj] SUBSEQUENT TO THIS ST4GE BY THE mAx DERIVATIVE, IN NO CASE SHOULD THIS MAXIMUM PL LARGE* THAN THE MAXIMUM EUR THE FUNCTION! FUR j:sT+l STEP 1 WHILE DELT A r J ] > END UNTIL COUNTER (jE"5 SIZE OR COM\/ER(iFO J NRI.TL(|.INE. » (lp cONvERfit-O then » " ELSE "NOT" ) , c Oi|NTER,EpSI| nN) j "RINl ("OUTPUT M 'DUTPUT) J -OCK^LlNt); ■NDiLNQ. PROGRAM FOR EXAMPLES 7.2 ITERATION * 10 U U T P ■1 .0 1.0 f ,\ l.C 1 .0 -1.0 J. 3 5.3 1.0 1.0 1.0 1.1 *.l 6.2 oooooo 0000 00 6Jb?lb o.ooo oo o o o U oooooo 2 t) y 3 3 o u 9 7 h 6 1 5 4 1 J oooooo 1 6 6 V 1 o o o o o o f ! J 4 9 o b 2VJ*7J 506b J M 4,o2-J4997 5, i70H?37 3.034766J 1. ooooooo f on 1 00 1 01-1 00 1 00 4 no '? 01-6 1 - o 01 -1 1 01-1 1 00 b 01-1 '01 1 oi-l . 5 „? 6 6 .0000 . 6 /" ? , oooo . 3 6 3 4 .5754 . 2 a 1 9 . o ij o . ;,) . Oooo .0000 . B 9 3 B . . (i . 7b5r-o2' f a59c-01' r 73*r-0] j 6 r - o 1 < A 1 B r - o 1 ' S'IK-01' o - no Or oo no ok oo n(iO" 1 65^-01 n o r OOOf 00 r 1 .000 1.000 1.00 o 9 , 3 h 9 « . 5 ? H.36b 1 .00 1.000 1 .ooo 1 .OOU 1.00 1.00 1 . o 9. A >\ 2 1 ,00 l.ooo oooof o o n o e o o o o r 4 3 ? 9 p ■ 04o*F' 9 1 /i / F ' OOF F F F 00 OOF 00 00 OOOO F 1937F F OOOOF 5 00 1 00 1 0?-? '01-3 01 2 oo-i 00 00 00 00 '01 00 00 .526 .oon . 000 .0^ .291 .190 . 00 . 330 . .0 00 .746 ,000 . 7 1 .000 .000 ,7/i3 1 3 3 V e • 0000 F OOOOF 5 9 6 2 F ■ 1 126E' R700F' F 9 A S q p . OOOOr 1 1 r 1 3 1 3 r . r n "7 2 c OOOOF F nl-l . 1 . 00 1 . 01-1 . P 1 -7 . 01 no 3 oo no s. 1 . 1 . 9. 1 . 1 - 1 . c 01 5.9 7532^^-0 1' 01 -y.O'|H2'i7lr-ol 00 1.4 rt 69/i0?r-0l' 1 . C F 1 .00 000 OOF 1 . E 1. OOOOOO OE - 1 . F 1 .0 OOOOOO F 1 . o n n o o o E 1 . P n!- q . 00-1 . Ol"1. ■ no 1 . 00 1 , nl 4. oo-l . oo i . 00 1 . 00-H. 52 O0O00n oooooo OOOOOO o n o o o n 1 98 7 f « 7 7 3 9 1 oooooo 7 ? S 4 6 S oooooo 3 ] fj 2 2 P noo oon no oooo oooooo oooooo 2^-sB 1 7 oooooo no 000 oooooo a 9 1 6 i OF OF OF OF 3F' 5F. OF OF. 5F' OF 6F' OF OF OF OF (J n 01 01 01 3F-0: OF 6 OF 0( OF 0( OF Of 2F"0' I 1 ERA T ION it 20 OUTPUT -l .000 1 .000 7. 16-1 l .ooo "1.000 -l .ooo 1 , o o J. 384 b,361 1 .000 ■ t> • 8 9 J 2,494 1 , OOO 1.103 *.129 6,250 i .ooo -J. 75V J. 034 -V.006 s oooo oooo 521c o o o o OOOO OOOO 097c 54 1 J o o 522' 6 a 04 OOOO 4 9oo J6 7J 6 J 4 ('00 67 if 7(S^ J ( 2 1 r 00 00 '01 - 00 00 00 00- '01 -o '01 - 00 01 - 01 00 '01 - 01 01 - 00 'I 02-* 01 -* 01 -1 .536* .0000 .6720 . o o o o . 6 b b .363^ ,0()(H) . 2 1 9 • . • oooo . 2 9 / 9 . 3 9 3 A • o o o o . oooo . Of)00 . 6 ') 3 2 • b :> 2 4 . 4 P 2 • ooo o 755r. 000 p A 5 9 r • 00 Of 7 3 fi r ■ 1 6 r ■ 0" S41r- 00 0" nOOf OU-Qr .3*6?" l65r- OOOr no 0^ OOOr o 5 7 r ■ 8 7 7 r - /.7lr- OOOr 02-1 00 1 01-1 00 1 1 4 1 3 00 1 oi-l 00 no 01 01 00 oi-l 01-1 01 1 00 H .00 .00 .000 .00 ,5?0 ,97o .00 . ooo . oou . .23V . 0H<4 .000 .64^ . f > . . ooo .00 . 000 .^12 OOOOF OOOOF OOOOF OOOOF 4 6 F ' 2 9 B 7 F ' OOOOF OOOOF OOOOF OOOOF 6 5 4 4 f" o9'3 7r' OOOOF 1 9 3 7 e ' F OOOOF OOOO F V OOOOF 4 3 3 4 r ' 00 5, 00 1 . 00 1 . 00 1 . 01-6. 02-5. 00-1 . 00 9. 00 1 . 1 . oi-l. or 00 01 00 00 01 526 1 oooo oooo o o o o 174 5 261 5 noo 3 3 09 OOOO oooo 1 H 5 7010 oooo oooo oooo oooo oooo oooo oooo 3 3 9 r - n 1 O00f oo 000F 00 OOOf 00 737r-ol 4 02 r. ol OOOf. 00 6 5 4 F - 1 OOOr no OOOr oo ooof no 557 r - n i 072r-0l OOOr no nnot nf)Or V f OOOr OOOf no no 00 00 '1.0000 1 .0000 1 . oooo 2.6355 ?.5nS2 1 »nooo 1 .noon 1 . 9. 7?54 < 9 . 7 \ 7 B 3. 31 B? 1*0000 1 . n 1.0000 1 . n n 1 -oooo i\ . 1 00 v 1 .0000 1 .oooo 1 .0000 OOOE OOOE 00 CE 01.2s 277E" OOOE OOOF OOOF 6 s 5 f ■ 50 4 F' 2A6| ' OnnE OOOF OOOF 000 r OOOF b?lF'i OOOF OOOF OOOE 0( Of Of '01 '01 Of Of. 0( 01 01 01 Or or Of or oc 01. 00 00 00 IURAI I [)N $ 30 53 OUTPU -1.00 1,00 7.16 1.00 •1.00 ■1.00 -4.81 J. 38 5.36 1.00 1.00 1.00 4,41 1.10 9.12 ■1.00 1 .00 •5.17 J. 03 1,00 I s ooooof oo ooooof oo J52lbf-01' oooooe 00 oooooe oo oooooe oo *893t>E -02' <»097f t"01' i ba 1 JF.-01 ■ oooooe 00 OOOOOf. 00 uoooup. oo ^6355p-01- J 4 9 8 o p- - 01' 9367Jf_"01 00 E 00 OOOOOF 00 bb23a:-01 */66J£-oi' OOOOOE 00 1.53 1.00 1.67 1.00 1 .00 6. 2d 1 .00 1 .00 9.5 2 1 .00 1 .00 1.00 2.21 7 .46 2 . 8 o 1.3a 9.oa 1.46 667b5r. 00000 r 2 * 5 9 r ' OO^OOf 6 6 7 3 3 r ■ 34] 06r< OOOOOf lOS^tr- r OOOOOf 6 ? 9 V 9 r - r OOOOOf OOftOOf 4752br- 2 3 3 3 r - 3?957r' 6 8 4 7 p 82471 r« 6 9 4 2 F ' 02-1 00 1 01-1 00 1 01 a 01 3 00 b 01-1 00 oo 01 00 00 00 01 01 01-1 02-1 01 1 ■01-1 .0000 .0000 .0000 ,0000 .5200 .970? .4393 .0000 .0000 .0000 ,0000 ,0000 ,0000 .6/121 .6205 .4?9H .0000 .0000 .0000 .0000 OOOF OOOF OOOF. OOOE 406F' 9H7f 269r- OOOF OOOF OOOF. OOOF OOOF OOOF 937F' 272E' 402F' OOOF OOOF o o o f: OOOE 00 s 00 1 00 1 00 1 01-6 02-5 01-1 00 00 00 00 00 00-1 01 1 '01-3 '01-1 1 00 1 00 1 00 1 .526 .noo .00 .000 .17 4 .261 .000 .3 30 .000 . o o o .noo .000 .0 00 .000 ,9*ft .14? .000 ,000 .000 ♦ ooo 1339r- OOOOr OOOOF OOOOr 5737 F , 8 40?r- OOOOF 9654r- OOOOF r OOOOr oooor OOOOr OOOOr 1o9or- 7900F' OOOOF OOOOE OOOOF. OOOOE 01-1 00 1 00 1 00 2 01-? 01 A 00 1 1 1 00 9 00 1 OO 1 00 7 00 1 00 1 01 -v oi-i 00-1 00 1 00 1 00 1 ooooo 00000 ooooo 63550 50522 ?P34« OOOOO ooooo 72546 ooooo ooooo 7/149 1 OOOOO OOOOO '* 3 4 3 7 OOOOO OOOOO OOOOO OOOOO OOOOO OOE 00 OOF 00 OOE 00 12E-01 77E-01 ?1E"01 OOE 00 OOF 00 55E-01 OOF 00 OOE 00 6HE-01 OOE 00 OOE 00 **E-01 OOE 00 OOE OOE OOE OOE 00 00 00 00 ITERATION # 40 OUTPUI s •i.oooooooe 1, 0000000 F f .16352 l&E l.OOOOOOUf ■l.OOOOOOOf ■l . ooooooof -4.81*6936 F 3,384o976 E 00 1 .5366755r-o?-l .OOOOOOOF. 00 1 . r 00 1.0 OOOOF 01-1.67?0659r-ol-l .OOOOOOOF 00 1 . r 00 I.OOOOOOOE 00 ^,6r,b673flr-ol 4.520040AE' 00 '. 36361 06r-ol 3.97o29P7f' 02-1.0 OOOOOF 00 S . 4 3 9 3 ? 6 9 £' 01-6, 2bl t >5 / *lr-ol-1.0 000 OOF 00 5.526l339r' 00 1. OOOOOOOF 00 1,0 OOOOOF oo 1 .ooooooo r 01-6. 17/1 57 3 7 r 02-5.?6tR40?r '01-1. OOOOOF 00 l >»33n96S4p 5.36 1 1,000 1.000 1.000 4,412 1.103 9.12V ' 6.250 1 .000 ■J. 75V ■1.000 6.369 541 Jf ' OOOOE OOOOE OOOOF 635ip' /i9bbf " 3 6 7 J F ' 683^F E J 7 3 t e OOOOF ?3^ t 01 -1 00 1 00 9 00 1 01-1 01 -1 01 1 01-1 00 2 02-2 00-1 '01-1 .000 .0 000 .5262 . .0000 . OoOO .oooo • oooo . bQ32 .652/1 .0000 • oooo OOOf OOOr 999c-. OOOt" OOOf r 000 «■ oOOr Q57r> R77r> 00 Or 000^ 00 00 01 00 00 01-1 01-1 1 00 8 .00 ,00 .000 ,000 .00 .6Z.2 .000 .000 .000 .000 .000 .912 oooor OOOOF OOOOE oooor OOOOF. 1937r- OOOOF OOOOF OOOOF OOOOF F «334F.' 00 00 00 00 01 (JO 00 00 00 01 . 000 .000 .000 .000 ,701 .0 00 .(•00 .000 .000 . 000 .000 .000 oooor oooor OOOOr OOOOr oo72r- oooor OOOOF oooor OOOOF OOOOF OOOOF OOOOF 01-1 00 1 00 1 00 2 M-? 01 6 00 1 01 1 00 9 00 1 00 1 00 7 01-1 00 00 00 00 0-6 00-9 000000 000000 000000 635501 5n52?7 28348? 000000 000000 Pa 72546S 000000 000000 7/i/i91 a 000000 000000 000000 000000 100962 000000 5/i9753 86738r< •01 ti .5200406E-01-6 . 1 745737f_-o.1« •?« .5052277F' "01 " 1 . u o u U 1 on f . 3^3ai 06^ •01 3 .97029H7F-02-5 , 2 6 1 M. 2 F - o 1 h 1 ?83^6?1 F" "01 -«*.8U693t>( -0?- l.OoOOOOOf 5 ,43932*9F" ■oi-l .OOOOOOOF 00 1 . •OOOOOOOE 00 J. 3baoQ7t| -01- b , 2 8 1 9 5 * 1 r * •01' •1 . 1 1 V 00 9 . 3 3 ^ h S 4 F ■ ■01 1 1 OOOOOOOF on 5 . 3 6 1 b 4 1 J | -01 - I .(MOOOOQr 00 1 .OOOOOOOE 00 1 . r 00 9 ■ 7254655 r ■ "01 1 , U ( 00 1 , r 00 1- .OOOOOOOF 00 1 .OOOOOOOF 00 1 , •OOOOOOOF 00 1 , Ol; ( 00 V.526?999r. •01 1 .OOOOOOOF 1 .OOOOOOOr 00 1 . .OOOOOOOF 00 1 .OOUOouuf . 00 1 ,0000000" 00 1 .OOOOOOOF 00 1 , f- 00 7, . r /u 9 1 6 n E ■ •01 <*.41*635S| -01- 1 . r 00 1 .OOOOOOOF 1 . 7 1 7 2 r • •01 ■ •1 .OOOOOOOE 00 1 . 1 3 « 9 6 1 ( -01- 1.0000000" 00 9 .6421937F- •ol 1 .OOOOOOOF 00 1 .OOOOOOOE 00 9,12V367J[ ;-oi 1 .0000000^ 00 1 .OOOOOOOF 00 1 .OOOOOOOr 00 1 . .OOOOOOOE 00 6. 25 U6r310096?ip ■01 -J. 759J73/ f f -o?- 2 , 6 b 2 /i B 7 / ~ ■ ■nl-l •OOOOOOOF 00 1 .OOOOOOOr 00 1 ■ - OOOOOOOE 00 J.03 t4 /66^! -oi- 9 . '4 P 2 'i 7 1 r ■ ■01 1 .OOOOOOOE 1 .OOOOOOOF 1 •OOOOOOOE 00 1 , oouoooui 00 l.Urt69'i02r*« •Ol' •1 .OOOOOOOE 00 1 .OOOOOOOF 00 1 •OOOOOOOE 00 55 Conclusion The techniques discussed in this thesis do demonstrate a parallel technique of wide application to functions whose deri- vative with respect to each memory variable is comparatively small. The ultimate case of an unpredictable function is one in which the range of values is discrete. These functions are essentially unpredictable using the techniques discussed here and one can offer convincing arguments (Chapter H) that no parallel evaluation is possible. Such discrete functions are characterized by an infinite derivative at many points. As the constraint on a derivative becomes more stringent the class of functions admits of more and more accurate parallel evaluation. 56 REFERENCES Iverson, K. E. A Programming Language . New York, Wiley, 1962 ill, A. On the Series-to-parallel Transformation of Linear Sequential Circuits. IEEE EC-15, 1966, p. 107. 57 VITA Harold Robert George Trout Born 18th December 19^5- B.Sc. in Mathematics, Victoria University of Wellington, New Zealand, 1966. B.Sc., 1st class honours in Mathematics, 1967, Victoria Univer- sity. M.S., Computer Science, University of Illinois, 1969 • 1970 - present, vice-president TMM programming. A hardware, software, and consulting company. Mr. Trout is author of 'A Compiler Generating System', 21st ACM National Conference, 1967. BIBLIOGRAPHIC DATA SHEET 1. Report No. UIUCDCS-R-72-5^9 3. Recipient's Accession No. 4. Title and Subtitle Parallel Techniques 5. Report Date October, 1972 7. Author(s) Harold Robert George Trout 8. Performing Organization Rept. No. 9. Performing Organization Name and Address 10. Project/Task/Work Unit No. Department of Computer Science University of Illinois Urbana, Illinois 618OI 11. Contract /Grant No. 12. Sponsoring Organization Name and Address 13. Type of Report & Period Covered 14. 15. Supplementary Notes 16. Abstracts This thesis develops several techniques for creating the parallel equivalent of a sequential algorithm. It is shown that any function which is linear in its memory variables can be coded in a parallel fashion. Furthermore, using a bit by bit strategy parallel equivalents of sequential algorithms in which the iterative function is subject only to a Lipschitz condition are developed. The thesis considers speed of convergence, and the computational expense of these algorithms. It develops in detail the application of several of the algorithms to practical examples. 7. Key Words and Document Analysis. 17a. Descriptors parallel algorithms bit by bit algorithms b. Identifiers/Open-Ended Terms c COSATI Field/Group Availability Statement 19. Security Class (This Report) UNCLASSIFIED 20. Security Class (This Page UNCLASSIFIED 21. No. of Pages 22. Price |*M NTIS-13 ( 10-70) USCOMM-DC 40329-P7I oc „ a »1*3