mora rmSD BQDSIhISi MHraflBI $ H9BI1 mm Hi Hi 91 Hi mSBSsSM RMfRfiliMm? wHHIHBIDIlttin ii RsHllfiul LIBRARY OF THE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN 510.84 I-fltor --no. 629-534 co/o. 2. CENTRAL CIRCULATION AND BOOKSTACKS The person borrowing this material is re- sponsible for its renewal or return before the Latest Date stamped below. You may be charged a minimum fee of $75.00 for each non-returned or lost item. Theft, mutilation, or defacement of library material* (an be causes for student disciplinary action. All materials owned by the University of Illinois Library are the property of the State of Illinois and are protected by Article 16B of Illinois Criminal Law and Procedure. TO RENEW, CALL (217) 333-8400. University of Illinois Library at Urbana-Champaign "JUL 2 6 20Q1 APR 5 2001 MAY 2 6 2002 When renewing by phone, write new due date below previous due date. L162 7U.S33 Or*. ^ UIUCDCS-R-72-533 yx^^tr COO-2118-0038 A STUDY OF VISUAL SHAPE PERCEPTION by Kiyoshi Maruyama «««i8i8r eejfce October 1972 1 UIUCDCS-R-72-533 A STUDY OF VISUAL SHAPE PERCEPTION by Kiyoshi Maruyama October 1972 DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF ILLINOIS URBANA, ILLINOIS 6l801 Supported in part by the Department of Computer Science and the Atomic Energy Commission under contract US AEC AT( 11-1)2118 and submitted in partial fulfillment of the requirements of the Graduate College for the degree of Doctor of Philosophy in Computer Science. Digitized by the Internet Archive in 2013 http://archive.org/details/studyofvisualsha533maru A STUDY OF VISUAL SHAPE PERCEPTION Kiyoshi Maruyama, Ph.D. Department of Computer Science University of Illinois at Urbana-Champaign, 1972 With the use of computers , this paper develops and studies algorithms involved in visual shape perception. The main interest is to solve the naming problem, i.e., the strategies by which map-like information is transferred to objects of the scene under analysis. The naming problem is specified by two input pictures: a sample picture and a text picture. The problem is to identify and name objects in the given sample picture using name information in the text picture. Here the most significant and necessary sub-problem to be solved is shape identification. Once possible individual shape identi- fication has been established, the next step is to consider the structural matchings between the given sample and text pictures. Introducing the concept of angularly simple curves and the decomposing of a complex region into simpler classes of regions , we give basic shape representations: polygonal, pattern sequence, and skeleton representations. From these representations both global and local features are extracted to channel to a structural analysis of complex pictures. An approach to the solution of shape matching problems — a procedure to determine an odd shape or form in a given set of shapes — is described. The "oddness" of a shape in a set of shape is defined on the basis of many detected features. The performance of the system is shown by comparing the system response to the average human response . It is shown how form perception in psychology can be modeled and studied by computers . To attain acceptable structural matchings between the sample and text pictures, graph representation of pictures is considered as a possible methodology. Relation graphs and approaches in finding relations are discussed; some basic graph transformations to achieve possible matchings are given. Connected to the graphical representation of pictures, sequential descriptions of complex figures are developed, and problems of shape extractions are discussed. Finally, a model is suggested for a solution of the naming problem by referring approaches and algorithms which are described in this work. Ill ACKNOWLEDGMENT The author expresses his deepest gratitude to his thesis advisor, Professor Bruce H. McCormick of the Department of Computer Science, University of Illinois, whose constant aid, encouragement, and ready availability as a source of dynamic ideas have led this work to its successful completion. The author also expresses his sincere appreciation for his former thesis advisor, Professor Jurg Nievergelt, presently on sabbatical leave, for his helpful discussion and encouragement. Special thanks should go to Mrs. Patricia C. Welch and Mrs. Barbara Armstrong, who typed the manuscript. Thanks are also extended to my wife, Michiko, for her patience. IV TABLE OF CONTENTS Page 1. INTRODUCTION 1 2. TOPOLOGICAL DISCUSSION 8 2 . 1 Basic Topology 8 2.2 Simple Closed Regions 10 2.2.1 Convex Polygon 10 2.2.2 Angularly Simple Region 10 2.2.3 Determination of Feasibility 13 2.2.U K-Convex Decomposition 18 2.2.5 K-Angularly Simple Decomposition 20 2.3 Generation of Polygonal Curves 2k 2.3.1 Boundary Detection 2k 2.3.2 Contour Smoothing 28 3. SHAPE REPRESENTATION 3l+ 3.1 Polygonal Representation 3I+ 3.2 Pattern Sequence Representation 39 3.2.1 Definition of Pattern Sequences 39 3.2.2 Generation of Pattern Sequences kk 3.2.3 Operations on Pattern Sequences 52 3.2.1+ Fourier Series 53 3.3 Generalization of Sequence Representation $k 3.3.1 Bi-Pattern Sequences 5I+ 3.3.2 Many Pattern Sequences 56 3.3.3 Interval Pattern Sequences 60 3-3.1+ Contour-Following Pattern Sequence 60 3.3.5 Convex-Concave Abstraction 63 V Page 3.U Skeleton of Shapes 66 3.1|.l Skeletonization 67 3.1*. 2 ^eck/Tail Decomposition 68 3.5 Characteristic Points of the Representation "J6 3.5 »1 Preliminary Discussion 76 3.5-2 Determination of Characteristic Point 77 3.5.3 Properties of Characteristic Points 79 3.6 Intercomparison of Shape Representations 8U k. FEATURE EXTRACTION 86 k.l Introduction 86 k.2 Description of Specified Features 87 1*.3 Feature Dependence on Encoding Shapes 93 k.k Extraction of Local Features 102 i+.U.l Preliminary Discussion 102 U.4.2 Determination of Local Lobes 10l+ 1+.U.3 Determination of Local Cavities 106 U.H.U Determination of Local Roughness 107 5. DETECTION OF ODD (SIMILAR) SHAPES 112 5.1 Basic Definitions 112 5.2 A System to Detect Odd Form 118 5.2.1 Analysis 118 5.2.2 Attention and Weight Modification 120 5.2.3 Decision and Selection 122 5.3 Experimental Results 123 5.3.1 Human Response 123 5-3.2 Attention Function 123 Page 5.3.3 System Response 125 5.3.^ Summary 130 6. GRAPHS AND GRAPH TRANSFORMATION 132 6.1 Graph Representation 132 6.2 Relation Graphs 135 6.2.1 Containment Relation 138 6.2.2 Neighborhood Relations 1^2 6.3 Graph Transformations lU6 6.3.1 SPLIT Transformation lk6 6.3.2 MERGE Transformation lU8 6.3.3 ELIM and INST Transformations 150 7. COMPLEX FIGURE DESCRIPTION 151 7.1 Extraction of Embedded Complex Shapes 151 7.1.1 Preliminary Discussion 151 7.1.2 Dissimilarity and Clustering Function 152 7.1.3 Pairing Function 153 7.1.^ Experimental Result 156 7.2 Extraction of Embedded Figures 160 7-2.1 Preliminary Discussion l60 7.2.2 Matching Between Nodes l62 7.3 Extraction of "Good" Figures 167 7.3.1 Introduction 167 7.3.2 Goodness of Figures l68 7.3.3 Illustrative Approach to Decomposition 172 VI 1 Page 8. SUGGESTED MODEL FOR THE NAMING PROBLEM 178 8.1 Introduction 178 8.2 A Model Description 180 8.3 Potential Application l8U 9. SUMMARY AND CONCLUSION I87. 9.1 Discussion of Results 187 9.2 Suggestions for Further Research 188 BIBLIOGRAPHY 189 APPENDIX A. LIST OF SHAPE DESCRIPTION SUBROUTINES 193 B. TWENTY SETS OF DATA STIMULI 202 VITA . 208 1. INTRODUCTION Stimulated by the publication of Wertheimer's "Principles of Perceptu- al Organization" in 1923, many psychologists have intensively studied human perception of shape and form. A good survey paper of these studies is found in Hake (1957); and a recent book, Visual Perception of Form by Zusne (1970), is an excellent summary and contains an extensive bibliography. A 1966 collection of papers edited by Leonard Uhr deals with various problems of pattern recog- nition by computers. However, many of the more crucial problems such as how humans encode shape information, detect near similarity or dissimilarity of shapes, focus attention upon particular aspects of shapes, and achieve percep- tual invariance are not yet adequately understood. A basic weakness of shape/form recognition by present computer systems may stem from the observation that shape/form recognition by humans has not yet been solved on the psychological level. Zusne (1970) says: In the history of modeling pattern perception on the computer, the task is really one of modeling discrimination rather than recog- nition. Recognition in a computer is recognition only in a trivial sense. Recognition by organisms implies previous experience, and with experience there accrue meanings to patterns that are uncorre- lated in any important way with the physical dimensions of patterns. Whether a machine will ever be able to cope with this problem is uncertain. (page Qk) The objective of this thesis are to investigate some intrinsic proper- ties in shapes and to provide some good ways to describe objects in such a way that the high level analysis of pictures, especially the "Naming Problem," can be done easily. Furthermore, we hope that this study provides the means to con- struct a computer system in which the system responses are demonstrably similar to an average human response on a specified task and to provide clues for ex- ploring some unsolved problems of shape perception in psychology. Let us first investigate the problems involved with defining shape and form. Webster's Third New International Dictionary (l96l) defines shape as "the visual make-up characteristic of a particular item or kind of item: characteristic appearance or visual form, or spatial form or contour tnat is used." The term "form" is somewhat more nebulous — it is used by different people to mean different things and by the same person to mean different things on different occasions. Shape, figure, structure, pattern, order, arrangement, configuration, plan, outline, and contour are similar terms without distinct meanings. This indefinite terminology is a source of confusion, and a more rigorous terminology may be very much needed. The Webster's Dictionary defines form as "the shape and structure of something as distinguished from the mater- ial of which it is composed," and it defines gestalt as "a structure or config- uration of physical, biological, or psychological phenomena." A number of explicit definitions of visual form have been given by Gibson (1951); they were proposed out of a conviction that "psychologists should come down to earth to say exactly what they mean when they talk about 2 3 form." According to his proposed definitions, solid forms and surface forms are realities. However, there is no such thing as form-in-general with the universal characteristics ascribed to it by gestalt theorists. The only kinds of visual form we shall undertake to deal with hereafter are those associated with or derived from physical objects. 1. task: detection, discrimination, recognition, identification, judgment (Hake 1957). 2. solid form: the closed physical surface enveloping a substance of some kind. The surface may be curved or it may be composed of adjoining flat surfaces with edges. 3. surface form: a flat physical surface with its edges. Let us describe briefly what kind of experimental studies have been made by psychologists on human shape perception. First, they have tried to define what is the quality of "goodness" in a shape or form. If a line forms a closed or almost closed figure, we may no longer see merely a line on a homo- geneous background, but a surface figure bounded by the line. It seems that a straight line is a more stable structure than a broken one and that, therefore, organization will occur in such a way that a straight line will be perceptually continued as a straight line (this idea was implemented by Guzman [1968] in his "Decomposition of Scenes Into Bodies"). Koffka (1935) generalizes this as: "Any curve will proceed in its own natural way; a circle as a circle, an ellipse as an ellipse, and so forth." Attneave (195M asserts that many of the gestalt principles of perceptual organization pertain essentially to information distri- k bution and the good gestalt is a figure with some high degree of internal redundancy. It may be that an important criteria for "goodness" of form, how- ever, is the relative importance of simplicity and complexity — a "good" config- uration being simple. The theory is that our perceptual and memory processes tend toward simplification of forms. To get a "good" figure from a given figure, Mowatt (19^0 ) obtained the statistical result that the large majority, about 9h% , of all changes to a given figure consist in the addition of lines or dots to the figure. He also shows that open figures are changed much more frequently than closed figures , and he concludes that closure and symmetry are considered "good" properties of a figure. Another way to determine goodness of shapes is to determine some k. It was demonstrated by Attneave (195M that information is concentrated at points where there is a change in an otherwise continuous gradient; at the contours of a form which mark the change from ground to figure; and at any inflection along the contour where the direction of the contour changes most rapidly. kind of threshold function (Bobbitt, 19^2). For example, the threshold may be stated in terms of the percentage of the total perimeter of the figure present at the point in the series representing the transition between the forms having the quality of twoness and those having the quality of oneness. In 1955 Deutsch observed that "the perceived form or shape of patterns must be invariant with changes in location and orientation of the patterns, and even minor changes in shape itself." However, recognition of forms and compar- ative judgment of forms deteriorates when forms are rotated with respect to the observer. In fact, there is considerable evidence to indicate that orientation plays a part in the definition of what we mean by the form of an object (Dod- well, 1961). Thus it may be so that the form of an object may not be invariant with rotation of the object and is partly defined in terms of orientation characteristics (see Zusne). Some of the stimuli, or target shapes, which have been used by psychol- ogists to study the problem of human form perception are: Dot Patterns (French, 195^), Metric Figures (Fitts and Leonard, 1957), Random Polygons (Attneave and Arnoult, 1956), and Computer-Generated Luminous Polygons (Poli- dora and Thompson, I96U). Because of the simplicity of generation by computers, polygons are used quite frequently in psychological experiments. While it is true that a curved shape may be approximated with straight line segments with- out appreciable loss of information, some information is irretrievably lost, depending on how crude or fine the approximation is. How such approximation affects perceived complexity is not known. The methods of polygonal shape representation which have been studied are mostly independent of shape orientation (see Deutsch, 1962, and Attneave and Arnoult, 1956). However, it may be that the perceptual independence of shape orientation may not play as significant a role for form perception in psychology. For the analysis of shapes and forms, a large number of physical characteristics and form invariances have been suggested and studied. Some of these are: compactness, jaggedness, skewness, and dominant axis. There have also been many other studies: theories of visual form perception, perceptual tasks, memory for form, form perception by animals, application, etc., have been done and are being continued by thousands of psychologists (see Zusne, 1970). However, as we stated earlier in this chap- ter, knowledge of how humans perceive shape and form on a psychological level is still exceedingly primitive. On the other hand, many engineers and mathematicians have been working on various shape recognition problems, from low-level image analysis to high-level semantic analysis, by constructing mathematical models or by computer simulations. The general and attractive summary of the study of picture processing can be seen in Rosenfeld (1969) • The most recent study by Schwebel (1972) surveys unstructured models , linguistic models, explicit graph representations of pictures, graph theoretic clustering, graph trans- formations and grammatical formulations, and graph structure processing languages. His own work is primarily concerned with the theoretical study of global-level picture processing using a graph structure representation. The plan of this thesis is as follows: In Chapter 2 the discussion of some basic definitions and properties in Topology to provide insight adequate for the following chapters is given. Elementary classes of curves — including the concept of angularly simple curves, and decomposition of a complex region into simpler classes of regions — are discussed. An approach for the detection of smooth curves from a given black and white image picture is also contained. Chapter 3 contains an extensive discussion of various types of shape representations. We introduce three basic shape representations: polygonal, pattern sequence, and skeleton representations. Pattern sequence representation is generalized to cover almost unlimited classes of shapes. Some characteristics of representations and comparison between rej:>resentations is also given. In Chapter k a discussion of feature extractions from those three basic shape representations discussed in Chapter 3 is given. Nineteen selected features are described, and feature dependence on encoding parameters is discussed. Some of the local features which should be extracted for shape perception problems are also discussed. Chapter 5 is provided to consider a shape perception problem as an application of concepts and approaches which are discussed in earlier chapters . A procedure to determine an odd shape or form in a given set of shapes, each set consisting of more than two shapes, is described. The "oddness" of a shape in a set of shapes is defined on the basis of a specified feature, and then the "most odd" shape on the basis of many detected features is defined. The performance of the system is shown by comparing the system response with the average human response. To make the system response resemble an average human response, modification of an assumed weight vector by means of an "attention mechanism" is considered. To attain acceptable solutions for a given recognition task, it is necessary to develop a graph structure model which makes it easy to access low-level as well as high-level picture information. In Chapter 6 we introduce graphical representation of pictures. Relation graphs are introduced to represent the structure of a given picture, and algorithms to generate relations as well as necessary graph structure transformations are also discussed. In Chapter 7 some problems in shape perception are introduced: extraction of complex shapes or figures from another picture, extraction of "good" figures or good decomposition. A description and an approach for a possible solution is given for each problem. Chapter 8 is provided to introduce a central problem in shape perception: the naming problem. As the summary of this work, a model is suggested for a solution of this problem by referencing approaches and algorithms which have been introduced in earlier chapters. Some potential application of the model is also discussed. Chapter 9 contains conclusions and suggestions for further studies on shape perception. In Appendix A some shape description subroutines are listed. The illustration of 20 sets of data stimuli which are used for the experiment of Chapter 5 appears in Appendix B. All algorithms described in this thesis have been written in PL/I language and implemented on an IBM 360/75 at the University of Illinois, Urban a -Champaign . 2. TOPOLOGICAL DISCUSSION In this chapter we discuss some basic definitions and properties in Topology to provide insight adequate for the following chapters. In Section 2.1 some basic concepts of curves are introduced. In Section 2.2 elementary classes of curves, including the concept of angularly simple curves, are examined, and decomposition of a complex region into simpler classes of regions will be dis- cussed. An approach for the detection of smooth contours from a given digitized image is also discussed in Section 2.3 2.1 Basic Topology Let us introduce some definitions and properties in elementary topology, where the proof of each property listed can be found in Cairns (1961) and Tondeur (1970 ). A curve y is regular if y is twice continuously differentiable map 2 y: I -* R with condition y("t) ^ f° r t e I. A curve y is simple , y: I = [a,b) -HR , if y(t) ^ y(t') for t, t' e I, which means a simple curve has no self-intersection. A simple curve y is called a piece-wise polynomial curve if it consists of a finite number of polynomial line segments. A simple curve y is called a polygonal curve if it consists of a finite number of straight-line segments. 2 The rotation index of a closed curve y: [a,b) ■> R with respect to a point x interior to the curve is the integer, W (y,x ) = 27/ e(s) dS = 27 (e(b) " e(a)) > where 6: I ■*■ R is an angle function. If the absolute value of the rotation index of a given closed curve is different from 1, then it must have self- intersections . From the above definitions, we have the following properties: i) A regular simple closed curve is convex if and only if the curvature function is non-negative or non-positive. ii) A regular closed curve with rotation index 1 (-l) and non-negative (non-positive) curvature is convex. Let S be a topological space, and consider the product space S x T, where T is the unit segment [0,l], Let ft also be a topological space, and let f: S x T -* ft be a continuous mapping of S x T into ft. The points of S x T are representable in the form (p,t), where (p,0) e S and t e T. We will denote with f : S -> ft (t e T) the mapping of S into ft defined by f (p) = f(p,t), (p e S, t e T). Then f is a continuous one-parameter family of continuous mappings (0 _< t _< l). It will be called a (homotopic) deformation of f into f . It is helpful, intuitively, to regard t as the time and to think of the image of a point p e S as "moving" from f n (p) to f (p) through the positions f (p) as t increases from to 1. Its deformation path is a continuous curve, defined by the restriction of f to the segment p x T. A mapping f : S -*■ ft is homotopic to a mapping f : S -> ft if it is possible to define a deformation of f into f . This relation will be symbol- ized by f = f . iii) Let y , y : I -*■ R - {x} be closed curves, then y n » Y-i are homotopic in R - {x} if and only if W(y n , x) = W(y-, , x). iv) The relationship - is an equivalence relation. As a consequence of the property iv) , the continuous mappings S -*- ft fall into homotopy classes, generally called merely classes of mappings. v) The space is simply connected if and only if each closed curve on S can be shrunk to be a point (or can be spanned). 10 2.2 Simple Closed Regions In this section we introduce several classes of simple closed regions: convex polygons, angularly simple polygons, and K-convex/K-angularly simple polygons. The Feasibility Problem involved in angular simplicity, as well as the decomposition of a complex region into K-convex or K-angularity simple polygons, is also discussed. 2.2.1 Convex Polygon The region bounded by a closed curve y is denoted by ft . A convex region ft is called a convex polygon if y is a closed polygonal curve. A class of convex polygons may be considered the simplest class of polygons, where the definition of convexity has been given in the previous section. However, we shall introduce another definition of convexity in terms of feasibility which will be discussed extensively in the following section. A point x in a closed region ft is called an angularly simple point if at x the curve y is totally visible. The collection of angularly simple points of a region ft is called the feasible set of ft and is denoted by X . A Y Y Y closed curve ft is convex if its feasible set is ft itself. In other words, ft Y Y Y is convex if at any point x in ft , y is totally visible. 2.2.2 Angularly Simple Region A region ft or its closed curve y is said to be angularly simple if its feasible set X is non-empty, and examples of angularly simple and not angularly simple regions are exemplified in Figure 2.1. Lemma 1: Let y be an angularly simple curve. Let us also assume that points x n and x' are angularly simple points of y» where x 4 x'. Then y is angularly 11 (a) An Angularly Simple Curve (t>) A Curve That Is Not Angularly- Simple (c) An Angularly Simple Polygon Figure 2.1. Angular Simplicity of Regions 12 simple at x*!, where xjj = vx Q + (1-w) x^, 0_ w . x . , where i=l m ^w. =1, w. > 0. i=l Theorem 1 : Let y be an angularly simple curve. Then the set of points which defines y to be angularly simple is a convex set. Proof : It is clear from Lemma 1. (Q.E.D.) Theorem 2 : Let y., i=l, 2, ..., n, be angularly simple curves whose regions and feasible sets are denoted by Q , X , and i=l, 2, ..., n, respectively. If 'i i n n X= p| X 4 0, then the perimeter of [J Si = Q , i=l Y i i=l Y i Y n which may be denoted by y = v y . , is angularly simple at any point in X. i=l X 13 Proof: By the assumption that each Y-» i = l» •••» n, is an angularly simple curve, then each y. is totally visible at any point in X . Thus at any point in X, Y- is totally visible for i=l, • ••» n, which implies the curve Y of ft is angularly simple at any point inside X. (Q.E.D.) n In the theorem, the condition ' I X 4- is a sufficient condi- 1-1 Y i tion for the curvature y to be angularly simple and clearly not a necessary condition. Corollary 1 : Let y., i-l» 2, ..., n, be angularly simple curves and let ft and X be the region and the feasible set of y. for i=l, 2, ..., n, i i x n n respectively. If y s \/ y. denotes the perimeter of \J ft , then the feasible i=l X i=l Y i n set X of y contains (~\ X Y ' Y 1 i=l Y i Proof : It is clear from Theorem 2. (Q.E.D.) 2.2.3 Determination of Feasibility It is clear that any convex region may be described by a finite number of line inequalities. We will extend this idea for the detection of feasibility. Let us assume that a simple closed curve y is angularly simple at a point x . Then the feasible set X of y can be denoted by the following set of inequalities Y(t) ^0 if the value of -f(t) at x is >_ 0, Y(t) <_ if the value of y(t) at x is <0, for t e I, where i(t) denotes the tangent line at Y(t). lU Before we extend our discussion on the general determination of feasi- bility, let us first introduce simple representations of a polygonal curve and a polygon — the extended discussion of alternative shape representations will be defined in the following chapter. A closed polygonal curve y which consists of 8 line segments, e. , e , ..., e„, is illustrated in Figure 2.2(a), where g. = denotes the line equality which corresponds to the edge e. for i=l, 2, ..., 8. One might use a sequence of coordinates to denote such a curve; however, if we order these line equalities in clockwise direction (or in counter-clockwise direction) , then such an ordered set of equalities clearly represents a unique polygonal curve, where each basic coordinate can be found as the intersection point of two adjacent equalities. For example, the curve in Figure 2.2(a) is denoted by G Y E {g l = ° 9 S 2 = °» ••*' S 8 = ° } 5 { gi = for i=l, 2, ..., 8}, and the basic point p , for example, is the intersection point of g =0 and g = 0. To denote the region Q , it is merely necessary to use inequalities instead of equalities. Without loss of generality, the right-hand side (left- hand side) of a line g. = in clockwise (counter-clockwise) direction may be denoted by an inequality g. >_ 0. Thus the polygon of Figure 2.2(b) can be denoted by the following ordered set of inequalities : Q = {g. > for i=l, .. . , 8}. Y i — Therefore, in general we have the following: A polygonal curve y is expressed by an ordered set G of equalities: G = {g. = for i=l, 2, ..., n} Y l where the basic coordinate p. is the intersection point between g. = and g. = 0, where j = i+1 (modn). Similarly, the corresponding polygon is J i 15 g 3 =o (a) A Representation of a Polygonal Curve g 3 ^o (b) A Representation of a Polygon Figure 2.2. Polygonal Curve and Polygon 16 expressed by an ordered set ft of inequalities: ft = {g. > for i=l, 2, ..., n} Y I — and g. >_ is applied only between p. and p. . Theorem 3 : Let y be a polygonal curve whose region is denoted by ft 5 {g. . for i=l, ..., n}. Then y as well as ft is angularly simple if and only if the set of inequalities g. >_ 0, for i=l, ..., n is feasible. Proof : Let us assume that y is angularly simple. Then there exists a point x~ inside ft such that the line segment between x_ and an intersection point of y any adjacent equalities has no intersection with any line segments of y. Thus g.(x ) >_ for all i, which implies that x is a solution for the given set of inequalities (feasibility of the set of inequalities). Let us assume {g. >_ for i=l, 2, ..., n} is feasible. Then there exists at least one point x such that g. (x ) >_ holds for all i. We consider the triangle of x , p. , p. in Figure 2.3(a) and show that neither line seg- ment p. , , x_ nor p. , x^ has an intersection with g. = for j 4 i-1, i, i+1. i-l' *i' J Assume g. = crosses the line segment p., x . Since the given curve is simple, the re exists another g = , k 4 j , i-l, i, i+1, such that g = intersects the line segment p., x . Since g. > is assumed to hold on the right-hand side of the line g. = 0, g.(x ) _> does not hold (see Figure 2.3[b]), which contradicts J J u the assumption that g. (x ) >_ holds for all i. (Q.E.D.) IT (a) (b) Figure 2.3. Illustration for the Proof of Theorem 3 18 2.2. k K-Convex Decomposition It is well known that any polygonal region can be represented by the union of a finite number of convex polygonal regions. We will consider the following decomposition of a polygonal region into unions of so-called "maximal" convex polygonal regions. Here we use the representation of curves and regions which were introduced in the previous section. m ft = [J ft is a convex maximal decomposition of ft if and only if for Y j=l Y j Y each convex ft , ft eft and ft cfc I J ft , and each ft is maximal. Y j Y j " Y Y J ~ *j Y k h Where G = {g. > for i=l, 2, .... n} and Y i — G = {g„ > for 1=1 9 2, . .., n.}, G cz G for all j , y- & - J y. - Y J J and m is called the convex complexity of ft . Y The above maximal decomposition is not unique for a given polygonal region, and the value m implies that at least m number of convex regions are required to cover the given decomposed polygonal region. In other words, let m ft , j=l s ..., m be convex polygons, and let ft = \J ft be connected. Then J j =1 J the complexity of ft is at most m. Let us consider all possible convex maximal decompositions of ft , and let the collection of all convex regions be {ft , j=l, 2, 3, ..., n). We J assign a cost C. for each convex cover ft , e.g., the number of inequalities to 3 Y j denote ft which is denoted by G : and we want to minimize the total cost to Y. Y. " J J cover ft . Y To solve this problem, let x., j=l, ..., n be boolean variables (i.e., J O-l) such that x.=l means the ,i-th convex region ft is chosen for covering ft j y. y 19 and x.=0 means it is not. Then we have the following Integer Programming formu- J n lation for the cost minimization, where YJ x. = K. j=l J n Minimize / . x.C . under the constraints of Y (x¥l) J Y (x. = l) Y j J J x. = or 1 for j=l, 2, ..., n. J Theorem h : Let ft "be a polygonal region having a convex maximal decomposition m m of [J ft Then ft is angularly simple if and only if f) ft ^ $. j=l Y j Y j=l Y j m Proof : Let us assume f~] ft ^ <£>, Then, since each ft is convex, any point j=l Y J Y j m x in P| ft implies that ft Y is angularly simple. 0=1 Y j m m Let us assume N ft is angularly simple; then G = ^J G is feasible, j=l Y j Y j=l Y j Since each G is a subset of G , the feasible region of G , i.e., ft , contains Y v Y Y m m the feasible region of G , I.e., X = Oft . Since X 4 4» (\ ft 7^ $. Y Y >i y j Y j-i y j (Q.E.D.) 20 2.2.5 K-Angularly Simple Decomposition In previous sections, we restricted our class of polygons to be angularly simple, which does not cover the general class of polygons. A K- angularly simple polygon is a polygon such that K distinct points, x. , i=l, ..., K, are necessary to cover ft with ft , i=l, 2, ..., K, which are Y Yj_ angularly simple at x. . In other words, for each basic point p . of ft , there i J Y exists at least one point x. (we call such a point "AS-point") such that x.p. l * i*j does not intersect any edges of ft . Simple examples of K-angularly simple poly- gons are illustrated in Figure 2.U. To determine whether a given polygonal region ft (= G ) is K-angularly simple or not, the following definition might be helpful. A set S is said to be mutually exclusive if and only if s . f\ s . = $ for all s . , s . e S , and j S | denotes the cardinality of S. m Theorem 5 • Let a convex maximal decomposition of ft be [J ft , and let S be Y 3=1 Y j the mutually exclusive set of {ft , j=l, ..., m} such that |s| = K is the maxi- Y j mum. Then ft is K-angularly simple. Proof : Let the maximum mutually exclusive set S be 1 i k K By the definition of angular simplicity, only interior points of ft. are AS-points of ft. . In other words, at any point xl in B, , k ^ i, x, does not imply that ft. is angularly simple because ft. H ft, = $• Since K = |s| is the number of dis- X IK joint convex polygons of ft , by Theorem k, K points are necessary to cover ft . (Q.E.D.) The class of regions which has been defined in Section 2.2.2 is called the class of 1-angularly simple regions. 21 (a] (b) (c) Figure 2.U. Examples of K-angularly Simple Polygons 22 Theorem 5 implies the following approach of decomposing a given polygon into K-angularly simple polygons. After the determination of K number of mutually exclusive polygons, B_, ..., ft R , ..., fi K , for each ft R we find all non-mutually exclusive polygons Q whose cumulative intersection is non-empty. The union of such polygons is an angularly simple polygon, and for all P* k we apply the above process. In Figure 2.5 a simple example of K-angularly simple decomposition, given by Theorem 5, is illustrated. The polygon in (a) can be decomposed into many convex polygons; denoted by R , R , ..., R,_, using only line equalities, i.e., the extension of appropriate edges which are indicated by dotted lines. Convex polygons, ft , ft , and ft_, attained by convex maximal decomposition, are shown in (b), where ft = R \j R , ft - R \j R \j R, and ft, = R, \j R . Since ft, fl ft ^ $ , ft fl fl_ = $ and ft_ f\ fi_ 4 $» "the maximum cardinality of the mutually exclusive set S is two, where S = {ft , ft_}. Thus, K = |s| =2, and the given polygon is 2- angularly simple. In (c) two angularly simple polygons; ft, \j ft_ and fi„ y L, are shown, where their feasible sets (hatched regions) are ft f\ ft and ft (\ ^q» respectively. 23 (a) Ob) s = {n , n }, k = |s| = 2 (c) Two Angularly Simple Polygons Figure 2.5. An Example of K-Angularly Simple Decomposition 2k 2.3 Generation of Polygonal Curves In this section we will discuss algorithms to generate smooth polygonal curves from a given black and white array picture. In Section 2.3.1 we will discuss the detection of boundary elements, and an approach for finding smoothed contours will be given in Section 2.3.1. 2.3.1 Boundary Detection Let us introduce algorithms to find contours on a black-and-white picture. Rosenfeld (1970 ) shows that the shrinking algorithm, edge-following algorithm, and border-following algorithm work. Among them, the border- following algorithm interests us. It has the advantage over the edge-following in that it requires fewer steps , whereas a border-following algorithm can proceed directly to that next element. On the other hand, as Rosenfeld mentions, the border-following algorithms tend to be somewhat more complicated. We, however, consider the black (white) -boundary-following algorithm which is quite simple compared to the border- following algorithm. Let us introduce some basic definitions: image array A consists of black region R and ground (white region) R. Thus A = R [J R, r. . e R, a. . e A. The set F. . of a. . is the set of four horizontal and vertical neighbors. Thus: ij ij F. . = {a..^-, ., a. , ., a. .. _ , a. . _}. The set E. . of a. . is the set of eight horizontal, vertical, and diagonal ij iJ neighbors. Thus: E.. = F. . i i {a. ... . . . . a. n . n , a..^ . .., a. .. .,-.}. ij ij U i+l, j+1' i-l, j-1' i+l, j-1 9 i-l, j+1 i) For a. .eR, ifF..AR^$, then a. . is a black-border element of ij ij ' ' ij R. Otherwise, it is a non-black-border element ii) For a. . eR, if F. . A R ^ $, then a. . is a white-border element l.l i.l M l.l 25 of R. Otherwise, it is a non-white-b order element. Examples of non-black, non-white border elements are illustrated in Figure 2.6(a) In (b) we show the numbering of eight neighbors of element a. .. In Figure 2.7 we show three types of boundaries for a black-white image array: geometric bound- ary, white boundary, and black boundary. Algorithm BB (Black-Boundary Following) Let x be a black-boundary element of R. Step 1. D = (direction index), x = x , B = {x } Step 2. Find n, = black and n, n = white in E , for k > D. * k k-1 x D = ^(k) where i\>{0) , ip(l) + 6, iK2), M3) + 0, 4»(U), i|>(5) + 2, 4>(6), ip(7) ■*■ h. B = B y { n v^» (ordered sequence to denote boundary elements). x = n n . k Step 3. If the first two elements of B are identical to the very last two elements of B, then eliminate the very last element from B and termi- nate. Otherwise go to Step 2. At the end of the Algorithm BB, the ordered sequence B has the following properties , where B = VW b m b_, 5 b , and b.'s are not necessarily distinct. 1 m l Theorem 6 . Algorithm BB generates a sequence B which consists of only and all black-boundary elements of R. Proof : Let us first show that B contains only black-boundary elements. In Step 2 , n n e R and n, ., e R, moreover, F P\ R 4 $ , thus by the ' k k-1 n ' ' k definition B contains only black-boundary elements. 26 Non Black-border Element a z /. z. & 7 ^ 7W< I Non White-border Element (a) Examples of Border Elements (b) Numbering of Eight Neighbors Figure 2.6. Border Elements and Neighbors 27 Geometric Boundary \Y\ \kn s^ ^ ^ $* ^ i^ XV ^ VVN ^ ^ vj\ Nlv ^ \ v\ s> vnN *X \\S ^y \ White Boundary Black Boundary Figure 2.7. Three Types of Boundaries for a Black-white Picture 28 Let us now prove that B contains all black-boundary elements. We assume a "black-boundary element n is missing from B. Without loss of J generality, let us assume that n should be located between b. and b. . From Step 2, it is clear that n . e E, and furthermore that b. , e E, J b. l+l b. and ii. e F. . However, in Step 2, n n , = white and n, = b. , = black, J b e 9 k-1 k l+l ' then n., . = n. e R which leads us to n n i R. k-1 j k (Q.E.D.) Corollary 2 . m= |B| is minimum. Proof : It is clear from Theorem 6. Algorithm WB (White-Boundary Following) Let x be a white-boundary element of R. Step 1. D = 0, x = x, B = {x }. Step 2. Find n = white, n = black in E , k > D. Set D = k + k (mod 8). K. ri* J- X B = B\J {n k > x = n n . k Step 3o If the first two elements in B are identical to the very last two elements of B, then eliminate the very last element from B and terminate. Otherwise, go to Step 2. Similar to Theorem 6 and corollary 2 for Algorithm BB, we have the following properties: Algorithm WB generates a sequence B which consists of only and all white-boundary elements of R, and m= |b| is minimum. 2.3.2 Contour Smoothing A rough polygonal contour can be obtained by determining the ordered sequence of all boundary elements and by connecting adjacent elements in the sequence with a segment of straight lines. The possible directed segments are 29 only eight; therefore, the contour can be simply represented as a chain of octal numbers. The properties of such chains have been studied by Freeman (1961). Freeman and Glass (1969) have studied the problem of finding a smooth line from such a chain. Another method for extracting a smooth polygonal contour from a digitized image is discussed by Montanari (1969, 1970). He first obtains the ordered se- quence of contour points and the connection graph (Rosenfeld and Pfalz, 1966) of the image by a modified Ledley algorithm in one image scan. Then he finds the approximating contour by defining a minimal perimeter polygon subjected to specified constraints. Montanari' s approach is interesting; however, it is fairly complex compared to the following approach which gives a reasonably smoothed contour in a reasonable amount of computation time. Let B'=b. ,b. ,...,b. ,...,b. be a connected subsequence of an ordered sequence B of black (or white) boundary elements. Then B' is said to be a ag-sequence if and only if for all k, k=2 , ..., n-1 the following condition holds. For b. and b. , let a. . and a. . „ be their corresponding image elements; then a, 3 are identical for all j, where |a|,|6| <_ 1. Thus a$-se- quences form diagonal or horizontal or vertical line segments on the contour. Let d denote the Euclidean distance. Then B' is said to be T-sequence if d(b. , l) <_ T for k=2, ... , n-1, X k where £ is the straight line which goes through b. and b. such that X l X n d(b. , £') > T for some k=2, ..., n, \ where £' is the straight line which goes through b. and b. .In other words, X l X n+1 T-sequence is a maximal subsequence of B which satisfies the given condition for each element between b. and b. . 1 n 30 Algorithm CS (Contour Smoothing) Let B be an ordered sequence obtained by Algorithm BB or WB. Step 1. Find all possible a3-sequences in B and connect b. and b. by a X l X n straight line. Step 2. Find a possible T-sequence in the sequence derived by Step 1, and connect b. and b. by a straight line. Set b. = b. and repeat this i n l i. i l In In step. In the above algorithm, Step 1 is negligible; however, Step 1 saves a great amount of computation time for Step 2. Computation examples using Algorithm BB and Algorithm CS are illustrated in Figures 2.8 and 2.9. Figure 2.8(a) shows a black and white digitized picture of "sleeping cat," (b) shows the black boundary obtained by Algorithm BB, and (c) shows the result obtained by Algorithm CS. Here the dots show the result after Step 1, and the dots with straight line segments show the result obtained after Step 2 of the Algorithm CS. To understand how well Algorithm CS works, the following analysis might help. There are 19^- black-border elements after running Algorithm BB, and then this is reduced to 91 elements by Step 1 of Algorithm CS. Thus the reduction ratio is about a half. However, after the application of Step 2, it is reduced to 38 elements, and thus the total reduction is about one-fifth. The greater reduction ratio is not necessarily always better, since, by increasing the thresh old value T in Step 2, we can achieve greater reduction ratio. In general the result derived on a large threshold value becomes far different from the originally given black-white image picture. Another example, Embryo, is illustrated in Figure 2.9. This has 2U0 black border elements, and it is deduced to 90 elements by Step 1. Then it is reduced to 36 elements. Thus the total reduction ratio is about one-seventh. (a) 31 I I o>; (c) Figure 2.8. Contour Approximation: Sleeping Cat 32 o 5 o •H •H X o u ft ft < g 1 o On c\j 0) bO •H fin 33 In this example, we achieve greater reduction ratio than in the previous one, because Embryo contains more smooth edges on the black-border elements than Sleeping Cat does. The threshold value used for the above two examples is slightly greater than 1/^2. We feel, after running Algorithms BB and CS on various examples, that these algorithms are simple and yet give good results in a reasonable amount of computation time. 3k 3. SHAPE REPRESENTATION We have discussed in Section 2.3 how one can obtain smoothed contours of objects from a given black-white picture. These contours are described as sequences of points coordinates; however, such description of objects is not an appropriate representation if we are concerned with the economical descrip- tion of the shape of objects. In this chapter we introduce three basic shape representations: polygonal, pattern sequence, and skeleton representations. Some characteristics of each representation and comparison between representa- tions will be discussed. 3.1 Polygonal Representation Smoothed contours detected from black-white pictures, considered as polygonal approximations, typically contain large numbers of vertices. We feel, however, that the number of vertices can be reduced significantly without destroying the shape information of objects being considered. In other words, with fewer vertices we can still retain a good approximate polygonal represen- tation of objects by an appropriate assignment of vertices. The problems involved here are: (l) how to reduce the number of ver- tices required for the representation, and (2) the appropriate representation or encoding of complex shapes. It is clear that almost all objects can be described by a finite number of vertices; typically the order of a hundred is sufficient for accurate repre- sentation. It is also clear that the number of vertices varies widely depending upon whether the given object is complex or simple. Let us consider two shapes— the sleeping cat and embryo of Figure 2.8 and Figure 2.9 — where their reduced number of vertices are 38 and 36, respectively. However, with even fewer ver- tices, about 15, we can adequately describe each figure without destroying its 35 shape by appropriate selection of vertex coordinates on or near the contour. This is exemplified in Figure 3.1, where the sleeping cat is denoted by l4 vertices and the embryo is denoted by 18 vertices. Let us define the best polygonal approximation with respect to the specified number of vertices, n, in terms of minimizing the difference between the polygonal approximation and the original (smoothed) curve. Let b , b , . .., b. , ..., b be boundary elements of a picture being polygonized, and let P, A and P , iL (k _< n) be the perimeter and the area of the original objects and k- vertex approximated polygon, respectively. Then our best poly- gonal approximation with k, k <_ n , vertices can be obtained by minimizing D, D = w x (P - P k ) + w 2 (A - \) + w T under d(b. , P, ) < T for i=l, . . . , m and k < n. 1 k — — Here, w. > denotes weight, T denotes a threshold for straight-line approxima- tion, and d(b. , P ) denotes the distance between element b. and the contour of IK 1 the approximated polygon with k vertices. As we can see from the above formulation, if we decrease T, then D becomes small; however, conditions d(b., P ) <_ T become critical. On the other hand, if we increase T, then we have to minimize w (P - P ) + w (A - A^ ) to reduce D. This means that we have to move k vertices around the contour of the figure so that the perimeter and the area of the approximated polygon are close to the original perimeter and area. In Figure 3.2 we show some examples of wry all three parameters, (P - P ), (A - A^ ) , and T, should be minimized sima: J - ae- ously to minimize the value D. Now, we shall discuss the representation of polygons. Let us assume the following sequence, 36 .a) Sleeping Cat: lU Vertices (Id) Embryo: 18 Vertices Figure 3.1. Polygonal Representation: Reduction of The Number of Vertices 37 (a) Small Areal Error and Small Threshold, But Large Perimeter Error (b) Large Areal Error, But Small Threshold and Small Perim. Error (c) Small Areal and Perim. Error, But Large Threshold (d) Small Areal Error , But Large Threshold and Perim. Error Figure 3.2. Topological Examples of Contour Approximation that Force the Function D to Become Large 38 (b 1 , 6 1 ), (s 2 fig), ..., (s^ 6J, ..., (s n , 6 ), where n is the number of vertices for shape description, s. denotes the distance and 6. denotes angles. This sequence can be derived in either one of the following manners : i) s. is an edge length of a polygon, and 6. is the rotation angle of the vector of edge s. ._ with respect to the vector of edge s.. o i+l * ° i ii) Let c be the center of the gravity of 0, , and let the vertices be p, , p_ , ..., p., ..., p . Then s. = d(c, p.), and 6. is the r l* r Z 9 l n l ' i l rotation angle of vector c, p..-, with respect to c, p.. i+l 39 Clearly, either sequence representation of objects is rotation and translation invariant. The only difference "between the encoded representations occurring under rotation transformation is that the sequence might he circularly shifted; however, this shifting does not effect shape perception. This polygonal representation has been used by many psychologists for their study of the human shape recognition system, but it is not easy to obtain shape information out of this sequence representation unless the whole sequence is first projected on the plane. 3.2 Pattern Sequence Representation 3.2.1 Definition of Pattern Sequences It is known from the previous chapter that any angularly simple polygon, ft , contains a point (or a set of points) x such that a vector from x to a point on y can trace y in one direction with nondecreasing rotation angle. We can use such a vector sequence to denote angularly simple curves as well. Such a vector sequence is illustrated in Figure 3.3, where the region covered by the vectors shows the polygon and the curve formed by connecting the tops of vec- tors (dotted lines) indicates the polygonal curve. Let us define such a sequence as follows: A pattern sequence , S, is an ordered set o f elementary patterns , q . R m O — S , S , S , ..., S., ..., S jt_t s.en where m denotes the dimension of the elementary patterns and N is called the circularity of S. To make S denote a unique polygon, it is possible that one A pattern sequence can be interpreted as a series of physiological measure- ments; IBM Research Reports, vol. 7, No. 2, 1971, by Jean F. Brennan. Uo Polygon Denoted by a Pattern Sequence Original Polygon (a) An Angularly Simple Polygon (b) A Pattern Sequence Corresponding to Polygon (a) Figure 3.3. An Angularly Simple Polygon and Its Pattern Sequence 1+1 might consider varying the angular difference between adjacent elementary patterns and provide the following sequence of angles : A = (6 6 V 0,1' i 2> 2,3' "•' i,i+l » •""» N-1,0 where 6. . . denotes the angular difference between elementary patterns s. l, l+l ' i and s. in . However, for simplicity, we assume 6. . ,_ is constant for all i, l+l i ,i+l ' i.e., 6. = 2tt/N. It is clear that with increasing N we can approximate with increasing accuracy a given angularly simple polygon. The dimension m of the elementary pattern varies with the intended application of the sequence. For example, it is sufficient to set m = 1 and use only the radial distance to denote polygonal regions; however, it is sometimes preferable to store additional information: first radial distance, then along each elementary pattern direction adjacent local attributes of texture, color, etc., especially when we use the sequence to describe the environment about a particular point. An example of m = 2 is illustrated in Figure 3.U. Here we consider one attribute to distinguish between ROCK, SAND, SNOW, and WATER. When we rotate or translate a polygon whose representation is given as a list of coordinates of basic points (or possibly by line equalities), it is usually necessary to change the point coordinates (and line equalities). However, a transformation of a polygon which is denoted by a pattern sequence, S , i s s imple : (i) a translation of an angularly simple polygon, Q, , corresponds simply to a translation of x„ of the pattern sequence, S, and (ii) a rotation of fi at x^ corresponds to circular shifting of indices y of elementary patterns, i.e., it is adequate to consider the rotation index , r, which will be defined later. U2 Figure 3.^. Free Space Representation: A Pattern Sequence of Multiple Dimension U3 Before we define the rotational transformation of S, let us define the canonical pattern sequence. A pattern sequence, S, is called a canonical pat- tern sequence if the first elementary pattern corresponds to the direction of the X-coordinate , and the ordering of the elementary patterns corresponds to the counter-clockwise rotation of corresponding vectors. Henceforth, we assume that each pattern sequence is canonical. This assigns the orientation of the corresponding polygon. For the rotation of a pattern sequence, S , at a point x , it is conve- nient to assume that the unit rotation angle, 6, is 2tt/N (or possibly, an integer multiple of 2tt/N). For example, if S ^ X = S 0' S l' S 2 5 ***' S i' ***' S N-2' S N-1 then the clockwise rotation of S(x ) through 6 degrees is given by and the counter-clockwise rotation of S(x ) through 6 degrees is given by S N-1' S 0' S l' ***' S i' ***' S N-2* We will define r to be the rotation index. If r > 0, S(x ) has been rotated in a clockwise direction through an angle of rS degrees. If r < 0, S(x ) has been rotated in a counter-clockwise direction through an angle of r6 degrees. Thus, in general, we have the following expression for a pattern sequence: MV = V S r+1' S r+2 J •*•' S r+i' '" 9 S r+N-1 where r+i is interpreted as r+i (mod N) for all i. This pattern sequence representation of objects has been applied for solutions of various problems: an approximate solution of the sofa problem, the determination of odd shapes and maze solvers with visual information (Maruyama, 1971 a, b, and c). It has been proved that this representation is appropriate for the representation of simple shapes. kk 3.2.2 Generation of Pattern Sequences Let us consider the generation of pattern sequences for given polygons. A polygon illustrated in Figure 3.5 is an AS-polygon since at any point in the convex polygonal region which is denoted by the dotted contour, the entire boundary of the given polygon is visible. The encoded configuration of the AS-polygon of Figure 3. 5 is illustrated in Figure 3.6. The vector which corresponds to the X-coordinate denotes the very first elementary pattern, s , of the pattern sequence, s (x ) ; the rotation index, r, is assumed to be at this stage, and each elementary pattern, s., follows in the counter-clockwise direction. In this example, the circularity, N of the pattern sequence is U8. Figure 3. 7 shows error dependence on the circularity N for the polygon of Figure 3. 5, where the values obtained at N = 180 are assumed to be true values. For small circularity, errors are heavily dependent upon the orientation of the given shape. However, for sufficiently large circularity, orientation errors become essentially negligible, To generate each elementary pattern, s., detect the innermost intersec- tion between the line segment at angle i2ir/N and an edge of the polygon. The pattern sequence generated in this manner is called the interior (or primal ) pattern. Let us consider the polygon which is illustrated in Figure 3. ° where x , the center of gravity, is used as the encoding point to generate a pattern sequence. Alternatively, to generate each elementary pattern, s., instead of measuring the distance from x to the innermost radial intersection, the follow- ing method called exterior (or dual ) pattern generation, is used. We first choose a circle of radius R which is large enough to contain the polygon, where the center of the circle is the center of gravity of the figure. Then we measure the distance, s . , to the outermost radial intersection h5 Figure 3.5. An AS-polygon 1*6 Figure 3.6. Example of An Encoded Representation By Pattern Sequence, S (x ) hi 0.0 V(180)- V(N) V(180) ♦ Z{s { -sf/N X s=(^Si)/N 40 60 CIRCULARITY N Figure 3.7. Error e vs. Circularity N for the Polygon in Figure 3.5 U8 Figure 3.8. An Example of a Non-Angularly Simple Polygon (x Denotes the Center of Gravity) h9 and the boundary of the encompassing circle of radius R. Finally the desired pattern sequence, S (x ) is derived "by s. = (R - s.) for i = 0, ..., N - 1. The above process of generating the pattern sequence, S (x ), of the polygon in Figure 3.8 is illustrated in Figure 3.9 » where the dotted arrows show the s.'s and the solid arrows emanating at x. show the s.'s. The contour obtained l i by the dual pattern generation is also illustrated in Figure 3.9. As we can see from the contour of the pattern sequence, we lost some information about the polygon of Figure 3.8. This loss of information can be compensated by the introduction of other pattern sequences which will be discussed later. However, we feel that the polygon denoted by a single pattern sequence is sufficiently accurate for the study of geometrical shapes. From a study of the above two methods of generating pattern sequences , we see that sharp corners of polygons do not show up in the abstracted contour. One way to avoid such information loss is to increase the circularity N. However, the problem here is the estimated error of encoding a given shape by a pattern sequence of circularity N. In Figure 3.10 we show abstracted con- tours obtained by the primal encoding method applied to the polygon of Figure 3.5 with different values of circularity N. If the circularity is larger than 12, then the abstracted contour holds the outline of the original polygon, even though some minor abrupt changes of the original polygon have disappeared from the abstracted contours. However, when the circularity becomes small, like 6, then the outline of the polygon given by a pattern sequence is completely different from the contour of the original polygon. The detailed discussion of the optimal choice of circularity is given in Chapter U. 50 -2»-X Figure 3.9. Dual Encoded Pattern Sequence of the Polygon in Figure 3.8 51 (a) N = 2k (b) N = 12 (c) N = 6 Figure 3.10. Contours Abstracted by Primal Encoding of the Polygon of Figure 3.6, Illustrating Dependence on Circularity N 52 3.2.3 Operations on Pattern Sequences Let S, S' be two pattern sequences of circularity N, X = S ' S l' ***' S i' •••» S N-1' S'(Xq) = Sq, B-, •••> s^, ..., s fj_}_* Then the union of S(x ) and S'(x Q ) is S(x Q ) = S(x Q ) (j S^(x Q ) = {max (s., s^ + .) for all i}, and the intersection of them is S(x Q ) = S(x Q ) ^ S^(x Q ) = (min (s., s^ + .) for all i}, where r is the rotation index. Clearly from the above definition, this class of pattern sequences is closed under union and intersection operations. A pattern sequence S is said to be 1-symmetric if there exists k, _< k _< N-l, N is even such that s. , . = s. . ... (mod N) for i = l s ...„-- 1. In other words. S is 1-symmetric k+i k-i+N 2 if there exists a line which goes through x to make S symmetric. In general, S(x ) is said to be K-symmetric (K <_ N/k) if there exists K distinct lines which go through x such that for each line S is symmetric. Thus, the class of sym- metric pattern sequences is closed under union and intersection operations, again . A pattern sequence S is said to be diagonally symmetric if and only if s. = S , (mod N). Thus, the class of diagonally symmetric pattern sequences is closed under union and intersection operations. Furthermore, it is trivial that 2k-symmetric (k >_ 1, integer) pattern sequences are diagonally symmetric. For CS(x ), C _> , it is said expansion if C > 1, and contraction if 1 > C _> 0. For S(x ) + e, if e > then it is said equi- expansion , otherwise it is said equi-contraction where lei < minis.}. If s.+(s -A)e, e > 0, A. _< (l+e)s./e is said to be proportional- expansion or contraction for all i. 53 3.2. k Fourier Series It becomes now clear that a pattern sequence can be described by a Fourier Series. The interval (0,2tt) is divided into N number of equal sub- intervals by points (k2TT/N) for k=0,l, . . . ,N-1. Thus the partial sum F (6) of Fourier Series is n F (0) = a /2 + Yl (a cos(k6) + b sin(k6)) s k=l under the minimization of the following square error: N-l E = £J(s - F (2Trk/»)) . k=0 n It is known that the solution to minimize E is given by: N-l a ™ = 2( Yy s cos(2mTTk/N))/N, (m=0,l,. . . ,n) m k=0 k N-l b = 2( Vs sin(2miTk/l))/N, (m=l,2,. . . ,n) . k=0 This Fourier Series for a pattern sequence might be interesting; however, we do not consider this type representation for objects. 5U 3.3 Generalization of Sequence Representation In Section 3.2 we discussed some sequential representations of angularly- simple regions. There we utilized the property that each AS-polygon has a set of points inside of the boundary such that from any point in the feasible set all edges of the boundary are visible. Thus we could use a sequence of radial vec- tors to denote such a region. If the region we want to describe is not angularly simple, then some portion of the boundary cannot be denoted by such a sequence. We were uncertain how accurately we should describe objects; however, we feel that it depends heavily upon user preference as well as his application. In this section, we introduce several approaches to describe general objects, yet they are quite simple compared with the representation by coordinate sequence or point encoding (Freeman, 196l). 3.3.1 Bi-Pattern Sequences This representation consists of two pattern sequences, primal S(x ) and dual S\X n ) pattern sequences. It is clear that these two pattern sequences are identical if and only if the encoded region is 1-angularly simple. Furthermore, dual pattern sequence covers regions at least as much as primal pattern sequences. This fact is denoted by S(x ) Csfx ), where each elementary pattern of S'Uq) is at least as great as the corresponding elementary pattern of S(x ). Let b(,X J — S , S , ... 3 S . , ..., S i\r_-]» S'(x Q ) = Sq, s^, ..., s^, ..., s N _ 1? s'. = R - s. for all i. The ambiguous regions implied by these two sequences are the regions which correspond to s! > s. for some i. In Figure 3.11 we illustrate a bi-pattern sequence encoding where (a) shows primal and dual encoding and (b) shows how two sequences are different and where some ambiguities occur. As we can see 55 (a) S(x ); i | i ; i | i j i i S(xJ R (*) Figure 3.11. Bi-Pattern Representation 56 from this example, we can eliminate some ambiguities by considering two sequences simultaneously and considering connections between elementary patterns in S(x_) and S(x ). For example, we can assume that s, and s are connected, which is denoted by a dotted line in Figure 3.11(b). The following algorithm eliminates as many ambiguities as possible from bi-pattern sequences and provides a more accurate description of regions. Algorithm BP (Bi-pattern Sequences) Let S' = s. , s. , ..., s. , ..., s. be a sub-sequence of i, 1 ' i . l 12 j m S(x ) such that s. +s. =R,s. +s. = R, and 3L- 1- 1 1 11 mm s . + s. < R for j ^ 1, m. Then s. is interpreted to be connected to s- . Repeat the above process until V-l 2 all such possible sub-sequences are exhausted. 3.3.2 Many Pattern Sequences This representation uses the necessary number of pattern sequences to encode the given figures relatively accurately. The pattern sequences, or set of pattern sequences, being utilized here are slightly different from the pattern sequences described in the previous section. The previous pattern sequence may be called a complete pattern sequence since all elementary patterns were con- sidered to be significant for a shape representation. However, pattern sequences considered here are so-called incomplete, i.e., partial pattern sequences in the sense that some elementary patterns are simply ignored, especially those ele- mentary patterns whose magnitude is greater than or equal to R, which is called the visible distance. 57 We can consider the following three approaches: a) S = {SUJ, i=l, 2, ..., K}, ■where the given object fi is at most a K-angularly simple region. In Figure 3.12, an example of this representation is illustrated, where (a) shows three complete pattern sequences generated at x , x , and x , respectively, and (t>) shows the modified pattern sequences of (a). This modification of pattern sequences, which is a reasonable consideration, can be achieved by detecting abrupt changes in the complete pattern sequences of (a). b) S = {S (x ), S 2 (x q ), ..., s n (x Q )} where each s.(x Q ) is a primal pattern sequence generated at point x , such that at the i-th generation, those portions of figures which have been viewed by S (x ) , S (x •) , ..., S. (x ) are eliminated and the remainder is considered to be the i-th encoding figure. Thus, each primal pattern sequence is incomplete. This approach is exemplified in Figure 3.13; the union of solid and dotted lines of (a) shows the originally given figure, and the solid lines show those lines which are encoded by the current pattern sequence. The dotted portion indicates the portion of the figure which is not encoded. Figure 3.13(b) shows the second pattern sequence S (x ) which covers the solid portion of the figure and is generated by eliminating the portion covered by S (x ). c) S = {S (x ), S (x ), ..., s (x )} where S.(x ) is a dual pattern sequence and each pattern sequence is generated at x . The rest of the explanation is similar to that in case b). This is the dual to the previous method, b), i.e., encoding starting inward from the exterior of the figure, while method b) encodes outward from 58 (a) Original Pattern Sequences (b) Modified Pattern Sequences Figure 3.12. Method a): Shape Representation with Multiple Patterns 59 (a) Si(Xo) (b) S 2 (X ) Figure 3.13. Method b): Shape Representation with Multiple Primal Patterns 60 the interior. This approach is exemplified in Figure 3.lU. By comparing this method and the method b), if the encoded object is fairly complex, then the method c) is preferable to method b). 3.3.3 Interval Pattern Sequences An interval pattern sequence S is defined similarly to the previous pattern sequences, except that each elementary pattern s. consists of many intervals: a b Q a ] _b 1 \ k. 1 1 ' 1 ' ' 1 ' a.b . l l where each X. indicates that the interval between a. and b. is occupied by the the object ^ • Y In Figure 3.15 an example of such a representation is illustrated. The advantage of this representation is that only one pattern sequence is sufficient to illustrate fairly complex closed regions; however, because each elementary pattern consists of various numbers of intervals, this makes the operations on pattern sequences fairly complex and time-consuming. Moreover, we need a fairly complex algorithm to recognize the shape which is denoted by this sequence. 3.3.^ Contour-Following Pattern Sequence This method is applicable for almost all simple closed regions and is similar to the Polygonal Representation of Section 3.1. At first the contour (or perimeter) is divided into N segments by N points: P — Pq s P-i » Po » • • • 9 P-i* • • ■ a PlM — T * Then each elementary pattern of this pattern sequence S(x ) is the distance between x and p. for all i, that is, s. = d(x_, p.) for all i. i ' ' i l 61 (a) Si(x ) \ \ \ Jo \ (*) S,(xJ Figure 3.1*+. Method c): Shape Representation with Multiple Dual Patterns Figure 3.15. Interval Pattern Sequence Figure 3.16. Contour-Following Pattern Sequence Convex (a) Sj-4 Sj_3 S[_ 2 Sj-j Sj Sj+i Sj + 2 Sj +3 S i + 4 c i+k =Sj -s i+k cos(kS) a - (b) , -l-~ II t- Convex Straight Concave Figure 3.17. Convexity and Concavity on Lines 63 Clearly, this sequence does not represent a unique region, "but a class of regions. To make this sequence denote a unique region, we will provide another sequence, called a rotation sequence R(x ), where R(x Q ) = r Q , r x , ..., r iS ..., r^^ r ± e {1, 0, -1}, r. : denotes angle rotation with respect to the vector x^ ,p . n . r. = 1 indicates counter-clockwise rotation, O'^i-l l ' and r. = -1 indicates clockwise rotation. i An example of a Contour-Following pattern sequence is illustrated in Figure 3.l6. A more detailed discussion as well as an application of this pattern sequence appears in Section 3,k, 3.3.5 Convex-Concave Abstraction It is appropriate to consider that the contour of a given object con- sists of some convex and concave curves. Such a sequence may be used as an abstract description of the global shape information. Here we briefly discuss how to reduce such a sequence from a given pattern sequence representation of an object. Let us consider a single pattern sequence, either primal or dual. In such a sequence, s n is said to be a maximal if s, > s, _, , s, , . , and s is said H ' k k k-1' k+1' k to be a minimal if s, < s, _ , s. ., . Clearly, each maximal elementary pattern k k-1 k+1 corresponds to a convex segment on a curve y. However, each minimal elementary pattern does not necessarily correspond to a concave segment on a curve y. In Figure 3.17(a) we illustrate how a straight line, a convex line, and a concave line appear as a subsequence between two adjacent maximal points. Here we assume that s. , and s , are maximal elementary patterns and the circularity N = 2tt/6. As we can see from these three lines, it is difficult to tell what line becomes straight and what line becomes a convex line or a 6U (a) ^^\ / < < / \^A 1"'? \ X \\ \ " v \\ \ — / ft / /' /\ f^s / / / /» 1 -"" 1 1 L^X^ 1 l^-- >r// \ V\ S3 O o 0) -p !h o o cd O u ft ft < < On H • on CD U bO •H 66 concave line. However, if we consider a new sequence, c. , = s. - s. , l+k 1 l+k cos(k6), then we can distinguish them. Three lines of Figure 3.17(a) are plotted in (b), where c. . < implies a concave curve between s. . and s. . , and c . . > implies a convex curve between s . , and s . . . l+k * i-U i+4 In Figure 3.18(a), three regions are illustrated along with their corresponding pattern sequences, and c. sequences are shown in (b) and (c), respectively. From a C sequence we can derive another sequence which only con- sists of convexes and concaves. In such a sequence, if more than one connected convex or concave appears, then such a subsequence is shrunk into a single convex or concave symbol. Thus, finally we get a (convex, concave) sequence, and such a sequence is used to describe the global shape of the region. This is exemplified in Figure 3.19« 3.4 Skeleton of Shapes In this section we describe an approach for the detection of skeletons and the decomposition of shapes in terms of the information extracted from the skeleton (see Figures 3.21 and 3.22). The basic idea of skeletonization of shapes is to describe a picture by using an intrinsic coordinate system. Thus, the originally given shape can be determined completely from skeleton points with their distance from the contour. The idea of skeletonization of shapes was originally developed by H. Blum many years ago as a new description of shape, and his recent works (1967, 1971) con- tain an extensive discussion of skeletons. Rosenfeld and Pfaltz (1966, 1967 ) have defined a discrete skeleton by us- ing a distance transformation and have illustrated a computing algorithm. The generalized and improved algorithm of this type appears in Stefanelli and Rosenfeld (1971). Another definition of skeleton is, for instance, the loci of 67 points having the same distance which can be seen as successive positions of a propagating wavefront. Wavefront superposition is not allowed, so that wave- fronts intersect on the skeleton and their propagation stops. The algorithm based on the wavefront propagation mechanism has been developed by Montanari (1969) where, in his algorithm, the main idea is the determination of the break points which occur when more than two wavefront s intersect. 3.^d Skeletonization Let ft be the region assumed by a curve y and let p = y(t), t e I. Then £(t) is said to be the pan normal at p if £(t) is perpendicular to y(t) and £(t) points inward. An example of a pannormal is illustrated in Figure 3.20(a). If y is a polygonal curve, then the pannormal for this case is similarly defined, and an example is illustrated in Figure 3.20(b). Let us consider a contour- following pattern sequence P^y) °f a shape which has been introduced in Section 3.3.^. P n (y) = P » Pji P 2 > •••> P i5 ••«•» %_i» where arc length p , p is equal to the arc length p. , p. . for i=0, ... , N-2. r i' l+l Then p. is said to be convex (concave) if the angle which extends inward of ft 1 y is less than or equal to it (greater than it). On such a concave point, the above definition of pannormality is not adequate, because at p. there are two nor- mals: one is perpendicular to e. and another is perpendicular to e. _. Thus, in this case we assume any direction between I and I of Figure 3.20(b) is normal at p.. A PM-disc of radius r. is the maximal disc on the i-th pannormal £(p. ) which touches point p. on a curve. The center of PM-disc of radius r. is denoted by c.o Therefore, for a given edge-following pattern sequence on a simple curve, 68 we get a unique sequence C (y): C N Y = C 0' C l' ***' °i' "**' C N-1' called the skeleton of ft . Similarly, we have the corresponding radii sequence R n (y) N " r ' r i s *••» r i » •••s r N-l* where we will measure the radius r. of the PM-disc at p. on y in the following manner. We assume a disc of radius r. which is very small. Then by increasing r. by S, which is also a small value, we detect the intersection between the disc of radius r. + 6 and the edges of Y. If there is an intersection, then r. is the radius of PM-disc, otherwise repeat the above process. Here r. = d(p.,c.) and c. is on the i-th pannormal. The above process is illustrated in Figure 3.20(c). The disc of radius r. at center c. is moved away from p. as much as l l l possible by increasing r. by 6 . Let us consider the case that p. is a concave point which consists of the intersection point between edges, e. and e. _. . Let us also assume that I. l l+l 1 and £ are two pannormals with respect to e. and e. , respectively, where this is illustrated in Figure 3.20(d). The angle between SL and I is assumed to be 6. Since we define any direction which is a linear combination of two vectors I and I to be a pannormal at p. , in a practical case we can consider several vec- tors as the pannormals at p. and generate a PM-disc for each such pannormal. 3.4.2 Neck/Tail Decomposition From the skeleton, it is possible to recognize a number of global properties of shapes, especially those relating to curvature (concave or convex, "neck" parts, "tail" parts, "hair" parts, etc.). 69 (a) Pannormal on a Simple Curve (b) Pannormal on a Polygonal Curve i, (c) PM-disc (d) Concave Point Figure 3.20. Pannormals and PM-discs TO Let us first consider the figures illustrated in Figure 3.22. (a) shows an example of a neck of a region, and (b) and (c) show examples which may not be interpreted as necks. From these it is said that a neck may exist on a con- cave shape, but not on a convex shape. The significance of locating this kind of property is that one connected region can be decomposed into more than two separate regions by cutting necks. This neck-finding approach is also important for the clustering problem (Zahn, 1971). By doing such a neck decomposition of connected regions into several disconnected regions, possible matching between two regions can be attained. In Figure 3.22(b) an example of a tail region is illustrated. Such a tail is quite often ignored by the main body or cut away from the main body, and it can be considered that the shape consists of two regions: body and tail. We will consider these neck/tail detection and decom- position problems. Let C (y) and R^Cy) be two sequences: center coordinates and radius, respectively, which are derived by the skeletonization of a curve y. C I\T Y = °0' C l' ***' C j ' ***' C N-1' R N Y = r 0' r i' ***' r j ' **"' r N-l' N-l r = (2 r.) / N, j=o J P N-l _ o j=0 J T = r + C a, C > 0. Then we have a sequence B (y), b n ( y ) = b Q , b x , ..., b j9 ..., b N _ lS where f b . = 1 for r . > T J J J - I b. = otherwise. 71 Figure 3.21. Skeleton of Shape NOT A NECK (c) NOT A NECK TAIL NOT A TAIL Figure 3.22. Necks and Tails 72 Let B' be a connected subsequence of B (y), and let |B'| denote the num- ber of elements in B' , i.e., the length of B' . For B' such that for all b, e B' b. = 0, it is said that B' forms a tail of region ft if Ib'I > C^N, 1 > C > 0. Here the selection of the value for C is essential for the detection of tails in a region, since for small C quite a few subsequences of B''s exist which satisfy the threshold. They do not necessarily correspond to a tail; however, they may be interpreted as hair parts. From our experience, C should be set to greater than 0.15. Of course, the value T is quite signi- ficant for the detection of both necks and tails. Let B 1 and B" be two distinct connected subsequences of B (y) such that b., b. = for all b. e B' and b. e B" . Then it is said that (p., p.) of (b. , b.) form a neck if d(p. , p.) < e, where e is a very small value. X 3 i J — In Figure 3.23, skeletons derived under the threshold values (a) T = 0, (b) T = s, and (c) T = s + 9c are illustrated. The boundary of the figure is denoted by character B, and its skeleton is denoted by the character S . An example of a region decomposition by detecting necks and tails is illustrated in Figure 3.2U. Here, the threshold T = s is used to detect the neck and tail, and the resulting decomposed figure is shown in (e). Another example of region decomposition by finding the tail of a complex figure is illustrated in Figure 3.25. Let us consider the following matching-in problem. Is Figure 3.2i+(a) embedded in Figure 3.25(a)? We can't find one region in the picture of Figure 3.25(a) which corresponds to the picture of Figure 3.2U(a); however, we can possibly make matching-in if we compare the decomposed picture of Figure 3.2Me) and the picture of Figure 3.25(d). Thus, it may be that possible matching can be attained by decomposing pictures using neck and tail properties. This matching-in problem or figure extraction problem will be discussed in Chapter 7. 73 i i i i i i I i i i I I (C f -Oro(DcoiD(Dincpcn CO fO to co V) 1 i l 1 1 1 1 I vt \S\ v» t/i VI io v> V» to 1 i i 1 Vj V) I I I I I I I I V) I CO I I I I I I I I I I I I I CO I I CD CO I I I I I I I I I I I I CQ cn CQ I I en co ro en en co i I i I i i i I I i i I co en. co i i i I i i I I I I i i i i I I f I G + I to V) I/) VI VI '» 4 I CD < CO CD CO CD CD CQ CO CCCDffiCQCOCQCDCDcDCD I I t I I I I I t I t | i i i i i i i i i i i i i i i i i i i i i i i i i i i i i co to co co co co co IA 1 1 1 | 1 t 1 1 1 1 V* 1 1 l I l 1 1 1 1 t 1 1 i l V> 1 1 V» 1 (/1(/> 1 > I 1 l i i 1 1 (/> 1 i 1 1 VI 1 i vt t/t v> V* V> 1 t I 1 VI *•> i i I CO 1 I I CO I CD CD I CO CO CO CO CO CO CO CD 1 I I I I I i I I I I V/"> I I I I I I I I I I I I I I 1 I I I I I I I I 1 I I I I I I I I I I I I I I t I I I I t I I I I 1 I I I I I I I I I I 1 I I I I I I flO CO CO CD CO I I I I I I I I I I I co co co ca co I i I i 1 I I I I I t 1 I I I I i I I I i I I i iiii V» vi V» i/i VI lyi > V» I I CO | I i CO L.-'J I CO I I CO I I I IIII 1(11 IIII CO CO CO CO V» <^ I l/t I I V» V> I I I t/» I it') ca co co co ta co co co co I I I VI v> Jk) V* Vt Vt V» VI VI VI i Vt I I I ai co co co co co i m a) iD co ii) l I i l I I i l I I I I 1 t Vt t/> I VI VI 'Wt/iwvu^i/iioW 1 o di io m ti co i i i i i i i i i i i i I i i i i I CO CO CO CD CD v> cn I I I t to I I I I I I I I I I I */i •H fe La ! •■ I ,a | ! i § •H hO « CD > •H 75 MS MB I • S t B I s b r s s sss«.s • IS » B61>i • ss a bb s ss MO a s i s ss i * Si 5 SSS I s ssss «&«;"HBe*Ba (a) Given Regions (b) T = IB SS IB S\ BB6 ssss bca B t ■ r la a 5 *S lis s its s v.s V.SS ss*--.ss *-s ss % sssss % is a fa • ee I B * I II t • ( (c) T = s (d) Decomposition Figure 3.25. Decomposition of Many Regions 76 3.5 Characteristic Points of the Representation 3.5.1 Preliminary Discussion In previous sections we discussed the generation of pattern sequences of polygonal regions; however, we have not mentioned yet how we obtain a point x , called an encoding point, of S(x ). In this section, we introduce several methods to find such a point in each given polygonal region. It is clear that for a given region Q , if encoding points x and x' are given, then two genera- ted pattern sequences S(x ) and S(x') denote the same region; however, the two sequences are quite different from each other. Thus, it is preferable to have a unique encoding point for the generation of a pattern sequence for a given region. Moreover, the choice x should not be affected either by the rotation or translation of fi . Y The center of gravity of the polygon might be used as the encoding point because of the uniqueness of the center of gravity. However, if the given region is not convex, then the center of gravity may not necessarily be located in the interior of the region. Here we are concerned with interior points such that their spacial distri- bution reflects some shape information of the given region. For example, points marked inside the polygons of Figure 2.U are such points, and these can be considered as "discrete skeleton" points. To assign these points in the region, we first find K- feasible regions by Theorem 5 of K-angularly simple decomposition and assign a point for each feasible region. Because of the convexity of feasible regions, the center of gravity of each region may be considered as a characteristic point. However, we will introduce another kind of characteristic point in convex regions in terms of primal pattern sequences. 77 3.5.2 Determination of Characteristic Point Let us consider the following approach: Algorithm CP (Characteristic Point) Find a point x in a feasible region X of ft "by maximizing the mean of elementary patterns in S(x), where S(x) is derived from X , rather than ft . Y Some other objective functions are : i) Minimize the deviation of elementary patterns, ii) Maximize the perimeter which is covered by S(x), iii) Maximize the area which is covered by S(x). To gain a little insight into Algorithm CP, let us consider a particular example of a circular region. Let x be the center of the circle, and let (k) (k+l) x , x be two points in the circle not identical to x . If d(x , x (k) ) > d(x , x (k+1) ), then Zs. (k) d(x , x ). Thus by the algorithm, s approaches x by iteration. In Figure 3.26 the characteristic point x attained by Algorithm CS in a convex polygon is illustrated. (a) shows the result determined by the maximi- zation of the mean of elementary patterns, and (b) shows the result determined by the minimization of the deviation of elementary patterns. Points x , x^, x , and x^ are trial points which were assigned as the initial points, and they approached to the point x . By the comparison of these results, the algorithm finds a fixed point x_ wherever the initial point is. Moreover, 78 O •rl Q Cm O a o •H -P s o bO ft O P-. H -P a •H •H 81 a 3 <4H O C o •H -P SI •H o -p •H PL, o o •H -p cd tsl •H •H CO CM ) (c) (a; Figure 3.29. Characteristic Points in Polygons: Maximization of Mean 83 i) Let fi be a convex region, and let the circularity N be sufficiently large. Then the point x which maximizes the mean of elementary patterns is unique. That is, if s(x) is the mean at interior point x, then, as we maximize s(x), x approaches x . ii) Let fi he a convex region, and let the circularity N he sufficiently large. Then the point x which minimizes the deviation of elemen- tary patterns is unique. iii) For a convex region fi , x maximizes the mean of elementary patterns Y if and only if x n minimizes the deviation of elementary patterns. iv) If fi is symmetric, then the point x_ which maximizes the mean (or y j- minimizes the deviation) of elementary patterns lies on the axis which makes fi symmetric. Y Theorem 7 . Let x and x be two points in an AS-polygon, and let P(x ) and P(x) be the perimeters of pattern sequences S(x ) and S(x), respectively. If x. e X , then y and P(x Q ) > P(x) P(x Q ) > P(x') > P(x) for x' = wx + (l-w)x, <_ w _< 1. Proof: It is sufficient to show that OP + PQ > OR + RQ > OQ. (Q.E.D.) 8U Corollary 3. P(x. ) = P(x) = P(x.) for x. ± x. if and only if x. , x. e X , where x = wx. + (l-w)x., < w < 1. i J Y x J - ~ Proof: It is clear from Theorem 7. 3.6 Intercomparison of Shape Representations In the previous sections we have introduced three basic shape represen- tations — polygonal, pattern sequence, and skeleton representations. Polygonal representation has been used by many psychologists because of its transforma- tion-free encoding. Pattern sequence representation has not been used until very recently; it has proved especially good for simple shape representation. Skeleton representation, however, is a distinct encoding orthogonal to the above contour representations and has unique properties for shape representa- tion. Each representation has its own advantages and disadvantages, and there is no general simple shape representation which preserves the advantages of all three representations. Thus for the best shape perception, we should utilize the advantages of all three representations. In this section we list the advantages and disadvantages of the three, although the listing is incomplete. Polygonal Representation Advantage: The representation is invariant under translation and rotation operations. Reconstruction of a shape from the representation is simple and can be used for unlimited classes of regions. Disadvantage: Because of its lack of spatial information (i.e., difficulty in establishing whether a trial point is con- tained in the polygon), this representation is not suitable 85 for shape recognition, perception, or identification problems . Pattern Sequence Representation Advantage: Translation and rotation operations on the representation are simple. Ease of detection of local and global shape features leads to a good encoding for shape recognition, perception, and identification problems. Disadvantage: The serious disadvantage of this representation is that it is not easily extended to non-angular ly simple shapes, especially thin-long shapes. Another disadvantage is that it requires large circularity to detect sharp changes in contours . Skeleton Representation Advantage: This is a very good representation of shapes and has few encoding restrictions; it is especially good for thin-long shapes. Local and global shape feature detection is easier than for polygonal representation. Thus, it is a suitable representation for shape recognition, perception, and identification problems. Disadvantage: Although this representation does not place restrictions on the class of shapes to be encoded, the detected skeleton does not readily convey shape information if shapes con- sidered are not long. Reconstruction of shapes from the representation is more difficult and time-consuming than for the other representations. Rotation of the represen- tation is difficult. 86 FEATURE EXTRACTION l+.l Introducti on In general, for pattern recognition various feature values are extracted from objects, and the performance of a pattern recognition system depends crucially on the selection of features. Methods by which features are selected, however, are often intuitive and empirical and depend upon the designer's experience with the problem. Both feature selection (Jones, 1971; Brown and Owen, 1967; etc.) and dimension reduction (Shepard, 1962, etc.) are important aspects of the design of a shape recognition system. The main guides to feature selection are: (l) features should be invariant to irrelevant variations, such as limited translation, rotation, changes of scale, etc., (2) objects must be pair-wise distinguishable, and (3) redundant features should be eliminated. Five major features and physical form measures most highly 2 correlated with these requirements are tabulated by Zusne (1970, p. 223). They are "Compactness," "Jaggedness ," "Skewness" of contours related to the Y-axis, "Skewness" of contours relative to the X-axis, and "Dominant Axis." However, several of these features are not independent of the orientation of the shape. In Section U.2, 19 selected features are described which are collections of features extracted from three basic shape representations which were discussed in Chapter 3. Section k.3 is provided to show how some features depend upon encoding parameters. Some special features Jones (1971) classified measures into 13 major features. 2 The parameters of dispersion, symmetry, and elongation are considered to be analogs of the second, third, and fourth moments of distribution of the area of a shape (p. 215). 87 which should be extracted for shape perception problems will be discussed in Section h.k. k .2 Description of Specified Features In Table k .1 we list 19 features which are independent of the orientation of a shape, except features Fk , F5, and F6 . These features are valid not only for a class of polygonal shapes, but also for the closely allied class of polynomial curved shapes. Not all these 19 features are equally extracted easily from all three shape representations previously discussed in Chapter 3; however, this is a comprehensive collection of features extracted from three basic shape representations: polygonal, pattern sequence ,' and skeleton representations. In Table k.2 we show which of these features are extracted from polygonal, pattern sequence, and skeleton representations, respectively. Any of these features can be extracted from each shape representa- tion. Features extracted from polygonal representation can be extracted from skeleton representation and vice versa, by transforming from one representa- tion to the other. However, we select features for each representation which are dependent upon the representation itself and are easy to extract directly from the representation. Some of these features in Table U.l are self-evident, and for those which are not, brief explanations are given below. We will explain features in terms of pattern sequence representation; however, the correspondence between the pattern sequence and skeleton representation becomes clear if we interprete the radii sequence Fvt(y) i- n Section 3.^.1 to a pattern sequence S (x^) in Section 3.2.1. r 88 NO. NAME Fl PERIMETER F2 AREA F3 PERIM/AREA FU M2 F5 M3 F6 Mil FT MEAN-R F8 MA.XAMP F9 DEV-R FIO ALT# Fll MAXIMAL/? F12 DEV-E F13 CONVEXITY FlU SYMMETRY F15 DEG-AS Fl6 DEG-F FIT DEV-ANG Fl8 ANG# F19 VERTEX# Table k.l. Selected Shape Features DESCRIPTION Perimeter P (unnormalized) Area A (unnormalized) (1 - 2/^A/P) Degree of variance Degree of skevness Degree of elongation Mean of unnormalized elementary patterns Difference between the maximum and the minimum in elementary patterns* Deviation of elementary patterns* Number of alternations in a pattern sequence wrt the mean Number of local maximal points in a pattern sequence Deviation of edge lengths* Degree of convexity Degree of symmetry Degree of angular simplicity Degree of feasibility Deviation of angles Number of angles less than i\/h and greater than l^/h Number of vertices in a polygonal representation * Each elementary pattern sequence is normalized by the mean of elementary patterns: FT ( MEAN-R ) . 89 Table k .2 . Extracted Features from Each Shape Representation POLYGONAL PATTERN SEQ. SKELETON REP. REP. REP. Fl * F2 * F3 * FU * F5 * F6 * F7 F8 F9 F10 Fll F12 * F13 FlU F15 Fl6 FIT * Fl8 * F19 * 90 Features Fk , F5 , and F6 : these correspond to the second, third, and fourth areal moments, respectively, where each moment has two values — the moment with respect to the X-axis and the moment with respect to the Y-axis. Feature F10 : half of the number of times the sequence S(x ) crosses the circle of radius s. For the polygon which is illustrated in Figure U.l(a), the number of alterations is k. Feature Fll: the number of local maxima in the sequence of S(x ) minus the number of alternations in F10. (Because the sequence is closed, the number of local maxima is identical to the number of local minima). In the example of Figure k.l, elementary patterns s , s, , Sq, and s correspond to local maxima, and s , s^, s , and s , correspond to local minima. Thus, this feature value is zero. Feature F12 : /z (e. - e) /(number of arcs) where distances between J J adjacent pairs of extrema (maximal and minimal points) are considered to be the e.'s. For example, in Figure H.l(c), s s is an edge and s s + s s p is a curve length for the shape of Figure U.l(b). In other words, the distance between two adjacent extrema or the arc length between two adjacent extrema is interpreted as e . . e is the mean of e.'s. J J Feature F13 : N /the circularity N, where N is determined in the following manner. For each pair of adjacent maxima, we assume an edge — an example is illustrated in Figure U.l(a) as a dotted straight line. Then we count the number of elementary patterns between the maxima whose elementary pattern length is greater than the distance between the encoding point and the edge. We repeat this process for all possible adjacent maxima. For example, in Figure U.l(a), 91 (a) A Polygon and a Circle of Radius s (b) A Polynomial Curved Shape S S t S 2 S3 S 4 S 5 S 6 S7 Sg S 9 S 10 Su S 12 S13 Sj4 S15 (c) The Sequence, S (x ), of the Above Poly V gon Figure k.l. Examples of Shapes Arid Their Pattern Sequences, S (x ) V 0' 92 elementary patterns s and s are greater than the distance to the edge. Each maximal point is also counted. Thus, we get the degree of convexity for both shapes, (a) and (b), as h/5. In Figure U.l(c), those elementary patterns which contribute to the degree of convexity are connected by bold lines. If we double the circularity N, i.e., 32, then the degree of convexity becomes 5/8. Feature FlU: the degree of symmetry for S (x ) is defined in the following manner. Let N be the number of elementary patterns which satisfy Is . - s . I < . lU* max( s . , s . ) 1 r+i r-i ' — r+i r-i for i = 1,2,..., N/2-1, where r+i and r-i are interpreted (mod N) for all i. Then the degree of symmetry is given by max {N }/half of the circularity N r r Thus, this feature value ranges between and 1. Feature F15: the degree of angular simplicity is defined by the ratio between the area covered by a primal pattern sequence and the actual area of the object. Thus, this feature value ranges between and 1. Feature Fl6 : the degree of feasibility is defined by the ratio between the area of the feasible region and the actual area of the object. This feature assumes values between and 1, and it is for any nonangularly simple region and 1 for convex regions. 3 The parameter of symmetry is considered to be the analog of the third moment of distribution of area (Brown and Owen, 1967). 93 k.3 Feature Dependence on Encoding Shapes This section is provided to show how some features: MAXAMP, MEAN-R, DEV-R, and PERIM/AREA in Table k.l, depend upon encodings. Those encoding parameters which we are concerned with here are: (i) array cell sizes, (ii) white/black encoding, (iii) primal, dual, or edge-following encoding, and (iv) circularity N, where the shape representation used for the measurement of feature dependence is the pattern sequence representation. Let us assume that a two-dimensional shape is given as an image array. An example of such a shape is illustrated in Figure U.2(a), where we assume the shape described here is the originally given accurate shape. In Figures (b), (c), and (d) we illustrate the black-boundary and white-boundary of the shape in (a) for different choices of array cell size denoted by 1 x 1, 2 x 2, and k x k, respectively. Thus, intuitively, it may be concluded that for a larger cell size, an object can be represented accurately, However, concerning boundaries, we can say nothing about whether a black or a white boundary is preferable with respect to feature values. In Table k.3 we show absolute errors for various features under the variation of array cell sizes, white-black boundaries, and primal, dual, or edge-following encoding methods. In this table we assumed that the feature values for a geometric boundary are true values for fixed circularity N = U8. From this table we understand that errors decrease as array cell size increases. Moreover, at least for a fixed circularity, whatever array cell sizes are used, absolute errors for black-boundary encoding are less than the absolute errors by white -boundary encoding. That is, the black-boundary 9h i i % M P v// W/ % SI (a) (b) 1x1 £1S 7rnY -^^rr^ZAJE—- *_5_. 7 r 17 ^ — -i Y "IT A:*! 7rV __j_ iizs~ V 3-Y- _,_:iZ Zr^ _ l_l S^E— — :___ j: -^ — -.V ;: - 5J_v_. v ~~ _- v 5JE — ±2 N ^ -^ — 5^ 2^— -i V _______ , = =:::i:!!iS53_=.sss«ii,j::: II - -_=- *t 7^_. — -^ 5 ^Y __. — ^ Dual E.F. .00 .00 .19 .19 .05 .05 .09 .09 .03 .03 .01 .02 .01+ .02 Primal .00 .1+7 .27 .25 .05 .09 .00 < On Dual .00 M .2k .29 .06 .13 .00 E.F. .00 M .22 .28 .05 .13 .01 Table k.k. Absolute Errors for Feature MEAN-R "for Different Circularity N, Feature Values for Geometric Boundary are Assumed to be True Values. Image Array Size is 2 x 2 N Primal Dual E. F. White Black White Black White Black 96 .09 .05 .07 .06 .05 .03 72 .08 .05 .06 .05 .01+ .03 1+8 .09 .01+ .06 .05 .01+ .03 36 .07 .06 .06 .05 .01+ .03 21+ .15 .01 .13 .03 .05 .03 18 .09 .03 .08 .03 .06 .02' 12 .21 .02 .15 .03 .05 .02 6 .03 .06 .03 .06 .13 .01 96 encoding is less influenced by the array cell size than the white- boundary encoding. To see how encodings, primal, dual, and edge-following, affect feature values, absolute errors for features, MEAN-R and DEV-R, are tabulated in Table k.k and Table U.5. Here feature values of the geometric boundary with corresponding encoding are assumed to be true feature values for each circularity N. From these two tables, it becomes clear that whichever primal, dual, or edge-following encoding method is used, the absolute error by black-boundary is less than the absolute error by white- boundary for various circularities. Because of its heavy dependence of encoding errors on individual shape — unless a shape to be encoded is analyzed — it is not appropriate to say which of the three encodings is preferable for a general shape representation. In Table k.6 absolute errors for features MEAN-R and DEV-R are tabulated for a black-boundary where feature values for circularity N = 96 are assumed to be true values for each encoding. From this table it seems very likely that absolute errors of edge-following encoding are less affected by the change of circularity N, especially when N >_ 2k. It is not appropriate to say which of the primal or dual encoding methods is preferable, since two generated sequences describe different regions, especially when the encoded object is not in the class of 1-AS-polygons . In Figure k.3 two relatively complex polygonal regions called "Embryo" and "Sleeping Cat" are illustrated. Using these two shapes, let us consider how the circularity N affects feature values. In Tables k.J and k.Q absolute errors for features of Embryo and Sleeping Cat are tabulated, respectively, where for each table feature values at circularity N = 96 are assumed to be true values. In Figures k.k and h. -5 some of the errors tabulated in Tables k."J and k.Q are plotted to see how circularity affects 97 Table 1+.5. Absolute Errors for Feature DEV-R for a Different Circularity N, Feature Values of Geometric Boundary are Assumed to be True Values. Image Array Size is 2 x 2 N Primal Dual E.I i White Black White Black White Black 96 .07 .00 .08 .0U .08 .03 72 .06 .02 .08 .05 .08 .01+ 1+8 .08 .01 .09 .03 .09 .03 36 .Ok .02 .07 .00 .08 .02 2k .Ok .00 .08 .02 .09 • 05 18 .01 .ok .05 .05 .06 .10 12 .18 .20 .07 .00 .05 .09 6 .03 .ok .03 .01+ .19 .23 Table 1+.6. Absolute Errors for Features MEAN-R and DEV-R. Feature Values at Circularity N = 96 are Assumed to be True Values for Each Encoding, where Image Array Size is 2 x 2 and Black-Boundary is used. MEAN-R DEV-R N Primal Dual E.F. Primal Dual E.F. 96 .00 .00 .00 .00 .00 .00 72 .00 .01 .00 .00 .02 .00 1+8 .01 .01 .00 .01 .01 .00 36 .00 .01 .00 .01 .02 .01 2k .03 .03 .03 .01 .01 .00 18 .11 .12 .03 .07 .10 .01 12 .00 .03 .01+ .08 .02 .01+ 6 .25 .22 .29 .11+ .17 .03 98 oding Point (a) Embryo Encoding Point (b) Sleeping Cat Figure k.3. Polygons: Embryo and Sleeping Cat 99 Table h.J. Absolute Errors for Features of Embryo, where Feature Values at Circularity N = 96 are Assumed to be True Values N MEAN-R DEV-R Primal Dual E.F. Primal Dual E.F. 96 .00 .00 .00 .00 .00 .00 72 .01 .01 .00 .00 .01 .00 1*8 .00 .01 .01 .01 .00 .01 36 .02 .00 .00 .00 .02 .00 2k .02 .02 .02 .00 .01 .01 18 .03 .01+ .01 .03 .06 .00 12 .02 .09 .05 .03 .00 .02 6 .3k .12 .05 .02 .15 .08 Table U.8. Absolute Errors for Features of Sleeping Cat, where Feature Values at Circularity N = 96 are Assumed to be True Values N MEAN-R DEV-R Primal Dual E.F. Primal Dual E.F. 96 .00 .00 .00 .00 .00 .00 72 .01 .02 .01 .00 .01 .00 1+8 .0U .01+ .01 ,0k .01+ .01 36 .03 .06 .01 .05 .07 .00 2k .03 .03 .01 .07 .02 .02 18 .06 .08 .01 • 07 .09 .00 12 .08 .07 .02 .23 .11+ .07 6 • 19 .01+ .10 .27 .31+ .27 100 0.4 0.3 a: o a: UJ .0.2 0.1 0.0 -• PRIMAL X -X DUAL O O E-F 20 ^*^T 40 60 CIRCULARITY N 80 100 Figure k .k . Errors v.s. Circularity N of Embryo: DEV-R 101 PRIMAL 20 40 60 CIRCULARITY N 80 100 Figure U.5. Errors v.s. Circularity N of Sleeping Cat: MEAN-R 102 features. From these tables and figures it can be said that feature values are not significantly affected by circularity. In other words, for N > 2h feature Yalues are stable and errors may be negligible. By the above arguments we may make the following summary: (i) Errors decrease as array cell size increases, (ii) Black-boundary encoding is preferable to white-boundary encoding, (iii) Errors on edge-following are less affected by the change of circularity than those of black or white boundaries, (iv) In general, errors increase as circularity decreases, and those errors for circularity N >_ 2U do not seem significant; of course, this is dependent on the complexity cf figures. h. h Extraction of Local Features U.U.I Preliminary Discussion In Section U.2 we introduced 19 features which were extracted from three shape representations: polygonal, pattern sequence, and skeleton representations. These features are some kind of physical measures of an object, and unfortunately, they do not directly describe its shape information. In a sense, the features in Table U.l are global features of a shape, and may not describe local features such as lobes, cavities, and roughness on curves . Let us consider the shapes illustrated in Figure U.6. Some interesting local features on each shape are indicated with a dotted circle where these local features may be called "lobes," "cavities," and "roughness," respectively. It is clear that humans detect these local features. Thus, for the description of objects, such features should also 103 (a) Local Lobe (b) Local Cavity (c) Local Roughness Figure U.6. Shapes with Interesting Local Features 10U be considered as well as those in Table k.l. Necks, tails, and hairs of shapes are also considered to be local features and have been discussed in Section 3. ^.2. In the following sections we will describe an approach for the extraction of each local feature from rather simple regions as shown in Figure H.6. Some other local features which are not discussed in the following sections, however, can be considered as significant local features for shape perception problems and are illustrated in Figure k.'J. For example, let us assume that a given object is represented by a pattern sequence S(x ). Then the extraction of local features may correspond to the determination of subsequences in S(x _) based on how and where the human pays attention. In other words, we are interested in determining a subsequence of a curve where a human eye's fixation time is longer than for another. k.k.2 Determination of Local Lobes We assume the bi-pattern sequences for the object to be encoded. We generate two kinds of pattern sequences: a primal pattern sequence S(x ) and a dual pattern sequence S'(x ). S ^ x = S 0' s i' #, *> S j_'---> s n_i' S{x ) = s^, s^,..., s^,..., s^, S ' ± = R -7 ± for all i. It is clear from the definition of these pattern sequences that s' > s for all i — 1 i, and this is denoted by S '(x ) r> S(x ) . An example of these sequences can - be found in Figure 3.11. If S'(x ) = S(x ), then the encoded region is 1-angularly simple, and the approach for the determination of local cavities in the following section will be considered for the determination of local lobes. 105 Figure h .J . Other Examples of Local Features 106 Let S' be given by O "— o. -i S . , . . . - S . » . . . . S . 11 1 1 1 2 j m which is a sub-sequence of S(x ) such that 6. +s~. =R, s. +7. = R 11 mm and s . + a ^ R for j = 2,..., m-1 . where R is the radius of a circle which contains the object completely. Then a sub-sequence S", which contains S', is a sub-sequence where some local lobes may be extracted. By this approach, the region or curvature indicated by a dotted circle of Figure U.6(a) can be determined. Let |S'| denote the length or cardinality of a sub-sequence S' , and let us defind D and D as follows: D = | S ' | /N and D = 1 - D . Then we assume that the greater the difference between D and D , the more the human tries to keep his attention on such a shape and tries to extract local features. This may be called the local oddness of a shape (c.f., odd shape detection in Maruyama, 1971)- This implies the determination of a portion of a curve on a shape which is different from the majority of curves, H.4.3 Determination of Local Cavities To detect the portion of the shape of Figure U.6(b) which is indicated by a dotted circle, we consider a sequence H which is derived from S(x_) in the following manner: 107 H = h Q , h 1 ,... 1 h.,..., h N _ lS where h . = I s . , _, - s.l + Is. - s. I for i = 1 , . . . , N-2 1 ' l+l i' ' i l-l 1 and h Q = \s ± - s Q | + |s - s^l, Vl = ' S " S N-ll + l S N-l " S N-2 ' * The H sequence magnifies elementary patterns which correspond to vertices on the closed curve. A sub-sequence H' of H can "be determined in the following way" h. e H' such that h. > T(h) , l l where T is a threshold based on h, N-l h - (£ h.)/N. i=0 X Thus, by determining the subscripts of H' it is possible to determine which of the elementary patterns s.'s in S(x ) denote local cavities U.U.U Determination of Local Roughness The dot-circled portion of the shape in Figure k.6(c) may be extracted by determining the angle changes between the adjacent vectors on the curve if the curve is smooth. This is illustrated in Figure U.8(a). To determine the rough portion on a polygonal curve, we distribute N number of points p.'s on the curve y in an equi -interval manner as mentioned in Section 3.h. And at each point p. on y , we consider a disc D of radius R. Then we measure the length 1. of the curve segments K l contained in the disc D^ at each point p.. R l L 1 Q , 1 ,..., l i ,... , 1 N _. 108 (a) Angle Changes on a Smooth Curve (t>) An Approach for Measuring Roughness Figure k.Q. Extraction of Local Roughness 109 This approach for the detection of the sequence L is illustrated in Figure U.8(b). We normalize l.'s in such a way that max {1'.} = lwhere i 1 ! = I . -2R so that we can consider the distribution of l.'s "between values *i l l and 1 as illustrated in Figure U.9. Distributions around imply smoothness, and distributions around 1 imply roughness, so smoothness or roughness can be determined by the distribution frequencies. This distribution is called SR-di st ribution . Some examples of SR-di st ribution are illustrated in Figure U.10 where one shape consists only of smooth curves and the other consists only of rough curves. Other examples are illustrated in Figures U.ll(a) and (b) where the roughness and smoothness between them is completely opposite. From these examples it becomes clear that SR-distribution can be used for the determination of local features. This can be used for some others such as a solution for odd shape detection, shape clustering, and a quick matching-in or quick shape extraction problem. Figure U.12 shows an example of how shape extraction can be accelerated by considering SR-distribution on curves. 110 m ox {I,} min [lj| +- Rough Figure k.9. SR-distribution: Smoothness and Roughness 1 a. (a) Smooth Curve (b) Rough Curve Figure U.10. Examples of SR-distribution Ill (a) (b) £222 Figure U.ll. Examples of SR-distri"bution Figure U.12. An Application of SR-distribution for Shape Extraction 112 5. DETECTION OF ODD (SIMILAR) SHAPES In the previous chapters we have discussed shape representation and feature extraction. As an application of these — a procedure to determine an odd shape or form in a given set of shapes— each set consisting of more than two shapes will be described. Here we utilize the pattern sequence represen- tation as shape description. In Section 5-1 the "oddness" of a shape in a set of shapes on the basis of a specified feature is defined, and then the "most odd" shape on the basis of many detected features is defined. A complete description of the system for odd shape detection appears in Section 5.2. To make the system response like an average human response, modification of an assumed weight vector by means of an "attention mechanism" is considered. Section 5.3 is provided to show the performance of the system by comparing the system response with the average human response. 5-1 Basic Definitions Let us first define discrimination within a given set of shapes on the basis of a single specified feature. We assume that the given set of shapes, ft , ft , . .., ft., . .., ft , to be presented to the subject (man /machine) consists of more than two shapes, i.e., n >_ 3. The specified feature, C, takes test results x , x , ..., x for the n shapes respectively. Definition 1: Odd Shape /Form on C We define the D- interval , d., for each ft. as: J J n n d. = max {x, } - min (x). J k=l K k=l K Then we define ft . as an odd. shape/form in the given set of n shapes on J 113 1 n the "basis of C if d. = min {d, }. J k=l ^ The following is an alternative definition of oddness based on a single specified feature: Definition 2: Odd Shape /Form on C We define deviation, d. for each ft. as: J J n n d. = zl x, - x. where x. = ( JJx^ )/(n-l) J k=l K J J k=l^ ¥i ¥i Again we define ft . as an odd shape/form in the given set of shapes on J n the basis of feature C if d. = min {cL }. J k=l ^ To see how the above definition of oddness reflects human response on the basis of a specified feature, let us consider the examples illustrated in Figure 5.1. We assume, for simplicity, that the specified features are the number of vertices and convexity. Table 5-1 shows the extracted feature values for the specified two features, and Table 5.2 shows the D-intervals, where we assume that the convex shape is denoted with value 1 and the concave one with 0. For the set A, ft. is said to he an odd shape on the basis of the number of vertices, while ft would be chosen as an odd shape on the basis of convexity since d = is the minimum. For set B, the number of vertices is not signifi- cant for the decision of oddness, since d., j = 1, 2, 3, h, assume identical J values. Similarly, for set B, the feature of convexity does not play any significant role for the decision of oddness since ft and ft p are concave and ft and ft, are convex. it is easy to see that d. is minimum only if x. is an extremal. llU a a. a- a, (a) Set A of Four Shapes a. a< a- a, (b) Set B of Four Shapes Figure 5.1. Trial Sets of Shapes 115 Table 5*1 • Feature Values of Shapes in Figure 5.1. flj tt 2 «3 «u Set A Number of Vertices 5 6 6 3 Convexity 1 l i Set B Number of Vertices 5 It 5 U Convexity 1 1 Table 5-2. D-Interval by Definition 1 i n 2 3 n Set A Number of Vertices 3 3 3 i* Convexity 1 1 0* i Set B Number of Vertices 1 1 1 i Convexity 1 1 1 i *This indicates the detected odd shape in the given set of shapes with respect to the single specified feature. 11C Table 5.3. Deviation by Definition 2 1 2 n 3 \ Set A Number of Vertices h 10/3 10/3 k/3* Convexity- U/3 U/3 0* h/3 Set B Number of Vertices U/3 U/3 h/3 k/3 Convexity U/3 U/3 h/3 k/3 Table 5.3 shows the deviation for shapes which are illustrated in Figure 5.1. Comparing Table 5*2 and Table 5-3, we see there is no significant difference between the odd shapes detected by Definition 1 and those detected by Definition 2. The only difference that appears in these two tables is in the very first rows for the set A. If we consider detection of two shapes from each set of shapes, one for the oddest shape and the other for the second oddest shape, then the Deviation approach gives the second oddest shape as either fi_ or Q_ for the set A on the basis of the number of vertices. However, the D-interval approach gives the second oddest shape as either Q , fl , or Q . At this stage, we have no criteria for which definition of oddness is preferable, especially when the number, n, of shapes is small (about h to 6). However, as n becomes large, then Definition 2 is preferable to Definition 1. These two definitions become identical when n is 3. Thus far we have discussed oddness of shape on the basis of a single feature, C. We will now expand the above definitions to enable choice of oddness on the basis of m features , C. , C^ , . . . , C . , . . . , C . ' 1' 2' i' ' m 117 Definition 3: Odd Shape/Form on C , . .., C., ..., C m Let us assume that d. . denotes the D-interval or deviation of the J. J j-th shape, Q . , on the basis of feature C\, and let J k/s - H. = v£ (W.d ) for ,5=1,2 n J i=l 1 J Then the shape ft is said to he the most odd shape in the given set of n shapes if H. = min {H. } Here W = (W_, , .... W. , .... W ), W. > 0, is called the modified weight vector, limi — which can he adaptively varied and will be discussed in Section 5.2.2. In the above definition, if K=l, then the method is called the Linear 2 Separation (LS); if K=2 , then this is called the Euclidean Distance Separation (EDS); and if K>_3, it is called the K-th degree Separation . Now we are ready to expand Definition 3 to determine similarity between a shape ft and each shape ft., j=l, ..., n. We define difference d. . for a specified feature C. as follows: d. . denotes the difference between ft and ft. i ij J on the basis of feature C. l d. . = |x. - x. . , x. : feature value of ft on C. , i l x. . : feature value of ft . on C. ij J i Then H . , defined in Definition 3, defines the most similar shape to ft as ft. if J J 2 Attneave (1950 ) asserts that psychological distances could not be fitted in Euclidean space since the psychological difference between any two stimuli is approximately equal to the sum of their differences with respect to the physical variables — not to the square root of the sum of the squared differences as demanded by an Euclidean space. A comparison between a Multiple Regression and Multiple Discriminant models can be found in Jones (1971). 118 n H. = min {H, }. J k=l k (k) These values H.'s correspond to the real values r. . assigned by function f\ i ij i which will be discussed in Section T.I- From the above argument, the detection of similar shape becomes identical to the detection of odd shape by assuming d. . denotes the difference instead of the D-interval or deviation. Thus, hereafter we consider only the problem of detecting odd shapes. 5.2. A System to Detect Odd Form We previously outlined our model for the detection of an odd shape/ form among a given set of shapes. This section develops the model rigorously. The block diagram of our model is illustrated in Figure 5.2, and we shall dis- cuss here the role of each block. Let us assume that for shapes ft., j=l, ..., n, that X., j=l, ... s n, are feature vectors of dimension m, respectively, where: X = - ( X , X , . . . , X , . . . , X ) = [x. . ] , x. . e R ij mxn' ij m: the number of features considered, n: the number of shapes presented, x. . : the feature value of the j-th shape l i ft . on the basis of the i-th feature, J 5.2.1 Analysis A D-Table is defined as : d = [a..] ij mxn where, Case 1 (Using Definition 1 of D-interval) w 119 X i X 2 T U)' (2) ATTENTION a WEIGHT MODIFICATION _1_ (1) ANALYSIS W L (3) DECISION a SELECTION OUTPUT j €{l,2,3,---,n} Figure 5.2. A System to Detect An Odd Shape/Form Amon* A Given Set of N Shapes 120 * v A d! . = d. . - d. . where d. . = Max {x., } ij xj ij ij k ^, ik V d. . = min {x. n } . 1J k^j lk Case 2 (Using Definition 2 of Deviation) n a* ■ TT j x., - x., ij ^i lk x j n where x. . = ( V* x )/(n-l) 1J k=l lk Thus, d! . denotes the original D-interval or deviation for the shape ft. on the basis of the i-th feature, C. Since we are dealing with m different features, we will normalize the D Table, i.e., d. . = d! ,/d. where d. = (Vd! -)/n. This ■'■J U ' u is quite important because each individual feature value is evaluated by quite different scales. The T Table, which is derived from the D Table, may be used to save storage for our model: T = [t..] ij mxn (1 if d. . = min {d. n } xj k xk otherwise Thus t. . is 1 if and only if shape ft. is defined as an odd shape, by either Definition 1 or 2 , on the basis of the i-th feature C. 5.2.2 Attention and Weight Modification Since humans pay attention to some features more than others (these attention centering features varying with the object), we will introduce a mecha- nism, called the attention mechanism, into our model. We call is a mapping from the D table into a vector A of 121 dimension m, called the attention vector. Thus $ : D + A where A = (A n , A . .... A. , . . . , A ) 1' 2' x m A. e R called the attention factor for the l i-th feature detection. Note that <$>(D) assumes different values for different sets of shapes. Without loss of generality, we write A. = . (d..., d. oJ ..., d.., ..., d. ) l Y i 11' i2' ij in for i = 1, 2, . .., m. To simplify our model we assume that $. = $ for all i, Then the modified weight vector W W = (w lt w 2 , ..., W., ..., w m ) is defined as W. = oj.A. for all i where l 11 u ~ (u 1 , Wo, . .. , uk, .. . , u m ) is the originally given (or assumed) weight vector (e.g., go = (l, 1, ..., l) implies that all features are considered to be equally significant). Let us give a simple example of such an attention function c|> ; : (a., 8.) -♦- A. for all i where a. = min {d., }, ill l , lk k 8. = min {d. n \4a. }. l , ik' i k More explicitly A. = $ (e i - a.) and has a property such that whenever 8. - a. > 8.' - a.' then 1111 <)> (8. - a.) > $ (8.' - a.') holds, l l — r l l 122 5.2.3 Decision and Selection The role of this block is to form a vector H of n dimensions and then select an odd shape according to Definition 3. H - (H , Hg, ..., H , ..., H ) J where: Case 1 (D table is used) K fm ; K rw J v i=l Case 2 (T Table is used) K rm ~ H. ■ 7E (W.t ) , K>1. i=l The selection of the most odd shape or form is accomplished in the following manner : Case 1 m £1 is the first choice odd shape if H. = min {H.} k i \ 3-1 J n P. is the second choice odd shape if H = min {H.|^H, } K 2 K 2 j=l J k l Case 2 n P» n is the first choice odd shape if H n = max {H.} k l k l j=l J n fi, is the second choice odd shape if H, = max {H.|^H } *2 k 2 o=l J K l 123 5.3 Experimental Results Here we compare human response and our system response to the task of detecting an odd shape from a given set of four shapes. The data stimuli, 20 sets of four shapes each, are illustrated in Appendix B. 5.3.1 Human Response To get an average human response to the 20 sets of shapes, we obtained exactly 100 student samples on campus at the University of Illinois . We asked each student to pick an odd shape from each set of shapes and we counted the frequency of response for each shape. Table 5.^ shows the human response for each set. For the first set of shapes, for example, 57 students chose the second shape, ft , as an odd shape, 22 students chose Q as an odd shape, and IT students chose ft as an odd shape. Thus the human response for the first set of shapes is ft as the oddest shape and ft or ft as the second oddest shape. This response is denoted as ft^/ft , ft- (for the sake of simplicity we write 2/1,3). Considering the 7-th set of shapes in Appendix A, 0_ , ft , and ft, are identical except for their rotation. Therefore, shape ft should be chosen most frequently as the odd shape. According to Table 5^> 98 students chose ft and two students chose ft, as the odd shape. The average human response is ft and is denoted by 2/_ . 5.3.2 Attention Function We focus our discussion on the following two types of attention functions : (i) $( Y ) « Y , k > Y = /3. - (y) = [C, sin (77 + C 2 y) +y!q , where [X] means if X > 1 then $(y) = 1 and if X < then (y) = 0, otherwise (y) = X. 12U Table 5.1*. HUMAN RESPONSE: The Numbers Represent the Number of Times Each Shape was Chosen as Odd by the Students. The Average Human Response is i/j (fl./p.) "l Q 2 "3 Q l+ Average H.R. 1 IT 5T 22 1+ 2/1,3 2 1+6 5 1+8 1 1.3/. 3 1+1+ 12 35 9 1.3/. 4 1 9T 2 3/_ 5 1 8 86 5 3L 6 9 65 1+ 22 2/1+ 7 98 2 2L 8 35 20 20 25 1.2 3.1+ 9 22 6 6T 5 3/1 10 2 11 31 56 4/3 11 88 11 1 1Z3 12 1+ 1+2 51+ 3,4/_ 13 1+1+ 1+2 1 13 1,2/4 lU 26 29 22 23 1.2 3,4 15 23 16 1+2 19 3/1,2,1+ 16 31] 35 1+ 30 1,2,3 IT 1+1+ 13 1 1+2 1,^/2 18 T 11+ 68 11 3/2,1+ 19 50 28 15 T 1/2,3 20 81 1+ 5 10 1/it 125 Figure 5-3 shows attention functions of the second type with C =0.2 and 5.23 ^ C > 3.93. We will see later — after running our model for various attention functions — that the second type of attention functions, which consist of trigonometric functions and linear functions with C = 0.2, are at least as good as those functions discussed above. 5.3.3 System Response To compare the behavior of the system with the average human response, we count the number of times the system response matches the average human response for the first choice of the odd shape. Let us first assume the weight vector No. 1 in Table 5-5, i.e., all these features which are considered for the system are equally significant. In the table weight, w. corresponds to the i-th feature of Table k.l. The weight co* is for a special feature which is extracted on the basis of a set of n shapes presented and is called an "interval difference sum." This feature reflects the degree of dissimilarity of (n-l) shapes. Let v s. denote the i-th elementary pattern of the k-th shape. Then the feature value of the j-th shape is defined as: W k k / „ (max{s.} - min{s.})/N. i-0 k^j X k^j X In Table 5.6 some system responses under different attention mechanisms are tabulated. Here weight vector No. 1 is assumed, and D and T indicate that the corresponding system response is determined by D table and T table, respectively. L26 Figure 5-3. Attention Function of the Form ♦ (y) = [0.2 sin(TT + C 2 y) + y]J. 3.93 < C 2 < 5.23 127 20 Table 5«5. Weight Vectors Weight Vector Weight Vector No. 1 No. 2 .617 .k62 .769 .381* .617 .U62 .307 .769 .617 .538 1.000 .38U '3 1 V 1 J 8 1 '9 1 '10 1 ) 11 1 12 1 13 1 19 1 * 1 Table 5.6. System Responses: Weight Vector No. 1 NO AM 1/2 Y -< 3 Y C =0.k ^=5.23 C =0.2 C;!;=^9 D T D T D T D T D T D T 70 65 70 65 75 75 75 75 75 70 75 75 From this table with no attention mechanism (NO AM), for example, the system response matches the average human response exactly Ik sets out of 20, i.e., 128 70% correct response by D table, while 65% correct response was given by T table. Let us consider the system response with an attention mechanism, •5 Y • Then the system response matches the average human response exactly 15 sets out of 20, i.e., 75% correct response. In general, the system response given by D table is at least as good as the system response given by T table. Furthermore, the system response with appropriate choice of an 1/2 attention mechanism (^ y ) is better than the system response without it. With an equally weighted vector, the system achieved 65% to 75% correct responses, i.e., 75% of the average human response was realized by the system. Those sets of five shapes for which the system did not match the average human response are Sets 1, k, 6, 15, and 16 of Appendix B. To improve the system response in terms of the average human response, let us consider the weight vector No. 2 in Table 5«5- This vector is obtained by solving a Linear Programming Problem to satisfy the average human response as closely as possible. The following is the approach to derive the vector. Let us consider the vector H in Section 5.2.3: H = ( H^ , . . . , H , . . . , H^ ) , where m H. = 7^ a), t. . for all j, w > for all i. Here u). is an unknown weight for the i-th feature, and we use T table instead of D table for analysis. Let us assume that the average human response for the k'-th set of n shapes is Q, , i.e., the k-th shape is determined as the most odd shape by the average human response. Then, without any modification of weight, the following set of inequalities should hold to respond 0, as the most odd shape by the system: 129 H - H. > for j 4 k. k j - This can be rewritten as C , CD >_ it Thus, if there are M number of stimuli, for each stimulus set we can derive the above set of inequalities. Consequently, we have a huge number of inequalities, each of them should be satisfied simultaneously for id to attain the exact average human responses. Such a huge system of inequalities can be written as : C cd >_ 0, a, u > o, k . — C.. aT > 0. M — Therefore, the problem of attaining the average human responses for M stimuli by the system becomes the problem of solving the above set of inequalities for cd . For 20 sets of data stimuli in Appendix B, we derive a set of inequalities using the average human responses in Table ^.h. The derived system contains 59 inequalities. Unfortunately, such a system turned oat to be infeasible, i.e., no cd exists to satisfy 59 inequalities simultaneously; _^. reduced to 20 inequalities, the problem was solved for a feasible solution cd. The weight vector No. 2 was obtained by this manner. In Table 5-7 some system responses attained with the weight vector No . 2 are shown . Table 5.7. System Responses: Weight Vector No. 2 NO AM 1/2 Y Y 3 Y C =0.U 0^=5.23 C =0.2 C^=U.U9 D T D T D T D T D T D T 75 75 75 75 85 80 90 80 8? 80 90 80 130 Without an attention mechanism the system achieved 75% correct response of the average human response. With attention mechanism, C =0.2, and C p = U.i+9, the system achieved 90% correct response with D table, which is a significant improvement compared to the system response given "by Table 5-6. Those sets of shapes whose average human response could not be achieved by the system are Sets 6 and 15 of Appendix B. For Set 6 the average human selects the second shape from the left while the system selects the fourth shape. For Set 15 the average human selects the third shape from the left while the system selects the fourth shape as the most odd shape. Many other system responses under various assumptions of system variables appear in Maruyama (1971 c). Since none of the students responded exactly the same as the average human responded, their correct response ranged from 65% to 90% (mostly 80% to 85%). Therefore, we see that the system response given by our model reflects the average human response on the odd shape/form detection. 5 .3>b Summary From our experiments we conclude that we can achieve a wide variety of system responses using different assumptions: different separation methods definitions of oddness, weight vectors and attention mechanisms. Overall experimental results are summarized as follows : (i) For the analysis of the system, the LS model is preferable to the EDS model because of its simplicity. According to our computational results, the two models have almost no significant difference in system response . Thus , we may conclude that the Linear System is an adequate model for human behavioral psychology (see Maruyama 1971c). 131 (ii) The system response given in terms of the D table is better than the system response given in terms of the T table regardless of the other parts of the system (e.g. weight vector, attention mechanism, separation method, etc.). The system analysis for the T table, however, is simpler than for the D table. (iii) We see that the system response using some kind of attention mechanism (in which the attention factor, A., is non-decreasing on the value 8. - a.) is better than the system response using no such mechanism. Thus, we may conclude that the attention function, <|> , : (8 i - a ± ) -> A i such that 8. -a. > 8.' - a! implies (8. -a.) > (8! - a!), which modifies l i l l l i=i l the given weight vector, is an adequate choice for our model. 132 6. GRAPHS AND GRAPH TRANSFORMATIONS 6.1 Graph Representation Graphs are widely used for the convenient representation of the two- dimensional geometry implicit in a picture which consists of many regions. By using a graph representation, many problems of picture interpretation become feasible with provision of appropriate heuristic algorithms. The study of Guzman (1968) demonstrated the significance of graphical representation of two- dimensional pictures, using this representation to analyze and cluster planar regions into three-dimensional objects by extracting certain properties on the vertices. Eastman (1970 ) finds a need for a better descriptive-capability to define the spatial relationships essential for two-dimensional space planning. Here again, a graph representation is evoked. Let us construct graphs for the picture of Figure 6.1(a), which con- sists of four regions (5 regions if we include the background as a region). Corresponding graphs (regions-nodes, regions-arcs, and arcs-nodes graphs) are exemplified in Figures 6.1(c), (d), and (e) respectively, using name informa- tion of (b) Which of these graphs should be used is dependent upon the kind of problem to be solved as well as the choice of heuristics. Usually graphs are described in various matrix forms ; however, it is convenient to describe a graph in a sequential manner. In that case, the following graph representation is useful. Let us again consider the picture of Figure 6.1(a), and let us assume that its regions, arcs, and nodes have been assigned names as (b). Each region is denoted in a way to preserve the shape of the region, and each node, i.e., intersection of more than one arc, is de- noted by the values of its X, Y coordinates. "TVlaruyama (1968) provides a graph I/O package which enables one to transform one matrix representation of a graph to other matrix representations. 133 (a) A Picture of Four Regions (b) Naming (Nodes-Arcs (c) Regions-Nodes (d) Regions-Arcs (e) Arcs-Nodes (f) Regions-Relations Figure 6.1. Examples of Graph Representation of a Picture 13U Let us assume the following expression for each node: ({()}*), 1 j K. where , V V. = node name = node name different from V = arc name ;= region name and its attributes the following semantics are implied: i) nodes V.'s are direct neighbors of the node V, ii) nodes V and V. are connected with an arc a., i J iii) region fi, is (partly) bounded by arc a., and ft is assumed by k j k a. in counter-clockwise direction at node V, and J iv) the sequence of V.'s is cyclic. By the application of the above representation, the picture of Figure 6.1(a) can be described as follows: V l ^ V l* ^ a 3 s %^ 9 V 2 ^ a l 9 *V' Y 3 ^ a 2» fi 2^ ^ V 2 ^ V h ^7' ^V' % ^ a 6 9 ^V' V 3 ^ a l+' ^V' V l ^ a l' %^ ^ v 3 ^ v l4 ^ a 5> fi 2^' V l ^ a 2' ^V> V 2 ^% 9 *V ^ V l* ^ V l ^ a 3' ^V 9 V 3 ^ a 5 s ^V' V 2 ^ a 6' %^ 9 V 2 ^ a T' fi 0^ ^ In the above expression, nodes V.'s are sequentially scanned in counter-clockwise order about node V. Thus the sequence of V.'s is cyclic. For example, the following two sequences are semantically identical: V l W ^ a 3' ^V' V 2 ^ a l' ^1^' V 3 ^ a 2' °2^ ^ V l ^ V 2 ^ a i' ^V' V 3 ^ a 2' fi 2^' Y h ^ a 3' ^V ^' From sequence expressions of each node, we can derive graphs which have been described previously. If arcs are simplified and approximated by 135 straight-line segments, then the above expression can be simplified by elimina- ting arc information. For example, V l ^ V U ^ a 3' n 0^' V 2 ^ a i' fi l^» V 3 ^ a 2' *V ^ can be simplified into v ± (v u (n Q ), v 2 (^j, v 3 (n 2 ) ) by assuming v and v., v and v and v and v are connected with straight line segments. This type of representation of pictures will be discussed extensively in the following chapter. 6.2 Relation Graphs Recently, Schwebel (1972) proposed a graph structure model for global level picture processing. He defines general graph structures and graph struc- ture transformations which are general mappings between structures. Graph- theoretic clustering techniques operate on relational graphs, attempting to merge nodes according to the relations between pairs of nodes. Zahn (l9Tl) considers clustering on graphs using weights and by forming a minimum spanning tree on a graph. Brice and Fennema (1970 ) apply region-merging heuristics. Explicit studies of graph transformations become necessary for structur- al inference. This area was initiated by McCormick (1967) in his presentation of illustrative transformations and his emphasis upon the concept of reversibi- lity. These ideas were further considered in Schwebel (1969) and led to defini- tions of properties of relations for composite formation in Schwebel and McCormick (1970). Schwebel' s study covers a wide area of the properties involved in graph structures, and it led to his definition of language, called Structure Operation 136 Language (SOL). SOL is proposed in order to simply and conveniently express operations on multi-relational graph structures which allow an arbitrary number of variables to be associated with any structure element. In this section we describe some relations. Some algorithms of extracting relations from a given picture and means to construct graphs are also described. A graph G, which is mixed and labeled, consists of a set of nodes V and a set of branches B, where G = (V, B) B C V x V : product space. Schwebel's graph structure, GS, was represented by a sextuple. We also assume a function g on B, such that for each element b in B, g(b) e R, where R is a set of relations between nodes. We consider regions-relation graphs, i.e., we assume that each node of a graph represents a closed or open region of a picture and branch, directed or not directed, denotes relations between two regions. Each node has parameters and attributes: the center of the gravity of the region or the AS-simple point, corresponding pattern sequences, textural and color attributes, etc., which preserve the shape of the region and its location in the picture. This informa- tion is accessed by an attached pointer for each node. Thus our graph is not a simple abstract graph, but a graph which describes the structure of pictures consisting of many regions. A simple example of a graph G = (V,B) is illustrated in Figure 6.2(a). The picture of (b) may be described as a graph of (c), where we introduced some 137 V = {v 4 ,v 2 ,v 3 ,v 4 ,v 5 } B={b lf b 2 ,b 31 b 4 ,b 5 ,b 6 } where G=(V,B) bi^V^Va) b« a (v lf v s ) b 3 = (v 3 ,v 2 ) b 4 =(v 4 ,Vi) b 5 =(v 4 ,v 5 ) b 6 =(v 3 ,v 5 ) (a) An Example of a Mixed Labeled Graph contained contained (b) A Picture (c) Graph Representation Figure 6.2'. Examples of Graphs: Regions-Relations 138 necessary relations to preserve the structure of the picture. These relations exemplified in (c) — "above" and "below," "left of" and "right of," "contains" and "contained in" — are pairs, and one of them is the inverse of the other rela- tion. Thus, clearly some relations can be eliminated from a graph without destroying the structure. Relations which we shall evoke are: i) r : contains, r : contained in, -L J *I * Lunuaiiu' * 4 I ' WJ ii) r : c connected, -1 r = r , c c iii) r A = above, -1 r = r B A iv) r R : right of, -1 r = r L R v) r s : partially surrounds, vi) r N : connected by neck, and vii) r T : tail of. below, left of. r : partially surrounded by, 6,2.1 Containment Relation The topology of a plain black-and-white figure or regional picture can be simply described by a tree, a containment tree. For example, the topological relation — "contains," "contained in" — of pictures illustrated in Figure 6.3 can be described simply by the containment trees as illustrated, where each node refers to a connected component of the figure or closed region. Thus such a tree is a subgraph of the graph which describes the structure of the picture. For array pictures, Buneman (1970 ) presented an algorithm which deter- mines this containment tree from information about the picture which is picked up during a single scan of array. His algorithm strongly resembles a generative grammar and a procedure for choosing which rule of the grammar to operate. 139 (a) Black and White Array Picture (b) Region Bounded Picture Figure 6.3. Containment Trees lUo Our algorithm of generating containment trees assumes that the given picture is described as a set of polygonal curves. We also assume that for each such closed curve a point inside the curve, the center of gravity or the AS-simple point, is available. Our procedure for deriving trees is quite similar to the determination of intersections between a straight line and with a set of curves. We emanate a ray from each given point which is inside the corresponding region and determine a sequence of intersections with the ray and the regional boundaries. Without loss of generality, as we can emanate the ray in any direction, we chose here to emanate the rays along X-coordinate. A sim- ple analysis on the generated intersection sequence shows the possible contain- ment relations. Let us consider the picture of Figure 6..U(a)» which consists of 5 closed regions. The regions are denoted by ft through ft , and their interior points are x through x , respectively. For example, region ft is contained in fi., and, in turn, ft is contained in ft . This can be determined by emanating a ray at x and examining the sequence of boundary intersections. In this case, the sequence is ft_, ft p , ft.. , and is an ordered sequence. It is interpreted as for each pair of adjacent elements, the left element contained in the right ele- ment. Thus, R c ftp and ft CL ft-,. Let us consider region ft . Its sequence is ft~, ft p , ft p , ftp., ft R , ftp, ft,. This means the emanated line at x doubly crosses regions ft and ft , which clearly implies that x is not contained in ft nor ft ; consequently ft is not contained in ft nor in ft . Thus any pair of identical elements should be eliminated from the sequence. After this shrinking process, the sequence becomes ft , ft , ft and is interpreted as ft C ft C ft-, . The region ft has its corresponding sequence as ft, , ft , ft , ft , ft and will be shrunk into ft, , ft , ft . However, since we are considering and determining which regions contain ft , the final sequence should be ft . lUl (a) X^ • AZ-4 , A&5 , AZ-5 , AZ«2 » «*i X 2 " i^4 , AZ.5 J itg , \l 2 j Aij X3 • aZ.3 , Ai 2 j A£ 2 j A/,5 j AZ.5 j A/>2 ) uij X 4 • A£ 4 , AZ, 5 , AZ> 5 , AZ, 2 » AZ-i X5 ' At-5 , AZ>2 > Jk«-i X, a t x 2 : ubo -j vit/j x 3 : Ai 3 , Ai 2 ?"i x 4 = u/'^ « u Zf p • AL x 5 : AZ 5 , Al 2 , AZ^ (b) (c) ii 2 C Ali & 3 C & 2 il 4 C & 2 & 5 C & 2 (d) Figure S.k. An Algorithm to Generate Containment Trees 1U2 From the above examples, we derive the following simple grammar. Let ft be a sequence, 1 * 1 ' 1 ' i 12 j n generated at a point x^ of the region ft . Let us also assume that symbols, a, 6 and y, denote any subsequences or null sequences. Then our grammar consists of the following two rules: Rl. aft. 3 ft. Y->a3y i . i . J J R2. Let x^ be the point of ft. , aft. 3 + ft. 3. j j j For any generated subsequence (or complete sequence), Rl is applied as many times as possible, and then, finally, R2 is applied. In Figure 6. Mb), generated sequences at x through x are listed, (c) shows the results obtained after the application of rules Rl and R2, and final containment relations as well as containment tree of picture (a) are illustra- ted in (d). In (d) only direct containment relations are exhibited, and tran- sitive containment relations such as ft~ c ft-i » ft-t, C ft-, and ft CZ ft-, have been eliminated. 6.2.2 Neighborhood Relations Relations, such as "connected," "above" or "below," "right of" or "left of," "partially surrounds" or "partially surrounded by," "neck" and "tail," are considered to be neighborhood relations . These relations can be defined only between brotherhood regions, where regions which have an identical direct father in the containment tree are said to be brothers. Thus, after construction a con- tainment tree for a given picture, we know which regions of the picture are brother The determination of relations between a pair of brotherhood regions is as simple as the construction of a containment tree. Let us assume that the 1U3 regions ft, , ft„ , >■•• ft. and ft are brothers, and let us assiime that we are & 1' 2' ' j n determining the relations between a region ft. and the rest of regions ft - { ft. }, J J where X 2 3 n We also assume that x. eft. for j = 1, ..., n. J J . To determine relations, a generalized primal pattern sequence is generated at x. considering a space of ft - { ft. }, where each elementary pattern has an attribute J to tell which of regions has been pierced. Thus , for this type of pattern se- quences at x., we may ignore measuring the distance between x. and the intersec- J J tion of elementary pattern with a region; however, we detect the region name where such an intersection occurs. The process of determining relations is illustrated in Figure 6.5 Here (a) shows a picture of eight regions and its containment tree. From the tree, regions ft-, ft_, ft. , ft,-, ft/-, and ft 7 are brothers, and their father is ft . Only brotherhood regions are illustrated in (b), and their possible relation graph is exemplified. In (c) a picture of four regions is shown, and the process of determining relations between ft and ft - {ft p } is illustrated as the genera- tion of a primal pattern sequence. From the sequence we can determine all possi- ble relations. The sequence generated at x is i^yx-p) ■- s , s , ........ s oo = 3, 3, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, k, k, k, h, k 9 3, 3, 3, where denotes empty space, 1 denotes ft , 3 denotes ft , and h denotes ft. . From the sequence S(x_), it becomes clear that ft p is connected to ft , ft_, and ft, . By the determination of the indices of elementary patterns, we derive further relationships: ft is left of ft , ft is below ft, and ft is above ft, . Ikk (a) A Picture and Its Containment Tree r R ,r T 1 © (b) Brotherhood Regions and Their Relation Graph S(x 2 H (c) S(X 2 ) = S ,S 1J ,S 23 S,e{il lf n si il 4 ,*} Figure 6.5. An Algorithm to Generate Neighborhood Relations 1^5 In general, S(x.) can be divided into four regions, J s(x.) - s Q , s is , s N/2 , , s N _ 1 • » i I below above J I . ,i u left of right of left of Relations such as "surrounds," "tail," and "neck" can be found by further in- spection of the pattern sequence S(x.) as well as shapes of ft. and its connec- ted regions. Let D. be the number of elementary patterns whose attributes are K. ft in S(x.). Then, if D /N > t , t , <_ t •: 1, is called the threshold, and K J K D /N is said to be the degree of connectedness; then we say ft. is surrounded by k J ft. . If D. /N is close to zero but ft. and ft, are connected, then we say these k k j k ' J two regions are connected by a neck if ft. and ft are both circle or square-like J k shapes. If either one (say ft ) is a long, snake-like shape, then it is said K. that ft, is a tail of ft . . k J For example, in Figure 6.5(b), Q is surrounded by ft^, ft_ and ft,- are connected by a neck, ft and ft^ are also connected by a neck, and ft„ is a tail of ft,-. 1U6 6.3 Graph Transformations The extraction of a specified picture from another complex picture corresponds to an extraction of the specified graph from another graph as a subgraph. It is necessary to apply some transformations on graphs to make possible matchings, especially if one graph is not completexy contained in another graph. Such graph transformations are necessary since at very low levels of picture processing, such as boundary finding, some regions which should be distinguished and be encoded differently may be determined as one large region. Similarly, several regions which have been determined to be separate regions should be considered as a one large region. There- fore, graph transformations such as splitting a node or merging nodes become significant to determine possible matchings between two graphs. 6.3.1 SPLIT Transformation Nodes can be split only if some properties hold in such nodes. This SPLIT transformation on a node corresponds to the decomposition of a region into several subregions. For example, if a region ft has a J neck in it, then Q. may be decomposed into two subregions, ft', and ft". J J J Thus the relations between these two subregions are connected, neck, and possibly above or below or right of or left of. Another example is if ft. has a tail-like region; then such a tail may be assumed to be a differ- o ent region from the main region of ft.. Thus, again, ft. can be split into J J a body region ft', and a tail region ft". J J This SPLIT transformation on a node is not a simple straightforward operation as illustrated in Figure 6.6. However, if the graph is not used to describe structural relations — nodes are not representing regions, and arcs iUt \ / ^> / \ Figure 6.6. A Straightforward SPLIT Transformation ©—MS) (a) A Complex Picture (b) The Corresponding Relation Graph Figure 6«7« A Complex Picture and Its Relation Graph: Containment Relations are Ignored 148 are not representing relations between regions— then such a straight- forward operation can be used. Let us consider a picture "Embryo" in Figure 6.7(a), where its relation graph may be illustrated as the graph of Figure 6.7(b). In the picture, the region ft^ has a possible neck and can be decomposed into two regions, ft' and ft". Let us apply the straightforward SPLIT transformation on a node ft of the relation graph of Embryo. Then the resulting graph can be illustrated in Figure 6.8(a), which does not make sense if one con- siders the Embryo picture. The correct transformation of splitting node ft is illustrated in Figure 6.8(b). From these two graphs, it becomes clear that whenever SPLIT transformations are applied on some node ft. J into ft 1 , and ft'.', then we should consider these as independent regions in J J the picture, and relations between ft', and ft" and with other neighborhood J J regions should be regenerated to preserve the new structure of the pic- ture. Which means whenever regions are decomposed, their structure also changes. . . 6.3.2 MERGE Transformation Merging of two nodes corresponds to the merging of two regions. This transformation is meaningless unless some neighborhood relations hold between these two regions. For example, merging between regions ft and ft in the graph of Figure 6»T(h) is impossible, since these two regions are completely separated by some other regions, a fact which is clear from the picture of Figure 6.7(a). However, merging between SV, ft_, and ft ft is possible, even if the merge of fig and ft fi is not clear, since ft^ and ft_ are connected and ft and ftg are connected. This merge transformations on nodes ft^, ft , and ftg leads the original graph to the graph of Figure 6.8(c), which is partially illustrated. Ik9 (a) A Wrong SPLIT Transformation (b) A Correct SPLIT Transformation £l 9 - Union of £l 6 ,£ly and £l e (c) MERGE Transformation © Figure 6.8. SPLIT and MERGE Transformations --*■ Insert & k with r' ► ® — - Eliminate XI, Figure 6.9. LNSTN and ELIMN Transformations 150 The important things here that one should notice for merge transfor- mations are that because the shape of merged regions is different from these being-merged nodes, the structure of the corresponding graph, namely relations between the merged region and others, changes. Consequently, one has to determine relations for a new graph, and this cannot simply be done by taking union of all relations for these being-merged nodes. Therefore, similar to SPLIT transformation, whenever MERGE transformations are applied between pairs of nodes, the algorithm should go down to the picture level analysis and should determine relations between the merged region and the rest. Of course, by such merge transformations, relations which should be determined are the relations between merged regions and those regions which had some relations with these being-merged nodes. 6.3.3 ELIM and INST Transformations ELIMN and INSTN transformations are complementary transformations provided to operate on nodes . The first transformation elimi- nates a specified node and related arcs from the current graph being trans- formed. The second transformation is provided to insert a new node between a specified pair of nodes by specified relations. Simple examples of these two transformations are illustrated in Figure 6.9. ELIME and INSTE transformations are provided to operate on relations (or edges) in graphs. The first transformation eliminates specified relations from between specified pairs of nodes if the pairs of nodes are explicitly given; otherwise, all such relations are eliminated from the current graph. The second transformation inserts specified relations between specified pairs of nodes. 151 7. COMPLEX FIGURE DESCRIPTION In this chapter we will discuss some problems in form perception and the corresponding descriptions of complex figures will "be given, also. It is clear that for an appropriate solution of these problems, some good heuristics are desperately needed. In Section 7-1 "we will introduce an approach for the extraction of embedded complex shapes by means of similar shape detection (which was discussed in Chapter 5) and relation graphs. Two other sections, J .2 and 7-3, are provided to introduce some other problems involved in complex figure drawings. For these, graph structural representation may not be an adequate description for the solution of problems described. 7.1 Extraction of Embedded Complex Shapes 7.1.1 Preliminary Discussion For two given pictures, P and P , both consisting of many regions, we assume that P can be embedded in Pp. These two pictures are described as relation graphs , where each node of the graph corresponds to a closed region of the picture. We assume that associated with each node is sufficient information to describe the shape, textural features, etc., of the corresponding closed region. Applying the decomposition procedures of necks and tails which have been discussed in Section 3.^.2, we can assume without loss of generality that 152 graphs G and G p of pictures P and P , respectively, correspond to graphs derived from fully decomposed regions. Thus, we assume that graph trans- formations on G and G separately have been done previously. Therefore, the problem of extracting embedded complex shapes reduces to the problem of determining whether the graph G exists in the graph G as a subgraph or not. Let graphs G and G consist of sets of shapes, ft and ft , respectively, where ft 1 = {ft^, ft 12 , fi 13 ,.-., 8 li »«"a n lm >> "»o = *■ 21' 22' ^p^'*"*' 2i'***' 2n m <_ n. Therefore, the task for the above problem is to determine for each shape ft a corresponding shape ft . in such a way that relations between ft . and the rest of the regions in ft hold between ft~ . and the rest of the regions in ft p . This correspondence is usually a one-to-many mapping. In other words, there are several regions in ft_ which satisfy the requirement of corresponding to the region ft and satisfying the necessary relations. We will develop here the property of this mapping function. 7.1.2 Dissimilarity and Clustering Function We define two functions, f and f , as follows. The function f assigns a real value to each shape, and the function f clusters shapes by considering values assigned by f . More specifically, f 1 is an algorithm that assigns real values to each shape of a picture by considering similarity (or dissimilarity) to a specified shape of another picture. For example, 153 f l ( "li> V = {r il' r i2'-"> r ij"*" r in } ' 2 where r. represents dissimilarity and is a real assigned by the function f., ij 1 to the j-th region of ft with respect to the region ft . of ft . This function has been described in Chapter 5 as a system to detect odd/similar shapes. The function f is a simple chain clustering function. After application of function f on ft with respect to the shape ft . , we obtain 2 the real value r. . for each ft^ . of ft^ as described above. These n real ij 2j 2 numbers can easily be sorted into a decreasing order, and we get a n > a^ > a„ > . . . > a . > . . . > a , 1—2—3— — j — — n' which is denoted by A. f is a function to divide A into two connected sub-sequences of a.'s, A and A , where shapes whose values are in A are dissimilar to the shape ft, . , and shapes whose values are in A^ are li 2 similar to o . . This separation of A into A and A can be attained by considering a threshold value T, where T = d + Co d = (a. - a )/(n - l), 1 n 2 n-1 - 2 a = H (& ± - a i+1 " d) /(n - 1). i=l Here C is a constant. Let el be an element in A such that a - a, >_ T and a. - a < T for i > k. Then A^^ = (a^ a g , . . . , a^) and A 2 = (a^ ,..., a ). 7.1.3 Pairing Function For simplicity let = f p o f be a product function of f_ and f and use for a solution to the given matching problem. We apply on each 15* shape ft of picture P and on the shapes of picture P . Thus, we get dwft tt ) i(tt tt ) d> (ft ft \ dwft ft ^ 1 11' 2 } > n 12' 2 ; '•••' V li' 2 y '••• , V lm' 2 1 ' Each (j) (fi . , JJ ) gives a set of shapes in ft , which are "similar" to the shape ft . Similarly, if we apply 6 on each region ft of picture P p and on the shapes of P , we get cbfft ft ) 6(0, ft ) is illustrated for two cases, 6 (ft , ft ) and , where it is defined as: (ft 1;L , ft 2 ) = {(ft liS ftp.) | ft 1;L e 4>(^4» tij) and ftp^ e 6 (ft-^, ftp)}. This function $ is illustrated in Figure 7.1(b). Each pair (ft , ftp.) denotes a similarity correspondence between ft and ft , where it is understood that ft.. . of ft, is embedded in ft~ as ft~ . . This function $ is li 1 2 2j quite useful to narrow down the possible matchings between shapes of two pictures and consequently to save wasteful comparisions. Now we have the following correspondence between a shape ft . and several shapes in ft p , <$(ft-]-]» ftp)» <£(ft-jp» ftp)**** 3>(ft-ji» ftp),..., 3>(ft-| m » ftp)* So the final step for a solution of the extraction of embedded shapes is basically the same as for locating subgraph G in G by considering a certain function, a function to measure goodness of correspondence. Let us assume that for each ft n . , a shape ft! £ ft~ such that li l 2 (ft , ft!) e £(ft . , ft ) is assigned for i = 1, 2,..., m. The necessary con- dition that G is a subgraph of G , where the subgraph consists of nodes 155 <£(&2j,Gl) (ft . , ft ? ), for i = 1,..., m, is derived. Finally, structural matching is determined. If the structure of ft and the structure of ft' in ft matches, then the global shape similarity between ft and ft is evaluated. In Figure 7*3 a pair of shapes is illustrated, and we will consider the extraction of the shape in (a) from the shape in (b). (a) Cb) Figure 7.3 A Pair of Shapes 157 p p 1 2 NECK/TAIL DECOMPOSITION in d FEATURE EXTRACTION GRAPH CONSTRUCTION X l' X 2 , °1' G 2 PAIRING ' v li» 2 1 TRIAL ASSIGNMENT G REORIENTATION G' (0') f 2 v 2' ^S^ NO YES DETERMINE GLOBAL SIMILARITY BETWEEN n and «' of n < TRI .T^N^ YES AL > ■ NO TERMINATE Figure 7.2. A Model for Complex Shape Extraction 158 The digitized picture and its decomposed picture of Figure 7.3(a) were shown in Figure 3.2U, (a) and (e), respectively. The digitized picture and its decom- position of Figure 7.3(b) are shown in Figure 7.U, (a) and (b), respectively. To see how the system of Figure 7.2 behaves, let us number the regions of each decomposed picture as in Figure 7-3. Thus, n i- {0 ll. 8 12. B 13 >' ma n 2 = {i W R 22 V' Pairs of regions, $(fi . , ft ), for i = 1, 2, 3, derived by the system are: • (Q U' V = { W VW W' • (Q 12 . 2 ) = ^ 12 :fi 23' ° 12 :n 2 6' fi i2 :fi 28 } ' *(n i3 , n 2 ) = {n i3 :R 22 , " 13 : ^. "i3 : ^7 } - These regions of ft Q which have been filtered by the pairing function $ are shown in Figure 7«Mc). Therefore, there are 18 possible assignments between regions of ft and the filtered regions of ft' . These 18 possible assignments have been reduced to the following four assignments after their structural matching has been determined. These are: B 3tr 5 "23' "28' fl 28' ft 28' ft 12 : ft 26 , ft 26 , ft 26 , ft 26 , ft : ft~ » ft~~, 0„i , ft • 13 27 22' 2U' 27 Finally, each global shape ft' is compared to the global shape of ft , and the best matching attained is: 159 Bl'RBRRR 8 B [I B 9 B bli B B BR B e. " B B * B _ U B a " b. b B B B 8 B B B 8 e >«bb BB B B B B HB P BR B B B B II B BR B BBt'BBB .8 R 8R B " 3 B K. R B B BB B B - B _B B B BBBB B B BB RB B 8B3 B p PR B 8P8BR _ 11 . _B._ B BBBB B B B B B B «_ B B B B B 5 RR_ 8 B B B 8 3 a 9_ 6 9 B ° 3 R PRBBRBPRBRR BBBBBBbBBbabrt8fi (a) Original Picture RbRpfiilR B B R B B R nR R B RR R R P R B B P R R fi R B B R 3 B B B B B B R RRH »B R R 9 R RB B R 3 P R B 3 9 3 BR B 8BPJP3 R 8 _ RB "3 '"" B P H 9 P __B_ fi " R fi R p B B B B B B P B B B B B R R B B 8 B B B BBBB 8 B B B B BB B RR P BR8 _B _ B _B B B 3 1 p 3 8 BTB BP B B_ BBBB 6 i* 9 B B B_ 8 B 3 B B R B PB BBRRPR B B IB R R fi _B_ _ BBPB9BRP9PR BBRBdPRBBBBBePB (b) Decomposition 3R3 R B 8 ( B B V B" BB» R B PR 9 PR 3 8 B B B B 8 3 R R B R H R b B B 3 B Bfir-.PRRR R BB R B P B 8 3 BB P B B B H P. B B P B BBBB B 8 B BB BBR B B R 8 R P. RPRPB B p BBBB d B 9 _R_ __B-_ ** B B R " " ' 1 B _ U _ _ _ ^ B b " b" "a" e BR R fi BBB _jMJ 8_ B B R R B B B_ B_ B ~ B BBBBBfib R Bb 6 P. R R B BB B9B3 R B _ B_ BB e 6 6 RB R _ _B _iJBB3B_ "B " BBBB I B R H R R B B fi B R BRBBBB RR e 3P iRHBBP (c) Elimination by Pairing (d) The Best. Shape Extracted Figure 7.U. An Example of Complex Shape Extraction by the Model of Figure 7-2 i6o "n : "28' fi 12 : R 26 , n 13 : V which is shown in Figure J.U(d). In this simulation only the connected relation, r«, has been extracted and considered for the relation graphs G. and G_. The computation time for the above example was 13 seconds. The program was written in PL/l and run on the IBM 360/75 • The model developed here can be generalized, and we will suggest a general model for shape extraction in Chapter 8. 7.2 Extraction of Embedded Figures 7.2.1 Preliminary Discussion In this section we consider the following problem: two pictures, P and P , are given, and the task is to determine whether or not the contour of P can be embedded in the picture P . Pairs of pictures are illustrated in Figure 7-5 where in each case picture P can be embedded in picture P . We will suggest an approach for the solution of this problem. The analogous problem of the extraction of embedded regions was discussed in Section 7-1 • Without loss of generality, we assume that the contour of a picture P is embedded in a picture P p . Thus, the task is to search for the corres- ponding vertices and arcs of P in P , where P may consist of a set of disjoint simply-closed curves, and P consists of overlapped closed curves. We also assume that each closed curve is a polygonal curve as exemplified in Figure 7- 5 Id). l6l (a) (b) (a) Pi Figure 7.5. Four Pairs of Picture Drawings, P n and P, 1 c where for Each Drawing P can be Embedded in P 162 The first step in the solution of this problem is to derive the contour expression of the picture P , where we assume that P consists of several connected regions. Then the second step is to search for the corresponding closed contour of P in P„. To determine a corresponding contour of P in P , an exhaustive type matching between two expressions of the pictures may be applied. Finding a clue for a solution of this matching problem is in general dependent upon how these two figures are described. 7.2.2 Matching Between Nodes Let us consider the pictures in Figure 7-6. As can be easily seen, the picture P is embedded in the picture F , where names for regions and vertices are given as specified. The input representation selected for these two pictures are Node -sequences as discussed in Section 6.1. The Node- sequences N and N for pictures P and Pg , respectively, are given in Table 7.1. There the numeric number denotes the subscript of a region, i.e., (n) means ft . From these two sequences, N and N , we derive the Angle-sequences A_^ and A , respectively, which are given in Table 7*2. From Node-sequence N and Angle -sequence A of the picture P , we can easily derive Node and Angle-sequences for the contour of P by using the fact that the region ft denotes the background region of the picture P . These are: 163 (a) (b) f? Figure 7-6 An Example Used to Extract P from P 16U Table 7-1 • Node-Sequences of Pictures in Figure 7.3 Node-sequence N. Node-sequence N, A(C(0), B(l)) B(C(1), A(0), D(2)) C(J(0), A(l), B(2), F(3)) D(E(2), B(0)) E(F(2), D(0), G(3)) F(C(2), E(3)) G(E(0), H(3)) H(G(0), 1(3)) I(J(3), H(0)) J(C(3), Ko)) a(b b(c c(n d(e e(n f(g g(h h(i i(e j(k k(l l(m m(n n(c o(j p(q. q(n 0) 0) 0) 3) k) 1) 5) 6) 6) 8) 9) 9) 7) k) 0) , f(l), , a(2), , b(U)) , b(2), , b(3), , a(0), , 4(1), , e(5), , h(7)) , f(0), , g(8), , k(l0) , h(9), , e(7), , p(io) 4(2)) 4(3), e(U)) a(l), g(5)) 4(5), h(6), i(T)) 3(8)) f(8), k(9)) g(9), m(7)) 0(10)) j(lO)) , p(iD) 1(H)) m(ll) , q(0)) ) 12), n(ll), 1(10), 0(0)) 12), p(0)) 165 Table 7.2. Angle-Sequences of N and N Angle-sequence A Angle-sequence A A ^ a CB ,a BC' ) B(a CA' a AD' a DC } C( °JA' a AB' °BF» a FJ ) D(a EB» a BE ) E(a FD»°DG' "OF 5 F(a CE' Q EC ) G(a EH' a HE ) H(a GI ,a IG ) l(a JH'°HJ ) J(a ci' a ic ) a(a bf» a fd' a db ) b(a ,a , o a ) ca' ad' de' ec ; c(ct ,a ) v nb' bn ; d(a a a a ) eb b a ag ge ^nb'V'V^hi' a in } f (a ,a a. ) ga' aj' jg e-( a .a ,a , a ) feV hd' df" fk* kh' h(a. ,a a a . ) le eg gm mi i( a .a ) v eh' he ; i ( a a a ) J kf fo' ok y k(a a a ) lg gJ Jl l( a , a , a ) * mk' kp' pm' m( a a a ) nh hi In n(a ,a a a a ) ce' em' mp' pq' qc o( a. a ) JP PJ p( a , a a , a ) qn nl eo oq q( a , a ) np pn 166 Node-sequence N' A(C(0) B(A(0) C(J(0) D(B(0) E(D(0) G(E(0) H(G(0) I(H(0) J(Ko) Angle-sequence A' A(a CB *AD C(a E( G( H( K J( l JA °BE ^G "EH °GI °HJ °IC Finally, the problem "becomes to search the Node -sequence N' in the Node- sequence N such that the Angle-sequence A 1 is preserved in the corresponding Node-sequence in N p . To solve this matching problem, the following concept might be helpful. Let a node of the sequence N' be "A" ; A(B(i)) A(a BC } and a node of the sequence N p be "a"; a(b(j), c(k), d(l)), a ( a bc > a cd » a db^' Then a pair of nodes (A, a) is said- to be a possible correspondence if and only if a BC ^ { V' a cd> a db } * where {o^, a^ t a }* = {o^, a^ a , a 1 + a,^ a.^ + a , a + c^} and e means that there exists at least one element a in {...}* such that Kc - a l - T - T is a predefined angle threshold. 167 Therefore, to determine the sequence N| in N , we find a corresponding node of N for each node of N' which satisfies the relation e, and, moreover, which preserves the ordering of Node-sequence, that is, the one-to-one correspondence between subscripts in sequences. We do not explicitly describe here an algorithm for the solution of the extraction of the embedded contour picture; however, the above definition of "possible correspondence" shows implicitly how to determine node-to-node matching between pictures. 7.3 Extraction of "Qood" Figures 7.3.1 Introduction In a complex drawing, a human sees instantly a figure which is embedded in it. What the human sees in the drawing or what he imagines out of the drawing might depend heavily upon the observer's past experiences, the orientation of the drawing, the brightness of lines in the drawing, etc. We will consider in this section an extraction of good figures from a complex two-dimensional drawing. A good figure may be a partial drawing in the complex drawing which comes up relatively clear and looks like an independent drawing in the complex picture. Since a human lives in three-dimensional space, it is likely that even if he sees two-dimensional drawings, he tends to see three-dimensional objects in it. Zusne (1970) says that whether a plane figure will be seen as two- or three-dimensional depends on the goodness of both shape and contour continuation. If both shape and continuation are good, a two- dimensional figure will be seen. If not, a three-dimensional shape will be seen. The study of three-dimensional perception of two-dimensional figures has been originated by Guzman (1968). Good figures in a two- dimensional drawing of three-dimensional objects are objects which can be 168 seen as three-dimensional objects. Thus, by using Guzman's program SEE (basically various analyses of vertices formed in the drawing), we can find which planes form an object, and consequently, the contour of the object which corresponds to a good figure in the given picture. To find the contour of a three-dimensional object which is embedded in a figure, consideration of only a few types of vertices is adequate enough. These are: (l) L, a vertex where two lines meet, (2) FORK, three lines forming angles smaller than it, and (3) ARROW, three lines meeting at a point with one of the angles bigger than j. Some other kinds of vertices are needed if the drawing contains some hidden lines. For the solution of the above problem, heuristics has been extensively studied. We assume hereafter that drawings being considered describe two- dimensional objects and not three-dimensional objects in the visual world. Thus, the heuristics developed by Guzman are not applicable. Moreover, by assuming the above, we may eliminate individual experiences of the visual world and consider the extraction of "good" shapes in a given complex drawing. 7-3.2 Goodness of Figures Before we define what we mean by "good" shape, let us introduce some principles of goodness of figures which have been considered by psychologists (also see Chaper l). (i) Law of Similarity : visual pattern elements that are alike tend to form groups. As to the law of similarity, it applies not only to similarity in shape among visual pattern elements, but also to similarity in color, position, and other variables. However, we only consider similarity of shape. 169 (ii) Lav of Proximity : visual pattern elements having smaller distances between them tend to group together more than elements separated by larger distances (Koffka, 1935; Mowatt, 19^0; Zahn, 1971). (iii) Principle of Symmetry : symmetrically located pattern elements will tend to organize themselves and associated elements into groups. Civ) Principle of Closure : visual form may be open or closed, complete or incomplete. An open form tends to change toward a closed form, etc. It often happens that several of the above principles hold. In such a case , the stronger one will predominate and lead to the perception of that form in which it predominates . We define a good figure or shape as a closed curve with respect to the principle of good continuation. In other words, a curve y is said to be a good figure if the curve is extracted from a complex drawing by connecting "good continuing" arcs or edges , and the resulting curve is closed. Good continuing arcs or edges are pairs of adjacent arcs or edges which are connected at a vertex and have the following property: a straight line proceeds straight, circle proceeds circle, etc. Let us consider several drawings, overlapped figures, of Figure 7.7. Each drawing can be thought of as consisting of a pair of closed curves and is basically classified by two methods : one is that a good figure means a good continued closed curve, and the other that a good figure is an outline (closed curve) of the given drawing. For example, (a) can be considered as the union of two triangles or as the union of a star-like shape and a hexagonal 170 (a) Case 1 Case 2 (*)■ Case 1 Case 2 Figure T-T- Decomposition of Overlapped Drawings 171 (c) Case Case 2 (a) Case 1 Case 2 Figure 7 .7 • Decomposition of Overlapped Drawings (continued) 172 shape. If we consider the good continuation of curves as a means for extract- ing a good figure, then a pair of triangles is obtained. However, if we assume that a good figure is an outline of a given overlapped drawing, then the second type of decomposition is obtained. Figures 7.7(b), (c), and (d) show some other examples where each case is considered for the two classes of decomposition. Figures 7.7(a) and (b) are examples of how both good continuation and outline approaches might work for good figure extraction. It may be said that the results obtained by good continuation are preferable to those derived by outline approach. Figures 7.7(c) and (d) are examples that the outline of figures cannot necessarily be thought of as good figures. In Figure 7.8(a) an overlapped complex figure is shown. By assuming that open curves are not good figures, we can eliminate hair-like curves and then obtain an overlapped drawing which consists of only closed curves (b). Next, using the extraction of good figures as good continued closed curves, we get two closed curves (c). As shown in the above examples, good figure extraction can be solved by using good continuation. The solution of this problem might simplify the extraction of a specified figure from a complex drawing of Section 7«2, and makes it easier to solve. 7-3.3 Illustrative Approach to Decomposition We assume that a given drawing consists of many open and closed curves and over lappings . Overlapping can be grouped into two classes: one is node-overlapping and the other is edge-overlapping . A vertex is called an even vertex if an even number of arcs or edges meet at the vertex; otherwise, it is called an odd vertex. It is easy to see that if odd vertices exist in a drawing of closed curves, then the number of odd vertices is even. If all vertices are even, then the given drawing consists only of 173 (a) (b) Figure 7.8. Elimination of Open Curves for the Decomposition of an Overlapped Drawing IT 1 * closed curves, and some vertices are common to several good figures in the drawing. However, if the drawing contains some odd vertices, then there are some edges or arcs which are used to describe more than one figure in the drawing. Whether a pair of arcs or edges has a good continuation at a vertex may he determined by some heuristics . In Figure J .9 some of the vertices used to extract a good continuation are illustrated. In the figure, dotted lines indicate a pair of edges or arcs which are assumed to have good continuation at the vertex. In Figure T-10 an example of a meaningless figure is shown, and this is used as a sample to show how decomposition of the figure into simpler shapes is carried out. The node sequence N which describes this complex figure is as follows: v 1 (v 2 (b,l) v 2 (v u (f,0) v 3 (v u (h,U) v u (v T (k,8) v 5 (v 3 (g,3) v 6 (v 8 (p,9) v 7 (v u (l,8) v 2 (a,0) v c (c,3) v (d,2)) v 1 (a,l), v 1 (b,2), v (e,U)) v 2 (e,2), v 1 (d,3), v 5 (g,5), v 6 (i,6), VgQ.T)) v (1,0), v 2 (f,M, v 3 (h,T)) v 1 (c,0), v (q,10), v 6 (m,5)) v T (n,T), v (j,6), v.(i,5), v (m,10), v (r,ll)) v^ds.T), v 6 (n,9), v g (o,0)) v 8 (v 10 (s,12), v 1Q (t,0), v T (o,9), v 6 (p,ll)) v 9 (v 10 (u,ll), v 6 (r,10), v 5 (q,0), v 1Q (v,13)) v 10 (v 8 (t,12), v 8 (s,ll), v (u,13), v 9 (v,0)). 175 / r 77 rv sj 7 Figure 7.9. Vertices Used to Extract Good Continuation 176 Figure 7«10. Meaningless Drawing 177 From this node sequence we derive a sequence called an "arc-sequence." This sequence tells us which arcs meet at each vertex. Thus, by applying good continuation of Figure 7-9 for each vertex, we can find pairs of good continuing arcs. These two sequences are shown below: Arc-Sequence A Good Continuation v (b,a,c,d) v 1 (a:d, b:c) v 2 (f,a,b,e) v 2 (f:a, b:e) v (h,e,d,g,i,j) v 3 (h:g, e:i, d:j) v u (k,l,f,h) v^krf, l:h) v 5 (g,c,q,m) v (g:m, c:q) Vg(p,n,j,i,m,r) v^nrm, j:r, i:p) v (l,k,n,o) v^(l:n, k:o) v Q (s,t,o,p) Vg(o:s, p:t) v (u,r,q,v) v 9 (r:v, q:u) v 1Q (t,s,u,v) v 1Q (t:u, s:v) To determine good closed curves, starting at v we select a pair of arcs a:d and look for a vertex which contains "a" or "d". In this example, v p has a pair f :a, so we move to v~. Repeat this procedure as many times as possible by visiting each vertex at most once. If, at the end, the vertex has "d", then the sequence of arcs visited by this method is closed; otherwise, it is open. By the above approach we get the following three sequences of arcs, each of which is a good closed curve or shape. 1. a,f ,k,o,s,v,r,j ,d, 2. b,e,i,p,t,u,q,c, 3. h,l,n,m,g. 178 8. SUGGESTED MODEL FOR THE NAMING PROBLEM 8.1 Introduction As a summary of the algorithms for high-level picture processing described in earlier chapters, we propose a model for the solution of the "naming" problems. A naming problem is specified by two input pictures: a sample picture where the objects in the picture are to be named at the completion of the analysis, and a text picture which describes an abstract drawing of the sample picture being considered where various regions have assigned names. The problem is to name the objects in the sample picture using name information from the text picture. A very simple data example for this naming problem is illustrated in Figure 8.1 where (a) is a sample picture and (b ) is the corresponding text picture. Using the name information in the text picture, we can name the people in the sample picture "Dad," "Me," and "Bob," and, consequently, we derive the result illustrated in (c). This kind of naming problem is frequently solved in the human visual world, and, more- over, the human solves it almost instantly. Our objective here is to propose a model for a solution of general naming problems as an extended application of the studies which have been done in this thesis. The model which will be described in the following section is a generalization of the model for complex shape extraction, which has been discussed in Section J.X. 179 BOB (a) Sample Picture (b) Text (c) Output of Naming Problem Figure 8.1. A Simple Example f the Naming Problem 180 8.2 A Model Description In developing a model for the solution of naming problems, the most significant and necessary problem which should be solved is the shape- identification problem. Without solving this, the solution for the naming problem cannot be attained. We ignore the texture or color or material information of objects constructed, and we assume only geometrical shape as the clue of identifying objects. Once the possible individual shape indentifications have been computed, the next step is to consider those shapes in the sample picture whose structure matches the structure of the text picture. If both structure and individual shape identification have been done successfully, then the final remaining task is to give a name to each object in the sample picture according to the individual shape matchings In Figure 8.2 we show a simplified block diagram of our model for the solution of the naming problem. "SAMPLE" is given as a gray-level picture and "TEXT" is given as a black-and-white noiseless clear picture or as a higher level representation with name information. In this diagram only program flows (no information flows) are shown. We will describe the context or the tasks of each block as follows : 1. First Level Analysis This block is applied only for a given sample picture to obtain the same information level as a text picture. Here, most of the low- level picture analyses are carried out, for example: (i) scanning and digitizing the input picture, (ii) differentiation of the digitized picture, (iii) transformation into a black-and-white picture. TEXT CO o Si o o I o M Eh H CO O 8 181 SAMPLE (1) FIRST LEVEL ANALYSIS (ARRAY LEVEL) (2) SECOND LEVEL ANALYSIS (REPRESENTATION) (3) THIRD LEVEL ANALYSIS (STRUCTURAL LEVEL) (k) INDIVIDUAL MAPPING (5) ASSIGNMENT (6) STRUCTURAL MATCHING (7) NAMING o EH K O 3 o M EH O w CO TERMINATE Figure 8.2. A Model for a Solution of Naming Probl em 182 (iv) black-boundary finding and the center of the gravity of each region, and (v) contour smoothing (ref. 2.3). All of these are described in Section 2.3. We will use ref. to show the corresponding reference in this thesis. At the end of these analyses, clear boundary representations of each object in the sample picture are obtained. 2. Second Level Analysis This block is provided to analyze each region of two pictures and to extract characteristics regarding the geometrical shape of regions. This includes: (i) analysis of characteristic points (ref. 3.5), (ii) polygonal and pattern sequence representation (ref. 3.1, 3.2, and 3.3), (iii) local and global feature extraction (ref. Chapter h) , (iv) contour abstraction (ref. 3-3-5) 5 and (v) node/angle representation (ref. 6.1, 7.2, and T«3). 3. Third Level Analysis Thus far, individual regions of pictures are analyzed, and at this level the structure of pictures is determined and graph representations are obtained. This analysis completes all possible processing of both samples and text pictures to provide necessary information for the solution of the naming problem. This final level analysis includes: (i) skeletonization of regions (ref. 3.*+.l), (ii) neck/tail decomposition or composition (ref. 3.^.2), (iii) derivation of containment trees (ref. 6.2.1), (iv) derivation of neighborhood relations (ref. 6.2.2), and (v) the construction of relation graphs (ref. 6.2 and 6.3). 183 k. Individual Mapping This block is provided to find for each region of the sample picture some possible corresponding regions in the text picture. This is done by first detecting an odd shape in the sample picture, and for the detected odd shape we find some similar shapes in the text picture. If reorientation of the sample picture is required to attain possible matching between regions in two pictures, then such operations will be done and then the process goes back to the second level analysis for appropriate change of analyzed data. For the remainder of regions in the sample picture, we again find the odd shape in it and then find several shapes of the text picture similar to it. We repeat this process as many times as necessary to complete the assignment of each shape of the sample picture to several shapes of the text picture. In this process it is also necessary to apply the concepts of extraction of embedded figures (ref. 1 .1 and 7.2) and extraction of good shapes (ref. 7-3) • This block includes: (i) odd/similar shape detection and reorientation (ref. Chapter 5), and (ii) extraction of embedded complex regions (ref. 7^1)) especially the function $ (one -to-many mapping) 5. Assignment For each region of the sample picture, several regions of the text picture are selected by the previous sub-task: Individual Mapping . This sub-task is provided to compute a one-to-one mapping, i.e., for each region of the sample picture it assigns a region of the text picture where such a region is included in the set of regions derived by the previous one-to-many mapping. A good assignment is obtained first by individual assignment, and then the whole assignment can be improved further by the 181* considering certain optimization functions (ref. 7-1) . At the end, one of the "best individual matchings is obtained, however, the structure of one picture may not be preserved in the other. 6. Structural Matching This determines whether the structure of the sample picture is preserved between these regions of the text picture which are assigned or selected for matching. In other words, this determines whether one graph can be a subgraph of another based upon relations between nodes. To attain a possible matching, we not only need another selection of regions for an assignment, but also the operation of composition of several regions of one graph or the decomposition of a region of the other graph. That is, some graph transformations (ref. 6.3) are required to attain possible matching. 7. Nairn' ng After obtaining individual regional matchings and structural matching, the final step is to give names for regions of the sample picture and output the result. Other possible matchings are also considered if requested. 8.3 Potential Application Not much research has been done on this kind of pattern recognition problem. However, the solution of this naming problem becomes significant in the fields of remote manipulation, medical diagnosis and research, elementary school education, etc. In remote manipulation, the world may be given as a textbook (two-dimensional projections of three- dimensional visual space) , and by identifying the objects in the visual world using the information described in the textbook, the system may further execute the assigned task in the world. In the medical field 185 we have good textbooks for organs, cells, etc., however, it is not an easy- task to identify between, for example, an X-rayed picture of the human brain and the textbook. Thus, it is very important to have such naming systems to ease the burden and to give some aid to doctors or to students in medical training. It is also possible to apply this naming system to elementary education as well as to the study of human shape/form perception. Disadvantaged or young students may use the system to learn language with shape and some other information. Figure 8.3 shows two other complex examples of textbooks which may be used for the naming problem, (a) shows an embryo of a pig and (b) shows an embryo of a chick. These examples are very complex, however, we hope that we can attain solutions of naming problems down to this level . 186 V u < & ori •p C 0) o H > CD Q S O U : /» THIS FlNCTTON GIVES. INVERSE PICTURE HE P +£_ %miAmi±£R&2i .:. ....._ USAGE: PIC = $0R(P1,P2) : /» THIS FUNCTION GIVES OR OF TWO P ICTURES PI AND P2 */_ „_JANJli^AGJL*A_RG2J__ ... USAGF: PIC = $AND(P1,P2): /♦ THIS FUNCTION GIVES AND OF TWO PICTURE? PI AND P2 */_ ARC? ) LSAGF: PIC = $SUB(P1,P2): X*_ThLLS. EliNCTIQJM GI VES $.ANC( PI , $NEGAJ $AfiDJ PLLi.P2L.il */_ 195 $FX0R(ARG1, APG2) USAGE: /* THIS PIC = $EXOR(Pl FUNCTION GIVES ,P2>; $OR($AND(Pl,$NEGA(P2l) ♦ */ /* SAND(SNF.GA(P1).P2) ) ) */ $BNDPY(ARG. USAGE: TYPF) PIC = SBNDRYiP iNl; /* /* THIS FUNCTION FINOS MATRIX ACCORDING TO SMOOTHED BOUNCARY OF THE THE SPECIFIED TYPE. GIVEN BIT */ */ /* /* TYPE = = 1 BLACK BOUNCARY WHITE BOUNDARY */ */ $#OFR(ARG.TYPE) USAGE: #TFR = $#OFR(PtN): /* THIS FUNCTION COUNTS THE NUMBER OF REGIONS OF PICTURE */ /* P ACCORDING TO THE SPECIFIED TYPE. /* TYPF = BOTH OPEN AND CLOSED REGIONS */ */ /* = 1 CNLY CLOSEC REGICNS */ $NAMFR(ARG1. I.ARG2) USAGE: CALL SNAHER ( P, I* • CEREBRUM* ) : /* THIS PROCEDURE GIVES NAME 'CEREBRUM* TO THE I-TH REGION*/ /* OF THE PICTURE P. */ $G( ARGl,ARG2.X,Y) USAGF: CAtl $G( P, 'CER EBRUM' , X, Y ) ; /* THIS PROCEDURE GIVES TFE CENTER OF GRAVITY (X,Y) OF */ /* RFGTON NAMED 'CEREBRUM' OF PICTURE P. */ ilS GINJARG1 .ARG2) USAGE: IF SIS GIN( P, • CEREBRUM' ) THEN - - - - /* THIS FUNCTION GIVES YES/NO ANSWER DEPENDING UPON /* WHETHFR THE CENTER CF GRAVITY OF THE REGION NANED */ */ /* 'CEREBRUM' OF PICTURE P IS IN THE REGION OR NOT. */ STS AS(ARG1 »ARG2> USAGE: IF SIS AS (P. 'CEREBRUM' ) THEN - - - - /* THIS FUNCTION GIUES YES/NO ANSWER CEPENDING UPON */ /♦ WHETHER THE RFGION NAMED 'CEREBRUM' OF PICTURE P IS ♦/ /* ANGULARLY SIMPLE OR NCT. */ SASPf ARG1 .ARG2.X.Y) USAGE: CALL $ASP(P, 'CEREBRUM* ,X,Y) /* THIS PROCEDURE GIVES THE ANGULARLY SIMPLE POINT (X,Y) */ /* OF THF REGION NAMED 'CEREBRUM' OF THE PICTURE P. */_ 196 SIS ATIARGJ..APG2.X.Y) USAGE: TF SIS AT(P, «CERFBRUM» ,X,Y) THEN - - - - / » WHFTHFR ThP CIVFN PTINT (X.Y) I S A NCUIARY SIMP1F FOR __*/_ /* THE RFGION NAMED •CEREBRUM' OF PICTURE P, OR NOT. */ $FNCnOE( flRG.K.TYPE) _ USANCE: CALL ENCODE I Pil^TYPEJ.;. /* THIS PROCEDURE ENCCCES EACH REGION OF PICTURE P INTO */ /.»-J>F G I VF N T Y_P E_n£_£CRil_kH£lS£ . £J RCUI AR I T V IS N %£_ /* TYPE = PRIME FNCfDINGI INNER HOST INTERSECTION) */ ./* . _ .. = 1_ .DUAL ENCODING (.OUTER MOST IN TERSECT ION) */ /* = ? EDGF FOLLOWING ENCODING */ ? ND LEVFL FUNCTIONS SF OF(ARG1 •ARG?«I) USAGES _L_XfcL FEATURE = SF_OF IP. LCEPJEBPUMA.^LLl /* THIS FUNCTION RETURNS THE I..TH FEATURE VALUE OF THE */ /* REGLCN_J^AMFD ! CEREER.UMJ OF .PICTURE P *7 */ *L /* I = 1 MAXAMP = ? MFANVPN /* = 3 DIVE* = 4 Af T* — /* /*— /* /* = 5 VAXIMAL* = 6. ...rCN-VEXVit = 7 CCNCAVE# = « n pnnps /* /* = 9 PFRIMETFR = 10 ARFA4 /* /» — /* /* =11 P AS _= X2 CCUVE XI T.Y.. = 13 SYMMETRY = u nFn AS /* = 15 CCNVEXNESS */ */ */ JLL */ _*/ */ *Z */ JLL */ $PRTNTF(ARG1.ARG2) U SAGF; CALL-SPJLI NTE USAGE; CALL SSKLTNZ ( P !♦' CEREBRUM ' , N ) ; /* THIS PROCEDURE SKELTONIZE THE REGION 'CEREBRUM' OF */ /♦ PICTURE PI USING GIVEN INTEGER N. IE ARG2=$ALL THEN ♦/ /* THE PROCEDURE SKELTCNIZE ALL REGICNS OF THE PICTURE. */ *PRINTS(ARG1 ,ARG?1 USAGF: CA 1 L SPRINTS A LLOCG ( I ,ARG) USAGF: CALL $ALLOCG( I .GNAME ) ; /» THIS ALLOCATES A GR IPH STRUCTURE CF N NODES WHOSE NAME »/ /* IS GNAME */ SFRFFG(ARG) USAGE: CALL $FREEG«GN AME ) ; /* THIS FREES THE DATA STRUCTURE OF NAME GNAME */ *TRANSG< ARG1.ARG2) USAGF: CALL STRANSG(PtG) /* THIS PROCEDURE TRANSFORMS THE PICTURE P INTO A RELATION*/ /♦ GRAPH G. */ /* RFLATICNS : SUBREG I CM CCNT A INMENT ) /* NEIGHBCRHOCCICCNNECTED REGION) */ */ /* NECK /* TAIL */ */ /* ABOVE / BELCW /* RIGHT / LFFT */ */ /* SURRCUNCS */ JSPL ITIARG1.ARG2. RELATION) USAGE: CALt SSPLITI GNAME .NNAME.R ELATIGN ) ; /* THIS SPLITS THE NODF NNAME CF GRAPH GNAME IF THE /* RFLATICN HOLD */ */ /* IF ARG2=$ALL THEN EACH NODE OF GNAME WHICH SATISFY /* THF RFIATTflN IS SPLIT */ */ 200 SM£RGEjLAJlGLL.A£G2 i ARG3,R£LATlCN) USAGE: CAIL SMERGE (G. M . N2 • REL AT ION ) : /♦ T HI S P F R GFS TW O N ODES Nl AND N? OF GRAPH G IF THF */_ /* SPECIFIED RELATION IS HCLC */ *PLTN?(ARG1 ,£PG2,ARC3) JJ.SA£f-X _J.E._tRLT.N2.IG4.Nl_iM2J...i; RELA T « REL AT THEN - - - /* THIS FUNCTION RETURNS RELATIONS BETWEEN TWO NODES Nl */ L* A ND N ? O F GRAPJbL-G *Z_ $ELIPJlLARG_l.AfiG2J USAGF: CALL $EL IMN( G • NCDE ) ; /♦ THLS_RPOCE DURE ELIMIN ATES .Jh£ NDD_E_ FROM GRAPH G *1_ %F\ IMF(ARK1 . ARn?.ARr,3.RF< ATIPM USAC-F: CALL $ELIME(G.N1«N2. RELATION); JJL JCH_LS___P£nC EDUR^__ELXM1NAT_E_S_ R£ LA TICN CALLED RELATIONS *J_ /* FROM TFF RELATION BETWEEN NODES Nl AND N2 OF GRAPH G */ JINSTNl ARG1 ,APG2,ARG3.ARG4 ) US AGE; CAll flNSTN(&*JVltN2.N2) /* THIS PROCEDURE INSERTS NODE N3 BETWEEN NODES Nl AND N2 */ 1* QE_GRAPH CL._THJE-RELAJJCNS. BFT WFFN Nl AND N? ARF +±_ /* PRESERVED FOR RELATIONS BETWEEN Nl AND N3 , AND N3 AND */ V*_M2_. *7_ -&.INSTFI ARG1 .ARG7. ARG3«.R-FI ATIOlL USAGE: CALL $1 NSTE ( G . M .N2 *R EL ) : ^J*_IH1£ _ELRnC_EXURE^IWS£RT S RELATIONS CALLE .. R EL B FTW FFN, *L /* NODES M AND N2 CF GRAPE G */ $MST(ARG1 .ARG2) USA GF: CAM SPST ( G. MS1ALAMEU /* THTS PROCEDURE FINDS A PIMPUM SPANNING TREE FROM GRAPH*/ /J»_„G AMO-_LT-JS CALLED PSTtvAPf _ *JL iSPTPSTI ARG.NJ USAGF: CALL SSLTMST (G «N ) : /» THIS PRnCFDMOF SPt T TS CRAPH R TNTO N + l DISJOINT TREES *J_ /* BY ELIPINATING N NUMBER OF THE MOST WEIGHTED EDGES. */ *PINFNC(ARG,N ) L1SAGE^_ .CALL. IMlNFNCiG »M1 ; /* THIS PPOCEDURE MINIMIZES CLUSTERING FUNCTION OF AT */ /* MfSJ_ N CI ISTFRS . *L 201 »CLSTR. REASON FOR RECOMMENDED RESTRICTIONS: i. SUBMITTED BY: NAME AND POSITION (Please print or type) Kiyoshi Maruyama Research Assistant Organization Department of Computer Science University of Illinois Urbana, Illinois 6l801 Signature * (/OR AEC USE ONI/ Date October 1972 'OR AEC USE ONLY AEC CONTRACT ADMINISTRATOR'S COMMENTS, IF ANY, ON ABOVE ANNOUNCEMENT AND DISTRIBUTION RECOMMENDATION: PATENT CLEARANCE: □ a. AEC patent clearance has been granted by responsible AEC patent group. M b. Report has been sent to responsible AEC patent group for clearance. I I c. Patent clearance not required. BLIOGRAPHIC DATA 1EET 1. Report No. UIUCDCS-R-72-533 3. Recipient's Accession No. Title and Subtitle A STUDY OF VISUAL SHAPE PERCEPTION 5. Report Date October 1972 Author(s ) Kiyoshi Maruyama 8. Performing Organization Rept. No. Performing Organization Name and Address Department of Computer Science University of Illinois Urbana, Illinois 6l801 10. Project/Task/Work Unit No. US AEC AT( 11-1) 2118 11. Contract /Grant No. 1+6-26-15-303 !. Sponsoring Organization Name and Address US AEC Chicago Operations Office 9800 South Cass Avenue Argonne, Illinois 60k39 13. Type of Report & Period Covered Thesis Research 14. i. Supplementary Notes Abstracts With the use of computers, this paper develops and studies algorithms involved in visual shape perception. The main interest is to solve the naming problem, i.e., the strategies by which map-like infor- mation is transferred to objects of the scene under analysis. Key Words and Document Analysis. 17a. Descriptors Shape, form, perception, feature extraction, oddness , relation graph, figure extraction, naming b. Identifiers/Open-Ended Terms c. COSATI Field/Group .Availability Statement unlimited distribution RM NTIS-3S ( 10-70) 19. Security Class (This Report) UNCLASSIFIED 20. Security Class (This Page UNCLASSIFIED 21. No. of Pages 22. Price USCOMM-DC 40329-P71 > ^mti H ■ I ».v rili, 9 ■ MP $ •MkI ■ ■ 1 ■ I ■ ■ I i'^Wfa khRhH M KlraflHBBIH IHHH QiHtflBwa m MMHI • » H raHBHBfl in ■ 11 . • h!£8bhHHHI^H