mora rmSD 
 
 BQDSIhISi 
 
 MHraflBI 
 
 $ H9BI1 
 
 mm 
 
 Hi 
 
 Hi 
 
 91 Hi 
 mSBSsSM 
 
 RMfRfiliMm? 
 
 wHHIHBIDIlttin 
 
 ii RsHllfiul 
 
LIBRARY OF THE 
 
 UNIVERSITY OF ILLINOIS 
 
 AT URBANA-CHAMPAIGN 
 
 510.84 
 I-fltor 
 
 --no. 629-534 
 co/o. 2. 
 
CENTRAL CIRCULATION AND BOOKSTACKS 
 
 The person borrowing this material is re- 
 sponsible for its renewal or return before 
 the Latest Date stamped below. You may 
 be charged a minimum fee of $75.00 for 
 each non-returned or lost item. 
 
 Theft, mutilation, or defacement of library material* (an be 
 causes for student disciplinary action. All materials owned by 
 the University of Illinois Library are the property of the State 
 of Illinois and are protected by Article 16B of Illinois Criminal 
 Law and Procedure. 
 
 TO RENEW, CALL (217) 333-8400. 
 University of Illinois Library at Urbana-Champaign 
 
 "JUL 2 6 20Q1 
 APR 5 2001 
 
 MAY 2 6 2002 
 
 When renewing by phone, write new due date 
 below previous due date. L162 
 
7U.S33 
 
 Or*. ^ 
 
 UIUCDCS-R-72-533 
 
 yx^^tr 
 
 COO-2118-0038 
 
 A STUDY OF VISUAL SHAPE PERCEPTION 
 
 by 
 
 Kiyoshi Maruyama 
 
 «««i8i8r 
 
 eejfce 
 
 October 1972 
 
 
 1 
 
UIUCDCS-R-72-533 
 
 A STUDY OF VISUAL SHAPE PERCEPTION 
 
 by 
 
 Kiyoshi Maruyama 
 
 October 1972 
 
 DEPARTMENT OF COMPUTER SCIENCE 
 UNIVERSITY OF ILLINOIS 
 URBANA, ILLINOIS 6l801 
 
 Supported in part by the Department of Computer Science and the Atomic 
 Energy Commission under contract US AEC AT( 11-1)2118 and submitted in 
 partial fulfillment of the requirements of the Graduate College for the 
 degree of Doctor of Philosophy in Computer Science. 
 
Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/studyofvisualsha533maru 
 
A STUDY OF VISUAL SHAPE PERCEPTION 
 
 Kiyoshi Maruyama, Ph.D. 
 
 Department of Computer Science 
 
 University of Illinois at Urbana-Champaign, 1972 
 
 With the use of computers , this paper develops and studies 
 algorithms involved in visual shape perception. The main interest 
 is to solve the naming problem, i.e., the strategies by which map-like 
 information is transferred to objects of the scene under analysis. 
 
 The naming problem is specified by two input pictures: a 
 sample picture and a text picture. The problem is to identify and name 
 objects in the given sample picture using name information in the text 
 picture. Here the most significant and necessary sub-problem to be 
 solved is shape identification. Once possible individual shape identi- 
 fication has been established, the next step is to consider the structural 
 matchings between the given sample and text pictures. 
 
 Introducing the concept of angularly simple curves and the 
 decomposing of a complex region into simpler classes of regions , we give 
 basic shape representations: polygonal, pattern sequence, and skeleton 
 representations. From these representations both global and local features 
 are extracted to channel to a structural analysis of complex pictures. 
 
 An approach to the solution of shape matching problems — a 
 procedure to determine an odd shape or form in a given set of shapes — is 
 described. The "oddness" of a shape in a set of shape is defined on the 
 basis of many detected features. The performance of the system is shown 
 by comparing the system response to the average human response . It is 
 shown how form perception in psychology can be modeled and studied 
 by computers . 
 
To attain acceptable structural matchings between the sample and 
 text pictures, graph representation of pictures is considered as a possible 
 methodology. Relation graphs and approaches in finding relations are 
 discussed; some basic graph transformations to achieve possible matchings 
 are given. Connected to the graphical representation of pictures, 
 sequential descriptions of complex figures are developed, and problems of 
 shape extractions are discussed. 
 
 Finally, a model is suggested for a solution of the naming 
 problem by referring approaches and algorithms which are described in 
 this work. 
 
Ill 
 
 ACKNOWLEDGMENT 
 
 The author expresses his deepest gratitude to his thesis advisor, 
 Professor Bruce H. McCormick of the Department of Computer Science, University 
 of Illinois, whose constant aid, encouragement, and ready availability as a 
 source of dynamic ideas have led this work to its successful completion. 
 
 The author also expresses his sincere appreciation for his former 
 thesis advisor, Professor Jurg Nievergelt, presently on sabbatical leave, 
 for his helpful discussion and encouragement. 
 
 Special thanks should go to Mrs. Patricia C. Welch and 
 Mrs. Barbara Armstrong, who typed the manuscript. Thanks are also extended 
 to my wife, Michiko, for her patience. 
 
 
IV 
 
 TABLE OF CONTENTS 
 
 
 Page 
 
 1. INTRODUCTION 1 
 
 2. TOPOLOGICAL DISCUSSION 8 
 
 2 . 1 Basic Topology 8 
 
 2.2 Simple Closed Regions 10 
 
 2.2.1 Convex Polygon 10 
 
 2.2.2 Angularly Simple Region 10 
 
 2.2.3 Determination of Feasibility 13 
 
 2.2.U K-Convex Decomposition 18 
 
 2.2.5 K-Angularly Simple Decomposition 20 
 
 2.3 Generation of Polygonal Curves 2k 
 
 2.3.1 Boundary Detection 2k 
 
 2.3.2 Contour Smoothing 28 
 
 3. SHAPE REPRESENTATION 3l+ 
 
 3.1 Polygonal Representation 3I+ 
 
 3.2 Pattern Sequence Representation 39 
 
 3.2.1 Definition of Pattern Sequences 39 
 
 3.2.2 Generation of Pattern Sequences kk 
 
 3.2.3 Operations on Pattern Sequences 52 
 
 3.2.1+ Fourier Series 53 
 
 3.3 Generalization of Sequence Representation $k 
 
 3.3.1 Bi-Pattern Sequences 5I+ 
 
 3.3.2 Many Pattern Sequences 56 
 
 3.3.3 Interval Pattern Sequences 60 
 
 3-3.1+ Contour-Following Pattern Sequence 60 
 
 3.3.5 Convex-Concave Abstraction 63 
 
V 
 
 Page 
 
 3.U Skeleton of Shapes 66 
 
 3.1|.l Skeletonization 67 
 
 3.1*. 2 ^eck/Tail Decomposition 68 
 
 3.5 Characteristic Points of the Representation "J6 
 
 3.5 »1 Preliminary Discussion 76 
 
 3.5-2 Determination of Characteristic Point 77 
 
 3.5.3 Properties of Characteristic Points 79 
 
 3.6 Intercomparison of Shape Representations 8U 
 
 k. FEATURE EXTRACTION 86 
 
 k.l Introduction 86 
 
 k.2 Description of Specified Features 87 
 
 1*.3 Feature Dependence on Encoding Shapes 93 
 
 k.k Extraction of Local Features 102 
 
 i+.U.l Preliminary Discussion 102 
 
 U.4.2 Determination of Local Lobes 10l+ 
 
 1+.U.3 Determination of Local Cavities 106 
 
 U.H.U Determination of Local Roughness 107 
 
 5. DETECTION OF ODD (SIMILAR) SHAPES 112 
 
 5.1 Basic Definitions 112 
 
 5.2 A System to Detect Odd Form 118 
 
 5.2.1 Analysis 118 
 
 5.2.2 Attention and Weight Modification 120 
 
 5.2.3 Decision and Selection 122 
 
 5.3 Experimental Results 123 
 
 5.3.1 Human Response 123 
 
 5-3.2 Attention Function 123 
 
Page 
 
 5.3.3 System Response 125 
 
 5.3.^ Summary 130 
 
 6. GRAPHS AND GRAPH TRANSFORMATION 132 
 
 6.1 Graph Representation 132 
 
 6.2 Relation Graphs 135 
 
 6.2.1 Containment Relation 138 
 
 6.2.2 Neighborhood Relations 1^2 
 
 6.3 Graph Transformations lU6 
 
 6.3.1 SPLIT Transformation lk6 
 
 6.3.2 MERGE Transformation lU8 
 
 6.3.3 ELIM and INST Transformations 150 
 
 7. COMPLEX FIGURE DESCRIPTION 151 
 
 7.1 Extraction of Embedded Complex Shapes 151 
 
 7.1.1 Preliminary Discussion 151 
 
 7.1.2 Dissimilarity and Clustering Function 152 
 
 7.1.3 Pairing Function 153 
 
 7.1.^ Experimental Result 156 
 
 7.2 Extraction of Embedded Figures 160 
 
 7-2.1 Preliminary Discussion l60 
 
 7.2.2 Matching Between Nodes l62 
 
 7.3 Extraction of "Good" Figures 167 
 
 7.3.1 Introduction 167 
 
 7.3.2 Goodness of Figures l68 
 
 7.3.3 Illustrative Approach to Decomposition 172 
 
VI 1 
 
 Page 
 
 8. SUGGESTED MODEL FOR THE NAMING PROBLEM 178 
 
 8.1 Introduction 178 
 
 8.2 A Model Description 180 
 
 8.3 Potential Application l8U 
 
 9. SUMMARY AND CONCLUSION I87. 
 
 9.1 Discussion of Results 187 
 
 9.2 Suggestions for Further Research 188 
 
 BIBLIOGRAPHY 189 
 
 APPENDIX 
 
 A. LIST OF SHAPE DESCRIPTION SUBROUTINES 193 
 
 B. TWENTY SETS OF DATA STIMULI 202 
 
 VITA . 208 
 
1. INTRODUCTION 
 
 Stimulated by the publication of Wertheimer's "Principles of Perceptu- 
 al Organization" in 1923, many psychologists have intensively studied human 
 perception of shape and form. A good survey paper of these studies is found in 
 Hake (1957); and a recent book, Visual Perception of Form by Zusne (1970), is 
 an excellent summary and contains an extensive bibliography. A 1966 collection 
 of papers edited by Leonard Uhr deals with various problems of pattern recog- 
 nition by computers. However, many of the more crucial problems such as how 
 humans encode shape information, detect near similarity or dissimilarity of 
 shapes, focus attention upon particular aspects of shapes, and achieve percep- 
 tual invariance are not yet adequately understood. 
 
 A basic weakness of shape/form recognition by present computer systems 
 
 may stem from the observation that shape/form recognition by humans has not yet 
 
 been solved on the psychological level. Zusne (1970) says: 
 
 In the history of modeling pattern perception on the computer, 
 the task is really one of modeling discrimination rather than recog- 
 nition. Recognition in a computer is recognition only in a trivial 
 sense. Recognition by organisms implies previous experience, and 
 with experience there accrue meanings to patterns that are uncorre- 
 lated in any important way with the physical dimensions of patterns. 
 Whether a machine will ever be able to cope with this problem is 
 uncertain. (page Qk) 
 
 The objective of this thesis are to investigate some intrinsic proper- 
 ties in shapes and to provide some good ways to describe objects in such a way 
 that the high level analysis of pictures, especially the "Naming Problem," can be 
 done easily. Furthermore, we hope that this study provides the means to con- 
 struct a computer system in which the system responses are demonstrably similar 
 
to an average human response on a specified task and to provide clues for ex- 
 ploring some unsolved problems of shape perception in psychology. 
 
 Let us first investigate the problems involved with defining shape and 
 form. Webster's Third New International Dictionary (l96l) defines shape as 
 "the visual make-up characteristic of a particular item or kind of item: 
 characteristic appearance or visual form, or spatial form or contour tnat is 
 used." The term "form" is somewhat more nebulous — it is used by different 
 people to mean different things and by the same person to mean different things 
 on different occasions. Shape, figure, structure, pattern, order, arrangement, 
 configuration, plan, outline, and contour are similar terms without distinct 
 meanings. This indefinite terminology is a source of confusion, and a more 
 rigorous terminology may be very much needed. The Webster's Dictionary defines 
 form as "the shape and structure of something as distinguished from the mater- 
 ial of which it is composed," and it defines gestalt as "a structure or config- 
 uration of physical, biological, or psychological phenomena." 
 
 A number of explicit definitions of visual form have been given by 
 Gibson (1951); they were proposed out of a conviction that "psychologists 
 
 should come down to earth to say exactly what they mean when they talk about 
 
 2 3 
 
 form." According to his proposed definitions, solid forms and surface forms 
 
 are realities. However, there is no such thing as form-in-general with the 
 
 universal characteristics ascribed to it by gestalt theorists. The only kinds 
 
 of visual form we shall undertake to deal with hereafter are those associated 
 
 with or derived from physical objects. 
 
 1. task: detection, discrimination, recognition, identification, judgment 
 (Hake 1957). 
 
 2. solid form: the closed physical surface enveloping a substance of some 
 kind. The surface may be curved or it may be composed of adjoining flat 
 surfaces with edges. 
 
 3. surface form: a flat physical surface with its edges. 
 
Let us describe briefly what kind of experimental studies have been 
 made by psychologists on human shape perception. First, they have tried to 
 define what is the quality of "goodness" in a shape or form. If a line forms 
 a closed or almost closed figure, we may no longer see merely a line on a homo- 
 geneous background, but a surface figure bounded by the line. It seems that a 
 straight line is a more stable structure than a broken one and that, therefore, 
 organization will occur in such a way that a straight line will be perceptually 
 continued as a straight line (this idea was implemented by Guzman [1968] in his 
 "Decomposition of Scenes Into Bodies"). Koffka (1935) generalizes this as: 
 "Any curve will proceed in its own natural way; a circle as a circle, an ellipse 
 as an ellipse, and so forth." Attneave (195M asserts that many of the gestalt 
 
 principles of perceptual organization pertain essentially to information distri- 
 
 k 
 bution and the good gestalt is a figure with some high degree of internal 
 
 redundancy. It may be that an important criteria for "goodness" of form, how- 
 ever, is the relative importance of simplicity and complexity — a "good" config- 
 uration being simple. The theory is that our perceptual and memory processes 
 tend toward simplification of forms. 
 
 To get a "good" figure from a given figure, Mowatt (19^0 ) obtained the 
 statistical result that the large majority, about 9h% , of all changes to a 
 given figure consist in the addition of lines or dots to the figure. He also 
 shows that open figures are changed much more frequently than closed figures , 
 and he concludes that closure and symmetry are considered "good" properties of 
 a figure. Another way to determine goodness of shapes is to determine some 
 
 k. It was demonstrated by Attneave (195M that information is concentrated at 
 points where there is a change in an otherwise continuous gradient; at the 
 contours of a form which mark the change from ground to figure; and at any 
 inflection along the contour where the direction of the contour changes 
 most rapidly. 
 
kind of threshold function (Bobbitt, 19^2). For example, the threshold may be 
 stated in terms of the percentage of the total perimeter of the figure present 
 at the point in the series representing the transition between the forms having 
 the quality of twoness and those having the quality of oneness. 
 
 In 1955 Deutsch observed that "the perceived form or shape of patterns 
 must be invariant with changes in location and orientation of the patterns, and 
 even minor changes in shape itself." However, recognition of forms and compar- 
 ative judgment of forms deteriorates when forms are rotated with respect to the 
 observer. In fact, there is considerable evidence to indicate that orientation 
 plays a part in the definition of what we mean by the form of an object (Dod- 
 well, 1961). Thus it may be so that the form of an object may not be invariant 
 with rotation of the object and is partly defined in terms of orientation 
 characteristics (see Zusne). 
 
 Some of the stimuli, or target shapes, which have been used by psychol- 
 ogists to study the problem of human form perception are: Dot Patterns 
 (French, 195^), Metric Figures (Fitts and Leonard, 1957), Random Polygons 
 (Attneave and Arnoult, 1956), and Computer-Generated Luminous Polygons (Poli- 
 dora and Thompson, I96U). Because of the simplicity of generation by computers, 
 polygons are used quite frequently in psychological experiments. While it is 
 true that a curved shape may be approximated with straight line segments with- 
 out appreciable loss of information, some information is irretrievably lost, 
 depending on how crude or fine the approximation is. How such approximation 
 affects perceived complexity is not known. 
 
 The methods of polygonal shape representation which have been studied 
 are mostly independent of shape orientation (see Deutsch, 1962, and Attneave 
 and Arnoult, 1956). However, it may be that the perceptual independence of 
 shape orientation may not play as significant a role for form perception in 
 
psychology. For the analysis of shapes and forms, a large number of physical 
 characteristics and form invariances have been suggested and studied. Some of 
 these are: compactness, jaggedness, skewness, and dominant axis. 
 
 There have also been many other studies: theories of visual form 
 perception, perceptual tasks, memory for form, form perception by animals, 
 application, etc., have been done and are being continued by thousands of 
 psychologists (see Zusne, 1970). However, as we stated earlier in this chap- 
 ter, knowledge of how humans perceive shape and form on a psychological level 
 is still exceedingly primitive. 
 
 On the other hand, many engineers and mathematicians have been 
 working on various shape recognition problems, from low-level image analysis 
 to high-level semantic analysis, by constructing mathematical models or by 
 computer simulations. The general and attractive summary of the study of 
 picture processing can be seen in Rosenfeld (1969) • The most recent study by 
 Schwebel (1972) surveys unstructured models , linguistic models, explicit 
 graph representations of pictures, graph theoretic clustering, graph trans- 
 formations and grammatical formulations, and graph structure processing 
 languages. His own work is primarily concerned with the theoretical study of 
 global-level picture processing using a graph structure representation. 
 
 The plan of this thesis is as follows: 
 
 In Chapter 2 the discussion of some basic definitions and properties 
 in Topology to provide insight adequate for the following chapters is given. 
 Elementary classes of curves — including the concept of angularly simple curves, 
 and decomposition of a complex region into simpler classes of regions — are 
 discussed. An approach for the detection of smooth curves from a given black 
 and white image picture is also contained. 
 
Chapter 3 contains an extensive discussion of various types 
 of shape representations. We introduce three basic shape representations: 
 polygonal, pattern sequence, and skeleton representations. Pattern sequence 
 representation is generalized to cover almost unlimited classes of shapes. 
 Some characteristics of representations and comparison between rej:>resentations 
 is also given. 
 
 In Chapter k a discussion of feature extractions from those three 
 basic shape representations discussed in Chapter 3 is given. Nineteen 
 selected features are described, and feature dependence on encoding parameters 
 is discussed. Some of the local features which should be extracted for shape 
 perception problems are also discussed. 
 
 Chapter 5 is provided to consider a shape perception problem as an 
 application of concepts and approaches which are discussed in earlier 
 chapters . A procedure to determine an odd shape or form in a given set of 
 shapes, each set consisting of more than two shapes, is described. The 
 "oddness" of a shape in a set of shapes is defined on the basis of a specified 
 feature, and then the "most odd" shape on the basis of many detected features 
 is defined. The performance of the system is shown by comparing the system 
 response with the average human response. To make the system response resemble 
 an average human response, modification of an assumed weight vector by means 
 of an "attention mechanism" is considered. 
 
 To attain acceptable solutions for a given recognition task, it is 
 necessary to develop a graph structure model which makes it easy to access 
 low-level as well as high-level picture information. In Chapter 6 we 
 introduce graphical representation of pictures. Relation graphs are 
 introduced to represent the structure of a given picture, and algorithms to 
 generate relations as well as necessary graph structure transformations are 
 also discussed. 
 
In Chapter 7 some problems in shape perception are introduced: 
 extraction of complex shapes or figures from another picture, extraction of 
 "good" figures or good decomposition. A description and an approach for a 
 possible solution is given for each problem. 
 
 Chapter 8 is provided to introduce a central problem in shape 
 perception: the naming problem. As the summary of this work, a model is 
 suggested for a solution of this problem by referencing approaches and 
 algorithms which have been introduced in earlier chapters. Some potential 
 application of the model is also discussed. 
 
 Chapter 9 contains conclusions and suggestions for further studies 
 on shape perception. In Appendix A some shape description subroutines are 
 listed. The illustration of 20 sets of data stimuli which are used for the 
 experiment of Chapter 5 appears in Appendix B. 
 
 All algorithms described in this thesis have been written in PL/I 
 language and implemented on an IBM 360/75 at the University of Illinois, 
 Urban a -Champaign . 
 
2. TOPOLOGICAL DISCUSSION 
 
 In this chapter we discuss some basic definitions and properties in 
 Topology to provide insight adequate for the following chapters. In Section 2.1 
 some basic concepts of curves are introduced. In Section 2.2 elementary classes 
 of curves, including the concept of angularly simple curves, are examined, and 
 decomposition of a complex region into simpler classes of regions will be dis- 
 cussed. An approach for the detection of smooth contours from a given digitized 
 image is also discussed in Section 2.3 
 
 2.1 Basic Topology 
 
 Let us introduce some definitions and properties in elementary topology, 
 where the proof of each property listed can be found in Cairns (1961) and 
 Tondeur (1970 ). 
 
 A curve y is regular if y is twice continuously differentiable map 
 
 2 
 y: I -* R with condition y("t) ^ f° r t e I. A curve y is simple , 
 
 y: I = [a,b) -HR , if y(t) ^ y(t') for t, t' e I, which means a simple curve 
 
 has no self-intersection. A simple curve y is called a piece-wise polynomial 
 
 curve if it consists of a finite number of polynomial line segments. A simple 
 
 curve y is called a polygonal curve if it consists of a finite number of 
 
 straight-line segments. 
 
 2 
 The rotation index of a closed curve y: [a,b) ■> R with respect to a 
 
 point x interior to the curve is the integer, 
 
 W 
 
 (y,x ) = 27/ e(s) dS = 27 (e(b) " e(a)) > 
 
 where 6: I ■*■ R is an angle function. If the absolute value of the rotation 
 index of a given closed curve is different from 1, then it must have self- 
 intersections . 
 

 From the above definitions, we have the following properties: 
 i) A regular simple closed curve is convex if and only if the 
 curvature function is non-negative or non-positive. 
 
 ii) A regular closed curve with rotation index 1 (-l) and non-negative 
 (non-positive) curvature is convex. 
 
 Let S be a topological space, and consider the product space S x T, 
 where T is the unit segment [0,l], Let ft also be a topological space, and let 
 f: S x T -* ft be a continuous mapping of S x T into ft. The points of S x T 
 are representable in the form (p,t), where (p,0) e S and t e T. We will denote 
 with f : S -> ft (t e T) the mapping of S into ft defined by f (p) = f(p,t), 
 (p e S, t e T). Then f is a continuous one-parameter family of continuous 
 mappings (0 _< t _< l). It will be called a (homotopic) deformation of f into 
 f . It is helpful, intuitively, to regard t as the time and to think of the 
 image of a point p e S as "moving" from f n (p) to f (p) through the positions 
 f (p) as t increases from to 1. Its deformation path is a continuous curve, 
 defined by the restriction of f to the segment p x T. 
 
 A mapping f : S -*■ ft is homotopic to a mapping f : S -> ft if it is 
 possible to define a deformation of f into f . This relation will be symbol- 
 ized by f = f . 
 
 iii) Let y , y : I -*■ R - {x} be closed curves, then y n » Y-i are 
 homotopic in R - {x} if and only if W(y n , x) = W(y-, , x). 
 
 iv) The relationship - is an equivalence relation. 
 
 As a consequence of the property iv) , the continuous mappings S -*- ft 
 fall into homotopy classes, generally called merely classes of mappings. 
 
 v) The space is simply connected if and only if each closed curve 
 on S can be shrunk to be a point (or can be spanned). 
 
10 
 
 2.2 Simple Closed Regions 
 
 In this section we introduce several classes of simple closed regions: 
 convex polygons, angularly simple polygons, and K-convex/K-angularly simple 
 polygons. The Feasibility Problem involved in angular simplicity, as well as 
 the decomposition of a complex region into K-convex or K-angularity simple 
 polygons, is also discussed. 
 
 2.2.1 Convex Polygon 
 
 The region bounded by a closed curve y is denoted by ft . A convex region 
 ft is called a convex polygon if y is a closed polygonal curve. A class of 
 convex polygons may be considered the simplest class of polygons, where the 
 definition of convexity has been given in the previous section. However, we 
 shall introduce another definition of convexity in terms of feasibility which 
 will be discussed extensively in the following section. 
 
 A point x in a closed region ft is called an angularly simple point if 
 
 at x the curve y is totally visible. The collection of angularly simple points 
 
 of a region ft is called the feasible set of ft and is denoted by X . A 
 Y Y Y 
 
 closed curve ft is convex if its feasible set is ft itself. In other words, ft 
 Y Y Y 
 
 is convex if at any point x in ft , y is totally visible. 
 
 2.2.2 Angularly Simple Region 
 
 A region ft or its closed curve y is said to be angularly simple if its 
 feasible set X is non-empty, and examples of angularly simple and not angularly 
 simple regions are exemplified in Figure 2.1. 
 
 Lemma 1: Let y be an angularly simple curve. Let us also assume that points 
 x n and x' are angularly simple points of y» where x 4 x'. Then y is angularly 
 
11 
 
 (a) An Angularly Simple Curve 
 
 (t>) A Curve That Is Not Angularly- 
 Simple 
 
 (c) An Angularly Simple Polygon 
 
 Figure 2.1. Angular Simplicity of Regions 
 
12 
 
 simple at x*!, where 
 
 xjj = vx Q + (1-w) x^, 0_<w<l. 
 
 Proof: Let us consider a point y(t) on y(l) , where t e I. Since y is angularly 
 
 simple at x and x ' , the line segments x , y(t) and x^ y(t ) have no intersec- 
 
 tion with y(t') for all t' 4- t, t' e I. Furthermore, the line segment x n ,x' 
 has no intersection with y(l) because, if it does, then neither x or x' can 
 be an angularly simple point. Therefore, the triangle x , x' y(t) contains no 
 points of y(t'), for t' ^ t, t' e I, which implies that the line segment x" y (t ) 
 has no intersection with Y(t') for t' 4 t. By the definition of angular simpli- 
 city, Y is angularly simple at x". 
 
 (Q.E.D.) 
 Lemma 1 implies that if y is angularly simple at points 
 x. , i=l, 2, ..., m, then y is angularly simple at any point which is the linear 
 
 m 
 convex sum of x . ' s , i.e., > w . x . , where 
 
 i=l 
 
 m 
 ^w. =1, w. > 0. 
 i=l 
 
 Theorem 1 : Let y be an angularly simple curve. Then the set of points which 
 
 defines y to be angularly simple is a convex set. 
 Proof : It is clear from Lemma 1. 
 
 (Q.E.D.) 
 
 Theorem 2 : Let y., i=l, 2, ..., n, be angularly simple curves whose regions 
 
 and feasible sets are denoted by Q , X , and i=l, 2, ..., n, respectively. If 
 
 'i i 
 
 n n 
 
 X= p| X 4 0, then the perimeter of [J Si = Q , 
 i=l Y i i=l Y i Y 
 
 n 
 which may be denoted by y = v y . , is angularly simple at any point in X. 
 
 i=l X 
 
13 
 
 Proof: By the assumption that each Y-» i = l» •••» n, is an angularly simple 
 curve, then each y. is totally visible at any point in X . Thus at any point 
 in X, Y- is totally visible for i=l, • ••» n, which implies the curve 
 Y of ft is angularly simple at any point inside X. 
 
 (Q.E.D.) 
 n 
 
 In the theorem, the condition ' I X 4- is a sufficient condi- 
 
 1-1 Y i 
 
 tion for the curvature y to be angularly simple and clearly not a necessary 
 
 condition. 
 
 Corollary 1 : Let y., i-l» 2, ..., n, be angularly simple curves and let 
 
 ft and X be the region and the feasible set of y. for i=l, 2, ..., n, 
 i i x 
 
 n n 
 
 respectively. If y s \/ y. denotes the perimeter of \J ft , then the feasible 
 
 i=l X i=l Y i 
 
 n 
 
 set X of y contains (~\ X 
 Y ' Y 
 
 1 i=l Y i 
 
 Proof : It is clear from Theorem 2. 
 
 (Q.E.D.) 
 
 2.2.3 Determination of Feasibility 
 
 It is clear that any convex region may be described by a finite 
 number of line inequalities. We will extend this idea for the detection of 
 feasibility. Let us assume that a simple closed curve y is angularly simple at a 
 point x . Then the feasible set X of y can be denoted by the following set 
 
 of inequalities 
 
 Y(t) ^0 if the value of -f(t) at x is >_ 0, 
 Y(t) <_ if the value of y(t) at x is <0, 
 for t e I, 
 
 where i(t) denotes the tangent line at Y(t). 
 
lU 
 
 Before we extend our discussion on the general determination of feasi- 
 bility, let us first introduce simple representations of a polygonal curve and 
 a polygon — the extended discussion of alternative shape representations will be 
 defined in the following chapter. A closed polygonal curve y which consists of 
 8 line segments, e. , e , ..., e„, is illustrated in Figure 2.2(a), where g. = 
 
 denotes the line equality which corresponds to the edge e. for i=l, 2, ..., 8. 
 One might use a sequence of coordinates to denote such a curve; however, if we 
 order these line equalities in clockwise direction (or in counter-clockwise 
 direction) , then such an ordered set of equalities clearly represents a unique 
 polygonal curve, where each basic coordinate can be found as the intersection 
 point of two adjacent equalities. For example, the curve in Figure 2.2(a) is 
 denoted by 
 
 G Y E {g l = ° 9 S 2 = °» ••*' S 8 = ° } 
 5 { gi = for i=l, 2, ..., 8}, 
 
 and the basic point p , for example, is the intersection point of g =0 and 
 g = 0. To denote the region Q , it is merely necessary to use inequalities 
 instead of equalities. Without loss of generality, the right-hand side (left- 
 hand side) of a line g. = in clockwise (counter-clockwise) direction may be 
 denoted by an inequality g. >_ 0. Thus the polygon of Figure 2.2(b) can be 
 denoted by the following ordered set of inequalities : 
 
 Q = {g. > for i=l, .. . , 8}. 
 Y i — 
 
 Therefore, in general we have the following: 
 
 A polygonal curve y is expressed by an ordered set G of equalities: 
 
 G = {g. = for i=l, 2, ..., n} 
 Y l 
 
 where the basic coordinate p. is the intersection point between g. = and 
 g. = 0, where j = i+1 (modn). Similarly, the corresponding polygon is 
 
 J 
 
 i 
 
15 
 
 g 3 =o 
 
 (a) A Representation of a Polygonal Curve 
 
 g 3 ^o 
 
 (b) A Representation of a Polygon 
 
 Figure 2.2. Polygonal Curve and Polygon 
 
16 
 
 expressed by an ordered set ft of inequalities: 
 
 ft = {g. > for i=l, 2, ..., n} 
 Y I — 
 
 and g. >_ is applied only between p. and p. . 
 
 Theorem 3 : Let y be a polygonal curve whose region is denoted by 
 
 ft 5 {g. . for i=l, ..., n}. Then y as well as ft is angularly simple if and 
 
 only if the set of inequalities g. >_ 0, for i=l, ..., n is feasible. 
 
 Proof : Let us assume that y is angularly simple. Then there exists a point 
 
 x~ inside ft such that the line segment between x_ and an intersection point of 
 y 
 
 any adjacent equalities has no intersection with any line segments of y. Thus 
 g.(x ) >_ for all i, which implies that x is a solution for the given set of 
 inequalities (feasibility of the set of inequalities). 
 
 Let us assume {g. >_ for i=l, 2, ..., n} is feasible. Then there 
 exists at least one point x such that g. (x ) >_ holds for all i. We consider 
 the triangle of x , p. , p. in Figure 2.3(a) and show that neither line seg- 
 
 ment p. , , x_ nor p. , x^ has an intersection with g. = for j 4 i-1, i, i+1. 
 i-l' *i' J 
 
 Assume g. = crosses the line segment p., x . Since the given curve is simple, 
 
 the 
 
 re exists another g = , k 4 j , i-l, i, i+1, such that g = intersects the 
 
 line segment p., x . Since g. > is assumed to hold on the right-hand side of 
 
 the line g. = 0, g.(x ) _> does not hold (see Figure 2.3[b]), which contradicts 
 J J u 
 
 the assumption that g. (x ) >_ holds for all i. 
 
 (Q.E.D.) 
 
IT 
 
 (a) 
 
 (b) 
 
 Figure 2.3. Illustration for the Proof of Theorem 3 
 
18 
 
 2.2. k K-Convex Decomposition 
 
 It is well known that any polygonal region can be represented by the 
 
 union of a finite number of convex polygonal regions. We will consider the 
 
 following decomposition of a polygonal region into unions of so-called "maximal" 
 
 convex polygonal regions. Here we use the representation of curves and regions 
 
 which were introduced in the previous section. 
 m 
 ft = [J ft is a convex maximal decomposition of ft if and only if for 
 Y j=l Y j Y 
 
 each convex ft , ft eft and ft cfc I J ft , and each ft is maximal. 
 
 Y j Y j " Y Y J ~ *j Y k h 
 
 Where G = {g. > for i=l, 2, .... n} and 
 
 Y i — 
 
 G = {g„ > for 1=1 9 2, . .., n.}, G cz G for all j , 
 y- & - J y. - Y 
 
 J J 
 
 and m is called the convex complexity of ft . 
 
 Y 
 
 The above maximal decomposition is not unique for a given polygonal 
 
 region, and the value m implies that at least m number of convex regions are 
 
 required to cover the given decomposed polygonal region. In other words, let 
 
 m 
 ft , j=l s ..., m be convex polygons, and let ft = \J ft be connected. Then 
 J j =1 J 
 
 the complexity of ft is at most m. 
 
 Let us consider all possible convex maximal decompositions of ft , and 
 
 let the collection of all convex regions be {ft , j=l, 2, 3, ..., n). We 
 
 J 
 assign a cost C. for each convex cover ft , e.g., the number of inequalities to 
 3 Y j 
 
 denote ft which is denoted by G : and we want to minimize the total cost to 
 Y. Y. " 
 
 J J 
 
 cover ft . 
 Y 
 
 To solve this problem, let x., j=l, ..., n be boolean variables (i.e., 
 
 J 
 
 O-l) such that x.=l means the ,i-th convex region ft is chosen for covering ft 
 
 j y. y 
 
19 
 
 and x.=0 means it is not. Then we have the following Integer Programming formu- 
 
 J n 
 
 lation for the cost minimization, where YJ x. = K. 
 
 j=l J 
 
 n 
 Minimize / . x.C . 
 
 under the constraints of 
 
 Y (x¥l) J Y (x. = l) Y j 
 
 J J 
 
 x. = or 1 for j=l, 2, ..., n. 
 J 
 
 Theorem h : Let ft "be a polygonal region having a convex maximal decomposition 
 
 m m 
 
 of [J ft Then ft is angularly simple if and only if f) ft ^ $. 
 
 j=l Y j Y j=l Y j 
 
 m 
 
 Proof : Let us assume f~] ft ^ <£>, Then, since each ft is convex, any point 
 
 j=l Y J Y j 
 
 m 
 x in P| ft implies that ft Y is angularly simple. 
 0=1 Y j 
 
 m m 
 
 Let us assume N ft is angularly simple; then G = ^J G is feasible, 
 j=l Y j Y j=l Y j 
 
 Since each G is a subset of G , the feasible region of G , i.e., ft , contains 
 
 Y v Y Y 
 
 m 
 
 m 
 
 the feasible region of G , I.e., X = Oft . Since X 4 4» (\ ft 7^ $. 
 
 Y Y >i y j Y j-i y j 
 
 (Q.E.D.) 
 
20 
 
 2.2.5 K-Angularly Simple Decomposition 
 
 In previous sections, we restricted our class of polygons to be angularly 
 simple, which does not cover the general class of polygons. 
 
 A K- angularly simple polygon is a polygon such that K distinct points, 
 
 x. , i=l, ..., K, are necessary to cover ft with ft , i=l, 2, ..., K, which are 
 
 Y Yj_ 
 
 angularly simple at x. . In other words, for each basic point p . of ft , there 
 
 i J Y 
 
 exists at least one point x. (we call such a point "AS-point") such that x.p. 
 
 l * i*j 
 
 does not intersect any edges of ft . Simple examples of K-angularly simple poly- 
 gons are illustrated in Figure 2.U. 
 
 To determine whether a given polygonal region ft (= G ) is K-angularly 
 simple or not, the following definition might be helpful. A set S is said to 
 be mutually exclusive if and only if s . f\ s . = $ for all s . , s . e S , and j S | 
 denotes the cardinality of S. 
 
 m 
 Theorem 5 • Let a convex maximal decomposition of ft be [J ft , and let S be 
 
 Y 3=1 Y j 
 
 the mutually exclusive set of {ft , j=l, ..., m} such that |s| = K is the maxi- 
 
 Y j 
 mum. Then ft is K-angularly simple. 
 
 Proof : Let the maximum mutually exclusive set S be 
 
 1 i k K 
 
 By the definition of angular simplicity, only interior points of ft. are AS-points 
 of ft. . In other words, at any point xl in B, , k ^ i, x, does not imply that 
 ft. is angularly simple because ft. H ft, = $• Since K = |s| is the number of dis- 
 
 X IK 
 
 joint convex polygons of ft , by Theorem k, K points are necessary to cover ft . 
 
 (Q.E.D.) 
 
 The class of regions which has been defined in Section 2.2.2 is called the 
 class of 1-angularly simple regions. 
 
21 
 
 (a] 
 
 (b) 
 
 (c) 
 
 Figure 2.U. Examples of K-angularly Simple Polygons 
 
22 
 
 Theorem 5 implies the following approach of decomposing a given polygon 
 into K-angularly simple polygons. After the determination of K number of 
 mutually exclusive polygons, B_, ..., ft R , ..., fi K , for each ft R we find all 
 non-mutually exclusive polygons Q whose cumulative intersection is non-empty. 
 
 The union of such polygons is an angularly simple polygon, and for all P* k we 
 apply the above process. 
 
 In Figure 2.5 a simple example of K-angularly simple decomposition, 
 given by Theorem 5, is illustrated. The polygon in (a) can be decomposed into 
 many convex polygons; denoted by R , R , ..., R,_, using only line equalities, 
 i.e., the extension of appropriate edges which are indicated by dotted lines. 
 Convex polygons, ft , ft , and ft_, attained by convex maximal decomposition, are 
 shown in (b), where ft = R \j R , ft - R \j R \j R, and ft, = R, \j R . Since 
 ft, fl ft ^ $ , ft fl fl_ = $ and ft_ f\ fi_ 4 $» "the maximum cardinality of the 
 mutually exclusive set S is two, where S = {ft , ft_}. Thus, K = |s| =2, and the 
 given polygon is 2- angularly simple. In (c) two angularly simple polygons; 
 ft, \j ft_ and fi„ y L, are shown, where their feasible sets (hatched regions) are 
 ft f\ ft and ft (\ ^q» respectively. 
 
23 
 
 (a) 
 
 Ob) s = {n , n }, k = |s| = 2 
 
 (c) Two Angularly Simple Polygons 
 Figure 2.5. An Example of K-Angularly Simple Decomposition 
 
2k 
 
 2.3 Generation of Polygonal Curves 
 
 In this section we will discuss algorithms to generate smooth polygonal 
 curves from a given black and white array picture. In Section 2.3.1 we will 
 discuss the detection of boundary elements, and an approach for finding smoothed 
 contours will be given in Section 2.3.1. 
 
 2.3.1 Boundary Detection 
 
 Let us introduce algorithms to find contours on a black-and-white picture. 
 Rosenfeld (1970 ) shows that the shrinking algorithm, edge-following algorithm, 
 and border-following algorithm work. Among them, the border- following algorithm 
 interests us. It has the advantage over the edge-following in that it requires 
 fewer steps , whereas a border-following algorithm can proceed directly to that 
 next element. On the other hand, as Rosenfeld mentions, the border-following 
 algorithms tend to be somewhat more complicated. We, however, consider the black 
 (white) -boundary-following algorithm which is quite simple compared to the border- 
 following algorithm. 
 
 Let us introduce some basic definitions: image array A consists of 
 
 black region R and ground (white region) R. Thus A = R [J R, r. . e R, a. . e A. 
 
 The set F. . of a. . is the set of four horizontal and vertical neighbors. Thus: 
 ij ij 
 
 F. . = {a..^-, ., a. , ., a. .. _ , a. . _}. 
 
 The set E. . of a. . is the set of eight horizontal, vertical, and diagonal 
 ij iJ 
 
 neighbors. Thus: 
 
 E.. = F. . i i {a. ... . . . . a. n . n , a..^ . .., a. .. .,-.}. 
 
 ij ij U i+l, j+1' i-l, j-1' i+l, j-1 9 i-l, j+1 
 
 i) For a. .eR, ifF..AR^$, then a. . is a black-border element of 
 ij ij ' ' ij 
 
 R. Otherwise, it is a non-black-border element 
 
 ii) For a. . eR, if F. . A R ^ $, then a. . is a white-border element 
 l.l i.l M l.l 
 
25 
 
 of R. Otherwise, it is a non-white-b order element. 
 Examples of non-black, non-white border elements are illustrated in Figure 2.6(a) 
 In (b) we show the numbering of eight neighbors of element a. .. In Figure 2.7 we 
 show three types of boundaries for a black-white image array: geometric bound- 
 ary, white boundary, and black boundary. 
 Algorithm BB (Black-Boundary Following) 
 
 Let x be a black-boundary element of R. 
 Step 1. D = (direction index), x = x , B = {x } 
 
 Step 2. Find n, = black and n, n = white in E , for k > D. 
 * k k-1 x 
 
 D = ^(k) where i\>{0) , ip(l) + 6, 
 
 iK2), M3) + 0, 
 
 4»(U), i|>(5) + 2, 
 
 4>(6), ip(7) ■*■ h. 
 
 B = B y { n v^» (ordered sequence to denote boundary elements). 
 
 x = n n . 
 k 
 
 Step 3. If the first two elements of B are identical to the very last two 
 
 elements of B, then eliminate the very last element from B and termi- 
 nate. Otherwise go to Step 2. 
 
 At the end of the Algorithm BB, the ordered sequence B has the following 
 properties , where 
 
 B = VW b m 
 
 b_, 5 b , and b.'s are not necessarily distinct. 
 1 m l 
 
 Theorem 6 . Algorithm BB generates a sequence B which consists of only and 
 
 all black-boundary elements of R. 
 
 Proof : Let us first show that B contains only black-boundary elements. In 
 
 Step 2 , n n e R and n, ., e R, moreover, F P\ R 4 $ , thus by the 
 ' k k-1 n ' ' 
 
 k 
 
 definition B contains only black-boundary elements. 
 
26 
 
 Non Black-border Element 
 
 a 
 
 z 
 
 /. 
 
 z. 
 
 & 
 
 
 7 
 
 ^ 
 
 7W< 
 
 I 
 
 Non White-border 
 Element 
 
 (a) Examples of Border Elements 
 
 (b) Numbering of Eight Neighbors 
 
 Figure 2.6. Border Elements and Neighbors 
 
27 
 
 Geometric Boundary 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 \Y\ 
 
 
 
 
 
 
 
 
 
 
 
 
 
 \kn 
 
 s^ 
 
 ^ 
 
 ^ 
 
 
 
 
 
 
 $* 
 
 
 
 
 ^ 
 
 
 i^ 
 
 XV 
 
 ^ 
 
 
 
 
 VVN 
 
 
 
 ^ 
 
 ^ 
 
 vj\ 
 
 
 
 
 
 Nlv 
 
 ^ 
 
 \ v\ 
 
 
 
 s> 
 
 
 
 
 vnN 
 
 *X 
 
 \\S 
 
 ^y 
 
 
 
 \ 
 
 
 
 
 
 
 
 
 
 
 
 
 White 
 Boundary 
 
 Black Boundary 
 
 Figure 2.7. Three Types of Boundaries for a Black-white Picture 
 
28 
 
 Let us now prove that B contains all black-boundary elements. We 
 
 assume a "black-boundary element n is missing from B. Without loss of 
 
 J 
 
 generality, let us assume that n should be located between b. and b. . 
 
 From Step 2, it is clear that n . e E, and furthermore that b. , e E, 
 
 J b. l+l b. 
 
 and ii. e F. . However, in Step 2, n n , = white and n, = b. , = black, 
 J b e 9 k-1 k l+l ' 
 
 then n., . = n. e R which leads us to n n i R. 
 k-1 j k 
 
 (Q.E.D.) 
 
 Corollary 2 . m= |B| is minimum. 
 Proof : It is clear from Theorem 6. 
 
 Algorithm WB (White-Boundary Following) 
 
 Let x be a white-boundary element of R. 
 Step 1. D = 0, x = x, B = {x }. 
 Step 2. Find n = white, n = black in E , k > D. Set D = k + k (mod 8). 
 
 K. ri* J- X 
 
 B = B\J {n k > 
 
 x = n n . 
 k 
 
 Step 3o If the first two elements in B are identical to the very last two 
 elements of B, then eliminate the very last element from B and 
 terminate. Otherwise, go to Step 2. 
 
 Similar to Theorem 6 and corollary 2 for Algorithm BB, we have the following 
 
 properties: Algorithm WB generates a sequence B which consists of only and all 
 
 white-boundary elements of R, and m= |b| is minimum. 
 
 2.3.2 Contour Smoothing 
 
 A rough polygonal contour can be obtained by determining the ordered 
 sequence of all boundary elements and by connecting adjacent elements in the 
 sequence with a segment of straight lines. The possible directed segments are 
 
29 
 
 only eight; therefore, the contour can be simply represented as a chain of octal 
 numbers. The properties of such chains have been studied by Freeman (1961). 
 Freeman and Glass (1969) have studied the problem of finding a smooth line from 
 such a chain. 
 
 Another method for extracting a smooth polygonal contour from a digitized 
 image is discussed by Montanari (1969, 1970). He first obtains the ordered se- 
 quence of contour points and the connection graph (Rosenfeld and Pfalz, 1966) of 
 the image by a modified Ledley algorithm in one image scan. Then he finds the 
 approximating contour by defining a minimal perimeter polygon subjected to 
 specified constraints. 
 
 Montanari' s approach is interesting; however, it is fairly complex 
 compared to the following approach which gives a reasonably smoothed contour in 
 a reasonable amount of computation time. 
 
 Let B'=b. ,b. ,...,b. ,...,b. be a connected subsequence of an 
 
 ordered sequence B of black (or white) boundary elements. Then B' is said to 
 be a ag-sequence if and only if for all k, k=2 , ..., n-1 the following condition 
 
 holds. For b. and b. , let a. . and a. . „ be their corresponding image 
 
 elements; then a, 3 are identical for all j, where |a|,|6| <_ 1. Thus a$-se- 
 
 quences form diagonal or horizontal or vertical line segments on the contour. 
 
 Let d denote the Euclidean distance. Then B' is said to be T-sequence if 
 
 d(b. , l) <_ T for k=2, ... , n-1, 
 X k 
 
 where £ is the straight line which goes through b. and b. such that 
 
 X l X n 
 
 d(b. , £') > T for some k=2, ..., n, 
 \ 
 
 where £' is the straight line which goes through b. and b. .In other words, 
 
 X l X n+1 
 
 T-sequence is a maximal subsequence of B which satisfies the given condition for 
 
 each element between b. and b. . 
 
 1 n 
 
30 
 
 Algorithm CS (Contour Smoothing) 
 
 Let B be an ordered sequence obtained by Algorithm BB or WB. 
 
 Step 1. Find all possible a3-sequences in B and connect b. and b. by a 
 
 X l X n 
 
 straight line. 
 
 Step 2. Find a possible T-sequence in the sequence derived by Step 1, and 
 
 connect b. and b. by a straight line. Set b. = b. and repeat this 
 i n l i. i l 
 
 In In 
 
 step. 
 
 In the above algorithm, Step 1 is negligible; however, Step 1 saves a 
 great amount of computation time for Step 2. 
 
 Computation examples using Algorithm BB and Algorithm CS are illustrated 
 in Figures 2.8 and 2.9. Figure 2.8(a) shows a black and white digitized picture 
 of "sleeping cat," (b) shows the black boundary obtained by Algorithm BB, and 
 (c) shows the result obtained by Algorithm CS. Here the dots show the result 
 after Step 1, and the dots with straight line segments show the result obtained 
 after Step 2 of the Algorithm CS. 
 
 To understand how well Algorithm CS works, the following analysis might 
 help. There are 19^- black-border elements after running Algorithm BB, and then 
 this is reduced to 91 elements by Step 1 of Algorithm CS. Thus the reduction 
 ratio is about a half. However, after the application of Step 2, it is reduced 
 to 38 elements, and thus the total reduction is about one-fifth. The greater 
 reduction ratio is not necessarily always better, since, by increasing the thresh 
 old value T in Step 2, we can achieve greater reduction ratio. In general the 
 result derived on a large threshold value becomes far different from the 
 originally given black-white image picture. 
 
 Another example, Embryo, is illustrated in Figure 2.9. This has 2U0 
 black border elements, and it is deduced to 90 elements by Step 1. Then it is 
 reduced to 36 elements. Thus the total reduction ratio is about one-seventh. 
 
(a) 
 
 31 
 
 I I 
 
 o>; 
 
 (c) 
 
 Figure 2.8. Contour Approximation: Sleeping Cat 
 
32 
 
 o 
 5 
 
 o 
 
 •H 
 
 
 
 •H 
 
 X 
 
 o 
 u 
 
 ft 
 ft 
 
 < 
 
 g 
 
 1 
 
 o 
 
 On 
 
 c\j 
 
 0) 
 
 bO 
 •H 
 fin 
 
33 
 
 In this example, we achieve greater reduction ratio than in the previous one, 
 because Embryo contains more smooth edges on the black-border elements than 
 Sleeping Cat does. 
 
 The threshold value used for the above two examples is slightly greater 
 than 1/^2. We feel, after running Algorithms BB and CS on various examples, 
 that these algorithms are simple and yet give good results in a reasonable 
 amount of computation time. 
 
3k 
 
 3. SHAPE REPRESENTATION 
 
 We have discussed in Section 2.3 how one can obtain smoothed contours 
 of objects from a given black-white picture. These contours are described 
 as sequences of points coordinates; however, such description of objects is not 
 an appropriate representation if we are concerned with the economical descrip- 
 tion of the shape of objects. In this chapter we introduce three basic shape 
 representations: polygonal, pattern sequence, and skeleton representations. 
 Some characteristics of each representation and comparison between representa- 
 tions will be discussed. 
 
 3.1 Polygonal Representation 
 
 Smoothed contours detected from black-white pictures, considered as 
 polygonal approximations, typically contain large numbers of vertices. We feel, 
 however, that the number of vertices can be reduced significantly without 
 destroying the shape information of objects being considered. In other words, 
 with fewer vertices we can still retain a good approximate polygonal represen- 
 tation of objects by an appropriate assignment of vertices. 
 
 The problems involved here are: (l) how to reduce the number of ver- 
 tices required for the representation, and (2) the appropriate representation 
 or encoding of complex shapes. 
 
 It is clear that almost all objects can be described by a finite number 
 of vertices; typically the order of a hundred is sufficient for accurate repre- 
 sentation. It is also clear that the number of vertices varies widely depending 
 upon whether the given object is complex or simple. Let us consider two shapes— 
 the sleeping cat and embryo of Figure 2.8 and Figure 2.9 — where their reduced 
 number of vertices are 38 and 36, respectively. However, with even fewer ver- 
 tices, about 15, we can adequately describe each figure without destroying its 
 
35 
 
 shape by appropriate selection of vertex coordinates on or near the contour. 
 This is exemplified in Figure 3.1, where the sleeping cat is denoted by l4 
 vertices and the embryo is denoted by 18 vertices. 
 
 Let us define the best polygonal approximation with respect to the 
 specified number of vertices, n, in terms of minimizing the difference between 
 the polygonal approximation and the original (smoothed) curve. Let 
 b , b , . .., b. , ..., b be boundary elements of a picture being polygonized, 
 and let P, A and P , iL (k _< n) be the perimeter and the area of the original 
 objects and k- vertex approximated polygon, respectively. Then our best poly- 
 gonal approximation with k, k <_ n , vertices can be obtained by minimizing D, 
 
 D = w x (P - P k ) + w 2 (A - \) + w T 
 
 under 
 
 d(b. , P, ) < T for i=l, . . . , m and k < n. 
 1 k — — 
 
 Here, w. > denotes weight, T denotes a threshold for straight-line approxima- 
 tion, and d(b. , P ) denotes the distance between element b. and the contour of 
 
 IK 1 
 
 the approximated polygon with k vertices. 
 
 As we can see from the above formulation, if we decrease T, then D 
 becomes small; however, conditions d(b., P ) <_ T become critical. On the other 
 hand, if we increase T, then we have to minimize w (P - P ) + w (A - A^ ) to 
 reduce D. This means that we have to move k vertices around the contour of the 
 figure so that the perimeter and the area of the approximated polygon are close 
 to the original perimeter and area. In Figure 3.2 we show some examples of wry 
 all three parameters, (P - P ), (A - A^ ) , and T, should be minimized sima: J - ae- 
 ously to minimize the value D. 
 
 Now, we shall discuss the representation of polygons. Let us assume 
 the following sequence, 
 
36 
 
 .a) Sleeping Cat: lU Vertices 
 
 (Id) Embryo: 18 Vertices 
 Figure 3.1. Polygonal Representation: Reduction of The Number of Vertices 
 
37 
 
 (a) Small Areal Error and Small 
 Threshold, But Large Perimeter Error 
 
 (b) Large Areal Error, 
 But Small Threshold and 
 Small Perim. Error 
 
 (c) Small Areal and Perim. Error, 
 But Large Threshold 
 
 (d) Small Areal Error , But 
 Large Threshold and 
 Perim. Error 
 
 Figure 3.2. Topological Examples of Contour Approximation that 
 Force the Function D to Become Large 
 
38 
 
 (b 1 , 6 1 ), (s 2 fig), ..., (s^ 6J, ..., (s n , 6 ), 
 
 where n is the number of vertices for shape description, s. denotes the distance 
 
 and 6. denotes angles. This sequence can be derived in either one of the 
 
 following manners : 
 
 i) s. is an edge length of a polygon, and 6. is the rotation angle 
 
 of the vector of edge s. ._ with respect to the vector of edge s.. 
 
 o i+l * ° i 
 
 ii) Let c be the center of the gravity of 0, , and let the vertices be 
 
 p, , p_ , ..., p., ..., p . Then s. = d(c, p.), and 6. is the 
 r l* r Z 9 l n l ' i l 
 
 rotation angle of vector c, p..-, with respect to c, p.. 
 
 i+l 
 
39 
 
 Clearly, either sequence representation of objects is rotation and translation 
 invariant. The only difference "between the encoded representations occurring 
 under rotation transformation is that the sequence might he circularly shifted; 
 however, this shifting does not effect shape perception. This polygonal 
 representation has been used by many psychologists for their study of the 
 human shape recognition system, but it is not easy to obtain shape information 
 out of this sequence representation unless the whole sequence is first 
 projected on the plane. 
 
 3.2 Pattern Sequence Representation 
 
 3.2.1 Definition of Pattern Sequences 
 
 It is known from the previous chapter that any angularly simple polygon, 
 ft , contains a point (or a set of points) x such that a vector from x to a 
 point on y can trace y in one direction with nondecreasing rotation angle. We 
 can use such a vector sequence to denote angularly simple curves as well. Such 
 a vector sequence is illustrated in Figure 3.3, where the region covered by the 
 vectors shows the polygon and the curve formed by connecting the tops of vec- 
 tors (dotted lines) indicates the polygonal curve. Let us define such a 
 sequence as follows: A pattern sequence , S, is an ordered set o f elementary 
 
 patterns , 
 
 q . R m 
 
 O — S , S , S , ..., S., ..., S jt_t s.en 
 
 where m denotes the dimension of the elementary patterns and N is called the 
 circularity of S. To make S denote a unique polygon, it is possible that one 
 
 
 A pattern sequence can be interpreted as a series of physiological measure- 
 ments; IBM Research Reports, vol. 7, No. 2, 1971, by Jean F. Brennan. 
 
Uo 
 
 Polygon Denoted by 
 a Pattern Sequence 
 
 Original Polygon 
 
 (a) An Angularly Simple Polygon 
 
 (b) A Pattern Sequence Corresponding to Polygon (a) 
 
 Figure 3.3. An Angularly Simple Polygon and Its Pattern Sequence 
 
1+1 
 
 might consider varying the angular difference between adjacent elementary 
 patterns and provide the following sequence of angles : 
 
 A = (6 6 
 
 V 0,1' i 
 
 2> 2,3' "•' i,i+l » •""» N-1,0 
 
 where 6. . . denotes the angular difference between elementary patterns s. 
 l, l+l ' i 
 
 and s. in . However, for simplicity, we assume 6. . ,_ is constant for all i, 
 l+l i ,i+l ' 
 
 i.e., 6. = 2tt/N. It is clear that with increasing N we can approximate 
 with increasing accuracy a given angularly simple polygon. 
 
 The dimension m of the elementary pattern varies with the intended 
 application of the sequence. For example, it is sufficient to set m = 1 and 
 use only the radial distance to denote polygonal regions; however, it is 
 sometimes preferable to store additional information: first radial distance, 
 then along each elementary pattern direction adjacent local attributes of 
 texture, color, etc., especially when we use the sequence to describe the 
 environment about a particular point. An example of m = 2 is illustrated in 
 Figure 3.U. Here we consider one attribute to distinguish between ROCK, SAND, 
 SNOW, and WATER. 
 
 When we rotate or translate a polygon whose representation is given as 
 a list of coordinates of basic points (or possibly by line equalities), it 
 
 is usually necessary to change the point coordinates (and line equalities). 
 
 However, a transformation of a polygon which is denoted by a pattern sequence, 
 
 S , i s s imple : 
 
 (i) a translation of an angularly simple polygon, Q, , corresponds 
 
 simply to a translation of x„ of the pattern sequence, S, and 
 
 (ii) a rotation of fi at x^ corresponds to circular shifting of indices 
 
 y 
 
 of elementary patterns, i.e., it is adequate to consider the rotation index , r, 
 which will be defined later. 
 
U2 
 
 Figure 3.^. Free Space Representation: A Pattern 
 Sequence of Multiple Dimension 
 
U3 
 
 Before we define the rotational transformation of S, let us define the 
 canonical pattern sequence. A pattern sequence, S, is called a canonical pat- 
 tern sequence if the first elementary pattern corresponds to the direction of 
 the X-coordinate , and the ordering of the elementary patterns corresponds to 
 the counter-clockwise rotation of corresponding vectors. Henceforth, we assume 
 that each pattern sequence is canonical. This assigns the orientation of the 
 corresponding polygon. 
 
 For the rotation of a pattern sequence, S , at a point x , it is conve- 
 nient to assume that the unit rotation angle, 6, is 2tt/N (or possibly, an 
 integer multiple of 2tt/N). For example, if 
 
 S ^ X = S 0' S l' S 2 5 ***' S i' ***' S N-2' S N-1 
 then the clockwise rotation of S(x ) through 6 degrees is given by 
 
 and the counter-clockwise rotation of S(x ) through 6 degrees is given by 
 
 S N-1' S 0' S l' ***' S i' ***' S N-2* 
 We will define r to be the rotation index. If r > 0, S(x ) has been rotated in 
 a clockwise direction through an angle of rS degrees. If r < 0, S(x ) has been 
 rotated in a counter-clockwise direction through an angle of r6 degrees. Thus, 
 in general, we have the following expression for a pattern sequence: 
 
 MV = V S r+1' S r+2 J •*•' S r+i' '" 9 S r+N-1 
 where r+i is interpreted as r+i (mod N) for all i. 
 
 This pattern sequence representation of objects has been applied for 
 solutions of various problems: an approximate solution of the sofa problem, 
 the determination of odd shapes and maze solvers with visual information 
 (Maruyama, 1971 a, b, and c). It has been proved that this representation is 
 appropriate for the representation of simple shapes. 
 
kk 
 
 3.2.2 Generation of Pattern Sequences 
 
 Let us consider the generation of pattern sequences for given polygons. 
 A polygon illustrated in Figure 3.5 is an AS-polygon since at any point in the 
 convex polygonal region which is denoted by the dotted contour, the entire 
 boundary of the given polygon is visible. 
 
 The encoded configuration of the AS-polygon of Figure 3. 5 is illustrated 
 in Figure 3.6. The vector which corresponds to the X-coordinate denotes the 
 very first elementary pattern, s , of the pattern sequence, s (x ) ; the rotation 
 index, r, is assumed to be at this stage, and each elementary pattern, s., 
 follows in the counter-clockwise direction. In this example, the circularity, 
 N of the pattern sequence is U8. Figure 3. 7 shows error dependence on the 
 circularity N for the polygon of Figure 3. 5, where the values obtained at 
 N = 180 are assumed to be true values. For small circularity, errors are 
 heavily dependent upon the orientation of the given shape. However, for 
 sufficiently large circularity, orientation errors become essentially negligible, 
 
 To generate each elementary pattern, s., detect the innermost intersec- 
 tion between the line segment at angle i2ir/N and an edge of the polygon. The 
 pattern sequence generated in this manner is called the interior (or primal ) 
 pattern. 
 
 Let us consider the polygon which is illustrated in Figure 3. ° where 
 x , the center of gravity, is used as the encoding point to generate a pattern 
 sequence. Alternatively, to generate each elementary pattern, s., instead of 
 measuring the distance from x to the innermost radial intersection, the follow- 
 ing method called exterior (or dual ) pattern generation, is used. 
 
 We first choose a circle of radius R which is large enough to contain 
 the polygon, where the center of the circle is the center of gravity of the 
 figure. Then we measure the distance, s . , to the outermost radial intersection 
 
h5 
 
 Figure 3.5. An AS-polygon 
 
1*6 
 
 Figure 3.6. Example of An Encoded Representation 
 By Pattern Sequence, S (x ) 
 
hi 
 
 0.0 
 
 
 V(180)- V(N) 
 V(180) 
 
 ♦ Z{s { -sf/N 
 
 X s=(^Si)/N 
 
 40 60 
 
 CIRCULARITY N 
 
 Figure 3.7. Error e vs. Circularity N 
 for the Polygon in Figure 3.5 
 
U8 
 
 Figure 3.8. An Example of a Non-Angularly Simple 
 
 Polygon (x Denotes the Center of Gravity) 
 
h9 
 
 and the boundary of the encompassing circle of radius R. Finally the desired 
 
 pattern sequence, S (x ) is derived "by s. = (R - s.) for i = 0, ..., N - 1. The 
 
 above process of generating the pattern sequence, S (x ), of the polygon in 
 
 Figure 3.8 is illustrated in Figure 3.9 » where the dotted arrows show the 
 
 s.'s and the solid arrows emanating at x. show the s.'s. The contour obtained 
 l i 
 
 by the dual pattern generation is also illustrated in Figure 3.9. As we can 
 see from the contour of the pattern sequence, we lost some information about 
 the polygon of Figure 3.8. This loss of information can be compensated by the 
 introduction of other pattern sequences which will be discussed later. However, 
 we feel that the polygon denoted by a single pattern sequence is sufficiently 
 accurate for the study of geometrical shapes. 
 
 From a study of the above two methods of generating pattern sequences , 
 we see that sharp corners of polygons do not show up in the abstracted contour. 
 One way to avoid such information loss is to increase the circularity N. 
 However, the problem here is the estimated error of encoding a given shape by 
 a pattern sequence of circularity N. In Figure 3.10 we show abstracted con- 
 tours obtained by the primal encoding method applied to the polygon of Figure 
 3.5 with different values of circularity N. If the circularity is larger than 
 12, then the abstracted contour holds the outline of the original polygon, even 
 though some minor abrupt changes of the original polygon have disappeared from 
 the abstracted contours. However, when the circularity becomes small, like 6, 
 then the outline of the polygon given by a pattern sequence is completely 
 different from the contour of the original polygon. The detailed discussion of 
 the optimal choice of circularity is given in Chapter U. 
 
50 
 
 -2»-X 
 
 Figure 3.9. Dual Encoded Pattern Sequence 
 of the Polygon in Figure 3.8 
 
51 
 
 (a) N = 2k 
 
 (b) N = 12 
 
 (c) N = 6 
 
 Figure 3.10. Contours Abstracted by Primal Encoding of the 
 Polygon of Figure 3.6, Illustrating Dependence on Circularity N 
 
52 
 
 3.2.3 Operations on Pattern Sequences 
 
 Let S, S' be two pattern sequences of circularity N, 
 
 X = S ' S l' ***' S i' •••» S N-1' 
 
 S'(Xq) = Sq, B-, •••> s^, ..., s fj_}_* 
 Then the union of S(x ) and S'(x Q ) is 
 
 S(x Q ) = S(x Q ) (j S^(x Q ) = {max (s., s^ + .) for all i}, 
 and the intersection of them is 
 
 S(x Q ) = S(x Q ) ^ S^(x Q ) = (min (s., s^ + .) for all i}, 
 where r is the rotation index. 
 
 Clearly from the above definition, this class of pattern sequences is 
 
 closed under union and intersection operations. A pattern sequence S is said 
 
 to be 1-symmetric if there exists k, _< k _< N-l, N is even such that 
 
 s. , . = s. . ... (mod N) for i = l s ...„-- 1. In other words. S is 1-symmetric 
 k+i k-i+N 2 
 
 if there exists a line which goes through x to make S symmetric. In general, 
 S(x ) is said to be K-symmetric (K <_ N/k) if there exists K distinct lines which 
 go through x such that for each line S is symmetric. Thus, the class of sym- 
 metric pattern sequences is closed under union and intersection operations, 
 again . 
 
 A pattern sequence S is said to be diagonally symmetric if and only if 
 s. = S , (mod N). Thus, the class of diagonally symmetric pattern sequences 
 is closed under union and intersection operations. Furthermore, it is trivial 
 that 2k-symmetric (k >_ 1, integer) pattern sequences are diagonally symmetric. 
 
 For CS(x ), C _> , it is said expansion if C > 1, and contraction if 
 1 > C _> 0. For S(x ) + e, if e > then it is said equi- expansion , otherwise 
 it is said equi-contraction where lei < minis.}. If s.+(s -A)e, e > 0, 
 A. _< (l+e)s./e is said to be proportional- expansion or contraction for all i. 
 
53 
 
 3.2. k Fourier Series 
 
 It becomes now clear that a pattern sequence can be described by a 
 Fourier Series. The interval (0,2tt) is divided into N number of equal sub- 
 intervals by points (k2TT/N) for k=0,l, . . . ,N-1. Thus the partial sum F (6) of 
 
 Fourier Series is 
 
 n 
 F (0) = a /2 + Yl (a cos(k6) + b sin(k6)) s 
 k=l 
 
 under the minimization of the following square error: 
 
 N-l 
 E = £J(s - F (2Trk/»)) . 
 
 k=0 n 
 
 It is known that the solution to minimize E is given by: 
 
 N-l 
 a ™ = 2( Yy s cos(2mTTk/N))/N, (m=0,l,. . . ,n) 
 m k=0 k 
 
 N-l 
 b = 2( Vs sin(2miTk/l))/N, (m=l,2,. . . ,n) . 
 k=0 
 
 This Fourier Series for a pattern sequence might be interesting; however, 
 we do not consider this type representation for objects. 
 
5U 
 
 3.3 Generalization of Sequence Representation 
 
 In Section 3.2 we discussed some sequential representations of angularly- 
 simple regions. There we utilized the property that each AS-polygon has a set 
 of points inside of the boundary such that from any point in the feasible set all 
 edges of the boundary are visible. Thus we could use a sequence of radial vec- 
 tors to denote such a region. If the region we want to describe is not angularly 
 simple, then some portion of the boundary cannot be denoted by such a sequence. 
 We were uncertain how accurately we should describe objects; however, we feel 
 that it depends heavily upon user preference as well as his application. 
 
 In this section, we introduce several approaches to describe general 
 objects, yet they are quite simple compared with the representation by coordinate 
 sequence or point encoding (Freeman, 196l). 
 
 3.3.1 Bi-Pattern Sequences 
 
 This representation consists of two pattern sequences, primal S(x ) and 
 dual S\X n ) pattern sequences. It is clear that these two pattern sequences are 
 identical if and only if the encoded region is 1-angularly simple. Furthermore, 
 dual pattern sequence covers regions at least as much as primal pattern sequences. 
 
 This fact is denoted by S(x ) Csfx ), where each elementary pattern of S'Uq) is 
 at least as great as the corresponding elementary pattern of S(x ). 
 Let 
 
 b(,X J — S , S , ... 3 S . , ..., S i\r_-]» 
 
 S'(x Q ) = Sq, s^, ..., s^, ..., s N _ 1? s'. = R - s. for all i. 
 
 The ambiguous regions implied by these two sequences are the regions which 
 correspond to s! > s. for some i. In Figure 3.11 we illustrate a bi-pattern 
 sequence encoding where (a) shows primal and dual encoding and (b) shows how 
 two sequences are different and where some ambiguities occur. As we can see 
 
 
55 
 
 (a) 
 
 S(x ); 
 
 i | i ; i | i j i i 
 
 S(xJ 
 
 R 
 
 (*) 
 
 Figure 3.11. Bi-Pattern Representation 
 
56 
 
 from this example, we can eliminate some ambiguities by considering two sequences 
 simultaneously and considering connections between elementary patterns in S(x_) 
 and S(x ). For example, we can assume that s, and s are connected, which is 
 denoted by a dotted line in Figure 3.11(b). 
 
 The following algorithm eliminates as many ambiguities as possible from 
 bi-pattern sequences and provides a more accurate description of regions. 
 
 Algorithm BP (Bi-pattern Sequences) 
 
 Let S' = s. , s. , ..., s. , ..., s. be a sub-sequence of 
 
 i, 1 ' i . l 
 12 j m 
 
 S(x ) such that 
 
 s. +s. =R,s. +s. = R, and 
 
 3L- 1- 1 1 
 
 11 mm 
 s . + s. < R for j ^ 1, m. 
 
 Then s. is interpreted to be connected to s- . Repeat the above process until 
 V-l 2 
 
 all such possible sub-sequences are exhausted. 
 
 3.3.2 Many Pattern Sequences 
 
 This representation uses the necessary number of pattern sequences to 
 encode the given figures relatively accurately. The pattern sequences, or set of 
 pattern sequences, being utilized here are slightly different from the pattern 
 sequences described in the previous section. The previous pattern sequence may 
 be called a complete pattern sequence since all elementary patterns were con- 
 sidered to be significant for a shape representation. However, pattern sequences 
 considered here are so-called incomplete, i.e., partial pattern sequences in the 
 sense that some elementary patterns are simply ignored, especially those ele- 
 mentary patterns whose magnitude is greater than or equal to R, which is called 
 the visible distance. 
 
57 
 
 We can consider the following three approaches: 
 
 a) S = {SUJ, i=l, 2, ..., K}, 
 
 ■where the given object fi is at most a K-angularly simple region. 
 In Figure 3.12, an example of this representation is illustrated, 
 where (a) shows three complete pattern sequences generated at x , 
 x , and x , respectively, and (t>) shows the modified pattern 
 sequences of (a). This modification of pattern sequences, which is 
 a reasonable consideration, can be achieved by detecting abrupt 
 changes in the complete pattern sequences of (a). 
 
 b) S = {S (x ), S 2 (x q ), ..., s n (x Q )} where each s.(x Q ) is a primal 
 pattern sequence generated at point x , such that at the i-th 
 generation, those portions of figures which have been viewed 
 
 by S (x ) , S (x •) , ..., S. (x ) are eliminated and the remainder is 
 considered to be the i-th encoding figure. Thus, each primal 
 pattern sequence is incomplete. This approach is exemplified in 
 Figure 3.13; the union of solid and dotted lines of (a) shows the 
 originally given figure, and the solid lines show those 
 lines which are encoded by the current pattern sequence. The dotted 
 portion indicates the portion of the figure which is not encoded. 
 Figure 3.13(b) shows the second pattern sequence S (x ) which covers 
 the solid portion of the figure and is generated by eliminating the 
 portion covered by S (x ). 
 
 c) S = {S (x ), S (x ), ..., s (x )} where S.(x ) is a dual pattern 
 sequence and each pattern sequence is generated at x . The rest 
 
 of the explanation is similar to that in case b). This is the dual 
 to the previous method, b), i.e., encoding starting inward from 
 the exterior of the figure, while method b) encodes outward from 
 
58 
 
 (a) Original Pattern Sequences 
 
 (b) Modified Pattern Sequences 
 
 Figure 3.12. Method a): Shape Representation with Multiple Patterns 
 
59 
 
 (a) Si(Xo) 
 
 (b) S 2 (X ) 
 
 Figure 3.13. Method b): Shape Representation with Multiple Primal Patterns 
 
60 
 
 the interior. This approach is exemplified in Figure 3.lU. 
 
 By comparing this method and the method b), if the encoded object is 
 fairly complex, then the method c) is preferable to method b). 
 
 3.3.3 Interval Pattern Sequences 
 
 An interval pattern sequence S is defined similarly to the previous 
 pattern sequences, except that each elementary pattern s. consists of many 
 
 intervals: 
 
 a b Q a ] _b 1 \ k. 
 
 1 1 ' 1 ' ' 1 ' 
 
 a.b . 
 l l 
 
 where each X. indicates that the interval between a. and b. is occupied by the 
 
 the object ^ • 
 Y 
 
 In Figure 3.15 an example of such a representation is illustrated. The 
 advantage of this representation is that only one pattern sequence is sufficient 
 to illustrate fairly complex closed regions; however, because each elementary 
 pattern consists of various numbers of intervals, this makes the operations on 
 pattern sequences fairly complex and time-consuming. Moreover, we need a fairly 
 complex algorithm to recognize the shape which is denoted by this sequence. 
 
 3.3.^ Contour-Following Pattern Sequence 
 
 This method is applicable for almost all simple closed regions and is 
 similar to the Polygonal Representation of Section 3.1. At first the contour 
 (or perimeter) is divided into N segments by N points: 
 
 P — Pq s P-i » Po » • • • 9 P-i* • • ■ a PlM — T * 
 
 Then each elementary pattern of this pattern sequence S(x ) is the distance 
 
 between x and p. for all i, that is, s. = d(x_, p.) for all i. 
 i ' ' i l 
 
61 
 
 (a) Si(x ) 
 
 \ 
 
 \ 
 
 \ Jo 
 
 \ 
 
 (*) S,(xJ 
 
 Figure 3.1*+. Method c): Shape Representation with Multiple Dual Patterns 
 
Figure 3.15. Interval Pattern Sequence 
 
 Figure 3.16. Contour-Following Pattern Sequence 
 
 Convex 
 
 (a) 
 
 Sj-4 Sj_3 S[_ 2 Sj-j Sj Sj+i Sj + 2 Sj +3 S 
 
 i + 4 
 
 c i+k =Sj -s i+k cos(kS) 
 
 a - 
 
 (b) 
 
 
 , -l-~ II 
 
 t- 
 
 Convex 
 Straight 
 
 Concave 
 
 Figure 3.17. Convexity and Concavity on Lines 
 
63 
 
 Clearly, this sequence does not represent a unique region, "but a class 
 of regions. To make this sequence denote a unique region, we will provide 
 another sequence, called a rotation sequence R(x ), where 
 
 R(x Q ) = r Q , r x , ..., r iS ..., r^^ r ± e {1, 0, -1}, 
 r. : denotes angle rotation with respect to the vector 
 
 x^ ,p . n . r. = 1 indicates counter-clockwise rotation, 
 O'^i-l l ' 
 
 and r. = -1 indicates clockwise rotation. 
 
 i 
 
 An example of a Contour-Following pattern sequence is illustrated in Figure 
 3.l6. A more detailed discussion as well as an application of this pattern 
 sequence appears in Section 3,k, 
 
 3.3.5 Convex-Concave Abstraction 
 
 It is appropriate to consider that the contour of a given object con- 
 sists of some convex and concave curves. Such a sequence may be used as an 
 abstract description of the global shape information. Here we briefly discuss 
 how to reduce such a sequence from a given pattern sequence representation of 
 an object. 
 
 Let us consider a single pattern sequence, either primal or dual. In 
 
 such a sequence, s n is said to be a maximal if s, > s, _, , s, , . , and s is said 
 H ' k k k-1' k+1' k 
 
 to be a minimal if s, < s, _ , s. ., . Clearly, each maximal elementary pattern 
 k k-1 k+1 
 
 corresponds to a convex segment on a curve y. However, each minimal elementary 
 pattern does not necessarily correspond to a concave segment on a curve y. 
 
 In Figure 3.17(a) we illustrate how a straight line, a convex line, 
 and a concave line appear as a subsequence between two adjacent maximal points. 
 Here we assume that s. , and s , are maximal elementary patterns and the 
 circularity N = 2tt/6. As we can see from these three lines, it is difficult to 
 tell what line becomes straight and what line becomes a convex line or a 
 
6U 
 
 (a) 
 
 ^^\ 
 
 
 
 
 / < < 
 
 / 
 
 \^A 
 
 1"'? 
 
 \ 
 
 X 
 
 \\ \ " 
 v \\ \ 
 
 — / ft 
 / /' 
 
 /\ 
 
 f^s 
 
 
 / / / 
 
 /» 
 
 1 
 
 
 
 -"" 1 
 
 1 
 
 L^X^ 
 
 
 1 
 
 l^-- 
 
 >r// 
 
 \ V\ 
 
 
 </ 
 
 / / 
 
 \\ V\ 
 \ V 
 
 y 1 * 
 
 \l'J 
 
 \ ^ 
 
 **"• — » • 
 
 A»-* 
 
 (b) — 
 
 (c) 
 
 / 
 
 / Concave 
 
 c i + k =Sj -s i+k cos(kS) 
 
 S=2tt/N 
 
 Figure 3.18. Convexity and Concavity on Simple Closed Curves 
 
 
65 
 
 A 
 
 A 
 
 a 
 
 o 
 
 •H 
 -P 
 U 
 
 cd 
 U 
 
 -P 
 
 u 
 o 
 
 I 
 
 X 
 
 > 
 S3 
 
 O 
 
 o 
 
 0) 
 
 -p 
 
 !h 
 o 
 
 o 
 cd 
 O 
 u 
 ft 
 ft 
 < 
 
 < 
 
 On 
 H 
 
 • 
 
 on 
 
 CD 
 
 U 
 
 bO 
 •H 
 
66 
 
 concave line. However, if we consider a new sequence, c. , = s. - s. , 
 
 l+k 1 l+k 
 
 cos(k6), then we can distinguish them. Three lines of Figure 3.17(a) are 
 plotted in (b), where c. . < implies a concave curve between s. . and s. . , 
 
 and c . . > implies a convex curve between s . , and s . . . 
 
 l+k * i-U i+4 
 
 In Figure 3.18(a), three regions are illustrated along with their 
 corresponding pattern sequences, and c. sequences are shown in (b) and (c), 
 respectively. From a C sequence we can derive another sequence which only con- 
 sists of convexes and concaves. In such a sequence, if more than one connected 
 convex or concave appears, then such a subsequence is shrunk into a single 
 convex or concave symbol. Thus, finally we get a (convex, concave) sequence, 
 and such a sequence is used to describe the global shape of the region. This 
 is exemplified in Figure 3.19« 
 
 3.4 Skeleton of Shapes 
 
 In this section we describe an approach for the detection of skeletons 
 and the decomposition of shapes in terms of the information extracted from the 
 skeleton (see Figures 3.21 and 3.22). 
 
 The basic idea of skeletonization of shapes is to describe a picture by 
 using an intrinsic coordinate system. Thus, the originally given shape can be 
 determined completely from skeleton points with their distance from the contour. 
 The idea of skeletonization of shapes was originally developed by H. Blum many 
 years ago as a new description of shape, and his recent works (1967, 1971) con- 
 tain an extensive discussion of skeletons. 
 
 Rosenfeld and Pfaltz (1966, 1967 ) have defined a discrete skeleton by us- 
 ing a distance transformation and have illustrated a computing algorithm. The 
 generalized and improved algorithm of this type appears in Stefanelli and 
 Rosenfeld (1971). Another definition of skeleton is, for instance, the loci of 
 
67 
 
 points having the same distance which can be seen as successive positions of a 
 propagating wavefront. Wavefront superposition is not allowed, so that wave- 
 fronts intersect on the skeleton and their propagation stops. The algorithm 
 based on the wavefront propagation mechanism has been developed by Montanari 
 (1969) where, in his algorithm, the main idea is the determination of the break 
 points which occur when more than two wavefront s intersect. 
 
 3.^d Skeletonization 
 
 Let ft be the region assumed by a curve y and let p = y(t), t e I. Then 
 £(t) is said to be the pan normal at p if £(t) is perpendicular to y(t) and £(t) 
 points inward. An example of a pannormal is illustrated in Figure 3.20(a). If 
 y is a polygonal curve, then the pannormal for this case is similarly defined, 
 and an example is illustrated in Figure 3.20(b). 
 
 Let us consider a contour- following pattern sequence P^y) °f a shape 
 which has been introduced in Section 3.3.^. 
 
 P n (y) = P » Pji P 2 > •••> P i5 ••«•» %_i» 
 
 where arc length p , p is equal to the arc length 
 
 p. , p. . for i=0, ... , N-2. 
 r i' l+l 
 
 Then p. is said to be convex (concave) if the angle which extends inward of ft 
 1 y 
 
 is less than or equal to it (greater than it). On such a concave point, the above 
 definition of pannormality is not adequate, because at p. there are two nor- 
 mals: one is perpendicular to e. and another is perpendicular to e. _. Thus, 
 in this case we assume any direction between I and I of Figure 3.20(b) is 
 normal at p.. 
 
 A PM-disc of radius r. is the maximal disc on the i-th pannormal £(p. ) 
 which touches point p. on a curve. The center of PM-disc of radius r. is denoted 
 by c.o Therefore, for a given edge-following pattern sequence on a simple curve, 
 
68 
 
 we get a unique sequence C (y): 
 
 C N Y = C 0' C l' ***' °i' "**' C N-1' 
 called the skeleton of ft . Similarly, we have the corresponding radii sequence 
 
 R n (y) 
 
 N " r ' r i s *••» r i » •••s r 
 
 N-l* 
 
 where we will measure the radius r. of the PM-disc at p. on y in the following 
 
 manner. 
 
 We assume a disc of radius r. which is very small. Then by increasing 
 
 r. by S, which is also a small value, we detect the intersection between the 
 
 disc of radius r. + 6 and the edges of Y. If there is an intersection, then r. is 
 
 the radius of PM-disc, otherwise repeat the above process. Here r. = d(p.,c.) 
 
 and c. is on the i-th pannormal. The above process is illustrated in Figure 
 
 3.20(c). The disc of radius r. at center c. is moved away from p. as much as 
 
 l l l 
 
 possible by increasing r. by 6 . 
 
 Let us consider the case that p. is a concave point which consists of 
 
 the intersection point between edges, e. and e. _. . Let us also assume that I. 
 
 l l+l 1 
 
 and £ are two pannormals with respect to e. and e. , respectively, where this 
 is illustrated in Figure 3.20(d). The angle between SL and I is assumed to be 
 6. Since we define any direction which is a linear combination of two vectors 
 I and I to be a pannormal at p. , in a practical case we can consider several vec- 
 tors as the pannormals at p. and generate a PM-disc for each such pannormal. 
 
 3.4.2 Neck/Tail Decomposition 
 
 From the skeleton, it is possible to recognize a number of global properties 
 of shapes, especially those relating to curvature (concave or convex, "neck" 
 parts, "tail" parts, "hair" parts, etc.). 
 
 
69 
 
 (a) Pannormal on a Simple Curve 
 
 (b) Pannormal on a Polygonal Curve 
 
 i, 
 
 (c) PM-disc 
 
 (d) Concave Point 
 
 Figure 3.20. Pannormals and PM-discs 
 
TO 
 
 Let us first consider the figures illustrated in Figure 3.22. (a) shows 
 an example of a neck of a region, and (b) and (c) show examples which may not 
 be interpreted as necks. From these it is said that a neck may exist on a con- 
 cave shape, but not on a convex shape. The significance of locating this kind 
 of property is that one connected region can be decomposed into more than two 
 separate regions by cutting necks. This neck-finding approach is also important 
 for the clustering problem (Zahn, 1971). By doing such a neck decomposition of 
 connected regions into several disconnected regions, possible matching between 
 two regions can be attained. In Figure 3.22(b) an example of a tail region is 
 illustrated. Such a tail is quite often ignored by the main body or cut away 
 from the main body, and it can be considered that the shape consists of two 
 regions: body and tail. We will consider these neck/tail detection and decom- 
 position problems. 
 
 Let C (y) and R^Cy) be two sequences: center coordinates and radius, 
 respectively, which are derived by the skeletonization of a curve y. 
 
 C I\T Y = °0' C l' ***' C j ' ***' C N-1' 
 
 R N Y = r 0' r i' ***' r j ' **"' r N-l' 
 
 N-l 
 r = (2 r.) / N, 
 
 j=o J 
 
 P N-l _ o 
 j=0 J 
 T = r + C a, C > 0. 
 
 Then we have a sequence B (y), 
 
 b n ( y ) = b Q , b x , ..., b j9 ..., b N _ lS 
 
 where f b . = 1 for r . > T 
 
 J J J - 
 
 I b. = otherwise. 
 
71 
 
 Figure 3.21. Skeleton of Shape 
 
 NOT A NECK 
 
 (c) 
 
 NOT A NECK 
 
 TAIL 
 
 NOT A TAIL 
 
 Figure 3.22. Necks and Tails 
 
72 
 
 Let B' be a connected subsequence of B (y), and let |B'| denote the num- 
 ber of elements in B' , i.e., the length of B' . For B' such that for all 
 b, e B' b. = 0, it is said that B' forms a tail of region ft if Ib'I > C^N, 
 1 > C > 0. Here the selection of the value for C is essential for the 
 detection of tails in a region, since for small C quite a few subsequences of 
 B''s exist which satisfy the threshold. They do not necessarily correspond to 
 a tail; however, they may be interpreted as hair parts. From our experience, 
 C should be set to greater than 0.15. Of course, the value T is quite signi- 
 ficant for the detection of both necks and tails. 
 
 Let B 1 and B" be two distinct connected subsequences of B (y) such that 
 
 b., b. = for all b. e B' and b. e B" . Then it is said that (p., p.) of 
 
 (b. , b.) form a neck if d(p. , p.) < e, where e is a very small value. 
 X 3 i J — 
 
 In Figure 3.23, skeletons derived under the threshold values 
 (a) T = 0, (b) T = s, and (c) T = s + 9c are illustrated. The boundary of the 
 figure is denoted by character B, and its skeleton is denoted by the character 
 S . An example of a region decomposition by detecting necks and tails is 
 illustrated in Figure 3.2U. Here, the threshold T = s is used to detect the 
 neck and tail, and the resulting decomposed figure is shown in (e). Another 
 example of region decomposition by finding the tail of a complex figure 
 is illustrated in Figure 3.25. 
 
 Let us consider the following matching-in problem. Is Figure 3.2i+(a) 
 embedded in Figure 3.25(a)? We can't find one region in the picture of Figure 
 3.25(a) which corresponds to the picture of Figure 3.2U(a); however, we can 
 possibly make matching-in if we compare the decomposed picture of Figure 3.2Me) 
 and the picture of Figure 3.25(d). Thus, it may be that possible matching can 
 be attained by decomposing pictures using neck and tail properties. This 
 matching-in problem or figure extraction problem will be discussed in Chapter 7. 
 
73 
 
 i i i i i i 
 I i i i I I 
 
 (C f -Oro(DcoiD(Dincpcn 
 
 CO fO to co <d to w 
 
 i i i i 
 
 i i i i 
 
 1 (/> 
 
 V) 
 
 1 
 
 i 
 
 l 
 
 1 1 
 1 1 
 
 I vt 
 
 \S\ 
 
 v» 
 
 t/i 
 
 VI 
 
 io v> 
 
 V» to 
 
 1 
 
 i 
 
 i 
 
 1 
 
 Vj V) 
 
 I I I I I I I I V) I 
 
 CO I I I 
 
 I I I I I 
 
 I I I I I 
 
 CO I 
 
 I CD CO 
 
 I I I I I I 
 
 I I I I I I 
 
 CQ cn CQ I I 
 
 en co ro en en co i 
 
 I i I i i i I I i i 
 
 I co en. co 
 
 i i i I i 
 
 i I I I I i i i i 
 
 I I f I 
 
 G 
 
 + 
 
 I to 
 
 V) I/) VI VI '» 4 
 
 I CD < 
 CO CD 
 
 CO CD CD CQ CO 
 
 CCCDffiCQCOCQCDCDcDCD I I 
 
 t I I I I 
 I t I t | 
 
 i i i i i i i 
 
 i i i i i i i 
 
 i i i i i i i 
 i i i i i i i 
 
 i co to co co co co co 
 
 IA 1 1 1 | 
 
 1 
 
 t 
 
 1 1 1 
 
 1 V* 1 1 l 
 
 I l 1 1 1 
 
 1 
 t 
 
 1 
 1 
 
 i l V> 
 1 1 V» 
 
 1 (/1(/> 1 > 
 
 I 
 
 1 
 
 l i i 
 
 1 1 (/></> 1 
 
 i 
 
 1 
 
 1 VI 1 
 
 i vt </i vt v) 
 
 VI 
 
 in 
 
 v> t/t v> 
 
 V* V> 1 t I 
 
 1 
 
 VI 
 
 *•> i i 
 
 I CO 1 I I 
 
 CO I 
 
 CD CD 
 I CO CO 
 
 CO CO CO CO CO CD 1 
 
 I I I I I i I I I I V/"> 
 
 I I I I I I I I I I I 
 
 I I I 1 I I I I I I I 
 
 I 1 I I I I I I I I I 
 
 I I I I I t I I I I t 
 
 I I I I 1 I I I I I I 
 
 I I I I 1 I I I I I I 
 
 flO CO CO CD CO 
 
 I I I I I I 
 
 I I I I I 
 
 co co co ca co 
 
 I i I i 1 I I I I I t 1 
 I I I I i I I I i I I i 
 
 iiii 
 
 V» vi V» i/i VI lyi ><? 
 
 Vt I/O t/> V» 
 
 I I CO | 
 
 I i CO L.-'J 
 
 I CO I I 
 CO I I I 
 
 IIII 
 1(11 
 
 IIII 
 
 CO CO CO CO 
 
 V» <^ I 
 
 l/t I I 
 
 V» V> I 
 
 I I t/» 
 
 I it') ca co co co ta co co co co I 
 
 I I VI v> 
 
 Jk) V* Vt Vt 
 
 V» VI VI VI i 
 
 Vt I I I 
 
 ai co co co co co 
 
 i m a) iD co ii) 
 
 l I i l I I i 
 l I I I I 1 t 
 
 Vt t/> I 
 VI VI 
 
 'Wt/iwvu^i/iioW 1 
 
 o di io m ti co 
 
 i i i i i i 
 
 i i i i i i 
 
 I i i i i I 
 
 CO CO CO CD CD v> cn 
 
 I I I t to I 
 
 I I I I I I I 
 I I I </l I I I 
 
 i vt </> */i <v» v» vt 
 
 VI I 1 I I I 
 
 VI 
 
 1 
 
 1 
 
 VI 
 
 1 
 
 1 
 
 VI 
 
 1 
 
 1 
 
 vt 
 
 Vt 
 
 1 
 
 
 1/1 
 
 CO 
 
 1 
 1 
 
 VI 
 
 1 
 1 
 1 
 
 1 
 1 
 
 EH 
 
 H 
 O 
 ,G 
 
 co 
 0) 
 
 -P 
 
 •H 
 I* 
 
 cn 
 
 o 
 -p 
 
 CD 
 H 
 CD 
 
 ^! 
 
 cn 
 
 CM 
 
 ro 
 pi 
 
 ■H 
 
lh 
 
 c 
 o 
 
 •H 
 
 -p 
 
 a 
 
 i i 
 
 i i 
 
 | 5— 1 
 
 * m 
 
 
 
 
 CT\ 
 
 
 
 
 
 
 9. 
 
 
 
 
 ? 
 
 
 + 
 
 
 ■ 
 
 
 
 
 
 a 
 
 
 
 
 
 
 
 
 | - 
 
 
 
 
 
 1 CO 
 
 
 
 
 "5 
 
 
 
 
 
 
 «.! 
 
 II 
 
 
 a 
 
 
 
 «£"" 
 
 
 c 
 
 ** £e 
 
 
 
 
 EH 
 
 o 
 
 S* 
 
 
 , 
 
 
 
 •H 
 
 
 
 
 
 T5 
 
 -p 
 
 •H 
 W 
 
 O 
 
 | 
 
 
 
 
 
 o 
 
 
 
 
 o 
 
 
 
 
 <u 
 
 
 
 
 O 
 
 
 
 
 H 
 
 
 
 
 •H 
 
 
 
 
 rt 
 
 *« 
 
 
 
 EH 
 
 CD 
 
 
 
 
 S) 
 
 
 
 u 
 
 ti 
 
 
 
 
 
 q 
 
 *«o 1 tice 
 
 *• « • * *" *• 
 
 *iff 1 
 
 II 
 
 3 
 
 1 
 
 
 
 
 
 1 
 
 1 
 
 ; 
 
 <rl ' 
 
 &H 
 
 ^ 
 
 
 on c •« * 
 
 ^ iflfil ' 
 
 
 o 
 
 - "^ i 
 
 
 
 
 <D 
 
 
 1 
 
 1 
 
 s — , 
 
 s 
 
 
 
 1 
 1 
 
 1 
 i 
 
 ! 
 
 O 
 
 CM 
 
 on 
 
 
 
 
 CD 
 
 
 
 
 u 
 
 
 
 
 S> 
 
 
 
 
 •H 
 
 
 
 
 
 
 fe 
 
 La ! 
 
 •■ I 
 
 ,a | 
 
 ! i 
 
 § 
 
 •H 
 
 hO 
 « 
 
 CD 
 > 
 •H 
 
75 
 
 MS MB I 
 • S t B I 
 
 s b r s s sss«.s 
 
 • IS » B61>i 
 
 • ss a bb s 
 
 ss MO a s i 
 
 s ss 
 
 i * Si 
 5 SSS 
 
 I s ssss 
 
 «&«;"HBe*Ba 
 
 
 (a) Given Regions 
 
 (b) T = 
 
 IB SS IB 
 
 S\ BB6 
 
 ssss bca 
 
 B t 
 ■ r 
 la a 
 
 5 *S 
 
 lis 
 s its 
 
 s v.s 
 
 V.SS 
 
 ss*--.ss 
 *-s ss 
 % sssss 
 
 % is 
 
 a fa 
 
 • ee 
 
 I B 
 
 * I II 
 
 t • ( 
 
 (c) T = s 
 
 (d) Decomposition 
 
 Figure 3.25. Decomposition of Many Regions 
 
76 
 
 3.5 Characteristic Points of the Representation 
 
 3.5.1 Preliminary Discussion 
 
 In previous sections we discussed the generation of pattern sequences of 
 polygonal regions; however, we have not mentioned yet how we obtain a point x , 
 called an encoding point, of S(x ). In this section, we introduce several 
 methods to find such a point in each given polygonal region. It is clear that 
 for a given region Q , if encoding points x and x' are given, then two genera- 
 ted pattern sequences S(x ) and S(x') denote the same region; however, the two 
 sequences are quite different from each other. Thus, it is preferable to have 
 a unique encoding point for the generation of a pattern sequence for a given 
 region. Moreover, the choice x should not be affected either by the 
 
 rotation or translation of fi . 
 
 Y 
 
 The center of gravity of the polygon might be used as the encoding 
 point because of the uniqueness of the center of gravity. However, if the 
 given region is not convex, then the center of gravity may not necessarily be 
 located in the interior of the region. 
 
 Here we are concerned with interior points such that their spacial distri- 
 bution reflects some shape information of the given region. For 
 example, points marked inside the polygons of Figure 2.U are such points, and 
 these can be considered as "discrete skeleton" points. To assign these points 
 in the region, we first find K- feasible regions by Theorem 5 of K-angularly 
 simple decomposition and assign a point for each feasible region. Because of 
 the convexity of feasible regions, the center of gravity of each region may be 
 considered as a characteristic point. 
 
 However, we will introduce another kind of characteristic point in 
 convex regions in terms of primal pattern sequences. 
 
77 
 
 3.5.2 Determination of Characteristic Point 
 
 Let us consider the following approach: 
 
 Algorithm CP (Characteristic Point) 
 
 Find a point x in a feasible region X of ft "by maximizing the mean of 
 
 elementary patterns in S(x), where S(x) is derived from X , rather than 
 
 ft . 
 Y 
 
 Some other objective functions are : 
 
 i) Minimize the deviation of elementary patterns, 
 
 ii) Maximize the perimeter which is covered by S(x), 
 
 iii) Maximize the area which is covered by S(x). 
 
 To gain a little insight into Algorithm CP, let us consider a particular 
 
 example of a circular region. Let x be the center of the circle, and let 
 
 (k) (k+l) 
 x , x be two points in the circle not identical to x . If 
 
 d(x , x (k) ) > d(x , x (k+1) ), then 
 
 Zs. (k) <Es. (k+1) . 
 
 1 1 
 
 -.. . . . - , . / (k) (k)v i (k+l) (k+lh 
 This is clear because for each l, (s. + s , ) < (s. + s , ) 
 
 (k) (k+l) (k) 
 
 if d(x , x ) > d(x , x ). Thus by the algorithm, s approaches x by 
 
 iteration. 
 
 In Figure 3.26 the characteristic point x attained by Algorithm CS in 
 a convex polygon is illustrated. (a) shows the result determined by the maximi- 
 zation of the mean of elementary patterns, and (b) shows the result determined 
 by the minimization of the deviation of elementary patterns. Points x , x^, 
 x , and x^ are trial points which were assigned as the initial points, and 
 
 they approached to the point x . By the comparison of these results, 
 
 the algorithm finds a fixed point x_ wherever the initial point is. Moreover, 
 
78 
 
 O 
 •rl 
 
 Q 
 
 Cm 
 O 
 
 a 
 o 
 
 •H 
 -P 
 
 s 
 
 o 
 
 bO 
 
 ft 
 
 O 
 
 P-. 
 
 H 
 
 
 -P 
 a 
 
 •H 
 
 <M 
 
 o 
 
 (3 
 
 o 
 
 ■H 
 
 O 
 
 •H 
 -P 
 
 ra 
 
 •H 
 
 M 
 
 (D 
 
 -p 
 o 
 
 H 
 
 Xi 
 o 
 
 OJ 
 
 on 
 
 0) 
 
 M 
 
 •H 
 
79 
 
 x attained by maximizing the mean is almost identical to x attained by mini- 
 mizing the deviation, even though the paths taken by two objective functions 
 are slightly different. 
 
 Let us modify S(x) in Algorithm CP as the pattern sequence derived from 
 
 ft , instead of the pattern sequence derived from the feasible region X of ft , 
 Y Y Y 
 
 and let us apply this algorithm for non-convex regions. 
 
 In Figure 3.27 and 3.28, the behavior of the algorithm applied on 1-angu- 
 larly simple polygons is illustrated. In these polygons, feasible regions 
 are denoted by dotted lines, and the computation results are shown for three 
 objective functions: (a) the maximization of the mean, (b) the maximization of 
 the perimeter, and (c) the maximization of the area. The points determined 
 are not identical; however, they are in the feasible region of each polygon. 
 Therefore, after finding a point in the feasible region X by maximizing the 
 mean of elementary patterns generated from ft , we then maximize the mean of 
 elementary patterns generated from the feasible region X of ft . This will 
 determine the characteristic point x in X , even if the trial points are 
 exterior of X . From the results illustrated in (b) and (c) of Figure 3.27 and 
 Figure 3.28, it can be said that the maximization of the mean (or the perimeter) 
 is preferable to the maximization of the area type objective function. 
 
 Other examples are illustrated in Figure 3.29, where these polygons are 
 not feasible or, even if they are feasible, the feasible set is very small com- 
 pared to the area of polygons. These examples indicate the existence of multiple 
 characteristic points in a non-convex region. 
 
 3.5*3 Properties of Characteristic Points 
 
 In this section, we list some properties of characteristic points which 
 are derived experimentally by Algorithm CP. 
 
8o 
 
 
 o 
 
 a 
 o 
 
 •H 
 -P 
 
 a 
 
 •H 
 
 S 
 
 U 
 
 a 
 p 
 
 <D 
 
 s 
 
 •H 
 fH 
 
 CD 
 
 O 
 
 a 
 o 
 
 ■H 
 
 P 
 
 ctf 
 N 
 
 O 
 
 B 
 
 •H 
 +3 
 
 cd 
 ts] 
 
 O 
 bO 
 
 o 
 ft 
 
 i 
 
 p 
 
 •H 
 
 O 
 •H 
 P 
 
 03 
 
 •H 
 
 <D 
 
 p 
 o 
 c<3 
 Sh 
 
 CM 
 
 00 
 
 CD 
 U 
 
 §> 
 
 •H 
 
81 
 
 a 
 3 
 
 <4H 
 O 
 
 C 
 
 o 
 
 •H 
 -P 
 
 SI 
 
 •H 
 
 o 
 -p 
 
 •H 
 
 PL, 
 
 o 
 o 
 
 •H 
 
 -p 
 
 cd 
 
 tsl 
 •H 
 
 •H 
 
 CO 
 CM 
 
 <D 
 O 
 
 d 
 o 
 
 •H 
 
 -p 
 
 SI 
 
 •H 
 
 s 
 
 •H 
 
 •H 
 
82 
 
 (a) 
 
 0>) 
 
 (c) 
 
 (a; 
 
 Figure 3.29. Characteristic Points in Polygons: 
 Maximization of Mean 
 
83 
 
 i) Let fi be a convex region, and let the circularity N be sufficiently 
 large. Then the point x which maximizes the mean of elementary 
 patterns is unique. That is, if s(x) is the mean at interior point 
 x, then, as we maximize s(x), x approaches x . 
 ii) Let fi he a convex region, and let the circularity N he sufficiently 
 large. Then the point x which minimizes the deviation of elemen- 
 tary patterns is unique. 
 
 iii) For a convex region fi , x maximizes the mean of elementary patterns 
 Y 
 
 if and only if x n minimizes the deviation of elementary patterns. 
 
 iv) If fi is symmetric, then the point x_ which maximizes the mean (or 
 y j- 
 
 minimizes the deviation) of elementary patterns lies on the axis 
 
 which makes fi symmetric. 
 Y 
 
 Theorem 7 . Let x and x be two points in an AS-polygon, and let P(x ) and P(x) 
 be the perimeters of pattern sequences S(x ) and S(x), respectively. 
 
 If x. e X , then 
 y 
 
 and 
 
 P(x Q ) > P(x) 
 
 P(x Q ) > P(x') > P(x) 
 
 for x' = wx + (l-w)x, <_ w _< 1. 
 
 Proof: It is sufficient to show that OP + PQ > OR + RQ > OQ. 
 
 (Q.E.D.) 
 
8U 
 
 Corollary 3. P(x. ) = P(x) = P(x.) for x. ± x. if and only if 
 
 x. , x. e X , where x = wx. + (l-w)x., < w < 1. 
 i J Y x J - ~ 
 
 Proof: It is clear from Theorem 7. 
 
 3.6 Intercomparison of Shape Representations 
 
 In the previous sections we have introduced three basic shape represen- 
 tations — polygonal, pattern sequence, and skeleton representations. Polygonal 
 representation has been used by many psychologists because of its transforma- 
 tion-free encoding. Pattern sequence representation has not been used until 
 very recently; it has proved especially good for simple shape representation. 
 Skeleton representation, however, is a distinct encoding orthogonal to the 
 above contour representations and has unique properties for shape representa- 
 tion. 
 
 Each representation has its own advantages and disadvantages, and there 
 is no general simple shape representation which preserves the advantages of all 
 three representations. Thus for the best shape perception, we should utilize 
 the advantages of all three representations. In this section we list the 
 advantages and disadvantages of the three, although the listing is incomplete. 
 
 Polygonal Representation 
 
 Advantage: The representation is invariant under translation and 
 
 rotation operations. Reconstruction of a shape from the 
 representation is simple and can be used for unlimited 
 classes of regions. 
 
 Disadvantage: Because of its lack of spatial information (i.e., 
 
 difficulty in establishing whether a trial point is con- 
 tained in the polygon), this representation is not suitable 
 
85 
 
 for shape recognition, perception, or identification 
 problems . 
 
 Pattern Sequence Representation 
 
 Advantage: Translation and rotation operations on the representation 
 are simple. Ease of detection of local and global shape 
 features leads to a good encoding for shape recognition, 
 perception, and identification problems. 
 
 Disadvantage: The serious disadvantage of this representation is that 
 it is not easily extended to non-angular ly simple shapes, 
 especially thin-long shapes. Another disadvantage is that 
 it requires large circularity to detect sharp changes in 
 contours . 
 
 Skeleton Representation 
 
 Advantage: This is a very good representation of shapes and has few 
 
 encoding restrictions; it is especially good for thin-long 
 shapes. Local and global shape feature detection is easier 
 than for polygonal representation. Thus, it is a suitable 
 representation for shape recognition, perception, and 
 identification problems. 
 Disadvantage: Although this representation does not place restrictions 
 on the class of shapes to be encoded, the detected skeleton 
 does not readily convey shape information if shapes con- 
 sidered are not long. Reconstruction of shapes from the 
 representation is more difficult and time-consuming than 
 for the other representations. Rotation of the represen- 
 tation is difficult. 
 
86 
 
 FEATURE EXTRACTION 
 
 l+.l Introducti 
 
 on 
 
 In general, for pattern recognition various feature values are 
 extracted from objects, and the performance of a pattern recognition system 
 depends crucially on the selection of features. Methods by which features 
 are selected, however, are often intuitive and empirical and depend upon the 
 designer's experience with the problem. Both feature selection (Jones, 1971; 
 Brown and Owen, 1967; etc.) and dimension reduction (Shepard, 1962, etc.) are 
 important aspects of the design of a shape recognition system. The main 
 guides to feature selection are: (l) features should be invariant to 
 irrelevant variations, such as limited translation, rotation, changes of 
 scale, etc., (2) objects must be pair-wise distinguishable, and 
 (3) redundant features should be eliminated. 
 
 Five major features and physical form measures most highly 
 
 2 
 correlated with these requirements are tabulated by Zusne (1970, p. 223). 
 
 They are "Compactness," "Jaggedness ," "Skewness" of contours related to 
 
 the Y-axis, "Skewness" of contours relative to the X-axis, and "Dominant 
 
 Axis." However, several of these features are not independent of the 
 
 orientation of the shape. 
 
 In Section U.2, 19 selected features are described which are 
 
 collections of features extracted from three basic shape representations 
 
 which were discussed in Chapter 3. Section k.3 is provided to show how 
 some features depend upon encoding parameters. Some special features 
 
 Jones (1971) classified measures into 13 major features. 
 
 2 
 The parameters of dispersion, symmetry, and elongation are considered to be 
 
 analogs of the second, third, and fourth moments of distribution of the area 
 
 of a shape (p. 215). 
 
87 
 
 which should be extracted for shape perception problems will be discussed 
 in Section h.k. 
 
 k .2 Description of Specified Features 
 
 In Table k .1 we list 19 features which are independent of the 
 orientation of a shape, except features Fk , F5, and F6 . These features are 
 valid not only for a class of polygonal shapes, but also for the closely 
 allied class of polynomial curved shapes. Not all these 19 features are equally 
 extracted easily from all three shape representations previously discussed in 
 Chapter 3; however, this is a comprehensive collection of features extracted from 
 three basic shape representations: polygonal, pattern sequence ,' and skeleton 
 representations. In Table k.2 we show which of these features are extracted 
 from polygonal, pattern sequence, and skeleton representations, respectively. 
 
 Any of these features can be extracted from each shape representa- 
 tion. Features extracted from polygonal representation can be extracted from 
 skeleton representation and vice versa, by transforming from one representa- 
 tion to the other. However, we select features for each representation 
 which are dependent upon the representation itself and are easy to extract 
 directly from the representation. 
 
 Some of these features in Table U.l are self-evident, and for those 
 
 which are not, brief explanations are given below. We will explain features 
 
 in terms of pattern sequence representation; however, the correspondence 
 
 between the pattern sequence and skeleton representation becomes clear if 
 
 we interprete the radii sequence Fvt(y) i- n Section 3.^.1 to a pattern 
 
 sequence S (x^) in Section 3.2.1. 
 r 
 
88 
 
 NO. 
 
 NAME 
 
 Fl 
 
 PERIMETER 
 
 F2 
 
 AREA 
 
 F3 
 
 PERIM/AREA 
 
 FU 
 
 M2 
 
 F5 
 
 M3 
 
 F6 
 
 Mil 
 
 FT 
 
 MEAN-R 
 
 F8 
 
 MA.XAMP 
 
 F9 
 
 DEV-R 
 
 FIO 
 
 ALT# 
 
 Fll 
 
 MAXIMAL/? 
 
 F12 
 
 DEV-E 
 
 F13 
 
 CONVEXITY 
 
 FlU 
 
 SYMMETRY 
 
 F15 
 
 DEG-AS 
 
 Fl6 
 
 DEG-F 
 
 FIT 
 
 DEV-ANG 
 
 Fl8 
 
 ANG# 
 
 F19 
 
 VERTEX# 
 
 Table k.l. Selected Shape Features 
 
 DESCRIPTION 
 
 Perimeter P (unnormalized) 
 
 Area A (unnormalized) 
 
 (1 - 2/^A/P) 
 
 Degree of variance 
 
 Degree of skevness 
 
 Degree of elongation 
 
 Mean of unnormalized elementary patterns 
 
 Difference between the maximum and the minimum in 
 elementary patterns* 
 
 Deviation of elementary patterns* 
 
 Number of alternations in a pattern sequence wrt the mean 
 
 Number of local maximal points in a pattern sequence 
 
 Deviation of edge lengths* 
 
 Degree of convexity 
 
 Degree of symmetry 
 
 Degree of angular simplicity 
 
 Degree of feasibility 
 
 Deviation of angles 
 
 Number of angles less than i\/h and greater than l^/h 
 
 Number of vertices in a polygonal representation 
 
 * Each elementary pattern sequence is normalized by the mean of elementary 
 patterns: FT ( MEAN-R ) . 
 
89 
 
 Table k .2 . Extracted Features from Each Shape Representation 
 
 POLYGONAL PATTERN SEQ. SKELETON 
 REP. REP. REP. 
 
 Fl 
 
 * 
 
 F2 
 
 * 
 
 F3 
 
 * 
 
 FU 
 
 * 
 
 F5 
 
 * 
 
 F6 
 
 * 
 
 F7 
 
 
 F8 
 
 
 F9 
 
 
 F10 
 
 
 Fll 
 
 
 F12 
 
 * 
 
 F13 
 
 
 FlU 
 
 
 F15 
 
 
 Fl6 
 
 
 FIT 
 
 * 
 
 Fl8 
 
 * 
 
 F19 
 
 * 
 
90 
 
 Features Fk , F5 , and F6 : these correspond to the second, third, and fourth 
 areal moments, respectively, where each moment has two values — 
 the moment with respect to the X-axis and the moment with respect 
 to the Y-axis. 
 
 Feature F10 : half of the number of times the sequence S(x ) crosses the 
 circle of radius s. For the polygon which is illustrated in 
 Figure U.l(a), the number of alterations is k. 
 
 Feature Fll: the number of local maxima in the sequence of S(x ) minus 
 the number of alternations in F10. (Because the sequence is 
 closed, the number of local maxima is identical to the number of 
 local minima). In the example of Figure k.l, elementary patterns 
 s , s, , Sq, and s correspond to local maxima, and s , s^, s , 
 and s , correspond to local minima. Thus, this feature value 
 is zero. 
 
 Feature F12 : /z (e. - e) /(number of arcs) where distances between 
 
 J J 
 adjacent pairs of extrema (maximal and minimal points) are 
 
 considered to be the e.'s. For example, in Figure H.l(c), 
 
 s s is an edge and s s + s s p is a curve length for the shape 
 
 of Figure U.l(b). In other words, the distance between two 
 
 adjacent extrema or the arc length between two adjacent extrema 
 
 is interpreted as e . . e is the mean of e.'s. 
 
 J J 
 
 Feature F13 : N /the circularity N, where N is determined in the following 
 manner. For each pair of adjacent maxima, we assume an edge — 
 an example is illustrated in Figure U.l(a) as a dotted straight 
 line. Then we count the number of elementary patterns between the 
 maxima whose elementary pattern length is greater than the distance 
 between the encoding point and the edge. We repeat this process 
 for all possible adjacent maxima. For example, in Figure U.l(a), 
 
91 
 
 (a) A Polygon and a Circle of Radius s 
 
 (b) A Polynomial Curved Shape 
 
 S S t S 2 S3 S 4 S 5 S 6 S7 Sg S 9 S 10 Su S 12 S13 Sj4 S15 
 
 (c) The Sequence, S (x ), of the Above Poly 
 
 V 
 
 gon 
 
 Figure k.l. Examples of Shapes Arid Their Pattern Sequences, S (x ) 
 
 V 0' 
 
92 
 
 elementary patterns s and s are greater than the distance to 
 the edge. Each maximal point is also counted. Thus, we get the 
 degree of convexity for both shapes, (a) and (b), as h/5. In 
 Figure U.l(c), those elementary patterns which contribute to the 
 degree of convexity are connected by bold lines. If we double the 
 circularity N, i.e., 32, then the degree of convexity becomes 5/8. 
 
 Feature FlU: the degree of symmetry for S (x ) is defined in the 
 
 following manner. Let N be the number of elementary patterns 
 
 which satisfy 
 
 Is . - s . I < . lU* max( s . , s . ) 
 1 r+i r-i ' — r+i r-i 
 
 for i = 1,2,..., N/2-1, where r+i and r-i are interpreted 
 
 (mod N) for all i. Then the degree of symmetry is given by 
 
 max {N }/half of the circularity N 
 r 
 r 
 
 Thus, this feature value ranges between and 1. 
 Feature F15: the degree of angular simplicity is defined by the ratio 
 
 between the area covered by a primal pattern sequence and the 
 
 actual area of the object. Thus, this feature value ranges 
 
 between and 1. 
 Feature Fl6 : the degree of feasibility is defined by the ratio between the 
 
 area of the feasible region and the actual area of the object. 
 
 This feature assumes values between and 1, and it is for any 
 
 nonangularly simple region and 1 for convex regions. 
 
 3 
 
 The parameter of symmetry is considered to be the analog of the third moment 
 
 of distribution of area (Brown and Owen, 1967). 
 
93 
 
 k.3 Feature Dependence on Encoding Shapes 
 
 This section is provided to show how some features: MAXAMP, 
 MEAN-R, DEV-R, and PERIM/AREA in Table k.l, depend upon encodings. Those 
 encoding parameters which we are concerned with here are: 
 (i) array cell sizes, 
 (ii) white/black encoding, 
 
 (iii) primal, dual, or edge-following encoding, and 
 (iv) circularity N, 
 where the shape representation used for the measurement of feature dependence 
 is the pattern sequence representation. 
 
 Let us assume that a two-dimensional shape is given as an image 
 array. An example of such a shape is illustrated in Figure U.2(a), where we 
 assume the shape described here is the originally given accurate shape. In 
 Figures (b), (c), and (d) we illustrate the black-boundary and white-boundary 
 of the shape in (a) for different choices of array cell size denoted by 
 1 x 1, 2 x 2, and k x k, respectively. Thus, intuitively, it may be 
 concluded that for a larger cell size, an object can be represented accurately, 
 However, concerning boundaries, we can say nothing about whether a black or a 
 
 white boundary is preferable with respect to feature values. 
 
 In Table k.3 we show absolute errors for various features under the 
 variation of array cell sizes, white-black boundaries, and primal, dual, or 
 edge-following encoding methods. In this table we assumed that the feature 
 values for a geometric boundary are true values for fixed circularity N = U8. 
 From this table we understand that errors decrease as array cell size 
 increases. Moreover, at least for a fixed circularity, whatever array cell 
 sizes are used, absolute errors for black-boundary encoding are less than 
 the absolute errors by white -boundary encoding. That is, the black-boundary 
 
9h 
 
 
 
 
 
 
 
 
 
 
 
 i 
 
 
 
 
 
 
 i 
 
 
 
 % 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 M 
 
 
 P 
 
 
 
 
 
 
 
 v// 
 
 
 
 
 
 
 
 
 W/ 
 
 % 
 
 
 
 
 
 
 
 
 
 
 SI 
 
 (a) 
 
 (b) 1x1 
 
 
 £1S 
 
 7rnY 
 
 -^^rr^ZAJE—- *_5_. 
 
 7 r 17 ^ — -i Y 
 
 "IT A:*! 
 
 7rV 
 
 __j_ iizs~ 
 
 V 3-Y- 
 
 _,_:iZ 
 
 Zr^ _ l_l 
 
 S^E— — :___ j: 
 
 -^ — -.V ;: 
 
 - 5J_v_. v ~~ 
 
 _- v 
 
 5JE — ±2 
 
 N ^ 
 
 
 
 
 
 -^ — 5^ 
 
 2^— -i V 
 
 
 _______ , 
 
 = =:::i:!!iS53_=.sss«ii,j::: 
 
 II - 
 
 -_=- *t 
 
 7^_. — -^ 
 
 
 
 
 
 
 
 
 
 
 
 5 ^Y 
 
 
 
 __. — ^ 
 
 
 
 <i___,. — **- 
 
 
 
 (c) 2x2 
 
 (d) U x k 
 
 Figure U.2. Black and White Boundaries 
 on Different Array Cell Sizes 
 
95 
 
 Table U.3. Absolute Errors for Various Features Under the Variation of Array 
 Sizes, White-Black Boundaries, and Primal or Dual or Edge- 
 Following Encoding. Feature Values for Geometric Encoding which 
 is the Exact Representation of Image is Assumed Error Zero. 
 N = 1+8 
 
 
 Geo- 
 metric 
 
 1 . 
 
 K 1 
 
 2x2 
 
 1+ 
 
 x 1+ 
 
 White 
 
 Black 
 
 White 
 
 Black 
 
 White 
 
 Black 
 
 
 Primal 
 
 .00 
 
 .20 
 
 .09 
 
 .12 
 
 .01+ 
 
 .06 
 
 .02 
 
 1 
 
 Dual 
 
 .00 
 
 .20 
 
 .09 
 
 .12 
 
 .0U 
 
 .06 
 
 .02 
 
 1 
 
 E.F. 
 
 .00 
 
 .21 
 
 .05 
 
 .08 
 
 .01 
 
 .05 
 
 .00 
 
 
 Primal 
 
 .00 
 
 .12 
 
 .05 
 
 .09 
 
 .01+ 
 
 .07 
 
 .02 
 
 1 
 
 Dual 
 
 .00 
 
 .08 
 
 .06 
 
 .06 
 
 .05 
 
 .05 
 
 .02 
 
 1 
 
 E.F. 
 
 .00 
 
 .09 
 
 .01* 
 
 .01+ 
 
 .03 
 
 .03 
 
 .01 
 
 K 
 
 Primal 
 
 .00 
 
 .20 
 
 .02 
 
 .07 
 
 .01 
 
 .01 
 
 .01 
 
 1 
 
 > 
 
 Dual 
 E.F. 
 
 .00 
 .00 
 
 .19 
 .19 
 
 .05 
 .05 
 
 .09 
 .09 
 
 .03 
 .03 
 
 .01 
 .02 
 
 .01+ 
 .02 
 
 
 Primal 
 
 .00 
 
 .1+7 
 
 .27 
 
 .25 
 
 .05 
 
 .09 
 
 .00 
 
 < 
 
 On 
 
 Dual 
 
 .00 
 
 M 
 
 .2k 
 
 .29 
 
 .06 
 
 .13 
 
 .00 
 
 
 E.F. 
 
 .00 
 
 M 
 
 .22 
 
 .28 
 
 .05 
 
 .13 
 
 .01 
 
 Table k.k. Absolute Errors for Feature MEAN-R "for Different Circularity N, 
 Feature Values for Geometric Boundary are Assumed to be True 
 Values. Image Array Size is 2 x 2 
 
 N 
 
 Primal 
 
 Dual 
 
 E. 
 
 F. 
 
 White 
 
 Black 
 
 White 
 
 Black 
 
 White 
 
 Black 
 
 96 
 
 .09 
 
 .05 
 
 .07 
 
 .06 
 
 .05 
 
 .03 
 
 72 
 
 .08 
 
 .05 
 
 .06 
 
 .05 
 
 .01+ 
 
 .03 
 
 1+8 
 
 .09 
 
 .01+ 
 
 .06 
 
 .05 
 
 .01+ 
 
 .03 
 
 36 
 
 .07 
 
 .06 
 
 .06 
 
 .05 
 
 .01+ 
 
 .03 
 
 21+ 
 
 .15 
 
 .01 
 
 .13 
 
 .03 
 
 .05 
 
 .03 
 
 18 
 
 .09 
 
 .03 
 
 .08 
 
 .03 
 
 .06 
 
 .02' 
 
 12 
 
 .21 
 
 .02 
 
 .15 
 
 .03 
 
 .05 
 
 .02 
 
 6 
 
 .03 
 
 .06 
 
 .03 
 
 .06 
 
 .13 
 
 .01 
 
96 
 
 encoding is less influenced by the array cell size than the white- 
 boundary encoding. 
 
 To see how encodings, primal, dual, and edge-following, affect 
 feature values, absolute errors for features, MEAN-R and DEV-R, are 
 tabulated in Table k.k and Table U.5. Here feature values of the geometric 
 boundary with corresponding encoding are assumed to be true feature values 
 for each circularity N. From these two tables, it becomes clear that 
 whichever primal, dual, or edge-following encoding method is used, the 
 absolute error by black-boundary is less than the absolute error by white- 
 boundary for various circularities. Because of its heavy dependence of 
 encoding errors on individual shape — unless a shape to be encoded is 
 analyzed — it is not appropriate to say which of the three encodings is 
 preferable for a general shape representation. 
 
 In Table k.6 absolute errors for features MEAN-R and DEV-R are 
 tabulated for a black-boundary where feature values for circularity N = 96 
 are assumed to be true values for each encoding. From this table it seems 
 very likely that absolute errors of edge-following encoding are less affected 
 by the change of circularity N, especially when N >_ 2k. It is not appropriate 
 to say which of the primal or dual encoding methods is preferable, since two 
 generated sequences describe different regions, especially when the encoded 
 object is not in the class of 1-AS-polygons . 
 
 In Figure k.3 two relatively complex polygonal regions called 
 "Embryo" and "Sleeping Cat" are illustrated. Using these two shapes, let 
 us consider how the circularity N affects feature values. In Tables k.J and 
 k.Q absolute errors for features of Embryo and Sleeping Cat are tabulated, 
 respectively, where for each table feature values at circularity N = 96 are 
 assumed to be true values. In Figures k.k and h. -5 some of the errors 
 tabulated in Tables k."J and k.Q are plotted to see how circularity affects 
 
97 
 
 Table 1+.5. Absolute Errors for Feature DEV-R for a Different Circularity N, 
 Feature Values of Geometric Boundary are Assumed to be True 
 Values. Image Array Size is 2 x 2 
 
 N 
 
 Primal 
 
 Dual 
 
 E.I 
 
 i 
 
 White 
 
 Black 
 
 White 
 
 Black 
 
 White 
 
 Black 
 
 96 
 
 .07 
 
 .00 
 
 .08 
 
 .0U 
 
 .08 
 
 .03 
 
 72 
 
 .06 
 
 .02 
 
 .08 
 
 .05 
 
 .08 
 
 .01+ 
 
 1+8 
 
 .08 
 
 .01 
 
 .09 
 
 .03 
 
 .09 
 
 .03 
 
 36 
 
 .Ok 
 
 .02 
 
 .07 
 
 .00 
 
 .08 
 
 .02 
 
 2k 
 
 .Ok 
 
 .00 
 
 .08 
 
 .02 
 
 .09 
 
 • 05 
 
 18 
 
 .01 
 
 .ok 
 
 .05 
 
 .05 
 
 .06 
 
 .10 
 
 12 
 
 .18 
 
 .20 
 
 .07 
 
 .00 
 
 .05 
 
 .09 
 
 6 
 
 .03 
 
 .ok 
 
 .03 
 
 .01+ 
 
 .19 
 
 .23 
 
 Table 1+.6. Absolute Errors for Features MEAN-R and DEV-R. Feature Values 
 at Circularity N = 96 are Assumed to be True Values for Each 
 Encoding, where Image Array Size is 2 x 2 and Black-Boundary 
 is used. 
 
 
 MEAN-R 
 
 DEV-R 
 
 
 N 
 
 Primal 
 
 Dual 
 
 E.F. 
 
 Primal 
 
 Dual 
 
 E.F. 
 
 96 
 
 .00 
 
 .00 
 
 .00 
 
 .00 
 
 .00 
 
 .00 
 
 72 
 
 .00 
 
 .01 
 
 .00 
 
 .00 
 
 .02 
 
 .00 
 
 1+8 
 
 .01 
 
 .01 
 
 .00 
 
 .01 
 
 .01 
 
 .00 
 
 36 
 
 .00 
 
 .01 
 
 .00 
 
 .01 
 
 .02 
 
 .01 
 
 2k 
 
 .03 
 
 .03 
 
 .03 
 
 .01 
 
 .01 
 
 .00 
 
 18 
 
 .11 
 
 .12 
 
 .03 
 
 .07 
 
 .10 
 
 .01 
 
 12 
 
 .00 
 
 .03 
 
 .01+ 
 
 .08 
 
 .02 
 
 .01+ 
 
 6 
 
 .25 .22 
 
 .29 
 
 .11+ 
 
 .17 
 
 .03 
 
98 
 
 oding Point 
 
 (a) Embryo 
 
 Encoding Point 
 
 (b) Sleeping Cat 
 
 Figure k.3. Polygons: Embryo and Sleeping Cat 
 
99 
 
 Table h.J. Absolute Errors for Features of Embryo, where Feature Values at 
 Circularity N = 96 are Assumed to be True Values 
 
 N 
 
 MEAN-R 
 
 DEV-R 
 
 Primal 
 
 Dual 
 
 E.F. 
 
 Primal 
 
 Dual 
 
 E.F. 
 
 96 
 
 .00 
 
 .00 
 
 .00 
 
 .00 
 
 .00 
 
 .00 
 
 72 
 
 .01 
 
 .01 
 
 .00 
 
 .00 
 
 .01 
 
 .00 
 
 1*8 
 
 .00 
 
 .01 
 
 .01 
 
 .01 
 
 .00 
 
 .01 
 
 36 
 
 .02 
 
 .00 
 
 .00 
 
 .00 
 
 .02 
 
 .00 
 
 2k 
 
 .02 
 
 .02 
 
 .02 
 
 .00 
 
 .01 
 
 .01 
 
 18 
 
 .03 
 
 .01+ 
 
 .01 
 
 .03 
 
 .06 
 
 .00 
 
 12 
 
 .02 
 
 .09 
 
 .05 
 
 .03 
 
 .00 
 
 .02 
 
 6 
 
 .3k 
 
 .12 
 
 .05 
 
 .02 
 
 .15 
 
 .08 
 
 Table U.8. Absolute Errors for Features of Sleeping Cat, where Feature 
 Values at Circularity N = 96 are Assumed to be True Values 
 
 N 
 
 MEAN-R 
 
 DEV-R 
 
 Primal 
 
 Dual 
 
 E.F. 
 
 Primal 
 
 Dual 
 
 E.F. 
 
 96 
 
 .00 
 
 .00 
 
 .00 
 
 .00 
 
 .00 
 
 .00 
 
 72 
 
 .01 
 
 .02 
 
 .01 
 
 .00 
 
 .01 
 
 .00 
 
 1+8 
 
 .0U 
 
 .01+ 
 
 .01 
 
 ,0k 
 
 .01+ 
 
 .01 
 
 36 
 
 .03 
 
 .06 
 
 .01 
 
 .05 
 
 .07 
 
 .00 
 
 2k 
 
 .03 
 
 .03 
 
 .01 
 
 .07 
 
 .02 
 
 .02 
 
 18 
 
 .06 
 
 .08 
 
 .01 
 
 • 07 
 
 .09 
 
 .00 
 
 12 
 
 .08 
 
 .07 
 
 .02 
 
 .23 
 
 .11+ 
 
 .07 
 
 6 
 
 • 19 
 
 .01+ 
 
 .10 
 
 .27 
 
 .31+ 
 
 .27 
 
100 
 
 0.4 
 
 0.3 
 
 a: 
 o 
 a: 
 
 UJ 
 
 .0.2 
 
 0.1 
 
 0.0 
 
 -• PRIMAL 
 
 X -X DUAL 
 
 O O E-F 
 
 
 
 20 
 
 ^*^T 
 
 40 60 
 
 CIRCULARITY N 
 
 80 
 
 100 
 
 Figure k .k . Errors v.s. Circularity N of Embryo: DEV-R 
 
101 
 
 PRIMAL 
 
 20 
 
 40 60 
 
 CIRCULARITY N 
 
 80 
 
 100 
 
 Figure U.5. Errors v.s. Circularity N of Sleeping Cat: MEAN-R 
 
102 
 
 features. From these tables and figures it can be said that feature values 
 are not significantly affected by circularity. In other words, for N > 2h 
 feature Yalues are stable and errors may be negligible. 
 
 By the above arguments we may make the following summary: 
 (i) Errors decrease as array cell size increases, 
 (ii) Black-boundary encoding is preferable to white-boundary 
 encoding, 
 (iii) Errors on edge-following are less affected by the change 
 of circularity than those of black or white boundaries, 
 (iv) In general, errors increase as circularity decreases, and 
 those errors for circularity N >_ 2U do not seem 
 significant; of course, this is dependent on the complexity 
 cf figures. 
 
 h. h Extraction of Local Features 
 U.U.I Preliminary Discussion 
 
 In Section U.2 we introduced 19 features which were extracted 
 from three shape representations: polygonal, pattern sequence, and skeleton 
 representations. These features are some kind of physical measures of an 
 object, and unfortunately, they do not directly describe its shape 
 information. In a sense, the features in Table U.l are global features of 
 a shape, and may not describe local features such as lobes, cavities, 
 and roughness on curves . 
 
 Let us consider the shapes illustrated in Figure U.6. Some 
 interesting local features on each shape are indicated with a dotted circle 
 where these local features may be called "lobes," "cavities," and 
 "roughness," respectively. It is clear that humans detect these local 
 features. Thus, for the description of objects, such features should also 
 
103 
 
 (a) Local Lobe 
 
 (b) Local Cavity 
 
 (c) Local Roughness 
 
 Figure U.6. Shapes with Interesting Local Features 
 
10U 
 
 be considered as well as those in Table k.l. Necks, tails, and hairs of 
 shapes are also considered to be local features and have been discussed 
 in Section 3. ^.2. 
 
 In the following sections we will describe an approach for the 
 extraction of each local feature from rather simple regions as shown in 
 Figure H.6. Some other local features which are not discussed in the 
 following sections, however, can be considered as significant local features 
 for shape perception problems and are illustrated in Figure k.'J. 
 
 For example, let us assume that a given object is represented by 
 a pattern sequence S(x ). Then the extraction of local features may 
 correspond to the determination of subsequences in S(x _) based on how and 
 where the human pays attention. In other words, we are interested in 
 determining a subsequence of a curve where a human eye's fixation time is 
 longer than for another. 
 
 k.k.2 Determination of Local Lobes 
 
 We assume the bi-pattern sequences for the object to be encoded. 
 We generate two kinds of pattern sequences: a primal pattern sequence 
 S(x ) and a dual pattern sequence S'(x ). 
 
 S ^ x = S 0' s i' #, *> S j_'---> s n_i' 
 
 S{x ) = s^, s^,..., s^,..., s^, S ' ± = R -7 ± for all i. 
 
 It is clear from the definition of these pattern sequences that s' > s for all 
 
 i — 1 
 
 i, and this is denoted by S '(x ) r> S(x ) . An example of these sequences can 
 
 - 
 
 be found in Figure 3.11. If S'(x ) = S(x ), then the encoded region is 
 1-angularly simple, and the approach for the determination of local cavities 
 in the following section will be considered for the determination of 
 local lobes. 
 
105 
 
 
 Figure h .J . Other Examples of Local Features 
 
106 
 
 Let S' be given by 
 
 O "— o. -i S . , . . . - S . » . . . . S . 
 11 1 1 
 
 1 2 j m 
 
 which is a sub-sequence of S(x ) such that 
 
 
 
 6. +s~. =R, s. +7. = R 
 
 11 mm 
 
 and s . + a ^ R for j = 2,..., m-1 . 
 
 where R is the radius of a circle which contains the object 
 
 completely. Then a sub-sequence S", which contains S', is a sub-sequence 
 
 where some local lobes may be extracted. By this approach, the 
 
 region or curvature indicated by a dotted circle of Figure U.6(a) can be 
 
 determined. 
 
 Let |S'| denote the length or cardinality of a sub-sequence S' , 
 and let us defind D and D as follows: 
 
 D = | S ' | /N and D = 1 - D . 
 
 Then we assume that the greater the difference between D and D , the more 
 the human tries to keep his attention on such a shape and tries to extract 
 local features. This may be called the local oddness of a shape (c.f., odd 
 shape detection in Maruyama, 1971)- This implies the determination of a 
 portion of a curve on a shape which is different from the majority of curves, 
 
 H.4.3 Determination of Local Cavities 
 
 To detect the portion of the shape of Figure U.6(b) which is 
 indicated by a dotted circle, we consider a sequence H which is derived 
 from S(x_) in the following manner: 
 
 
107 
 
 H = h Q , h 1 ,... 1 h.,..., h N _ lS 
 
 where h . = I s . , _, - s.l + Is. - s. I for i = 1 , . . . , N-2 
 
 1 ' l+l i' ' i l-l 1 
 
 and h Q = \s ± - s Q | + |s - s^l, 
 
 Vl = ' S " S N-ll + l S N-l " S N-2 ' * 
 
 The H sequence magnifies elementary patterns which correspond to vertices 
 
 on the closed curve. A sub-sequence H' of H can "be determined in the 
 
 following way" 
 
 h. e H' such that h. > T(h) , 
 l l 
 
 where T is a threshold based on h, 
 
 N-l 
 
 h - (£ h.)/N. 
 i=0 X 
 
 Thus, by determining the subscripts of H' it is possible to 
 
 determine which of the elementary patterns s.'s in S(x ) denote local cavities 
 
 U.U.U Determination of Local Roughness 
 
 The dot-circled portion of the shape in Figure k.6(c) may be 
 extracted by determining the angle changes between the adjacent vectors on 
 the curve if the curve is smooth. This is illustrated in Figure U.8(a). 
 
 To determine the rough portion on a polygonal curve, we 
 
 distribute N number of points p.'s on the curve y in an equi -interval 
 
 manner as mentioned in Section 3.h. And at each point p. on y , we consider 
 
 a disc D of radius R. Then we measure the length 1. of the curve segments 
 K l 
 
 contained in the disc D^ at each point p.. 
 
 R l 
 
 L 1 Q , 1 ,..., l i ,... , 1 N _. 
 
108 
 
 (a) Angle Changes on a Smooth Curve 
 
 (t>) An Approach for Measuring Roughness 
 
 Figure k.Q. Extraction of Local Roughness 
 
109 
 
 This approach for the detection of the sequence L is illustrated in 
 
 Figure U.8(b). We normalize l.'s in such a way that max {1'.} = lwhere 
 
 i 
 
 1 ! = I . -2R so that we can consider the distribution of l.'s "between values 
 *i l l 
 
 and 1 as illustrated in Figure U.9. Distributions around imply smoothness, 
 and distributions around 1 imply roughness, so smoothness or roughness can 
 be determined by the distribution frequencies. This distribution is 
 called SR-di st ribution . 
 
 Some examples of SR-di st ribution are illustrated in Figure U.10 
 where one shape consists only of smooth curves and the other consists only 
 of rough curves. Other examples are illustrated in Figures U.ll(a) and (b) 
 where the roughness and smoothness between them is completely opposite. 
 
 From these examples it becomes clear that SR-distribution can be 
 used for the determination of local features. This can be used for some 
 others such as a solution for odd shape detection, shape clustering, and 
 a quick matching-in or quick shape extraction problem. Figure U.12 shows 
 an example of how shape extraction can be accelerated by considering 
 SR-distribution on curves. 
 
110 
 
 m 
 
 ox {I,} 
 
 min [lj| 
 
 +- Rough 
 
 Figure k.9. SR-distribution: Smoothness and Roughness 
 
 1 
 
 a. 
 
 (a) Smooth Curve 
 
 (b) Rough Curve 
 
 Figure U.10. Examples of SR-distribution 
 
Ill 
 
 (a) 
 
 (b) 
 
 £222 
 
 Figure U.ll. Examples of SR-distri"bution 
 
 Figure U.12. An Application of SR-distribution for Shape Extraction 
 
112 
 5. DETECTION OF ODD (SIMILAR) SHAPES 
 
 In the previous chapters we have discussed shape representation and 
 feature extraction. As an application of these — a procedure to determine an 
 odd shape or form in a given set of shapes— each set consisting of more than 
 two shapes will be described. Here we utilize the pattern sequence represen- 
 tation as shape description. 
 
 In Section 5-1 the "oddness" of a shape in a set of shapes on the 
 basis of a specified feature is defined, and then the "most odd" shape on the 
 basis of many detected features is defined. A complete description of the 
 system for odd shape detection appears in Section 5.2. To make the system 
 response like an average human response, modification of an assumed weight 
 vector by means of an "attention mechanism" is considered. Section 5.3 is 
 provided to show the performance of the system by comparing the system 
 response with the average human response. 
 
 5-1 Basic Definitions 
 
 Let us first define discrimination within a given set of shapes on the 
 basis of a single specified feature. We assume that the given set of shapes, 
 ft , ft , . .., ft., . .., ft , to be presented to the subject (man /machine) consists 
 of more than two shapes, i.e., n >_ 3. 
 
 The specified feature, C, takes test results x , x , ..., x for the 
 n shapes respectively. 
 
 Definition 1: Odd Shape /Form on C 
 
 We define the D- interval , d., for each ft. as: 
 J J 
 
 n n 
 
 d. = max {x, } - min (x). 
 
 J k=l K k=l K 
 
 Then we define ft . as an odd. shape/form in the given set of n shapes on 
 
 J 
 
113 
 
 1 n 
 the "basis of C if d. = min {d, }. 
 
 J k=l ^ 
 The following is an alternative definition of oddness based on a single 
 specified feature: 
 
 Definition 2: Odd Shape /Form on C 
 
 We define deviation, d. for each ft. as: 
 J J 
 
 n n 
 
 d. = zl x, - x. where x. = ( JJx^ )/(n-l) 
 J k=l K J J k=l^ 
 
 ¥i ¥i 
 
 Again we define ft . as an odd shape/form in the given set of shapes on 
 
 J n 
 
 the basis of feature C if d. = min {cL }. 
 
 J k=l ^ 
 
 To see how the above definition of oddness reflects human response on 
 the basis of a specified feature, let us consider the examples illustrated in 
 Figure 5.1. We assume, for simplicity, that the specified features are the 
 number of vertices and convexity. Table 5-1 shows the extracted feature values 
 for the specified two features, and Table 5.2 shows the D-intervals, where we 
 assume that the convex shape is denoted with value 1 and the concave one with 
 0. For the set A, ft. is said to he an odd shape on the basis of the number of 
 vertices, while ft would be chosen as an odd shape on the basis of convexity 
 since d = is the minimum. For set B, the number of vertices is not signifi- 
 cant for the decision of oddness, since d., j = 1, 2, 3, h, assume identical 
 
 J 
 
 values. Similarly, for set B, the feature of convexity does not play any 
 significant role for the decision of oddness since ft and ft p are concave and 
 ft and ft, are convex. 
 
 it is easy to see that d. is minimum only if x. is an extremal. 
 
llU 
 
 a 
 
 a. 
 
 a- 
 
 a, 
 
 (a) Set A of Four Shapes 
 
 a. 
 
 a< 
 
 a- 
 
 a, 
 
 (b) Set B of Four Shapes 
 
 Figure 5.1. Trial Sets of Shapes 
 
115 
 
 Table 5*1 • Feature Values of Shapes in Figure 5.1. 
 
 
 
 flj 
 
 tt 2 
 
 «3 
 
 «u 
 
 Set 
 
 A 
 
 Number of 
 Vertices 
 
 5 
 
 6 
 
 6 
 
 3 
 
 Convexity 
 
 1 
 
 l 
 
 
 
 i 
 
 Set 
 
 B 
 
 Number of 
 Vertices 
 
 5 
 
 It 
 
 5 
 
 U 
 
 Convexity 
 
 
 
 
 
 1 
 
 1 
 
 Table 5-2. D-Interval by Definition 1 
 
 
 
 i 
 
 n 
 
 2 
 
 3 
 
 n 
 
 Set 
 A 
 
 Number of 
 Vertices 
 
 3 
 
 3 
 
 3 
 
 i* 
 
 Convexity 
 
 1 
 
 1 
 
 0* 
 
 i 
 
 Set 
 
 B 
 
 Number of 
 Vertices 
 
 1 
 
 1 
 
 1 
 
 i 
 
 Convexity 
 
 1 
 
 1 
 
 1 
 
 i 
 
 *This indicates the detected odd shape in the given set of shapes with 
 respect to the single specified feature. 
 
11C 
 
 Table 5.3. Deviation by Definition 2 
 
 
 
 1 
 
 2 
 
 n 
 
 3 
 
 \ 
 
 Set 
 
 A 
 
 Number of 
 Vertices 
 
 h 
 
 10/3 
 
 10/3 
 
 k/3* 
 
 Convexity- 
 
 U/3 
 
 U/3 
 
 0* 
 
 h/3 
 
 Set 
 
 B 
 
 Number of 
 Vertices 
 
 U/3 
 
 U/3 
 
 h/3 
 
 k/3 
 
 Convexity 
 
 U/3 
 
 U/3 
 
 h/3 
 
 k/3 
 
 Table 5.3 shows the deviation for shapes which are illustrated in 
 Figure 5.1. Comparing Table 5*2 and Table 5-3, we see there is no significant 
 difference between the odd shapes detected by Definition 1 and those detected 
 by Definition 2. The only difference that appears in these two tables is in 
 the very first rows for the set A. If we consider detection of two shapes 
 from each set of shapes, one for the oddest shape and the other for the 
 second oddest shape, then the Deviation approach gives the second oddest 
 shape as either fi_ or Q_ for the set A on the basis of the number of vertices. 
 However, the D-interval approach gives the second oddest shape as either Q , 
 fl , or Q . At this stage, we have no criteria for which definition of oddness 
 is preferable, especially when the number, n, of shapes is small (about h to 
 6). However, as n becomes large, then Definition 2 is preferable to Definition 
 1. These two definitions become identical when n is 3. 
 
 Thus far we have discussed oddness of shape on the basis of a single 
 feature, C. We will now expand the above definitions to enable choice of 
 
 oddness on the basis of m features , C. , C^ , . . . , C . , . . . , C . 
 
 ' 1' 2' i' ' m 
 
117 
 
 Definition 3: Odd Shape/Form on C , . .., C., ..., C 
 
 m 
 
 Let us assume that d. . denotes the D-interval or deviation of the 
 
 J. J 
 
 j-th shape, Q . , on the basis of feature C\, and let 
 
 J 
 
 k/s - 
 
 H. = v£ (W.d ) for ,5=1,2 n 
 
 J i=l 1 J 
 
 Then the shape ft is said to he the most odd shape in the given 
 set of n shapes 
 
 if H. = min {H. } 
 
 Here W = (W_, , .... W. , .... W ), W. > 0, is called the modified weight vector, 
 limi — 
 
 which can he adaptively varied and will be discussed in Section 5.2.2. 
 
 In the above definition, if K=l, then the method is called the Linear 
 
 2 
 Separation (LS); if K=2 , then this is called the Euclidean Distance Separation 
 
 (EDS); and if K>_3, it is called the K-th degree Separation . 
 
 Now we are ready to expand Definition 3 to determine similarity between 
 
 a shape ft and each shape ft., j=l, ..., n. We define difference d. . for a 
 
 specified feature C. as follows: d. . denotes the difference between ft and ft. 
 
 i ij J 
 
 on the basis of feature C. 
 
 l 
 
 d. . = |x. - x. . , 
 
 x. : feature value of ft on C. , 
 i l 
 
 x. . : feature value of ft . on C. 
 
 ij J i 
 
 Then H . , defined in Definition 3, defines the most similar shape to ft as ft. if 
 J J 
 
 2 
 Attneave (1950 ) asserts that psychological distances could not be fitted in 
 
 Euclidean space since the psychological difference between any two stimuli is 
 
 approximately equal to the sum of their differences with respect to the physical 
 
 variables — not to the square root of the sum of the squared differences as 
 
 demanded by an Euclidean space. A comparison between a Multiple Regression and 
 
 Multiple Discriminant models can be found in Jones (1971). 
 
118 
 
 n 
 
 H. = min {H, }. 
 
 J k=l k 
 
 (k) 
 These values H.'s correspond to the real values r. . assigned by function f\ 
 
 i ij i 
 
 which will be discussed in Section T.I- 
 
 From the above argument, the detection of similar shape becomes identical 
 to the detection of odd shape by assuming d. . denotes the difference instead of 
 the D-interval or deviation. Thus, hereafter we consider only the problem of 
 detecting odd shapes. 
 
 5.2. A System to Detect Odd Form 
 
 We previously outlined our model for the detection of an odd shape/ 
 form among a given set of shapes. This section develops the model rigorously. 
 
 The block diagram of our model is illustrated in Figure 5.2, and we shall dis- 
 cuss here the role of each block. 
 
 Let us assume that for shapes ft., j=l, ..., n, that X., j=l, ... s n, are 
 feature vectors of dimension m, respectively, where: 
 
 X = 
 
 - ( X , X , . . . , X , . . . , X ) 
 
 = [x. . ] , x. . e R 
 ij mxn' ij 
 
 m: the number of features considered, 
 
 n: the number of shapes presented, 
 
 x. . : the feature value of the j-th shape 
 l i 
 
 ft . on the basis of the i-th feature, 
 J 
 
 5.2.1 Analysis 
 
 A D-Table is defined as : 
 
 d = [a..] 
 
 ij mxn 
 where, Case 1 (Using Definition 1 of D-interval) 
 

 w 
 
 119 
 
 X i X 2 
 
 T 
 
 U)' 
 
 (2) 
 ATTENTION 
 
 a 
 
 WEIGHT 
 MODIFICATION 
 
 _1_ 
 
 (1) 
 
 ANALYSIS 
 
 W 
 
 L 
 
 (3) 
 
 DECISION 
 
 a 
 
 SELECTION 
 
 OUTPUT 
 j €{l,2,3,---,n} 
 
 Figure 5.2. A System to Detect An Odd Shape/Form Amon* 
 A Given Set of N Shapes 
 
120 
 
 * v A 
 
 d! . = d. . - d. . where d. . = Max {x., } 
 
 ij xj ij ij k ^, ik 
 
 V 
 
 d. . = min {x. n } . 
 
 1J k^j lk 
 
 Case 2 (Using Definition 2 of Deviation) 
 
 n 
 
 a* ■ TT j x., - x., 
 
 ij ^i lk x j 
 
 n 
 where x. . = ( V* x )/(n-l) 
 1J k=l lk 
 
 Thus, d! . denotes the original D-interval or deviation for the shape ft. on the 
 basis of the i-th feature, C. Since we are dealing with m different features, 
 we will normalize the D Table, i.e., d. . = d! ,/d. where d. = (Vd! -)/n. This 
 
 ■'■J U ' u 
 
 is quite important because each individual feature value is evaluated by quite 
 different scales. 
 
 The T Table, which is derived from the D Table, may be used to save 
 storage for our model: 
 
 T = [t..] 
 
 ij mxn 
 
 (1 if d. . = min {d. n } 
 xj k xk 
 
 otherwise 
 Thus t. . is 1 if and only if shape ft. is defined as an odd shape, by either 
 Definition 1 or 2 , on the basis of the i-th feature C. 
 
 5.2.2 Attention and Weight Modification 
 
 Since humans pay attention to some features more than others (these 
 
 attention centering features varying with the object), we will introduce a mecha- 
 nism, called the attention mechanism, into our model. We call <j) the attention 
 function, and we assume <j> is a mapping from the D table into a vector A of 
 
121 
 
 dimension m, called the attention vector. 
 Thus 
 
 $ : D + A 
 
 where A = (A n , A . .... A. , . . . , A ) 
 
 1' 2' x m 
 
 A. e R called the attention factor for the 
 
 l 
 
 i-th feature detection. 
 Note that <$>(D) assumes different values for different sets of shapes. 
 Without loss of generality, we write 
 
 A. = <j>. (d..., d. oJ ..., d.., ..., d. ) 
 
 l Y i 11' i2' ij in 
 
 for i = 1, 2, . .., m. To simplify our model we assume that $. = $ for all i, 
 
 Then the modified weight vector W 
 
 W = (w lt w 2 , ..., W., ..., w m ) 
 
 is defined as W. = oj.A. for all i where 
 l 11 
 
 u ~ (u 1 , Wo, . .. , uk, .. . , u m ) 
 
 is the originally given (or assumed) weight vector (e.g., go = (l, 1, ..., l) 
 
 implies that all features are considered to be equally significant). 
 
 Let us give a simple example of such an attention function c|> ; 
 
 <J> : (a., 8.) -♦- A. for all i where a. = min {d., }, 
 ill l , lk 
 
 k 
 
 8. = min {d. n \4a. }. 
 
 l , ik' i 
 
 k 
 
 More explicitly 
 
 A. = $ (e i - a.) 
 
 and <j> has a property such that whenever 8. - a. > 8.' - a.' then 
 
 1111 
 
 <)> (8. - a.) > $ (8.' - a.') holds, 
 l l — r l l 
 
122 
 
 5.2.3 Decision and Selection 
 
 The role of this block is to form a vector H of n dimensions and 
 then select an odd shape according to Definition 3. 
 
 H - (H , Hg, ..., H , ..., H ) 
 
 J 
 where: Case 1 (D table is used) 
 
 K fm ; 
 
 K rw 
 J v i=l 
 
 Case 2 (T Table is used) 
 
 K rm ~ 
 
 H. ■ 7E (W.t ) , K>1. 
 
 i=l 
 The selection of the most odd shape or form is accomplished in the following 
 
 manner : 
 
 Case 1 
 
 m 
 £1 is the first choice odd shape if H. = min {H.} 
 
 k i \ 3-1 J 
 
 n 
 P. is the second choice odd shape if H = min {H.|^H, } 
 K 2 K 2 j=l J k l 
 
 Case 2 
 
 n 
 
 P» n is the first choice odd shape if H n = max {H.} 
 k l k l j=l J 
 
 n 
 fi, is the second choice odd shape if H, = max {H.|^H } 
 *2 k 2 o=l J K l 
 
123 
 
 5.3 Experimental Results 
 
 Here we compare human response and our system response to the task 
 of detecting an odd shape from a given set of four shapes. The data stimuli, 
 20 sets of four shapes each, are illustrated in Appendix B. 
 
 5.3.1 Human Response 
 
 To get an average human response to the 20 sets of shapes, we 
 obtained exactly 100 student samples on campus at the University of Illinois . 
 We asked each student to pick an odd shape from each set of shapes and we 
 counted the frequency of response for each shape. Table 5.^ shows the human 
 response for each set. For the first set of shapes, for example, 57 students 
 chose the second shape, ft , as an odd shape, 22 students chose Q as an odd 
 shape, and IT students chose ft as an odd shape. Thus the human response for 
 the first set of shapes is ft as the oddest shape and ft or ft as the second 
 oddest shape. This response is denoted as ft^/ft , ft- (for the sake of 
 simplicity we write 2/1,3). Considering the 7-th set of shapes in Appendix A, 
 0_ , ft , and ft, are identical except for their rotation. Therefore, shape ft 
 should be chosen most frequently as the odd shape. According to Table 5^> 
 98 students chose ft and two students chose ft, as the odd shape. The average 
 human response is ft and is denoted by 2/_ . 
 
 5.3.2 Attention Function 
 
 We focus our discussion on the following two types of attention 
 functions : 
 
 (i) $( Y ) « Y , k > Y = /3. - <X, 
 (ii) <j>(y) = [C, sin (77 + C 2 y) +y!q , 
 
 where [X] means if X > 1 then $(y) = 1 and if X < then 
 <t>(y) = 0, otherwise <j>(y) = X. 
 
12U 
 
 Table 5.1*. HUMAN RESPONSE: The Numbers Represent the Number of Times Each 
 Shape was Chosen as Odd by the Students. The Average Human 
 Response is i/j (fl./p.) 
 
 
 "l 
 
 Q 2 
 
 "3 
 
 Q l+ 
 
 Average 
 
 H.R. 
 
 1 
 
 IT 
 
 5T 
 
 22 
 
 1+ 
 
 2/1,3 
 
 2 
 
 1+6 
 
 5 
 
 1+8 
 
 1 
 
 1.3/. 
 
 3 
 
 1+1+ 
 
 12 
 
 35 
 
 9 
 
 1.3/. 
 
 4 
 
 1 
 
 
 
 9T 
 
 2 
 
 3/_ 
 
 5 
 
 1 
 
 8 
 
 86 
 
 5 
 
 3L 
 
 6 
 
 9 
 
 65 
 
 1+ 
 
 22 
 
 2/1+ 
 
 7 
 
 
 
 98 
 
 
 
 2 
 
 2L 
 
 8 
 
 35 
 
 20 
 
 20 
 
 25 
 
 1.2 
 
 3.1+ 
 
 9 
 
 22 
 
 6 
 
 6T 
 
 5 
 
 3/1 
 
 10 
 
 2 
 
 11 
 
 31 
 
 56 
 
 4/3 
 
 11 
 
 88 
 
 
 
 11 
 
 1 
 
 1Z3 
 
 12 
 
 1+ 
 
 
 
 1+2 
 
 51+ 
 
 3,4/_ 
 
 13 
 
 1+1+ 
 
 1+2 
 
 1 
 
 13 
 
 1,2/4 
 
 lU 
 
 26 
 
 29 
 
 22 
 
 23 
 
 1.2 
 3,4 
 
 15 
 
 23 
 
 16 
 
 1+2 
 
 19 
 
 3/1,2,1+ 
 
 16 
 
 31] 
 
 35 
 
 1+ 
 
 30 
 
 1,2,3 
 
 IT 
 
 1+1+ 
 
 13 
 
 1 
 
 1+2 
 
 1,^/2 
 
 18 
 
 T 
 
 11+ 
 
 68 
 
 11 
 
 3/2,1+ 
 
 19 
 
 50 
 
 28 
 
 15 
 
 T 
 
 1/2,3 
 
 20 
 
 81 
 
 1+ 
 
 5 
 
 10 
 
 1/it 
 
125 
 
 Figure 5-3 shows attention functions of the second type with 
 C =0.2 and 5.23 ^ C > 3.93. We will see later — after running our model 
 for various attention functions — that the second type of attention functions, 
 which consist of trigonometric functions and linear functions with C = 0.2, 
 are at least as good as those functions discussed above. 
 
 5.3.3 System Response 
 
 To compare the behavior of the system with the average human 
 response, we count the number of times the system response matches the average 
 human response for the first choice of the odd shape. 
 
 Let us first assume the weight vector No. 1 in Table 5-5, i.e., all 
 these features which are considered for the system are equally significant. 
 In the table weight, w. corresponds to the i-th feature of Table k.l. The 
 weight co* is for a special feature which is extracted on the basis of a set 
 of n shapes presented and is called an "interval difference sum." This 
 
 feature reflects the degree of dissimilarity of (n-l) shapes. Let 
 
 v 
 
 s. denote the i-th elementary pattern of the k-th shape. Then the feature 
 
 value of the j-th shape is defined as: 
 
 W k k 
 
 / „ (max{s.} - min{s.})/N. 
 i-0 k^j X k^j X 
 
 In Table 5.6 some system responses under different attention 
 
 mechanisms are tabulated. Here weight vector No. 1 is assumed, and D and 
 
 T indicate that the corresponding system response is determined by D table 
 
 and T table, respectively. 
 
L26 
 
 Figure 5-3. Attention Function of the Form 
 ♦ (y) = [0.2 sin(TT + C 2 y) + y]J. 
 3.93 < C 2 < 5.23 
 
127 
 
 20 
 
 Table 5«5. Weight Vectors 
 
 Weight Vector Weight Vector 
 No. 1 No. 2 
 
 .617 
 .k62 
 .769 
 .381* 
 .617 
 .U62 
 
 .307 
 .769 
 .617 
 .538 
 1.000 
 .38U 
 
 '3 
 
 1 
 
 V 
 
 1 
 
 J 8 
 
 1 
 
 '9 
 
 1 
 
 '10 
 
 1 
 
 ) 
 11 
 
 1 
 
 12 
 
 1 
 
 13 
 
 1 
 
 19 
 
 1 
 
 * 
 
 1 
 
 Table 5.6. System Responses: Weight Vector No. 1 
 
 NO AM 
 
 1/2 
 Y 
 
 -< 
 
 3 
 
 Y 
 
 C =0.k 
 ^=5.23 
 
 C =0.2 
 C;!;=^9 
 
 D 
 
 T 
 
 D 
 
 T 
 
 D 
 
 T 
 
 D 
 
 T 
 
 D 
 
 T 
 
 D 
 
 T 
 
 70 
 
 65 
 
 70 
 
 65 
 
 75 
 
 75 
 
 75 
 
 75 
 
 75 
 
 70 
 
 75 
 
 75 
 
 From this table with no attention mechanism (NO AM), for example, the system 
 response matches the average human response exactly Ik sets out of 20, i.e., 
 
128 
 
 70% correct response by D table, while 65% correct response was given by 
 
 T table. Let us consider the system response with an attention mechanism, 
 
 •5 
 Y • Then the system response matches the average human response exactly 
 
 15 sets out of 20, i.e., 75% correct response. In general, the system 
 
 response given by D table is at least as good as the system response given 
 
 by T table. Furthermore, the system response with appropriate choice of an 
 
 1/2 
 attention mechanism (^ y ) is better than the system response without it. 
 
 With an equally weighted vector, the system achieved 65% to 75% 
 correct responses, i.e., 75% of the average human response was realized 
 by the system. Those sets of five shapes for which the system did not match 
 the average human response are Sets 1, k, 6, 15, and 16 of Appendix B. 
 
 To improve the system response in terms of the average human 
 response, let us consider the weight vector No. 2 in Table 5«5- This vector 
 is obtained by solving a Linear Programming Problem to satisfy the average 
 human response as closely as possible. The following is the approach to 
 derive the vector. 
 
 Let us consider the vector H in Section 5.2.3: 
 H = ( H^ , . . . , H , . . . , H^ ) , 
 
 where 
 
 m 
 H. = 7^ a), t. . for all j, w > for all i. 
 
 Here u). is an unknown weight for the i-th feature, and we use T table 
 instead of D table for analysis. Let us assume that the average human response 
 for the k'-th set of n shapes is Q, , i.e., the k-th shape is determined as the 
 most odd shape by the average human response. Then, without any modification 
 of weight, the following set of inequalities should hold to respond 0, as the 
 most odd shape by the system: 
 
129 
 
 H - H. > for j 4 k. 
 k j - 
 
 This can be rewritten as 
 
 C , CD >_ 
 it 
 
 Thus, if there are M number of stimuli, for each stimulus set we can derive 
 the above set of inequalities. Consequently, we have a huge number of 
 inequalities, each of them should be satisfied simultaneously for id to attain 
 the exact average human responses. Such a huge system of inequalities can be 
 written as : 
 
 C cd >_ 0, 
 
 a, u > o, 
 
 k . — 
 
 C.. aT > 0. 
 
 M — 
 
 Therefore, the problem of attaining the average human responses for M 
 stimuli by the system becomes the problem of solving the above set of 
 inequalities for cd . 
 
 For 20 sets of data stimuli in Appendix B, we derive a set of 
 inequalities using the average human responses in Table ^.h. The derived 
 system contains 59 inequalities. Unfortunately, such a system turned oat 
 
 to be infeasible, i.e., no cd exists to satisfy 59 inequalities simultaneously; 
 
 _^. 
 reduced to 20 inequalities, the problem was solved for a feasible solution cd. 
 
 The weight vector No. 2 was obtained by this manner. 
 
 In Table 5-7 some system responses attained with the weight vector 
 
 No . 2 are shown . 
 
 Table 5.7. System Responses: Weight Vector No. 2 
 
 NO AM 
 
 1/2 
 
 Y 
 
 Y 
 
 3 
 
 Y 
 
 C =0.U 
 0^=5.23 
 
 C =0.2 
 C^=U.U9 
 
 D 
 
 T 
 
 D 
 
 T 
 
 D 
 
 T 
 
 D 
 
 T 
 
 D 
 
 T 
 
 D 
 
 T 
 
 75 
 
 75 
 
 75 
 
 75 
 
 85 
 
 80 
 
 90 
 
 80 
 
 8? 
 
 80 
 
 90 
 
 80 
 
130 
 
 Without an attention mechanism the system achieved 75% correct response 
 of the average human response. With attention mechanism, C =0.2, 
 and C p = U.i+9, the system achieved 90% correct response with D table, which 
 is a significant improvement compared to the system response given "by 
 Table 5-6. Those sets of shapes whose average human response could not be 
 achieved by the system are Sets 6 and 15 of Appendix B. 
 
 For Set 6 the average human selects the second shape from the 
 left while the system selects the fourth shape. For Set 15 the average 
 human selects the third shape from the left while the system selects the 
 fourth shape as the most odd shape. 
 
 Many other system responses under various assumptions of system 
 variables appear in Maruyama (1971 c). Since none of the students 
 responded exactly the same as the average human responded, their correct 
 response ranged from 65% to 90% (mostly 80% to 85%). Therefore, we see that 
 the system response given by our model reflects the average human response 
 on the odd shape/form detection. 
 
 5 .3>b Summary 
 
 From our experiments we conclude that we can achieve a wide variety 
 of system responses using different assumptions: different separation methods 
 definitions of oddness, weight vectors and attention mechanisms. Overall 
 experimental results are summarized as follows : 
 
 (i) For the analysis of the system, the LS model is preferable 
 to the EDS model because of its simplicity. According to our computational 
 results, the two models have almost no significant difference in system 
 response . Thus , we may conclude that the Linear System is an adequate model 
 for human behavioral psychology (see Maruyama 1971c). 
 
131 
 
 (ii) The system response given in terms of the D table is better 
 than the system response given in terms of the T table regardless of the other 
 parts of the system (e.g. weight vector, attention mechanism, separation 
 method, etc.). The system analysis for the T table, however, is simpler than 
 for the D table. 
 
 (iii) We see that the system response using some kind of attention 
 mechanism (in which the attention factor, A., is non-decreasing on the value 
 8. - a.) is better than the system response using no such mechanism. Thus, 
 we may conclude that the attention function, <|> , 
 
 <j>: (8 i - a ± ) -> A i 
 
 such that 8. -a. > 8.' - a! implies (8. -a.) > (8! - a!), which modifies 
 l i l l l i=i l 
 
 the given weight vector, is an adequate choice for our model. 
 
132 
 
 6. GRAPHS AND GRAPH TRANSFORMATIONS 
 
 6.1 Graph Representation 
 
 Graphs are widely used for the convenient representation of the two- 
 dimensional geometry implicit in a picture which consists of many regions. By 
 using a graph representation, many problems of picture interpretation become 
 feasible with provision of appropriate heuristic algorithms. The study of 
 Guzman (1968) demonstrated the significance of graphical representation of two- 
 dimensional pictures, using this representation to analyze and cluster planar 
 regions into three-dimensional objects by extracting certain properties on the 
 vertices. Eastman (1970 ) finds a need for a better descriptive-capability to 
 define the spatial relationships essential for two-dimensional space planning. 
 Here again, a graph representation is evoked. 
 
 Let us construct graphs for the picture of Figure 6.1(a), which con- 
 sists of four regions (5 regions if we include the background as a region). 
 Corresponding graphs (regions-nodes, regions-arcs, and arcs-nodes graphs) are 
 exemplified in Figures 6.1(c), (d), and (e) respectively, using name informa- 
 tion of (b) Which of these graphs should be used is dependent upon the kind 
 of problem to be solved as well as the choice of heuristics. 
 
 Usually graphs are described in various matrix forms ; however, it is 
 convenient to describe a graph in a sequential manner. In that case, the 
 following graph representation is useful. Let us again consider the picture 
 of Figure 6.1(a), and let us assume that its regions, arcs, and nodes have been 
 assigned names as (b). Each region is denoted in a way to preserve the shape 
 of the region, and each node, i.e., intersection of more than one arc, is de- 
 noted by the values of its X, Y coordinates. 
 
 "TVlaruyama (1968) provides a graph I/O package which enables one to transform 
 one matrix representation of a graph to other matrix representations. 
 
133 
 
 (a) A Picture of Four Regions 
 
 (b) Naming (Nodes-Arcs 
 
 (c) Regions-Nodes 
 
 (d) Regions-Arcs 
 
 (e) Arcs-Nodes 
 
 (f) Regions-Relations 
 
 Figure 6.1. Examples of Graph Representation of a Picture 
 
13U 
 
 Let us assume the following expression for each node: 
 
 <V>({<V.>(<a.><ft >)}*), 
 1 j K. 
 
 where , 
 
 V 
 
 V. 
 
 = node name 
 
 = node name different from V 
 
 = arc name 
 
 ;= region name and its attributes 
 
 the following semantics are implied: 
 
 i) nodes V.'s are direct neighbors of the node V, 
 
 ii) nodes V and V. are connected with an arc a., 
 i J 
 
 iii) region fi, is (partly) bounded by arc a., and ft is assumed by 
 k j k 
 
 a. in counter-clockwise direction at node V, and 
 J 
 
 iv) the sequence of V.'s is cyclic. 
 By the application of the above representation, the picture of Figure 
 6.1(a) can be described as follows: 
 
 V l ^ V l* ^ a 3 s %^ 9 V 2 ^ a l 9 *V' Y 3 ^ a 2» fi 2^ ^ 
 
 V 2 ^ V h ^7' ^V' % ^ a 6 9 ^V' V 3 ^ a l+' ^V' V l ^ a l' %^ ^ 
 
 v 3 ^ v l4 ^ a 5> fi 2^' V l ^ a 2' ^V> V 2 ^% 9 *V ^ 
 
 V l* ^ V l ^ a 3' ^V 9 V 3 ^ a 5 s ^V' V 2 ^ a 6' %^ 9 V 2 ^ a T' fi 0^ ^ 
 In the above expression, nodes V.'s are sequentially scanned in counter-clockwise 
 order about node V. Thus the sequence of V.'s is cyclic. For example, the 
 following two sequences are semantically identical: 
 
 V l W ^ a 3' ^V' V 2 ^ a l' ^1^' V 3 ^ a 2' °2^ ^ 
 
 V l ^ V 2 ^ a i' ^V' V 3 ^ a 2' fi 2^' Y h ^ a 3' ^V ^' 
 From sequence expressions of each node, we can derive graphs which have 
 been described previously. If arcs are simplified and approximated by 
 
135 
 
 straight-line segments, then the above expression can be simplified by elimina- 
 ting arc information. For example, 
 
 V l ^ V U ^ a 3' n 0^' V 2 ^ a i' fi l^» V 3 ^ a 2' *V ^ 
 can be simplified into 
 
 v ± (v u (n Q ), v 2 (^j, v 3 (n 2 ) ) 
 
 by assuming v and v., v and v and v and v are connected with straight line 
 segments. This type of representation of pictures will be discussed extensively 
 in the following chapter. 
 
 6.2 Relation Graphs 
 
 Recently, Schwebel (1972) proposed a graph structure model for global 
 level picture processing. He defines general graph structures and graph struc- 
 ture transformations which are general mappings between structures. Graph- 
 theoretic clustering techniques operate on relational graphs, attempting to 
 merge nodes according to the relations between pairs of nodes. Zahn (l9Tl) 
 considers clustering on graphs using weights and by forming a minimum spanning 
 tree on a graph. Brice and Fennema (1970 ) apply region-merging heuristics. 
 
 Explicit studies of graph transformations become necessary for structur- 
 al inference. This area was initiated by McCormick (1967) in his presentation 
 of illustrative transformations and his emphasis upon the concept of reversibi- 
 lity. These ideas were further considered in Schwebel (1969) and led to defini- 
 tions of properties of relations for composite formation in Schwebel and 
 McCormick (1970). 
 
 Schwebel' s study covers a wide area of the properties involved in graph 
 structures, and it led to his definition of language, called Structure Operation 
 
136 
 
 Language (SOL). SOL is proposed in order to simply and conveniently express 
 operations on multi-relational graph structures which allow an arbitrary number 
 of variables to be associated with any structure element. 
 
 In this section we describe some relations. Some algorithms of 
 extracting relations from a given picture and means to construct graphs 
 are also described. 
 
 A graph G, which is mixed and labeled, consists of a set of nodes V and 
 a set of branches B, where 
 G = (V, B) 
 
 B C V x V : product space. 
 Schwebel's graph structure, GS, was represented by a sextuple. We also assume 
 a function g on B, such that for each element b in B, g(b) e R, where R is a 
 set of relations between nodes. 
 
 We consider regions-relation graphs, i.e., we assume that each node of 
 a graph represents a closed or open region of a picture and branch, directed or 
 not directed, denotes relations between two regions. Each node has parameters 
 and attributes: the center of the gravity of the region or the AS-simple point, 
 corresponding pattern sequences, textural and color attributes, etc., which 
 preserve the shape of the region and its location in the picture. This informa- 
 tion is accessed by an attached pointer for each node. Thus our graph is not a 
 simple abstract graph, but a graph which describes the structure of pictures 
 consisting of many regions. 
 
 A simple example of a graph G = (V,B) is illustrated in Figure 6.2(a). 
 The picture of (b) may be described as a graph of (c), where we introduced some 
 
137 
 
 V = {v 4 ,v 2 ,v 3 ,v 4 ,v 5 } 
 
 B={b lf b 2 ,b 31 b 4 ,b 5 ,b 6 } 
 
 where 
 
 G=(V,B) 
 
 bi^V^Va) 
 
 b« a (v lf v s ) 
 
 b 3 = (v 3 ,v 2 ) 
 b 4 =(v 4 ,Vi) 
 b 5 =(v 4 ,v 5 ) 
 b 6 =(v 3 ,v 5 ) 
 
 (a) An Example of a Mixed Labeled Graph 
 
 contained 
 
 contained 
 
 (b) A Picture 
 
 (c) Graph Representation 
 
 Figure 6.2'. Examples of Graphs: Regions-Relations 
 
138 
 
 necessary relations to preserve the structure of the picture. These relations 
 exemplified in (c) — "above" and "below," "left of" and "right of," "contains" 
 and "contained in" — are pairs, and one of them is the inverse of the other rela- 
 tion. Thus, clearly some relations can be eliminated from a graph without 
 destroying the structure. 
 
 Relations which we shall evoke are: 
 
 i) r : contains, r : contained in, 
 
 -L J 
 
 *I * 
 
 Lunuaiiu' * 
 
 4 I ' WJ 
 
 ii) 
 
 r : 
 
 c 
 
 connected, 
 
 -1 
 r = r , 
 
 c c 
 
 iii) 
 
 r A = 
 
 above, 
 
 -1 
 r = r 
 B A 
 
 iv) 
 
 r R : 
 
 right of, 
 
 -1 
 r = r 
 L R 
 
 v) 
 
 r s : 
 
 partially surrounds, 
 
 vi) 
 
 r N : 
 
 connected by 
 
 neck, and 
 
 vii) 
 
 r T : 
 
 tail of. 
 
 
 below, 
 
 left of. 
 
 r : partially surrounded by, 
 
 6,2.1 Containment Relation 
 
 The topology of a plain black-and-white figure or regional picture can 
 be simply described by a tree, a containment tree. For example, the topological 
 relation — "contains," "contained in" — of pictures illustrated in Figure 6.3 can 
 be described simply by the containment trees as illustrated, where each node 
 refers to a connected component of the figure or closed region. Thus such a 
 tree is a subgraph of the graph which describes the structure of the picture. 
 
 For array pictures, Buneman (1970 ) presented an algorithm which deter- 
 mines this containment tree from information about the picture which is picked 
 up during a single scan of array. His algorithm strongly resembles a generative 
 grammar and a procedure for choosing which rule of the grammar to operate. 
 
139 
 
 (a) Black and White Array Picture 
 
 (b) Region Bounded Picture 
 
 Figure 6.3. Containment Trees 
 
lUo 
 
 Our algorithm of generating containment trees assumes that the given 
 picture is described as a set of polygonal curves. We also assume that 
 for each such closed curve a point inside the curve, the center of gravity or 
 the AS-simple point, is available. Our procedure for deriving trees is quite 
 similar to the determination of intersections between a straight line and with 
 a set of curves. We emanate a ray from each given point which is inside the 
 corresponding region and determine a sequence of intersections with the ray and 
 the regional boundaries. Without loss of generality, as we can emanate the ray 
 in any direction, we chose here to emanate the rays along X-coordinate. A sim- 
 ple analysis on the generated intersection sequence shows the possible contain- 
 ment relations. 
 
 Let us consider the picture of Figure 6..U(a)» which consists of 5 
 closed regions. The regions are denoted by ft through ft , and their interior 
 points are x through x , respectively. For example, region ft is contained in 
 fi., and, in turn, ft is contained in ft . This can be determined by emanating a 
 ray at x and examining the sequence of boundary intersections. In this case, 
 the sequence is ft_, ft p , ft.. , and is an ordered sequence. It is interpreted as 
 for each pair of adjacent elements, the left element contained in the right ele- 
 ment. Thus, R c ftp and ft CL ft-,. Let us consider region ft . Its sequence is 
 ft~, ft p , ft p , ftp., ft R , ftp, ft,. This means the emanated line at x doubly crosses 
 regions ft and ft , which clearly implies that x is not contained in ft nor ft ; 
 consequently ft is not contained in ft nor in ft . Thus any pair of identical 
 elements should be eliminated from the sequence. After this shrinking process, 
 the sequence becomes ft , ft , ft and is interpreted as ft C ft C ft-, . The region 
 ft has its corresponding sequence as ft, , ft , ft , ft , ft and will be shrunk into 
 ft, , ft , ft . However, since we are considering and determining which regions 
 contain ft , the final sequence should be ft . 
 
lUl 
 
 (a) 
 
 X^ • AZ-4 , A&5 , AZ-5 , AZ«2 » «*i 
 
 X 2 " i^4 , AZ.5 J itg , \l 2 j Aij 
 
 X3 • aZ.3 , Ai 2 j A£ 2 j A/,5 j AZ.5 j A/>2 ) uij 
 
 X 4 • A£ 4 , AZ, 5 , AZ> 5 , AZ, 2 » AZ-i 
 
 X5 ' At-5 , AZ>2 > Jk«-i 
 
 X, 
 
 a t 
 
 x 2 : 
 
 ubo -j vit/j 
 
 x 3 : 
 
 Ai 3 , Ai 2 ?"i 
 
 x 4 = 
 
 u/'^ « u Zf p • AL 
 
 x 5 : 
 
 AZ 5 , Al 2 , AZ^ 
 
 (b) 
 
 (c) 
 
 ii 2 C Ali 
 
 & 3 C & 2 
 
 il 4 C & 2 
 
 & 5 C & 2 
 
 (d) 
 
 Figure S.k. An Algorithm to Generate Containment Trees 
 
1U2 
 
 From the above examples, we derive the following simple grammar. Let ft 
 be a sequence, 
 
 1 * 1 ' 1 ' i 
 12 j n 
 
 generated at a point x^ of the region ft . Let us also assume that symbols, 
 
 a, 6 and y, denote any subsequences or null sequences. Then our grammar 
 
 consists of the following two rules: 
 
 Rl. aft. 3 ft. Y->a3y 
 i . i . 
 J J 
 
 R2. Let x^ be the point of ft. , aft. 3 + ft. 3. 
 
 j j j 
 
 For any generated subsequence (or complete sequence), Rl is applied as many 
 times as possible, and then, finally, R2 is applied. 
 
 In Figure 6. Mb), generated sequences at x through x are listed, (c) 
 shows the results obtained after the application of rules Rl and R2, and final 
 containment relations as well as containment tree of picture (a) are illustra- 
 ted in (d). In (d) only direct containment relations are exhibited, and tran- 
 sitive containment relations such as ft~ c ft-i » ft-t, C ft-, and ft CZ ft-, have been 
 eliminated. 
 
 6.2.2 Neighborhood Relations 
 
 Relations, such as "connected," "above" or "below," "right of" or "left 
 of," "partially surrounds" or "partially surrounded by," "neck" and "tail," are 
 considered to be neighborhood relations . These relations can be defined only 
 between brotherhood regions, where regions which have an identical direct father 
 in the containment tree are said to be brothers. Thus, after construction a con- 
 tainment tree for a given picture, we know which regions of the picture are brother 
 
 The determination of relations between a pair of brotherhood regions is 
 as simple as the construction of a containment tree. Let us assume that the 
 
 
1U3 
 
 regions ft, , ft„ , >■•• ft. and ft are brothers, and let us assiime that we are 
 & 1' 2' ' j n 
 
 determining the relations between a region ft. and the rest of regions ft - { ft. }, 
 
 J J 
 
 where 
 
 X 2 3 n 
 
 We also assume that x. eft. for j = 1, ..., n. 
 
 J J . 
 
 To determine relations, a generalized primal pattern sequence is generated at x. 
 considering a space of ft - { ft. }, where each elementary pattern has an attribute 
 
 J 
 
 to tell which of regions has been pierced. Thus , for this type of pattern se- 
 quences at x., we may ignore measuring the distance between x. and the intersec- 
 J J 
 
 tion of elementary pattern with a region; however, we detect the region name 
 where such an intersection occurs. 
 
 The process of determining relations is illustrated in Figure 6.5 
 Here (a) shows a picture of eight regions and its containment tree. From the 
 tree, regions ft-, ft_, ft. , ft,-, ft/-, and ft 7 are brothers, and their father is ft . 
 Only brotherhood regions are illustrated in (b), and their possible relation 
 graph is exemplified. In (c) a picture of four regions is shown, and the process 
 of determining relations between ft and ft - {ft p } is illustrated as the genera- 
 tion of a primal pattern sequence. From the sequence we can determine all possi- 
 ble relations. The sequence generated at x is 
 
 i^yx-p) ■- s , s , ........ s oo 
 
 = 3, 3, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, k, k, k, h, k 9 3, 3, 3, 
 
 where denotes empty space, 1 denotes ft , 3 denotes ft , and h denotes 
 ft. . From the sequence S(x_), it becomes clear that ft p is connected to ft , ft_, 
 and ft, . By the determination of the indices of elementary patterns, we derive 
 further relationships: ft is left of ft , ft is below ft, and ft is above ft, . 
 
Ikk 
 
 (a) A Picture and Its Containment Tree 
 
 r R ,r T 
 
 1 © 
 
 (b) Brotherhood Regions and Their Relation Graph 
 
 S(x 2 H 
 
 (c) S(X 2 ) = S ,S 1J ,S 23 S,e{il lf n si il 4 ,*} 
 
 Figure 6.5. An Algorithm to Generate Neighborhood Relations 
 
1^5 
 
 In general, S(x.) can be divided into four regions, 
 J 
 
 s(x.) - s Q , s is , s N/2 , , s N _ 1 
 
 • » i I 
 
 below above 
 
 J I . ,i u 
 
 
 left of right of left of 
 
 Relations such as "surrounds," "tail," and "neck" can be found by further in- 
 spection of the pattern sequence S(x.) as well as shapes of ft. and its connec- 
 ted regions. Let D. be the number of elementary patterns whose attributes are 
 
 K. 
 
 ft in S(x.). Then, if D /N > t , t , <_ t •: 1, is called the threshold, and 
 
 K J K 
 
 D /N is said to be the degree of connectedness; then we say ft. is surrounded by 
 k J 
 
 ft. . If D. /N is close to zero but ft. and ft, are connected, then we say these 
 
 k k j k ' J 
 
 two regions are connected by a neck if ft. and ft are both circle or square-like 
 
 J k 
 
 shapes. If either one (say ft ) is a long, snake-like shape, then it is said 
 
 K. 
 
 that ft, is a tail of ft . . 
 k J 
 
 For example, in Figure 6.5(b), Q is surrounded by ft^, ft_ and ft,- are 
 connected by a neck, ft and ft^ are also connected by a neck, and ft„ is a tail 
 of ft,-. 
 
1U6 
 
 6.3 Graph Transformations 
 
 The extraction of a specified picture from another complex picture 
 corresponds to an extraction of the specified graph from another graph as 
 a subgraph. It is necessary to apply some transformations on graphs to make 
 possible matchings, especially if one graph is not completexy contained 
 in another graph. 
 
 Such graph transformations are necessary since at very low levels 
 of picture processing, such as boundary finding, some regions which should 
 be distinguished and be encoded differently may be determined as one 
 large region. Similarly, several regions which have been determined to 
 be separate regions should be considered as a one large region. There- 
 fore, graph transformations such as splitting a node or merging nodes 
 become significant to determine possible matchings between two graphs. 
 
 6.3.1 SPLIT Transformation 
 
 Nodes can be split only if some properties hold in such nodes. 
 
 This SPLIT transformation on a node corresponds to the decomposition of 
 
 a region into several subregions. For example, if a region ft has a 
 
 J 
 
 neck in it, then Q. may be decomposed into two subregions, ft', and ft". 
 
 J J J 
 
 Thus the relations between these two subregions are connected, neck, and 
 
 possibly above or below or right of or left of. Another example is if 
 
 ft. has a tail-like region; then such a tail may be assumed to be a differ- 
 
 o 
 
 ent region from the main region of ft.. Thus, again, ft. can be split into 
 
 J J 
 
 a body region ft', and a tail region ft". 
 J J 
 
 This SPLIT transformation on a node is not a simple straightforward 
 operation as illustrated in Figure 6.6. However, if the graph is not used to 
 describe structural relations — nodes are not representing regions, and arcs 
 
iUt 
 
 \ / 
 
 ^> 
 
 / \ 
 
 Figure 6.6. A Straightforward SPLIT Transformation 
 
 ©—MS) 
 
 (a) A Complex Picture 
 
 (b) The Corresponding Relation Graph 
 
 Figure 6«7« A Complex Picture and Its Relation Graph: 
 Containment Relations are Ignored 
 
148 
 
 are not representing relations between regions— then such a straight- 
 forward operation can be used. 
 
 Let us consider a picture "Embryo" in Figure 6.7(a), where its 
 relation graph may be illustrated as the graph of Figure 6.7(b). In the 
 picture, the region ft^ has a possible neck and can be decomposed into two 
 regions, ft' and ft". Let us apply the straightforward SPLIT transformation 
 on a node ft of the relation graph of Embryo. Then the resulting graph 
 can be illustrated in Figure 6.8(a), which does not make sense if one con- 
 siders the Embryo picture. The correct transformation of splitting node 
 ft is illustrated in Figure 6.8(b). From these two graphs, it becomes 
 
 clear that whenever SPLIT transformations are applied on some node ft. 
 
 J 
 
 into ft 1 , and ft'.', then we should consider these as independent regions in 
 J J 
 
 the picture, and relations between ft', and ft" and with other neighborhood 
 
 J J 
 
 regions should be regenerated to preserve the new structure of the pic- 
 ture. Which means whenever regions are decomposed, their structure also 
 changes. . . 
 
 6.3.2 MERGE Transformation 
 
 Merging of two nodes corresponds to the merging of two regions. 
 This transformation is meaningless unless some neighborhood relations hold 
 between these two regions. For example, merging between regions ft and 
 ft in the graph of Figure 6»T(h) is impossible, since these two regions 
 are completely separated by some other regions, a fact which is clear 
 from the picture of Figure 6.7(a). However, merging between SV, ft_, and ft ft 
 is possible, even if the merge of fig and ft fi is not clear, since ft^ and 
 ft_ are connected and ft and ftg are connected. This merge transformations 
 on nodes ft^, ft , and ftg leads the original graph to the graph of Figure 
 6.8(c), which is partially illustrated. 
 
Ik9 
 
 (a) A Wrong SPLIT Transformation (b) A Correct SPLIT Transformation 
 
 £l 9 - Union of £l 6 ,£ly and £l e 
 
 (c) MERGE Transformation 
 
 © 
 
 Figure 6.8. SPLIT and MERGE Transformations 
 --*■ Insert & k with r' ► 
 
 ® — - 
 
 Eliminate XI, 
 
 Figure 6.9. LNSTN and ELIMN Transformations 
 
150 
 
 The important things here that one should notice for merge transfor- 
 mations are that because the shape of merged regions is different from these 
 being-merged nodes, the structure of the corresponding graph, namely relations 
 between the merged region and others, changes. Consequently, one has to 
 determine relations for a new graph, and this cannot simply be done by taking 
 union of all relations for these being-merged nodes. Therefore, similar to 
 SPLIT transformation, whenever MERGE transformations are applied between pairs 
 of nodes, the algorithm should go down to the picture level analysis and 
 should determine relations between the merged region and the rest. Of course, 
 by such merge transformations, relations which should be determined are the 
 relations between merged regions and those regions which had some relations 
 with these being-merged nodes. 
 
 6.3.3 ELIM and INST Transformations 
 
 ELIMN and INSTN transformations are complementary transformations 
 
 provided to operate on nodes . The first transformation elimi- 
 nates a specified node and related arcs from the current graph being trans- 
 formed. The second transformation is provided to insert a new node between 
 a specified pair of nodes by specified relations. Simple examples of these 
 two transformations are illustrated in Figure 6.9. 
 
 ELIME and INSTE transformations are provided to operate on relations 
 (or edges) in graphs. The first transformation eliminates specified relations 
 from between specified pairs of nodes if the pairs of nodes are explicitly 
 given; otherwise, all such relations are eliminated from the current graph. 
 The second transformation inserts specified relations between specified pairs 
 of nodes. 
 
151 
 7. COMPLEX FIGURE DESCRIPTION 
 
 In this chapter we will discuss some problems in form perception and 
 the corresponding descriptions of complex figures will "be given, also. It is 
 clear that for an appropriate solution of these problems, some good heuristics 
 are desperately needed. 
 
 In Section 7-1 "we will introduce an approach for the extraction of 
 embedded complex shapes by means of similar shape detection (which was 
 discussed in Chapter 5) and relation graphs. Two other sections, J .2 and 7-3, 
 are provided to introduce some other problems involved in complex figure 
 drawings. For these, graph structural representation may not be an adequate 
 description for the solution of problems described. 
 
 7.1 Extraction of Embedded Complex Shapes 
 7.1.1 Preliminary Discussion 
 
 For two given pictures, P and P , both consisting of many regions, 
 we assume that P can be embedded in Pp. These two pictures are described 
 as relation graphs , where each node of the graph corresponds to a closed 
 region of the picture. We assume that associated with each node is 
 sufficient information to describe the shape, textural features, etc., of 
 the corresponding closed region. 
 
 Applying the decomposition procedures of necks and tails which have 
 been discussed in Section 3.^.2, we can assume without loss of generality that 
 
152 
 
 graphs G and G p of pictures P and P , respectively, correspond to graphs 
 derived from fully decomposed regions. Thus, we assume that graph trans- 
 formations on G and G separately have been done previously. Therefore, 
 the problem of extracting embedded complex shapes reduces to the problem of 
 determining whether the graph G exists in the graph G as a subgraph or not. 
 
 Let graphs G and G consist of sets of shapes, ft and ft , 
 respectively, where 
 
 ft 1 = {ft^, ft 12 , fi 13 ,.-., 8 li »«"a n lm >> 
 
 "»o = *■ 21' 22' ^p^'*"*' 2i'***' 2n 
 m <_ n. 
 Therefore, the task for the above problem is to determine for each shape ft 
 a corresponding shape ft . in such a way that relations between ft . and the 
 rest of the regions in ft hold between ft~ . and the rest of the regions in ft p . 
 This correspondence is usually a one-to-many mapping. In other words, there 
 are several regions in ft_ which satisfy the requirement of corresponding to the 
 region ft and satisfying the necessary relations. We will develop here the 
 property of this mapping function. 
 
 7.1.2 Dissimilarity and Clustering Function 
 
 We define two functions, f and f , as follows. The function f 
 assigns a real value to each shape, and the function f clusters shapes by 
 considering values assigned by f . More specifically, f 1 is an algorithm that 
 assigns real values to each shape of a picture by considering similarity (or 
 dissimilarity) to a specified shape of another picture. For example, 
 
153 
 
 f l ( "li> V = {r il' r i2'-"> r ij"*" r in } ' 
 
 2 
 where r. represents dissimilarity and is a real assigned by the function f., 
 ij 1 
 
 to the j-th region of ft with respect to the region ft . of ft . This function 
 has been described in Chapter 5 as a system to detect odd/similar shapes. 
 The function f is a simple chain clustering function. After 
 
 application of function f on ft with respect to the shape ft . , we obtain 
 
 2 
 the real value r. . for each ft^ . of ft^ as described above. These n real 
 ij 2j 2 
 
 numbers can easily be sorted into a decreasing order, and we get 
 
 a n > a^ > a„ > . . . > a . > . . . > a , 
 
 1—2—3— — j — — n' 
 
 which is denoted by A. f is a function to divide A into two connected 
 
 sub-sequences of a.'s, A and A , where shapes whose values are in A are 
 
 dissimilar to the shape ft, . , and shapes whose values are in A^ are 
 
 li 2 
 
 similar to o . . This separation of A into A and A can be 
 attained by considering a threshold value T, where 
 
 T = d + Co 
 
 d = (a. - a )/(n - l), 
 
 1 n 
 
 2 n-1 - 2 
 
 a = H (& ± - a i+1 " d) /(n - 1). 
 i=l 
 
 Here C is a constant. Let el be an element in A such that 
 
 a - a, >_ T and a. - a < T for i > k. 
 
 Then A^^ = (a^ a g , . . . , a^) and A 2 = (a^ ,..., a ). 
 
 7.1.3 Pairing Function 
 
 For simplicity let <f> = f p o f be a product function of f_ and f 
 and use <J> for a solution to the given matching problem. We apply <J> on each 
 
15* 
 
 shape ft of picture P and on the shapes of picture P . Thus, we get 
 
 dwft tt ) i(tt tt ) d> (ft ft \ dwft ft ^ 
 
 1 11' 2 } > n 12' 2 ; '•••' V li' 2 y '••• , V lm' 2 1 ' 
 
 Each (j) (fi . , JJ ) gives a set of shapes in ft , which are "similar" to the shape 
 
 ft . Similarly, if we apply 6 on each region ft of picture P p and on the 
 
 shapes of P , we get 
 
 cbfft ft ) 6(0, ft ) <ti(ft ft ) <Wft ft ) 
 
 yV 21' l'» V 22' !'»•••■ TV 2j' !'»—•» Y ^ 2n' V' 
 
 In Figure 7.1(a), the function <J> is illustrated for two cases, 
 
 6 (ft , ft ) and <M fi p., ^,). It is appropriate to combine these two cases so 
 
 that a function gives pairs of shapes which are similar. This function is 
 
 called a "pairing" function and is denoted by 3>, where it is defined as: 
 
 <j>(ft 1;L , ft 2 ) = {(ft liS ftp.) | ft 1;L e 4>(^4» tij) and ftp^ e 6 (ft-^, ftp)}. 
 
 This function $ is illustrated in Figure 7.1(b). Each pair 
 
 (ft , ftp.) denotes a similarity correspondence between ft and ft , where 
 
 it is understood that ft.. . of ft, is embedded in ft~ as ft~ . . This function $ is 
 
 li 1 2 2j 
 
 quite useful to narrow down the possible matchings between shapes of two 
 pictures and consequently to save wasteful comparisions. 
 
 Now we have the following correspondence between a shape ft . and 
 several shapes in ft p , 
 
 <$(ft-]-]» ftp)» <£(ft-jp» ftp)**** 3>(ft-ji» ftp),..., 3>(ft-| m » ftp)* 
 
 So the final step for a solution of the extraction of embedded shapes is 
 
 basically the same as for locating subgraph G in G by considering a certain 
 
 function, a function to measure goodness of correspondence. 
 
 Let us assume that for each ft n . , a shape ft! £ ft~ such that 
 
 li l 2 
 
 (ft , ft!) e £(ft . , ft ) is assigned for i = 1, 2,..., m. The necessary con- 
 dition that G is a subgraph of G , where the subgraph consists of nodes 
 
155 
 
 <£(&2j,Gl) 
 
 <M& U ,ft 2 ) 
 
 (a) 
 
 (b) 
 
 Figure 7-1- Pairing Function 
 
156 
 
 whose regions are ft',.*., ft' , is that the relations between any two nodes of G. 
 should be held between the corresponding nodes of the subgraph of G . That 
 is, 
 
 R(fi 1:L , ft^) CR(D», 0£) 
 
 for all i y k, where R denotes a set of relations between a pair of shapes, 
 and C. denotes relations R(ft , ft ) are hold between ft. 1 and ft' . 
 
 _L1 XK X 1c 
 
 7.1.^ Experimental Result 
 
 In Figure 7-2 a model for complex shape extraction is illustrated. 
 For given two pictures, P and P_, neck/tail decomposition, feature extraction 
 and graph construction will be done separately. Then the pairing function is 
 applied, and ij>(ft . , ft ? ), for i = 1,..., m, is derived. Finally, structural 
 matching is determined. If the structure of ft and the structure of ft' 
 in ft matches, then the global shape similarity between ft and ft 
 is evaluated. 
 
 In Figure 7*3 a pair of shapes is illustrated, and we will consider 
 the extraction of the shape in (a) from the shape in (b). 
 
 (a) 
 
 Cb) 
 
 Figure 7.3 A Pair of Shapes 
 
157 
 
 p p 
 
 1 2 
 
 
 NECK/TAIL DECOMPOSITION 
 
 
 in d 
 
 
 FEATURE EXTRACTION 
 GRAPH CONSTRUCTION 
 
 
 X l' X 2 , °1' G 2 
 
 
 PAIRING 
 
 
 ' 
 
 v li» 2 
 
 1 
 
 
 TRIAL ASSIGNMENT 
 G REORIENTATION 
 
 
 
 
 
 
 
 G' (0') 
 
 f 2 v 2' 
 
 ^S^ NO 
 
 
 YES 
 
 
 
 DETERMINE GLOBAL 
 
 SIMILARITY BETWEEN 
 
 n and «' of n 
 
 
 < TRI 
 
 .T^N^ YES 
 
 AL > ■ 
 
 
 
 NO 
 
 TERMINATE 
 
 Figure 7.2. A Model for Complex Shape Extraction 
 
158 
 
 The digitized picture and its decomposed picture of Figure 7.3(a) were shown 
 in Figure 3.2U, (a) and (e), respectively. The digitized picture and its decom- 
 position of Figure 7.3(b) are shown in Figure 7.U, (a) and (b), respectively. 
 
 To see how the system of Figure 7.2 behaves, let us number the regions 
 of each decomposed picture as in Figure 7-3. Thus, 
 
 n i- {0 ll. 8 12. B 13 >' ma 
 
 n 2 = {i W R 22 V' 
 
 Pairs of regions, $(fi . , ft ), for i = 1, 2, 3, derived by the system are: 
 
 • (Q U' V = { W VW W' 
 
 • (Q 12 . 2 ) = ^ 12 :fi 23' ° 12 :n 2 6' fi i2 :fi 28 } ' 
 
 *(n i3 , n 2 ) = {n i3 :R 22 , " 13 : ^. "i3 : ^7 } - 
 These regions of ft Q which have been filtered by the pairing function $ are shown 
 in Figure 7«Mc). Therefore, there are 18 possible assignments between regions 
 of ft and the filtered regions of ft' . These 18 possible assignments have been 
 reduced to the following four assignments after their structural matching has 
 been determined. These are: 
 
 B 3tr 5 "23' "28' fl 28' ft 28' 
 ft 12 : ft 26 , ft 26 , ft 26 , ft 26 , 
 
 ft : ft~ » ft~~, 0„i , ft • 
 
 13 27 22' 2U' 27 
 
 Finally, each global shape ft' is compared to the global shape of ft , and the best 
 matching attained is: 
 
159 
 
 Bl'RBRRR 
 
 8 B 
 
 [I B 
 
 9 B 
 
 bli B B 
 
 BR B e. " 
 
 B B * 
 
 B _ U B 
 
 a " b. b 
 
 B B B 
 
 8 B B 
 
 B 
 
 8 
 
 e >«bb 
 
 BB B 
 
 B B 
 
 B HB 
 
 P BR 
 
 B B 
 
 B 
 
 B II B BR 
 
 B BBt'BBB .8 R 
 
 8R B " 3 B 
 
 K. R B B 
 
 BB 
 B 
 
 B - B 
 
 _B B 
 
 B BBBB 
 
 B B BB 
 
 RB 
 
 B 
 
 8B3 
 B 
 
 p PR 
 
 B 8P8BR _ 11 . _B._ 
 
 B BBBB B B B 
 
 B B B «_ 
 
 B B B B 
 
 B 5 RR_ 
 
 8 B B B 
 
 8 3 a 9_ 
 
 6 9 B ° 
 
 3 R PRBBRBPRBRR 
 
 BBBBBBbBBbabrt8fi 
 
 (a) Original Picture 
 
 RbRpfiilR 
 
 B B 
 
 R B 
 
 B R 
 
 nR R B 
 
 RR R R P 
 
 R B B 
 
 P R R 
 
 fi R B 
 
 B R 3 
 
 B B B 
 
 B B B 
 
 R RRH 
 
 »B R 
 
 R 9 
 
 R RB 
 
 B R 3 
 
 P R B 
 
 3 9 3 BR 
 
 B 8BPJP3 R 8 _ 
 
 RB "3 '"" B P 
 
 H 9 P __B_ 
 
 fi " R fi R 
 
 p 
 
 B 
 
 B 
 
 B 
 
 B 
 
 B 
 
 B 
 
 P 
 
 B 
 
 B 
 
 B 
 
 B 
 
 B 
 
 R 
 
 R 
 
 B 
 
 B 
 
 8 
 
 B 
 
 B 
 
 B BBBB 
 
 8 
 
 B 
 
 B 
 
 B B 
 
 BB 
 
 B RR 
 
 P 
 
 BR8 
 
 _B _ 
 B 
 
 _B 
 
 B 
 B 
 
 3 1 p 3 
 
 8 BTB BP B B_ 
 
 BBBB 6 i* 9 
 
 B B B_ 8 
 
 B 3 B B 
 
 R B PB 
 
 BBRRPR 
 B B 
 
 IB R R 
 
 fi _B_ _ BBPB9BRP9PR 
 
 BBRBdPRBBBBBePB 
 
 (b) Decomposition 
 
 3R3 R B 
 
 8 ( B 
 
 B V B" BB» 
 
 R B PR 9 
 
 
 PR 3 8 
 
 
 
 B B B B 
 
 
 
 8 3 R R 
 
 
 
 B R H R 
 
 
 
 b B B 3 
 
 
 
 B Bfir-.PRRR R 
 
 
 
 BB R B 
 
 
 
 P B 8 
 
 
 
 3 BB 
 
 
 
 P B 
 
 
 
 B B 
 
 H P. 
 
 
 
 B B 
 
 
 
 P B 
 
 
 
 BBBB B 8 
 
 
 
 B BB BBR 
 
 
 
 B B R 8 
 R P. RPRPB B 
 
 
 p BBBB d B 
 
 9 _R_ __B-_ ** 
 
 B B R " " ' 1 
 
 B _ U _ _ _ ^ B 
 
 b " b" "a" e 
 
 BR R fi 
 
 BBB 
 
 _jMJ 8_ 
 
 B B 
 
 R R 
 
 B B 
 B_ B_ 
 
 B ~ B 
 
 BBBBBfib R 
 
 Bb 6 P. 
 R R B 
 
 BB 
 
 B9B3 R B 
 
 _ B_ BB e 
 
 6 6 RB 
 
 R _ _B _iJBB3B_ 
 "B " BBBB 
 I B 
 
 R 
 
 H R 
 
 R B 
 
 B fi 
 
 B R 
 BRBBBB 
 
 RR e 
 
 3P 
 
 iRHBBP 
 
 (c) Elimination by Pairing (d) The Best. Shape Extracted 
 
 Figure 7.U. An Example of Complex Shape Extraction 
 by the Model of Figure 7-2 
 
i6o 
 
 "n : "28' 
 fi 12 : R 26 , 
 
 n 13 : V 
 
 which is shown in Figure J.U(d). In this simulation only the connected 
 relation, r«, has been extracted and considered for the relation graphs G. and 
 G_. The computation time for the above example was 13 seconds. The 
 program was written in PL/l and run on the IBM 360/75 • The model developed 
 here can be generalized, and we will suggest a general model for shape 
 extraction in Chapter 8. 
 
 7.2 Extraction of Embedded Figures 
 7.2.1 Preliminary Discussion 
 
 In this section we consider the following problem: two pictures, 
 P and P , are given, and the task is to determine whether or not the contour 
 of P can be embedded in the picture P . Pairs of pictures are illustrated 
 in Figure 7-5 where in each case picture P can be embedded in picture P . 
 We will suggest an approach for the solution of this problem. The analogous 
 problem of the extraction of embedded regions was discussed in Section 7-1 • 
 
 Without loss of generality, we assume that the contour of a picture 
 P is embedded in a picture P p . Thus, the task is to search for the corres- 
 ponding vertices and arcs of P in P , where P may consist of a set of 
 disjoint simply-closed curves, and P consists of overlapped closed curves. 
 We also assume that each closed curve is a polygonal curve as exemplified in 
 Figure 7- 5 Id). 
 
l6l 
 
 (a) 
 
 (b) 
 
 
 (a) 
 
 Pi 
 
 Figure 7.5. Four Pairs of Picture Drawings, P n and P, 
 
 1 c 
 
 where for Each Drawing P can be Embedded in P 
 
162 
 
 The first step in the solution of this problem is to derive the 
 contour expression of the picture P , where we assume that P consists of 
 several connected regions. Then the second step is to search for the 
 corresponding closed contour of P in P„. To determine a corresponding 
 contour of P in P , an exhaustive type matching between two expressions of 
 the pictures may be applied. Finding a clue for a solution of this matching 
 problem is in general dependent upon how these two figures are described. 
 
 7.2.2 Matching Between Nodes 
 
 Let us consider the pictures in Figure 7-6. As can be easily seen, 
 the picture P is embedded in the picture F , where names for regions and 
 vertices are given as specified. The input representation selected for these 
 two pictures are Node -sequences as discussed in Section 6.1. The Node- 
 sequences N and N for pictures P and Pg , respectively, are given in 
 Table 7.1. There the numeric number denotes the subscript of a region, i.e., 
 (n) means ft . 
 
 From these two sequences, N and N , we derive the Angle-sequences 
 A_^ and A , respectively, which are given in Table 7*2. From Node-sequence 
 N and Angle -sequence A of the picture P , we can easily derive Node and 
 Angle-sequences for the contour of P by using the fact that the region ft 
 denotes the background region of the picture P . These are: 
 
163 
 
 (a) 
 
 (b) 
 
 f? 
 
 Figure 7-6 
 
 An Example Used to Extract P from P 
 
16U 
 
 Table 7-1 • Node-Sequences of Pictures in Figure 7.3 
 
 Node-sequence N. 
 
 Node-sequence N, 
 
 A(C(0), 
 
 B(l)) 
 
 
 B(C(1), 
 
 A(0), 
 
 D(2)) 
 
 C(J(0), 
 
 A(l), 
 
 B(2), F(3)) 
 
 D(E(2), 
 
 B(0)) 
 
 
 E(F(2), 
 
 D(0), 
 
 G(3)) 
 
 F(C(2), 
 
 E(3)) 
 
 
 G(E(0), 
 
 H(3)) 
 
 
 H(G(0), 
 
 1(3)) 
 
 
 I(J(3), 
 
 H(0)) 
 
 
 J(C(3), 
 
 Ko)) 
 
 
 a(b 
 b(c 
 c(n 
 d(e 
 e(n 
 
 f(g 
 
 g(h 
 h(i 
 i(e 
 j(k 
 k(l 
 l(m 
 m(n 
 n(c 
 o(j 
 
 p(q. 
 
 q(n 
 
 0) 
 0) 
 0) 
 3) 
 k) 
 1) 
 5) 
 6) 
 6) 
 8) 
 9) 
 9) 
 7) 
 k) 
 0) 
 
 , f(l), 
 
 , a(2), 
 
 , b(U)) 
 
 , b(2), 
 
 , b(3), 
 
 , a(0), 
 
 , 4(1), 
 
 , e(5), 
 
 , h(7)) 
 
 , f(0), 
 
 , g(8), 
 
 , k(l0) 
 
 , h(9), 
 
 , e(7), 
 
 , p(io) 
 
 4(2)) 
 
 4(3), e(U)) 
 
 a(l), g(5)) 
 4(5), h(6), i(T)) 
 3(8)) 
 
 f(8), k(9)) 
 g(9), m(7)) 
 
 0(10)) 
 
 j(lO)) 
 
 , p(iD) 
 
 1(H)) 
 
 m(ll) , q(0)) 
 ) 
 
 12), n(ll), 1(10), 0(0)) 
 12), p(0)) 
 
165 
 
 Table 7.2. Angle-Sequences of N and N 
 
 Angle-sequence A 
 
 Angle-sequence A 
 
 A ^ a CB ,a BC' ) 
 
 
 B(a CA' a AD' 
 
 a DC } 
 
 C( °JA' a AB' 
 
 °BF» a FJ ) 
 
 D(a EB» a BE ) 
 
 
 E(a FD»°DG' 
 
 "OF 5 
 
 F(a CE' Q EC ) 
 
 
 G(a EH' a HE ) 
 
 
 H(a GI ,a IG ) 
 
 
 l(a JH'°HJ ) 
 
 
 J(a ci' a ic ) 
 
 
 a(a bf» a fd' a db ) 
 
 b(a ,a , o a ) 
 ca' ad' de' ec ; 
 
 c(ct ,a ) 
 v nb' bn ; 
 
 d(a a a a ) 
 eb b a ag ge 
 
 ^nb'V'V^hi' a in } 
 
 f (a ,a a. ) 
 
 ga' aj' jg 
 
 e-( a .a ,a , a ) 
 feV hd' df" fk* kh' 
 
 h(a. ,a a a . ) 
 le eg gm mi 
 
 i( a .a ) 
 v eh' he ; 
 
 i ( a a a ) 
 J kf fo' ok y 
 
 k(a a a ) 
 
 lg gJ Jl 
 
 l( a , a , a ) 
 * mk' kp' pm' 
 
 m( a a a ) 
 nh hi In 
 
 n(a ,a a a a ) 
 
 ce' em' mp' pq' qc 
 
 o( a. a ) 
 
 JP PJ 
 
 p( a , a a , a ) 
 qn nl eo oq 
 
 q( a , a ) 
 np pn 
 
166 
 
 Node-sequence N' 
 
 A(C(0) 
 B(A(0) 
 C(J(0) 
 D(B(0) 
 E(D(0) 
 G(E(0) 
 H(G(0) 
 I(H(0) 
 
 J(Ko) 
 
 Angle-sequence A' 
 
 A(a 
 
 CB 
 
 *AD 
 
 C(a 
 
 E( 
 G( 
 H( 
 K 
 J( 
 
 l JA 
 
 °BE 
 ^G 
 "EH 
 °GI 
 °HJ 
 °IC 
 
 Finally, the problem "becomes to search the Node -sequence N' in the Node- 
 sequence N such that the Angle-sequence A 1 is preserved in the corresponding 
 Node-sequence in N p . To solve this matching problem, the following concept 
 might be helpful. 
 
 Let a node of the sequence N' be "A" ; 
 
 A(B(i)) 
 
 A(a BC } 
 
 and a node of the sequence N p be "a"; 
 
 a(b(j), c(k), d(l)), a ( a bc > a cd » a db^' 
 
 Then a pair of nodes (A, a) is said- to be a possible correspondence if and 
 only if 
 
 a BC ^ { V' a cd> a db } * 
 where {o^, a^ t a }* = {o^, a^ a , a 1 + a,^ a.^ + a , a + c^} 
 
 and e means that there exists at least one element a in {...}* such that 
 
 Kc - a l - T - 
 
 T is a predefined angle threshold. 
 
167 
 
 Therefore, to determine the sequence N| in N , we find a corresponding node 
 of N for each node of N' which satisfies the relation e, and, moreover, 
 which preserves the ordering of Node-sequence, that is, the one-to-one 
 correspondence between subscripts in sequences. We do not explicitly 
 describe here an algorithm for the solution of the extraction of the embedded 
 contour picture; however, the above definition of "possible correspondence" 
 shows implicitly how to determine node-to-node matching between pictures. 
 
 7.3 Extraction of "Qood" Figures 
 7.3.1 Introduction 
 
 In a complex drawing, a human sees instantly a figure which is 
 embedded in it. What the human sees in the drawing or what he imagines out 
 of the drawing might depend heavily upon the observer's past experiences, 
 the orientation of the drawing, the brightness of lines in the drawing, etc. 
 We will consider in this section an extraction of good figures from a complex 
 two-dimensional drawing. A good figure may be a partial drawing in the 
 complex drawing which comes up relatively clear and looks like an 
 independent drawing in the complex picture. 
 
 Since a human lives in three-dimensional space, it is likely that 
 even if he sees two-dimensional drawings, he tends to see three-dimensional 
 objects in it. Zusne (1970) says that whether a plane figure will be seen 
 as two- or three-dimensional depends on the goodness of both shape and 
 contour continuation. If both shape and continuation are good, a two- 
 dimensional figure will be seen. If not, a three-dimensional shape will 
 be seen. 
 
 The study of three-dimensional perception of two-dimensional 
 figures has been originated by Guzman (1968). Good figures in a two- 
 dimensional drawing of three-dimensional objects are objects which can be 
 
168 
 
 seen as three-dimensional objects. Thus, by using Guzman's program SEE 
 (basically various analyses of vertices formed in the drawing), we can find 
 which planes form an object, and consequently, the contour of the object 
 which corresponds to a good figure in the given picture. To find the 
 contour of a three-dimensional object which is embedded in a figure, 
 consideration of only a few types of vertices is adequate enough. These 
 are: (l) L, a vertex where two lines meet, (2) FORK, three lines forming 
 angles smaller than it, and (3) ARROW, three lines meeting at a point with 
 one of the angles bigger than j. Some other kinds of vertices are needed 
 if the drawing contains some hidden lines. For the solution of the above 
 problem, heuristics has been extensively studied. 
 
 We assume hereafter that drawings being considered describe two- 
 dimensional objects and not three-dimensional objects in the visual world. 
 Thus, the heuristics developed by Guzman are not applicable. Moreover, 
 by assuming the above, we may eliminate individual experiences of the 
 visual world and consider the extraction of "good" shapes in a given 
 complex drawing. 
 
 7-3.2 Goodness of Figures 
 
 Before we define what we mean by "good" shape, let us introduce 
 some principles of goodness of figures which have been considered by 
 psychologists (also see Chaper l). 
 
 (i) Law of Similarity : visual pattern elements that are alike 
 tend to form groups. As to the law of similarity, it applies not only to 
 similarity in shape among visual pattern elements, but also to similarity in 
 color, position, and other variables. However, we only consider similarity 
 of shape. 
 
169 
 
 (ii) Lav of Proximity : visual pattern elements having smaller 
 distances between them tend to group together more than elements separated 
 by larger distances (Koffka, 1935; Mowatt, 19^0; Zahn, 1971). 
 
 (iii) Principle of Symmetry : symmetrically located pattern 
 elements will tend to organize themselves and associated elements into groups. 
 
 Civ) Principle of Closure : visual form may be open or closed, 
 
 complete or incomplete. An open form tends to change toward a closed form, etc. 
 
 It often happens that several of the above principles hold. In 
 such a case , the stronger one will predominate and lead to the perception 
 of that form in which it predominates . We define a good figure or shape as 
 a closed curve with respect to the principle of good continuation. In 
 other words, a curve y is said to be a good figure if the curve is extracted 
 from a complex drawing by connecting "good continuing" arcs or edges , and 
 the resulting curve is closed. Good continuing arcs or edges are pairs of 
 adjacent arcs or edges which are connected at a vertex and have the 
 following property: a straight line proceeds straight, circle proceeds 
 circle, etc. 
 
 Let us consider several drawings, overlapped figures, of Figure 7.7. 
 Each drawing can be thought of as consisting of a pair of closed curves and is 
 basically classified by two methods : one is that a good figure means a 
 good continued closed curve, and the other that a good figure is an outline 
 (closed curve) of the given drawing. For example, (a) can be considered as the 
 union of two triangles or as the union of a star-like shape and a hexagonal 
 
170 
 
 (a) 
 
 Case 1 
 
 Case 2 
 
 (*)■ 
 
 Case 1 
 
 Case 2 
 
 Figure T-T- Decomposition of Overlapped Drawings 
 
171 
 
 (c) 
 
 Case 
 
 Case 2 
 
 (a) 
 
 Case 1 
 
 Case 2 
 
 Figure 7 .7 • Decomposition of Overlapped Drawings 
 (continued) 
 
172 
 
 shape. If we consider the good continuation of curves as a means for extract- 
 ing a good figure, then a pair of triangles is obtained. However, if we assume 
 that a good figure is an outline of a given overlapped drawing, then the 
 second type of decomposition is obtained. Figures 7.7(b), (c), and (d) show 
 some other examples where each case is considered for the two classes of 
 decomposition. Figures 7.7(a) and (b) are examples of how both good 
 continuation and outline approaches might work for good figure extraction. 
 It may be said that the results obtained by good continuation are preferable 
 to those derived by outline approach. Figures 7.7(c) and (d) are examples 
 that the outline of figures cannot necessarily be thought of as good figures. 
 
 In Figure 7.8(a) an overlapped complex figure is shown. By 
 assuming that open curves are not good figures, we can eliminate hair-like 
 curves and then obtain an overlapped drawing which consists of only closed 
 curves (b). Next, using the extraction of good figures as good continued 
 closed curves, we get two closed curves (c). 
 
 As shown in the above examples, good figure extraction can be 
 solved by using good continuation. The solution of this problem might 
 simplify the extraction of a specified figure from a complex drawing of 
 Section 7«2, and makes it easier to solve. 
 
 7-3.3 Illustrative Approach to Decomposition 
 
 We assume that a given drawing consists of many open and closed 
 curves and over lappings . Overlapping can be grouped into two classes: one 
 is node-overlapping and the other is edge-overlapping . A vertex is called 
 an even vertex if an even number of arcs or edges meet at the vertex; 
 otherwise, it is called an odd vertex. It is easy to see that if odd 
 vertices exist in a drawing of closed curves, then the number of odd vertices 
 is even. If all vertices are even, then the given drawing consists only of 
 
173 
 
 (a) 
 
 (b) 
 
 Figure 7.8. Elimination of Open Curves for the 
 Decomposition of an Overlapped Drawing 
 
IT 1 * 
 
 closed curves, and some vertices are common to several good figures in the 
 drawing. However, if the drawing contains some odd vertices, then there are 
 some edges or arcs which are used to describe more than one figure in 
 the drawing. 
 
 Whether a pair of arcs or edges has a good continuation 
 at a vertex may he determined by some heuristics . In Figure J .9 
 some of the vertices used to extract a good continuation are illustrated. 
 In the figure, dotted lines indicate a pair of edges or arcs which are 
 assumed to have good continuation at the vertex. 
 
 In Figure T-10 an example of a meaningless figure is shown, and 
 this is used as a sample to show how decomposition of the figure into simpler 
 shapes is carried out. The node sequence N which describes this complex 
 figure is as follows: 
 
 v 1 (v 2 (b,l) 
 
 v 2 (v u (f,0) 
 v 3 (v u (h,U) 
 v u (v T (k,8) 
 v 5 (v 3 (g,3) 
 v 6 (v 8 (p,9) 
 v 7 (v u (l,8) 
 
 v 2 (a,0) 
 
 v c (c,3) v (d,2)) 
 
 v 1 (a,l), v 1 (b,2), v (e,U)) 
 
 v 2 (e,2), v 1 (d,3), v 5 (g,5), v 6 (i,6), VgQ.T)) 
 
 v (1,0), v 2 (f,M, v 3 (h,T)) 
 
 v 1 (c,0), v (q,10), v 6 (m,5)) 
 
 v T (n,T), 
 
 v (j,6), v.(i,5), 
 
 v (m,10), v (r,ll)) 
 
 v^ds.T), v 6 (n,9), v g (o,0)) 
 
 v 8 (v 10 (s,12), v 1Q (t,0), v T (o,9), v 6 (p,ll)) 
 v 9 (v 10 (u,ll), v 6 (r,10), v 5 (q,0), v 1Q (v,13)) 
 
 v 10 (v 8 (t,12), v 8 (s,ll), v (u,13), v 9 (v,0)). 
 
175 
 
 / 
 
 r 77 
 
 rv sj 7 
 
 Figure 7.9. Vertices Used to Extract Good Continuation 
 
176 
 
 Figure 7«10. Meaningless Drawing 
 
177 
 
 From this node sequence we derive a sequence called an "arc-sequence." 
 This sequence tells us which arcs meet at each vertex. Thus, by applying 
 good continuation of Figure 7-9 for each vertex, we can find pairs of good 
 continuing arcs. These two sequences are shown below: 
 
 Arc-Sequence A Good Continuation 
 
 v (b,a,c,d) v 1 (a:d, b:c) 
 
 v 2 (f,a,b,e) v 2 (f:a, b:e) 
 
 v (h,e,d,g,i,j) v 3 (h:g, e:i, d:j) 
 
 v u (k,l,f,h) v^krf, l:h) 
 
 v 5 (g,c,q,m) v (g:m, c:q) 
 
 Vg(p,n,j,i,m,r) v^nrm, j:r, i:p) 
 
 v (l,k,n,o) v^(l:n, k:o) 
 
 v Q (s,t,o,p) Vg(o:s, p:t) 
 
 v (u,r,q,v) v 9 (r:v, q:u) 
 
 v 1Q (t,s,u,v) v 1Q (t:u, s:v) 
 
 To determine good closed curves, starting at v we select a pair of arcs 
 a:d and look for a vertex which contains "a" or "d". In this example, v p 
 has a pair f :a, so we move to v~. Repeat this procedure as many times as 
 possible by visiting each vertex at most once. If, at the end, the vertex 
 has "d", then the sequence of arcs visited by this method is closed; 
 otherwise, it is open. 
 
 By the above approach we get the following three sequences of arcs, 
 each of which is a good closed curve or shape. 
 
 1. a,f ,k,o,s,v,r,j ,d, 
 
 2. b,e,i,p,t,u,q,c, 
 
 3. h,l,n,m,g. 
 
178 
 8. SUGGESTED MODEL FOR THE NAMING PROBLEM 
 
 8.1 Introduction 
 
 As a summary of the algorithms for high-level picture processing 
 described in earlier chapters, we propose a model for the solution of the 
 "naming" problems. A naming problem is specified by two input pictures: 
 a sample picture where the objects in the picture are to be named at the 
 completion of the analysis, and a text picture which describes an abstract 
 drawing of the sample picture being considered where various regions have 
 assigned names. The problem is to name the objects in the sample picture 
 using name information from the text picture. 
 
 A very simple data example for this naming problem is illustrated 
 in Figure 8.1 where (a) is a sample picture and (b ) is the 
 
 corresponding text picture. Using the name information in the text picture, 
 we can name the people in the sample picture "Dad," "Me," and "Bob," and, 
 consequently, we derive the result illustrated in (c). This kind of 
 naming problem is frequently solved in the human visual world, and, more- 
 over, the human solves it almost instantly. Our objective here is to propose 
 a model for a solution of general naming problems as an extended application 
 of the studies which have been done in this thesis. The model which 
 will be described in the following section is a generalization of the model 
 for complex shape extraction, which has been discussed in Section J.X. 
 
179 
 
 BOB 
 
 (a) Sample Picture 
 
 (b) Text 
 
 (c) Output of Naming Problem 
 
 Figure 8.1. A Simple Example f the Naming Problem 
 
180 
 
 8.2 A Model Description 
 
 In developing a model for the solution of naming problems, the 
 most significant and necessary problem which should be solved is the shape- 
 identification problem. Without solving this, the solution for the naming 
 problem cannot be attained. We ignore the texture or color or material 
 information of objects constructed, and we assume only geometrical shape 
 as the clue of identifying objects. Once the possible individual shape 
 indentifications have been computed, the next step is to consider those 
 shapes in the sample picture whose structure matches the structure of the 
 text picture. If both structure and individual shape identification have 
 been done successfully, then the final remaining task is to give a name to 
 each object in the sample picture according to the individual shape matchings 
 
 In Figure 8.2 we show a simplified block diagram of our model for 
 the solution of the naming problem. "SAMPLE" is given as a gray-level 
 picture and "TEXT" is given as a black-and-white noiseless clear picture or 
 as a higher level representation with name information. In this diagram 
 only program flows (no information flows) are shown. We will describe the 
 context or the tasks of each block as follows : 
 
 1. First Level Analysis 
 
 This block is applied only for a given sample picture to 
 obtain the same information level as a text picture. Here, most of the low- 
 level picture analyses are carried out, for example: 
 
 (i) scanning and digitizing the input picture, 
 (ii) differentiation of the digitized picture, 
 
 (iii) transformation into a black-and-white picture. 
 
TEXT 
 
 CO 
 
 o 
 
 Si 
 
 o 
 o 
 
 I 
 
 o 
 
 M 
 Eh 
 H 
 CO 
 O 
 
 8 
 
 181 
 
 SAMPLE 
 
 (1) FIRST LEVEL ANALYSIS 
 (ARRAY LEVEL) 
 
 (2) SECOND LEVEL ANALYSIS 
 (REPRESENTATION) 
 
 (3) THIRD LEVEL ANALYSIS 
 (STRUCTURAL LEVEL) 
 
 (k) INDIVIDUAL MAPPING 
 
 (5) ASSIGNMENT 
 
 (6) STRUCTURAL MATCHING 
 
 (7) NAMING 
 
 o 
 
 EH 
 
 K 
 
 O 
 
 3 
 
 o 
 
 M 
 EH 
 
 O 
 
 w 
 
 CO 
 
 TERMINATE 
 
 Figure 8.2. A Model for a Solution of Naming Probl 
 
 em 
 
182 
 
 (iv) black-boundary finding and the center of the 
 gravity of each region, and 
 (v) contour smoothing (ref. 2.3). 
 All of these are described in Section 2.3. We will use ref. to show the 
 corresponding reference in this thesis. At the end of these analyses, clear 
 boundary representations of each object in the sample picture are obtained. 
 
 2. Second Level Analysis 
 
 This block is provided to analyze each region of two pictures 
 and to extract characteristics regarding the geometrical shape of regions. 
 This includes: 
 
 (i) analysis of characteristic points (ref. 3.5), 
 (ii) polygonal and pattern sequence representation 
 (ref. 3.1, 3.2, and 3.3), 
 (iii) local and global feature extraction (ref. Chapter h) , 
 (iv) contour abstraction (ref. 3-3-5) 5 and 
 (v) node/angle representation (ref. 6.1, 7.2, and T«3). 
 
 3. Third Level Analysis 
 
 Thus far, individual regions of pictures are analyzed, and at 
 this level the structure of pictures is determined and graph representations 
 are obtained. This analysis completes all possible processing of both 
 samples and text pictures to provide necessary information for the solution 
 of the naming problem. This final level analysis includes: 
 (i) skeletonization of regions (ref. 3.*+.l), 
 (ii) neck/tail decomposition or composition (ref. 3.^.2), 
 (iii) derivation of containment trees (ref. 6.2.1), 
 (iv) derivation of neighborhood relations (ref. 6.2.2), and 
 (v) the construction of relation graphs (ref. 6.2 and 6.3). 
 
183 
 
 k. Individual Mapping 
 
 This block is provided to find for each region of the sample 
 picture some possible corresponding regions in the text picture. This is 
 done by first detecting an odd shape in the sample picture, and for the 
 detected odd shape we find some similar shapes in the text picture. If 
 reorientation of the sample picture is required to attain possible matching 
 between regions in two pictures, then such operations will be done and then 
 the process goes back to the second level analysis for appropriate change of 
 analyzed data. For the remainder of regions in the sample picture, we again 
 find the odd shape in it and then find several shapes of the text picture 
 similar to it. We repeat this process as many times as necessary to complete 
 the assignment of each shape of the sample picture to several shapes of the 
 text picture. In this process it is also necessary to apply the concepts 
 of extraction of embedded figures (ref. 1 .1 and 7.2) and extraction of 
 good shapes (ref. 7-3) • This block includes: 
 
 (i) odd/similar shape detection and reorientation 
 (ref. Chapter 5), and 
 (ii) extraction of embedded complex regions (ref. 7^1)) 
 especially the function $ (one -to-many mapping) 
 5. Assignment 
 
 For each region of the sample picture, several regions of 
 the text picture are selected by the previous sub-task: Individual Mapping . 
 This sub-task is provided to compute a one-to-one mapping, i.e., for each 
 region of the sample picture it assigns a region of the text picture where 
 such a region is included in the set of regions derived by the previous 
 one-to-many mapping. A good assignment is obtained first by individual 
 assignment, and then the whole assignment can be improved further by the 
 
181* 
 
 considering certain optimization functions (ref. 7-1) . At the end, 
 
 one of the "best individual matchings is obtained, however, the structure of 
 
 one picture may not be preserved in the other. 
 
 6. Structural Matching 
 
 This determines whether the structure of the sample picture 
 is preserved between these regions of the text picture which are assigned 
 or selected for matching. In other words, this determines whether one graph 
 can be a subgraph of another based upon relations between nodes. To 
 attain a possible matching, we not only need another selection of regions for 
 an assignment, but also the operation of composition of several regions 
 of one graph or the decomposition of a region of the other graph. That is, some 
 graph transformations (ref. 6.3) are required to attain possible matching. 
 
 7. Nairn' ng 
 
 After obtaining individual regional matchings and structural 
 matching, the final step is to give names for regions of the sample picture 
 and output the result. Other possible matchings are also considered 
 if requested. 
 
 8.3 Potential Application 
 
 Not much research has been done on this kind of pattern recognition 
 problem. However, the solution of this naming problem becomes significant 
 in the fields of remote manipulation, medical diagnosis and 
 research, elementary school education, etc. In remote manipulation, the 
 world may be given as a textbook (two-dimensional projections of three- 
 dimensional visual space) , and by identifying the objects in the visual world 
 using the information described in the textbook, the system may 
 further execute the assigned task in the world. In the medical field 
 
185 
 
 we have good textbooks for organs, cells, etc., however, it is not an easy- 
 task to identify between, for example, an X-rayed picture of the human brain 
 and the textbook. Thus, it is very important to have such naming systems 
 to ease the burden and to give some aid to doctors or to students in medical 
 training. It is also possible to apply this naming system to 
 elementary education as well as to the study of human shape/form perception. 
 Disadvantaged or young students may use the system to learn language with 
 shape and some other information. 
 
 Figure 8.3 shows two other complex examples of textbooks which may 
 be used for the naming problem, (a) shows an embryo of a pig and (b) shows 
 an embryo of a chick. These examples are very complex, however, we hope that 
 we can attain solutions of naming problems down to this level . 
 
186 
 
 V 
 
 u 
 
 < 
 
 & 
 
 ori 
 
 •p 
 C 
 
 0) 
 
 o 
 H 
 
 > 
 
 CD 
 
 Q 
 
 S 
 O 
 U 
 
 <H 
 
 TJ 
 
 0) 
 « 
 
187 
 
 9. SUMMARY AND CONCLUSION 
 
 9.1 Discussion of Results 
 
 This work may be divided into two groups: low-level picture 
 analyses and high-level picture analyses of complex pictures. 
 
 The low- level picture analysis includes topological analyses 
 of curves, regions, shape representations, and global and local feature 
 extractions. These led to work on a problem in shape perception: odd 
 shape detection problems. 
 
 The high-level picture analysis includes graph representation 
 of the structure of pictures: relation graphs and graph structure 
 transformations, and some shape matching-in problems. As a potential 
 application of both low and high-level picture processings, we suggested 
 a model for the solution of naming problems. 
 
 In this work we introduce some studies which have been done by 
 psychologists on human shape/form perception and show how these studies can 
 be used for mechanical form perception. We also discuss how form perception 
 in psychology can be modeled and studied with computers. It is hoped that 
 this work will provide a method for describing figures efficiently so that 
 one can construct a computer system in which the system behavior is similar 
 to that of an average human. It is also hoped that this work will give 
 some clues in the study of human shape perception, especially in the study 
 of how and where the human pays attention to shapes. Thus, in the future a 
 computer system might be used in place of psychologists for various 
 psychological problems. 
 
188 
 
 9.2 Suggestions for Further Research 
 
 The study of shape perception "by computers involves the following 
 areas : 
 
 Shape representations 
 
 Feature extraction 
 
 Graph structural representation of complex 
 pictures, and graph, transformations 
 
 Heuristics on shape identification 
 Feature extraction is dependent upon shape representations. An appropriate 
 selection of features which describe shapes adequately is one of the most 
 significant problems which should be studied extensively. Local features — 
 lobes, cavities, roughness, etc. — should be extracted and should be combined 
 with global features for shape description. 
 
 Graph structure introduced in this work should be studied more on 
 various types of tasks and should be combined with Schwebel's work (1972). 
 A language, SOL, for graph structure analyses is being implemented at the 
 University of Illinois. 
 
 An extension of this work is to complete the overall implementation 
 of algorithms described in such a way that one can easily combine algorithms 
 to make models for various tasks. The model for the naming problem described 
 in Chapter 8 should be implemented and examined for various examples. 
 
189 
 
 BIBLIOGRAPHY 
 
 Arnoult, M. D. , "Shape Discrimination as a Function of the Angular Orientation 
 of the Stimuli," J. Exp. Psychol. , vol. kf 9 No. 5, 195 1 *. 
 
 Attneave, F. , "Some Informational Aspects of Visual Perception," Psychological 
 Review , vol. 6l, No. 3, 195 1 *, PP» 183-193. 
 
 Attneave, F. and Arnoult, M. D. , "The Quantitative Study of Shape and Pattern 
 Perception," Psychological Bulletin , vol D 53, No. 6, November 1956, 
 pp. U52-U71. 
 
 Baird, J. C. , Psychological Analysis of Visual Space , Pergamon Press, 
 Elms ford, New York, 1970. 
 
 Bartley, S. H. , Principle of Perception , Harper & Row Publishers, New York, 
 1969, pp. 21U-2U0. 
 
 Blum, H., "A Transformation for Extracting New Descriptors of Shape," 
 
 Models for the Perception of Speech and Visual Forms , M.I.T. Press, 
 Cambridge, Mass., 1967, pp. 362-380. 
 
 Blum, H. , "Biological Shape and Visual Scene," NMSB.FIGINC , July 1971. 
 
 Bobbitt, J. M. "An Experimental Study of the Phenomenon of Closure as a 
 Threshold Function," J. Exp. Psychol. , vol. 30, 19^2, pp. 273-29U. 
 
 Brice, C. R. and Fennema, C. L. , "Scene Analysis Using Regions," SRI Projects, 
 April 1970. 
 
 Brown, D c R. and Owen, D. H. , "The Metrics of Visual Form: Methodological 
 Dyspepsia," Psychological Bulletin , vol. 68, No. U, 1967, pp. 2^3-259. 
 
 Buneman, 0. P., "A Grammar for the Topological Analysis of Plane Figures," 
 
 Machine Intelligence 5 , ed. by Meltzer and Michie, Edinburgh University 
 Press, 1970. 
 
 Cairns, S. S., Introductory Topology , The Ronald Press Company, New York, 196l, 
 
 Calvert, T. W. , "Nonorthogonal Projections for Feature Extraction in Pattern 
 Recognition," IEEE Trans, on Computers , May 1970, pp. kkl-k32. 
 
 Casperson, R. C, "The Visual Discrimination of Geometric Forms," J. Exp. 
 Psychol. , vol. U0, 1950, pp. 668-681. 
 
 Cohen, G. , "Pattern Recognition: Difference Between Matching Patterns to 
 
 Patterns and Matching Descriptions to Patterns," Journal of Exp. Psychol- 
 ogy , vol. 82, No. 3, 1969, PP. k2 r J-h3h. 
 
 Deutsch, J. A., "A Theory of Shape Recognition," British Journal of Psychol- 
 ogy , vol. U6, 1955, PP. 30-37. 
 
 Deutsch, J. A., "A System for Shape Recognition," Psychological Review , vol. 
 69, No. 6, 1962, pp. U92-500. 
 
190 
 
 Dodwell, P. C, "Coding and Learning in Shape Discrimination," Psychological 
 Review , vol. 68, 196l, pp. 373-382. 
 
 Duda R. 0. and Hart, P. E., "Experiments in Scene Analysis," Artificial 
 
 Intelligence Group, Technical Note 20, SRI Project 8259, January 1970. 
 
 Eastman, C. M. , "Representation for Space Planning," CACM, vol. 13, No. U, 
 April 1970, pp. 2U2-250. 
 
 Fitts, P. M. and Leonard, J. A., "A Stimulus Correlates of Visual Pattern 
 Recognition," Ohio State University, Columbus, Ohio, 1957- 
 
 Freeman, H., "On the Encoding of Arbitrary Geometric Configurations," IRE. 
 Trans, on Electronic Computers , vol. EC-10 , June 1961 , pp. 260-268. 
 
 Freeman, H. and Glass, M. G. , "On the Quantization of Line Drawing Data," 
 IEEE Transactions , SSC-5, No. 1, 1969, PP- 70-79- 
 
 French, R. S., "The Pattern Recognition in the Presence of Visual Noise," 
 J I Exp. Psychol. , vol. kl 9 195^, PP- 27-31. 
 
 Gibson, J. J., The Perception of the Visual World , Houghton Mifflin Company, 
 Boston, 1950. 
 
 Gibson, J. J., "What is a Form," Psychol . Rev, , vol. 58, 1951, pp. U03-U12. 
 Gregory, R. L , The Intelligent Eye , McGraw-Hill Book Company, 1970. 
 
 Guzman, A., "Decomposition of Scenes into Bodies," AFIPS Conference Proc , 
 vol. 33, Part One, December 1968, pp. 291-304. 
 
 Hake, H. W. , "Form Discrimination and The Invariance of Form," WADC Technical 
 Report 57-621, October 1959, PP» 60-79. 
 
 Harman, L. D and Knowlton, K. C. , "Picture Processing by Computer," Science , 
 volo 16U, No. 3875, April 1969. 
 
 Jones, L. E. , "Pattern Recognition In Ciological and Technical Systems," 
 
 Department of Computer Science Seminar, University of Illinois, Urbana, 
 November 17, 1971. 
 
 Julesz, B. , "Visual Pattern Discrimination," IRE Trans. Inform. Theory , vol. 
 IT-8, February 1962, pp. 84-92. '""" ~ ~ '" -~'~" "~" 
 
 Koffka, K. , "Points and Lines as Stimuli," from Principles of Gestalt Psychol- 
 ogy , Harcourt Brace, Inc., New York, 1935. 
 
 Lyusternik, L. A., Convex Figures and Polyhedra , translated and adopted from 
 the first Russian edition (1956) by Donald L. Barnett : D. C. Heath 
 and Company, Boston, 1966. 
 
 McCormick, B. H. "Syntax-Directed Recognition of Pictures: A New Theoretical 
 Model," University of Illinois at Urbana-Champaign, November 1967. 
 
191 
 
 Maruyama, K. , "A Procedure for Detecting Intersections and Its Application," 
 DCS Report No. U^9 , Department of Computer Science, University of Illi- 
 nois, May 1971 (part appears in the International Journal of Computer and 
 Information Sciences , Issue No. 3), (a). 
 
 Maruyama, K. , "An Approximation Method for Solving the Sofa Problem," DCS 
 
 Report No. U89 » Department of Computer Science, University of Illinois, 
 October 1971 (also appears in the International Journal of Computer and 
 Information Sciences), (b). 
 
 Maruyama, K. , "A Problem in Form Perception: ODD Shape Detection," DCS Report 
 No. U90, Department of Computer Science, University of Illinois, December 
 1971, (c). 
 
 Mayeda, W. "Graph Theory and Its Application," CS/EE HlU Lecture Note, Univer- 
 sity of Illinois, 1969. 
 
 Montanari , G. U. , "Continuous Skeletons from Digitized Images," Journal of the 
 ACM , vo. 16, No. h, October 1969 , pp. 53 i +-5 I +9. 
 
 Montanari, G. U. , "A Note on Minimal Length Polygonal Approximation to a Digi- 
 tized Contour," Comm. of the ACM , vol. 13, No. 1, January 1970, pp. ^1- 
 hi. 
 
 Mowatt, M. H. , "Configurational Properties Considered 'Good' by Naive Sub- 
 jects," Amer. J. Psychol. , vol. 53, 19^0, pp. U6-69. 
 
 Pfaltz, J. L. and Rosenfeld, A., "Computer Representation of Planar Regions by 
 Their Skeletons," Comm. of the ACM , vol. 10, No. 2, February 1967 , 
 pp. 119-125. 
 
 Rosenfeld, A., Picture Processing by Computer , Academic Press, New York, 1969. 
 
 Rosenfeld, A., "Connectivity in Digital Pictures," Journal of the ACM , vol. 17, 
 No. 1, January 1970, pp. 1U6-160. 
 
 Rosenfeld, A. and Pfaltz, J. L. , "Sequential Operations in Digital Picture 
 
 Processing," Journal of the ACM , vol. 13, No. U, October 1966 , pp. U71- 
 k9k. 
 
 Schwebel, J. C, "Graph Transformation for Composite Formation," DCS Report 
 No. 368, University of Illinois, Champaign-Urbana, December 1969. 
 
 Schvebel, J. C, "A Graph Transformation Model for Picture Processing," Ph.D. 
 Thesis, Report No. 5lU, Department of Computer Science, University of 
 Illinois, 1972. 
 
 Schwebel, J. C and McCormick, B. H. , "Consistent Properties of Composite 
 
 Formation Under a Binary Relation," Information Science, vol. 2, No. 2, 
 April 1970. 
 
 Shepard, R. N. , "The Analysis of Proximities: Multidimensional Scaling With 
 
 an'unknown Distance Function I," Psychometrika , vol. 27, No. 2, June 1962, 
 
 pp. 125-lUO. 
 
192 
 
 Shepard, R. N. and Metzler, J., "Mental Rotation of Three-Dimensional Objects," 
 Science , vol. 171, No. 3972, February 1971, pp. 701-703. 
 
 Stef anelli , R. and Rosenfeld, A., "Some Parallel Thinning Algorithms for Digi 
 tal Pictures," Journal of the ACM , vol. 18, No. 2, April 1972, pp. 255- 
 26U. 
 
 Tondeur, P., Topology and Geometry , Lecture Note for Math. 323, University of 
 Illinois, 1970. 
 
 Uhr, L. (Ed.), Pattern Recognition , New York, Wiley, 1966. 
 
 Whitney, D. E. , "State Space Models of Remote Manipulation Tasks," Ph.D. 
 Thesis, M.I.T. , January 1968. 
 
 Zahn, C. T. , "Graph-Theoretic Methods for Detecting and Describing Gestalt 
 Clusters," IEEE Trans, on Computers , vol. C-20 , No. 1, January 1971, 
 pp. 68-86. 
 
 Zusne , L. , Visual Perception of Form , Academic Press, New York, 1970. 
 
193 
 
 APPENDIX A 
 LIST OF SHAPE DESCRIPTION SUBROUTINES 
 
19 U 
 
 /* TH p LIST INC. CF THE PROCfCURES PPCV IOEC„.*/. 
 TNI TI Al I7&TIPN 
 
 JWAKFUP 
 
 USAGF.L ._. CALL _$W AKFUP ; 
 
 /* THIS PROCEDURE INITIALIZES NECESSARY VARIABLES, AND */ 
 /♦ TH IS SH OULD AP PEAR AT TFE B.EGINLNG OF EACH PROGRAM */ 
 
 . LO W IFV FL FUNCTICNS 
 
 SGFTP(TYPF) 
 USAGF: P = $GFTP(NL 
 
 /* THIS FUNCTION GETS 128 BY 128 GRAY PICTURE, */ 
 
 /* TYP F =__0__ FRCP CARt READER */ 
 
 /* = 1 FROM DISC */ 
 JJt = ? FROM TAPF *J_ 
 
 JPR_LNTPLARG.JYP£L__ 
 
 USAGE: CALL $PR INTP < P ,N ) : 
 
 /* THIS £POCEnURE PUTS 128 BY 128 PICTURE NAMED P, */ 
 
 /* TYPF = INTO PRINTER */ 
 /* =_J INTQ DISC */_ 
 
 /* =2 INTO TAPE */ 
 
 $DIFF(ARG) 
 
 L1SA£E: £!£_„= _$EIFF( PL; 
 
 /* THIS FUNCTION DIFFERENTIATES 128 BY 128 PICTURE NAMED P*/ 
 
 SWINDOWLARG.I ,NI «J.NJi 
 liSAiiJEX__PLC =_$kINDQW(F.l,NJ, J_,NJJ; 
 
 /* THIS FUNCTION SELECTS WINDOW OF NI BY NJ AT ELEMENT */ 
 
 1* LLtdl^ ....__ _. */ 
 
 JNF OA( ARG) 
 
 USAGF: PTC = $f\FGA(P>: 
 
 /» THIS FlNCTTON GIVES. INVERSE PICTURE HE P +£_ 
 
 %miAmi±£R&2i .:. ....._ 
 
 USAGE: PIC = $0R(P1,P2) : 
 
 /» THIS FUNCTION GIVES OR OF TWO P ICTURES PI AND P2 */_ 
 
 „_JANJli^AGJL*A_RG2J__ ... 
 
 USAGF: PIC = $AND(P1,P2): 
 /♦ THIS FUNCTION GIVES AND OF TWO PICTURE? PI AND P2 */_ 
 
 ARC? ) 
 
 LSAGF: PIC = $SUB(P1,P2): 
 X*_ThLLS. EliNCTIQJM GI VES $.ANC( PI , $NEGAJ $AfiDJ PLLi.P2L.il */_ 
 
195 
 
 $FX0R(ARG1, 
 
 APG2) 
 
 
 
 
 
 
 
 USAGE: 
 /* THIS 
 
 PIC = $EXOR(Pl 
 FUNCTION GIVES 
 
 ,P2>; 
 
 $OR($AND(Pl,$NEGA(P2l) ♦ 
 
 
 
 */ 
 
 /* 
 
 SAND(SNF.GA(P1).P2) ) ) 
 
 
 
 
 */ 
 
 $BNDPY(ARG. 
 USAGE: 
 
 TYPF) 
 PIC = 
 
 SBNDRYiP 
 
 iNl; 
 
 
 
 
 
 /* 
 /* 
 
 THIS FUNCTION FINOS 
 MATRIX ACCORDING TO 
 
 SMOOTHED BOUNCARY OF THE 
 THE SPECIFIED TYPE. 
 
 GIVEN 
 
 BIT 
 
 */ 
 */ 
 
 /* 
 /* 
 
 TYPE 
 
 = 
 
 = 1 
 
 BLACK BOUNCARY 
 WHITE BOUNDARY 
 
 
 
 
 */ 
 */ 
 
 $#OFR(ARG.TYPE) 
 
 USAGE: #TFR = $#OFR(PtN): 
 
 /* THIS FUNCTION COUNTS THE NUMBER OF REGIONS OF PICTURE 
 
 */ 
 
 /* P ACCORDING TO THE SPECIFIED TYPE. 
 
 /* TYPF = BOTH OPEN AND CLOSED REGIONS 
 
 */ 
 */ 
 
 /* = 1 CNLY CLOSEC REGICNS 
 
 */ 
 
 $NAMFR(ARG1. I.ARG2) 
 
 USAGE: CALL SNAHER ( P, I* • CEREBRUM* ) : 
 
 /* THIS PROCEDURE GIVES NAME 'CEREBRUM* TO THE I-TH REGION*/ 
 /* OF THE PICTURE P. */ 
 
 $G( ARGl,ARG2.X,Y) 
 
 USAGF: CAtl $G( P, 'CER EBRUM' , X, Y ) ; 
 
 /* THIS PROCEDURE GIVES TFE CENTER OF GRAVITY (X,Y) OF 
 
 */ 
 
 /* RFGTON NAMED 'CEREBRUM' OF PICTURE P. 
 
 */ 
 
 ilS GINJARG1 .ARG2) 
 
 USAGE: IF SIS GIN( P, • CEREBRUM' ) THEN - - - - 
 
 /* THIS FUNCTION GIVES YES/NO ANSWER DEPENDING UPON 
 /* WHETHFR THE CENTER CF GRAVITY OF THE REGION NANED 
 
 */ 
 */ 
 
 /* 'CEREBRUM' OF PICTURE P IS IN THE REGION OR NOT. */ 
 
 STS AS(ARG1 »ARG2> 
 
 USAGE: IF SIS AS (P. 'CEREBRUM' ) THEN - - - - 
 
 /* THIS FUNCTION GIUES YES/NO ANSWER CEPENDING UPON */ 
 /♦ WHETHER THE RFGION NAMED 'CEREBRUM' OF PICTURE P IS ♦/ 
 /* ANGULARLY SIMPLE OR NCT. */ 
 
 SASPf ARG1 .ARG2.X.Y) 
 
 USAGE: CALL $ASP(P, 'CEREBRUM* ,X,Y) 
 
 /* THIS PROCEDURE GIVES THE ANGULARLY SIMPLE POINT (X,Y) */ 
 /* OF THF REGION NAMED 'CEREBRUM' OF THE PICTURE P. */_ 
 
196 
 
 SIS ATIARGJ..APG2.X.Y) 
 
 USAGE: TF SIS AT(P, «CERFBRUM» ,X,Y) THEN - - - - 
 
 / » WHFTHFR ThP CIVFN PTINT (X.Y) I S A NCUIARY SIMP1F FOR __*/_ 
 
 /* THE RFGION NAMED •CEREBRUM' OF PICTURE P, OR NOT. */ 
 
 $FNCnOE( flRG.K.TYPE) 
 
 _ USANCE: CALL ENCODE I Pil^TYPEJ.;. 
 
 /* THIS PROCEDURE ENCCCES EACH REGION OF PICTURE P INTO */ 
 
 /.»-J>F G I VF N T Y_P E_n£_£CRil_kH£lS£ . £J RCUI AR I T V IS N %£_ 
 
 /* TYPE = PRIME FNCfDINGI INNER HOST INTERSECTION) */ 
 
 ./* . _ .. = 1_ .DUAL ENCODING (.OUTER MOST IN TERSECT ION) */ 
 
 /* = ? EDGF FOLLOWING ENCODING */ 
 
 ? ND LEVFL FUNCTIONS 
 
 SF OF(ARG1 •ARG?«I) 
 
 USAGES _L_XfcL FEATURE = SF_OF IP. LCEPJEBPUMA.^LLl 
 
 /* THIS FUNCTION RETURNS THE I..TH FEATURE VALUE OF THE */ 
 
 /* REGLCN_J^AMFD ! CEREER.UMJ OF .PICTURE P *7 
 
 */ 
 *L 
 
 
 /* I 
 
 = 1 MAXAMP 
 = ? MFANVPN 
 
 
 /* 
 
 = 3 DIVE* 
 = 4 Af T* 
 
 — 
 
 /* 
 
 /*— 
 
 /* 
 /* 
 
 = 5 VAXIMAL* 
 
 = 6. ...rCN-VEXVit 
 
 = 7 CCNCAVE# 
 = « n pnnps 
 
 
 /* 
 /* 
 
 = 9 PFRIMETFR 
 = 10 ARFA4 
 
 
 /* 
 
 /» — 
 
 /* 
 /* 
 
 =11 P AS 
 _= X2 CCUVE XI T.Y.. 
 = 13 SYMMETRY 
 = u nFn AS 
 
 
 /* 
 
 = 15 CCNVEXNESS 
 
 */ 
 */ 
 
 */ 
 
 JLL 
 
 */ 
 _*/ 
 */ 
 *Z 
 */ 
 JLL 
 
 */ 
 
 $PRTNTF(ARG1.ARG2) 
 U SAGF; CALL-SPJLI NTE <JU_» CERE E RUM* li 
 
 /* THIS PPOCEDURF PRINTS OUT FEATURE VALUES OF THE REGION */ 
 _ /♦ N A EFO TFRFR RH M ' O F PICTURE P^ . LE ARG? = SAI 1 THFN THIS*/ 
 
 /* PRINTS OUT FEATURE VALUES OF ALL REGIONS OF P. */ 
 
 $PRTNTR(ARGl .ARC?) 
 
 -USAGE: CALL. SPRINTJU P » 'CEREBRUM • } 
 
 /* THIS PROCEDURE PRINTS OUT THE SHAPE OF THE REGION */ 
 
 -/* NAMED -*CEP.E-BiUmi— OE- PICTUR E- -P— IF ARG2=1iAH — IHEN- THIS ♦/ 
 
 /* PRINTS OUT ALL REGICNS OF P */ 
 
197 
 
 SELIMR(ARG I.ARG2,ARG3) 
 
 USAGE: CALL $EL IMR < PI , 'CEREBRUM' , P2 ) : 
 
 /♦ THIS PROCEDURE ELIMINATES REGION NAMED 'CEREBRUM' *V 
 
 /* ERCM PICTURE PI ANC THE REMINDER IS NAMED AS P2. */ 
 
 $SKLTNZ(ARG1 .ARG2»N> 
 
 USAGE; CALL SSKLTNZ ( P !♦' CEREBRUM ' , N ) ; 
 
 /* THIS PROCEDURE SKELTONIZE THE REGION 'CEREBRUM' OF */ 
 /♦ PICTURE PI USING GIVEN INTEGER N. IE ARG2=$ALL THEN ♦/ 
 /* THE PROCEDURE SKELTCNIZE ALL REGICNS OF THE PICTURE. */ 
 
 *PRINTS(ARG1 ,ARG?1 
 
 USAGF: CA 1 L SPRINTS <P_1 , 'CEREBRUM' ); 
 
 /* THIS PROCEDERE PRINTS OUT THE SKELTON OF THE REGION */ 
 
 /♦ ■r.FRFBRUM* OF PICTURE P. IF AR,G 2 = *AL L THEN BUS */_ 
 
 /* PRINTS OUT SKELTCN CF ALL REGICNS OF THE PICTURE ♦/ 
 
 STAIL?(ARG1 .ARG2) 
 
 USAGE: I F $T AIL? ( Pt 'CER EBRU M •_ )_ THEN 
 
 /* THIS FUNCTION GIVES YES/NO ANSWER DEPENDING UPON */ 
 /♦ WHETHER THE REGION 'CEREBRUM' OF PICTURE P HAS A TAIL ♦/ 
 /* OR NOT. */ 
 
 $NECK?(ARG1 .ARG2) 
 
 USAGE: IF SNECK 7 CP. 'CER E BRUM' ) T H EN - 
 
 /* THIS FUNCTION GIVES YES/NC ANSWER DEPENDING UPON 
 
 /* WHFTHFP THE REGION 'CEREBRUM' OF PICTURE P HAS A NECK 
 
 */ 
 
 /* OR NOT. 
 
 
 */ 
 
 4DECO T1ARG1.ARG2) 
 
 USAGE: CALL SDECC T ( P 1. « CERE BRUM' ) : 
 
 /* THIS PROfFnURF DECOMPOSES THE REGION 
 
 OF PI BY TAIL 
 
 */ 
 
 /* PROPERTY 
 
 /* IF ARC? = (All THEN ALL REGICNS WILL 
 
 BE DECOMPOSED 
 
 */ 
 */ 
 
 *PFCf NIARG1.ARG?) 
 
 USAGE: CALL $DECC _N< P 1 ♦ • CEREERUM' ) ; 
 
 /* THIS PROCFDURE DECOMPOSES THE REGION 
 
 OF PI BY NECK 
 
 */ 
 
 /* PROPERTY 
 
 /* IF ARG? = 4ALL THEN ALL REGICNS WILL 
 
 BE DECOMPOSED 
 
 */ 
 
 
198 
 
 3 RD L EV-EL-FU&CT I ON fCR SPECIAL L-SAGE 
 
 JSFT BURC1 .ARG7.ARG3.TYPF ) 
 
 USAGF: CALl $SET D( PI , »CERE ERUM • ,P2,TYPE ) : 
 
 /♦ if arh? = tnrn sHAPF thfn the PRnfFnnRF sfts p_tari f »/ 
 
 /* FOR THE PURPOSE OF CDC SHAPE DETECTION. OTHERWISE */ 
 
 /A_IHJ_S -B P Of. EDUPJE S FT S CITABLE- FOR <; T M 1 1 AR SHAPF DETECTION*/ 
 
 /* THAT IS A SIMILAR REGICN TO 'CEREBRUM* OF »/ 
 
 /♦ PTCTIIR£.P1 IS rONSIPFRn ON P-2^. £i_ 
 
 /* TYPE =0 C INTERVAL */ 
 X* .=_J CJSPER_S10N +J- 
 
 tFVAl-FURGl . A T TEJVl ICh ) - 
 
 USAGE: CALL $E VAL, E < P .ATTENT ICN ) : 
 /* SLIM OF wm+rtl.J) FCR All. I . WHFRF H(I) TS THF tZ. 
 
 /* ORIGINALLY GIVEN WEIGHT FCR I_TH FEATURE FOR ATTENTION */ 
 / * =; AND TH E -I, TH M CCIFIEIL ..HE IG HT F OR ATTFNT I O N 5 1 */_ 
 
 PRO E RJ ARG^N^XY-P^E-l 
 
 USAGE: I 1ST = SORDER < P.N, TYPE) 
 /♦ THIS FUNCTION RFTURhS THE POINTER TO THF FOI I QMJ NG 4JL 
 
 /* LIST. THIS FIND FIRST N NUMBER OF MAXIMUM OR MINIMUM */ 
 /* RFGI-ONS FRCM-IHF-gJXT4l&F SL* *X- 
 
 /* JYPF = MAXIMUM */ 
 £* =_L R1&IMUM *J- 
 
 HSFt FCTM 1ST. U 
 
 USAGE: I_ TH REGION = SSELECT ( L I ST, I ) : 
 /♦ T hLLS- FL N C T lQN-JiEXURNS T hE -MAME-QE_TiiF U T H R FGION IN */_ 
 
 /* THF LIST. */ 
 
 $FRFFL (LIST) 
 USAGE: CAM tFBFFL-tLISTJ- 
 
 /* THIS PROCEDURE FREES ThE POINTER CALLED LIST */ 
 
 $ODO<ARG,K, ATTENTION) 
 USAGE-: CALL- SCCXU P.H.-ATXENJ ICNU- 
 
 /* THIS PPOCFDURF PICKS UP N NUMBER OF THE MOST ODD SHAPES*/ 
 
 /* FRCM TFE PICTURE P. IT IS HR ITTEN AS FPU OWS *JL 
 
 /* $ODD: PROCEDURE(P,N. ATTENTION); */ 
 
 -X* DC1 P £ilA&i2XU- VAAIKG^ *X 
 
 /* N FIXED BIN, */ 
 
 /Jl AJXENJI&N BXTtlli *X 
 
 /* CALL $SET„D( • • • . $ODD„SHAPE , P,l ) ; */ 
 
 /* CALL HEVAL^F(P.ATTFNTinN): *Z_ 
 
 /* LIST = SORDEP (P,N,0); »/ 
 
 ..- /*- ca_lsX_Tq-Nr..- JODDS-LI ) -= -SSELFCT.fl I ST, I ); END; *X 
 
 /* RETURN: */ 
 */ 
 
199 
 
 S SIM(ARG1.ARG2,ARG3»N,ATTEMICN ) 
 
 USAGE: CALL SSI M< PI . 'CEREBRUM* , P3. N,ATTENTI ON) ; 
 
 /» THIS PROCEDURE PICKS UP N NUMBER CF THE MOST SIMILAR ♦/ 
 
 /* SHAPES TO 'CEREBRUM OF PICTURE PI FROM P3. IT IS */ 
 
 /* WRITTEN AS FOL LOWS tL 
 
 /* $SIM: PROCEOURE<Pl,P2.P3tN,ATTENTICN); */ 
 
 /* OCL <P1»P2,P3) CHAR(20) VARING, */ 
 
 /* N BIN FIXED, */ 
 
 /♦ ATTENTICN ami); *L 
 
 /* CALL $SET..D(PlfP2*P3.1); */ 
 
 /♦ CALL SFVA L_F (P3.ATTE NTI O N); ♦/ 
 
 /* LIST $CRDER(P3.N,1): */ 
 
 / * D O T = l TO N: ISIM S(I) = $SELECT( L I ST, I) ; END; ♦/ 
 
 /* RETURN: */ 
 
 /♦ END JSIM: V 
 
 GRAPH LEVFL FUNCTION 
 
 >A LLOCG ( I ,ARG) 
 
 USAGF: CALL $ALLOCG( I .GNAME ) ; 
 
 /» THIS ALLOCATES A GR IPH STRUCTURE CF N NODES WHOSE NAME »/ 
 
 /* IS GNAME */ 
 
 SFRFFG(ARG) 
 
 USAGE: CALL $FREEG«GN AME ) ; 
 
 /* THIS FREES THE DATA STRUCTURE OF NAME GNAME */ 
 
 *TRANSG< ARG1.ARG2) 
 USAGF: CALL STRANSG(PtG) 
 
 /* THIS PROCEDURE TRANSFORMS THE PICTURE P INTO A RELATION*/ 
 /♦ GRAPH G. */ 
 
 /* RFLATICNS : SUBREG I CM CCNT A INMENT ) 
 
 /* NEIGHBCRHOCCICCNNECTED REGION) 
 
 */ 
 */ 
 
 /* NECK 
 /* TAIL 
 
 */ 
 */ 
 
 /* ABOVE / BELCW 
 /* RIGHT / LFFT 
 
 */ 
 */ 
 
 /* SURRCUNCS 
 
 */ 
 
 JSPL ITIARG1.ARG2. RELATION) 
 
 USAGE: CALt SSPLITI GNAME .NNAME.R ELATIGN ) ; 
 
 /* THIS SPLITS THE NODF NNAME CF GRAPH GNAME IF THE 
 /* RFLATICN HOLD 
 
 */ 
 */ 
 
 /* IF ARG2=$ALL THEN EACH NODE OF GNAME WHICH SATISFY 
 /* THF RFIATTflN IS SPLIT 
 
 */ 
 
 */ 
 
200 
 
 SM£RGEjLAJlGLL.A£G2 i ARG3,R£LATlCN) 
 
 USAGE: CAIL SMERGE (G. M . N2 • REL AT ION ) : 
 /♦ T HI S P F R GFS TW O N ODES Nl AND N? OF GRAPH G IF THF */_ 
 
 /* SPECIFIED RELATION IS HCLC */ 
 
 *PLTN?(ARG1 ,£PG2,ARC3) 
 
 JJ.SA£f-X _J.E._tRLT.N2.IG4.Nl_iM2J...i; RELA T « REL AT THEN - - - 
 
 /* THIS FUNCTION RETURNS RELATIONS BETWEEN TWO NODES Nl */ 
 L* A ND N ? O F GRAPJbL-G *Z_ 
 
 $ELIPJlLARG_l.AfiG2J 
 
 USAGF: CALL $EL IMN( G • NCDE ) ; 
 /♦ THLS_RPOCE DURE ELIMIN ATES .Jh£ NDD_E_ FROM GRAPH G *1_ 
 
 %F\ IMF(ARK1 . ARn?.ARr,3.RF< ATIPM 
 
 USAC-F: CALL $ELIME(G.N1«N2. RELATION); 
 
 JJL JCH_LS___P£nC EDUR^__ELXM1NAT_E_S_ R£ LA TICN CALLED RELATIONS *J_ 
 
 /* FROM TFF RELATION BETWEEN NODES Nl AND N2 OF GRAPH G */ 
 
 JINSTNl ARG1 ,APG2,ARG3.ARG4 ) 
 US AGE; CAll flNSTN(&*JVltN2.N2) 
 
 /* THIS PROCEDURE INSERTS NODE N3 BETWEEN NODES Nl AND N2 */ 
 
 1* QE_GRAPH CL._THJE-RELAJJCNS. BFT WFFN Nl AND N? ARF +±_ 
 
 /* PRESERVED FOR RELATIONS BETWEEN Nl AND N3 , AND N3 AND */ 
 V*_M2_. *7_ 
 
 -&.INSTFI ARG1 .ARG7. ARG3«.R-FI ATIOlL 
 
 USAGE: CALL $1 NSTE ( G . M .N2 *R EL ) : 
 
 ^J*_IH1£ _ELRnC_EXURE^IWS£RT S RELATIONS CALLE .. R EL B FTW FFN, *L 
 
 /* NODES M AND N2 CF GRAPE G */ 
 
 $MST(ARG1 .ARG2) 
 
 USA GF: CAM SPST ( G. MS1ALAMEU 
 
 /* THTS PROCEDURE FINDS A PIMPUM SPANNING TREE FROM GRAPH*/ 
 /J»_„G AMO-_LT-JS CALLED PSTtvAPf _ *JL 
 
 iSPTPSTI ARG.NJ 
 
 USAGF: CALL SSLTMST (G «N ) : 
 
 /» THIS PRnCFDMOF SPt T TS CRAPH R TNTO N + l DISJOINT TREES *J_ 
 
 /* BY ELIPINATING N NUMBER OF THE MOST WEIGHTED EDGES. */ 
 
 *PINFNC(ARG,N ) 
 L1SAGE^_ .CALL. IMlNFNCiG »M1 ; 
 
 /* THIS PPOCEDURE MINIMIZES CLUSTERING FUNCTION OF AT */ 
 /* MfSJ_ N CI ISTFRS . *L 
 
201 
 
 »CLSTR<ARG,N) 
 
 USAGE: CALl $CLSTR(G»N): 
 
 /» THIS PPOCEDUPE FINDS CLLSTERS OF AT MOST N . THIS CAN »/ 
 
 /* BE WRITTEN AS FOLLOWS */ 
 
 /♦ SCLSTP: P ROCEDU R EIG.N U ±/_ 
 
 /* CCL G CHARC20) VARING, */ 
 
 /* N BIN FIXEC; */ 
 
 /* CALL SMST(G.Gl): */ 
 
 /» CALL SSPTMST(Gl.N); */_ 
 
 /* CALL SMINFNCCGUK): */ 
 
 /* RETURN: */ 
 
 /* END SCLSTR: */ 
 iSUBG?(ARGl,A PG2,PELATI ONI 
 
 USAGE: IF $SUBG(G1.G2, RELATION) THEN 
 
 /♦ THIS FUN CTION FINDS WHETHER G2 CAN BE A SUBGRAPH OF »/ 
 /* Gl OR NOT ANC THEN FETUPNS YES/NO ANSWER. IF THERE ARE */ 
 /♦ MORE THAN CNF SUBGRAPHS IN Gl, THEN THIS FUNCTION FINDS*/ 
 
 /* A SUBGRAPH WHICH SATISFIES A PREDEFINED FUNCTION */ 
 /* IF THERE IS NO SUBGPAPH THEN THIS FINDS POSSIBLE */ 
 
 /* SUBGRAPHS IN Gl. */ 
 
 /♦ IF RELATION IS SPECIFIED THEN THE FUNCTION F I NO ONLY ♦/ 
 /* A SUBGRAPH IN G2 WHCSF RELATION IN Gl IS STRICTLY HOLD */ 
 
202 
 
 APPENDIX B 
 TWENTY SETS OF DATA STIMULI 
 
203 
 
 SET- 1 
 
 SET- 2 
 
 SET - 3 
 
 SET- 4 
 
20U 
 
 SET-5 
 
 SET- 6 
 
 SET- 7 
 
 SET-8 
 
 
205 
 
 SET- 9 
 
 SET - 10 
 
 SET-11 
 
 SET - 12 
 
206 
 
 SET -13 
 
 SET -14 
 
 SET -15 
 
 SET -16 
 
20? 
 
 SET -17 
 
 
 SET -18 
 
 SET -19 
 
 SET- 20 
 
208 
 
 VITA 
 
 Kiyoshi Maruyama was born in Nagano-ken, Japan, on December 5, 19*+5. 
 He graduated from Nihon University, Tokyo, Japan, with a Bachelor of Electri- 
 cal Engineering degree in 1968. He started his graduate work at Tohoku Univer- 
 sity, Sendai , Japan, in April 1968. 
 
 From 1968 to 1972, he was a research assistant in the Department of 
 Computer Science at the University of Illinois. He received his Master of 
 Science degree in Computer Science in 1970. 
 
 Mr. Maruyama is the author of papers entitled "A Procedure To Determine 
 Intersections Between Polyhedral Objects," "An Approximation Method For Solv- 
 ing The Sofa Problem," "On The Parallel Evaluation of Polynomials," and the 
 co-author of a paper entitled "The Parallel Evaluation of Arithmetic Expres- 
 sions Without Divisions." 
 
 Mr. Maruyama is a student member of the Association for Computing 
 Machinery. 
 
orm AEC-427 
 
 (6/68) 
 AECM 3201 
 
 U.S. ATOMIC ENERGY COMMISSION 
 
 UNIVERSITY-TYPE CONTRACTOR'S RECOMMENDATION FOR 
 
 DISPOSITION OF SCIENTIFIC AND TECHNICAL DOCUMENT 
 
 ( Sm Instruction* on Rtvtrm Sid* ) 
 
 I. AEC REPORT NO. 
 
 COO-2118-0038 
 
 2. TITLE 
 
 A STUDY OF VISUAL SHAPE PERCEPTION 
 
 3. TYPE OF DOCUMENT (Check one): 
 
 [33 a. Scientific and technical report 
 
 [~1 b. Conference paper not to be published in a journal: 
 
 Title of conference 
 
 Date of conference 
 
 Exact location of conference. 
 
 Sponsoring organization 
 
 □ c. Other (Specify) 
 
 *. RECOMMENDED ANNOUNCEMENT AND DISTRIBUTION (Check one): 
 
 yg a. AEC's normal announcement and distribution procedures may be followed. 
 
 I I b. Make available only within AEC and to AEC contractors and other U.S. Government agencies and their contractors. 
 "2 c. Make no announcement or distrubution. 
 
 >. REASON FOR RECOMMENDED RESTRICTIONS: 
 
 i. SUBMITTED BY: NAME AND POSITION (Please print or type) 
 
 Kiyoshi Maruyama 
 Research Assistant 
 
 Organization 
 
 Department of Computer Science 
 University of Illinois 
 Urbana, Illinois 6l801 
 
 Signature 
 
 * (/OR AEC USE ONI/ 
 
 Date 
 
 October 1972 
 
 'OR AEC USE ONLY 
 
 AEC CONTRACT ADMINISTRATOR'S COMMENTS, IF ANY, ON ABOVE ANNOUNCEMENT AND DISTRIBUTION 
 RECOMMENDATION: 
 
 PATENT CLEARANCE: 
 
 □ a. AEC patent clearance has been granted by responsible AEC patent group. 
 M b. Report has been sent to responsible AEC patent group for clearance. 
 I I c. Patent clearance not required. 
 
BLIOGRAPHIC DATA 
 1EET 
 
 1. Report No. 
 
 UIUCDCS-R-72-533 
 
 3. Recipient's Accession No. 
 
 Title and Subtitle 
 
 A STUDY OF VISUAL SHAPE PERCEPTION 
 
 5. Report Date 
 
 October 1972 
 
 Author(s ) 
 
 Kiyoshi Maruyama 
 
 8. Performing Organization Rept. 
 No. 
 
 Performing Organization Name and Address 
 
 Department of Computer Science 
 University of Illinois 
 Urbana, Illinois 6l801 
 
 10. Project/Task/Work Unit No. 
 
 US AEC AT( 11-1) 2118 
 
 11. Contract /Grant No. 
 
 1+6-26-15-303 
 
 !. Sponsoring Organization Name and Address 
 
 
 US AEC Chicago Operations Office 
 9800 South Cass Avenue 
 Argonne, Illinois 60k39 
 
 13. Type of Report & Period 
 Covered 
 
 Thesis Research 
 
 14. 
 
 i. Supplementary Notes 
 
 Abstracts 
 
 With the use of computers, this paper develops and studies 
 algorithms involved in visual shape perception. The main interest is to 
 solve the naming problem, i.e., the strategies by which map-like infor- 
 mation is transferred to objects of the scene under analysis. 
 
 Key Words and Document Analysis. 17a. Descriptors 
 
 Shape, form, perception, feature extraction, oddness , relation graph, 
 figure extraction, naming 
 
 b. Identifiers/Open-Ended Terms 
 
 c. COSATI Field/Group 
 
 .Availability Statement 
 
 unlimited distribution 
 
 RM NTIS-3S ( 10-70) 
 
 19. Security Class (This 
 Report) 
 
 UNCLASSIFIED 
 
 20. Security Class (This 
 Page 
 
 UNCLASSIFIED 
 
 21. No. of Pages 
 
 22. Price 
 
 USCOMM-DC 40329-P71 
 

 > 
 
^mti 
 
 
 
 
 H 
 
 ■ I ».v rili, 9 
 
 ■ MP 
 
 $ 
 
 
 
 •MkI 
 
 ■ 
 
 ■ 
 
 
 1 ■ 
 
 I 
 
 ■ 
 
 ■ 
 I 
 
 i'^Wfa 
 
 khRhH M 
 
 KlraflHBBIH IHHH 
 
 QiHtflBwa m MMHI • 
 
 » H 
 
 raHBHBfl 
 
 in ■ 11 
 
 . • h!£8bhHHHI^H